给定一个函数,它产生的是1到5之间的随机整数,写一个函数,它产生的是1到7之间的随机整数。


当前回答

(我剽窃了亚当·罗森菲尔德的答案,使其运行速度提高了7%左右。)

假设rand5()返回分布相等的{0,1,2,3,4}中的一个,目标是返回分布相等的{0,1,2,3,4,5,6}。

int rand7() {
  i = 5 * rand5() + rand5();
  max = 25;
  //i is uniform among {0 ... max-1}
  while(i < max%7) {
    //i is uniform among {0 ... (max%7 - 1)}
    i *= 5;
    i += rand5(); //i is uniform {0 ... (((max%7)*5) - 1)}
    max %= 7;
    max *= 5; //once again, i is uniform among {0 ... max-1}
  }
  return(i%7);
}

我们在跟踪这个循环在变量max中所能产生的最大值。如果到目前为止的结果在max%7和max-1之间,那么结果将均匀分布在该范围内。如果不是,则使用余数,余数是0到max%7-1之间的随机数,然后再次调用rand()来生成一个新的数字和一个新的max。然后我们重新开始。

编辑:在这个方程中,期望调用rand5()的次数是x:

x =  2     * 21/25
   + 3     *  4/25 * 14/20
   + 4     *  4/25 *  6/20 * 28/30
   + 5     *  4/25 *  6/20 *  2/30 * 7/10
   + 6     *  4/25 *  6/20 *  2/30 * 3/10 * 14/15
   + (6+x) *  4/25 *  6/20 *  2/30 * 3/10 *  1/15
x = about 2.21 calls to rand5()

其他回答

因为1/7是一个以5为底的无限小数,所以没有(完全正确的)解可以在常数时间内运行。一个简单的解决方案是使用拒绝抽样,例如:


int i;
do
{
  i = 5 * (rand5() - 1) + rand5();  // i is now uniformly random between 1 and 25
} while(i > 21);
// i is now uniformly random between 1 and 21
return i % 7 + 1;  // result is now uniformly random between 1 and 7

这个循环的预期运行时间为25/21 = 1.19次迭代,但是永远循环的概率非常小。

这个解决方案受到了Rob McAfee的启发。 然而,它不需要循环,结果是一个均匀分布:

// Returns 1-5
var rnd5 = function(){
   return parseInt(Math.random() * 5, 10) + 1;
}
// Helper
var lastEdge = 0;
// Returns 1-7
var rnd7 = function () {
  var map = [
     [ 1, 2, 3, 4, 5 ],
     [ 6, 7, 1, 2, 3 ],
     [ 4, 5, 6, 7, 1 ],
     [ 2, 3, 4, 5, 6 ],
     [ 7, 0, 0, 0, 0 ]
  ];
  var result = map[rnd5() - 1][rnd5() - 1];
  if (result > 0) {
    return result;
  }
  lastEdge++;
  if (lastEdge > 7 ) {
    lastEdge = 1;
  }
  return lastEdge;
};

// Test the a uniform distribution
results = {}; for(i=0; i < 700000;i++) { var rand = rnd7(); results[rand] = results[rand] ? results[rand] + 1 : 1;} 
console.log(results)

结果:[1:99560,2:99932,3:100355,4:100262,5:99603,6:100062,7:100226]

js小提琴

这个怎么样

rand5 () % + rand5 (2) + 2 (2) % + rand5 rand5 () (2) % + rand5 % + rand5 (2) 2

不确定这是均匀分布的。有什么建议吗?

Here is a solution that tries to minimize the number of calls to rand5() while keeping the implementation simple and efficient; in particular, it does not require arbitrary large integers unlike Adam Rosenfield’s second answer. It exploits the fact that 23/19 = 1.21052... is a good rational approximation to log(7)/log(5) = 1.20906..., thus we can generate 19 random elements of {1,...,7} out of 23 random elements of {1,...,5} by rejection sampling with only a small rejection probability. On average, the algorithm below takes about 1.266 calls to rand5() for each call to rand7(). If the distribution of rand5() is uniform, so is rand7().

uint_fast64_t pool;

int capacity = 0;

void new_batch (void)
{
  uint_fast64_t r;
  int i;

  do {
    r = 0;
    for (i = 0; i < 23; i++)
      r = 5 * r + (rand5() - 1);
  } while (r >= 11398895185373143ULL);  /* 7**19, a bit less than 5**23 */

  pool = r;
  capacity = 19;
}

int rand7 (void)
{
  int r;

  if (capacity == 0)
    new_batch();

  r = pool % 7;
  pool /= 7;
  capacity--;

  return r + 1;
}

上面引用了一些优雅的算法,但这里有一种方法可以接近它,尽管它可能是迂回的。我假设的值是从0开始的。

R2 =给出小于2的随机数生成器(样本空间= {0,1}) R8 =给出小于8的随机数生成器(样本空间= {0,1,2,3,4,5,6,7})

为了从R2生成R8,您将运行R2三次,并将所有3次运行的组合结果作为3位二进制数使用。下面是R2运行三次时的值范围:

0, 0, 0 --> 0 . . 1, 1, 1 --> 7

现在要从R8生成R7,我们只需再次运行R7,如果它返回7:

int R7() {
  do {
    x = R8();
  } while (x > 6)
  return x;
}

迂回的解决方案是从R5生成R2(就像我们从R8生成R7一样),然后从R2生成R8,然后从R8生成R7。