


one is valid for all generators. It is easier to see in a limit case. If your generator has a RAND_MAX which is 2 (that isn't compliant with the C standard) and you want only 0 or 1 as value, using modulo will generate 0 twice as often (when the generator generates 0 and 2) as it will generate 1 (when the generator generates 1). Note that this is true as soon as you don't drop values, whatever the mapping you are using from the generator values to the wanted one, one will occurs twice as often as the other. some kind of generator have their less significant bits less random than the other, at least for some of their parameters, but sadly those parameter have other interesting characteristic (such has being able to have RAND_MAX one less than a power of 2). The problem is well known and for a long time library implementation probably avoid the problem (for instance the sample rand() implementation in the C standard use this kind of generator, but drop the 16 less significant bits), but some like to complain about that and you may have bad luck


int alea(int n){ 
 assert (0 < n && n <= RAND_MAX); 
 int partSize = 
      n == RAND_MAX ? 1 : 1 + (RAND_MAX-n)/(n+1); 
 int maxUsefull = partSize * n + (partSize-1); 
 int draw; 
 do { 
   draw = rand(); 
 } while (draw > maxUsefull); 
 return draw/partSize; 

生成0到n之间的随机数将避免这两个问题(并且它避免RAND_MAX == INT_MAX溢出)

顺便说一句,c++ 11引入了标准方法来简化和rand()以外的其他生成器。



rand() % 3   // if RAND_MAX were only 10, gives
output of rand()   |   rand()%3
0                  |   0
1                  |   1
2                  |   2
3                  |   0
4                  |   1
5                  |   2
6                  |   0
7                  |   1
8                  |   2
9                  |   0




一个比循环更好的解决方案(循环效率非常低,甚至不应该被建议使用)是使用输出范围大得多的PRNG。梅森Twister算法的最大输出为4,294,967,295。这样做MersenneTwister::genrand_int32() % 10,将是均匀分布的,模偏效应将几乎消失。


int unbiased_random_bit() {    
    int x1, x2, prev;
    prev = 2;
    x1 = rand() % 2;
    x2 = rand() % 2;

    for (;; x1 = rand() % 2, x2 = rand() % 2)
        if (x1 ^ x2)      // 01 -> 1, or 10 -> 0.
            return x2;        
        else if (x1 & x2)
            if (!prev)    // 0011
                return 1;
                prev = 1; // 1111 -> continue, bias unresolved
            if (prev == 1)// 1100
                return 0;
            else          // 0000 -> continue, bias unresolved
                prev = 0;


1% 2 = 1 2% 2 = 0 3% 2 = 1 Random_between(1,3) % 2 =更可能是1

在本例中,当您想要0到1之间的随机数时,不应该使用% 2。你可以通过% 3得到一个0到2之间的随机数,因为在这种情况下:RAND_MAX是3的倍数。


有更简单的方法,但要加上其他答案,这是我的解,得到一个0到n - 1之间的随机数,所以有n种不同的可能性,没有偏差。

编码可能性数量所需的比特数(不是字节数)就是您需要的随机数据的比特数 从随机位编码数字 如果这个数字是>= n,重新启动(不取模)。



next: n

    | bitSize r from to |
    n < 0 ifTrue: [^0 - (self next: 0 - n)].
    n = 0 ifTrue: [^nil].
    n = 1 ifTrue: [^0].
    cache isNil ifTrue: [cache := OrderedCollection new].
    cache size < (self randmax highBit) ifTrue: [
        Security.DSSRandom default next asByteArray do: [ :byte |
            (1 to: 8) do: [ :i |    cache add: (byte bitAt: i)]
    r := 0.
    bitSize := n highBit.
    to := cache size.
    from := to - bitSize + 1.
    (from to: to) do: [ :i |
        r := r bitAt: i - from + 1 put: (cache at: i)
    cache removeFrom: from to: to.
    r >= n ifTrue: [^self next: n].



当rand()返回0、3、6或9时,rand()%3 == 0。因此,P(0) = 4/11

当rand()返回1,4,7或10时,rand()%3 == 1。因此,P(1) = 4/11

当rand()返回2,5或8时,rand()%3 == 2。因此,P(2) = 3/11


那么rand()%n何时以相等的概率返回从0到n-1的数字范围呢?当RAND_MAX%n == n - 1。在这种情况下,加上我们之前的假设rand()确实以相同的概率返回了一个介于0和RAND_MAX之间的数字,n的模类也将是均匀分布的。


int x; 
do {
    x = rand();
} while (x >= n);


一个更有效的公式方法是取一个长度可被n整除的大范围,如RAND_MAX - RAND_MAX % n,不断生成随机数,直到你得到一个位于该范围内的随机数,然后取模量:

int x;

do {
    x = rand();
} while (x >= (RAND_MAX - RAND_MAX % n));

x %= n;



@user1413793 is correct about the problem. I'm not going to discuss that further, except to make one point: yes, for small values of n and large values of RAND_MAX, the modulo bias can be very small. But using a bias-inducing pattern means that you must consider the bias every time you calculate a random number and choose different patterns for different cases. And if you make the wrong choice, the bugs it introduces are subtle and almost impossible to unit test. Compared to just using the proper tool (such as arc4random_uniform), that's extra work, not less work. Doing more work and getting a worse solution is terrible engineering, especially when doing it right every time is easy on most platforms.





 * Calculate a uniformly distributed random number less than upper_bound
 * avoiding "modulo bias".
 * Uniformity is achieved by generating new random numbers until the one
 * returned is outside the range [0, 2**32 % upper_bound).  This
 * guarantees the selected random number will be inside
 * [2**32 % upper_bound, 2**32) which maps back to [0, upper_bound)
 * after reduction modulo upper_bound.
arc4random_uniform(u_int32_t upper_bound)
    u_int32_t r, min;

    if (upper_bound < 2)
        return 0;

    /* 2**32 % x == (2**32 - x) % x */
    min = -upper_bound % upper_bound;

     * This could theoretically loop forever but each retry has
     * p > 0.5 (worst case, usually far better) of selecting a
     * number inside the range we need, so it should rarely need
     * to re-roll.
    for (;;) {
        r = arc4random();
        if (r >= min)

    return r % upper_bound;


更改arc4random_uniform()计算2** 32% upper_bound为 -upper_bound % upper_bound。简化代码并使之成为 在ILP32和LP64架构上都是一样的,而且速度也略快 LP64架构使用32位余数而不是64位余数 余数。 由Jorden Verwer在tech@上指出 好的deraadt;DJM和otto没有反对意见


public int nextInt(int n) {
   if (n <= 0)
     throw new IllegalArgumentException("n must be positive");

   if ((n & -n) == n)  // i.e., n is a power of 2
     return (int)((n * (long)next(31)) >> 31);

   int bits, val;
   do {
       bits = next(31);
       val = bits % n;
   } while (bits - val + (n-1) < 0);
   return val;