


public class RandomUnif {

    public static final int BASE_NUMBER = 5;

    private static Random rand = new Random();

    /** given generator, returns uniform integer in the range 0.. BASE_NUMBER-1
    public static int randomBASE() {
        return rand.nextInt(BASE_NUMBER);

    /** returns uniform integer in the range 0..n-1 using randomBASE() */
    public static int randomUnif(int n) {
        int rand, factor;
        if( n <= 1 ) return 0;
        else if( n == BASE_NUMBER ) return randomBASE();
        if( n < BASE_NUMBER ) {
            factor = BASE_NUMBER / n;
                rand = randomBASE() / factor;
            while(rand >= n);
            return rand;
        } else {
            factor = (n - 1) / BASE_NUMBER + 1;
            do {
                rand = factor * randomBASE() + randomUnif(factor);
            } while(rand >= n);
            return rand;


 n  calls
 2  1.250 
 3  1.644 
 4  1.252 
 5  1.000 
 6  3.763 
 7  3.185 
 8  2.821 
 9  2.495 
10  2.250 
11  3.646 
12  3.316 
13  3.060 
14  2.853 
15  2.650 
16  2.814 
17  2.644 
18  2.502 
19  2.361 
20  2.248 
21  2.382 
22  2.277 
23  2.175 
24  2.082 
25  2.000 
26  5.472 
27  5.280 
28  5.119 
29  4.899 



$num = 0;
$possibilities = 1;

sub rand7
  while( $possibilities < 7 )
    $num = $num * 5 + int(rand(5));
    $possibilities *= 5;
  my $result = $num % 7;
  $num = int( $num / 7 );
  $possibilities /= 7;
  return $result;


function rand7() {

    var returnVal = 4;

    for (var n=0; n<3; n++) {
        var rand = rand5();

        if (rand==1||rand==2){
        else if (rand==3||rand==4) {

    return returnVal;

我构建了一个测试函数,循环rand7() 10,000次,将所有返回值相加,然后除以10,000。如果rand7()工作正常,我们计算的平均值应该是4 -例如,(1+2+3+4+5+6+7 / 7)= 4。在做了多次测试后,平均值确实是4:)


static struct rand_buffer {
  unsigned v, count;
} buf2, buf3;

void push (struct rand_buffer *buf, unsigned n, unsigned v)
  buf->v = buf->v * n + v;

#define PUSH(n, v)  push (&buf##n, n, v)

int rand16 (void)
  int v = buf2.v & 0xf;
  buf2.v >>= 4;
  buf2.count -= 4;
  return v;

int rand9 (void)
  int v = buf3.v % 9;
  buf3.v /= 9;
  buf3.count -= 2;
  return v;

int rand7 (void)
  if (buf3.count >= 2) {
    int v = rand9 ();

    if (v < 7)
      return v % 7 + 1;

    PUSH (2, v - 7);

  for (;;) {
    if (buf2.count >= 4) {
      int v = rand16 ();

      if (v < 14) {
        PUSH (2, v / 7);
        return v % 7 + 1;

      PUSH (2, v - 14);

    // Get a number between 0 & 25
    int v = 5 * (rand5 () - 1) + rand5 () - 1;

    if (v < 21) {
      PUSH (3, v / 7);
      return v % 7 + 1;

    v -= 21;
    PUSH (2, v & 1);
    PUSH (2, v >> 1);




function rnd7() {
    do {
        r1 = rnd5() - 1;
        do {
            r2=rnd5() - 1;
        } while (r2 > 1);
        result = r2 * 5 + r1;
    } while (result > 6);
    return result + 1;


The entropy of a random variable is a well-defined quantity. For a random variable which takes on N states with equal probabilities (a uniform distribution), the entropy is log2 N. Thus, rand5() has approximately 2.32193 bits of entropy, and rand7() has about 2.80735 bits of entropy. If we hope to maximize our use of randomness, we need to use all 2.32193 bits of entropy from each call to rand5(), and apply them to generating 2.80735 bits of entropy needed for each call to rand7(). The fundamental limit, then, is that we can do no better than log(7)/log(5) = 1.20906 calls to rand5() per call to rand7().


So how do we do it? We generate an infinitely precise random real number between 0 and 1 (pretend for the moment that we could actually compute and store such an infinitely precise number -- we'll fix this later). We can generate such a number by generating its digits in base 5: we pick the random number 0.a1a2a3..., where each digit ai is chosen by a call to rand5(). For example, if our RNG chose ai = 1 for all i, then ignoring the fact that that isn't very random, that would correspond to the real number 1/5 + 1/52 + 1/53 + ... = 1/4 (sum of a geometric series).

Ok, so we've picked a random real number between 0 and 1. I now claim that such a random number is uniformly distributed. Intuitively, this is easy to understand, since each digit was picked uniformly, and the number is infinitely precise. However, a formal proof of this is somewhat more involved, since now we're dealing with a continuous distribution instead of a discrete distribution, so we need to prove that the probability that our number lies in an interval [a, b] equals the length of that interval, b - a. The proof is left as an exercise for the reader =).





乘以7 结果的积分部分是下一个以7为基数的数字 减去积分部分,只留下小数部分 转到第一步

为了处理无限精度的问题,我们计算一个部分结果,并存储结果的上界。也就是说,假设我们调用rand5()两次,两次都返回1。到目前为止,我们生成的数字是0.11(以5为基数)。无论rand5()调用的无限序列的剩余部分产生什么,我们生成的随机实数永远不会大于0.12:0.11≤0.11xyz…< 0.12。



import random

rand5_calls = 0
def rand5():
    global rand5_calls
    rand5_calls += 1
    return random.randint(0, 4)

def rand7_gen():
    state = 0
    pow5 = 1
    pow7 = 7
    while True:
        if state / pow5 == (state + pow7) / pow5:
            result = state / pow5
            state = (state - result * pow5) * 7
            pow7 *= 7
            yield result
            state = 5 * state + pow7 * rand5()
            pow5 *= 5

if __name__ == '__main__':
    r7 = rand7_gen()
    N = 10000
    x = list(next(r7) for i in range(N))
    distr = [x.count(i) for i in range(7)]
    expmean = N / 7.0
    expstddev = math.sqrt(N * (1.0/7.0) * (6.0/7.0))

    print '%d TRIALS' % N
    print 'Expected mean: %.1f' % expmean
    print 'Expected standard deviation: %.1f' % expstddev
    print 'DISTRIBUTION:'
    for i in range(7):
        print '%d: %d   (%+.3f stddevs)' % (i, distr[i], (distr[i] - expmean) / expstddev)
    print 'Calls to rand5: %d (average of %f per call to rand7)' % (rand5_calls, float(rand5_calls) / N)

注意,rand7_gen()返回一个生成器,因为它的内部状态涉及到将数字转换为以7为基数。测试工具调用next(r7) 10000次以产生10000个随机数,然后测量它们的分布。只使用整数数学,所以结果是完全正确的。


