给定一个函数,它产生的是1到5之间的随机整数,写一个函数,它产生的是1到7之间的随机整数。


当前回答

这个答案更像是一个从Rand5函数中获得最大熵的实验。因此,T有点不清楚,几乎可以肯定比其他实现慢得多。

假设0-4为均匀分布,0-6为均匀分布:

public class SevenFromFive
{
  public SevenFromFive()
  {
    // this outputs a uniform ditribution but for some reason including it 
    // screws up the output distribution
    // open question Why?
    this.fifth = new ProbabilityCondensor(5, b => {});
    this.eigth = new ProbabilityCondensor(8, AddEntropy);
  } 

  private static Random r = new Random();
  private static uint Rand5()
  {
    return (uint)r.Next(0,5);
  }

  private class ProbabilityCondensor
  {
    private readonly int samples;
    private int counter;
    private int store;
    private readonly Action<bool> output;

    public ProbabilityCondensor(int chanceOfTrueReciprocal,
      Action<bool> output)
    {
      this.output = output;
      this.samples = chanceOfTrueReciprocal - 1;  
    }

    public void Add(bool bit)
    {
      this.counter++;
      if (bit)
        this.store++;   
      if (counter == samples)
      {
        bool? e;
        if (store == 0)
          e = false;
        else if (store == 1)
          e = true;
        else
          e = null;// discard for now       
        counter = 0;
        store = 0;
        if (e.HasValue)
          output(e.Value);
      }
    }
  }

  ulong buffer = 0;
  const ulong Mask = 7UL;
  int bitsAvail = 0;
  private readonly ProbabilityCondensor fifth;
  private readonly ProbabilityCondensor eigth;

  private void AddEntropy(bool bit)
  {
    buffer <<= 1;
    if (bit)
      buffer |= 1;      
    bitsAvail++;
  }

  private void AddTwoBitsEntropy(uint u)
  {
    buffer <<= 2;
    buffer |= (u & 3UL);    
    bitsAvail += 2;
  }

  public uint Rand7()
  {
    uint selection;   
    do
    {
      while (bitsAvail < 3)
      {
        var x = Rand5();
        if (x < 4)
        {
          // put the two low order bits straight in
          AddTwoBitsEntropy(x);
          fifth.Add(false);
        }
        else
        { 
          fifth.Add(true);
        }
      }
      // read 3 bits
      selection = (uint)((buffer & Mask));
      bitsAvail -= 3;     
      buffer >>= 3;
      if (selection == 7)
        eigth.Add(true);
      else
        eigth.Add(false);
    }
    while (selection == 7);   
    return selection;
  }
}

每次调用Rand5添加到缓冲区的比特数目前是4/5 * 2,所以是1.6。 如果包括1/5的概率值,则增加0.05,因此增加1.65,但请参阅代码中的注释,我不得不禁用它。

调用Rand7消耗的比特数= 3 + 1/8 *(3 + 1/8 *(3 + 1/8 *(… 这是3 + 3/8 + 3/64 + 3/512…大约是3.42

通过从7中提取信息,我每次调用回收1/8*1/7位,大约0.018

这使得每次调用的净消耗为3.4比特,这意味着每一次Rand7调用到Rand5的比率为2.125。最优值应该是2.1。

我可以想象这种方法比这里的许多其他方法都要慢得多,除非调用Rand5的代价非常昂贵(比如调用一些外部熵源)。

其他回答

以下是我的发现:

Random5产生1~5的范围,随机分布 如果我们运行3次并将它们加在一起,我们将得到3~15个随机分布的范围 在3~15范围内执行算术 (3~15) - 1 = (2~14) (2~14)/2 = (1~7)

然后我们得到1~7的范围,这是我们正在寻找的Random7。

通过使用滚动总数,您可以同时

保持平均分配;而且 不需要牺牲随机序列中的任何元素。

这两个问题都是简单的rand(5)+rand(5)…类型的解决方案。下面的Python代码展示了如何实现它(其中大部分是证明发行版)。

import random
x = []
for i in range (0,7):
    x.append (0)
t = 0
tt = 0
for i in range (0,700000):
    ########################################
    #####            qq.py             #####
    r = int (random.random () * 5)
    t = (t + r) % 7
    ########################################
    #####       qq_notsogood.py        #####
    #r = 20
    #while r > 6:
        #r =     int (random.random () * 5)
        #r = r + int (random.random () * 5)
    #t = r
    ########################################
    x[t] = x[t] + 1
    tt = tt + 1
high = x[0]
low = x[0]
for i in range (0,7):
    print "%d: %7d %.5f" % (i, x[i], 100.0 * x[i] / tt)
    if x[i] < low:
        low = x[i]
    if x[i] > high:
        high = x[i]
diff = high - low
print "Variation = %d (%.5f%%)" % (diff, 100.0 * diff / tt)

这个输出显示了结果:

pax$ python qq.py
0:   99908 14.27257
1:  100029 14.28986
2:  100327 14.33243
3:  100395 14.34214
4:   99104 14.15771
5:   99829 14.26129
6:  100408 14.34400
Variation = 1304 (0.18629%)

pax$ python qq.py
0:   99547 14.22100
1:  100229 14.31843
2:  100078 14.29686
3:   99451 14.20729
4:  100284 14.32629
5:  100038 14.29114
6:  100373 14.33900
Variation = 922 (0.13171%)

pax$ python qq.py
0:  100481 14.35443
1:   99188 14.16971
2:  100284 14.32629
3:  100222 14.31743
4:   99960 14.28000
5:   99426 14.20371
6:  100439 14.34843
Variation = 1293 (0.18471%)

一个简单的rand(5)+rand(5),忽略那些返回大于6的情况,其典型变化为18%,是上面所示方法的100倍:

pax$ python qq_notsogood.py
0:   31756 4.53657
1:   63304 9.04343
2:   95507 13.64386
3:  127825 18.26071
4:  158851 22.69300
5:  127567 18.22386
6:   95190 13.59857
Variation = 127095 (18.15643%)

pax$ python qq_notsogood.py
0:   31792 4.54171
1:   63637 9.09100
2:   95641 13.66300
3:  127627 18.23243
4:  158751 22.67871
5:  126782 18.11171
6:   95770 13.68143
Variation = 126959 (18.13700%)

pax$ python qq_notsogood.py
0:   31955 4.56500
1:   63485 9.06929
2:   94849 13.54986
3:  127737 18.24814
4:  159687 22.81243
5:  127391 18.19871
6:   94896 13.55657
Variation = 127732 (18.24743%)

并且,根据Nixuz的建议,我已经清理了脚本,所以您可以提取并使用rand7…材料:

import random

# rand5() returns 0 through 4 inclusive.

def rand5():
    return int (random.random () * 5)

# rand7() generator returns 0 through 6 inclusive (using rand5()).

def rand7():
    rand7ret = 0
    while True:
        rand7ret = (rand7ret + rand5()) % 7
        yield rand7ret

# Number of test runs.

count = 700000

# Work out distribution.

distrib = [0,0,0,0,0,0,0]
rgen =rand7()
for i in range (0,count):
    r = rgen.next()
    distrib[r] = distrib[r] + 1

# Print distributions and calculate variation.

high = distrib[0]
low = distrib[0]
for i in range (0,7):
    print "%d: %7d %.5f" % (i, distrib[i], 100.0 * distrib[i] / count)
    if distrib[i] < low:
        low = distrib[i]
    if distrib[i] > high:
        high = distrib[i]
diff = high - low
print "Variation = %d (%.5f%%)" % (diff, 100.0 * diff / count)

下面是Adam回答的Python实现。

import random

def rand5():
    return random.randint(1, 5)

def rand7():
    while True:
        r = 5 * (rand5() - 1) + rand5()
        #r is now uniformly random between 1 and 25
        if (r <= 21):
            break
    #result is now uniformly random between 1 and 7
    return r % 7 + 1

我喜欢把我正在研究的算法扔进Python,这样我就可以摆弄它们,我想我把它贴在这里,希望它对外面的人有用,而不是花很长时间来拼凑。

为什么不简单一点呢?

int random7() {
  return random5() + (random5() % 3);
}

由于取模,在这个解中得到1和7的几率较低,然而,如果你只是想要一个快速和可读的解,这是一种方法。

extern int r5();

int r7() {
    return ((r5() & 0x01) << 2 ) | ((r5() & 0x01) << 1 ) | (r5() & 0x01);
}