我需要写一个加权版的random。选择(列表中的每个元素有不同的被选择的概率)。这是我想到的:

def weightedChoice(choices):
    """Like random.choice, but each element can have a different chance of
    being selected.

    choices can be any iterable containing iterables with two items each.
    Technically, they can have more than two items, the rest will just be
    ignored.  The first item is the thing being chosen, the second item is
    its weight.  The weights can be any numeric values, what matters is the
    relative differences between them.
    """
    space = {}
    current = 0
    for choice, weight in choices:
        if weight > 0:
            space[current] = choice
            current += weight
    rand = random.uniform(0, current)
    for key in sorted(space.keys() + [current]):
        if rand < key:
            return choice
        choice = space[key]
    return None

这个函数对我来说太复杂了,而且很丑。我希望这里的每个人都能提供一些改进的建议或其他方法。对我来说,效率没有代码的整洁和可读性重要。


当前回答

如果不介意使用numpy,可以使用numpy.random.choice。

例如:

import numpy

items  = [["item1", 0.2], ["item2", 0.3], ["item3", 0.45], ["item4", 0.05]
elems = [i[0] for i in items]
probs = [i[1] for i in items]

trials = 1000
results = [0] * len(items)
for i in range(trials):
    res = numpy.random.choice(items, p=probs)  #This is where the item is selected!
    results[items.index(res)] += 1
results = [r / float(trials) for r in results]
print "item\texpected\tactual"
for i in range(len(probs)):
    print "%s\t%0.4f\t%0.4f" % (items[i], probs[i], results[i])

如果你知道你需要提前做多少选择,你可以不像这样循环:

numpy.random.choice(items, trials, p=probs)

其他回答

加权选择的一个非常基本和简单的方法如下:

np.random.choice(['A', 'B', 'C'], p=[0.3, 0.4, 0.3])

我可能已经来不及提供任何有用的东西了,但这里有一个简单,简短,非常有效的片段:

def choose_index(probabilies):
    cmf = probabilies[0]
    choice = random.random()
    for k in xrange(len(probabilies)):
        if choice <= cmf:
            return k
        else:
            cmf += probabilies[k+1]

不需要排序你的概率或用你的cmf创建一个向量,它一旦找到它的选择就会终止。内存:O(1),时间:O(N),平均运行时间~ N/2。

如果你有权重,只需添加一行:

def choose_index(weights):
    probabilities = weights / sum(weights)
    cmf = probabilies[0]
    choice = random.random()
    for k in xrange(len(probabilies)):
        if choice <= cmf:
            return k
        else:
            cmf += probabilies[k+1]

如果你碰巧有Python 3,并且害怕安装numpy或编写自己的循环,你可以这样做:

import itertools, bisect, random

def weighted_choice(choices):
   weights = list(zip(*choices))[1]
   return choices[bisect.bisect(list(itertools.accumulate(weights)),
                                random.uniform(0, sum(weights)))][0]

因为你可以用一袋管道适配器做任何东西!尽管……我必须承认,尼德的回答虽然稍长一些,但比较容易理解。

def weighted_choice(choices):
   total = sum(w for c, w in choices)
   r = random.uniform(0, total)
   upto = 0
   for c, w in choices:
      if upto + w >= r:
         return c
      upto += w
   assert False, "Shouldn't get here"

通解:

import random
def weighted_choice(choices, weights):
    total = sum(weights)
    treshold = random.uniform(0, total)
    for k, weight in enumerate(weights):
        total -= weight
        if total < treshold:
            return choices[k]