我需要写一个加权版的random。选择(列表中的每个元素有不同的被选择的概率)。这是我想到的:
def weightedChoice(choices):
"""Like random.choice, but each element can have a different chance of
being selected.
choices can be any iterable containing iterables with two items each.
Technically, they can have more than two items, the rest will just be
ignored. The first item is the thing being chosen, the second item is
its weight. The weights can be any numeric values, what matters is the
relative differences between them.
"""
space = {}
current = 0
for choice, weight in choices:
if weight > 0:
space[current] = choice
current += weight
rand = random.uniform(0, current)
for key in sorted(space.keys() + [current]):
if rand < key:
return choice
choice = space[key]
return None
这个函数对我来说太复杂了,而且很丑。我希望这里的每个人都能提供一些改进的建议或其他方法。对我来说,效率没有代码的整洁和可读性重要。
如果你没有提前定义你想要选择多少项(所以,你没有做k=10这样的事情),你只有概率,你可以做下面的事情。注意,你的概率加起来不需要等于1,它们可以相互独立:
soup_items = ['pepper', 'onion', 'tomato', 'celery']
items_probability = [0.2, 0.3, 0.9, 0.1]
selected_items = [item for item,p in zip(soup_items,items_probability) if random.random()<p]
print(selected_items)
>>>['pepper','tomato']
下面是Python 3.6标准库中包含的版本:
import itertools as _itertools
import bisect as _bisect
class Random36(random.Random):
"Show the code included in the Python 3.6 version of the Random class"
def choices(self, population, weights=None, *, cum_weights=None, k=1):
"""Return a k sized list of population elements chosen with replacement.
If the relative weights or cumulative weights are not specified,
the selections are made with equal probability.
"""
random = self.random
if cum_weights is None:
if weights is None:
_int = int
total = len(population)
return [population[_int(random() * total)] for i in range(k)]
cum_weights = list(_itertools.accumulate(weights))
elif weights is not None:
raise TypeError('Cannot specify both weights and cumulative weights')
if len(cum_weights) != len(population):
raise ValueError('The number of weights does not match the population')
bisect = _bisect.bisect
total = cum_weights[-1]
return [population[bisect(cum_weights, random() * total)] for i in range(k)]
来源:https://hg.python.org/cpython/file/tip/Lib/random.py l340
如果你碰巧有Python 3,并且害怕安装numpy或编写自己的循环,你可以这样做:
import itertools, bisect, random
def weighted_choice(choices):
weights = list(zip(*choices))[1]
return choices[bisect.bisect(list(itertools.accumulate(weights)),
random.uniform(0, sum(weights)))][0]
因为你可以用一袋管道适配器做任何东西!尽管……我必须承认,尼德的回答虽然稍长一些,但比较容易理解。
我不喜欢它们的语法。我只想具体说明这些项目是什么以及每项的权重是多少。我意识到我可以用随机。选项,但我很快就写了下面的类。
import random, string
from numpy import cumsum
class randomChoiceWithProportions:
'''
Accepts a dictionary of choices as keys and weights as values. Example if you want a unfair dice:
choiceWeightDic = {"1":0.16666666666666666, "2": 0.16666666666666666, "3": 0.16666666666666666
, "4": 0.16666666666666666, "5": .06666666666666666, "6": 0.26666666666666666}
dice = randomChoiceWithProportions(choiceWeightDic)
samples = []
for i in range(100000):
samples.append(dice.sample())
# Should be close to .26666
samples.count("6")/len(samples)
# Should be close to .16666
samples.count("1")/len(samples)
'''
def __init__(self, choiceWeightDic):
self.choiceWeightDic = choiceWeightDic
weightSum = sum(self.choiceWeightDic.values())
assert weightSum == 1, 'Weights sum to ' + str(weightSum) + ', not 1.'
self.valWeightDict = self._compute_valWeights()
def _compute_valWeights(self):
valWeights = list(cumsum(list(self.choiceWeightDic.values())))
valWeightDict = dict(zip(list(self.choiceWeightDic.keys()), valWeights))
return valWeightDict
def sample(self):
num = random.uniform(0,1)
for key, val in self.valWeightDict.items():
if val >= num:
return key