找到Python列表中最常见元素的有效方法是什么?
我的列表项可能不是可哈希的,所以不能使用字典。 同样,在抽取的情况下,应返回索引最低的项。例子:
>>> most_common(['duck', 'duck', 'goose'])
'duck'
>>> most_common(['goose', 'duck', 'duck', 'goose'])
'goose'
找到Python列表中最常见元素的有效方法是什么?
我的列表项可能不是可哈希的,所以不能使用字典。 同样,在抽取的情况下,应返回索引最低的项。例子:
>>> most_common(['duck', 'duck', 'goose'])
'duck'
>>> most_common(['goose', 'duck', 'duck', 'goose'])
'goose'
当前回答
你可能不再需要这个了,但这是我对一个类似问题所做的。(因为评论,它看起来比实际要长。)
itemList = ['hi', 'hi', 'hello', 'bye']
counter = {}
maxItemCount = 0
for item in itemList:
try:
# Referencing this will cause a KeyError exception
# if it doesn't already exist
counter[item]
# ... meaning if we get this far it didn't happen so
# we'll increment
counter[item] += 1
except KeyError:
# If we got a KeyError we need to create the
# dictionary key
counter[item] = 1
# Keep overwriting maxItemCount with the latest number,
# if it's higher than the existing itemCount
if counter[item] > maxItemCount:
maxItemCount = counter[item]
mostPopularItem = item
print mostPopularItem
其他回答
我在最近的一个项目中需要这样做。我承认,我无法理解Alex的回答,所以这就是我最后得到的答案。
def mostPopular(l):
mpEl=None
mpIndex=0
mpCount=0
curEl=None
curCount=0
for i, el in sorted(enumerate(l), key=lambda x: (x[1], x[0]), reverse=True):
curCount=curCount+1 if el==curEl else 1
curEl=el
if curCount>mpCount \
or (curCount==mpCount and i<mpIndex):
mpEl=curEl
mpIndex=i
mpCount=curCount
return mpEl, mpCount, mpIndex
我根据Alex的解决方案计时,对于短列表,它要快10-15%,但一旦超过100个或更多元素(测试多达20万个),它就会慢20%。
numbers = [1, 3, 7, 4, 3, 0, 3, 6, 3]
max_repeat_num = max(numbers, key=numbers.count) *# which number most* frequently
max_repeat = numbers.count(max_repeat_num) *#how many times*
print(f" the number {max_repeat_num} is repeated{max_repeat} times")
ans = [1, 1, 0, 0, 1, 1]
all_ans = {ans.count(ans[i]): ans[i] for i in range(len(ans))}
print(all_ans)
all_ans={4: 1, 2: 0}
max_key = max(all_ans.keys())
4
print(all_ans[max_key])
1
这是一个很简单的解,时间复杂度是线性的
L =['鹅','鸭','鸭']
def most_common (L):
current_winner = 0
max_repeated = None
for i in L:
amount_times = L.count(i)
if amount_times > current_winner:
current_winner = amount_times
max_repeated = i
return max_repeated
打印(most_common (L)
“鸭子”
number是列表中重复次数最多的元素吗
如果排序和哈希都不可行,这是一个明显的缓慢的解决方案(O(n²)),但相等比较(==)可用:
def most_common(items):
if not items:
raise ValueError
fitems = []
best_idx = 0
for item in items:
item_missing = True
i = 0
for fitem in fitems:
if fitem[0] == item:
fitem[1] += 1
d = fitem[1] - fitems[best_idx][1]
if d > 0 or (d == 0 and fitems[best_idx][2] > fitem[2]):
best_idx = i
item_missing = False
break
i += 1
if item_missing:
fitems.append([item, 1, i])
return items[best_idx]
但是,如果你的列表(n)的长度很大,那么让你的项目可哈希或可排序(正如其他答案所建议的那样)几乎总是能更快地找到最常见的元素。哈希时平均为O(n),排序时最差为O(n*log(n))。