找到Python列表中最常见元素的有效方法是什么?

我的列表项可能不是可哈希的,所以不能使用字典。 同样,在抽取的情况下,应返回索引最低的项。例子:

>>> most_common(['duck', 'duck', 'goose'])
'duck'
>>> most_common(['goose', 'duck', 'duck', 'goose'])
'goose'

当前回答

从这里借鉴,这可以在Python 2.7中使用:

from collections import Counter

def Most_Common(lst):
    data = Counter(lst)
    return data.most_common(1)[0][0]

比Alex的解决方案快4-6倍,比newacct提出的一行程序快50倍。

在CPython 3.6+(任何Python 3.7+)上,上面将选择第一个看到的元素。如果你在旧的Python上运行,为了检索列表中第一个出现的元素,你需要进行两次传递来保持顺序:

# Only needed pre-3.6!
def most_common(lst):
    data = Counter(lst)
    return max(lst, key=data.get)

其他回答

numbers = [1, 3, 7, 4, 3, 0, 3, 6, 3]
max_repeat_num = max(numbers, key=numbers.count)     *# which number most* frequently
max_repeat = numbers.count(max_repeat_num)           *#how many times*
print(f" the number {max_repeat_num} is repeated{max_repeat} times")

我在最近的一个项目中需要这样做。我承认,我无法理解Alex的回答,所以这就是我最后得到的答案。

def mostPopular(l):
    mpEl=None
    mpIndex=0
    mpCount=0
    curEl=None
    curCount=0
    for i, el in sorted(enumerate(l), key=lambda x: (x[1], x[0]), reverse=True):
        curCount=curCount+1 if el==curEl else 1
        curEl=el
        if curCount>mpCount \
        or (curCount==mpCount and i<mpIndex):
            mpEl=curEl
            mpIndex=i
            mpCount=curCount
    return mpEl, mpCount, mpIndex

我根据Alex的解决方案计时,对于短列表,它要快10-15%,但一旦超过100个或更多元素(测试多达20万个),它就会慢20%。

简单的一行解决方案

moc= max([(lst.count(chr),chr) for chr in set(lst)])

它会返回频率最高的元素。

# use Decorate, Sort, Undecorate to solve the problem

def most_common(iterable):
    # Make a list with tuples: (item, index)
    # The index will be used later to break ties for most common item.
    lst = [(x, i) for i, x in enumerate(iterable)]
    lst.sort()

    # lst_final will also be a list of tuples: (count, index, item)
    # Sorting on this list will find us the most common item, and the index
    # will break ties so the one listed first wins.  Count is negative so
    # largest count will have lowest value and sort first.
    lst_final = []

    # Get an iterator for our new list...
    itr = iter(lst)

    # ...and pop the first tuple off.  Setup current state vars for loop.
    count = 1
    tup = next(itr)
    x_cur, i_cur = tup

    # Loop over sorted list of tuples, counting occurrences of item.
    for tup in itr:
        # Same item again?
        if x_cur == tup[0]:
            # Yes, same item; increment count
            count += 1
        else:
            # No, new item, so write previous current item to lst_final...
            t = (-count, i_cur, x_cur)
            lst_final.append(t)
            # ...and reset current state vars for loop.
            x_cur, i_cur = tup
            count = 1

    # Write final item after loop ends
    t = (-count, i_cur, x_cur)
    lst_final.append(t)

    lst_final.sort()
    answer = lst_final[0][2]

    return answer

print most_common(['x', 'e', 'a', 'e', 'a', 'e', 'e']) # prints 'e'
print most_common(['goose', 'duck', 'duck', 'goose']) # prints 'goose'

这是O(n)解。

mydict   = {}
cnt, itm = 0, ''
for item in reversed(lst):
     mydict[item] = mydict.get(item, 0) + 1
     if mydict[item] >= cnt :
         cnt, itm = mydict[item], item

print itm

(reversed用于确保它返回最低的索引项)