在Python中,给定一个项目,如何在列表中计算它的出现次数?


一个相关但不同的问题是计算集合中每个不同元素的出现次数,将字典或列表作为直方图结果而不是单个整数。有关该问题,请参阅使用字典统计列表中的项目。


当前回答

为什么不使用Pandas?

import pandas as pd

my_list = ['a', 'b', 'c', 'd', 'a', 'd', 'a']

# converting the list to a Series and counting the values
my_count = pd.Series(my_list).value_counts()
my_count

输出:

a    3
d    2
b    1
c    1
dtype: int64

如果您正在查找特定元素的计数,例如a,请尝试:

my_count['a']

输出:

3

其他回答

使用itertools.groupby()对所有元素进行计数

通过itertools.groupby()可以获得列表中所有元素的计数。

具有“重复”计数

from itertools import groupby

L = ['a', 'a', 'a', 't', 'q', 'a', 'd', 'a', 'd', 'c']  # Input list

counts = [(i, len(list(c))) for i,c in groupby(L)]      # Create value-count pairs as list of tuples 
print(counts)

退换商品

[('a', 3), ('t', 1), ('q', 1), ('a', 1), ('d', 1), ('a', 1), ('d', 1), ('c', 1)]

请注意,它是如何将前三个a组合为第一个组的,而其他a组在列表的后面。这是因为输入列表L未排序。如果小组实际上应该是分开的,这有时会是一个好处。

具有唯一计数

如果需要唯一的组计数,只需对输入列表进行排序:

counts = [(i, len(list(c))) for i,c in groupby(sorted(L))]
print(counts)

退换商品

[('a', 5), ('c', 1), ('d', 2), ('q', 1), ('t', 1)]

注意:为了创建唯一计数,与groupby解决方案相比,许多其他答案提供了更简单、更可读的代码。但这里显示的是与重复计数示例平行。

list.count(x)返回x在列表中出现的次数

参见:http://docs.python.org/tutorial/datastructures.html#more-在列表上

以下是三种解决方案:

Fastest是使用for循环并将其存储在Dict中。

import time
from collections import Counter


def countElement(a):
    g = {}
    for i in a:
        if i in g: 
            g[i] +=1
        else: 
            g[i] =1
    return g


z = [1,1,1,1,2,2,2,2,3,3,4,5,5,234,23,3,12,3,123,12,31,23,13,2,4,23,42,42,34,234,23,42,34,23,423,42,34,23,423,4,234,23,42,34,23,4,23,423,4,23,4]


#Solution 1 - Faster
st = time.monotonic()
for i in range(1000000):
    b = countElement(z)
et = time.monotonic()
print(b)
print('Simple for loop and storing it in dict - Duration: {}'.format(et - st))

#Solution 2 - Fast
st = time.monotonic()
for i in range(1000000):
    a = Counter(z)
et = time.monotonic()
print (a)
print('Using collections.Counter - Duration: {}'.format(et - st))

#Solution 3 - Slow
st = time.monotonic()
for i in range(1000000):
    g = dict([(i, z.count(i)) for i in set(z)])
et = time.monotonic()
print(g)
print('Using list comprehension - Duration: {}'.format(et - st))

后果

#解决方案1-更快

{1: 4, 2: 5, 3: 4, 4: 6, 5: 2, 234: 3, 23: 10, 12: 2, 123: 1, 31: 1, 13: 1, 42: 5, 34: 4, 423: 3}
Simple for loop and storing it in dict - Duration: 12.032000000000153

#解决方案2-快速

Counter({23: 10, 4: 6, 2: 5, 42: 5, 1: 4, 3: 4, 34: 4, 234: 3, 423: 3, 5: 2, 12: 2, 123: 1, 31: 1, 13: 1})
Using collections.Counter - Duration: 15.889999999999418

#解决方案3-缓慢

{1: 4, 2: 5, 3: 4, 4: 6, 5: 2, 34: 4, 423: 3, 234: 3, 42: 5, 12: 2, 13: 1, 23: 10, 123: 1, 31: 1}
Using list comprehension - Duration: 33.0

我已经将所有建议的解决方案(以及一些新的解决方案)与perfplot(我的一个小项目)进行了比较。

清点一项

对于足够大的阵列,事实证明

numpy.sum(numpy.array(a) == 1)

比其他解决方案稍快。

清点所有项目

如前所述,

numpy.bincount(a)

是你想要的。


再现绘图的代码:

from collections import Counter
from collections import defaultdict
import numpy
import operator
import pandas
import perfplot


def counter(a):
    return Counter(a)


def count(a):
    return dict((i, a.count(i)) for i in set(a))


def bincount(a):
    return numpy.bincount(a)


def pandas_value_counts(a):
    return pandas.Series(a).value_counts()


def occur_dict(a):
    d = {}
    for i in a:
        if i in d:
            d[i] = d[i]+1
        else:
            d[i] = 1
    return d


def count_unsorted_list_items(items):
    counts = defaultdict(int)
    for item in items:
        counts[item] += 1
    return dict(counts)


def operator_countof(a):
    return dict((i, operator.countOf(a, i)) for i in set(a))


perfplot.show(
    setup=lambda n: list(numpy.random.randint(0, 100, n)),
    n_range=[2**k for k in range(20)],
    kernels=[
        counter, count, bincount, pandas_value_counts, occur_dict,
        count_unsorted_list_items, operator_countof
        ],
    equality_check=None,
    logx=True,
    logy=True,
    )
from collections import Counter
from collections import defaultdict
import numpy
import operator
import pandas
import perfplot


def counter(a):
    return Counter(a)


def count(a):
    return dict((i, a.count(i)) for i in set(a))


def bincount(a):
    return numpy.bincount(a)


def pandas_value_counts(a):
    return pandas.Series(a).value_counts()


def occur_dict(a):
    d = {}
    for i in a:
        if i in d:
            d[i] = d[i] + 1
        else:
            d[i] = 1
    return d


def count_unsorted_list_items(items):
    counts = defaultdict(int)
    for item in items:
        counts[item] += 1
    return dict(counts)


def operator_countof(a):
    return dict((i, operator.countOf(a, i)) for i in set(a))


b = perfplot.bench(
    setup=lambda n: list(numpy.random.randint(0, 100, n)),
    n_range=[2 ** k for k in range(20)],
    kernels=[
        counter,
        count,
        bincount,
        pandas_value_counts,
        occur_dict,
        count_unsorted_list_items,
        operator_countof,
    ],
    equality_check=None,
)
b.save("out.png")
b.show()

虽然这是一个很古老的问题,但由于我没有找到一个单行,所以我做了一个。

# original numbers in list
l = [1, 2, 2, 3, 3, 3, 4]

# empty dictionary to hold pair of number and its count
d = {}

# loop through all elements and store count
[ d.update( {i:d.get(i, 0)+1} ) for i in l ]

print(d)
# {1: 1, 2: 2, 3: 3, 4: 1}