如何在整数列表中找到重复项并创建重复项的另一个列表?


当前回答

集合。Counter是python 2.7中的新功能:


Python 2.5.4 (r254:67916, May 31 2010, 15:03:39) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-46)] on linux2
a = [1,2,3,2,1,5,6,5,5,5]
import collections
print [x for x, y in collections.Counter(a).items() if y > 1]
Type "help", "copyright", "credits" or "license" for more information.
  File "", line 1, in 
AttributeError: 'module' object has no attribute 'Counter'
>>> 

在早期版本中,你可以使用传统的字典:

a = [1,2,3,2,1,5,6,5,5,5]
d = {}
for elem in a:
    if elem in d:
        d[elem] += 1
    else:
        d[elem] = 1

print [x for x, y in d.items() if y > 1]

其他回答

我想在列表中找到重复项最有效的方法是:

from collections import Counter

def duplicates(values):
    dups = Counter(values) - Counter(set(values))
    return list(dups.keys())

print(duplicates([1,2,3,6,5,2]))

它对所有元素使用一次Counter,然后对所有唯一元素使用一次Counter。用第二个减去第一个,只剩下重复的部分。

下面是一个快速生成器,它使用dict将每个元素存储为一个带有布尔值的键,用于检查是否已经产生了重复项。

对于所有元素都是可哈希类型的列表:

def gen_dupes(array):
    unique = {}
    for value in array:
        if value in unique and unique[value]:
            unique[value] = False
            yield value
        else:
            unique[value] = True

array = [1, 2, 2, 3, 4, 1, 5, 2, 6, 6]
print(list(gen_dupes(array)))
# => [2, 1, 6]

对于可能包含列表的列表:

def gen_dupes(array):
    unique = {}
    for value in array:
        is_list = False
        if type(value) is list:
            value = tuple(value)
            is_list = True

        if value in unique and unique[value]:
            unique[value] = False
            if is_list:
                value = list(value)

            yield value
        else:
            unique[value] = True

array = [1, 2, 2, [1, 2], 3, 4, [1, 2], 5, 2, 6, 6]
print(list(gen_dupes(array)))
# => [2, [1, 2], 6]

在没有任何python数据结构的帮助下,你可以简单地尝试下面的代码。这将工作于寻找重复的各种输入,如字符串,列表等。

# finding duplicates in unsorted an array 
def duplicates(numbers):
    store=[]
    checked=[]
    for i in range(len(numbers)):
        counter =1 
        for j in range(i+1,len(numbers)):
            if numbers[i] not in checked and numbers[j]==numbers[i] :
                counter +=1 
        if counter > 1 :
            store.append(numbers[i])
            checked.append(numbers[i])
    return store

print(duplicates([1,2,2,3,3,3,4,4,5]))  # output:  [2, 3, 4]
print(duplicates("madam"))              # output:  ['m', 'a']

我必须这样做,因为我挑战自己不使用其他方法:

def dupList(oldlist):
    if type(oldlist)==type((2,2)):
        oldlist=[x for x in oldlist]
    newList=[]
    newList=newList+oldlist
    oldlist=oldlist
    forbidden=[]
    checkPoint=0
    for i in range(len(oldlist)):
        #print 'start i', i
        if i in forbidden:
            continue
        else:
            for j in range(len(oldlist)):
                #print 'start j', j
                if j in forbidden:
                    continue
                else:
                    #print 'after Else'
                    if i!=j: 
                        #print 'i,j', i,j
                        #print oldlist
                        #print newList
                        if oldlist[j]==oldlist[i]:
                            #print 'oldlist[i],oldlist[j]', oldlist[i],oldlist[j]
                            forbidden.append(j)
                            #print 'forbidden', forbidden
                            del newList[j-checkPoint]
                            #print newList
                            checkPoint=checkPoint+1
    return newList

所以你的样本工作如下:

>>>a = [1,2,3,3,3,4,5,6,6,7]
>>>dupList(a)
[1, 2, 3, 4, 5, 6, 7]

我们可以使用itertools。Groupby,以便找到所有有dup的项:

from itertools import groupby

myList  = [2, 4, 6, 8, 4, 6, 12]
# when the list is sorted, groupby groups by consecutive elements which are similar
for x, y in groupby(sorted(myList)):
    #  list(y) returns all the occurences of item x
    if len(list(y)) > 1:
        print x  

输出将是:

4
6