如何在整数列表中找到重复项并创建重复项的另一个列表?


当前回答

这里有很多答案,但我认为这是一个相对易于阅读和理解的方法:

def get_duplicates(sorted_list):
    duplicates = []
    last = sorted_list[0]
    for x in sorted_list[1:]:
        if x == last:
            duplicates.append(x)
        last = x
    return set(duplicates)

注:

如果您希望保留重复计数,请去掉强制转换 'set'在底部获得完整的列表 如果您更喜欢使用生成器,请将duplicate .append(x)替换为yield x和底部的return语句(您可以稍后强制转换为set)

其他回答

使用sort()函数。重复项可以通过遍历它并检查l1[i] == l1[i+1]来识别。

为了实现这个问题,我们可以使用多种不同的方法来解决它,这两种是常见的解决方案,但在实际场景中实现它们时,我们还必须考虑时间复杂性。

import random
import time

dupl_list = [random.randint(1,1000) for x in range(500)]
print("List with duplicate integers")
print (dupl_list)


#Method 1 
print("******************Method 1 *************")

def Repeat_num(x):
    _size = len(x)
    repeated = []
    for i in range(_size):
        # print(i)
        k = i + 1
        for j in range(k, _size):
            # print(j)
            if x[i] == x[j] and x[i] not in repeated:
                repeated.append(x[i])
    return repeated

start = time.time()
print(Repeat_num(dupl_list))
end = time.time()
print("The time of execution of above program is :",(end-start) * 10**3, "ms")

print("***************Method 2****************")

#method 2 - using count()
def repeast_count(dup_list):
  new = []
  for a in dup_list:
      # print(a)
      # checking the occurrence of elements
      n = dup_list.count(a)
      # if the occurrence is more than
      # one we add it to the output list
      if n > 1:
          if new.count(a) == 0:  # condition to check
              new.append(a)
  return new


start = time.time()
print(repeast_count(dupl_list))
end = time.time()
print("The time of execution of above program is :",(end-start) * 10**3, "ms")

# #输出示例:

List with duplicate integers
[5, 45, 28, 81, 32, 98, 8, 83, 47, 95, 41, 49, 4, 1, 85, 26, 38, 82, 54, 11]
******************Method 1 *************
[]
The time of execution of above program is : 1.1069774627685547 ms
***************Method 2****************
[]
The time of execution of above program is : 0.1881122589111328 ms

对于一般的理解,方法1是好的,但是对于真正的实现,我更喜欢方法2,因为它比方法1花费的时间更少。

在Python中,只需一次迭代就可以找到被愚弄的人,这是一个非常简单快速的方法:

testList = ['red', 'blue', 'red', 'green', 'blue', 'blue']

testListDict = {}

for item in testList:
  try:
    testListDict[item] += 1
  except:
    testListDict[item] = 1

print testListDict

输出内容如下:

>>> print testListDict
{'blue': 3, 'green': 1, 'red': 2}

这和更多在我的博客http://www.howtoprogramwithpython.com

在没有任何python数据结构的帮助下,你可以简单地尝试下面的代码。这将工作于寻找重复的各种输入,如字符串,列表等。

# finding duplicates in unsorted an array 
def duplicates(numbers):
    store=[]
    checked=[]
    for i in range(len(numbers)):
        counter =1 
        for j in range(i+1,len(numbers)):
            if numbers[i] not in checked and numbers[j]==numbers[i] :
                counter +=1 
        if counter > 1 :
            store.append(numbers[i])
            checked.append(numbers[i])
    return store

print(duplicates([1,2,2,3,3,3,4,4,5]))  # output:  [2, 3, 4]
print(duplicates("madam"))              # output:  ['m', 'a']

使用toolz时:

from toolz import frequencies, valfilter

a = [1,2,2,3,4,5,4]
>>> list(valfilter(lambda count: count > 1, frequencies(a)).keys())
[2,4]