我如何检查一个列表是否有任何重复,并返回一个没有重复的新列表?
当前回答
如果你想删除重复项(就地编辑而不是返回新列表)而不使用内置set, dict,请检查此选项。键,唯一,计数器
>>> t = [1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> for i in t:
... if i in t[t.index(i)+1:]:
... t.remove(i)
...
>>> t
[3, 1, 2, 5, 6, 7, 8]
其他回答
到目前为止,我看到的所有保持顺序的方法要么使用朴素比较(时间复杂度最多为O(n^2)),要么使用限制于可哈希输入的重载OrderedDicts/set+list组合。下面是一个与哈希无关的O(nlogn)解决方案:
更新增加了关键参数、文档和Python 3兼容性。
# from functools import reduce <-- add this import on Python 3
def uniq(iterable, key=lambda x: x):
"""
Remove duplicates from an iterable. Preserves order.
:type iterable: Iterable[Ord => A]
:param iterable: an iterable of objects of any orderable type
:type key: Callable[A] -> (Ord => B)
:param key: optional argument; by default an item (A) is discarded
if another item (B), such that A == B, has already been encountered and taken.
If you provide a key, this condition changes to key(A) == key(B); the callable
must return orderable objects.
"""
# Enumerate the list to restore order lately; reduce the sorted list; restore order
def append_unique(acc, item):
return acc if key(acc[-1][1]) == key(item[1]) else acc.append(item) or acc
srt_enum = sorted(enumerate(iterable), key=lambda item: key(item[1]))
return [item[1] for item in sorted(reduce(append_unique, srt_enum, [srt_enum[0]]))]
这只是一个可读的函数,很容易理解,我已经使用了dict数据结构,我已经使用了一些内置函数和更好的复杂度O(n)
def undup(dup_list):
b={}
for i in dup_list:
b.update({i:1})
return b.keys()
a=["a",'b','a']
print undup(a)
免责声明:你可能会得到缩进错误(如果复制和粘贴),使用上述代码与适当的缩进粘贴之前
Python的魔力内置类型
在python中,仅通过python的内置类型就可以很容易地处理这样复杂的情况。
让我告诉你怎么做!
方法一:一般情况
方法(1行代码)删除重复的元素在列表中仍然保持排序顺序
line = [1, 2, 3, 1, 2, 5, 6, 7, 8]
new_line = sorted(set(line), key=line.index) # remove duplicated element
print(new_line)
你会得到结果的
[1, 2, 3, 5, 6, 7, 8]
方法二:特殊情况
TypeError: unhashable type: 'list'
处理不可哈希的特殊情况(3行代码)
line=[['16.4966155686595', '-27.59776154691', '52.3786295521147']
,['16.4966155686595', '-27.59776154691', '52.3786295521147']
,['17.6508629295574', '-27.143305738671', '47.534955022564']
,['17.6508629295574', '-27.143305738671', '47.534955022564']
,['18.8051102904552', '-26.688849930432', '42.6912804930134']
,['18.8051102904552', '-26.688849930432', '42.6912804930134']
,['19.5504702331098', '-26.205884452727', '37.7709192714727']
,['19.5504702331098', '-26.205884452727', '37.7709192714727']
,['20.2929416861422', '-25.722717575124', '32.8500163147157']
,['20.2929416861422', '-25.722717575124', '32.8500163147157']]
tuple_line = [tuple(pt) for pt in line] # convert list of list into list of tuple
tuple_new_line = sorted(set(tuple_line),key=tuple_line.index) # remove duplicated element
new_line = [list(t) for t in tuple_new_line] # convert list of tuple into list of list
print (new_line)
你会得到这样的结果:
[
['16.4966155686595', '-27.59776154691', '52.3786295521147'],
['17.6508629295574', '-27.143305738671', '47.534955022564'],
['18.8051102904552', '-26.688849930432', '42.6912804930134'],
['19.5504702331098', '-26.205884452727', '37.7709192714727'],
['20.2929416861422', '-25.722717575124', '32.8500163147157']
]
因为元组是可哈希的,你可以很容易地在列表和元组之间转换数据
这里有一个例子,返回没有重复的列表,保持顺序。不需要任何外部导入。
def GetListWithoutRepetitions(loInput):
# return list, consisting of elements of list/tuple loInput, without repetitions.
# Example: GetListWithoutRepetitions([None,None,1,1,2,2,3,3,3])
# Returns: [None, 1, 2, 3]
if loInput==[]:
return []
loOutput = []
if loInput[0] is None:
oGroupElement=1
else: # loInput[0]<>None
oGroupElement=None
for oElement in loInput:
if oElement<>oGroupElement:
loOutput.append(oElement)
oGroupElement = oElement
return loOutput
使用set:
a = [0,1,2,3,4,3,3,4]
a = list(set(a))
print a
使用unique:
import numpy as np
a = [0,1,2,3,4,3,3,4]
a = np.unique(a).tolist()
print a