我如何检查一个列表是否有任何重复,并返回一个没有重复的新列表?
当前回答
简单易行:
myList = [1, 2, 3, 1, 2, 5, 6, 7, 8]
cleanlist = []
[cleanlist.append(x) for x in myList if x not in cleanlist]
输出:
>>> cleanlist
[1, 2, 3, 5, 6, 7, 8]
其他回答
这里有一个例子,返回没有重复的列表,保持顺序。不需要任何外部导入。
def GetListWithoutRepetitions(loInput):
# return list, consisting of elements of list/tuple loInput, without repetitions.
# Example: GetListWithoutRepetitions([None,None,1,1,2,2,3,3,3])
# Returns: [None, 1, 2, 3]
if loInput==[]:
return []
loOutput = []
if loInput[0] is None:
oGroupElement=1
else: # loInput[0]<>None
oGroupElement=None
for oElement in loInput:
if oElement<>oGroupElement:
loOutput.append(oElement)
oGroupElement = oElement
return loOutput
>>> t = [1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> t
[1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> s = []
>>> for i in t:
if i not in s:
s.append(i)
>>> s
[1, 2, 3, 5, 6, 7, 8]
Python的魔力内置类型
在python中,仅通过python的内置类型就可以很容易地处理这样复杂的情况。
让我告诉你怎么做!
方法一:一般情况
方法(1行代码)删除重复的元素在列表中仍然保持排序顺序
line = [1, 2, 3, 1, 2, 5, 6, 7, 8]
new_line = sorted(set(line), key=line.index) # remove duplicated element
print(new_line)
你会得到结果的
[1, 2, 3, 5, 6, 7, 8]
方法二:特殊情况
TypeError: unhashable type: 'list'
处理不可哈希的特殊情况(3行代码)
line=[['16.4966155686595', '-27.59776154691', '52.3786295521147']
,['16.4966155686595', '-27.59776154691', '52.3786295521147']
,['17.6508629295574', '-27.143305738671', '47.534955022564']
,['17.6508629295574', '-27.143305738671', '47.534955022564']
,['18.8051102904552', '-26.688849930432', '42.6912804930134']
,['18.8051102904552', '-26.688849930432', '42.6912804930134']
,['19.5504702331098', '-26.205884452727', '37.7709192714727']
,['19.5504702331098', '-26.205884452727', '37.7709192714727']
,['20.2929416861422', '-25.722717575124', '32.8500163147157']
,['20.2929416861422', '-25.722717575124', '32.8500163147157']]
tuple_line = [tuple(pt) for pt in line] # convert list of list into list of tuple
tuple_new_line = sorted(set(tuple_line),key=tuple_line.index) # remove duplicated element
new_line = [list(t) for t in tuple_new_line] # convert list of tuple into list of list
print (new_line)
你会得到这样的结果:
[
['16.4966155686595', '-27.59776154691', '52.3786295521147'],
['17.6508629295574', '-27.143305738671', '47.534955022564'],
['18.8051102904552', '-26.688849930432', '42.6912804930134'],
['19.5504702331098', '-26.205884452727', '37.7709192714727'],
['20.2929416861422', '-25.722717575124', '32.8500163147157']
]
因为元组是可哈希的,你可以很容易地在列表和元组之间转换数据
可以使用Python set或dict.fromkeys()方法删除重复项。 dict.fromkeys()方法将一个列表转换为一个字典。字典不能包含重复的值,因此dict.fromkeys()将返回只有唯一值的字典。 集,像字典一样,不能包含重复的值。如果将列表转换为集合,则删除所有重复项。
方法一:幼稚法
mylist = [5, 10, 15, 20, 3, 15, 25, 20, 30, 10, 100]
uniques = []
for i in mylist:
if i not in uniques:
uniques.append(i)
print(uniques)
方法二:使用set()
mylist = [5, 10, 15, 20, 3, 15, 25, 20, 30, 10, 100]
myset = set(mylist)
print(list(myset))
它需要安装一个第三方模块,但包iteration_utilities包含一个unique_everseen1函数,可以删除所有重复的同时保留顺序:
>>> from iteration_utilities import unique_everseen
>>> list(unique_everseen(['a', 'b', 'c', 'd'] + ['a', 'c', 'd']))
['a', 'b', 'c', 'd']
如果你想避免列表添加操作的开销,你可以使用itertools。链:
>>> from itertools import chain
>>> list(unique_everseen(chain(['a', 'b', 'c', 'd'], ['a', 'c', 'd'])))
['a', 'b', 'c', 'd']
unique_everseen也适用于列表中有不可哈希项(例如列表)的情况:
>>> from iteration_utilities import unique_everseen
>>> list(unique_everseen([['a'], ['b'], 'c', 'd'] + ['a', 'c', 'd']))
[['a'], ['b'], 'c', 'd', 'a']
然而,这将比项目是可哈希的(多)慢。
1披露:我是iteration_utilities-library的作者。