如何从列表中删除重复项,同时保持顺序?使用集合删除重复项会破坏原始顺序。 是否有内置的或python的习语?


当前回答

消除序列中的重复值,但保留其余项的顺序。使用通用发电机功能。

# for hashable sequence
def remove_duplicates(items):
    seen = set()
    for item in items:
        if item not in seen:
            yield item
            seen.add(item)

a = [1, 5, 2, 1, 9, 1, 5, 10]
list(remove_duplicates(a))
# [1, 5, 2, 9, 10]



# for unhashable sequence
def remove_duplicates(items, key=None):
    seen = set()
    for item in items:
        val = item if key is None else key(item)
        if val not in seen:
            yield item
            seen.add(val)

a = [ {'x': 1, 'y': 2}, {'x': 1, 'y': 3}, {'x': 1, 'y': 2}, {'x': 2, 'y': 4}]
list(remove_duplicates(a, key=lambda d: (d['x'],d['y'])))
# [{'x': 1, 'y': 2}, {'x': 1, 'y': 3}, {'x': 2, 'y': 4}]

其他回答

消除序列中的重复值,但保留其余项的顺序。使用通用发电机功能。

# for hashable sequence
def remove_duplicates(items):
    seen = set()
    for item in items:
        if item not in seen:
            yield item
            seen.add(item)

a = [1, 5, 2, 1, 9, 1, 5, 10]
list(remove_duplicates(a))
# [1, 5, 2, 9, 10]



# for unhashable sequence
def remove_duplicates(items, key=None):
    seen = set()
    for item in items:
        val = item if key is None else key(item)
        if val not in seen:
            yield item
            seen.add(val)

a = [ {'x': 1, 'y': 2}, {'x': 1, 'y': 3}, {'x': 1, 'y': 2}, {'x': 2, 'y': 4}]
list(remove_duplicates(a, key=lambda d: (d['x'],d['y'])))
# [{'x': 1, 'y': 2}, {'x': 1, 'y': 3}, {'x': 2, 'y': 4}]

这里有一个简单的方法:

list1 = ["hello", " ", "w", "o", "r", "l", "d"]
sorted(set(list1 ), key=list1.index)

输出如下:

["hello", " ", "w", "o", "r", "l", "d"]
l = [1,2,2,3,3,...]
n = []
n.extend(ele for ele in l if ele not in set(n))

一个生成器表达式,它使用集合的O(1)查找来确定是否在新列表中包含元素。

在Python 3.7及以上版本中,字典可以保证记住它们的键插入顺序。这个问题的答案概括了当前的状况。

OrderedDict解决方案因此变得过时,没有任何import语句,我们可以简单地发出:

>>> lst = [1, 2, 1, 3, 3, 2, 4]
>>> list(dict.fromkeys(lst))
[1, 2, 3, 4]

不使用导入模块或集的解决方案:

text = "ask not what your country can do for you ask what you can do for your country"
sentence = text.split(" ")
noduplicates = [(sentence[i]) for i in range (0,len(sentence)) if sentence[i] not in sentence[:i]]
print(noduplicates)

给输出:

['ask', 'not', 'what', 'your', 'country', 'can', 'do', 'for', 'you']