假设我有一个字典列表:

[
    {'id': 1, 'name': 'john', 'age': 34},
    {'id': 1, 'name': 'john', 'age': 34},
    {'id': 2, 'name': 'hanna', 'age': 30},
]

如何获得唯一字典的列表(删除重复项)?

[
    {'id': 1, 'name': 'john', 'age': 34},
    {'id': 2, 'name': 'hanna', 'age': 30},
]

当前回答

我总结了我最喜欢的尝试:

https://repl.it/@SmaMa/Python-List-of-unique-dictionaries

# ----------------------------------------------
# Setup
# ----------------------------------------------

myList = [
  {"id":"1", "lala": "value_1"},
  {"id": "2", "lala": "value_2"}, 
  {"id": "2", "lala": "value_2"}, 
  {"id": "3", "lala": "value_3"}
]
print("myList:", myList)

# -----------------------------------------------
# Option 1 if objects has an unique identifier
# -----------------------------------------------

myUniqueList = list({myObject['id']:myObject for myObject in myList}.values())
print("myUniqueList:", myUniqueList)

# -----------------------------------------------
# Option 2 if uniquely identified by whole object
# -----------------------------------------------

myUniqueSet = [dict(s) for s in set(frozenset(myObject.items()) for myObject in myList)]
print("myUniqueSet:", myUniqueSet)

# -----------------------------------------------
# Option 3 for hashable objects (not dicts)
# -----------------------------------------------

myHashableObjects = list(set(["1", "2", "2", "3"]))
print("myHashAbleList:", myHashableObjects)

其他回答

在python 3中,简单的技巧,但基于唯一字段(id):

data = [ {'id': 1}, {'id': 1}]

list({ item['id'] : item for item in data}.values())

因此,创建一个临时字典,键为id。这将过滤掉重复的内容。 dict的values()将是列表

在Python2.7

>>> L=[
... {'id':1,'name':'john', 'age':34},
... {'id':1,'name':'john', 'age':34},
... {'id':2,'name':'hanna', 'age':30},
... ]
>>> {v['id']:v for v in L}.values()
[{'age': 34, 'id': 1, 'name': 'john'}, {'age': 30, 'id': 2, 'name': 'hanna'}]

在Python3

>>> L=[
... {'id':1,'name':'john', 'age':34},
... {'id':1,'name':'john', 'age':34},
... {'id':2,'name':'hanna', 'age':30},
... ] 
>>> list({v['id']:v for v in L}.values())
[{'age': 34, 'id': 1, 'name': 'john'}, {'age': 30, 'id': 2, 'name': 'hanna'}]

在Python2.5/2.6

>>> L=[
... {'id':1,'name':'john', 'age':34},
... {'id':1,'name':'john', 'age':34},
... {'id':2,'name':'hanna', 'age':30},
... ] 
>>> dict((v['id'],v) for v in L).values()
[{'age': 34, 'id': 1, 'name': 'john'}, {'age': 30, 'id': 2, 'name': 'hanna'}]

我们可以用熊猫

import pandas as pd
yourdict=pd.DataFrame(L).drop_duplicates().to_dict('r')
Out[293]: [{'age': 34, 'id': 1, 'name': 'john'}, {'age': 30, 'id': 2, 'name': 'hanna'}]

注意与接受答案略有不同。

drop_duplicate将检查pandas中的所有列,如果都相同则删除该行。

例如:

如果我们把第二个字典的名字从约翰改为彼得

L=[
    {'id': 1, 'name': 'john', 'age': 34},
    {'id': 1, 'name': 'peter', 'age': 34},
    {'id': 2, 'name': 'hanna', 'age': 30},
]
pd.DataFrame(L).drop_duplicates().to_dict('r')
Out[295]: 
[{'age': 34, 'id': 1, 'name': 'john'},
 {'age': 34, 'id': 1, 'name': 'peter'},# here will still keeping the dict in the out put 
 {'age': 30, 'id': 2, 'name': 'hanna'}]

扩展John La Rooy (Python -唯一字典列表)的答案,使其更加灵活:

def dedup_dict_list(list_of_dicts: list, columns: list) -> list:
    return list({''.join(row[column] for column in columns): row
                for row in list_of_dicts}.values())

调用函数:

sorted_list_of_dicts = dedup_dict_list(
    unsorted_list_of_dicts, ['id', 'name'])

如果字典仅由所有项唯一标识(ID不可用),则可以使用JSON来使用答案。下面是一个不使用JSON的替代方法,只要所有字典值都是不可变的,它就可以工作

[dict(s) for s in set(frozenset(d.items()) for d in L)]