假设我有一个字典列表:
[
{'id': 1, 'name': 'john', 'age': 34},
{'id': 1, 'name': 'john', 'age': 34},
{'id': 2, 'name': 'hanna', 'age': 30},
]
如何获得唯一字典的列表(删除重复项)?
[
{'id': 1, 'name': 'john', 'age': 34},
{'id': 2, 'name': 'hanna', 'age': 30},
]
在python 3.6+(我已经测试过了)中,只需使用:
import json
#Toy example, but will also work for your case
myListOfDicts = [{'a':1,'b':2},{'a':1,'b':2},{'a':1,'b':3}]
#Start by sorting each dictionary by keys
myListOfDictsSorted = [sorted(d.items()) for d in myListOfDicts]
#Using json methods with set() to get unique dict
myListOfUniqueDicts = list(map(json.loads,set(map(json.dumps, myListOfDictsSorted))))
print(myListOfUniqueDicts)
解释:我们正在映射json。转储将字典编码为json对象,这是不可变的。Set可用于生成包含唯一不可变对象的迭代对象。最后,我们使用json.loads转换回字典表示。注意,一开始,必须按键排序才能以唯一的形式排列字典。这对于Python 3.6+是有效的,因为字典在默认情况下是有序的。
在python 3.6+(我已经测试过了)中,只需使用:
import json
#Toy example, but will also work for your case
myListOfDicts = [{'a':1,'b':2},{'a':1,'b':2},{'a':1,'b':3}]
#Start by sorting each dictionary by keys
myListOfDictsSorted = [sorted(d.items()) for d in myListOfDicts]
#Using json methods with set() to get unique dict
myListOfUniqueDicts = list(map(json.loads,set(map(json.dumps, myListOfDictsSorted))))
print(myListOfUniqueDicts)
解释:我们正在映射json。转储将字典编码为json对象,这是不可变的。Set可用于生成包含唯一不可变对象的迭代对象。最后,我们使用json.loads转换回字典表示。注意,一开始,必须按键排序才能以唯一的形式排列字典。这对于Python 3.6+是有效的,因为字典在默认情况下是有序的。
这里有一个相当紧凑的解决方案,尽管我怀疑不是特别有效(委婉地说):
>>> ds = [{'id':1,'name':'john', 'age':34},
... {'id':1,'name':'john', 'age':34},
... {'id':2,'name':'hanna', 'age':30}
... ]
>>> map(dict, set(tuple(sorted(d.items())) for d in ds))
[{'age': 30, 'id': 2, 'name': 'hanna'}, {'age': 34, 'id': 1, 'name': 'john'}]
我总结了我最喜欢的尝试:
https://repl.it/@SmaMa/Python-List-of-unique-dictionaries
# ----------------------------------------------
# Setup
# ----------------------------------------------
myList = [
{"id":"1", "lala": "value_1"},
{"id": "2", "lala": "value_2"},
{"id": "2", "lala": "value_2"},
{"id": "3", "lala": "value_3"}
]
print("myList:", myList)
# -----------------------------------------------
# Option 1 if objects has an unique identifier
# -----------------------------------------------
myUniqueList = list({myObject['id']:myObject for myObject in myList}.values())
print("myUniqueList:", myUniqueList)
# -----------------------------------------------
# Option 2 if uniquely identified by whole object
# -----------------------------------------------
myUniqueSet = [dict(s) for s in set(frozenset(myObject.items()) for myObject in myList)]
print("myUniqueSet:", myUniqueSet)
# -----------------------------------------------
# Option 3 for hashable objects (not dicts)
# -----------------------------------------------
myHashableObjects = list(set(["1", "2", "2", "3"]))
print("myHashAbleList:", myHashableObjects)
扩展John La Rooy (Python -唯一字典列表)的答案,使其更加灵活:
def dedup_dict_list(list_of_dicts: list, columns: list) -> list:
return list({''.join(row[column] for column in columns): row
for row in list_of_dicts}.values())
调用函数:
sorted_list_of_dicts = dedup_dict_list(
unsorted_list_of_dicts, ['id', 'name'])