假设我有一个字典列表:
[
{'id': 1, 'name': 'john', 'age': 34},
{'id': 1, 'name': 'john', 'age': 34},
{'id': 2, 'name': 'hanna', 'age': 30},
]
如何获得唯一字典的列表(删除重复项)?
[
{'id': 1, 'name': 'john', 'age': 34},
{'id': 2, 'name': 'hanna', 'age': 30},
]
扩展John La Rooy (Python -唯一字典列表)的答案,使其更加灵活:
def dedup_dict_list(list_of_dicts: list, columns: list) -> list:
return list({''.join(row[column] for column in columns): row
for row in list_of_dicts}.values())
调用函数:
sorted_list_of_dicts = dedup_dict_list(
unsorted_list_of_dicts, ['id', 'name'])
这里有一个相当紧凑的解决方案,尽管我怀疑不是特别有效(委婉地说):
>>> ds = [{'id':1,'name':'john', 'age':34},
... {'id':1,'name':'john', 'age':34},
... {'id':2,'name':'hanna', 'age':30}
... ]
>>> map(dict, set(tuple(sorted(d.items())) for d in ds))
[{'age': 30, 'id': 2, 'name': 'hanna'}, {'age': 34, 'id': 1, 'name': 'john'}]
a = [
{'id':1,'name':'john', 'age':34},
{'id':1,'name':'john', 'age':34},
{'id':2,'name':'hanna', 'age':30},
]
b = {x['id']:x for x in a}.values()
print(b)
输出:
[{“年龄”:34岁“id”:1、“名称”:“约翰”},{“id”:“年龄”:30日2时,“名字”:“汉娜”}]
对象可以放入集合中。您可以使用对象而不是字典,如果需要,在所有set插入后转换回字典列表。例子
class Person:
def __init__(self, id, age, name):
self.id = id
self.age = age
self.name = name
my_set = {Person(id=2, age=3, name='Jhon')}
my_set.add(Person(id=3, age=34, name='Guy'))
my_set.add({Person(id=2, age=3, name='Jhon')})
# if needed convert to list of dicts
list_of_dict = [{'id': obj.id,
'name': obj.name,
'age': obj.age} for obj in my_set]
可能有更优雅的解决方案,但我认为最好添加一个更详细的解决方案,使其更容易遵循。这里假设没有唯一键,你有一个简单的k,v结构,并且你使用的python版本保证了列表顺序。这适用于原来的职位。
data_set = [
{'id': 1, 'name': 'john', 'age': 34},
{'id': 1, 'name': 'john', 'age': 34},
{'id': 2, 'name': 'hanna', 'age': 30},
]
# list of keys
keys = [k for k in data_set[0]]
# Create a List of Lists of the values from the data Set
data_set_list = [[v for v in v.values()] for v in data_set]
# Dedupe
new_data_set = []
for lst in data_set_list:
# Check if list exists in new data set
if lst in new_data_set:
print(lst)
continue
# Add list to new data set
new_data_set.append(lst)
# Create dicts
new_data_set = [dict(zip(keys,lst)) for lst in new_data_set]
print(new_data_set)