我需要合并多个字典,这是我有例如:
dict1 = {1:{"a":{A}}, 2:{"b":{B}}}
dict2 = {2:{"c":{C}}, 3:{"d":{D}}}
A、B、C和D是树的叶子,比如{"info1":"value", "info2":"value2"}
字典的级别(深度)未知,可能是{2:{"c":{"z":{"y":{c}}}}}
在我的例子中,它表示一个目录/文件结构,节点是文档,叶子是文件。
我想将它们合并得到:
dict3 = {1:{"a":{A}}, 2:{"b":{B},"c":{C}}, 3:{"d":{D}}}
我不确定如何用Python轻松做到这一点。
由于dictviews支持集合操作,我能够极大地简化jterrace的答案。
def merge(dict1, dict2):
for k in dict1.keys() - dict2.keys():
yield (k, dict1[k])
for k in dict2.keys() - dict1.keys():
yield (k, dict2[k])
for k in dict1.keys() & dict2.keys():
yield (k, dict(merge(dict1[k], dict2[k])))
任何将dict与非dict(技术上讲,一个带有'keys'方法的对象和一个没有'keys'方法的对象)组合的尝试都会引发AttributeError。这包括对函数的初始调用和递归调用。这正是我想要的,所以我离开了。您可以很容易地捕获递归调用抛出的AttributeErrors,然后生成您想要的任何值。
def m(a,b):
aa = {
k : dict(a.get(k,{}), **v) for k,v in b.items()
}
aap = print(aa)
return aap
d1 = {1:{"a":"A"}, 2:{"b":"B"}}
d2 = {2:{"c":"C"}, 3:{"d":"D"}}
dict1 = {1:{"a":{1}}, 2:{"b":{2}}}
dict2 = {2:{"c":{222}}, 3:{"d":{3}}}
m(d1,d2)
m(dict1,dict2)
"""
Output :
{2: {'b': 'B', 'c': 'C'}, 3: {'d': 'D'}}
{2: {'b': {2}, 'c': {222}}, 3: {'d': {3}}}
"""
这个简单的递归过程将一个字典合并到另一个字典,同时覆盖冲突的键:
#!/usr/bin/env python2.7
def merge_dicts(dict1, dict2):
""" Recursively merges dict2 into dict1 """
if not isinstance(dict1, dict) or not isinstance(dict2, dict):
return dict2
for k in dict2:
if k in dict1:
dict1[k] = merge_dicts(dict1[k], dict2[k])
else:
dict1[k] = dict2[k]
return dict1
print (merge_dicts({1:{"a":"A"}, 2:{"b":"B"}}, {2:{"c":"C"}, 3:{"d":"D"}}))
print (merge_dicts({1:{"a":"A"}, 2:{"b":"B"}}, {1:{"a":"A"}, 2:{"b":"C"}}))
输出:
{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}
{1: {'a': 'A'}, 2: {'b': 'C'}}
在不影响输入字典的情况下返回一个合并。
def _merge_dicts(dictA: Dict = {}, dictB: Dict = {}) -> Dict:
# it suffices to pass as an argument a clone of `dictA`
return _merge_dicts_aux(dictA, dictB, copy(dictA))
def _merge_dicts_aux(dictA: Dict = {}, dictB: Dict = {}, result: Dict = {}, path: List[str] = None) -> Dict:
# conflict path, None if none
if path is None:
path = []
for key in dictB:
# if the key doesn't exist in A, add the B element to A
if key not in dictA:
result[key] = dictB[key]
else:
# if the key value is a dict, both in A and in B, merge the dicts
if isinstance(dictA[key], dict) and isinstance(dictB[key], dict):
_merge_dicts_aux(dictA[key], dictB[key], result[key], path + [str(key)])
# if the key value is the same in A and in B, ignore
elif dictA[key] == dictB[key]:
pass
# if the key value differs in A and in B, raise error
else:
err: str = f"Conflict at {'.'.join(path + [str(key)])}"
raise Exception(err)
return result
灵感来自@andrew cooke的解决方案
基于@andrew cooke。这个版本处理字典的嵌套列表,还允许选项更新值
def merge(a, b, path=None, update=True):
"http://stackoverflow.com/questions/7204805/python-dictionaries-of-dictionaries-merge"
"merges b into a"
if path is None: path = []
for key in b:
if key in a:
if isinstance(a[key], dict) and isinstance(b[key], dict):
merge(a[key], b[key], path + [str(key)])
elif a[key] == b[key]:
pass # same leaf value
elif isinstance(a[key], list) and isinstance(b[key], list):
for idx, val in enumerate(b[key]):
a[key][idx] = merge(a[key][idx], b[key][idx], path + [str(key), str(idx)], update=update)
elif update:
a[key] = b[key]
else:
raise Exception('Conflict at %s' % '.'.join(path + [str(key)]))
else:
a[key] = b[key]
return a
我有一个迭代的解决方案-工作得更好的大字典&很多(例如jsons等):
import collections
def merge_dict_with_subdicts(dict1: dict, dict2: dict) -> dict:
"""
similar behaviour to builtin dict.update - but knows how to handle nested dicts
"""
q = collections.deque([(dict1, dict2)])
while len(q) > 0:
d1, d2 = q.pop()
for k, v in d2.items():
if k in d1 and isinstance(d1[k], dict) and isinstance(v, dict):
q.append((d1[k], v))
else:
d1[k] = v
return dict1
注意,这将使用d2中的值来覆盖d1,以防它们都不是字典。(与python的dict.update()相同)
一些测试:
def test_deep_update():
d = dict()
merge_dict_with_subdicts(d, {"a": 4})
assert d == {"a": 4}
new_dict = {
"b": {
"c": {
"d": 6
}
}
}
merge_dict_with_subdicts(d, new_dict)
assert d == {
"a": 4,
"b": {
"c": {
"d": 6
}
}
}
new_dict = {
"a": 3,
"b": {
"f": 7
}
}
merge_dict_with_subdicts(d, new_dict)
assert d == {
"a": 3,
"b": {
"c": {
"d": 6
},
"f": 7
}
}
# test a case where one of the dicts has dict as value and the other has something else
new_dict = {
'a': {
'b': 4
}
}
merge_dict_with_subdicts(d, new_dict)
assert d['a']['b'] == 4
我已经测试了大约1200个字典——这种方法花了0.4秒,而递归的解决方案花了2.5秒。