我需要合并多个字典,这是我有例如:
dict1 = {1:{"a":{A}}, 2:{"b":{B}}}
dict2 = {2:{"c":{C}}, 3:{"d":{D}}}
A、B、C和D是树的叶子,比如{"info1":"value", "info2":"value2"}
字典的级别(深度)未知,可能是{2:{"c":{"z":{"y":{c}}}}}
在我的例子中,它表示一个目录/文件结构,节点是文档,叶子是文件。
我想将它们合并得到:
dict3 = {1:{"a":{A}}, 2:{"b":{B},"c":{C}}, 3:{"d":{D}}}
我不确定如何用Python轻松做到这一点。
这实际上是相当棘手的-特别是如果你想要一个有用的错误消息时,事情是不一致的,同时正确地接受重复但一致的条目(这是这里没有其他答案做的..)。
假设你没有大量的条目,递归函数是最简单的:
from functools import reduce
def merge(a, b, path=None):
"merges b into a"
if path is None: path = []
for key in b:
if key in a:
if isinstance(a[key], dict) and isinstance(b[key], dict):
merge(a[key], b[key], path + [str(key)])
elif a[key] == b[key]:
pass # same leaf value
else:
raise Exception('Conflict at %s' % '.'.join(path + [str(key)]))
else:
a[key] = b[key]
return a
# works
print(merge({1:{"a":"A"},2:{"b":"B"}}, {2:{"c":"C"},3:{"d":"D"}}))
# has conflict
merge({1:{"a":"A"},2:{"b":"B"}}, {1:{"a":"A"},2:{"b":"C"}})
注意,这会使a发生变化——b的内容被添加到a(也会返回a)。如果你想保留a,你可以叫它merge(dict(a) b)
Agf指出(下面),你可能有两个以上的字典,在这种情况下,你可以使用:
reduce(merge, [dict1, dict2, dict3...])
所有内容都将被添加到dict1中。
注意:我编辑了我的初始答案以改变第一个参数;这使得“reduce”更容易解释
由于dictviews支持集合操作,我能够极大地简化jterrace的答案。
def merge(dict1, dict2):
for k in dict1.keys() - dict2.keys():
yield (k, dict1[k])
for k in dict2.keys() - dict1.keys():
yield (k, dict2[k])
for k in dict1.keys() & dict2.keys():
yield (k, dict(merge(dict1[k], dict2[k])))
任何将dict与非dict(技术上讲,一个带有'keys'方法的对象和一个没有'keys'方法的对象)组合的尝试都会引发AttributeError。这包括对函数的初始调用和递归调用。这正是我想要的,所以我离开了。您可以很容易地捕获递归调用抛出的AttributeErrors,然后生成您想要的任何值。
在不影响输入字典的情况下返回一个合并。
def _merge_dicts(dictA: Dict = {}, dictB: Dict = {}) -> Dict:
# it suffices to pass as an argument a clone of `dictA`
return _merge_dicts_aux(dictA, dictB, copy(dictA))
def _merge_dicts_aux(dictA: Dict = {}, dictB: Dict = {}, result: Dict = {}, path: List[str] = None) -> Dict:
# conflict path, None if none
if path is None:
path = []
for key in dictB:
# if the key doesn't exist in A, add the B element to A
if key not in dictA:
result[key] = dictB[key]
else:
# if the key value is a dict, both in A and in B, merge the dicts
if isinstance(dictA[key], dict) and isinstance(dictB[key], dict):
_merge_dicts_aux(dictA[key], dictB[key], result[key], path + [str(key)])
# if the key value is the same in A and in B, ignore
elif dictA[key] == dictB[key]:
pass
# if the key value differs in A and in B, raise error
else:
err: str = f"Conflict at {'.'.join(path + [str(key)])}"
raise Exception(err)
return result
灵感来自@andrew cooke的解决方案
这里有一个使用生成器的简单方法:
def mergedicts(dict1, dict2):
for k in set(dict1.keys()).union(dict2.keys()):
if k in dict1 and k in dict2:
if isinstance(dict1[k], dict) and isinstance(dict2[k], dict):
yield (k, dict(mergedicts(dict1[k], dict2[k])))
else:
# If one of the values is not a dict, you can't continue merging it.
# Value from second dict overrides one in first and we move on.
yield (k, dict2[k])
# Alternatively, replace this with exception raiser to alert you of value conflicts
elif k in dict1:
yield (k, dict1[k])
else:
yield (k, dict2[k])
dict1 = {1:{"a":"A"},2:{"b":"B"}}
dict2 = {2:{"c":"C"},3:{"d":"D"}}
print dict(mergedicts(dict1,dict2))
这个打印:
{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}
换个答案怎么样?!?这也避免了突变/副作用:
def merge(dict1, dict2):
output = {}
# adds keys from `dict1` if they do not exist in `dict2` and vice-versa
intersection = {**dict2, **dict1}
for k_intersect, v_intersect in intersection.items():
if k_intersect not in dict1:
v_dict2 = dict2[k_intersect]
output[k_intersect] = v_dict2
elif k_intersect not in dict2:
output[k_intersect] = v_intersect
elif isinstance(v_intersect, dict):
v_dict2 = dict2[k_intersect]
output[k_intersect] = merge(v_intersect, v_dict2)
else:
output[k_intersect] = v_intersect
return output
dict1 = {1:{"a":{"A"}}, 2:{"b":{"B"}}}
dict2 = {2:{"c":{"C"}}, 3:{"d":{"D"}}}
dict3 = {1:{"a":{"A"}}, 2:{"b":{"B"},"c":{"C"}}, 3:{"d":{"D"}}}
assert dict3 == merge(dict1, dict2)