我需要合并多个字典,这是我有例如:
dict1 = {1:{"a":{A}}, 2:{"b":{B}}}
dict2 = {2:{"c":{C}}, 3:{"d":{D}}}
A、B、C和D是树的叶子,比如{"info1":"value", "info2":"value2"}
字典的级别(深度)未知,可能是{2:{"c":{"z":{"y":{c}}}}}
在我的例子中,它表示一个目录/文件结构,节点是文档,叶子是文件。
我想将它们合并得到:
dict3 = {1:{"a":{A}}, 2:{"b":{B},"c":{C}}, 3:{"d":{D}}}
我不确定如何用Python轻松做到这一点。
class Utils(object):
"""
>>> a = { 'first' : { 'all_rows' : { 'pass' : 'dog', 'number' : '1' } } }
>>> b = { 'first' : { 'all_rows' : { 'fail' : 'cat', 'number' : '5' } } }
>>> Utils.merge_dict(b, a) == { 'first' : { 'all_rows' : { 'pass' : 'dog', 'fail' : 'cat', 'number' : '5' } } }
True
>>> main = {'a': {'b': {'test': 'bug'}, 'c': 'C'}}
>>> suply = {'a': {'b': 2, 'd': 'D', 'c': {'test': 'bug2'}}}
>>> Utils.merge_dict(main, suply) == {'a': {'b': {'test': 'bug'}, 'c': 'C', 'd': 'D'}}
True
"""
@staticmethod
def merge_dict(main, suply):
"""
获取融合的字典,以main为主,suply补充,冲突时以main为准
:return:
"""
for key, value in suply.items():
if key in main:
if isinstance(main[key], dict):
if isinstance(value, dict):
Utils.merge_dict(main[key], value)
else:
pass
else:
pass
else:
main[key] = value
return main
if __name__ == '__main__':
import doctest
doctest.testmod()
我一直在测试你的解决方案,并决定在我的项目中使用这个:
def mergedicts(dict1, dict2, conflict, no_conflict):
for k in set(dict1.keys()).union(dict2.keys()):
if k in dict1 and k in dict2:
yield (k, conflict(dict1[k], dict2[k]))
elif k in dict1:
yield (k, no_conflict(dict1[k]))
else:
yield (k, no_conflict(dict2[k]))
dict1 = {1:{"a":"A"}, 2:{"b":"B"}}
dict2 = {2:{"c":"C"}, 3:{"d":"D"}}
#this helper function allows for recursion and the use of reduce
def f2(x, y):
return dict(mergedicts(x, y, f2, lambda x: x))
print dict(mergedicts(dict1, dict2, f2, lambda x: x))
print dict(reduce(f2, [dict1, dict2]))
将函数作为参数传递是将jterrace解决方案扩展为所有其他递归解决方案的关键。
这里有一个使用生成器的简单方法:
def mergedicts(dict1, dict2):
for k in set(dict1.keys()).union(dict2.keys()):
if k in dict1 and k in dict2:
if isinstance(dict1[k], dict) and isinstance(dict2[k], dict):
yield (k, dict(mergedicts(dict1[k], dict2[k])))
else:
# If one of the values is not a dict, you can't continue merging it.
# Value from second dict overrides one in first and we move on.
yield (k, dict2[k])
# Alternatively, replace this with exception raiser to alert you of value conflicts
elif k in dict1:
yield (k, dict1[k])
else:
yield (k, dict2[k])
dict1 = {1:{"a":"A"},2:{"b":"B"}}
dict2 = {2:{"c":"C"},3:{"d":"D"}}
print dict(mergedicts(dict1,dict2))
这个打印:
{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}
这个版本的函数将处理N个字典,并且只处理字典——不能传递不恰当的参数,否则将引发TypeError。合并本身解释了键冲突,它不是覆盖来自合并链下的字典的数据,而是创建一组值并追加到该值;没有数据丢失。
它可能不是页面上最有效的,但它是最彻底的,当你合并2到N字典时,你不会丢失任何信息。
def merge_dicts(*dicts):
if not reduce(lambda x, y: isinstance(y, dict) and x, dicts, True):
raise TypeError, "Object in *dicts not of type dict"
if len(dicts) < 2:
raise ValueError, "Requires 2 or more dict objects"
def merge(a, b):
for d in set(a.keys()).union(b.keys()):
if d in a and d in b:
if type(a[d]) == type(b[d]):
if not isinstance(a[d], dict):
ret = list({a[d], b[d]})
if len(ret) == 1: ret = ret[0]
yield (d, sorted(ret))
else:
yield (d, dict(merge(a[d], b[d])))
else:
raise TypeError, "Conflicting key:value type assignment"
elif d in a:
yield (d, a[d])
elif d in b:
yield (d, b[d])
else:
raise KeyError
return reduce(lambda x, y: dict(merge(x, y)), dicts[1:], dicts[0])
print merge_dicts({1:1,2:{1:2}},{1:2,2:{3:1}},{4:4})
输出:{1:[1,2],2:{1:2,3:1},4:4}
如果有人想要另一种方法来解决这个问题,这是我的解决方案。
优点:简洁、声明性和函数式风格(递归,没有突变)。
潜在缺点:这可能不是你想要的合并。查阅文档字符串以了解语义。
def deep_merge(a, b):
"""
Merge two values, with `b` taking precedence over `a`.
Semantics:
- If either `a` or `b` is not a dictionary, `a` will be returned only if
`b` is `None`. Otherwise `b` will be returned.
- If both values are dictionaries, they are merged as follows:
* Each key that is found only in `a` or only in `b` will be included in
the output collection with its value intact.
* For any key in common between `a` and `b`, the corresponding values
will be merged with the same semantics.
"""
if not isinstance(a, dict) or not isinstance(b, dict):
return a if b is None else b
else:
# If we're here, both a and b must be dictionaries or subtypes thereof.
# Compute set of all keys in both dictionaries.
keys = set(a.keys()) | set(b.keys())
# Build output dictionary, merging recursively values with common keys,
# where `None` is used to mean the absence of a value.
return {
key: deep_merge(a.get(key), b.get(key))
for key in keys
}
如果你有一个未知级别的字典,那么我会建议一个递归函数:
def combineDicts(dictionary1, dictionary2):
output = {}
for item, value in dictionary1.iteritems():
if dictionary2.has_key(item):
if isinstance(dictionary2[item], dict):
output[item] = combineDicts(value, dictionary2.pop(item))
else:
output[item] = value
for item, value in dictionary2.iteritems():
output[item] = value
return output