我需要合并多个字典,这是我有例如:

dict1 = {1:{"a":{A}}, 2:{"b":{B}}}

dict2 = {2:{"c":{C}}, 3:{"d":{D}}}

A、B、C和D是树的叶子,比如{"info1":"value", "info2":"value2"}

字典的级别(深度)未知,可能是{2:{"c":{"z":{"y":{c}}}}}

在我的例子中,它表示一个目录/文件结构,节点是文档,叶子是文件。

我想将它们合并得到:

 dict3 = {1:{"a":{A}}, 2:{"b":{B},"c":{C}}, 3:{"d":{D}}}

我不确定如何用Python轻松做到这一点。


当前回答

在不影响输入字典的情况下返回一个合并。

def _merge_dicts(dictA: Dict = {}, dictB: Dict = {}) -> Dict:
    # it suffices to pass as an argument a clone of `dictA`
    return _merge_dicts_aux(dictA, dictB, copy(dictA))


def _merge_dicts_aux(dictA: Dict = {}, dictB: Dict = {}, result: Dict = {}, path: List[str] = None) -> Dict:

    # conflict path, None if none
    if path is None:
        path = []

    for key in dictB:

        # if the key doesn't exist in A, add the B element to A
        if key not in dictA:
            result[key] = dictB[key]

        else:
            # if the key value is a dict, both in A and in B, merge the dicts
            if isinstance(dictA[key], dict) and isinstance(dictB[key], dict):
                _merge_dicts_aux(dictA[key], dictB[key], result[key], path + [str(key)])

            # if the key value is the same in A and in B, ignore
            elif dictA[key] == dictB[key]:
                pass

            # if the key value differs in A and in B, raise error
            else:
                err: str = f"Conflict at {'.'.join(path + [str(key)])}"
                raise Exception(err)

    return result

灵感来自@andrew cooke的解决方案

其他回答

字典的字典合并

由于这是一个规范的问题(尽管存在某些非泛化性),所以我提供了规范的python方法来解决这个问题。

最简单的情况:“叶子是嵌套的字典,以空字典结尾”:

d1 = {'a': {1: {'foo': {}}, 2: {}}}
d2 = {'a': {1: {}, 2: {'bar': {}}}}
d3 = {'b': {3: {'baz': {}}}}
d4 = {'a': {1: {'quux': {}}}}

这是递归最简单的情况,我推荐两种简单的方法:

def rec_merge1(d1, d2):
    '''return new merged dict of dicts'''
    for k, v in d1.items(): # in Python 2, use .iteritems()!
        if k in d2:
            d2[k] = rec_merge1(v, d2[k])
    d3 = d1.copy()
    d3.update(d2)
    return d3

def rec_merge2(d1, d2):
    '''update first dict with second recursively'''
    for k, v in d1.items(): # in Python 2, use .iteritems()!
        if k in d2:
            d2[k] = rec_merge2(v, d2[k])
    d1.update(d2)
    return d1

我相信我更喜欢第二个,而不是第一个,但请记住,第一个的原始状态必须从它的起源重建。用法如下:

>>> from functools import reduce # only required for Python 3.
>>> reduce(rec_merge1, (d1, d2, d3, d4))
{'a': {1: {'quux': {}, 'foo': {}}, 2: {'bar': {}}}, 'b': {3: {'baz': {}}}}
>>> reduce(rec_merge2, (d1, d2, d3, d4))
{'a': {1: {'quux': {}, 'foo': {}}, 2: {'bar': {}}}, 'b': {3: {'baz': {}}}}

复杂情况:“叶子是任何其他类型的:”

所以如果它们以字典结尾,这是一个简单的合并结尾空字典的例子。如果不是,也不是那么微不足道。如果是字符串,怎么合并?集合也可以类似地更新,所以我们可以这样处理,但我们失去了它们合并的顺序。那么顺序重要吗?

因此,代替更多信息,最简单的方法是给它们一个标准的更新处理,如果两个值都不是dict:即第二个dict的值将覆盖第一个dict,即使第二个dict的值是None,而第一个dict的值是一个包含大量信息的dict。

d1 = {'a': {1: 'foo', 2: None}}
d2 = {'a': {1: None, 2: 'bar'}}
d3 = {'b': {3: 'baz'}}
d4 = {'a': {1: 'quux'}}

from collections.abc import MutableMapping

def rec_merge(d1, d2):
    '''
    Update two dicts of dicts recursively, 
    if either mapping has leaves that are non-dicts, 
    the second's leaf overwrites the first's.
    '''
    for k, v in d1.items():
        if k in d2:
            # this next check is the only difference!
            if all(isinstance(e, MutableMapping) for e in (v, d2[k])):
                d2[k] = rec_merge(v, d2[k])
            # we could further check types and merge as appropriate here.
    d3 = d1.copy()
    d3.update(d2)
    return d3

现在

from functools import reduce
reduce(rec_merge, (d1, d2, d3, d4))

返回

{'a': {1: 'quux', 2: 'bar'}, 'b': {3: 'baz'}}

适用于原问题:

我不得不删除字母周围的花括号,并将它们放在单引号中,以使其成为合法的Python(否则它们将在Python 2.7+中设置字面量),并附加一个缺少的大括号:

dict1 = {1:{"a":'A'}, 2:{"b":'B'}}
dict2 = {2:{"c":'C'}, 3:{"d":'D'}}

rec_merge(dict1, dict2)现在返回:

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}

匹配原始问题的期望结果(在改变后,例如{A}为'A')。

我能想到的最简单的方法是:

#!/usr/bin/python

from copy import deepcopy
def dict_merge(a, b):
    if not isinstance(b, dict):
        return b
    result = deepcopy(a)
    for k, v in b.iteritems():
        if k in result and isinstance(result[k], dict):
                result[k] = dict_merge(result[k], v)
        else:
            result[k] = deepcopy(v)
    return result

a = {1:{"a":'A'}, 2:{"b":'B'}}
b = {2:{"c":'C'}, 3:{"d":'D'}}

print dict_merge(a,b)

输出:

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}

如果有人想要另一种方法来解决这个问题,这是我的解决方案。

优点:简洁、声明性和函数式风格(递归,没有突变)。

潜在缺点:这可能不是你想要的合并。查阅文档字符串以了解语义。

def deep_merge(a, b):
    """
    Merge two values, with `b` taking precedence over `a`.

    Semantics:
    - If either `a` or `b` is not a dictionary, `a` will be returned only if
      `b` is `None`. Otherwise `b` will be returned.
    - If both values are dictionaries, they are merged as follows:
        * Each key that is found only in `a` or only in `b` will be included in
          the output collection with its value intact.
        * For any key in common between `a` and `b`, the corresponding values
          will be merged with the same semantics.
    """
    if not isinstance(a, dict) or not isinstance(b, dict):
        return a if b is None else b
    else:
        # If we're here, both a and b must be dictionaries or subtypes thereof.

        # Compute set of all keys in both dictionaries.
        keys = set(a.keys()) | set(b.keys())

        # Build output dictionary, merging recursively values with common keys,
        # where `None` is used to mean the absence of a value.
        return {
            key: deep_merge(a.get(key), b.get(key))
            for key in keys
        }

如果你有一个未知级别的字典,那么我会建议一个递归函数:

def combineDicts(dictionary1, dictionary2):
    output = {}
    for item, value in dictionary1.iteritems():
        if dictionary2.has_key(item):
            if isinstance(dictionary2[item], dict):
                output[item] = combineDicts(value, dictionary2.pop(item))
        else:
            output[item] = value
    for item, value in dictionary2.iteritems():
         output[item] = value
    return output

基于@andrew cooke。这个版本处理字典的嵌套列表,还允许选项更新值

def merge(a, b, path=None, update=True):
    "http://stackoverflow.com/questions/7204805/python-dictionaries-of-dictionaries-merge"
    "merges b into a"
    if path is None: path = []
    for key in b:
        if key in a:
            if isinstance(a[key], dict) and isinstance(b[key], dict):
                merge(a[key], b[key], path + [str(key)])
            elif a[key] == b[key]:
                pass # same leaf value
            elif isinstance(a[key], list) and isinstance(b[key], list):
                for idx, val in enumerate(b[key]):
                    a[key][idx] = merge(a[key][idx], b[key][idx], path + [str(key), str(idx)], update=update)
            elif update:
                a[key] = b[key]
            else:
                raise Exception('Conflict at %s' % '.'.join(path + [str(key)]))
        else:
            a[key] = b[key]
    return a