我需要合并多个字典,这是我有例如:

dict1 = {1:{"a":{A}}, 2:{"b":{B}}}

dict2 = {2:{"c":{C}}, 3:{"d":{D}}}

A、B、C和D是树的叶子,比如{"info1":"value", "info2":"value2"}

字典的级别(深度)未知,可能是{2:{"c":{"z":{"y":{c}}}}}

在我的例子中,它表示一个目录/文件结构,节点是文档,叶子是文件。

我想将它们合并得到:

 dict3 = {1:{"a":{A}}, 2:{"b":{B},"c":{C}}, 3:{"d":{D}}}

我不确定如何用Python轻松做到这一点。


当前回答

这里有一个使用生成器的简单方法:

def mergedicts(dict1, dict2):
    for k in set(dict1.keys()).union(dict2.keys()):
        if k in dict1 and k in dict2:
            if isinstance(dict1[k], dict) and isinstance(dict2[k], dict):
                yield (k, dict(mergedicts(dict1[k], dict2[k])))
            else:
                # If one of the values is not a dict, you can't continue merging it.
                # Value from second dict overrides one in first and we move on.
                yield (k, dict2[k])
                # Alternatively, replace this with exception raiser to alert you of value conflicts
        elif k in dict1:
            yield (k, dict1[k])
        else:
            yield (k, dict2[k])

dict1 = {1:{"a":"A"},2:{"b":"B"}}
dict2 = {2:{"c":"C"},3:{"d":"D"}}

print dict(mergedicts(dict1,dict2))

这个打印:

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}

其他回答

Short-n-sweet:

from collections.abc import MutableMapping as Map

def nested_update(d, v):
"""
Nested update of dict-like 'd' with dict-like 'v'.
"""

for key in v:
    if key in d and isinstance(d[key], Map) and isinstance(v[key], Map):
        nested_update(d[key], v[key])
    else:
        d[key] = v[key]

这类似于(并且构建在)Python的字典上。更新方法。它返回None(如果你喜欢,你总是可以添加返回d),因为它在原地更新dict d。v中的键将覆盖d中任何现有的键(它不会尝试解释字典的内容)。

它也适用于其他(“类字典”)映射。

以下是来自@andrew cooke的回答。 它以更好的方式处理嵌套列表。

def deep_merge_lists(original, incoming):
    """
    Deep merge two lists. Modifies original.
    Recursively call deep merge on each correlated element of list. 
    If item type in both elements are
     a. dict: Call deep_merge_dicts on both values.
     b. list: Recursively call deep_merge_lists on both values.
     c. any other type: Value is overridden.
     d. conflicting types: Value is overridden.

    If length of incoming list is more that of original then extra values are appended.
    """
    common_length = min(len(original), len(incoming))
    for idx in range(common_length):
        if isinstance(original[idx], dict) and isinstance(incoming[idx], dict):
            deep_merge_dicts(original[idx], incoming[idx])

        elif isinstance(original[idx], list) and isinstance(incoming[idx], list):
            deep_merge_lists(original[idx], incoming[idx])

        else:
            original[idx] = incoming[idx]

    for idx in range(common_length, len(incoming)):
        original.append(incoming[idx])


def deep_merge_dicts(original, incoming):
    """
    Deep merge two dictionaries. Modifies original.
    For key conflicts if both values are:
     a. dict: Recursively call deep_merge_dicts on both values.
     b. list: Call deep_merge_lists on both values.
     c. any other type: Value is overridden.
     d. conflicting types: Value is overridden.

    """
    for key in incoming:
        if key in original:
            if isinstance(original[key], dict) and isinstance(incoming[key], dict):
                deep_merge_dicts(original[key], incoming[key])

            elif isinstance(original[key], list) and isinstance(incoming[key], list):
                deep_merge_lists(original[key], incoming[key])

            else:
                original[key] = incoming[key]
        else:
            original[key] = incoming[key]

我还没有对此进行广泛的测试,因此,鼓励您的反馈。

from collections import defaultdict

dict1 = defaultdict(list)

dict2= defaultdict(list)

dict3= defaultdict(list)


dict1= dict(zip(Keys[ ],values[ ]))

dict2 = dict(zip(Keys[ ],values[ ]))


def mergeDict(dict1, dict2):

    dict3 = {**dict1, **dict2}

    for key, value in dict3.items():

        if key in dict1 and key in dict2:

           dict3[key] = [value , dict1[key]]

    return dict3

dict3 = mergeDict(dict1, dict2)

#sort keys alphabetically.

dict3.keys()

合并两个字典并添加公共键的值

正如在许多其他答案中提到的,递归算法在这里最有意义。一般来说,在使用递归时,最好创建新值,而不是试图修改任何输入数据结构。

我们需要定义在每个合并步骤中发生的事情。如果两个输入都是字典,这很简单:我们从每一边复制唯一键,然后递归合并重复键的值。导致问题的是基本情况。如果我们拿出一个单独的函数,逻辑会更容易理解。作为占位符,我们可以将这两个值包装在一个元组中:

def merge_leaves(x, y):
    return (x, y)

现在我们的逻辑核心是这样的:

def merge(x, y):
    if not(isinstance(x, dict) and isinstance(y, dict)):
        return merge_leaves(x, y)
    x_keys, y_keys = x.keys(), y.keys()
    result = { k: merge(x[k], y[k]) for k in x_keys & y_keys }
    result.update({k: x[k] for k in x_keys - y_keys})
    result.update({k: y[k] for k in y_keys - x_keys})
    return result

让我们来测试一下:

>>> x = {'a': {'b': 'c', 'd': 'e'}, 'f': 1, 'g': {'h', 'i'}, 'j': None}
>>> y = {'a': {'d': 'e', 'h': 'i'}, 'f': {'b': 'c'}, 'g': 1, 'k': None}
>>> merge(x, y)
{'f': (1, {'b': 'c'}), 'g': ({'h', 'i'}, 1), 'a': {'d': ('e', 'e'), 'b': 'c', 'h': 'i'}, 'j': None, 'k': None}
>>> x # The originals are unmodified.
{'a': {'b': 'c', 'd': 'e'}, 'f': 1, 'g': {'h', 'i'}, 'j': None}
>>> y
{'a': {'d': 'e', 'h': 'i'}, 'f': {'b': 'c'}, 'g': 1, 'k': None}

我们可以很容易地修改叶子归并规则,例如:

def merge_leaves(x, y):
    try:
        return x + y
    except TypeError:
        return Ellipsis

并观察效果:

>>> merge(x, y)
{'f': Ellipsis, 'g': Ellipsis, 'a': {'d': 'ee', 'b': 'c', 'h': 'i'}, 'j': None, 'k': None}

我们还可以通过使用第三方库来根据输入的类型进行分派来潜在地清理这个问题。例如,使用multidispatch,我们可以这样做:

@dispatch(dict, dict)
def merge(x, y):
    x_keys, y_keys = x.keys(), y.keys()
    result = { k: merge(x[k], y[k]) for k in x_keys & y_keys }
    result.update({k: x[k] for k in x_keys - y_keys})
    result.update({k: y[k] for k in y_keys - x_keys})
    return result

@dispatch(str, str)
def merge(x, y):
    return x + y

@dispatch(tuple, tuple)
def merge(x, y):
    return x + y

@dispatch(list, list)
def merge(x, y):
    return x + y

@dispatch(int, int):
def merge(x, y):
    raise ValueError("integer value conflict")

@dispatch(object, object):
    return (x, y)

这允许我们在不编写自己的类型检查的情况下处理叶类型特殊情况的各种组合,并在主递归函数中替换类型检查。

这实际上是相当棘手的-特别是如果你想要一个有用的错误消息时,事情是不一致的,同时正确地接受重复但一致的条目(这是这里没有其他答案做的..)。

假设你没有大量的条目,递归函数是最简单的:

from functools import reduce

def merge(a, b, path=None):
    "merges b into a"
    if path is None: path = []
    for key in b:
        if key in a:
            if isinstance(a[key], dict) and isinstance(b[key], dict):
                merge(a[key], b[key], path + [str(key)])
            elif a[key] == b[key]:
                pass # same leaf value
            else:
                raise Exception('Conflict at %s' % '.'.join(path + [str(key)]))
        else:
            a[key] = b[key]
    return a

# works
print(merge({1:{"a":"A"},2:{"b":"B"}}, {2:{"c":"C"},3:{"d":"D"}}))
# has conflict
merge({1:{"a":"A"},2:{"b":"B"}}, {1:{"a":"A"},2:{"b":"C"}})

注意,这会使a发生变化——b的内容被添加到a(也会返回a)。如果你想保留a,你可以叫它merge(dict(a) b)

Agf指出(下面),你可能有两个以上的字典,在这种情况下,你可以使用:

reduce(merge, [dict1, dict2, dict3...])

所有内容都将被添加到dict1中。

注意:我编辑了我的初始答案以改变第一个参数;这使得“reduce”更容易解释