我需要合并多个字典,这是我有例如:

dict1 = {1:{"a":{A}}, 2:{"b":{B}}}

dict2 = {2:{"c":{C}}, 3:{"d":{D}}}

A、B、C和D是树的叶子,比如{"info1":"value", "info2":"value2"}

字典的级别(深度)未知,可能是{2:{"c":{"z":{"y":{c}}}}}

在我的例子中,它表示一个目录/文件结构,节点是文档,叶子是文件。

我想将它们合并得到:

 dict3 = {1:{"a":{A}}, 2:{"b":{B},"c":{C}}, 3:{"d":{D}}}

我不确定如何用Python轻松做到这一点。


当前回答

概述

下面的方法将字典的深度合并问题细分为:

使用A的参数化浅归并函数merge(f)(A,b) 函数f归并两个字典a和b 与归并一起使用的递归归并函数f


实现

合并两个(非嵌套的)字典的函数可以用很多种方式编写。我个人喜欢

def merge(f):
    def merge(a,b): 
        keys = a.keys() | b.keys()
        return {key:f(a.get(key), b.get(key)) for key in keys}
    return merge

定义一个合适的递归归并函数f的一个好方法是使用multidispatch,它允许定义函数根据参数的类型沿着不同的路径求值。

from multipledispatch import dispatch

#for anything that is not a dict return
@dispatch(object, object)
def f(a, b):
    return b if b is not None else a

#for dicts recurse 
@dispatch(dict, dict)
def f(a,b):
    return merge(f)(a,b)

例子

要合并两个嵌套字典,只需使用merge(f),例如:

dict1 = {1:{"a":"A"},2:{"b":"B"}}
dict2 = {2:{"c":"C"},3:{"d":"D"}}
merge(f)(dict1, dict2)
#returns {1: {'a': 'A'}, 2: {'b': 'B', 'c': 'C'}, 3: {'d': 'D'}} 

注:

这种方法的优点是:

该函数由较小的函数构建而成,每个函数只做一件事 这使得代码更容易推理和测试 这种行为不是硬编码的,但可以根据需要进行更改和扩展,从而提高代码重用(参见下面的示例)。


定制

一些答案还考虑了包含列表的字典,例如其他(可能嵌套的)字典。在这种情况下,可能需要映射列表并根据位置合并它们。这可以通过在归并函数f中添加另一个定义来实现:

import itertools
@dispatch(list, list)
def f(a,b):
    return [merge(f)(*arg) for arg in itertools.zip_longest(a, b)]

其他回答

这应该有助于将所有项从dict2合并到dict1:

for item in dict2:
    if item in dict1:
        for leaf in dict2[item]:
            dict1[item][leaf] = dict2[item][leaf]
    else:
        dict1[item] = dict2[item]

请测试一下,告诉我们这是否是你想要的。

编辑:

上述解决方案只合并了一个级别,但正确地解决了op给出的例子。如果合并多个级别,应该使用递归。

这实际上是相当棘手的-特别是如果你想要一个有用的错误消息时,事情是不一致的,同时正确地接受重复但一致的条目(这是这里没有其他答案做的..)。

假设你没有大量的条目,递归函数是最简单的:

from functools import reduce

def merge(a, b, path=None):
    "merges b into a"
    if path is None: path = []
    for key in b:
        if key in a:
            if isinstance(a[key], dict) and isinstance(b[key], dict):
                merge(a[key], b[key], path + [str(key)])
            elif a[key] == b[key]:
                pass # same leaf value
            else:
                raise Exception('Conflict at %s' % '.'.join(path + [str(key)]))
        else:
            a[key] = b[key]
    return a

# works
print(merge({1:{"a":"A"},2:{"b":"B"}}, {2:{"c":"C"},3:{"d":"D"}}))
# has conflict
merge({1:{"a":"A"},2:{"b":"B"}}, {1:{"a":"A"},2:{"b":"C"}})

注意,这会使a发生变化——b的内容被添加到a(也会返回a)。如果你想保留a,你可以叫它merge(dict(a) b)

Agf指出(下面),你可能有两个以上的字典,在这种情况下,你可以使用:

reduce(merge, [dict1, dict2, dict3...])

所有内容都将被添加到dict1中。

注意:我编辑了我的初始答案以改变第一个参数;这使得“reduce”更容易解释

def m(a,b):
    aa = {
        k : dict(a.get(k,{}), **v) for k,v in b.items()
        }
    aap = print(aa)
    return aap

d1 = {1:{"a":"A"}, 2:{"b":"B"}}

d2 = {2:{"c":"C"}, 3:{"d":"D"}}

dict1 = {1:{"a":{1}}, 2:{"b":{2}}}

dict2 = {2:{"c":{222}}, 3:{"d":{3}}}

m(d1,d2)

m(dict1,dict2)

"""
Output :

{2: {'b': 'B', 'c': 'C'}, 3: {'d': 'D'}}


{2: {'b': {2}, 'c': {222}}, 3: {'d': {3}}}

"""
from collections import defaultdict
from itertools import chain

class DictHelper:

@staticmethod
def merge_dictionaries(*dictionaries, override=True):
    merged_dict = defaultdict(set)
    all_unique_keys = set(chain(*[list(dictionary.keys()) for dictionary in dictionaries]))  # Build a set using all dict keys
    for key in all_unique_keys:
        keys_value_type = list(set(filter(lambda obj_type: obj_type != type(None), [type(dictionary.get(key, None)) for dictionary in dictionaries])))
        # Establish the object type for each key, return None if key is not present in dict and remove None from final result
        if len(keys_value_type) != 1:
            raise Exception("Different objects type for same key: {keys_value_type}".format(keys_value_type=keys_value_type))

        if keys_value_type[0] == list:
            values = list(chain(*[dictionary.get(key, []) for dictionary in dictionaries]))  # Extract the value for each key
            merged_dict[key].update(values)

        elif keys_value_type[0] == dict:
            # Extract all dictionaries by key and enter in recursion
            dicts_to_merge = list(filter(lambda obj: obj != None, [dictionary.get(key, None) for dictionary in dictionaries]))
            merged_dict[key] = DictHelper.merge_dictionaries(*dicts_to_merge)

        else:
            # if override => get value from last dictionary else make a list of all values
            values = list(filter(lambda obj: obj != None, [dictionary.get(key, None) for dictionary in dictionaries]))
            merged_dict[key] = values[-1] if override else values

    return dict(merged_dict)



if __name__ == '__main__':
  d1 = {'aaaaaaaaa': ['to short', 'to long'], 'bbbbb': ['to short', 'to long'], "cccccc": ["the is a test"]}
  d2 = {'aaaaaaaaa': ['field is not a bool'], 'bbbbb': ['field is not a bool']}
  d3 = {'aaaaaaaaa': ['filed is not a string', "to short"], 'bbbbb': ['field is not an integer']}
  print(DictHelper.merge_dictionaries(d1, d2, d3))

  d4 = {"a": {"x": 1, "y": 2, "z": 3, "d": {"x1": 10}}}
  d5 = {"a": {"x": 10, "y": 20, "d": {"x2": 20}}}
  print(DictHelper.merge_dictionaries(d4, d5))

输出:

{'bbbbb': {'to long', 'field is not an integer', 'to short', 'field is not a bool'}, 
'aaaaaaaaa': {'to long', 'to short', 'filed is not a string', 'field is not a bool'}, 
'cccccc': {'the is a test'}}

{'a': {'y': 20, 'd': {'x1': 10, 'x2': 20}, 'z': 3, 'x': 10}}

Short-n-sweet:

from collections.abc import MutableMapping as Map

def nested_update(d, v):
"""
Nested update of dict-like 'd' with dict-like 'v'.
"""

for key in v:
    if key in d and isinstance(d[key], Map) and isinstance(v[key], Map):
        nested_update(d[key], v[key])
    else:
        d[key] = v[key]

这类似于(并且构建在)Python的字典上。更新方法。它返回None(如果你喜欢,你总是可以添加返回d),因为它在原地更新dict d。v中的键将覆盖d中任何现有的键(它不会尝试解释字典的内容)。

它也适用于其他(“类字典”)映射。