我想将两个词典融入一个新的词典。

x = {'a': 1, 'b': 2}
y = {'b': 3, 'c': 4}
z = merge(x, y)

>>> z
{'a': 1, 'b': 3, 'c': 4}

每当两个字典中都有一个关键 k 时,只应保留 y(k) 的值。


当前回答

重复 / 深度更新 a dict

def deepupdate(original, update):
    """
    Recursively update a dict.
    Subdict's won't be overwritten but also updated.
    """
    for key, value in original.iteritems(): 
        if key not in update:
            update[key] = value
        elif isinstance(value, dict):
            deepupdate(value, update[key]) 
    return update

示威:

pluto_original = {
    'name': 'Pluto',
    'details': {
        'tail': True,
        'color': 'orange'
    }
}

pluto_update = {
    'name': 'Pluutoo',
    'details': {
        'color': 'blue'
    }
}

print deepupdate(pluto_original, pluto_update)

结果:

{
    'name': 'Pluutoo',
    'details': {
        'color': 'blue',
        'tail': True
    }
}

谢谢Radnaw的编辑。

其他回答

您可以使用 toolz.merge([x, y]) 为此。

在 Python 3.9 中

基于PEP 584的,Python的新版本引入了两个新的词典操作器:union(<unk>)和in-place union(<unk>=)。您可以使用<unk>来结合两个词典,而<unk>=将更新一个词典:

>>> pycon = {2016: "Portland", 2018: "Cleveland"}
>>> europython = {2017: "Rimini", 2018: "Edinburgh", 2019: "Basel"}

>>> pycon | europython
{2016: 'Portland', 2018: 'Edinburgh', 2017: 'Rimini', 2019: 'Basel'}

>>> pycon |= europython
>>> pycon
{2016: 'Portland', 2018: 'Edinburgh', 2017: 'Rimini', 2019: 'Basel'}

使用<unk>的优点之一是它在不同的字典类型上工作,并通过合并保持类型:

>>> from collections import defaultdict
>>> europe = defaultdict(lambda: "", {"Norway": "Oslo", "Spain": "Madrid"})
>>> africa = defaultdict(lambda: "", {"Egypt": "Cairo", "Zimbabwe": "Harare"})

>>> europe | africa
defaultdict(<function <lambda> at 0x7f0cb42a6700>,
  {'Norway': 'Oslo', 'Spain': 'Madrid', 'Egypt': 'Cairo', 'Zimbabwe': 'Harare'})

>>> {**europe, **africa}
{'Norway': 'Oslo', 'Spain': 'Madrid', 'Egypt': 'Cairo', 'Zimbabwe': 'Harare'}

您可以使用默认定义,当您想要有效处理丢失的密钥时,请注意, <unk> 保留默认定义,而 {**europe, **africa} 不。

基本用途是更新现有字典,类似于.update():

>>> libraries = {
...     "collections": "Container datatypes",
...     "math": "Mathematical functions",
... }
>>> libraries |= {"zoneinfo": "IANA time zone support"}
>>> libraries
{'collections': 'Container datatypes', 'math': 'Mathematical functions',
 'zoneinfo': 'IANA time zone support'}

当您将字典与字典合并时,两个字典都必须具有适当的字典类型,另一方面,现场运营商(字典=)很高兴与任何字典类似的数据结构合作:

>>> libraries |= [("graphlib", "Functionality for graph-like structures")]
>>> libraries
{'collections': 'Container datatypes', 'math': 'Mathematical functions',
 'zoneinfo': 'IANA time zone support',
 'graphlib': 'Functionality for graph-like structures'}
from collections import Counter
dict1 = {'a':1, 'b': 2}
dict2 = {'b':10, 'c': 11}
result = dict(Counter(dict1) + Counter(dict2))

这应该解决你的问题。

我很想知道我能否用一行严格的方法击败接受答案的时间:

我尝试了5种方法,前面没有一个 - 所有一个线路 - 所有产生正确的答案 - 我无法接近。

所以......为了拯救你麻烦,也许满足好奇心:

import json
import yaml
import time
from ast import literal_eval as literal

def merge_two_dicts(x, y):
    z = x.copy()   # start with x's keys and values
    z.update(y)    # modifies z with y's keys and values & returns None
    return z

x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}

start = time.time()
for i in range(10000):
    z = yaml.load((str(x)+str(y)).replace('}{',', '))
elapsed = (time.time()-start)
print (elapsed, z, 'stringify yaml')

start = time.time()
for i in range(10000):
    z = literal((str(x)+str(y)).replace('}{',', '))
elapsed = (time.time()-start)
print (elapsed, z, 'stringify literal')

start = time.time()
for i in range(10000):
    z = eval((str(x)+str(y)).replace('}{',', '))
elapsed = (time.time()-start)
print (elapsed, z, 'stringify eval')

start = time.time()
for i in range(10000):
    z = {k:int(v) for k,v in (dict(zip(
            ((str(x)+str(y))
            .replace('}',' ')
            .replace('{',' ')
            .replace(':',' ')
            .replace(',',' ')
            .replace("'",'')
            .strip()
            .split('  '))[::2], 
            ((str(x)+str(y))
            .replace('}',' ')
            .replace('{',' ').replace(':',' ')
            .replace(',',' ')
            .replace("'",'')
            .strip()
            .split('  '))[1::2]
             ))).items()}
elapsed = (time.time()-start)
print (elapsed, z, 'stringify replace')

start = time.time()
for i in range(10000):
    z = json.loads(str((str(x)+str(y)).replace('}{',', ').replace("'",'"')))
elapsed = (time.time()-start)
print (elapsed, z, 'stringify json')

start = time.time()
for i in range(10000):
    z = merge_two_dicts(x, y)
elapsed = (time.time()-start)
print (elapsed, z, 'accepted')

结果:

7.693928956985474 {'c': 11, 'b': 10, 'a': 1} stringify yaml
0.29134678840637207 {'c': 11, 'b': 10, 'a': 1} stringify literal
0.2208399772644043 {'c': 11, 'b': 10, 'a': 1} stringify eval
0.1106564998626709 {'c': 11, 'b': 10, 'a': 1} stringify replace
0.07989692687988281 {'c': 11, 'b': 10, 'a': 1} stringify json
0.005082368850708008 {'c': 11, 'b': 10, 'a': 1} accepted

我從這裡學到的是,JSON的方法是最快的方式(那些試圖)從字典的字典返回;比我認為是正常的方法的速度更快(約四分之一的時間)我也學到,YAML的方法應該以任何代價避免。

是的,我明白这不是最好的 / 正确的方式. 我很好奇它是否更快,而且不是; 我发表以证明它是这样。

x = {'a': 1, 'b': 2}
y = {'b': 3, 'c': 4}

>>> z
{'a': 1, 'b': 3, 'c': 4}

z = {**x, **y}

z = {**x, 'foo': 1, 'bar': 2, **y}

>>> z
{'a': 1, 'b': 3, 'foo': 1, 'bar': 2, 'c': 4}

z = x.copy()
z.update(y) # which returns None since it mutates z

def merge_two_dicts(x, y):
    """Given two dictionaries, merge them into a new dict as a shallow copy."""
    z = x.copy()
    z.update(y)
    return z

z = merge_two_dicts(x, y)

def merge_dicts(*dict_args):
    """
    Given any number of dictionaries, shallow copy and merge into a new dict,
    precedence goes to key-value pairs in latter dictionaries.
    """
    result = {}
    for dictionary in dict_args:
        result.update(dictionary)
    return result

z = merge_dicts(a, b, c, d, e, f, g) 

和 g 的关键值对将先行于字典 a 到 f 等。

z = dict(x.items() + y.items())

>>> c = dict(a.items() + b.items())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'dict_items' and 'dict_items'

同样,在 Python 3 (viewitems() 在 Python 2.7) 中采取元素的合并也会失败,当值是不可破坏的对象(如列表,例如)。即使您的值是可破坏的,因为套件是无形的,行为与先例无定义。

>>> c = dict(a.items() | b.items())

>>> x = {'a': []}
>>> y = {'b': []}
>>> dict(x.items() | y.items())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

>>> x = {'a': 2}
>>> y = {'a': 1}
>>> dict(x.items() | y.items())
{'a': 2}

另一个你不应该使用的黑客:

z = dict(x, **y)

字典的目的是采取可触摸的密钥(例如,frozensets或tuples),但这种方法在Python 3中失败,当密钥不是线条时。

>>> c = dict(a, **b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: keyword arguments must be strings

dict(a=1, b=10, c=11)

{'a': 1, 'b': 10, 'c': 11}

>>> foo(**{('a', 'b'): None})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: foo() keywords must be strings
>>> dict(**{('a', 'b'): None})
{('a', 'b'): None}

我的答案: merge_two_dicts(x,y)实际上对我来说看起来更清楚,如果我们实际上对可读性感兴趣。

from copy import deepcopy

def dict_of_dicts_merge(x, y):
    z = {}
    overlapping_keys = x.keys() & y.keys()
    for key in overlapping_keys:
        z[key] = dict_of_dicts_merge(x[key], y[key])
    for key in x.keys() - overlapping_keys:
        z[key] = deepcopy(x[key])
    for key in y.keys() - overlapping_keys:
        z[key] = deepcopy(y[key])
    return z

>>> x = {'a':{1:{}}, 'b': {2:{}}}
>>> y = {'b':{10:{}}, 'c': {11:{}}}
>>> dict_of_dicts_merge(x, y)
{'b': {2: {}, 10: {}}, 'a': {1: {}}, 'c': {11: {}}}

{k: v for d in dicts for k, v in d.items()} # iteritems in Python 2.7

dict((k, v) for d in dicts for k, v in d.items()) # iteritems in Python 2

from itertools import chain
z = dict(chain(x.items(), y.items())) # iteritems in Python 2

from timeit import repeat
from itertools import chain

x = dict.fromkeys('abcdefg')
y = dict.fromkeys('efghijk')

def merge_two_dicts(x, y):
    z = x.copy()
    z.update(y)
    return z

min(repeat(lambda: {**x, **y}))
min(repeat(lambda: merge_two_dicts(x, y)))
min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
min(repeat(lambda: dict(chain(x.items(), y.items()))))
min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))

>>> min(repeat(lambda: {**x, **y}))
1.0804965235292912
>>> min(repeat(lambda: merge_two_dicts(x, y)))
1.636518670246005
>>> min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
3.1779992282390594
>>> min(repeat(lambda: dict(chain(x.items(), y.items()))))
2.740647904574871
>>> min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))
4.266070580109954
$ uname -a
Linux nixos 4.19.113 #1-NixOS SMP Wed Mar 25 07:06:15 UTC 2020 x86_64 GNU/Linux

词典中的资源