我想将两个词典融入一个新的词典。
x = {'a': 1, 'b': 2}
y = {'b': 3, 'c': 4}
z = merge(x, y)
>>> z
{'a': 1, 'b': 3, 'c': 4}
每当两个字典中都有一个关键 k 时,只应保留 y(k) 的值。
我想将两个词典融入一个新的词典。
x = {'a': 1, 'b': 2}
y = {'b': 3, 'c': 4}
z = merge(x, y)
>>> z
{'a': 1, 'b': 3, 'c': 4}
每当两个字典中都有一个关键 k 时,只应保留 y(k) 的值。
当前回答
深深的定律:
from typing import List, Dict
from copy import deepcopy
def merge_dicts(*from_dicts: List[Dict], no_copy: bool=False) -> Dict :
""" no recursion deep merge of two dicts
By default creates fresh Dict and merges all to it.
no_copy = True, will merge all dicts to a fist one in a list without copy.
Why? Sometime I need to combine one dictionary from "layers".
The "layers" are not in use and dropped immediately after merging.
"""
if no_copy:
xerox = lambda x:x
else:
xerox = deepcopy
result = xerox(from_dicts[0])
for _from in from_dicts[1:]:
merge_queue = [(result, _from)]
for _to, _from in merge_queue:
for k, v in _from.items():
if k in _to and isinstance(_to[k], dict) and isinstance(v, dict):
# key collision add both are dicts.
# add to merging queue
merge_queue.append((_to[k], v))
continue
_to[k] = xerox(v)
return result
使用:
print("=============================")
print("merge all dicts to first one without copy.")
a0 = {"a":{"b":1}}
a1 = {"a":{"c":{"d":4}}}
a2 = {"a":{"c":{"f":5}, "d": 6}}
print(f"a0 id[{id(a0)}] value:{a0}")
print(f"a1 id[{id(a1)}] value:{a1}")
print(f"a2 id[{id(a2)}] value:{a2}")
r = merge_dicts(a0, a1, a2, no_copy=True)
print(f"r id[{id(r)}] value:{r}")
print("=============================")
print("create fresh copy of all")
a0 = {"a":{"b":1}}
a1 = {"a":{"c":{"d":4}}}
a2 = {"a":{"c":{"f":5}, "d": 6}}
print(f"a0 id[{id(a0)}] value:{a0}")
print(f"a1 id[{id(a1)}] value:{a1}")
print(f"a2 id[{id(a2)}] value:{a2}")
r = merge_dicts(a0, a1, a2)
print(f"r id[{id(r)}] value:{r}")
其他回答
x = {'a': 1, 'b': 2}
y = {'b': 3, 'c': 4}
>>> z
{'a': 1, 'b': 3, 'c': 4}
z = {**x, **y}
z = {**x, 'foo': 1, 'bar': 2, **y}
>>> z
{'a': 1, 'b': 3, 'foo': 1, 'bar': 2, 'c': 4}
z = x.copy()
z.update(y) # which returns None since it mutates z
def merge_two_dicts(x, y):
"""Given two dictionaries, merge them into a new dict as a shallow copy."""
z = x.copy()
z.update(y)
return z
z = merge_two_dicts(x, y)
def merge_dicts(*dict_args):
"""
Given any number of dictionaries, shallow copy and merge into a new dict,
precedence goes to key-value pairs in latter dictionaries.
"""
result = {}
for dictionary in dict_args:
result.update(dictionary)
return result
z = merge_dicts(a, b, c, d, e, f, g)
和 g 的关键值对将先行于字典 a 到 f 等。
z = dict(x.items() + y.items())
>>> c = dict(a.items() + b.items())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'dict_items' and 'dict_items'
同样,在 Python 3 (viewitems() 在 Python 2.7) 中采取元素的合并也会失败,当值是不可破坏的对象(如列表,例如)。即使您的值是可破坏的,因为套件是无形的,行为与先例无定义。
>>> c = dict(a.items() | b.items())
>>> x = {'a': []}
>>> y = {'b': []}
>>> dict(x.items() | y.items())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
>>> x = {'a': 2}
>>> y = {'a': 1}
>>> dict(x.items() | y.items())
{'a': 2}
另一个你不应该使用的黑客:
z = dict(x, **y)
字典的目的是采取可触摸的密钥(例如,frozensets或tuples),但这种方法在Python 3中失败,当密钥不是线条时。
>>> c = dict(a, **b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: keyword arguments must be strings
和
dict(a=1, b=10, c=11)
{'a': 1, 'b': 10, 'c': 11}
>>> foo(**{('a', 'b'): None})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: foo() keywords must be strings
>>> dict(**{('a', 'b'): None})
{('a', 'b'): None}
我的答案: merge_two_dicts(x,y)实际上对我来说看起来更清楚,如果我们实际上对可读性感兴趣。
from copy import deepcopy
def dict_of_dicts_merge(x, y):
z = {}
overlapping_keys = x.keys() & y.keys()
for key in overlapping_keys:
z[key] = dict_of_dicts_merge(x[key], y[key])
for key in x.keys() - overlapping_keys:
z[key] = deepcopy(x[key])
for key in y.keys() - overlapping_keys:
z[key] = deepcopy(y[key])
return z
>>> x = {'a':{1:{}}, 'b': {2:{}}}
>>> y = {'b':{10:{}}, 'c': {11:{}}}
>>> dict_of_dicts_merge(x, y)
{'b': {2: {}, 10: {}}, 'a': {1: {}}, 'c': {11: {}}}
{k: v for d in dicts for k, v in d.items()} # iteritems in Python 2.7
dict((k, v) for d in dicts for k, v in d.items()) # iteritems in Python 2
from itertools import chain
z = dict(chain(x.items(), y.items())) # iteritems in Python 2
from timeit import repeat
from itertools import chain
x = dict.fromkeys('abcdefg')
y = dict.fromkeys('efghijk')
def merge_two_dicts(x, y):
z = x.copy()
z.update(y)
return z
min(repeat(lambda: {**x, **y}))
min(repeat(lambda: merge_two_dicts(x, y)))
min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
min(repeat(lambda: dict(chain(x.items(), y.items()))))
min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))
>>> min(repeat(lambda: {**x, **y}))
1.0804965235292912
>>> min(repeat(lambda: merge_two_dicts(x, y)))
1.636518670246005
>>> min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
3.1779992282390594
>>> min(repeat(lambda: dict(chain(x.items(), y.items()))))
2.740647904574871
>>> min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))
4.266070580109954
$ uname -a
Linux nixos 4.19.113 #1-NixOS SMP Wed Mar 25 07:06:15 UTC 2020 x86_64 GNU/Linux
词典中的资源
z1 = dict(x.items() + y.items())
z2 = dict(x, **y)
在我的机器上,至少(一个相当常见的x86_64运行Python 2.5.2),替代Z2不仅更短,更简单,而且更快。
% python -m timeit -s 'x=y=dict((i,i) for i in range(20))' 'z1=dict(x.items() + y.items())'
100000 loops, best of 3: 5.67 usec per loop
% python -m timeit -s 'x=y=dict((i,i) for i in range(20))' 'z2=dict(x, **y)'
100000 loops, best of 3: 1.53 usec per loop
示例2:不超越的字典,将252条短线地图到整条,反之亦然:
% python -m timeit -s 'from htmlentitydefs import codepoint2name as x, name2codepoint as y' 'z1=dict(x.items() + y.items())'
1000 loops, best of 3: 260 usec per loop
% python -m timeit -s 'from htmlentitydefs import codepoint2name as x, name2codepoint as y' 'z2=dict(x, **y)'
10000 loops, best of 3: 26.9 usec per loop
z2赢得了大约10的因素,这在我的书中是一个相当大的胜利!
在比较这两个之后,我想知道 z1 的不良性能是否可以归功于构建两个项目列表的顶端,这反过来导致我想知道这个变量是否会更好地工作:
from itertools import chain
z3 = dict(chain(x.iteritems(), y.iteritems()))
% python -m timeit -s 'from itertools import chain; from htmlentitydefs import codepoint2name as x, name2codepoint as y' 'z3=dict(chain(x.iteritems(), y.iteritems()))'
10000 loops, best of 3: 66 usec per loop
z0 = dict(x)
z0.update(y)
% python -m timeit -s 'from htmlentitydefs import codepoint2name as x, name2codepoint as y' 'z0=dict(x); z0.update(y)'
10000 loops, best of 3: 26.9 usec per loop
你也可以这样写作
z0 = x.copy()
z0.update(y)
正如托尼所做的那样,但(不令人惊讶)评分的差异显然没有对性能的测量效应。 使用任何人看起来对你是正确的。
用一个细致的理解,你可以
x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}
dc = {xi:(x[xi] if xi not in list(y.keys())
else y[xi]) for xi in list(x.keys())+(list(y.keys()))}
给予
>>> dc
{'a': 1, 'c': 11, 'b': 10}
注意合成,如果不明白
{ (some_key if condition else default_key):(something_if_true if condition
else something_if_false) for key, value in dict_.items() }
Python 3.9 + 仅限
合并(<unk>)和更新(<unk>=)运营商已被添加到内置的<unk>类。
>>> d = {'spam': 1, 'eggs': 2, 'cheese': 3}
>>> e = {'cheese': 'cheddar', 'aardvark': 'Ethel'}
>>> d | e
{'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}
扩展任务版本在现场运行:
>>> d |= e
>>> d
{'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}
查看 PEP 584
此外,当您使用.items() (Python 3.0 之前),您正在创建一个新的列表,包含从字典中的项目. 如果您的字典是大,那么它是相当多的顶部(两个大列表将被扔掉,一旦合并的字典创建)。更新() 可以更有效地工作,因为它可以通过第二个字典项目为项目。
在时间方面:
>>> timeit.Timer("dict(x, **y)", "x = dict(zip(range(1000), range(1000)))\ny=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
15.52571702003479
>>> timeit.Timer("temp = x.copy()\ntemp.update(y)", "x = dict(zip(range(1000), range(1000)))\ny=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
15.694622993469238
>>> timeit.Timer("dict(x.items() + y.items())", "x = dict(zip(range(1000), range(1000)))\ny=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
41.484580039978027
此外,字典创建的关键词论点仅在Python 2.3中添加,而复制()和更新()将在较旧版本中工作。