假设你有一本这样的字典:
{'a': 1,
'c': {'a': 2,
'b': {'x': 5,
'y' : 10}},
'd': [1, 2, 3]}
你会如何把它平摊成这样:
{'a': 1,
'c_a': 2,
'c_b_x': 5,
'c_b_y': 10,
'd': [1, 2, 3]}
假设你有一本这样的字典:
{'a': 1,
'c': {'a': 2,
'b': {'x': 5,
'y' : 10}},
'd': [1, 2, 3]}
你会如何把它平摊成这样:
{'a': 1,
'c_a': 2,
'c_b_x': 5,
'c_b_y': 10,
'd': [1, 2, 3]}
当前回答
这一变化扁平化嵌套字典,压缩键与max_level和自定义减速器。
def flatten(d, max_level=None, reducer='tuple'):
if reducer == 'tuple':
reducer_seed = tuple()
reducer_func = lambda x, y: (*x, y)
else:
raise ValueError(f'Unknown reducer: {reducer}')
def impl(d, pref, level):
return reduce(
lambda new_d, kv:
(max_level is None or level < max_level)
and isinstance(kv[1], dict)
and {**new_d, **impl(kv[1], reducer_func(pref, kv[0]), level + 1)}
or {**new_d, reducer_func(pref, kv[0]): kv[1]},
d.items(),
{}
)
return impl(d, reducer_seed, 0)
其他回答
def flatten_nested_dict(_dict, _str=''):
'''
recursive function to flatten a nested dictionary json
'''
ret_dict = {}
for k, v in _dict.items():
if isinstance(v, dict):
ret_dict.update(flatten_nested_dict(v, _str = '_'.join([_str, k]).strip('_')))
elif isinstance(v, list):
for index, item in enumerate(v):
if isinstance(item, dict):
ret_dict.update(flatten_nested_dict(item, _str= '_'.join([_str, k, str(index)]).strip('_')))
else:
ret_dict['_'.join([_str, k, str(index)]).strip('_')] = item
else:
ret_dict['_'.join([_str, k]).strip('_')] = v
return ret_dict
使用flatdict库:
dic={'a': 1,
'c': {'a': 2,
'b': {'x': 5,
'y' : 10}},
'd': [1, 2, 3]}
import flatdict
f = flatdict.FlatDict(dic,delimiter='_')
print(f)
#output
{'a': 1, 'c_a': 2, 'c_b_x': 5, 'c_b_y': 10, 'd': [1, 2, 3]}
我尝试了本页上的一些解决方案-虽然不是全部-但我尝试的那些都无法处理dict的嵌套列表。
考虑这样一个词典:
d = {
'owner': {
'name': {'first_name': 'Steven', 'last_name': 'Smith'},
'lottery_nums': [1, 2, 3, 'four', '11', None],
'address': {},
'tuple': (1, 2, 'three'),
'tuple_with_dict': (1, 2, 'three', {'is_valid': False}),
'set': {1, 2, 3, 4, 'five'},
'children': [
{'name': {'first_name': 'Jessica',
'last_name': 'Smith', },
'children': []
},
{'name': {'first_name': 'George',
'last_name': 'Smith'},
'children': []
}
]
}
}
以下是我的临时解决方案:
def flatten_dict(input_node: dict, key_: str = '', output_dict: dict = {}):
if isinstance(input_node, dict):
for key, val in input_node.items():
new_key = f"{key_}.{key}" if key_ else f"{key}"
flatten_dict(val, new_key, output_dict)
elif isinstance(input_node, list):
for idx, item in enumerate(input_node):
flatten_dict(item, f"{key_}.{idx}", output_dict)
else:
output_dict[key_] = input_node
return output_dict
生产:
{
owner.name.first_name: Steven,
owner.name.last_name: Smith,
owner.lottery_nums.0: 1,
owner.lottery_nums.1: 2,
owner.lottery_nums.2: 3,
owner.lottery_nums.3: four,
owner.lottery_nums.4: 11,
owner.lottery_nums.5: None,
owner.tuple: (1, 2, 'three'),
owner.tuple_with_dict: (1, 2, 'three', {'is_valid': False}),
owner.set: {1, 2, 3, 4, 'five'},
owner.children.0.name.first_name: Jessica,
owner.children.0.name.last_name: Smith,
owner.children.1.name.first_name: George,
owner.children.1.name.last_name: Smith,
}
一个临时的解决方案,但并不完美。 注意:
它不保留空字典,例如地址:{}k/v对。 它不会将嵌套元组中的字典平铺——尽管使用python元组类似于列表的事实很容易添加它。
使用生成器的Python 3.3解决方案:
def flattenit(pyobj, keystring=''):
if type(pyobj) is dict:
if (type(pyobj) is dict):
keystring = keystring + "_" if keystring else keystring
for k in pyobj:
yield from flattenit(pyobj[k], keystring + k)
elif (type(pyobj) is list):
for lelm in pyobj:
yield from flatten(lelm, keystring)
else:
yield keystring, pyobj
my_obj = {'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y': 10}}, 'd': [1, 2, 3]}
#your flattened dictionary object
flattened={k:v for k,v in flattenit(my_obj)}
print(flattened)
# result: {'c_b_y': 10, 'd': [1, 2, 3], 'c_a': 2, 'a': 1, 'c_b_x': 5}
如果你使用pandas,有一个函数隐藏在pandas.io.json中。_normalize1调用nested_to_record来完成这个操作。
from pandas.io.json._normalize import nested_to_record
flat = nested_to_record(my_dict, sep='_')
1在熊猫0.24版本。X及以上版本使用panda .io.json.normalize(不带_)