假设我有这个:
[
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5},
{"name": "Pam", "age": 7}
]
通过搜索“Pam”作为名称,我想检索相关的字典:{name:“Pam”,年龄:7}
如何做到这一点?
假设我有这个:
[
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5},
{"name": "Pam", "age": 7}
]
通过搜索“Pam”作为名称,我想检索相关的字典:{name:“Pam”,年龄:7}
如何做到这一点?
当前回答
我测试了各种方法来遍历字典列表并返回键x具有特定值的字典。
结果:
速度:列表推导>生成器表达式>>常规列表迭代>>>过滤器。 所有的比例都与列表中的字典数量线性(10x列表大小-> 10x时间)。 对于大量(数千个)键,每个字典的键不会显著影响速度。请看我计算的图表:https://imgur.com/a/quQzv(方法名称见下文)。
所有测试均使用Python 3.6.4, W7x64完成。
from random import randint
from timeit import timeit
list_dicts = []
for _ in range(1000): # number of dicts in the list
dict_tmp = {}
for i in range(10): # number of keys for each dict
dict_tmp[f"key{i}"] = randint(0,50)
list_dicts.append( dict_tmp )
def a():
# normal iteration over all elements
for dict_ in list_dicts:
if dict_["key3"] == 20:
pass
def b():
# use 'generator'
for dict_ in (x for x in list_dicts if x["key3"] == 20):
pass
def c():
# use 'list'
for dict_ in [x for x in list_dicts if x["key3"] == 20]:
pass
def d():
# use 'filter'
for dict_ in filter(lambda x: x['key3'] == 20, list_dicts):
pass
结果:
1.7303 # normal list iteration
1.3849 # generator expression
1.3158 # list comprehension
7.7848 # filter
其他回答
为@FrédéricHamidi添加一点点。
如果你不确定字典列表中是否有键,这样做会有帮助:
next((item for item in dicts if item.get("name") and item["name"] == "Pam"), None)
你试过熊猫套餐吗?它非常适合这类搜索任务,也进行了优化。
import pandas as pd
listOfDicts = [
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5},
{"name": "Pam", "age": 7}
]
# Create a data frame, keys are used as column headers.
# Dict items with the same key are entered into the same respective column.
df = pd.DataFrame(listOfDicts)
# The pandas dataframe allows you to pick out specific values like so:
df2 = df[ (df['name'] == 'Pam') & (df['age'] == 7) ]
# Alternate syntax, same thing
df2 = df[ (df.name == 'Pam') & (df.age == 7) ]
我在下面添加了一些基准测试,以说明熊猫在更大范围内(即10万+条目)的更快运行时间:
setup_large = 'dicts = [];\
[dicts.extend(({ "name": "Tom", "age": 10 },{ "name": "Mark", "age": 5 },\
{ "name": "Pam", "age": 7 },{ "name": "Dick", "age": 12 })) for _ in range(25000)];\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(dicts);'
setup_small = 'dicts = [];\
dicts.extend(({ "name": "Tom", "age": 10 },{ "name": "Mark", "age": 5 },\
{ "name": "Pam", "age": 7 },{ "name": "Dick", "age": 12 }));\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(dicts);'
method1 = '[item for item in dicts if item["name"] == "Pam"]'
method2 = 'df[df["name"] == "Pam"]'
import timeit
t = timeit.Timer(method1, setup_small)
print('Small Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_small)
print('Small Method Pandas: ' + str(t.timeit(100)))
t = timeit.Timer(method1, setup_large)
print('Large Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_large)
print('Large Method Pandas: ' + str(t.timeit(100)))
#Small Method LC: 0.000191926956177
#Small Method Pandas: 0.044392824173
#Large Method LC: 1.98827004433
#Large Method Pandas: 0.324505090714
@Frédéric Hamidi的回答很好。在Python 3中。X .next()的语法稍有改变。因此有一个小小的修改:
>>> dicts = [
{ "name": "Tom", "age": 10 },
{ "name": "Mark", "age": 5 },
{ "name": "Pam", "age": 7 },
{ "name": "Dick", "age": 12 }
]
>>> next(item for item in dicts if item["name"] == "Pam")
{'age': 7, 'name': 'Pam'}
正如@Matt在评论中提到的,你可以添加一个默认值:
>>> next((item for item in dicts if item["name"] == "Pam"), False)
{'name': 'Pam', 'age': 7}
>>> next((item for item in dicts if item["name"] == "Sam"), False)
False
>>>
使用列表推导式的一个简单方法是,如果l是列表
l = [
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5},
{"name": "Pam", "age": 7}
]
然后
[d['age'] for d in l if d['name']=='Tom']
你可以通过使用Python中的filter和next方法来实现这一点。
方法过滤给定序列并返回一个迭代器。 Next方法接受迭代器并返回列表中的下一个元素。
所以你可以通过,
my_dict = [
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5},
{"name": "Pam", "age": 7}
]
next(filter(lambda obj: obj.get('name') == 'Pam', my_dict), None)
输出是,
{'name': 'Pam', 'age': 7}
注意:如果没有找到所搜索的名称,上述代码将返回None。