访问列表中的多个元素，知道它们的索引

我需要从给定的列表中选择一些元素，知道它们的索引。假设我想创建一个新列表，其中包含从给定列表[- 2,1,5,3,8,5,6]中索引为1,2,5的元素。我所做的是:

a = [-2,1,5,3,8,5,6]
b = [1,2,5]
c = [ a[i] for i in b]

有什么更好的办法吗?比如c = a[b] ?

当前回答

基本的和不太广泛的测试，比较五个答案的执行时间:

def numpyIndexValues(a, b):
    na = np.array(a)
    nb = np.array(b)
    out = list(na[nb])
    return out

def mapIndexValues(a, b):
    out = map(a.__getitem__, b)
    return list(out)

def getIndexValues(a, b):
    out = operator.itemgetter(*b)(a)
    return out

def pythonLoopOverlap(a, b):
    c = [ a[i] for i in b]
    return c

multipleListItemValues = lambda searchList, ind: [searchList[i] for i in ind]

使用以下输入:

a = range(0, 10000000)
b = range(500, 500000)

简单的python循环是最快的，lambda操作紧随其后，mapIndexValues和getIndexValues始终非常相似，numpy方法在将列表转换为numpy数组后明显更慢。如果数据已经在numpy数组中，则使用numpy. numpyIndexValues方法。删除数组转换是最快的。

numpyIndexValues -> time:1.38940598 (when converted the lists to numpy arrays)
numpyIndexValues -> time:0.0193445 (using numpy array instead of python list as input, and conversion code removed)
mapIndexValues -> time:0.06477512099999999
getIndexValues -> time:0.06391049500000001
multipleListItemValues -> time:0.043773591
pythonLoopOverlap -> time:0.043021754999999995

2015-09-11 06:54:52

其他回答

我的回答没有使用numpy或python集合。

查找元素的一种简单方法如下:

a = [-2, 1, 5, 3, 8, 5, 6]
b = [1, 2, 5]
c = [i for i in a if i in b]

缺点:此方法可能不适用于较大的列表。对于较大的列表，建议使用numpy。

2014-08-28 10:02:08

这里有一个更简单的方法:

a = [-2,1,5,3,8,5,6]
b = [1,2,5]
c = [e for i, e in enumerate(a) if i in b]

2019-09-06 16:49:27

基本的和不太广泛的测试，比较五个答案的执行时间:

def numpyIndexValues(a, b):
    na = np.array(a)
    nb = np.array(b)
    out = list(na[nb])
    return out

def mapIndexValues(a, b):
    out = map(a.__getitem__, b)
    return list(out)

def getIndexValues(a, b):
    out = operator.itemgetter(*b)(a)
    return out

def pythonLoopOverlap(a, b):
    c = [ a[i] for i in b]
    return c

multipleListItemValues = lambda searchList, ind: [searchList[i] for i in ind]

使用以下输入:

a = range(0, 10000000)
b = range(500, 500000)

numpyIndexValues -> time:1.38940598 (when converted the lists to numpy arrays)
numpyIndexValues -> time:0.0193445 (using numpy array instead of python list as input, and conversion code removed)
mapIndexValues -> time:0.06477512099999999
getIndexValues -> time:0.06391049500000001
multipleListItemValues -> time:0.043773591
pythonLoopOverlap -> time:0.043021754999999995

2015-09-11 06:54:52

另一个解决方案是通过熊猫系列:

import pandas as pd

a = pd.Series([-2, 1, 5, 3, 8, 5, 6])
b = [1, 2, 5]
c = a[b]

如果你想，你可以把c转换回一个列表:

c = list(c)

2017-09-16 17:56:55

列表理解显然是最直接和最容易记住的——除了相当python化!

在任何情况下，在提出的解决方案中，它不是最快的(我已经在Windows上使用Python 3.8.3运行了我的测试):

import timeit
from itertools import compress
import random
from operator import itemgetter
import pandas as pd

__N_TESTS__ = 10_000

vector = [str(x) for x in range(100)]
filter_indeces = sorted(random.sample(range(100), 10))
filter_boolean = random.choices([True, False], k=100)

# Different ways for selecting elements given indeces

# list comprehension
def f1(v, f):
   return [v[i] for i in filter_indeces]

# itemgetter
def f2(v, f):
   return itemgetter(*f)(v)

# using pandas.Series
# this is immensely slow
def f3(v, f):
   return list(pd.Series(v)[f])

# using map and __getitem__
def f4(v, f):
   return list(map(v.__getitem__, f))

# using enumerate!
def f5(v, f):
   return [x for i, x in enumerate(v) if i in f]

# using numpy array
def f6(v, f):
   return list(np.array(v)[f])

print("{:30s}:{:f} secs".format("List comprehension", timeit.timeit(lambda:f1(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Operator.itemgetter", timeit.timeit(lambda:f2(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Using Pandas series", timeit.timeit(lambda:f3(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Using map and __getitem__", timeit.timeit(lambda: f4(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Enumeration (Why anyway?)", timeit.timeit(lambda: f5(vector, filter_indeces), number=__N_TESTS__)))

我的结果是:

列表理解:0.007113秒操作符。Itemgetter:0.003247秒使用Pandas系列:2.977286秒使用map和getitem:0.005029秒枚举(为什么?):0.135156秒 Numpy:0.157018秒

2021-10-09 14:35:09

访问列表中的多个元素，知道它们的索引

推荐文章

最新文章

标签