Index()将给出列表中第一个出现的项。是否有一个巧妙的技巧可以返回一个元素列表中的所有索引?
当前回答
There’s an answer using np.where to find the indices of a single value, which is not faster than a list-comprehension, if the time to convert a list to an array is included The overhead of importing numpy and converting a list to a numpy.array probably makes using numpy a less efficient option for most circumstances. A careful timing analysis would be necessary. In cases where multiple functions/operations will need to be performed on the list, converting the list to an array, and then using numpy functions will likely be a faster option. This solution uses np.where and np.unique to find the indices of all unique elements in a list. Using np.where on an array (including the time to convert the list to an array) is slightly slower than a list-comprehension on a list, for finding all indices of all unique elements. This has been tested on an 2M element list with 4 unique values, and the size of the list/array and number of unique elements will have an impact. Other solutions using numpy on an array can be found in Get a list of all indices of repeated elements in a numpy array Tested in [python 3.10.4, numpy 1.23.1] and [python 3.11.0, numpy 1.23.4]
import numpy as np
import random # to create test list
# create sample list
random.seed(365)
l = [random.choice(['s1', 's2', 's3', 's4']) for _ in range(20)]
# convert the list to an array for use with these numpy methods
a = np.array(l)
# create a dict of each unique entry and the associated indices
idx = {v: np.where(a == v)[0].tolist() for v in np.unique(a)}
# print(idx)
{'s1': [7, 9, 10, 11, 17],
's2': [1, 3, 6, 8, 14, 18, 19],
's3': [0, 2, 13, 16],
's4': [4, 5, 12, 15]}
%timeit在2M元素列表中,有4个唯一的str元素
# create 2M element list
random.seed(365)
l = [random.choice(['s1', 's2', 's3', 's4']) for _ in range(2000000)]
功能
def test1():
# np.where: convert list to array and find indices of a single element
a = np.array(l)
return np.where(a == 's1')
def test2():
# list-comprehension: on list l and find indices of a single element
return [i for i, x in enumerate(l) if x == "s1"]
def test3():
# filter: on list l and find indices of a single element
return list(filter(lambda i: l[i]=="s1", range(len(l))))
def test4():
# use np.where and np.unique to find indices of all unique elements: convert list to array
a = np.array(l)
return {v: np.where(a == v)[0].tolist() for v in np.unique(a)}
def test5():
# list comprehension inside dict comprehension: on list l and find indices of all unique elements
return {req_word: [idx for idx, word in enumerate(l) if word == req_word] for req_word in set(l)}
函数调用
%timeit test1()
%timeit test2()
%timeit test3()
%timeit test4()
%timeit test5()
python 3.10.4
214 ms ± 19.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
85.1 ms ± 1.48 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
146 ms ± 1.65 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
365 ms ± 11.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
360 ms ± 5.82 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
结果python 3.11.0
209 ms ± 15.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
70.4 ms ± 1.86 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
132 ms ± 4.65 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
371 ms ± 20.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
314 ms ± 15.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
其他回答
你可以使用枚举的列表推导式:
indices = [i for i, x in enumerate(my_list) if x == "whatever"]
迭代器enumerate(my_list)为列表中的每一项生成对(index, item)。使用i, x作为循环变量目标,将这些对解包到索引i和列表项x中。我们向下筛选到所有符合条件的x,并选择这些元素的索引i。
使用for循环:
使用枚举和列表理解的答案更python化,但不一定更快。然而,这个答案是针对那些可能不被允许使用这些内置功能的学生。 创建一个空列表,索引 创建for I in range(len(x)):循环,该循环本质上是遍历索引位置列表[0,1,2,3,…]len (x) 1] 在循环中,将任意i(其中x[i]与value匹配)添加到索引中 X [i]通过索引访问列表
def get_indices(x: list, value: int) -> list:
indices = list()
for i in range(len(x)):
if x[i] == value:
indices.append(i)
return indices
n = [1, 2, 3, -50, -60, 0, 6, 9, -60, -60]
print(get_indices(n, -60))
>>> [4, 8, 9]
函数get_indexes是用类型提示实现的。在这种情况下,列表n是一串int型,因此我们搜索值,也定义为int型。
使用while循环和.index:
对于.index,使用try-except进行错误处理,因为如果value不在列表中,则会发生ValueError。
def get_indices(x: list, value: int) -> list:
indices = list()
i = 0
while True:
try:
# find an occurrence of value and update i to that index
i = x.index(value, i)
# add i to the list
indices.append(i)
# advance i by 1
i += 1
except ValueError as e:
break
return indices
print(get_indices(n, -60))
>>> [4, 8, 9]
一个基于动态列表理解的解决方案,以防我们事先不知道哪个元素:
lst = ['to', 'be', 'or', 'not', 'to', 'be']
{req_word: [idx for idx, word in enumerate(lst) if word == req_word] for req_word in set(lst)}
结果:
{'be': [1, 5], 'or': [2], 'to': [0, 4], 'not': [3]}
您也可以按照相同的思路考虑所有其他方法,但是使用index()您只能找到一个索引,尽管您可以自己设置出现次数。
There’s an answer using np.where to find the indices of a single value, which is not faster than a list-comprehension, if the time to convert a list to an array is included The overhead of importing numpy and converting a list to a numpy.array probably makes using numpy a less efficient option for most circumstances. A careful timing analysis would be necessary. In cases where multiple functions/operations will need to be performed on the list, converting the list to an array, and then using numpy functions will likely be a faster option. This solution uses np.where and np.unique to find the indices of all unique elements in a list. Using np.where on an array (including the time to convert the list to an array) is slightly slower than a list-comprehension on a list, for finding all indices of all unique elements. This has been tested on an 2M element list with 4 unique values, and the size of the list/array and number of unique elements will have an impact. Other solutions using numpy on an array can be found in Get a list of all indices of repeated elements in a numpy array Tested in [python 3.10.4, numpy 1.23.1] and [python 3.11.0, numpy 1.23.4]
import numpy as np
import random # to create test list
# create sample list
random.seed(365)
l = [random.choice(['s1', 's2', 's3', 's4']) for _ in range(20)]
# convert the list to an array for use with these numpy methods
a = np.array(l)
# create a dict of each unique entry and the associated indices
idx = {v: np.where(a == v)[0].tolist() for v in np.unique(a)}
# print(idx)
{'s1': [7, 9, 10, 11, 17],
's2': [1, 3, 6, 8, 14, 18, 19],
's3': [0, 2, 13, 16],
's4': [4, 5, 12, 15]}
%timeit在2M元素列表中,有4个唯一的str元素
# create 2M element list
random.seed(365)
l = [random.choice(['s1', 's2', 's3', 's4']) for _ in range(2000000)]
功能
def test1():
# np.where: convert list to array and find indices of a single element
a = np.array(l)
return np.where(a == 's1')
def test2():
# list-comprehension: on list l and find indices of a single element
return [i for i, x in enumerate(l) if x == "s1"]
def test3():
# filter: on list l and find indices of a single element
return list(filter(lambda i: l[i]=="s1", range(len(l))))
def test4():
# use np.where and np.unique to find indices of all unique elements: convert list to array
a = np.array(l)
return {v: np.where(a == v)[0].tolist() for v in np.unique(a)}
def test5():
# list comprehension inside dict comprehension: on list l and find indices of all unique elements
return {req_word: [idx for idx, word in enumerate(l) if word == req_word] for req_word in set(l)}
函数调用
%timeit test1()
%timeit test2()
%timeit test3()
%timeit test4()
%timeit test5()
python 3.10.4
214 ms ± 19.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
85.1 ms ± 1.48 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
146 ms ± 1.65 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
365 ms ± 11.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
360 ms ± 5.82 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
结果python 3.11.0
209 ms ± 15.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
70.4 ms ± 1.86 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
132 ms ± 4.65 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
371 ms ± 20.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
314 ms ± 15.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
或使用范围(python 3):
l=[i for i in range(len(lst)) if lst[i]=='something...']
For (python 2):
l=[i for i in xrange(len(lst)) if lst[i]=='something...']
然后(两种情况):
print(l)
不出所料。
推荐文章
- 将Pandas或Numpy Nan替换为None以用于MysqlDB
- 使用pandas对同一列进行多个聚合
- 使用Python解析HTML
- django MultiValueDictKeyError错误,我如何处理它
- 如何在for循环期间修改列表条目?
- 我如何在Django中创建一个鼻涕虫?
- 数组与列表的性能
- 没有名为'django.core.urlresolvers'的模块
- 蟒蛇导出环境文件
- Django - makemigrations -未检测到任何更改
- SQLAlchemy:引擎、连接和会话差异
- 在Python Pandas中删除多个列中的所有重复行
- 更改pandas DataFrame中的特定列名
- 将Pandas多索引转换为列
- 熊猫在每组中获得最高的n个记录