我在Python中迭代一个元组列表,并试图删除它们,如果它们满足某些条件。
for tup in somelist:
if determine(tup):
code_to_remove_tup
我应该用什么来代替code_to_remove_tup?我不知道怎么用这种方式把东西拿掉。
我在Python中迭代一个元组列表,并试图删除它们,如果它们满足某些条件。
for tup in somelist:
if determine(tup):
code_to_remove_tup
我应该用什么来代替code_to_remove_tup?我不知道怎么用这种方式把东西拿掉。
当前回答
对于喜欢函数式编程的人:
somelist[:] = filter(lambda tup: not determine(tup), somelist)
or
from itertools import ifilterfalse
somelist[:] = list(ifilterfalse(determine, somelist))
其他回答
您可以使用列表推导式创建一个新列表,其中只包含您不想删除的元素:
somelist = [x for x in somelist if not determine(x)]
或者,通过分配slice somlist[:],你可以改变现有的列表,只包含你想要的项:
somelist[:] = [x for x in somelist if not determine(x)]
如果对某个列表的其他引用需要反映更改,则此方法可能很有用。
你也可以使用itertools来代替理解函数。在Python 2中:
from itertools import ifilterfalse
somelist[:] = ifilterfalse(determine, somelist)
或者在python3中:
from itertools import filterfalse
somelist[:] = filterfalse(determine, somelist)
建议列表推导的答案几乎是正确的,除了它们构建了一个全新的列表,然后给它一个与旧列表相同的名称,它们没有在适当的地方修改旧列表。这与Lennart建议的选择性删除不同——它更快,但如果您的列表是通过多个引用访问的,那么您只是重新设置了其中一个引用,而没有更改列表对象本身,这可能会导致微妙的、灾难性的错误。
幸运的是,它非常容易获得列表推导式的速度和所需的就地更改的语义——只是代码:
somelist[:] = [tup for tup in somelist if determine(tup)]
请注意与其他答案的细微区别:这个答案没有分配给一个裸名。它赋值给一个列表切片,恰好是整个列表,因此替换了同一Python列表对象中的列表内容,而不是像其他答案一样只是重新设置一个引用(从以前的列表对象到新的列表对象)。
变通方案概述
:
use a linked list implementation/roll your own. A linked list is the proper data structure to support efficient item removal, and does not force you to make space/time tradeoffs. A CPython list is implemented with dynamic arrays as mentioned here, which is not a good data type to support removals. There doesn't seem to be a linked list in the standard library however: Is there a linked list predefined library in Python? https://github.com/ajakubek/python-llist start a new list() from scratch, and .append() back at the end as mentioned at: https://stackoverflow.com/a/1207460/895245 This time efficient, but less space efficient because it keeps an extra copy of the array around during iteration. use del with an index as mentioned at: https://stackoverflow.com/a/1207485/895245 This is more space efficient since it dispenses the array copy, but it is less time efficient, because removal from dynamic arrays requires shifting all following items back by one, which is O(N).
一般来说,如果你做得很快,不想添加一个自定义LinkedList类,你只需要在默认情况下使用更快的.append()选项,除非内存是一个大问题。
官方Python 2教程4.2。“声明”
https://docs.python.org/2/tutorial/controlflow.html#for-statements
这部分文档明确说明:
您需要复制迭代列表才能修改它 一种方法是使用切片符号[:]
If you need to modify the sequence you are iterating over while inside the loop (for example to duplicate selected items), it is recommended that you first make a copy. Iterating over a sequence does not implicitly make a copy. The slice notation makes this especially convenient: >>> words = ['cat', 'window', 'defenestrate'] >>> for w in words[:]: # Loop over a slice copy of the entire list. ... if len(w) > 6: ... words.insert(0, w) ... >>> words ['defenestrate', 'cat', 'window', 'defenestrate']
Python 2文档7.3。“for语句”
https://docs.python.org/2/reference/compound_stmts.html#for
这部分文档再次说明你必须复制一份,并给出了一个实际的删除示例:
Note: There is a subtlety when the sequence is being modified by the loop (this can only occur for mutable sequences, i.e. lists). An internal counter is used to keep track of which item is used next, and this is incremented on each iteration. When this counter has reached the length of the sequence the loop terminates. This means that if the suite deletes the current (or a previous) item from the sequence, the next item will be skipped (since it gets the index of the current item which has already been treated). Likewise, if the suite inserts an item in the sequence before the current item, the current item will be treated again the next time through the loop. This can lead to nasty bugs that can be avoided by making a temporary copy using a slice of the whole sequence, e.g., for x in a[:]:
if x < 0: a.remove(x)
然而,我不同意这个实现,因为.remove()必须遍历整个列表才能找到值。
Python能做得更好吗?
似乎这个特定的Python API可以得到改进。例如,将其与:
Java ListIterator::删除哪些文档“此调用只能对next或previous调用一次” c++ std::vector::erase,返回被删除元素之后的一个有效的互操作器
这两种方法都清楚地表明,除了使用迭代器本身,您不能修改正在迭代的列表,并为您提供了在不复制列表的情况下修改列表的有效方法。
可能潜在的基本原理是,Python列表被假定为动态数组支持,因此任何类型的删除都将是低效的,而Java在ListIterator的ArrayList和LinkedList实现方面都有更好的接口层次结构。
在Python标准库中似乎也没有显式的链表类型:Python链表
你可以反过来尝试for- loops,这样对于some_list,你就可以这样做:
list_len = len(some_list)
for i in range(list_len):
reverse_i = list_len - 1 - i
cur = some_list[reverse_i]
# some logic with cur element
if some_condition:
some_list.pop(reverse_i)
这样索引是对齐的,并且不会受到列表更新的影响(无论是否弹出cur元素)。
这里的大多数答案都要求您创建列表的副本。我有一个用例,其中列表相当长(110K项),明智的做法是继续减少列表。
首先,你需要用while循环替换foreach循环,
i = 0
while i < len(somelist):
if determine(somelist[i]):
del somelist[i]
else:
i += 1
i的值在if块中没有改变,因为一旦旧项被删除,您将希望从SAME INDEX中获得新项的值。