在Python中获取迭代器中的元素个数

一般来说，有没有一种有效的方法可以知道Python中的迭代器中有多少个元素，而不用遍历每个元素并计数?

当前回答

这在理论上是不可能的:事实上，这就是“停止问题”。

证明

相反，假设可以使用函数len(g)来确定任何生成器g的长度(或无限长度)。

对于任何程序P，现在让我们将P转换为生成器g(P): 对于P中的每个返回点或出口点，产生一个值而不是返回它。

如果len(g(P)) ==无穷大，P不会停止。

这解决了暂停问题，这是不可能的，见维基百科。矛盾。

因此，如果不对泛型生成器进行迭代(==实际运行整个程序)，就不可能对其元素进行计数。

更具体地说，考虑

def g():
    while True:
        yield "more?"

长度是无限的。这样的发生器有无穷多个。

2022-01-16 16:10:03

其他回答

关于你最初的问题，答案仍然是，在Python中通常没有办法知道迭代器的长度。

Given that you question is motivated by an application of the pysam library, I can give a more specific answer: I'm a contributer to PySAM and the definitive answer is that SAM/BAM files do not provide an exact count of aligned reads. Nor is this information easily available from a BAM index file. The best one can do is to estimate the approximate number of alignments by using the location of the file pointer after reading a number of alignments and extrapolating based on the total size of the file. This is enough to implement a progress bar, but not a method of counting alignments in constant time.

2010-08-17 18:57:51

通常的做法是将这类信息放在文件头中，并让pysam允许您访问这些信息。我不知道格式，但是你检查过API了吗?

正如其他人所说，你不能从迭代器中知道长度。

2010-07-27 17:37:22

不能(除非特定迭代器的类型实现了一些特定的方法，使之成为可能)。

通常，只能通过使用迭代器来计数迭代器项。最有效的方法之一:

import itertools
from collections import deque

def count_iter_items(iterable):
    """
    Consume an iterable not reading it into memory; return the number of items.
    """
    counter = itertools.count()
    deque(itertools.izip(iterable, counter), maxlen=0)  # (consume at C speed)
    return next(counter)

(对于Python 3。X替换itertools。Izip with zip)。

2013-02-27 12:22:39

我喜欢这个基数包，它是非常轻量级的，并尝试使用最快的实现，这取决于可迭代对象。

用法:

>>> import cardinality
>>> cardinality.count([1, 2, 3])
3
>>> cardinality.count(i for i in range(500))
500
>>> def gen():
...     yield 'hello'
...     yield 'world'
>>> cardinality.count(gen())
2

count()的实际实现如下:

def count(iterable):
    if hasattr(iterable, '__len__'):
        return len(iterable)

    d = collections.deque(enumerate(iterable, 1), maxlen=1)
    return d[0][0] if d else 0

2016-04-15 10:32:48

这违背了迭代器的定义，迭代器是一个指向对象的指针，加上如何到达下一个对象的信息。

迭代器不知道在终止之前它还能迭代多少次。这个可以是无穷，所以无穷可能是你的答案。

2013-11-08 00:53:59

在Python中获取迭代器中的元素个数

推荐文章

最新文章

标签