迭代器和生成器之间的区别是什么?举一些例子来说明你在什么时候使用每种情况会很有帮助。


当前回答

对于相同的数据,你可以比较两种方法:

def myGeneratorList(n):
    for i in range(n):
        yield i

def myIterableList(n):
    ll = n*[None]
    for i in range(n):
        ll[i] = i
    return ll

# Same values
ll1 = myGeneratorList(10)
ll2 = myIterableList(10)
for i1, i2 in zip(ll1, ll2):
    print("{} {}".format(i1, i2))

# Generator can only be read once
ll1 = myGeneratorList(10)
ll2 = myIterableList(10)

print("{} {}".format(len(list(ll1)), len(ll2)))
print("{} {}".format(len(list(ll1)), len(ll2)))

# Generator can be read several times if converted into iterable
ll1 = list(myGeneratorList(10))
ll2 = myIterableList(10)

print("{} {}".format(len(list(ll1)), len(ll2)))
print("{} {}".format(len(list(ll1)), len(ll2)))

此外,如果检查内存占用,生成器占用的内存要少得多,因为它不需要同时将所有值存储在内存中。

其他回答

之前的回答忽略了这一点:生成器有close方法,而典型的迭代器没有。close方法在生成器中触发StopIteration异常,该异常可能在迭代器中的finally子句中被捕获,以获得运行一些清理的机会。这种抽象使得它在大型迭代器中比简单迭代器更有用。可以像关闭文件一样关闭生成器,而不必担心下面有什么。

也就是说,我个人对第一个问题的回答是:iteratable只有__iter__方法,典型的迭代器只有__next__方法,生成器既有__iter__又有__next__,还有一个附加的close。

For the second question, my personal answer would be: in a public interface, I tend to favor generators a lot, since it’s more resilient: the close method an a greater composability with yield from. Locally, I may use iterators, but only if it’s a flat and simple structure (iterators does not compose easily) and if there are reasons to believe the sequence is rather short especially if it may be stopped before it reach the end. I tend to look at iterators as a low level primitive, except as literals.

对于控制流而言,生成器是一个与承诺同样重要的概念:两者都是抽象的和可组合的。

如果没有另外两个概念:可迭代对象和迭代器协议,就很难回答这个问题。

What is difference between iterator and iterable? Conceptually you iterate over iterable with the help of corresponding iterator. There are a few differences that can help to distinguish iterator and iterable in practice: One difference is that iterator has __next__ method, iterable does not. Another difference - both of them contain __iter__ method. In case of iterable it returns the corresponding iterator. In case of iterator it returns itself. This can help to distinguish iterator and iterable in practice.

>>> x = [1, 2, 3]
>>> dir(x) 
[... __iter__ ...]
>>> x_iter = iter(x)
>>> dir(x_iter)
[... __iter__ ... __next__ ...]
>>> type(x_iter)
list_iterator

What are iterables in python? list, string, range etc. What are iterators? enumerate, zip, reversed etc. We may check this using the approach above. It's kind of confusing. Probably it would be easier if we have only one type. Is there any difference between range and zip? One of the reasons to do this - range has a lot of additional functionality - we may index it or check if it contains some number etc. (see details here). How can we create an iterator ourselves? Theoretically we may implement Iterator Protocol (see here). We need to write __next__ and __iter__ methods and raise StopIteration exception and so on (see Alex Martelli's answer for an example and possible motivation, see also here). But in practice we use generators. It seems to be by far the main method to create iterators in python.

我可以给你一些更有趣的例子,展示这些概念在实践中的一些令人困惑的用法:

in keras we have tf.keras.preprocessing.image.ImageDataGenerator; this class doesn't have __next__ and __iter__ methods; so it's not an iterator (or generator); if you call its flow_from_dataframe() method you'll get DataFrameIterator that has those methods; but it doesn't implement StopIteration (which is not common in build-in iterators in python); in documentation we may read that "A DataFrameIterator yielding tuples of (x, y)" - again confusing usage of terminology; we also have Sequence class in keras and that's custom implementation of a generator functionality (regular generators are not suitable for multithreading) but it doesn't implement __next__ and __iter__, rather it's a wrapper around generators (it uses yield statement);

无代码4行小抄:

A generator function is a function with yield in it.

A generator expression is like a list comprehension. It uses "()" vs "[]"

A generator object (often called 'a generator') is returned by both above.

A generator is also a subtype of iterator.

强烈推荐Ned Batchelder的迭代器和生成器示例

一个没有生成器的方法,它对偶数进行处理

def evens(stream):
   them = []
   for n in stream:
      if n % 2 == 0:
         them.append(n)
   return them

而通过使用发电机

def evens(stream):
    for n in stream:
        if n % 2 == 0:
            yield n

我们不需要任何列表或返回语句 有效的大/无限长的流…它只是走动并产生值

调用evens方法(生成器)和往常一样

num = [...]
for n in evens(num):
   do_smth(n)

发电机也用于打破双环

迭代器

满页的书是可迭代对象,书签是可迭代对象 迭代器

而这个书签除了下一步移动什么也做不了

litr = iter([1,2,3])
next(litr) ## 1
next(litr) ## 2
next(litr) ## 3
next(litr) ## StopIteration  (Exception) as we got end of the iterator

使用生成器…我们需要一个函数

使用迭代器…我们需要next和iter

如前所述:

Generator函数返回一个迭代器对象

Iterator的全部好处:

每次在内存中存储一个元素

迭代器和生成器之间的区别是什么?举一些例子来说明你在什么时候使用每种情况会很有帮助。

总结:迭代器是具有__iter__和__next__ (Python 2中的next)方法的对象。生成器提供了一种简单的内置方法来创建iterator实例。

包含yield的函数仍然是一个函数,当调用它时,返回一个生成器对象的实例:

def a_function():
    "when called, returns generator object"
    yield

生成器表达式也返回一个生成器:

a_generator = (i for i in range(0))

有关更深入的阐述和示例,请继续阅读。

Generator是一个迭代器

具体来说,generator是迭代器的子类型。

>>> import collections, types
>>> issubclass(types.GeneratorType, collections.Iterator)
True

我们可以通过几种方式创建生成器。一种非常常见和简单的方法是使用函数。

具体来说,包含yield的函数是一个函数,当调用它时,返回一个生成器:

>>> def a_function():
        "just a function definition with yield in it"
        yield
>>> type(a_function)
<class 'function'>
>>> a_generator = a_function()  # when called
>>> type(a_generator)           # returns a generator
<class 'generator'>

生成器也是一个迭代器:

>>> isinstance(a_generator, collections.Iterator)
True

迭代器是可迭代对象

迭代器是可迭代对象,

>>> issubclass(collections.Iterator, collections.Iterable)
True

它需要一个返回迭代器的__iter__方法:

>>> collections.Iterable()
Traceback (most recent call last):
  File "<pyshell#79>", line 1, in <module>
    collections.Iterable()
TypeError: Can't instantiate abstract class Iterable with abstract methods __iter__

一些可迭代对象的例子是内置元组、列表、字典、集合、冻结集、字符串、字节字符串、字节数组、范围和memoryview:

>>> all(isinstance(element, collections.Iterable) for element in (
        (), [], {}, set(), frozenset(), '', b'', bytearray(), range(0), memoryview(b'')))
True

迭代器需要一个next或__next__方法

在Python 2中:

>>> collections.Iterator()
Traceback (most recent call last):
  File "<pyshell#80>", line 1, in <module>
    collections.Iterator()
TypeError: Can't instantiate abstract class Iterator with abstract methods next

在Python 3中:

>>> collections.Iterator()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class Iterator with abstract methods __next__

我们可以使用iter函数从内置对象(或自定义对象)中获取迭代器:

>>> all(isinstance(iter(element), collections.Iterator) for element in (
        (), [], {}, set(), frozenset(), '', b'', bytearray(), range(0), memoryview(b'')))
True

当你试图使用for循环对象时,__iter__方法会被调用。然后在迭代器对象上调用__next__方法,为循环取出每一项。迭代器在耗尽它时抛出StopIteration,此时它不能被重用。

来自文档

从内置类型文档的迭代器类型部分的生成器类型部分:

Python的生成器提供了一种实现迭代器协议的方便方法。如果容器对象的__iter__()方法被实现为生成器,它将自动返回一个迭代器对象(技术上,一个生成器对象),提供__iter__()和next() [__next__() in python3]方法。关于生成器的更多信息可以在yield表达式的文档中找到。

(强调)。

从这里我们了解到generator是一种(方便的)迭代器类型。

迭代器对象示例

您可以通过创建或扩展自己的对象来创建实现Iterator协议的对象。

class Yes(collections.Iterator):

    def __init__(self, stop):
        self.x = 0
        self.stop = stop

    def __iter__(self):
        return self

    def next(self):
        if self.x < self.stop:
            self.x += 1
            return 'yes'
        else:
            # Iterators must raise when done, else considered broken
            raise StopIteration

    __next__ = next # Python 3 compatibility

但是简单地使用Generator更容易做到这一点:

def yes(stop):
    for _ in range(stop):
        yield 'yes'

或者更简单,生成器表达式(类似于列表推导式):

yes_expr = ('yes' for _ in range(stop))

它们都可以以同样的方式使用:

>>> stop = 4             
>>> for i, y1, y2, y3 in zip(range(stop), Yes(stop), yes(stop), 
                             ('yes' for _ in range(stop))):
...     print('{0}: {1} == {2} == {3}'.format(i, y1, y2, y3))
...     
0: yes == yes == yes
1: yes == yes == yes
2: yes == yes == yes
3: yes == yes == yes

结论

当需要将Python对象扩展为可迭代的对象时,可以直接使用Iterator协议。

然而,在绝大多数情况下,您最适合使用yield来定义返回Generator Iterator的函数或考虑Generator expression。

最后,请注意生成器作为协程提供了更多的功能。我在回答“yield”关键字做什么?”时,深入地解释了Generators和yield语句。