“yield”这个词有两个意思:生产某物(例如,生产玉米),和停下来让别人/事情继续(例如,汽车让行人)。这两个定义都适用于Python的yield关键字;生成器函数的特殊之处在于,与常规函数不同,值可以在仅暂停而不是终止生成器函数的情况下“返回”给调用者。
It is easiest to imagine a generator as one end of a bidirectional pipe with a "left" end and a "right" end; this pipe is the medium over which values are sent between the generator itself and the generator function's body. Each end of the pipe has two operations: push, which sends a value and blocks until the other end of the pipe pulls the value, and returns nothing; and pull, which blocks until the other end of the pipe pushes a value, and returns the pushed value. At runtime, execution bounces back and forth between the contexts on either side of the pipe -- each side runs until it sends a value to the other side, at which point it halts, lets the other side run, and waits for a value in return, at which point the other side halts and it resumes. In other words, each end of the pipe runs from the moment it receives a value to the moment it sends a value.
The pipe is functionally symmetric, but -- by convention I'm defining in this answer -- the left end is only available inside the generator function's body and is accessible via the yield keyword, while the right end is the generator and is accessible via the generator's send function. As singular interfaces to their respective ends of the pipe, yield and send do double duty: they each both push and pull values to/from their ends of the pipe, yield pushing rightward and pulling leftward while send does the opposite. This double duty is the crux of the confusion surrounding the semantics of statements like x = yield y. Breaking yield and send down into two explicit push/pull steps will make their semantics much more clear:
Suppose g is the generator. g.send pushes a value leftward through the right end of the pipe.
Execution within the context of g pauses, allowing the generator function's body to run.
The value pushed by g.send is pulled leftward by yield and received on the left end of the pipe. In x = yield y, x is assigned to the pulled value.
Execution continues within the generator function's body until the next line containing yield is reached.
yield pushes a value rightward through the left end of the pipe, back up to g.send. In x = yield y, y is pushed rightward through the pipe.
Execution within the generator function's body pauses, allowing the outer scope to continue where it left off.
g.send resumes and pulls the value and returns it to the user.
When g.send is next called, go back to Step 1.
虽然是循环的,但这个过程确实有一个开始:当g.e send(None)——这是next(g)的缩写——第一次被调用时(向第一个send调用传递None以外的东西是非法的)。它可能有一个终点:当在生成器函数体中没有更多的yield语句要达到时。
Do you see what makes the yield statement (or more accurately, generators) so special? Unlike the measly return keyword, yield is able to pass values to its caller and receive values from its caller all without terminating the function it lives in! (Of course, if you do wish to terminate a function -- or a generator -- it's handy to have the return keyword as well.) When a yield statement is encountered, the generator function merely pauses, and then picks back up right where it left off upon being sent another value. And send is just the interface for communicating with the inside of a generator function from outside it.
如果我们真的想要尽可能地打破这个推/拉/管道的类比,我们最终会得到下面的伪代码,除了步骤1-5之外,yield和send是同一个硬币管道的两面:
right_end.push(None) # the first half of g.send; sending None is what starts a generator
right_end.pause()
left_end.start()
initial_value = left_end.pull()
if initial_value is not None: raise TypeError("can't send non-None value to a just-started generator")
left_end.do_stuff()
left_end.push(y) # the first half of yield
left_end.pause()
right_end.resume()
value1 = right_end.pull() # the second half of g.send
right_end.do_stuff()
right_end.push(value2) # the first half of g.send (again, but with a different value)
right_end.pause()
left_end.resume()
x = left_end.pull() # the second half of yield
goto 6
关键的转换是我们将x = yield y和value1 = g.e send(value2)分别分割为两个语句:left_end.push(y)和x = left_end.pull();和value1 = right_end.pull()和right_end.push(value2)。yield关键字有两种特殊情况:x = yield和yield y。它们分别是语法糖,对于x = yield None和_ = yield y #丢弃值。
有关通过管道发送值的精确顺序的具体细节,请参见下面。
下面是上面的一个相当长的具体模型。首先,应该首先注意,对于任何生成器g, next(g)完全等价于g.e send(None)。考虑到这一点,我们可以只关注send是如何工作的,只讨论如何使用send来改进生成器。
假设我们有
def f(y): # This is the "generator function" referenced above
while True:
x = yield y
y = x
g = f(1)
g.send(None) # yields 1
g.send(2) # yields 2
现在,f的定义大致是以下普通(非生成器)函数:
def f(y):
bidirectional_pipe = BidirectionalPipe()
left_end = bidirectional_pipe.left_end
right_end = bidirectional_pipe.right_end
def impl():
initial_value = left_end.pull()
if initial_value is not None:
raise TypeError(
"can't send non-None value to a just-started generator"
)
while True:
left_end.push(y)
x = left_end.pull()
y = x
def send(value):
right_end.push(value)
return right_end.pull()
right_end.send = send
# This isn't real Python; normally, returning exits the function. But
# pretend that it's possible to return a value from a function and then
# continue execution -- this is exactly the problem that generators were
# designed to solve!
return right_end
impl()
在f的这个变换中发生了以下情况:
We've moved the implementation into a nested function.
We've created a bidirectional pipe whose left_end will be accessed by the nested function and whose right_end will be returned and accessed by the outer scope -- right_end is what we know as the generator object.
Within the nested function, the very first thing we do is check that left_end.pull() is None, consuming a pushed value in the process.
Within the nested function, the statement x = yield y has been replaced by two lines: left_end.push(y) and x = left_end.pull().
We've defined the send function for right_end, which is the counterpart to the two lines we replaced the x = yield y statement with in the previous step.
在这个幻想世界中,函数可以在返回后继续执行,g被赋值为right_end,然后调用impl()。所以在上面的例子中,如果我们一行一行地执行,大概会发生以下情况:
left_end = bidirectional_pipe.left_end
right_end = bidirectional_pipe.right_end
y = 1 # from g = f(1)
# None pushed by first half of g.send(None)
right_end.push(None)
# The above push blocks, so the outer scope halts and lets `f` run until
# *it* blocks
# Receive the pushed value, None
initial_value = left_end.pull()
if initial_value is not None: # ok, `g` sent None
raise TypeError(
"can't send non-None value to a just-started generator"
)
left_end.push(y)
# The above line blocks, so `f` pauses and g.send picks up where it left off
# y, aka 1, is pulled by right_end and returned by `g.send(None)`
right_end.pull()
# Rinse and repeat
# 2 pushed by first half of g.send(2)
right_end.push(2)
# Once again the above blocks, so g.send (the outer scope) halts and `f` resumes
# Receive the pushed value, 2
x = left_end.pull()
y = x # y == x == 2
left_end.push(y)
# The above line blocks, so `f` pauses and g.send(2) picks up where it left off
# y, aka 2, is pulled by right_end and returned to the outer scope
right_end.pull()
x = left_end.pull()
# blocks until the next call to g.send
这精确地映射到上面的16步伪代码。
还有一些其他细节,比如错误是如何传播的,到达生成器的末尾时会发生什么(管道关闭),但这应该清楚使用send时基本控制流是如何工作的。
使用这些相同的糖化规则,让我们看看两个特殊情况:
def f1(x):
while True:
x = yield x
def f2(): # No parameter
while True:
x = yield x
在大多数情况下,它们的糖化方式与f相同,唯一的区别是yield语句如何转换:
def f1(x):
# ... set up pipe
def impl():
# ... check that initial sent value is None
while True:
left_end.push(x)
x = left_end.pull()
# ... set up right_end
def f2():
# ... set up pipe
def impl():
# ... check that initial sent value is None
while True:
left_end.push(x)
x = left_end.pull()
# ... set up right_end
在第一种方法中,传递给f1的值首先被推(yield),然后所有拉(发送)的值都被推(yield)回来。在第二种情况下,x在第一次push时(还)没有值,因此引发UnboundLocalError。