有人能解释一下吗?我理解它们背后的基本概念,但我经常看到它们互换使用,我感到困惑。
现在我们到了这里,它们和普通函数有什么不同?
有人能解释一下吗?我理解它们背后的基本概念,但我经常看到它们互换使用,我感到困惑。
现在我们到了这里,它们和普通函数有什么不同?
并不是所有的闭包都是lambdas,也不是所有的lambdas都是闭包。两者都是函数,但不一定是以我们习惯的方式。
lambda本质上是一个内联定义的函数,而不是声明函数的标准方法。lambda可以经常作为对象传递。
闭包是一种函数,它通过引用其主体外部的字段来封闭其周围的状态。封闭状态在闭包调用之间保持不变。
在面向对象语言中,闭包通常是通过对象提供的。然而,一些面向对象语言(如c#)实现的特殊功能更接近于纯函数式语言(如lisp)所提供的闭包的定义,后者没有对象来封装状态。
有趣的是,在c#中引入lambda和闭包使函数式编程更接近主流用法。
当大多数人想到函数时,他们想到的是命名函数:
function foo() { return "This string is returned from the 'foo' function"; }
当然,这些都是按名字命名的:
foo(); //returns the string above
使用lambda表达式,你可以有匿名函数:
@foo = lambda() {return "This is returned from a function without a name";}
在上面的例子中,你可以通过赋值的变量来调用lambda:
foo();
然而,比将匿名函数分配给变量更有用的是将它们传递给或从高阶函数传递,即接受/返回其他函数的函数。在很多情况下,命名一个函数是不必要的:
function filter(list, predicate)
{ @filteredList = [];
for-each (@x in list) if (predicate(x)) filteredList.add(x);
return filteredList;
}
//filter for even numbers
filter([0,1,2,3,4,5,6], lambda(x) {return (x mod 2 == 0)});
闭包可以是命名函数或匿名函数,但当它“关闭”函数定义范围内的变量时,即闭包仍将引用闭包本身中使用的任何外部变量的环境。这是一个命名闭包:
@x = 0;
function incrementX() { x = x + 1;}
incrementX(); // x now equals 1
这看起来并不多,但如果这都是在另一个函数中,你将incrementX传递给一个外部函数呢?
function foo()
{ @x = 0;
function incrementX()
{ x = x + 1;
return x;
}
return incrementX;
}
@y = foo(); // y = closure of incrementX over foo.x
y(); //returns 1 (y.x == 0 + 1)
y(); //returns 2 (y.x == 1 + 1)
这就是在函数式编程中获得有状态对象的方法。因为不需要命名“incrementX”,你可以在这种情况下使用lambda:
function foo()
{ @x = 0;
return lambda()
{ x = x + 1;
return x;
};
}
lambda只是一个匿名函数——一个没有名字定义的函数。在Scheme等某些语言中,它们等同于命名函数。实际上,函数定义在内部被重写为将lambda绑定到变量。在其他语言中,如Python,它们之间有一些(相当不必要的)区别,但它们在其他方面的行为是相同的。
闭包是在定义它的环境上关闭的任何函数。这意味着它可以访问不在其参数列表中的变量。例子:
def func(): return h
def anotherfunc(h):
return func()
这将导致一个错误,因为func不会在另一个环境中关闭func - h是undefined。Func只在全局环境下关闭。这是可行的:
def anotherfunc(h):
def func(): return h
return func()
因为这里,func是在anotherfunc中定义的,而在python 2.3及以上版本(或类似这样的数字)中,当他们几乎正确地获得闭包时(突变仍然无效),这意味着它在anotherfunc的环境中关闭,并可以访问其中的变量。在Python 3.1+中,使用nonlocal关键字时也可以使用突变。
还有一点很重要——func将继续在另一个func的环境中关闭,即使它不再在另一个func中被求值。这段代码也可以工作:
def anotherfunc(h):
def func(): return h
return func
print anotherfunc(10)()
这将输出10。
正如您所注意到的,这与lambda无关——它们是两个不同(尽管相关)的概念。
从编程语言的角度来看,它们完全是两种不同的东西。
基本上,对于图灵完备语言,我们只需要非常有限的元素,例如抽象、应用和还原。抽象和应用提供了构建lambda表达式的方法,而约简决定了lambda表达式的含义。
Lambda提供了一种将计算过程抽象出来的方法。 例如,要计算两个数字的和,可以抽象出一个接受两个参数x、y并返回x+y的过程。在scheme中,可以写成
(lambda (x y) (+ x y))
您可以重命名参数,但它完成的任务不会改变。 在几乎所有的编程语言中,您都可以为lambda表达式指定一个名称,即命名函数。但是没有太大的区别,它们在概念上可以被认为只是语法糖。
好,现在想象一下这是如何实现的。当我们将lambda表达式应用于某些表达式时,例如。
((lambda (x y) (+ x y)) 2 3)
We can simply substitute the parameters with the expression to be evaluated. This model is already very powerful. But this model doesn't enable us to change the values of symbols, e.g. We can't mimic the change of status. Thus we need a more complex model. To make it short, whenever we want to calculate the meaning of the lambda expression, we put the pair of symbol and the corresponding value into an environment(or table). Then the rest (+ x y) is evaluated by looking up the corresponding symbols in the table. Now if we provide some primitives to operate on the environment directly, we can model the changes of status!
有了这个背景,检查这个函数:
(lambda (x y) (+ x y z))
我们知道,当我们求lambda表达式的值时,x y将被绑定到一个新的表中。但是我们怎样才能查到z呢?实际上z是一个自由变量。一定有一个外在的 否则,表达式的含义不能仅通过绑定x和y来确定。为了清楚地说明这一点,可以在scheme中这样写:
((lambda (z) (lambda (x y) (+ x y z))) 1)
所以z在外层表中被绑定为1。我们仍然得到一个接受两个参数的函数,但它的真正含义也取决于外部环境。 换句话说,外部环境对自由变量关闭。在set的帮助下!,我们可以使函数有状态,也就是说,它不是数学意义上的函数。它的返回值不仅取决于输入,还取决于z。
这是你们已经非常了解的东西,对象的方法几乎总是依赖于对象的状态。这就是为什么有人说“闭包是穷人的东西”。但我们也可以把对象看作穷人的闭包,因为我们真的喜欢第一类函数。
我用scheme来说明这个想法,因为scheme是最早的有真正闭包的语言之一。这里的所有材料在SICP第3章中都有更好的呈现。
总之,lambda和闭包是完全不同的概念。A是一个函数。闭包是一对lambda和对应的闭包环境。
它很简单:lambda是一种语言结构,即匿名函数的简单语法;闭包是一种实现它的技术——或者任何一类函数,无论是命名的还是匿名的。
更准确地说,闭包是一类函数在运行时如何表示的,作为它的一对“代码”和一个环境在该代码中使用的所有非局部变量上的“闭包”。这样,即使它们产生的外部作用域已经退出,这些变量仍然可以访问。
不幸的是,有许多语言不支持函数作为第一类值,或者只支持残差形式的函数。所以人们经常用“闭合”这个词来区分“实物”。
概念与上面描述的相同,但如果您有PHP背景,则使用PHP代码进一步解释。
$input = array(1, 2, 3, 4, 5);
$output = array_filter($input, function ($v) { return $v > 2; });
函数($v){返回$v > 2}是lambda函数定义。我们甚至可以将它存储在一个变量中,这样它就可以被重用:
$max = function ($v) { return $v > 2; };
$input = array(1, 2, 3, 4, 5);
$output = array_filter($input, $max);
现在,如果您想更改筛选数组中允许的最大数量,该怎么办?你必须编写另一个lambda函数或创建一个闭包(PHP 5.3):
$max_comp = function ($max) {
return function ($v) use ($max) { return $v > $max; };
};
$input = array(1, 2, 3, 4, 5);
$output = array_filter($input, $max_comp(2));
闭包是在自己的环境中求值的函数,该环境具有一个或多个绑定变量,在调用该函数时可以访问这些变量。它们来自函数式编程领域,其中有许多概念。闭包类似于lambda函数,但更聪明的地方在于它们能够与定义闭包的外部环境中的变量进行交互。
下面是一个简单的PHP闭包示例:
$string = "Hello World!";
$closure = function() use ($string) { echo $string; };
$closure();
这篇文章解释得很好。
这个问题很老了,有很多答案。 现在有了Java 8和官方Lambda这两个非官方的闭包项目,这个问题又重新出现了。
Java上下文中的答案(通过Lambdas和闭包-有什么区别?):
闭包是一个lambda表达式,它与一个环境配对,将它的每个自由变量绑定到一个值。在Java中,lambda表达式将通过闭包的方式实现,因此这两个术语在社区中可以互换使用。”
关于lambda和闭包有很多困惑,甚至在这个StackOverflow问题的答案中也是如此。与其询问那些从某些编程语言的实践中学习闭包的随机程序员或其他一无所知的程序员,不如去寻找源头(一切开始的地方)。因为Lambda和闭包来自于Lambda微积分,由Alonzo Church在30年代发明,那时第一台电子计算机还不存在,这就是我说的来源。
Lambda Calculus是世界上最简单的编程语言。你唯一能做的事情是:►
APPLICATION: Applying one expression to another, denoted f x.(Think of it as a function call, where f is the function and x is its only parameter) ABSTRACTION: Binds a symbol occurring in an expression to mark that this symbol is just a "slot", a blank box waiting to be filled with value, a "variable" as it were. It is done by prepending a Greek letter λ (lambda), then the symbolic name (e.g. x), then a dot . before the expression. This then converts the expression into a function expecting one parameter.For example: λx.x+2 takes the expression x+2 and tells that the symbol x in this expression is a bound variable – it can be substituted with a value you supply as a parameter. Note that the function defined this way is anonymous – it doesn't have a name, so you can't refer to it yet, but you can immediately call it (remember application?) by supplying it the parameter it is waiting for, like this: (λx.x+2) 7. Then the expression (in this case a literal value) 7 is substituted as x in the subexpression x+2 of the applied lambda, so you get 7+2, which then reduces to 9 by common arithmetics rules.
所以我们解开了其中一个谜团: 是上面例子中的匿名函数,λx.x+2。 在不同的编程语言中,函数抽象(lambda)的语法可能不同。例如,在JavaScript中是这样的:
function(x) { return x+2; }
你可以马上把它应用到这样的参数上:
(function(x) { return x+2; })(7)
或者你可以将这个匿名函数(lambda)存储到某个变量中:
var f = function(x) { return x+2; }
这实际上给了它一个名字f,允许你引用它,并在以后多次调用它,例如:
alert( f(7) + f(10) ); // should print 21 in the message box
但你不需要说出它的名字。你可以立即调用它:
alert( function(x) { return x+2; } (7) ); // should print 9 in the message box
在LISP中,lambda是这样的:
(lambda (x) (+ x 2))
你可以通过将它立即应用到一个参数来调用这样的lambda:
( (lambda (x) (+ x 2)) 7 )
好了,现在是时候解决另一个谜团了:什么是闭包。 为了做到这一点,让我们讨论一下lambda表达式中的符号(变量)。
As I said, what the lambda abstraction does is binding a symbol in its subexpression, so that it becomes a substitutible parameter. Such a symbol is called bound. But what if there are other symbols in the expression? For example: λx.x/y+2. In this expression, the symbol x is bound by the lambda abstraction λx. preceding it. But the other symbol, y, is not bound – it is free. We don't know what it is and where it comes from, so we don't know what it means and what value it represents, and therefore we cannot evaluate that expression until we figure out what y means.
事实上,另外两个符号2和+也是如此。只是我们对这两个符号太熟悉了,以至于我们经常忘记计算机不认识它们,我们需要通过在某个地方定义它们来告诉它它们的意思,例如在图书馆或语言本身。
你可以认为自由符号是在表达式之外的其他地方定义的,在它的“周围上下文”中,这被称为它的环境。环境可能是一个更大的表达式,这个表达式是其中的一部分(就像奎冈·金说的:“总有更大的鱼”;)),或者在一些库中,或者在语言本身中(作为原始语言)。
这让我们将lambda表达式分为两类:
CLOSED表达式:出现在这些表达式中的每个符号都被一些lambda抽象绑定。换句话说,它们是独立的;它们不需要任何周围的上下文来计算。它们也被称为组合子。 OPEN表达式:这些表达式中的一些符号是不绑定的——也就是说,其中出现的一些符号是自由的,它们需要一些外部信息,因此在您提供这些符号的定义之前,它们不能被求值。
你可以通过提供环境来关闭一个开放lambda表达式,环境通过将这些自由符号绑定到一些值(可能是数字、字符串、匿名函数或lambdas等等)来定义所有这些自由符号。
下面是结尾部分: lambda表达式的闭包是在外部上下文(环境)中定义的一组特定的符号,这些符号为表达式中的自由符号赋值,使它们不再是自由的。它将一个仍然包含一些“未定义的”自由符号的开放lambda表达式转换为一个不再包含任何自由符号的封闭lambda表达式。
例如,如果你有这样的λ表达式:λx。X /y+2,符号X是有界的,而符号y是自由的,因此表达式是开放的,除非你说出y的意思,否则无法计算(+和2也是一样,它们也是自由的)。但假设你也有一个这样的环境:
{ y: 3,
+: [built-in addition],
2: [built-in number],
q: 42,
w: 5 }
这个环境为lambda表达式(y, +, 2)中的所有“未定义”(自由)符号和几个额外的符号(q, w)提供定义。我们需要定义的符号是环境的这个子集:
{ y: 3,
+: [built-in addition],
2: [built-in number] }
这正是lambda表达式的闭包:>
换句话说,它关闭了一个开lambda表达式。这就是名称闭包最初的来源,这也是为什么在这个帖子中有那么多人的答案不太正确的原因:P 那么,他们为什么错了呢?为什么这么多人说闭包是内存中的一些数据结构,或者他们使用的语言的一些特性,或者为什么他们把闭包和lambdas混淆?: P
Well, the corporate marketoids of Sun/Oracle, Microsoft, Google etc. are to blame, because that's what they called these constructs in their languages (Java, C#, Go etc.). They often call "closures" what are supposed to be just lambdas. Or they call "closures" a particular technique they used to implement lexical scoping, that is, the fact that a function can access the variables that were defined in its outer scope at the time of its definition. They often say that the function "encloses" these variables, that is, captures them into some data structure to save them from being destroyed after the outer function finishes executing. But this is just made-up post factum "folklore etymology" and marketing, which only makes things more confusing, because every language vendor uses its own terminology.
更糟糕的是,他们说的话总有一点是真的,这让你不能轻易地把它当作错误来看待。
If you want to implement a language that uses lambdas as first-class citizens, you need to allow them to use symbols defined in their surrounding context (that is, to use free variables in your lambdas). And these symbols must be there even when the surrounding function returns. The problem is that these symbols are bound to some local storage of the function (usually on the call stack), which won't be there anymore when the function returns. Therefore, in order for a lambda to work the way you expect, you need to somehow "capture" all these free variables from its outer context and save them for later, even when the outer context will be gone. That is, you need to find the closure of your lambda (all these external variables it uses) and store it somewhere else (either by making a copy, or by preparing space for them upfront, somewhere else than on the stack). The actual method you use to achieve this goal is an "implementation detail" of your language. What's important here is the closure, which is the set of free variables from the environment of your lambda that need to be saved somewhere.
没过多久,人们就开始将他们在语言实现中使用的实际数据结构称为“闭包”本身。结构通常是这样的:
Closure {
[pointer to the lambda function's machine code],
[pointer to the lambda function's environment]
}
这些数据结构作为参数传递给其他函数,从函数返回,并存储在变量中,以表示lambdas,并允许它们访问其封闭环境以及在该上下文中运行的机器代码。但这只是实现闭包的一种方法(众多方法之一),而不是闭包本身。
如上所述,lambda表达式的闭包是其环境中定义的子集,这些定义将值赋给该lambda表达式中包含的自由变量,从而有效地关闭表达式(将目前还不能求值的开放lambda表达式转换为随后可以求值的封闭lambda表达式,因为现在已经定义了其中包含的所有符号)。
其他任何东西都只是程序员和语言供应商的“货物崇拜”和“巫毒魔法”,他们不知道这些概念的真正根源。
我希望这能回答你的问题。但如果你有任何后续问题,请在评论中提出,我会试着更好地解释。
它取决于函数是否使用外部变量来执行操作。
外部变量——定义在函数作用域之外的变量。
Lambda表达式是无状态的,因为它依赖于参数、内部变量或常量来执行操作。 函数<Integer,Integer> lambda = t -> { Int n = 2 返回t * n } 闭包保持状态,因为它使用外部变量(即函数体范围之外定义的变量)以及参数和常量来执行操作。 Int n = 2 函数<Integer,Integer>闭包= t -> { 返回t * n }
当Java创建闭包时,它将变量n与函数一起保存,以便在传递给其他函数或在任何地方使用时可以引用它。
Lambda表达式只是一个匿名函数。例如,在纯java中,你可以这样写:
Function<Person, Job> mapPersonToJob = new Function<Person, Job>() {
public Job apply(Person person) {
Job job = new Job(person.getPersonId(), person.getJobDescription());
return job;
}
};
其中Function类是在java代码中构建的。现在你可以在某处调用mapPersonToJob.apply(person)来使用它。这只是一个例子。在没有语法之前,这是一个lambda。lambda是一个捷径。
关闭:
当Lambda可以访问此作用域之外的变量时,它就成为闭包。我猜你可以说它的魔力,它神奇地包裹着它被创建的环境,并使用其作用域之外的变量。因此,闭包意味着lambda可以访问它的OUTER SCOPE。
在Kotlin中,lambda总是可以访问它的闭包(在它的外部作用域中的变量)
Lambda是一个匿名函数定义,它(不一定)绑定到标识符。
“匿名函数起源于阿朗佐·丘奇(Alonzo Church)发明的lambda微积分,其中所有函数都是匿名的”——维基百科
闭包是lambda函数的实现。
Peter J. Landin在1964年将闭包定义为拥有一个环境部分和一个控制部分,由他的SECD机器用于计算表达式
Lambda和闭包的一般解释在其他响应中涵盖。
对于c++背景的人来说,Lambda表达式是在c++ 11中引入的。可以把Lambdas看作是一种创建匿名函数和函数对象的方便方法。
"The distinction between a lambda and the corresponding closure is precisely equivalent to the distinction between a class and an instance of the class. A class exists only in source code; it doesn’t exist at runtime. What exists at runtime are objects of the class type. Closures are to lambdas as objects are to classes. This should not be a surprise, because each lambda expression causes a unique class to be generated (during compilation) and also causes an object of that class type, a closure to be created (at runtime)." - Scott Myers
c++允许我们检查Lambda和Closure之间的细微差别,因为您必须显式地指定要捕获的自由变量。
在下面的示例中,Lambda表达式没有自由变量,是一个空捕获列表([])。它本质上是一个普通函数,严格意义上不需要闭包。所以它甚至可以作为函数指针参数传递。
void register_func(void(*f)(int val)) // Works only with an EMPTY capture list
{
int val = 3;
f(val);
}
int main()
{
int env = 5;
register_func( [](int val){ /* lambda body can access only val variable*/ } );
}
只要在捕获列表([env])中引入了来自周围环境的自由变量,就必须生成一个Closure。
register_func( [env](int val){ /* lambda body can access val and env variables*/ } );
由于这不再是一个普通的函数,而是一个闭包,因此会产生编译错误。 不存在从"lambda []void (int val)->void"到"void (*)(int val)"的合适转换函数
这个错误可以用函数包装器std::function来修复,它接受任何可调用的目标,包括生成的闭包。
void register_func(std::function<void(int val)> f)
有关c++示例的详细解释,请参阅Lambda和闭包。
Lambda vs闭包
是匿名函数(方法)
闭包是封闭(捕获)其封闭范围内的变量的函数。非本地变量)
Java
interface Runnable {
void run();
}
class MyClass {
void foo(Runnable r) {
}
//Lambda
void lambdaExample() {
foo(() -> {});
}
//Closure
String s = "hello";
void closureExample() {
foo(() -> { s = "world";});
}
}
斯威夫特(结束)
class MyClass {
func foo(r:() -> Void) {}
func lambdaExample() {
foo(r: {})
}
var s = "hello"
func closureExample() {
foo(r: {s = "world"})
}
}
这个问题已经12年了,我们仍然把它作为谷歌中“闭包vs lambda”的第一个链接。 所以我不得不说,因为没有人明确地说过。
Lambda表达式是一个匿名函数(声明)。
一个闭包,引用Scott的《编程语言语用学》解释为:
创建引用环境的显式表示(通常是当前调用子例程时执行的环境),并将其与子例程的引用捆绑在一起,称为闭包。
也就是说,它就像我们所说的“函数+投降上下文”的捆绑。
在这个问题的各种现有答案中,有许多技术模糊或“甚至没有错误”的人造珍珠的声音,所以我最终添加了一个新的……
对术语的澄清
最好知道,术语“闭包”和“lambda”都可以表示不同的东西,这是上下文相关的。
这是一个形式问题,因为所讨论的PL(编程语言)规范可能明确地定义了这些术语。
例如,通过ISO c++(从c++ 11开始):
lambda表达式的类型(也是闭包对象的类型)是一种唯一的、未命名的非并集类类型,称为闭包类型,其属性如下所述。
由于类C语言的用户每天都会混淆“指针”(类型)到“指针值”或“指针对象”(类型的居民),这里也有混淆的风险:大多数c++用户实际上是通过使用术语“闭包”来谈论“闭包对象”。小心歧义。
NOTE To make things generally clearer and more precise, I'd seldom deliberately use some language-neutral terms (usually specific to the PL theory instead of the language-defined terminology. For instance, type inhabitant used above covers the language-specific "(r)values" and "lvalues" in a broader sense. (Since the syntactic essence of C++'s value category definition is irrelevant, avoiding "(l/r)values" may reduce confusion). (Disclaimer: lvalues and rvalues are common enough in many other contexts.) Terms not defined formally among different PLs may be in quotes. Verbatim copy from referenced materials may be also in quotes, with typos unchanged.
这甚至与“lambda”更相关。(小写)字母lambda (λ)是希腊字母表中的一个元素。通过比较“lambda”和“闭包”,人们当然不是在谈论字母本身,而是使用“lambda”派生概念的语法背后的某些东西。
现代PLs中的相关构造通常被命名为“lambda表达式”。它来源于下面讨论的“lambda抽象”。
在详细讨论之前,我建议阅读一些问题本身的评论。我觉得它们比这里的大多数问题的答案更安全,更有帮助,因为混淆的风险更小。(可悲的是,这是我决定在这里提供答案的最重要原因…)
Lambdas:一个简短的历史
在PLs中命名为“lambda”的结构,无论是“lambda表达式”还是其他东西,都是语法的。换句话说,这些语言的用户可以找到这样的源语言结构,这些结构被用来构建其他的东西。粗略地说,“其他”在实践中只是“匿名函数”。
这样的构造源自lambda抽象,它是A. Church开发的(无类型的)lambda演算的三个语法类别(“表达式的种类”)之一。
Lambda演算是一个演绎系统(更准确地说,是一个TRS(术语重写系统)),可以对计算进行普遍建模。对lambda项进行约简,就像在普通的PLs中计算表达式一样。利用内置的约简规则,定义各种计算方法就足够了。(正如你所知道的,它是图灵完备的。)因此,它可以用作PL。
一般来说,PL中的表达式求值与TRS中的项约简是不可互换的。然而,lambda演算是一种可以在源语言中表示所有约简结果的语言(即lambda项),因此它们具有相同的意义。几乎所有的PLs在实践中都没有这个属性;用于描述其语义的演算可能包含不是源语言表达式的术语,并且还原可能比求值具有更详细的效果。
Every terms ("expressions") in the lambda calculus (lambda terms) are either variable, abstraction or application. "Variable" here is the syntax (just the variable's name) of symbol, which can refer to an existing "variable" (semantically, an entity which may reduce to some other lambda term) introduced previously. The ability to introduce a variable is provided by the abstraction syntax, which has a leading letter λ, followed by a bound variable, a dot and a lambda term. The bound variable is similar to the formal parameter name both in the syntax and semantic among many languages, and the followed lambda term inside the lambda abstraction is just like the function body. The application syntax combines a lambda term ("actual argument") to some abstraction, like the function call expression in many PLs.
说明一个lambda抽象只能引入一个参数。要克服微积分内部的限制,请参阅curiling。
The ability of introducing variables makes lambda calculus a typical high-level language (albeit simple). On the other hand, combinatory logics can be treated as PLs by remove away the variable and abstraction features from the lambda calculus. Combinatory logics are low-level exactly in this sense: they are like plain-old assembly languages which do not allow to introduce variables named by the user (despite macros, which requires additional preprocessing). (... If not more low-level... typically assembly languages can at least introduce user-named labels.)
注意,lambda抽象可以构建在任何其他lambda术语中,而不需要指定名称来表示抽象。因此,lambda抽象整体上形成了匿名函数(可能是嵌套的)。这是一个相当高级的特性(与ISO C相比,后者不允许匿名或嵌套函数)。
非类型化lambda演算的继承者包括各种类型化lambda演算(如lambda cube)。它们更像是静态类型语言,需要在函数的形式参数上添加类型注释。尽管如此,lambda抽象在这里仍然具有相同的角色。
Although lambda calculi are not intended to be directly used as PLs implemented in computers, they do have affected PLs in practice. Notably, J. McCarthy introduced the LAMBDA operator in LISP to provide functions exactly following the idea of Church's untyped lambda calculus. Apparently, the name LAMBDA comes from the letter λ. LISP (later) has a different syntax (S-expression), but all programmable elements in the LAMBDA expressions can be directly mapped to the lambda abstractions in the untyped lambda calculus by trivial syntactic conversions.
On the other hand, many other PLs express similar functionalities by other means. A slightly different way to introduce reusable computations are named functions (or more exactly, named subroutines), which are supported by earlier PLs like FORTRAN, and languages derived from ALGOL. They are introduced by syntaxes specifying a named entity being a function at the same time. This is simpler in some sense compared to LISP dialects (esp. in the aspect of implementation), and it seems more popular than LISP dialects for decades. Named functions may also allow extensions not shared by anonymous functions like function overloading.
Nevertheless, more and more industrial programmers finally find the usefulness of first-class functions, and demands of the ability to introduce function definitions in-place (in the expressions in the arbitrary contexts, say, as an argument of some other function) are increasing. It is natural and legitimate to avoid naming a thing not required to be, and any named functions fail here by definition. (You may know, naming things correctly is one of the well-known hard problems in the computer science.) To address the problem, anonymous functions are introduced to languages traditionally only providing named functions (or function-like constructs like "methods", whatsoever), like C++ and Java. Many of them name the feature as "lambda expressions" or similar lambda things, because they are basically reflecting the essentially same idea in lambda calculi. Renaissance.
有点模棱两可:在lambda演算中,所有的术语(变量、抽象和应用程序)都是PL中的有效表达式;在这个意义上,它们都是“lambda表达式”。但是,为了丰富其特性而添加lambda抽象的pl可能会专门将抽象的语法命名为“lambda表达式”,以区别于现有的其他类型的表达式。
闭包:历史记录
数学中的闭包与PLs中的闭包是不一样的。
在后一种情况下,该术语是由P. J. Landin于1964年创造的,用于在评估“Church's λ notation建模”的PLs时提供一级函数的支持。
具体到Landin提出的模型(SECD机器),闭包由λ表达式和相对于它被评估的环境组成,或者更准确地说:
一个环境部分,它是一个列表,其两项是(1)一个环境(2)一个标识符列表的标识符
以及控制部分,该控制部分由唯一项为AE的列表组成
NOTE AE is abbreviated for applicative expression in the paper. This is the syntax exposing more or less the same functionality of application in the lambda calculus. There are also some additional pieces of details like "applicative" not that interesting in the lambda calculus (because it is purely functional), though. SECD is not consistent with the original lambda calculus for these minor differences. For example, SECD halts on arbitrary single lambda abstraction whether the subterm ("body") has a normal form, because it will not reduce the subterm ("evaluate the body") without the abstraction has been applied ("called"). However, such behavior may be more like the PLs today than the lambda calculus. SECD is also not the only abstract machine can evaluate lambda terms; although most other abstract machines for the similar purpose may also have environments. Contrast to the lambda calculus (which is pure), these abstract machines can support mutation in some degrees.
因此,在这个特定的上下文中,闭包是一种内部数据结构,用于实现使用ae对PLs进行特定的评估。
The discipline of accessing the variables in closures reflects lexical scoping, first used in early 1960s by the imperative language ALGOL 60. ALGOL 60 does support nested procedures and passing procedures to parameters, but not returning procedures as results. For languages has full support of first-class functions which can be returned by functions, the static chain in ALGOL 60-style implementations does not work because free variables used by the function being returned may be no longer present on the call stack. This is the upwards funarg problem. Closures resolve the problem by capturing the free variable in the environment parts and avoiding allocating them on the stack.
On the other hand, early LISP implementations all use dynamic scope. This makes the variable bindings referenced all reachable in the global store, and name hiding (if any) is implemented as per-variable basis: once a variable is created with an existing name, the old one is backed by a LIFO structure; in other words, each variable's name can access a corresponding global stack. This effectively cancels the need of the per-function environments because no free variables are ever captured in the function (they are already "captured" by the stacks).
Despite imitating the lambda notation at the first, LISP is very different to the lambda calculus here. The lambda calculus is statically scoped. That is, each variable denotes to the instance bounded by the nearest same named-formal parameter of a lambda abstraction which contains the variable before its reduction. In the semantics of lambda calculus, reducing an application substitute the term ("argument") to the bound variable ("formal parameter") in the abstraction. Since all values can be represented as lambda terms in the lambda calculus, this can be done by direct rewriting by replacing specific subterms in each step of the reducing.
注:因此,环境不是减少lambda项的必要条件。然而,扩展lambda演算的演算可以在语法中显式地引入环境,即使它只是为纯计算建模(没有突变)。通过显式地添加环境,可以在环境上有专门的约束规则来强制环境规范化,这加强了微积分的方程理论。(见§9.1。)
LISP is quite different, because its underlying semantic rules are based on neither lambda calculus nor term rewriting. Therefore, LISP needs some different mechanism to maintaining the scoping discipline. It adopted the mechanism based on the environment data structures saving the variable to value mappings (i.e. variable bindings). There may be more sophisticated structure in an environment in new variants of LISP (e.g. lexically scoped Lisp allows mutations), but the simplest structure conceptually equivalent to the environment defined by the Landin's paper, discussed below.
LISP实现在早期确实支持第一类函数,但是使用纯动态作用域,就不存在真正的funargs问题:它们可以避免堆栈上的分配,并让全局所有者(GC、垃圾收集器)管理引用变量的环境中的资源(和激活记录)。此时不需要闭包。这是闭包发明之前的早期实现。
深度绑定近似于静态(词法)绑定,是在1962年左右的LISP 1.5中通过FUNARG设备引入的。这最终使这个问题以“funarg问题”的名字闻名于世。
AIM-199指出,这本质上是关于环境的。
Scheme是第一个默认支持词法作用域的Lisp方言(在现代版本的Scheme中,动态作用域可以通过make-parameter/parameterize形式模拟)。在后来的十年中有一些争论,但最终大多数Lisp方言采用了默认词汇作用域的想法,就像许多其他语言一样。从此,闭包作为一种实现技术,在不同类型的PLs中得到了更广泛的传播和流行。
闭包:演变
The original paper of Landin first defines an environment being a mathematical function mapping the name ("constant") to the named object ("primitive"). Then, it specifies the environment as "a list-structure made up of name/value pairs". The latter is also implemented in early Lisp implementation as alists (associative lists), but modern language implementations do not necessarily follow such detail. In particular, environments can be linked to support nested closures, which is unlikely directly supported by abstract machines like SECD.
除了环境,Landin论文中“环境部分”的另一个组件用于保存lambda抽象(函数的形式参数)的绑定变量的名称。对于不需要反映源信息的现代实现来说,这也是可选的(而且很可能没有),在现代实现中,参数的名称可以被静态优化(由lambda calculi的alpha-重命名规则精神上授予)。
类似地,现代实现可能不会将语法构造(AEs或lambda术语)直接保存为控制部分。相反,它们可能使用一些内部IR(中间表示)或“编译”形式(例如Lisp方言的一些实现使用的FASL)。这样的IR甚至不保证是由lambda表单生成的(例如,它可以来自一些命名函数的函数体)。
此外,环境部分可以为lambda演算保存其他不用于求值的信息。例如,它可以保留一个额外的标识符,以提供额外的绑定来命名调用站点上的环境。这可以实现基于lambda calculi扩展的语言。
重温pl专用术语
此外,一些语言可能在其规范中定义“闭包”相关术语,以命名闭包实现的实体。这是不幸的,因为它会导致许多误解,如“闭包是一个函数”。但幸运的是,大多数语言似乎都避免将其直接命名为语言中的语法结构。
尽管如此,这仍然比通过语言规范任意重载更完善的通用概念要好。举几个例子:
"objects" are redirected to "instance of classes" (in Java/CLR/"OOP" languages) instead of traditional "typed storage" (in C and C++) or just "values" (in many Lisps); "variables" are redirected to something traditional called "objects" (in Golang) as well as mutable states (in many new languages), so it is no longer compatible to mathematics and pure functional languages; "polymorphism" is restricted to inclusion polymorphism (in C++/"OOP" languages) even these languages do have other kinds of polymorphism (parametric polymorphism and ad-hoc polymorphism).
关于资源管理
尽管在现代实现中省略了这些组件,但Landin论文中的定义是相当灵活的。它没有限制如何将组件(如环境)存储在SECD机器的上下文之外。
在实践中,使用了各种策略。最常见和传统的方法是使所有资源都由一个全局所有者拥有,该所有者可以收集不再使用的资源,即(全局)GC,首先在LISP中使用。
其他方法可能不需要全局所有者,并且对闭包有更好的局部性,例如:
In C++, resources of entities captured in closures are allowed being managed explicitly by users, by specifying how to capture each variable in the lambda-expression's capture list (by value copy, by reference, or even by an explicit initializer) and the exact type of each variable (smart pointers or other types). This can be unsafe, but it gains more flexibility when used correctly. In Rust, resources are captured with different capture modes (by immutable borrow, by borrow, by move) tried in turn (by the implementation), and users can specify explicit move. This is more conservative than C++, but safer in some sense (since borrows are statically checked, compared to unchecked by-reference captures in C++).
以上所有策略都支持闭包(c++和Rust确实有特定于语言的“闭包类型”概念定义)。管理闭包所使用的资源的规程与闭包的资格无关。
So, (although not seen here,) the claim of the necessity of graph tracing for closures by Thomas Lord at LtU is also technically incorrect. Closures can solve the funarg problem because it allows preventing invalid accesses to the activation record (the stack), but the fact does not magically assert every operations on the resources comprising the closure will be valid. Such mechanism depend on the external execution environment. It should be clear, even in traditional implementations, the implicit owner (GC) is not a component in the closures, and the existence of the owner is the implementation detail of SECD machine (so it is one of the "high-order" details to the users). Whether such detail supports graph tracing or not has no effects on the qualification of closures. Besides, AFAIK, the language constructs let combined with rec is first introduced (again by P. Landin) in ISWIM in 1966, which could not have effects to enforce the original meaning of the closures invented earlier than itself.
的关系
综上所述,闭包可以(非正式地)定义为:
(1)特定于PL实现的数据结构,包括环境部分和类函数实体的控制部分,其中:
(1.1)控制部分来源于指定类函数实体的求值构造的一些源语言构造;
(1.2)环境部分由环境和可选的其他实现定义的数据组成;
(1.3)(1.2)中的环境由类函数实体的潜在上下文依赖的源语言构造决定,用于保存在创建类函数实体的源语言构造的评估构造中捕获的自由变量。
(2)或者,在(1)中使用名为“闭包”的实体的实现技术的总称。
Lambda表达式(抽象)只是源语言中引入(创建)未命名函数类实体的语法构造之一。PL可以提供它作为引入类函数实体的唯一方式。
一般来说,源程序中的lambda表达式和程序执行中闭包的存在之间没有明确的对应关系。由于实现细节对程序的可观察行为没有影响,所以PL实现通常允许在可能的情况下合并为闭包分配的资源,或者在对程序语义无关紧要的情况下完全忽略创建闭包:
The implementation can check the set of the free variables to be captured in the lambda expression, and when the set is empty, it can avoid introducing the environment part, so the function-like entity will not require a closure to be maintained. Such strategy is usually mandated in the rules of static languages. Otherwise, the implementation may or may not always create a closure for a function-like entity resulted by evaluating the lambda expression whether there are variables to be captured.
Lambda表达式可以计算为类函数实体。一些PLs的用户可以把这样一个类似函数的实体称为“闭包”。在这种情况下,“匿名函数”应该是这种“闭包”的更中性的名称。
附录:功能:混乱的历史
这与问题没有直接联系,但值得注意的是,“函数”可以在不同的上下文中命名不同的实体。
这在数学上已经是一团糟了。
目前我懒得在PLs的上下文中总结它们,但作为警告:注意上下文,以确保不同PLs中“函数”的各种定义不会使你的推理偏离主题。
至于一般使用“匿名函数”(在实践中由pl共享),我相信它不会在这个主题上引入重大的混淆和误解。
Named functions may have a slightly more problems. Functions may denote the entity of the name themselves (the "symbols"), as well as the evaluated values of these names. Given that the fact that most PLs don't have unevaluated context to differentiate a function with some other entities carrying interesting meaning (e.g. sizeof(a_plain_cxx_function) in C++ just ill-formed), users may not observe the differences of the misinterpretation between unevaluated operand and evaluated values. That will be problematic with some Lisp dialects having QUOTE. Even experienced PL specialists can easily miss something important; this is also why I emphasize to distinguish syntactic constructs with other entities.