最近我读了很多关于函数式编程的东西,大部分我都能理解,但有一件事我就是搞不懂,那就是无状态编码。在我看来,通过删除可变状态来简化编程就像通过删除仪表盘来“简化”一辆汽车:最终产品可能更简单,但希望它能与最终用户交互。

几乎我能想到的每个用户应用程序都将状态作为核心概念。如果你写了一个文档(或一个SO post),状态会随着每一个新的输入而改变。或者如果你玩电子游戏,会有大量的状态变量,从所有角色的位置开始,这些角色往往会不断移动。如果不跟踪不断变化的值,您怎么可能做任何有用的事情呢?

每次我发现一些讨论这个问题的东西,它都是用真正的技术函数语言写的,假设我没有浓厚的FP背景。有谁知道如何向那些对命令式编码有很好的、扎实的理解,但在函数方面完全是n00b的人解释这一点吗?

编辑:到目前为止,一堆回复似乎试图让我相信不可变值的优点。我懂你的意思。这很有道理。我不明白的是,在没有可变变量的情况下,如何跟踪必须不断变化的值。


当前回答

我觉得这里有点误会。纯函数式程序有状态。不同之处在于该状态是如何建模的。在纯函数式编程中,状态是由接受某个状态并返回下一个状态的函数操作的。然后通过将状态传递给纯函数序列来实现状态排序。

甚至全局可变状态也可以这样建模。例如,在Haskell中,程序是一个从World到World的函数。也就是说,你传入整个宇宙,程序返回一个新的宇宙。但是,在实践中,您只需要传入您的程序实际感兴趣的部分。程序实际上返回一系列动作,作为程序运行的操作环境的指令。

您希望从命令式编程的角度对此进行解释。好的,让我们看一些用函数式语言编写的非常简单的命令式编程。

考虑下面的代码:

int x = 1;
int y = x + 1;
x = x + y;
return x;

相当标准的命令式代码。没有做什么有趣的事情,但这是可以用来说明的。我想你们会同意这里涉及到国家。变量x的值随时间变化。现在,让我们稍微改变一下符号,发明一个新的语法:

let x = 1 in
let y = x + 1 in
let z = x + y in z 

加上括号使它的意思更清楚:

let x = 1 in (let y = x + 1 in (let z = x + y in (z)))

可以看到,状态是由一系列纯表达式建模的,这些纯表达式绑定了下面表达式的自由变量。

您会发现这个模式可以建模任何类型的状态,甚至IO。

其他回答

除了别人给出的很好的答案,想想Java中的Integer和String类。这些类的实例是不可变的,但这并不意味着仅仅因为它们的实例不可更改,这些类就毫无用处。不可变性给了你一定的安全性。您知道,如果使用String或Integer实例作为Map的键,则该键不能更改。将其与Java中的Date类进行比较:

Date date = new Date();
mymap.put(date, date.toString());
// Some time later:
date.setTime(new Date().getTime());

你已经无声地改变了地图中的一个键!使用不可变对象(如在函数式编程中)要干净得多。更容易推断会发生什么副作用——没有!这意味着程序员更容易,优化器也更容易。

我觉得这里有点误会。纯函数式程序有状态。不同之处在于该状态是如何建模的。在纯函数式编程中,状态是由接受某个状态并返回下一个状态的函数操作的。然后通过将状态传递给纯函数序列来实现状态排序。

甚至全局可变状态也可以这样建模。例如,在Haskell中,程序是一个从World到World的函数。也就是说,你传入整个宇宙,程序返回一个新的宇宙。但是,在实践中,您只需要传入您的程序实际感兴趣的部分。程序实际上返回一系列动作,作为程序运行的操作环境的指令。

您希望从命令式编程的角度对此进行解释。好的,让我们看一些用函数式语言编写的非常简单的命令式编程。

考虑下面的代码:

int x = 1;
int y = x + 1;
x = x + y;
return x;

相当标准的命令式代码。没有做什么有趣的事情,但这是可以用来说明的。我想你们会同意这里涉及到国家。变量x的值随时间变化。现在,让我们稍微改变一下符号,发明一个新的语法:

let x = 1 in
let y = x + 1 in
let z = x + y in z 

加上括号使它的意思更清楚:

let x = 1 in (let y = x + 1 in (let z = x + y in (z)))

可以看到,状态是由一系列纯表达式建模的,这些纯表达式绑定了下面表达式的自由变量。

您会发现这个模式可以建模任何类型的状态,甚至IO。

事实上,即使在没有可变状态的语言中,也很容易有一些看起来像可变状态的东西。

Consider a function with type s -> (a, s). Translating from Haskell syntax, it means a function which takes one parameter of type "s" and returns a pair of values, of types "a" and "s". If s is the type of our state, this function takes one state and returns a new state, and possibly a value (you can always return "unit" aka (), which is sort of equivalent to "void" in C/C++, as the "a" type). If you chain several calls of functions with types like this (getting the state returned from one function and passing it to the next), you have "mutable" state (in fact you are in each function creating a new state and abandoning the old one).

It might be easier to understand if you imagine the mutable state as the "space" where your program is executing, and then think of the time dimension. At instant t1, the "space" is in a certain condition (say for example some memory location has value 5). At a later instant t2, it is in a different condition (for example that memory location now has value 10). Each of these time "slices" is a state, and it is immutable (you cannot go back in time to change them). So, from this point of view, you went from the full spacetime with a time arrow (your mutable state) to a set of slices of spacetime (several immutable states), and your program is just treating each slice as a value and computing each of them as a function applied to the previous one.

好吧,也许这并不容易理解:-)

It might seem inneficient to explicitly represent the whole program state as a value, which has to be created only to be discarded the next instant (just after a new one is created). For some algorithms it might be natural, but when it is not, there is another trick. Instead of a real state, you can use a fake state which is nothing more than a marker (let's call the type of this fake state State#). This fake state exists from the point of view of the language, and is passed like any other value, but the compiler completely omits it when generating the machine code. It only serves to mark the sequence of execution.

举个例子,假设编译器给了我们以下函数:

readRef :: Ref a -> State# -> (a, State#)
writeRef :: Ref a -> a -> State# -> (a, State#)

从这些类似haskell的声明中转换,readRef接收到类似于指向“a”类型值和假状态的指针或句柄的东西,并返回由第一个形参和新的假状态指向的“a”类型值。writeRef类似,但是改变了所指向的值。

If you call readRef and then pass it the fake state returned by writeRef (perhaps with other calls to unrelated functions in the middle; these state values create a "chain" of function calls), it will return the value written. You can call writeRef again with the same pointer/handle and it will write to the same memory location — but, since conceptually it is returning a new (fake) state, the (fake) state is still imutable (a new one has been "created"). The compiler will call the functions in the order it would have to call them if there was a real state variable which had to be computed, but the only state which there is is the full (mutable) state of the real hardware.

(了解Haskell的人会注意到我简化了很多东西,并省略了一些重要的细节。对于那些想要看到更多细节的人,看看mtl中的Control.Monad.State,以及ST和IO(又名ST RealWorld)单子。

You might wonder why doing it in such a roundabout way (instead of simply having mutable state in the language). The real advantage is that you have reified your program's state. What before was implicit (your program state was global, allowing for things like action at a distance) is now explicit. Functions which do not receive and return the state cannot modify it or be influenced by it; they are "pure". Even better, you can have separate state threads, and with a bit of type magic, they can be used to embed an imperative computation within a pure one, without making it impure (the ST monad in Haskell is the one normally used for this trick; the State# I mentioned above is in fact GHC's State# s, used by its implementation of the ST and IO monads).

让我们来回答更普遍的问题:

没有状态,你怎么做有用的事情?

你不。

寻找传统语言的替代品 我们必须首先认识到,一种制度不可能成为历史 敏感(允许执行一个程序来影响 一个后续的行为),除非系统已经 某种状态(第一个程序可以改变这种状态) 而第二个可以访问)。因此,历史敏感 计算系统的模型必须具有状态转换 语义学,至少在这个弱意义上。 约翰·巴克斯。

(由我强调)

重要的是巴克斯随后的观察:

但这不是 意味着每个计算都必须严重依赖于 复杂状态[…]

像Haskell或Clean这样的函数式语言允许您轻松地将这种观察付诸实践:大多数定义都是普通的函数,就像您在数学教育中看到的那样。这就留下了一小群“杂七杂八的人”来处理所有恼人的外部状态,例如:

与用户互动, 与远程服务通信, 处理模拟使用随机抽样, 打印出一个SVG文件(例如海报), 定期备份,

...两种语言都使用类型来区分普通和混杂。


有时,如果您试图实现的算法使用私有、局部可变状态实现,则效果最好。在这种情况下,你可以使用Haskell扩展来做到这一点,而不会让整个程序变得“内部杂乱”——详情请参阅John Launchbury和Simon Peyton Jones编写的State In Haskell。

JavaScript provides very clear examples of the different ways of approaching mutable or immutable state\values within its core because the ECMAScript specifications were not able to settle on a universal standard so one must continue to memorize or doublecheck which functions create a new object that they return or modify the original object passed to it. If your entire language is immutable then you know you are always getting a new (copied & possibly modified) result and never have to worry about accidentally modifying the variable before passing it into a function.

你知道哪个会返回一个新的对象,哪个会改变下面例子中的原始对象吗?

Array.prototype.push()
String.prototype.slice()
Array.prototype.splice()
String.prototype.trim()