有人知道当纯函数式编程而不是命令式编程(即允许副作用)时,可能发生的最糟糕的渐近减速是多少吗?

从itowlson的评论中澄清:是否存在任何问题,最著名的非破坏性算法渐进地比最著名的破坏性算法差,如果是这样,差多少?


本文声称,已知的union-find算法的纯函数实现都比他们发布的算法具有更差的渐近复杂性,后者具有纯函数接口,但在内部使用可变数据。

事实上,其他的答案都声称不会有任何区别,例如,纯函数式代码的唯一“缺点”是它可以并行化,这让你对函数式编程社区在这些问题上的知情性/客观性有了一个概念。

编辑:

下面的评论指出,关于纯函数式编程利弊的有偏见的讨论可能不会来自“函数式编程社区”。好点。也许我所看到的倡导者只是,引用一个评论,“文盲”。

例如,我认为这篇博文的作者可以说是函数式编程社区的代表,因为它是一个“懒惰评估的要点”列表,它将是一个提及懒惰和纯函数式编程可能具有的任何缺点的好地方。一个好地方应该是取代以下(技术上是正确的,但有偏见到不搞笑的地步)解雇:

如果一个严格函数在严格语言中有O(f(n))个复杂度,那么在惰性语言中它也有O(f(n))个复杂度。为什么担心?:)


According to Pippenger [1996], when comparing a Lisp system that is purely functional (and has strict evaluation semantics, not lazy) to one that can mutate data, an algorithm written for the impure Lisp that runs in O(n) can be translated to an algorithm in the pure Lisp that runs in O(n log n) time (based on work by Ben-Amram and Galil [1992] about simulating random access memory using only pointers). Pippenger also establishes that there are algorithms for which that is the best you can do; there are problems which are O(n) in the impure system which are Ω(n log n) in the pure system.

关于这篇论文有几点需要注意。最重要的是它没有解决惰性函数语言,比如Haskell。Bird, Jones和De Moor[1997]证明了Pippenger构造的问题可以在O(n)时间内用惰性函数语言解决,但他们没有确定(据我所知,没有人确定)惰性函数语言是否可以在与突变语言相同的渐近运行时间内解决所有问题。

The problem constructed by Pippenger requires Ω(n log n) is specifically constructed to achieve this result, and is not necessarily representative of practical, real-world problems. There are a few restrictions on the problem that are a bit unexpected, but necessary for the proof to work; in particular, the problem requires that results are computed on-line, without being able to access future input, and that the input consists of a sequence of atoms from an unbounded set of possible atoms, rather than a fixed size set. And the paper only establishes (lower bound) results for an impure algorithm of linear running time; for problems that require a greater running time, it is possible that the extra O(log n) factor seen in the linear problem may be able to be "absorbed" in the process of extra operations necessary for algorithms with greater running times. These clarifications and open questions are explored briefly by Ben-Amram [1996].

在实践中,许多算法可以用纯函数式语言实现,其效率与使用可变数据结构的语言相同。关于有效实现纯函数式数据结构的技术,请参阅Chris Okasaki的“纯函数式数据结构”[Okasaki 1998](这是他的论文[Okasaki 1996]的扩展版本)。

Anyone who needs to implement algorithms on purely-functional data structures should read Okasaki. You can always get at worst an O(log n) slowdown per operation by simulating mutable memory with a balanced binary tree, but in many cases you can do considerably better than that, and Okasaki describes many useful techniques, from amortized techniques to real-time ones that do the amortized work incrementally. Purely functional data structures can be a bit difficult to work with and analyze, but they provide many benefits like referential transparency that are helpful in compiler optimization, in parallel and distributed computing, and in implementation of features like versioning, undo, and rollback.

还要注意,所有这些只讨论渐近运行时间。许多实现纯函数式数据结构的技术都会给您带来一定程度的持续因素减速,这是由于它们工作所需的额外簿记以及相关语言的实现细节。纯函数式数据结构的好处可能超过这些恒定的因素减速,因此您通常需要根据所讨论的问题进行权衡。

参考文献

Ben-Amram, Amir and Galil, Zvi 1992. "On Pointers versus Addresses" Journal of the ACM, 39(3), pp. 617-648, July 1992 Ben-Amram, Amir 1996. "Notes on Pippenger's Comparison of Pure and Impure Lisp" Unpublished manuscript, DIKU, University of Copenhagen, Denmark Bird, Richard, Jones, Geraint, and De Moor, Oege 1997. "More haste, less speed: lazy versus eager evaluation" Journal of Functional Programming 7, 5 pp. 541–547, September 1997 Okasaki, Chris 1996. "Purely Functional Data Structures" PhD Thesis, Carnegie Mellon University Okasaki, Chris 1998. "Purely Functional Data Structures" Cambridge University Press, Cambridge, UK Pippenger, Nicholas 1996. "Pure Versus Impure Lisp" ACM Symposium on Principles of Programming Languages, pages 104–109, January 1996


对于内存使用的固定上限,应该没有区别。

Proof sketch: Given a fixed upper bound on memory usage, one should be able to write a virtual machine which executes an imperative instruction set with the same asymptotic complexity as if you were actually executing on that machine. This is so because you can manage the mutable memory as a persistent data structure, giving O(log(n)) read and writes, but with a fixed upper bound on memory usage, you can have a fixed amount of memory, causing these to decay to O(1). Thus the functional implementation can be the imperative version running in the functional implementation of the VM, and so they should both have the same asymptotic complexity.


确实有一些算法和数据结构是没有渐进有效的纯函数解决方案(即一个可在纯lambda微积分中实现的解决方案),即使懒惰。

前面提到的联合发现 哈希表 数组 一些图算法 ...

然而,我们假设在“命令式”语言中,对内存的访问是O(1),而在理论上,它不可能如此渐近(即对于无界的问题大小),并且在一个巨大的数据集中对内存的访问总是O(log n),这可以在函数式语言中模拟。

此外,我们必须记住,实际上所有现代函数式语言都提供可变数据,Haskell甚至在不牺牲纯度的情况下提供了它(ST单子)。