我同意Dietrich Epp的观点:GHC的快速是由几个因素共同作用的结果。
首先,Haskell的级别很高。这使得编译器可以在不破坏代码的情况下执行积极的优化。
Think about SQL. Now, when I write a SELECT statement, it might look like an imperative loop, but it isn't. It might look like it loops over all rows in that table trying to find the one that matches the specified conditions, but actually the "compiler" (the DB engine) could be doing an index lookup instead — which has completely different performance characteristics. But because SQL is so high-level, the "compiler" can substitute totally different algorithms, apply multiple processors or I/O channels or entire servers transparently, and more.
I think of Haskell as being the same. You might think you just asked Haskell to map the input list to a second list, filter the second list into a third list, and then count how many items resulted. But you didn't see GHC apply stream-fusion rewrite rules behind the scenes, transforming the entire thing into a single tight machine code loop that does the whole job in a single pass over the data with no allocation — the kind of thing that would be tedious, error-prone and non-maintainable to write by hand. That's only really possible because of the lack of low-level details in the code.
从另一个角度来看,Haskell为什么不能快呢?它做了什么会让它变慢?
它不是像Perl或JavaScript那样的解释性语言。它甚至不是像Java或c#那样的虚拟机系统。它一直编译到本地机器代码,所以没有开销。
与面向对象语言(Java、c#、JavaScript等)不同,Haskell具有完全类型擦除功能(如C、c++、Pascal等)。所有类型检查只在编译时发生。因此,也没有运行时类型检查来降低您的速度。(就此而言,没有空指针检查。例如,在Java中,JVM必须检查空指针,如果遵从一个空指针,则抛出异常。Haskell不需要为支票而烦恼。)
You say it sounds slow to "create functions on the fly at run-time", but if you look very carefully, you don't actually do that. It might look like you do, but you don't. If you say (+5), well, that's hard-coded into your source code. It cannot change at run-time. So it's not really a dynamic function. Even curried functions are really just saving parameters into a data block. All the executable code actually exists at compile-time; there is no run-time interpretation. (Unlike some other languages that have an "eval function".)
想想帕斯卡。它很旧了,没有人再用它了,但是没有人会抱怨帕斯卡很慢。它有很多令人不喜欢的地方,但慢并不是其中之一。Haskell除了垃圾收集而不是手动内存管理之外,并没有做太多与Pascal不同的事情。不可变的数据允许对GC引擎进行一些优化(惰性计算会使其变得有些复杂)。
我认为Haskell看起来很先进,很复杂,很高级,每个人都觉得“哦,哇,这真的很强大,它一定非常慢!”但事实并非如此。或者至少,它不是你所期望的方式。是的,它有一个惊人的类型系统。但你知道吗?这一切都发生在编译时。到运行时,它就消失了。是的,它允许你用一行代码构建复杂的adt。但你知道吗?ADT只是一个普通的C结构并集。仅此而已。
真正的杀手是懒惰的评估。当你正确地处理代码的严格/懒惰时,你可以写出非常快的代码,但仍然优雅而美丽。但是如果你做错了,你的程序就会变慢几千倍,为什么会发生这样的事情真的不太明显。
例如,我写了一个简单的小程序来计算每个字节在文件中出现的次数。对于一个25KB的输入文件,该程序运行了20分钟,占用了6gb的RAM!这是荒谬的! !但后来我意识到问题所在,添加了一个刘海图案,运行时间下降到0.02秒。
哈斯克尔在这里走得出乎意料地慢。这当然需要一段时间来适应。但是随着时间的推移,编写非常快速的代码变得更加容易。
是什么让哈斯克尔这么快?纯洁。静态类型。懒惰。但最重要的是,要足够高级,编译器可以在不破坏代码预期的情况下从根本上改变实现。
但我想这只是我的个人观点……