我相信Erlang社区并不羡慕Node.js,因为它在本地实现了非阻塞I/O,并且可以轻松地将部署扩展到多个处理器(Node.js中甚至没有内置这些处理器)。更多详情请访问http://journal.dedasys.com/2010/04/29/erlang-vs-node-js和Node.js或Erlang

哈斯克尔呢?Haskell能否提供Node.js的一些好处,即在不求助于多线程编程的情况下避免阻塞I/O的干净解决方案?


Node.js有很多吸引人的地方

Events: No thread manipulation, the programmer only provides callbacks (as in Snap framework) Callbacks are guaranteed to be run in a single thread: no race condition possible. Nice and simple UNIX-friendly API. Bonus: Excellent HTTP support. DNS also available. Every I/O is by default asynchronous. This makes it easier to avoid locks. However, too much CPU processing in a callback will impact other connections (in this case, the task should split into smaller sub-tasks and re-scheduled). Same language for client-side and server-side. (I don't see too much value in this one, however. jQuery and Node.js share the event programming model but the rest is very different. I just can't see how sharing code between server-side and client-side could be useful in practice.) All this packaged in a single product.


Haskell能否提供Node.js的一些好处,即在不求助于多线程编程的情况下避免阻塞I/O的干净解决方案?

是的,事实上事件和线程在Haskell中是统一的。

你可以在显式的轻量级线程中编程(例如,一台笔记本电脑上有数百万个线程)。 或;您可以基于可伸缩的事件通知以异步事件驱动风格进行编程。

线程实际上是根据事件实现的,并跨多个核心运行,具有无缝的线程迁移,有记录的性能和应用程序。

例如

大规模并发工作编排 并发集合在32或48核上扩展 工具支持分析和调试多线程/多事件程序。 高性能事件驱动的web服务器。 有趣的用户:比如高频交易。

并发收集nbody在32核

在Haskell中,你同时拥有事件和线程,因为它是所有事件的底层。

阅读描述实现的论文。


Ok, so having watched a little of the node.js presentation that @gawi pointed me at, I can say a bit more about how Haskell compares to node.js. In the presentation, Ryan describes some of the benefits of Green Threads, but then goes on to say that he doesn't find the lack of a thread abstraction to be a disadvantage. I'd disagree with his position, particularly in the context of Haskell: I think the abstractions that threads provide are essential for making server code easier to get right, and more robust. In particular:

using one thread per connection lets you write code that expresses the communication with a single client, rather that writing code that deals with all the clients at the same time. Think of it like this: a server that handles multiple clients with threads looks almost the same as one that handles a single client; the main difference is there's a fork somewhere in the former. If the protocol you're implementing is at all complex, managing the state machine for multiple clients simultaneously gets quite tricky, whereas threads let you just script the communication with a single client. The code is easier to get right, and easier to understand and maintain. callbacks on a single OS thread is cooperative multitasking, as opposed to preemptive multitasking, which is what you get with threads. The main disadvantage with cooperative multitasking is that the programmer is responsible for making sure that there's no starvation. It loses modularity: make a mistake in one place, and it can screw up the whole system. This is really something you don't want to have to worry about, and preemption is the simple solution. Moreover, communication between callbacks isn't possible (it would deadlock). concurrency isn't hard in Haskell, because most code is pure and so is thread-safe by construction. There are simple communication primitives. It's much harder to shoot yourself in the foot with concurrency in Haskell than in a language with unrestricted side effects.


首先,我不同意你的观点,node.js公开所有这些回调是正确的。你最终用CPS(延续传递风格)编写程序,我认为这应该是编译器的工作来完成转换。

事件:没有线程操作,程序员只提供回调(在Snap框架中)

因此,考虑到这一点,如果你愿意,你可以使用异步风格来编写,但这样做你会错过高效的同步风格,每个请求一个线程。Haskell在同步代码方面的效率非常高,特别是与其他语言相比。所有的事情都在背后。

回调保证在单个线程中运行:不可能出现竞态条件。

在node.js中你仍然可以有竞态条件,但这更困难。

每个请求都在它自己的线程中。当您编写必须与其他线程通信的代码时,由于haskell的并发原语,使其线程安全非常简单。

漂亮而简单的unix友好的API。好处:出色的HTTP支持。DNS也可用。

看看hackage,你自己看看。

默认情况下,每个I/O都是异步的(不过有时这可能很烦人)。这样可以更容易地避免锁定。然而,在回调中过多的CPU处理将影响其他连接(在这种情况下,任务应该分成更小的子任务并重新调度)。

你没有这样的问题,ghc会把你的工作分配到真正的OS线程中。

客户端和服务器端使用相同的语言。(然而,我不认为这有太多价值。JQuery和Node.js共享事件编程模型,但其余部分非常不同。我只是不明白在服务器端和客户端之间共享代码在实践中有什么用。)

哈斯克尔不可能在这里赢…对吧?再想想,http://www.haskell.org/haskellwiki/Haskell_in_web_browser。

所有这些都包装在一个单一的产品。

下载ghc,启动cabal。每一种需求都有一个套餐。


这个问题非常荒谬,因为1)Haskell已经以一种更好的方式解决了这个问题,2)以与Erlang大致相同的方式解决了这个问题。下面是针对node的基准测试:http://www.yesodweb.com/blog/2011/03/preliminary-warp-cross-language-benchmarks

给Haskell 4个内核,它可以在一个应用程序中每秒处理100k个(简单)请求。Node不能做那么多,也不能跨核扩展单个应用程序。你不需要做任何事情来获得这个,因为Haskell运行时是非阻塞的。另一种(相对常见的)在运行时中内置非阻塞IO的语言是Erlang。


IMHO事件很好,但是通过回调来编程就不好了。

Most of the problems that makes special the coding and debugging of web applications comes from what makes them scalable and flexible. The most important, the stateless nature of HTTP. This enhances navigability, but this imposes an inversion of control where the IO element (the web server in this case) call different handlers in the application code. This event model -or callback model, more accurately said- is a nightmare, since callbacks do not share variable scopes, and an intuitive view of the navigation is lost. It is very difficult to prevent all the possible state changes when the user navigate back and forth, among other problems.

It may be said that the problems are similar to GUI programming where the event model works fine, but GUIs have no navigation and no back button. That multiplies the state transitions possible in web applications. The result of the attempt to solve these problem are heavy frameworks with complicated configurations plenty of pervasive magic identifiers without questioning the root of the problem: the callback model and its inherent lack of sharing of variable scopes, and no sequencing, so the sequence has to be constructed by linking identifiers.

There are sequential based frameworks like ocsigen (ocaml) seaside (smalltalk) WASH (discontinued, Haskell) and mflow (Haskell) that solve the problem of state management while maintaining navigability and REST-fulness. within these frameworks, the programmer can express the navigation as a imperative sequence where the program send pages and wait for responses in a single thread, variables are in scope and the back button works automatically. This inherently produces shorter, more safe, more readable code where the navigation is clearly visible to the programmer. (fair warning: I´m the developer of mflow)


我个人认为Node.js和用回调编程是不必要的低级和有点不自然的事情。当一个好的运行时,比如在GHC中找到的运行时,可以为你处理回调并且非常有效时,为什么要用回调编程呢?

与此同时,GHC运行时有了很大的改进:它现在有了一个叫做MIO的“新的新的IO管理器”,我相信“M”代表多核。它建立在现有IO管理器的基础上,其主要目标是克服4+核性能下降的原因。本文提供的性能数据令人印象深刻。看到自己:

使用Mio,在Haskell中实际的HTTP服务器可以扩展到20个CPU内核,与使用以前版本的GHC的相同服务器相比,其峰值性能高达6.5倍。Haskell服务器的延迟也得到了改善:]在中等负载下,与以前版本的GHC相比,减少了5.7倍的预期响应时间

And:

我们还表明,使用Mio, McNettle(用Haskell编写的SDN控制器)可以有效扩展到40+核,在一台机器上达到每秒超过2000万次新请求的吞吐量,因此成为所有现有SDN控制器中最快的。

Mio已经进入GHC 7.8.1版本。我个人认为这是Haskell性能上的重大进步。比较之前的GHC版本和7.8.1编译的现有web应用程序的性能将是非常有趣的。


就像nodejs放弃了libev一样 Snap Haskell Web框架也放弃了libev。