我担心我在这里“做错了事情”,如果是这样,删除我,我道歉。特别是,我不知道我是如何创建一些人已经创建的整洁的小注释的。然而,我对这个话题有很多关注/观察。
1)热门答案之一伪代码中的注释元素
result = query( "select smurfs from some_mushroom" );
// twiddle fingers
go_do_something_with_result( result );
本质上是虚假的。如果线程正在计算,那么它就不是在无所事事,而是在做必要的工作。另一方面,如果它只是等待IO的完成,那么它没有使用CPU时间,内核中线程控制基础设施的全部意义在于CPU会找到一些有用的事情来做。正如本文所建议的那样,“无所事事”的唯一方法是创建一个轮询循环,而编写过真正web服务器代码的人都不可能做到这一点。
2) "Threads are hard", only makes sense in the context of data sharing. If you have essentially independent threads such as is the case when handling independent web requests, then threading is trivially simple, you just code up the linear flow of how to handle one job, and sit pretty knowing that it will handle multiple requests, and each will be effectively independent. Personally, I would venture that for most programmers, learning the closure/callback mechanism is more complex than simply coding the top-to-bottom thread version. (But yes, if you have to communicate between the threads, life gets really hard really fast, but then I'm unconvinced that the closure/callback mechanism really changes that, it just restricts your options, because this approach is still achievable with threads. Anyway, that's a whole other discussion that's really not relevant here).
3)到目前为止,还没有人提出任何真实的证据来说明为什么一种特定类型的上下文切换比其他类型的上下文切换更费时或更省时。我在创建多任务内核方面的经验(在小型嵌入式控制器上,没有什么比“真正的”操作系统更花哨的了)表明情况不会是这样的。
4) All the illustrations that I have seen to date that purport to show how much faster Node is than other webservers are horribly flawed, however, they're flawed in a way that does indirectly illustrate one advantage I would definitely accept for Node (and it's by no means insignificant). Node doesn't look like it needs (nor even permits, actually) tuning. If you have a threaded model, you need to create sufficient threads to handle the expected load. Do this badly, and you'll end up with poor performance. If there are too few threads, then the CPU is idle, but unable to accept more requests, create too many threads, and you will waste kernel memory, and in the case of a Java environment, you'll also be wasting main heap memory. Now, for Java, wasting heap is the first, best, way to screw up the system's performance, because efficient garbage collection (currently, this might change with G1, but it seems that the jury is still out on that point as of early 2013 at least) depends on having lots of spare heap. So, there's the issue, tune it with too few threads, you have idle CPUs and poor throughput, tune it with too many, and it bogs down in other ways.
5) There is another way in which I accept the logic of the claim that Node's approach "is faster by design", and that is this. Most thread models use a time-sliced context switch model, layered on top of the more appropriate (value judgement alert :) and more efficient (not a value judgement) preemptive model. This happens for two reasons, first, most programmers don't seem to understand priority preemption, and second, if you learn threading in a windows environment, the timeslicing is there whether you like it or not (of course, this reinforces the first point; notably, the first versions of Java used priority preemption on Solaris implementations, and timeslicing in Windows. Because most programmers didn't understand and complained that "threading doesn't work in Solaris" they changed the model to timeslice everywhere). Anyway, the bottom line is that timeslicing creates additional (and potentially unnecessary) context switches. Every context switch takes CPU time, and that time is effectively removed from the work that can be done on the real job at hand. However, the amount of time invested in context switching because of timeslicing should not be more than a very small percentage of the overall time, unless something pretty outlandish is happening, and there's no reason I can see to expect that to be the case in a simple webserver). So, yes, the excess context switches involved in timeslicing are inefficient (and these don't happen in kernel threads as a rule, btw) but the difference will be a few percent of throughput, not the kind of whole number factors that are implied in the performance claims that are often implied for Node.
无论如何,很抱歉我说了这么长时间,但我真的觉得,到目前为止,讨论还没有证明任何东西,我很高兴听到有人在这些情况下说:
a)真正解释为什么Node应该更好(除了我上面概述的两个场景之外,我认为第一个(糟糕的调优)是迄今为止我看到的所有测试的真正解释。([编辑],实际上,我想得越多,我就越想知道大量堆栈使用的内存在这里是否重要。现代线程的默认堆栈大小往往相当大,但由基于闭包的事件系统分配的内存将只是所需要的。)
B)一个真正的基准测试,实际上给线程服务器选择一个公平的机会。至少这样,我就不必再相信这些声明本质上是错误的;>([编辑]这可能比我想的要强烈得多,但我确实觉得对性能好处给出的解释充其量是不完整的,所显示的基准是不合理的)。
欢呼,
托比