一般来说，Node.js如何处理10,000个并发请求?

在node.js中，请求应该是IO绑定，而不是CPU绑定。这意味着每个请求不应该强迫node.js做大量的计算。如果在解决请求时涉及大量计算，那么node.js不是一个好的选择。IO界需要很少的计算量。请求的大部分时间都花在对DB或服务的调用上。

Node.js有单线程事件循环，但它只是一个厨师。在后台，大部分工作是由操作系统完成的，Libuv确保了与操作系统的通信。Libuv的文档如下:

在事件驱动编程中，应用程序表示感兴趣的当某些事件发生时，对它们做出反应。的责任从操作系统收集事件或监视其他事件源由libuv处理，用户可以注册事件发生时调用的回调。

The incoming requests are handled by the Operating system. This is pretty much correct for almost all servers based on request-response model. Incoming network calls are queued in OS Non-blocking IO queue.'Event Loop constantly polls OS IO queue that is how it gets to know about the incoming client request. "Polling" means checking the status of some resource at a regular interval. If there are any incoming requests, evnet loop will take that request, it will execute that synchronously. while executing if there is any async call (i.e setTimeout), it will be put into the callback queue. After the event loop finishes executing sync calls, it can poll the callbacks, if it finds a callback that needs to be executed, it will execute that callback. then it will poll for any incoming request. If you check the node.js docs there is this image:

从文档阶段概述

poll:检索新的I/O事件;执行I/O相关的回调(几乎除了关闭回调，由定时器，和setimmediation ());节点将在适当的时候阻塞在这里。

事件循环不断地从不同队列轮询。如果一个请求需要外部调用或磁盘访问，这将被传递给操作系统，操作系统也有2个不同的队列。一旦事件循环检测到某些事情必须异步完成，它就会将它们放入队列中。一旦它被放入队列中，事件循环将处理到下一个任务。

这里要提到的一件事是，事件循环持续运行。只有Cpu可以将这个线程移出Cpu，事件循环本身不会这样做。

从文档中可以看出:

The secret to the scalability of Node.js is that it uses a small number of threads to handle many clients. If Node.js can make do with fewer threads, then it can spend more of your system's time and memory working on clients rather than on paying space and time overheads for threads (memory, context-switching). But because Node.js has only a few threads, you must structure your application to use them wisely. Here's a good rule of thumb for keeping your Node.js server speedy: Node.js is fast when the work associated with each client at any given time is "small".

注意，小任务意味着IO绑定任务而不是CPU绑定任务。只有当每个请求的工作主要是IO工作时，单个事件循环才会处理客户机负载。

Context switch basically means CPU is out of resources so It needs to stop the execution of one process to allow another process to execute. OS first has to evict process1 so it will take this process from CPU and it will save this process in the main memory. Next, OS will restore process2 by loading process control block from memory and it will put it on the CPU for execution. Then process2 will start its execution. Between process1 ended and the process2 started, we have lost some time. Large number of threads can cause a heavily loaded system to spend precious cycles on thread scheduling and context switching, which adds latency and imposes limits on scalability and throughput.

2022-09-27 00:57:20

如果你不得不问这个问题，那么你可能不熟悉大多数web应用程序/服务的功能。你可能认为所有的软件都是这样做的:

user do an action
       │
       v
 application start processing action
   └──> loop ...
          └──> busy processing
 end loop
   └──> send result to user

然而，这不是web应用程序的工作方式，也不是任何以数据库作为后端的应用程序的工作方式。Web应用会这样做:

user do an action
       │
       v
 application start processing action
   └──> make database request
          └──> do nothing until request completes
 request complete
   └──> send result to user

在这种情况下，软件的大部分运行时间使用0%的CPU时间来等待数据库返回。

多线程网络应用:

多线程网络应用程序像这样处理上述工作负载:

request ──> spawn thread
              └──> wait for database request
                     └──> answer request
request ──> spawn thread
              └──> wait for database request
                     └──> answer request
request ──> spawn thread
              └──> wait for database request
                     └──> answer request

因此，线程大部分时间都在使用0%的CPU等待数据库返回数据。在这样做的时候，他们不得不为每个线程分配所需的内存，其中包括为每个线程分配完全独立的程序堆栈等。此外，他们将不得不启动一个线程，虽然不像启动一个完整的进程那么昂贵，但仍然不便宜。

单线程事件循环

既然我们大部分时间都在使用0%的CPU，为什么不在不使用CPU的时候运行一些代码呢?这样，每个请求将获得与多线程应用程序相同的CPU时间，但我们不需要启动线程。所以我们这样做:

request ──> make database request
request ──> make database request
request ──> make database request
database request complete ──> send response
database request complete ──> send response
database request complete ──> send response

实际上，这两种方法返回的数据延迟大致相同，因为数据库响应时间主导着处理过程。

这里的主要优点是我们不需要生成一个新线程，所以我们不需要做很多很多的malloc，这会减慢我们的速度。

魔法，隐形线程

看似神秘的事情是，上述两种方法是如何“并行”运行工作负载的?答案是数据库是线程化的。所以我们的单线程应用实际上是利用了另一个进程的多线程行为:数据库。

单线程方法失败的地方

如果在返回数据之前需要进行大量的CPU计算，那么单线程应用程序就会失败。现在，我指的不是处理数据库结果的for循环。基本上还是O(n)我的意思是做傅里叶变换(例如mp3编码)，光线追踪(3D渲染)等。

单线程应用程序的另一个缺陷是它只使用单个CPU核心。因此，如果你有一个四核服务器(现在并不少见)，你就不会使用其他3核。

多线程方法失败的地方

如果你需要为每个线程分配大量的RAM，那么多线程应用程序就会失败。首先，RAM使用本身意味着你不能像单线程应用程序那样处理那么多请求。更糟糕的是，malloc很慢。分配大量的对象(这在现代web框架中很常见)意味着我们最终可能会比单线程应用程序慢。这是node.js通常获胜的地方。

当您需要在线程中运行另一种脚本语言时，这种用例最终会使多线程变得更糟。首先，通常需要malloc该语言的整个运行时，然后需要malloc脚本使用的变量。

所以如果你用C或go或java编写网络应用程序，那么线程的开销通常不会太糟糕。如果你正在编写一个C web服务器来提供PHP或Ruby，那么用javascript或Ruby或Python编写一个更快的服务器是非常容易的。

混合方法

一些web服务器使用混合方法。例如，Nginx和Apache2将网络处理代码实现为事件循环的线程池。每个线程运行一个事件循环，同时处理单线程请求，但请求在多个线程之间是负载平衡的。

一些单线程架构也使用混合方法。而不是从一个进程启动多个线程，你可以启动多个应用程序-例如，在四核机器上的4个node.js服务器。然后使用负载均衡器将工作负载分散到各个进程中。node.js中的cluster模块正是这样做的。

实际上，这两种方法在技术上是彼此相同的镜像。

2016-01-18 14:37:56

下面是来自这篇媒体文章的一个很好的解释:

给定一个NodeJS应用程序，由于Node是单线程的，假设处理涉及Promise。所有这些都需要8秒，这是否意味着在这个请求之后的客户端请求将需要等待8秒? 不。NodeJS事件循环是单线程的。NodeJS的整个服务器架构不是单线程的。

Before getting into the Node server architecture, to take a look at typical multithreaded request response model, the web server would have multiple threads and when concurrent requests get to the webserver, the webserver picks threadOne from the threadPool and threadOne processes requestOne and responds to clientOne and when the second request comes in, the web server picks up the second thread from the threadPool and picks up requestTwo and processes it and responds to clientTwo. threadOne is responsible for all kinds of operations that requestOne demanded including doing any blocking IO operations.

线程需要等待阻塞IO操作的事实使其效率低下。使用这种模型，web服务器只能处理与线程池中线程数量相同的请求。

NodeJS Web Server maintains a limited Thread Pool to provide services to client requests. Multiple clients make multiple requests to the NodeJS server. NodeJS receives these requests and places them into the EventQueue . NodeJS server has an internal component referred to as the EventLoop which is an infinite loop that receives requests and processes them. This EventLoop is single threaded. In other words, EventLoop is the listener for the EventQueue. So, we have an event queue where the requests are being placed and we have an event loop listening to these requests in the event queue. What happens next? The listener(the event loop) processes the request and if it is able to process the request without needing any blocking IO operations, then the event loop would itself process the request and sends the response back to the client by itself. If the current request uses blocking IO operations, the event loop sees whether there are threads available in the thread pool, picks up one thread from the thread pool and assigns the particular request to the picked thread. That thread does the blocking IO operations and sends the response back to the event loop and once the response gets to the event loop, the event loop sends the response back to the client.

NodeJS比传统的多线程请求响应模型好在哪里? 在传统的多线程请求/响应模型中，每个客户端都得到一个不同的线程，而在NodeJS中，更简单的请求都直接由EventLoop处理。这是线程池资源的优化，并且没有为每个客户机请求创建线程的开销。

2021-05-19 14:31:40

您可能认为大部分处理都是在节点事件循环中处理的。节点实际上将I/O工作分配给线程。I/O操作通常比CPU操作要长几个数量级，那么为什么CPU要等待呢?此外，操作系统已经可以很好地处理I/O任务。事实上，由于Node不等待，它实现了更高的CPU利用率。

通过类比的方式，可以将NodeJS想象成一个服务员，在I/O厨师在厨房里准备订单的同时接受客户的订单。其他系统有多名厨师，他们为顾客点单、准备饭菜、清理桌子，然后才为下一位顾客服务。

2016-01-18 13:51:59

多线程阻塞系统的阻塞部分使其效率较低。被阻塞的线程在等待响应期间不能用于其他任何事情。

而非阻塞单线程系统则充分利用了它的单线程系统。

见下图: 在这里，在厨房门口等待或在顾客挑选食物时等待，是“阻塞”了服务员的全部能力。在计算系统的意义上，它可以等待IO，或DB响应或任何阻塞整个线程的东西，即使线程在等待时能够进行其他工作。

让我们看看非阻塞是如何工作的:

在非阻塞系统中，服务员只接单和上菜，不在任何地方等待。他分享了他的手机号码，以便在他们完成订单后给他们回电话。同样地，他将自己的电话号码分享给厨房，以便在订单准备就绪时回复。

这就是Event循环在NodeJS中的工作方式，并且比阻塞多线程系统执行得更好。

2021-10-13 15:01:20

在node.js中，请求应该是IO绑定，而不是CPU绑定。这意味着每个请求不应该强迫node.js做大量的计算。如果在解决请求时涉及大量计算，那么node.js不是一个好的选择。IO界需要很少的计算量。请求的大部分时间都花在对DB或服务的调用上。

Node.js有单线程事件循环，但它只是一个厨师。在后台，大部分工作是由操作系统完成的，Libuv确保了与操作系统的通信。Libuv的文档如下:

在事件驱动编程中，应用程序表示感兴趣的当某些事件发生时，对它们做出反应。的责任从操作系统收集事件或监视其他事件源由libuv处理，用户可以注册事件发生时调用的回调。

The incoming requests are handled by the Operating system. This is pretty much correct for almost all servers based on request-response model. Incoming network calls are queued in OS Non-blocking IO queue.'Event Loop constantly polls OS IO queue that is how it gets to know about the incoming client request. "Polling" means checking the status of some resource at a regular interval. If there are any incoming requests, evnet loop will take that request, it will execute that synchronously. while executing if there is any async call (i.e setTimeout), it will be put into the callback queue. After the event loop finishes executing sync calls, it can poll the callbacks, if it finds a callback that needs to be executed, it will execute that callback. then it will poll for any incoming request. If you check the node.js docs there is this image:

从文档阶段概述

poll:检索新的I/O事件;执行I/O相关的回调(几乎除了关闭回调，由定时器，和setimmediation ());节点将在适当的时候阻塞在这里。

事件循环不断地从不同队列轮询。如果一个请求需要外部调用或磁盘访问，这将被传递给操作系统，操作系统也有2个不同的队列。一旦事件循环检测到某些事情必须异步完成，它就会将它们放入队列中。一旦它被放入队列中，事件循环将处理到下一个任务。

这里要提到的一件事是，事件循环持续运行。只有Cpu可以将这个线程移出Cpu，事件循环本身不会这样做。

从文档中可以看出:

The secret to the scalability of Node.js is that it uses a small number of threads to handle many clients. If Node.js can make do with fewer threads, then it can spend more of your system's time and memory working on clients rather than on paying space and time overheads for threads (memory, context-switching). But because Node.js has only a few threads, you must structure your application to use them wisely. Here's a good rule of thumb for keeping your Node.js server speedy: Node.js is fast when the work associated with each client at any given time is "small".

注意，小任务意味着IO绑定任务而不是CPU绑定任务。只有当每个请求的工作主要是IO工作时，单个事件循环才会处理客户机负载。

Context switch basically means CPU is out of resources so It needs to stop the execution of one process to allow another process to execute. OS first has to evict process1 so it will take this process from CPU and it will save this process in the main memory. Next, OS will restore process2 by loading process control block from memory and it will put it on the CPU for execution. Then process2 will start its execution. Between process1 ended and the process2 started, we have lost some time. Large number of threads can cause a heavily loaded system to spend precious cycles on thread scheduling and context switching, which adds latency and imposes limits on scalability and throughput.

2022-09-27 00:57:20

一般来说，Node.js如何处理10,000个并发请求?

推荐文章

最新文章

标签