根据上下文,“部署”一词可能有两种含义。你也混淆了Apache/Nginx的角色和其他组件的角色。
历史注释:本文最初写于2010年11月6日,当时Ruby应用程序服务器生态系统还很有限。我在2013年3月15日更新了这篇文章,其中包含了生态系统的所有最新更新。
免责声明:我是应用服务器之一Phusion Passenger的作者之一。
Apache vs Nginx
它们都是网络服务器。它们可以提供静态文件,但是-使用正确的模块-也可以提供动态web应用程序,例如那些用PHP编写的应用程序。Apache更流行,有更多的功能,Nginx更小,更快,功能更少。
Apache和Nginx都不能开箱即用地提供Ruby web应用,要做到这一点,你需要结合使用Apache/Nginx和某种附加组件,后面会介绍。
Apache和Nginx也可以充当反向代理,这意味着它们可以接收传入的HTTP请求并将其转发到另一个服务器,该服务器也使用HTTP。当服务器响应HTTP响应时,Apache/Nginx会将响应转发回客户端;稍后您将了解为什么这是相关的。
Mongrel和其他生产应用服务器vs WEBrick
Mongrel是一个Ruby“应用服务器”:具体来说,这意味着Mongrel是一个应用程序,它:
将Ruby应用程序加载到它自己的进程空间中。
建立一个TCP套接字,允许它与外部世界(例如Internet)通信。
Mongrel监听这个套接字上的HTTP请求,并将请求数据传递给Ruby web应用程序。
Ruby web应用程序然后返回一个对象,该对象描述了HTTP响应应该是什么样子,Mongrel负责将其转换为实际的HTTP响应(实际的字节)并通过套接字发回。
然而,Mongrel是相当过时的,现在它不再维持。更新的应用服务器有:
Phusion乘客
独角兽
薄
彪马
特立尼达(仅JRuby)
TorqueBox(仅限JRuby)
我将在后面介绍它们,并描述它们彼此之间以及与Mongrel之间的区别。
WEBrick做的事情和Mongrel一样,但区别是:
WEBrick is not fit for production, unlike everything else that I mentioned before. WEBrick is written entirely in Ruby. Mongrel (and most other Ruby app servers) is part Ruby and part C (Mostly Ruby), but its HTTP parser is written in C for performance.
WEBrick is slower and less robust. It has some known memory leaks and some known HTTP parsing problems.
WEBrick is usually only used as the default server during development because WEBrick is included in Ruby by default. Mongrel and other app servers needs to be installed separately. It's not recommended to use WEBrick in production environments, though for some reason Heroku chose WEBrick as its default server. They were using Thin before, so I have no idea why they switched to WEBrick.
应用服务器和整个世界
目前所有的Ruby应用服务器都使用HTTP,但是一些应用服务器可能直接在端口80上公开给Internet,而其他的则可能不是。
可以直接暴露在互联网上的应用服务器:Phusion Passenger, Rainbows
应用服务器可能不会直接暴露在互联网上:Mongrel, Unicorn, Thin, Puma。这些应用服务器必须放在反向代理web服务器(如Apache和Nginx)后面。
我对Trinidad和TorqueBox了解不够,所以我省略了它们。
为什么某些应用服务器必须置于反向代理之后?
Some app servers can only handle 1 request concurrently, per process. If you want to handle 2 requests concurrently you need to run multiple app server instances, each serving the same Ruby app. This set of app server processes is called an app server cluster (hence the name Mongrel Cluster, Thin Cluster, etc). You must then setup Apache or Nginx to reverse proxy to this cluster. Apache/Nginx will take care of distributing requests between the instances in the cluster (More on this in section "I/O concurrency models").
The web server can buffer requests and responses, protecting the app server from "slow clients" - HTTP clients that don't send or accept data very quickly. You don't want your app server to do nothing while waiting for the client to send the full request or to receive the full response, because during that time the app server may not be able to do anything else. Apache and Nginx are very good at doing many things at the same time because they're either multithreaded or evented.
Most app servers can serve static files, but are not particularly good at it. Apache and Nginx can do it faster.
People typically set up Apache/Nginx to serve static files directly, but forward requests that don't correspond with static files to the app server, it's good security practice. Apache and Nginx are very mature and can shield the app server from (perhaps maliciously) corrupted requests.
为什么有些应用服务器可以直接暴露在互联网上?
Phusion Passenger与其他应用服务器完全不同。它的独特之处在于它可以集成到web服务器中。
《彩虹》的作者公开表示,直接把它暴露在互联网上是安全的。作者相当肯定HTTP解析器中没有漏洞(以及类似的漏洞)。不过,作者不提供任何保证,并表示使用风险自负。
应用服务器比较
在本节中,我将比较我提到的大多数应用服务器,但不包括Phusion Passenger。Phusion Passenger是如此不同于其他的野兽,我给了它一个专门的部分。我还省略了Trinidad和TorqueBox,因为我对它们不够了解,但它们只在使用JRuby时才有用。
Mongrel was pretty bare bones. As mentioned earlier, Mongrel is purely single-threaded multi-process, so it is only useful in a cluster. There is no process monitoring: if a process in the cluster crashes (e.g. because of a bug in the app) then it needs to be manually restarted. People tend to use external process monitoring tools such as Monit and God.
Unicorn is a fork of Mongrel. It supports limited process monitoring: if a process crashes it is automatically restarted by the master process. It can make all processes listen on a single shared socket, instead of a separate socket for each process. This simplifies reverse proxy configuration. Like Mongrel, it is purely single-threaded multi-process.
Thin uses the evented I/O model by utilizing the EventMachine library. Other than using the Mongrel HTTP parser, it is not based on Mongrel in any way. Its cluster mode has no process monitoring so you need to monitor crashes etc. There is no Unicorn-like shared socket, so each process listens on its own socket. In theory, Thin's I/O model allows high concurrency, but in most practical situations that Thin is used for, one Thin process can only handle 1 concurrent request, so you still need a cluster. More about this peculiar property in section "I/O concurrency models".
Puma was also forked from Mongrel, but unlike Unicorn, Puma is designed to be purely multi-threaded. There is therefore currently no builtin cluster support. You need to take special care to ensure that you can utilize multiple cores (More about this in section "I/O concurrency models").
Rainbows supports multiple concurrency models through the use of different libraries.
Phusion乘客
Phusion Passenger的工作原理与其他的非常不同。Phusion Passenger直接集成到Apache或Nginx中,因此可以与Apache的mod_php进行比较。就像mod_php允许Apache为PHP应用服务一样,Phusion Passenger允许Apache(以及Nginx!)几乎神奇地为Ruby应用服务。Phusion Passenger的目标是尽可能少地麻烦地让一切Just Work(tm)。
而不是为你的应用程序启动一个进程或集群,并配置Apache/Nginx提供静态文件和/或反向代理请求到进程/集群与Phusion Passenger,你只需要:
编辑web服务器配置文件并指定Ruby应用程序的“公共”目录的位置。
没有第二步。
所有配置都在web服务器配置文件中完成。Phusion Passenger几乎实现了一切自动化。不需要启动集群并管理进程。启动/停止进程,当它们崩溃时重新启动它们,等等——所有这些都是自动化的。与其他应用服务器相比,Phusion Passenger的活动部件要少得多。这种易用性是人们使用Phusion Passenger的主要原因之一。
与其他应用服务器不同的是,Phusion Passenger主要是用c++编写的,这使得它非常快。
Phusion Passenger还有一个企业级版本,拥有更多的功能,比如自动滚动重启、多线程支持、抗部署错误等。
由于上述原因,Phusion Passenger是目前最受欢迎的Ruby应用服务器,支持超过15万个网站,包括纽约时报、皮克斯、Airbnb等大型网站。
Phusion Passenger vs其他应用服务器
Phusion Passenger提供了更多的功能,并提供了许多优于其他应用服务器的优势,例如:
Dynamically adjusting the number of processes based on traffic. We run a ton of Rails apps on our resource-constrainted server that are not public-facing, and that people in our organization only use at most a few times a day. Things like Gitlab, Redmine, etc. Phusion Passenger can spin down those processes when they're not used, and spinning them up when they're used, allowing more resources to be available for more important apps. With other app servers, all your processes are turned on all the time.
Some app servers are not good at certain workloads, by design. For example Unicorn is designed for fast-running requests only: See the Unicorn website section "Just Worse in Some Cases".
Unicorn不擅长的工作量有:
流工作负载(例如Rails 4实时流或Rails 4模板流)。
应用程序执行HTTP API调用的工作负载。
Phusion Passenger Enterprise 4或更高版本中的混合I/O模型使其成为这类工作负载的绝佳选择。
Other app servers require the user to run at least one instance per application. By contrast, Phusion Passenger supports multiple applications in a single instance. This greatly reduces administration overhead.
Automatic user switching, a convenient security feature.
Phusion Passenger supports many MRI Ruby, JRuby and Rubinius. Mongrel, Unicorn and Thin only support MRI. Puma also supports all 3.
Phusion Passenger actually supports more than just Ruby! It also supports Python WSGI, so it can for example also run Django and Flask apps. In fact Phusion Passenger is moving into the direction of becoming a polyglot server. Node.js support on the todo list.
Out-of-band garbage collection. Phusion Passenger can run the Ruby garbage collector outside the normal request/response cycle, potentially reducing request times by hundreds of milliseconds. Unicorn also has a similar feature, but Phusion Passenger's version is more flexible because
1) it's not limited to GC and can be used for arbitrary work.
2) Phusion Passenger's version works well with multithreaded apps, while Unicorn's does not.
Automated rolling restarts. Rolling restarts on Unicorn and other servers require some scripting work. Phusion Passenger Enterprise completely automates this way for you.
还有更多的特点和优势,但清单真的很长。你应该参考全面的Phusion Passenger手册(Apache版本,Nginx版本)或Phusion Passenger网站的信息。
I/O并发模型
Single-threaded multi-process. This is traditionally the most popular I/O model for Ruby app servers, partially because multithreading support in the Ruby ecosystem was very bad. Each process can handle exactly 1 request at a time. The web server load balances between processes. This model is very robust and there is little chance for the programmer to introduce concurrency bugs. However, its I/O concurrency is extremely limited (limited by the number of processes). This model is very suitable for fast, short-running workloads. It is very unsuitable for slow, long-running blocking I/O workloads, e.g. workloads involving the calling of HTTP APIs.
Purely multi-threaded. Nowadays the Ruby ecosystem has excellent multithreading support, so this I/O model has become very viable. Multithreading allows high I/O concurrency, making it suitable for both short-running and long-running blocking I/O workloads. The programmer is more likely to introduce concurrency bugs, but luckily most web frameworks are designed in such a way that this is still very unlikely. One thing to note however is that the MRI Ruby interpreter cannot leverage multiple CPU cores even when there are multiple threads, due to the use of the Global Interpreter Lock (GIL). You can work around this by using multiple multi-threaded processes, because each process can leverage a CPU core. JRuby and Rubinius have no GIL, so they can fully leverage multiple cores in a single process.
Hybrid multi-threaded multi-process. Primarily implemented by Phusion Passenger Enterprise 4 and later. You can easily switch between single-threaded multi-process, purely multithreaded, or perhaps even multiple processes each with multiple threads. This model gives the best of both worlds.
Evented. This model is completely different from the previously mentioned model. It allows very high I/O concurrency and is therefore excellent for long-running blocking I/O workloads. To utilize it, explicit support from the application and the framework is required. However all the major frameworks like Rails and Sinatra do not support evented code. This is why in practice a Thin process still cannot handle more than 1 request at a time, making it effectively behave the same as the single-threaded multi-process model. There are specialized frameworks that can take advantage of evented I/O, such as Cramp.
Phusion博客上最近发布了一篇文章,内容是在给定工作负载的情况下优化进程和线程的数量。参见调优Phusion Passenger的并发设置。
Capistrano
卡皮斯特拉诺则完全不同。在前面的所有章节中,“部署”指的是在应用服务器上启动Ruby应用程序,以便访问者可以访问它,但在这之前通常需要做一些准备工作,例如:
将Ruby应用程序的代码和文件上传到服务器机器。
安装应用程序所依赖的库。
设置或迁移数据库。
启动和停止应用程序可能依赖的任何守护进程,如Sidekiq/Resque worker或其他。
设置应用程序时需要做的其他事情。
在Capistrano的上下文中,“部署”指的是做所有这些准备工作。Capistrano不是应用服务器。相反,它是一个自动化所有准备工作的工具。每次部署新版本的应用程序时,您告诉Capistrano服务器的位置以及需要运行哪些命令,Capistrano将负责为您将Rails应用程序上传到服务器并运行您指定的命令。
Capistrano总是与应用程序服务器结合使用。它不替换应用服务器。反之亦然,应用服务器不能取代Capistrano,它们可以与Capistrano结合使用。
当然你不必用卡皮斯特拉诺。如果您更喜欢使用FTP上传Ruby应用程序,并且每次手动运行相同步骤的命令,那么您可以这样做。其他人厌倦了,所以他们在Capistrano自动化了这些步骤。