我曾被要求评估RabbitMQ而不是Kafka,但发现很难找到一个消息队列比Kafka更适合的情况。有人知道在哪些用例中消息队列在吞吐量、持久性、延迟或易用性方面更适合吗?
当前回答
我知道有点晚了,也许你已经间接地说过了,但是,Kafka根本不是一个队列,它是一个日志(就像上面有人说的,基于民意调查)。
简单来说,当你更喜欢RabbitMQ(或任何队列技术)而不是Kafka时,最明显的用例是:
You have multiple consumers consuming from a queue and whenever there is a new message in the queue and an available consumer, you want this message to be processed. If you look closely at how Kafka works, you'll notice it does not know how to do that, because of partition scaling, you'll have a consumer dedicated to a partition and you'll get into starvation issue. Issue that is easily avoided by using simple queue techno. You can think of using a thread that will dispatch the different messages from same partition, but again, Kafka does not have any selective acknowledgment mechanisms.
你能做的最多的就是像那些家伙一样,试着把Kafka转换成一个队列: https://github.com/softwaremill/kmq
雅尼克
其他回答
我每周都听到这个问题。RabbitMQ(类似于IBM MQ或JMS或其他消息传递解决方案)用于传统消息传递,Apache Kafka用作流媒体平台(消息传递+分布式存储+数据处理)。两者都是为不同的用例构建的。
你可以在“传统消息传递”中使用Kafka,但不能在Kafka特定的场景中使用MQ。
文章“Apache Kafka vs.企业服务总线——朋友、敌人还是亦敌亦友?”(https://www.confluent.io/blog/apache-kafka-vs-enterprise-service-bus-esb-friends-enemies-or-frenemies/)讨论了为什么Kafka对集成和消息解决方案(包括RabbitMQ)不是竞争的,而是互补的,以及如何将两者集成。
Apache Kafka is a popular choice for powering data pipelines. Apache kafka added kafka stream to support popular etl use cases. KSQL makes it simple to transform data within the pipeline, readying messages to cleanly land in another system. KSQL is the streaming SQL engine for Apache Kafka. It provides an easy-to-use yet powerful interactive SQL interface for stream processing on Kafka, without the need to write code in a programming language such as Java or Python. KSQL is scalable, elastic, fault-tolerant, and real-time. It supports a wide range of streaming operations, including data filtering, transformations, aggregations, joins, windowing, and sessionization.
https://docs.confluent.io/current/ksql/docs/index.html
对于etl系统来说,Rabbitmq并不是一个受欢迎的选择,它更适合那些需要简单的消息传递系统和更低吞吐量的系统。
Scaling both is hard in a distributed fault tolerant way but I'd make a case that it's much harder at massive scale with RabbitMQ. It's not trivial to understand Shovel, Federation, Mirrored Msg Queues, ACK, Mem issues, Fault tollerance etc. Not to say you won't also have specific issues with Zookeeper etc on Kafka but there are less moving parts to manage. That said, you get a Polyglot exchange with RMQ which you don't with Kafka. If you want streaming, use Kafka. If you want simple IoT or similar high volume packet delivery, use Kafka. It's about smart consumers. If you want msg flexibility and higher reliability with higher costs and possibly some complexity, use RMQ.
我知道有点晚了,也许你已经间接地说过了,但是,Kafka根本不是一个队列,它是一个日志(就像上面有人说的,基于民意调查)。
简单来说,当你更喜欢RabbitMQ(或任何队列技术)而不是Kafka时,最明显的用例是:
You have multiple consumers consuming from a queue and whenever there is a new message in the queue and an available consumer, you want this message to be processed. If you look closely at how Kafka works, you'll notice it does not know how to do that, because of partition scaling, you'll have a consumer dedicated to a partition and you'll get into starvation issue. Issue that is easily avoided by using simple queue techno. You can think of using a thread that will dispatch the different messages from same partition, but again, Kafka does not have any selective acknowledgment mechanisms.
你能做的最多的就是像那些家伙一样,试着把Kafka转换成一个队列: https://github.com/softwaremill/kmq
雅尼克
简短的回答是“消息确认”。RabbitMQ可以配置为需要消息确认。如果接收方失败,消息将返回队列,另一个接收方可以再次尝试。虽然你可以用自己的代码在Kafka中完成这个任务,但它可以在RabbitMQ中开箱即用。
根据我的经验,如果你有一个需要查询信息流的应用程序,Kafka和KSql是你最好的选择。如果你想要一个排队系统,你最好使用RabbitMQ。