如果你知道,那么你会如何向外行描述ZooKeeper ?
我试过apache wiki, zookeeper sourceforge…但我还是无法与之产生共鸣。
论文:https://pdos.csail.mit.edu/6.824/papers/zookeeper.pdf 由MIT 6.824提供的讲座从36:00开始:https://youtu.be/pbmyrNjzdDk?t=2198
我大致理解ZooKeeper,但对“quorum”和“split brain”这两个术语有问题,所以也许我可以和你分享我的发现(我认为自己也是一个外行)。
These 5 servers form a quorum. Quorum simply means "these servers can vote upon who should be the leader". So the voting is based on majority. Majority simply means "more than half" so more than half of the number of servers must agree for a specific server to become the leader. So there is this bad thing that may happen called "split brain". A split brain is simply this, as far as I understand: The cluster of 5 servers splits into two parts, or let's call it "server teams", with maybe one part of 2 and the other of 3 servers. This is really a bad situation as if both "server teams" must execute a specific order how would you decide wich team should be preferred? They might have received different information from the clients. So it is really important to know what "server team" is still relevant and which one can/should be ignored. Majority is also the reason you should use an odd number of servers. If you have 4 servers and a split brain where 2 servers seperate then both "server teams" could say "hey, we want to decide who is the leader!" but how should you decide which 2 servers you should choose? With 5 servers it's simple: The server team with 3 servers has the majority and is allowed to select the new leader. Even if you just have 3 servers and one of them fails the other 2 still form the majority and can agree that one of them will become the new leader.
Zookeeper还提供了一个非常易于使用的API。这篇博客,Zookeeper Java API的例子,有一些例子,如果你正在寻找例子。
Apache ZooKeeper是一种开源技术,用于协调和管理分布式应用程序中的配置。它简化了维护配置细节、启用分布式同步和管理命名注册中心等任务。
Apache ZooKeeper可以与Apache Pinot或Apache Flink等Apache项目一起使用。Apache Kafka还使用ZooKeeper来管理代理、主题和分区信息。由于Apache ZooKeeper是开源的,你可以将它与任何你选择的技术/项目配对,而不仅仅是Apache Foundation项目。
让我们想象一下,在文件存储中有一百万个文件,并且文件数量每分钟都在增加。 我们的任务是先处理,然后删除这些文件。我们可以想到的一种方法是编写一个脚本来完成这个任务,并在多个服务器上并行运行多个实例。我们甚至可以根据需求增加或减少服务器数量。这基本上是一个分布式计算/数据处理应用程序。
在这里,我们如何确保同一文件不会同时被多个服务器拾取和处理? 为了解决这个问题,所有服务器都应该共享当前正在处理哪个文件的信息。
上面是一个粗略的例子,需要一些其他的护栏,但我希望它能让你了解什么是zookeeper。 ZK基本上是一个可以使用ZK API访问的数据存储。但它不应该被用作数据库。应该只存储少量数据(通常以KB为单位)。上限为每个znode 1MB。 ZK是专门构建的,这样分布式应用程序就可以相互通信。
存储配置:存储被访问的配置 在您的分布式应用程序中。 命名服务:将服务名称、IP地址映射等信息集中存储 用户和应用程序通过网络进行通信。 组成员:所有运行在分布式服务器上的应用程序都可以连接到ZK并发送心跳。如果任何一个服务器/应用程序宕机,ZK可以通知其他服务器/应用程序 有关此事件的服务器/应用程序。
其他特性必须建立在ZooKeeper API的基础上。
锁和队列——对于分布式同步很有用。 两阶段提交——当我们必须跨阶段提交/回滚时非常有用 服务器。 leader选举——您的分布式应用程序可以使用ZK来进行自动故障转移的leader选举。 共享的计数器
下面的页面解释了如何实现这些特性 https://zookeeper.apache.org/doc/current/recipes.html
ZooKeeper可以有更多的应用。这些特性必须建立在基于分布式系统需求的ZK API之上。
Atomicity and Durability Zookeeper itself is distributed and Fault tolerant. The architecture involves one leader node and multiple follower nodes. In case a ZK follower node goes down, it will automatically failover. The client sessions are replicated hence ZK can automatically move clients to a different node. If the Leader node goes down then a new leader is elected using the ZK consensus algorithm. Reads are very fast since its served from in-memory store. Writes are written in the sequence in which it arrived. Hence maintains ordering. Watches will send out notification to the client who set the watch on some data. This reduces the need to poll ZK. Note that watches are one time triggers and if you get a watch event and you want to get notified of future changes, you must set another watch. Persistent and ephemeral znodes are available. Both are stored on ZK disks. Persistent here means that the data will be persisted once the client who created it disconnects. Ephemeral means the data will be removed automatically when the client disconnects. Ephemeral znodes are not allowed to have children. There is also persistent sequential and ephemeral sequential znodes. Here the names of the znodes can have a suffix sequence number. similar to DB auto increment ID's, these sequence number keeps increasing and managed by ZK. This can be useful to implement queues, locks etc.
