ZK currentEpoch&acceptedEpoch

ZK currentEpoch&acceptedEpoch

作者: Alen_ab56 | 来源:发表于2021-11-09 15:23 被阅读0次

ZK currentEpoch&acceptedEpoch
zookeeper实现分布式锁
zookeeper集群启动脚本
java.lang.NoClassDefFoundError:
zookeeper day1
Zookeeper 高可用搭建
Zookeeper常用命令
Kafka on docker
零知识证明中的超级新星：zk-SNARKs
分布式锁

在做多机房kafka切ZK演练时发现,当原集群的zk节点加入新集群时,出现报错
Leaders epoch, 6 is less than accepted epoch, 9

查看/data/zookeeper/data/version-2目录下确实有2个文件,分别是
acceptedEpoch、currentEpoch,这2个文件里的值都是9

这是为什么呢?这两个文件是做什么的?
这两个文件分别反映了指定的server进程已经看到的和参与的epoch number。尽管这些文件不包含任何应用级别的数据，但他们对于数据一致性来说很重要，决定了集群的选主能否成功.

https://issues.apache.org/jira/browse/ZOOKEEPER-335?focusedCommentId=16975961&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16975961

这两个变量主要是为了解决集群失败恢复的场景
As mentioned, the implementation up to version 3.3.3 has not included epoch variables acceptedEpoch and currentEpoch. This omission has generated problems [5]
(issue ZOOKEEPER-335 in Apache’s issue tracking system) in a production version
and was noticed by many ZooKeeper clients. The origin of this problem is at the beginning of Recovery Phase (Algorithm 4 line 2), when the leader increments its epoch
(contained in lastZxid) even before acquiring a quorum of successfully connected followers (such leader is called false leader ). Since a follower goes back to FLE if its
epoch is larger than the leader’s epoch (line 25), when a false leader drops leadership
and becomes a follower of a leader from a previous epoch, it finds a smaller epoch (line

and goes back to FLE. This behavior can loop, switching from Recovery Phase to
FLE.
摘自:http://www.tcs.hut.fi/Studies/T-79.5001/reports/2012-deSouzaMedeiros.pdf

简单来说就是: 以前是不区分acceptedEpoch 和 currentEpoch的，以前epoch是直接从zxid中前32位里提取的。但这会导致一个问题：假设有三个服务器s1, s2, s3. 集群s1和s2取得联系，且s1为leader，s3为LOOKING:
s2重启，加上s3的选票，将s3选为leader
s3把自己当做leader，且epoch+1，但无法与其它server取得联系。此时s1还是认为自己是leader(后文会问为什么)。
s2无法与s3取得联系，同时收到s1的LEADING信息，便回到s1的旧集群里
s3无法与他人取得联系，退出leadership，回到FLE，并收到旧集群leader s1的消息，便作为follower也回到旧集群里
s3作为follower发现自己的epoch比旧leader的epoch还大，便又回到FLE
之后s3就不断在4和5之间徘徊，不断在FLE阶段和RECOVER阶段循环。

至于为什么s1自认为自己是leader, 是因为leader有一个缓存时间导致leader不会因为某些瞬时故障而结束自己的任期.
这个缓存时间的原理是:心跳包
在心跳包以内leader1检测不到leader2和leader3的learnHandler线程死亡,因而leader状态保持有效,仅仅是状态表示标识,不会影响写操作,因为写操作会要求半数以上节点响应,而这个时间端这个要求是不满足的.

那么acceptedEpoch和currentEpoch是怎么解决故障恢复问题的呢?
if (newEpoch > self.getAcceptedEpoch()) {
wrappedEpochBytes.putInt((int) self.getCurrentEpoch());
self.setAcceptedEpoch(newEpoch);
} else if (newEpoch == self.getAcceptedEpoch()) {
// since we have already acked an epoch equal to the leaders, we cannot ack
// again, but we still need to send our lastZxid to the leader so that we can
// sync with it if it does assume leadership of the epoch.
// the -1 indicates that this reply should not count as an ack for the new epoch
wrappedEpochBytes.putInt(-1);
} else {
throw new IOException("Leaders epoch, "
+ newEpoch
+ " is less than accepted epoch, "
+ self.getAcceptedEpoch());
直接报错,强制不允许大于leader的epoch的节点加入集群

相关文章

ZK currentEpoch&acceptedEpoch
在做多机房kafka切ZK演练时发现,当原集群的zk节点加入新集群时,出现报错Leaders epoch, 6 i...
zookeeper实现分布式锁
一：zk的安装、配置 mac上的zk安装、配置： mac上zk安装win上的zk安装、配置：win上zk安装二：...
zookeeper集群启动脚本
start_zk.sh stop_zk.sh status_zk.sh
java.lang.NoClassDefFoundError:
Springboot dubbo zk 启动失败 zk和dubbo的版本不匹配原因：本地zk需与部署的zk版...
zookeeper day1
1 zk 简介 zk 管理大数据生态系统中各个组件。（Hadoop、Hive、Spark） zk应用场景： zk是...
Zookeeper 高可用搭建
1 单机搭建zk 2 zk可视化 3 spring cloud单机整合zk 4 zk集群搭建 5 spring c...
Zookeeper常用命令
一、zk服务命令启动ZK服务: bin/zkServer.sh start 查看ZK服务状态: bin/zkSe...
Kafka on docker
Install zookeeper Download zk Start zk Install kafka Down...
零知识证明中的超级新星：zk-SNARKs
零知识证明中的超级新星：zk-SNARKs 什么是zk-SNARKs zk-SNARK是“Zero-Knowled...
分布式锁
方案一：redis方案二：zk 方案一：redis 方案二：zk

网友评论

Zookeeper

本文标题：ZK currentEpoch&acceptedEpoch

本文链接：https://www.haomeiwen.com/subject/masmzltx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

栏目导航

热点阅读

Zookeeper

关于我们|服务条款|联系我们|ZK currentEpoch&acceptedEpoch|投稿指南|网站地图|RSS订阅|排版工具|手机版

提供经典美文摘抄,优美散文欣赏,现代诗歌精选,短篇小说,心情随笔,表白情书范文,故事会在线阅读欣赏

Copyright © 2014-2023 Haomeiwen.com All Rights Reserved. 好美文阅读网版权所有

备案信息：桂公网安备 45052102000051号 · 桂ICP备13007215号-3

本站所收录作品、热点评论等信息部分来源互联网，目的只是为了系统归纳学习和传递资讯

所有作品版权归原创作者所有，与本站立场无关，如不慎侵犯了你的权益，请联系我们告知，我们将做删除处理！