美文网首页
kafka集群一台broker挂掉,解决办法

kafka集群一台broker挂掉,解决办法

作者: 邵红晓 | 来源:发表于2021-08-25 13:26 被阅读0次

背景

4台kafka服务器39挂掉了,一直起不来 ,kafka 数据日志目录 服务器器上磁盘被占超过80%,告警
数据日志:$KAFKA_HOME/config/server.properties log.dirs配置项
操作日志:$KAFKA_HOME/logs目录
在同一台机器上进行不同磁盘的负载均衡,执行kafka-reassign-partitions.sh 失败之后,重启kafka失败
想重新负载一下,结果失败
Partitions reassignment failed due to from /export1/kafka-log/ to /export2/kafka-log/
参考 https://www.cnblogs.com/set-cookie/p/9614241.html

解决办法,手动触发partition的leader的选举,分配partition leader到可用节点上

1、查看

image.png
报错
Error: partition 0 does not have a leader. Skip getting offsets
2、修改元数据
/usr/hdf/current/kafka-broker/bin/zookeeper-shell.sh xx.xx.40.38:2181
get /brokers/topics/filebeat-dwlong-brush/partitions/0/state
{"controller_epoch":55,"leader":1004,"version":1,"leader_epoch":74,"isr":[1004]}
重新设置isr为可用节点,leader为可用节点,leader_epoch+1
set /brokers/topics/filebeat-dwlong-brush/partitions/0/state
{"controller_epoch":55,"leader":1001,"version":1,"leader_epoch":75,"isr":[1001]}
3、重新选举:
编写preferred-leader-plan.json文件,内容如下:
{"partitions":[{"topic":"filebeat-dwlong-brush","partition": 0}]
执行以下命令:
/usr/hdf/current/kafka-broker/bin/kafka-preferred-replica-election.sh --zookeeper xx.xx.40.40:2181 --path-to-json-file preferred-leader-plan.json
4、检查:
/usr/hdf/current/kafka-broker/bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list xx.xx.40.38:9092 --topic action --time -1
并查看日志:/var/log/kafka/server.log
如未出现:Error: partition 0 does not have a leader. Skip getting offsets 则 成功,若出现,重启机器。

正常如下:
/usr/hdf/current/kafka-broker/bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list xx.xx.40.38:9092 --topic filebeat-dwlong-brush --time -1
filebeat-dwlong-brush:2:2964861
filebeat-dwlong-brush:1:2964843
filebeat-dwlong-brush:3:2948928
filebeat-dwlong-brush:0:2964749
每个分区offset 正常,如有消费,数值会有增加。

相关文章

网友评论

      本文标题:kafka集群一台broker挂掉,解决办法

      本文链接:https://www.haomeiwen.com/subject/yrmsiltx.html