一、问题背景
rabbitMQ 架构为 rabbitMQ + keepalived 镜像模式。
rabbitmq01 192.168.1.101
rabbitmq01 192.168.1.102
vip 192.168.1.110 在rabbitmq02 上
现在rabbitmq01 故障,起不来,队列数据同步rabbbit02 有问题。
解决方法是将rabbitmq01 这个故障节点从集群中剔除,然后重新加入。
二、重新加入集群相关操作
在rabbitmq02 192.168.1.102服务器上
剔除rabbitmq01节点
# rabbitmqctl cluster_status
# rabbitmqctl forget_cluster_node rabbit@192-168-1-101
# rabbitmqctl cluster_status
在rabbitmq01 192.168.1.101服务器上
停止rabbitmq相关进程
# systemctl stop rabbitmq-server
# ps aux | grep rabbit | grep -v grep | awk ‘{print $2}’| xargs kill -9
移除rabbitmq相关数据文件
# mkdir /kingdee/rabbitmqBackup
# mv /var/lib/rabbitmq/* /kingdee/rabbitmqBackup/
重新启动rabbitmq
# systemctl start rabbitmq-server
# ps aux | grep rabbit
创建rabbitMQ用户
# rabbitmqctl add_user mquser rabbitMQ@123
注:此密码为安装时设置的rabbitMQ密码
# rabbitmqctl list_users
# rabbitmqctl set_user_tags mquser administrator
# rabbitmqctl set_permissions -p / mquser '.*' '.*' '.*'
# rabbitmqctl stop_app
# rabbitmqctl join_cluster rabbit@192-168-1-102
# rabbitmqctl cluster_status
# rabbitmqctl start_app
# systemctl start keepalived
在rabbitmq02 192.168.1.102服务器上
添加策略同步策略
# rabbitmqctl set_policy ha-all "^" '{"ha-mode":"all","ha-sync-mode":"automatic"}'
三、参考
RabbitMQ Cluster群集安装配置
https://www.cnblogs.com/elvi/p/7736661.html
Network partition detected
Mnesia reports that this RabbitMQ cluster has experienced a network partition.
There is a risk of losing data. Please read RabbitMQ documentation about network partitions and the possible solutions.
网友评论