Tips: 在运行 Sentinel 时必须使用配置文件,因为系统将使用该文件来保存在重启时将重新加载的当前状态。如果没有给出配置文件或配置文件路径不可写,Sentinel 将简单地拒绝启动。
Tips: Sentinel 默认运行侦听 TCP 端口 26379 的连接,因此要使Sentinel 工作,您的服务器的端口 26379必须打开以接收来自其他 Sentinel 实例的 IP 地址的连接。否则哨兵不能通话,也不能就该做什么达成一致,所以永远不会执行故障转移。
测试环境
192.168.100.161 Redis-Master 6379 |{Sentinel01} 26379
192.168.100.162 Redis-Slave 6379 |{Sentinel02} 26379
192.168.100.163 Redis-Slave 6379 |{Sentinel03} 26379
在slave节点的配置文件中添加如下配置,以方便故障转移后Slave->Master其他slave节点到新master节点同步数据
requirepass "password"
修改后,需要重启redis服务。
哨兵模式的配置文件:
cp /usr/local/redis/sentinel.conf /etc/redis/sentinel.conf
vim /etc/redis/sentinel.conf
#修改如下:
port 26379
daemonize yes
protected-mode no
pidfile "/var/run/redis-sentinel.pid"
logfile "/var/log/redis-sentinel.log"
dir "/tmp" #工作目录
sentinel monitor mymaster 192.168.100.161 6379 2
#sentinel monitor <master-name> <ip> <port> <quorum>
#mymaster是集群的名称,master-name:监控的名字(任意);
#ip port:master节点的ip和端口,不需要配置从节点信息;quorum:代表要判定主节点最终不可达所需要的票数
#2为法定人数限制(quorum),即有几个sentinel认为master down了就进行故障转移,一般此值是所有sentinel节点(一般总数是>=3的奇数,如3.5.7等)的一半以上的整数值,总数是3,即3/2=1.5,取值为2,是master的ODOWN客观下线的依据。
sentinel auth-pass mymaster password
##mymaster集群中master的密码,注意此行要在上面行的下面
sentinel down-after-milliseconds mymaster 5000
#sentinel down-after-milliseconds <master-name> <times>:
# (每个Sentinel节点都要通过定期发送ping命令来判断Redis数据节点和其余Sentinel节点是否可达,如果超过了down-after-milliseconds配置的时间且没有有效的回复,则判定节点不可达,<times>(单位为毫秒)1秒=1000毫秒。
acllog-max-len 128
#ACL日志跟踪失败的命令和与ACL关联的身份验证事件, 在下面定义ACL日志的最大条目长度。
sentinel parallel-syncs mymaster 1
#sentinel parallel-syncs <master-name> <nums>
#发生故障转移后,可以同时向新master同步数据的slave的数量,数字越小总同步时间越长,但可以减轻新master的负载压力。
sentinel failover-timeout mymaster 50000
#sentinel failover-timeout <master-name> <times>:
# 故障转移的超时时间
sentinel deny-scripts-reconfig yes
#禁止修改脚本
sentinel notification-script <master-name> <script-path>:
#在故障转移期间,当一些警告级别的Sentinel事件发生(指重要事件,例如-sdown:客观下线、-odown:主观下线)时,会触发对应路径的脚本,并向脚本发送相应的事件参数
sentinel client-reconfig-script <master-name> <script-path>:
#在故障转移结束后,会触发对应路径的脚本,并向脚本发送故障转移结果的相关参数。和notification-script类似
# HOSTNAMES SUPPORT
SENTINEL resolve-hostnames no
# 通过启用解析主机名来启用主机名支持。请注意,您必须确保DNS配置正确,并且DNS解析不会引入很长的延迟。
SENTINEL announce-hostnames no
# 启用“解析主机名”时,Sentinel在向用户、配置文件等公开实例时仍使用IP地址。如果要在宣布时保留主机名,请启用下面的“宣布主机名”。
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
不常用的配置
# SCRIPTS EXECUTION
sentinel notification-script mymaster /var/redis/notify.sh
# sentinel通知脚本和sentinel reconfig脚本用于配置在故障转移后被调用以通知系统管理员或重新配置客户端的脚本。使用以下错误处理规则执行脚本:
# 如果脚本以“1”退出,则稍后将重试执行(最大次数当前设置为10)。
# 如果脚本以“2”(或更高的值)退出,则不会重试脚本执行。
# 如果脚本因接收到信号而终止,则行为与退出代码1相同。
# 脚本的最大运行时间为60秒。达到此限制后,脚本将以SIGKILL终止,并重试执行。
# NOTIFICATION SCRIPT | 通知执行
# 为警告级别(例如-sdown、-odown等)中生成的任何sentinel事件调用指定的通知脚本
# Example:
# sentinel notification-script <master-name> <script-path>
sentinel client-reconfig-script mymaster /var/redis/reconfig.sh
# CLIENTS RECONFIGURATION SCRIPT | 客户端重新配置脚本(可以多次调用)
# 由于故障切换而更改主机时,可以调用脚本来执行特定于应用程序的任务,以通知客户端配置已更改且主机位于不同的地址。
# ip、from port、to ip、to port 的参数用于传递主机的旧地址和所选副本(现在是主机)的新地址。
# The following arguments are passed to the script:
# <master-name> <role> <state> <from-ip> <from-port> <to-ip> <to-port>
# <state> is currently always "failover"
# <role> is either "leader" or "observer"
# Example:
# sentinel client-reconfig-script <master-name> <script-path>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
# Generated by CONFIG REWRITE Sentinel 以下的配置都是动态写进去的配置
protected-mode no
user default on nopass sanitize-payload ~* &* +@all
sentinel myid ab45d4d4b7484cbc231f96f3b8f1f502950a7627
sentinel leader-epoch mymaster 0
sentinel current-epoch 1
# 配置密码
sentinel auth-pass mymaster password
sentinel config-epoch mymaster 3
sentinel leader-epoch mymaster 0
# 发现了两个slave节点
sentinel known-replica mymaster 192.168.100.163 6379
sentinel known-replica mymaster 192.168.100.162 6379
# 发现了两个Sentinel节点
sentinel known-sentinel mymaster 192.168.100.163 26379 0d004b8d55fa5af749bf4114211274c51e65493a
sentinel known-sentinel mymaster 192.168.100.162 26379 f2b4e02c2d08b328b4f1ba89cb6a72be76822c9b
配置redis-sentinel 为service服务文件
vim /etc/init.d/redis-sentinel
#!/bin/sh
#chkconfig: 2345 80 90
#description:auto_run
REDISPORT=26379
#注意自己安装的redis根目录
REDISPATH=/usr/local/redis
EXEC=${REDISPATH}/src/redis-sentinel
CLIEXEC=${REDISPATH}/src/redis-cli
PIDFILE=/var/run/redis-sentinel.pid
CONF="/etc/redis/sentinel.conf"
case "$1" in
start)
if [ -f $PIDFILE ]
then
echo "$PIDFILE exists, process is already running or crashed"
else
echo "Starting Redis sentinel..."
$EXEC $CONF
fi
;;
stop)
if [ ! -f $PIDFILE ]
then
echo "$PIDFILE does not exist, process is not running"
else
PID=$(cat $PIDFILE)
echo "Stopping ..."
$CLIEXEC -p $REDISPORT shutdown
while [ -x /proc/${PID} ]
do
echo "Waiting for Redis-sentinel to shutdown ..."
sleep 1
done
echo "Redis-sentinel stopped"
fi
;;
*)
echo "Please use start or stop as first argument"
;;
esac
#设置权限让liunx可以执行
chmod 755 /etc/init.d/redis-sentinel
启动
service redis-sentinel start
停止
service redis-sentinel stop
设置开机启动
chkconfig redis-sentinel on
192.168.100.161:26379> sentinel masters # 查看所有Master状态
192.168.100.161:26379> sentinel master mymaster # 查看指定Master状态
192.168.100.161:26379> SENTINEL replicas mymaster # 查看备选slave节点
192.168.100.161:26379> SENTINEL sentinels mymaster # 查看其它哨兵状态
192.168.100.161:26379> SENTINEL get-master-addr-by-name mymaster #获取当前master的地址
在Master节点上添加一个Keys验证主从正常同步
# 1.在master 6379中添加一个key
192.168.100.161:6379> set username WeiyiGeek
OK
# 2.在Slave 6379 中查询添加key
192.168.100.162:6379> get username # "WeiyiGeek"
192.168.100.163:6379> set demo redis # 注意从节点只能读不能写(主节点采有写)
# (error) READONLY You cant write against a read only replica.
# 3.在Slave2 6379 中查询添加key
echo "get username" | redis-cli -p 6379 -a 123456 # # "WeiyiGeek"
测试哨兵模式下的故障转移
手动停止master节点上的redis服务,
service redis stop
查看sentinel的日志
tailf /var/log/redis-sentinel.log
12538:X 27 Feb 2022 17:48:36.548 # +sdown master mymaster 192.168.100.161 6379
12538:X 27 Feb 2022 17:48:36.604 # +odown master mymaster 192.168.100.161 6379 #quorum 2/2
12538:X 27 Feb 2022 17:48:36.604 # +new-epoch 18
12538:X 27 Feb 2022 17:48:36.604 # +try-failover master mymaster 192.168.100.161 6379
12538:X 27 Feb 2022 17:48:36.628 # +vote-for-leader ab45d4d4b7484cbc231f96f3b8f1f502950a7627 18
12538:X 27 Feb 2022 17:48:36.637 # f2b4e02c2d08b328b4f1ba89cb6a72be76822c9b voted for ab45d4d4b7484cbc231f96f3b8f1f502950a7627 18
12538:X 27 Feb 2022 17:48:36.663 # 0d004b8d55fa5af749bf4114211274c51e65493a voted for ab45d4d4b7484cbc231f96f3b8f1f502950a7627 18
12538:X 27 Feb 2022 17:48:36.687 # +elected-leader master mymaster 192.168.100.161 6379
12538:X 27 Feb 2022 17:48:36.687 # +failover-state-select-slave master mymaster 192.168.100.161 6379
12538:X 27 Feb 2022 17:48:36.742 # +selected-slave slave 192.168.100.163:6379 192.168.100.163 6379 @ mymaster 192.168.100.161 6379
12538:X 27 Feb 2022 17:48:36.742 * +failover-state-send-slaveof-noone slave 192.168.100.163:6379 192.168.100.163 6379 @ mymaster 192.168.100.161 6379
12538:X 27 Feb 2022 17:48:36.819 * +failover-state-wait-promotion slave 192.168.100.163:6379 192.168.100.163 6379 @ mymaster 192.168.100.161 6379
12538:X 27 Feb 2022 17:48:37.418 # +promoted-slave slave 192.168.100.163:6379 192.168.100.163 6379 @ mymaster 192.168.100.161 6379
12538:X 27 Feb 2022 17:48:37.418 # +failover-state-reconf-slaves master mymaster 192.168.100.161 6379
12538:X 27 Feb 2022 17:48:37.464 * +slave-reconf-sent slave 192.168.100.162:6379 192.168.100.162 6379 @ mymaster 192.168.100.161 6379
12538:X 27 Feb 2022 17:48:37.765 # -odown master mymaster 192.168.100.161 6379
12538:X 27 Feb 2022 17:48:38.443 * +slave-reconf-inprog slave 192.168.100.162:6379 192.168.100.162 6379 @ mymaster 192.168.100.161 6379
12538:X 27 Feb 2022 17:48:38.443 * +slave-reconf-done slave 192.168.100.162:6379 192.168.100.162 6379 @ mymaster 192.168.100.161 6379
12538:X 27 Feb 2022 17:48:38.543 # +failover-end master mymaster 192.168.100.161 6379
12538:X 27 Feb 2022 17:48:38.543 # +switch-master mymaster 192.168.100.161 6379 192.168.100.163 6379
12538:X 27 Feb 2022 17:48:38.543 * +slave slave 192.168.100.162:6379 192.168.100.162 6379 @ mymaster 192.168.100.163 6379
12538:X 27 Feb 2022 17:48:38.543 * +slave slave 192.168.100.161:6379 192.168.100.161 6379 @ mymaster 192.168.100.163 6379
12538:X 27 Feb 2022 17:49:28.625 # +sdown slave 192.168.100.161:6379 192.168.100.161 6379 @ mymaster 192.168.100.163 6379
12538:X 27 Feb 2022 17:50:25.616 # -sdown slave 192.168.100.161:6379 192.168.100.161 6379 @ mymaster 192.168.100.163 6379
12538:X 27 Feb 2022 17:55:27.912 * +reboot slave 192.168.100.161:6379 192.168.100.161 6379 @ mymaster 192.168.100.163 6379
主节点切换成功
停止的主节点需要修改配置文件才能作为新的从节点接入集群
vim /etc/redis/6379.conf
添加如下设置
replicaof 192.168.100.163 6379
masterauth password
保存退出后启动,然后在新的主节点上查看状态
修改配置文件前直接启动只会有一个从节点
# Replication
role:master
connected_slaves:1
slave0:ip=192.168.100.162,port=6379,state=online,offset=67839,lag=0
master_failover_state:no-failover
master_replid:023ecee75b3b328bdf3a93640577a362dd6c66d1
master_replid2:e57722cc8053e394e707d735b9cec3f96a830430
master_repl_offset:67839
second_repl_offset:15004
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:491
repl_backlog_histlen:67349
查看162节点的6379.conf文件,发现
复制配置已经被修改过了
replicaof 192.168.100.163 6379
原master节点修改配置文件后重新启动redis服务,再次在新master163查看状态
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.100.162,port=6379,state=online,offset=113579,lag=0
slave1:ip=192.168.100.161,port=6379,state=online,offset=113565,lag=1
master_failover_state:no-failover
master_replid:023ecee75b3b328bdf3a93640577a362dd6c66d1
master_replid2:e57722cc8053e394e707d735b9cec3f96a830430
master_repl_offset:113579
second_repl_offset:15004
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:491
repl_backlog_histlen:113089
已经有两个从节点了。
在新的主节点添加一个key 验证可写
新主节点163
redis-cli
127.0.0.1:6379> auth password
OK
127.0.0.1:6379> set hello world
OK
127.0.0.1:6379> get hello
"world"
127.0.0.1:6379> keys *
1) "username"
2) "hello"
127.0.0.1:6379>
新从节点161和162


网友评论