一、###Zabbix集群构建:
Mysql的MHA架构清单:
主机名 | IP | 主/从 | MHA高可用软件 | 读写分离软件 |
---|---|---|---|---|
db01 | 192.168.98.5 | 主 | 绑定VIP:192.168.98.92 | |
db02 | 192.168.98.6 | 从 | ||
db03 | 192.168.98.13 | 从 | MHA软件部署位置 | 读写分离软件 部署位置 |
Zabbix的web端清单:
主机名 | IP | 主/从 | 功能 |
---|---|---|---|
zabbix01 | 192.168.98.5 | 主 | zabbix服务端 |
zabbix02 | 192.168.98.6 | 主 | zabbix服务端 |
负载均衡 | 192.168.98.6 | 提供负载均衡功能 |
二、给db01、db02、db03服务器部署Mysql数据库:
注意:下面只给出db01上搭建mysql的步骤,db02和db03用相同的方法搭建。
2.1 创建相关目录:
[root@db01 ~]# mkdir /application ----> 软件的存放目录
[root@db01 /]# mkdir /data/3306/data -p ----> 数据的存放位置
2.2 创建用户:
[root@db01 /]# useradd -s /sbin/nologin -M mysql
2.3 上传软件并解压处理:
[root@db01 application]# ll
total 1079692
-rw-r--r-- 1 mysql mysql 644869837 Aug 5 09:31 mysql-5.7.26-linux-glibc2.12-x86_64.tar.gz
[root@db01 /application]# tar xf mysql-5.7.26-linux-glibc2.12-x86_64
[root@db01 /application]# mv mysql-5.7.26-linux-glibc2.12-x86_64 mysql
[root@db01 application]# ll
total 1079692
drwxr-xr-x 9 mysql mysql 129 Aug 5 12:22 mysql
-rw-r--r-- 1 mysql mysql 644869837 Aug 5 09:31 mysql-5.7.26-linux-glibc2.12-x86_64.tar.gz
注:软件包可以去mysql.com官网下载
2.4 数据初始化(创建系统数据):
[root@db01 ~]# rpm -qa|grep maria
mariadb-libs-5.5.60-1.el7_5.x86_64
[root@db01 ~]# yum remove mariadb-libs-5.5.60-1.el7_5.x86_64 -y
"注:由于centos7系统自带mariadb数据库,所以必须要将其删除。"
[root@db01 ~]# vim /etc/profile
export PATH=/application/mysql/bin:$PATH
[root@db01 ~]# mysql -V //次命令用于检查mysql版本信息
mysql Ver 14.14 Distrib 5.7.26, for linux-glibc2.12 (x86_64) using EditLine wrapper
[root@db01 data]# mysqld --initialize-insecure --user=mysql --basedir=/application/mysql --datadir=/data/3306/data //数据库初始化命令
2019-08-05T08:41:34.391208Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2019-08-05T08:41:36.705525Z 0 [Warning] InnoDB: New log files created, LSN=45790
2019-08-05T08:41:36.910614Z 0 [Warning] InnoDB: Creating foreign key constraint system tables.
2019-08-05T08:41:37.021365Z 0 [Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: d63578ab-b75c-11e9-a331-000c29a71104.
2019-08-05T08:41:37.022771Z 0 [Warning] Gtid table is not ready to be used. Table 'mysql.gtid_executed' cannot be opened.
2019-08-05T08:41:37.023736Z 1 [Warning] root@localhost is created with an empty password ! Please consider switching off the --initialize-insecure option.
2.5 书写配置文件:
cat >/etc/my.cnf <<EOF
[mysqld]
user=mysql
basedir=/application/mysql
datadir=/data/3306/data
socket=/tmp/mysql.sock
[mysql]
socket=/tmp/mysql.sock
prompt=wuxin[\\d]> //wuxin变量可以自己定义
EOF
2.6 准备启动脚本:
[root@db01 ~]#chown -R mysql.mysql /data/* /application/mysql/*
[root@db01 ~]#cp /application/mysql/support-files/mysql.server /etc/init.d/mysqld
[root@db01 ~]# /etc/init.d/mysqld start //启动mysql方法
Starting MySQL.Logging to '/data/3306/data/db01.err'.
SUCCESS!
2.7 使用systemd管理mysql:
[root@db01 ~]# /etc/init.d/mysqld start //因为要用systemctl去管理,所以先按上一步的方法关闭mysql
[root@db01 ~]# cat > /etc/systemd/system/mysqld.service <<EOF
[Unit]
Description=MySQL Server
Documentation=man:mysqld(8)
Documentation=http://dev.mysql.com/doc/refman/en/using-systemd.html
After=network.target
After=syslog.target
[Install]
WantedBy=multi-user.target
[Service]
User=mysql
Group=mysql
ExecStart=/application/mysql/bin/mysqld --defaults-file=/etc/my.cnf
LimitNOFILE = 5000
EOF
注意:将原来模式启动mysqld先关闭,然后再用systemd管理。
2.8 用systemctl管理启动数据库:
[root@db01 ~]#systemctl start mysqld
三、给db01、db02、db03三台数据库构建主从关系,开启GTID复制功能:
3.1 清理环境,三台数据库都操作:
pkill mysqld
\rm -rf /data/3306/data/*
\rm -rf /data/binlog/*
mkdir -p /data/3306/data
mkdir -p /data/binlog/
chown -R mysql.mysql /data/*
3.2 准备配置文件,每台数据库填写各自的配置文件:
# 主库db01:
cat > /etc/my.cnf <<EOF
[mysqld]
basedir=/application/mysql/
datadir=/data/3306/data
socket=/tmp/mysql.sock
server_id=51
port=3306
secure-file-priv=/tmp
autocommit=0
log_bin=/data/binlog/mysql-bin
binlog_format=row
gtid-mode=on
enforce-gtid-consistency=true
log-slave-updates=1
[mysql]
prompt=db01 [\\d]>
EOF
# slave1(db02):
cat > /etc/my.cnf <<EOF
[mysqld]
basedir=/application/mysql
datadir=/data/3306/data
socket=/tmp/mysql.sock
server_id=52
port=3306
secure-file-priv=/tmp
autocommit=0
log_bin=/data/binlog/mysql-bin
binlog_format=row
gtid-mode=on
enforce-gtid-consistency=true
log-slave-updates=1
[mysql]
prompt=db02 [\\d]>
EOF
# slave2(db03):
cat > /etc/my.cnf <<EOF
[mysqld]
basedir=/application/mysql
datadir=/data/3306/data
socket=/tmp/mysql.sock
server_id=53
port=3306
secure-file-priv=/tmp
autocommit=0
log_bin=/data/binlog/mysql-bin
binlog_format=row
gtid-mode=on
enforce-gtid-consistency=true
log-slave-updates=1
[mysql]
prompt=db03 [\\d]>
EOF
3.3 初始化数据,三台数据库都执行:
mysqld --initialize-insecure --user=mysql --basedir=/application/mysql --datadir=/data/3306/data
3.4 启动数据库,三台数据库都启动:
systemctl start mysqld
3.5 构建主从:
db01主库执行:
mysql -e "grant replication slave on *.* to repl@'192.168.%.%' identified by '123456';"
db02从库和db03从库分别执行:
mysql -e "change master to master_host='192.168.98.5',master_user='repl',master_password='123456' ,MASTER_AUTO_POSITION=1;"
mysql -e "start slave;"
[root@db02 ~]# mysql -e "show slave status \G"|grep Running:
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
[root@db03 ~]# mysql -e "show slave status \G"|grep Running:
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
总结:做完上述步骤,就完成了一主两从的搭建。
四、MHA环境搭建
规划:
主库:
192.168.98.5 主库
VIP:192.168.98.92
从库:
192.168.98.6 从库
192.168.98.13 从库、MHA软件的manager管理端
4.1、配置关键程序软连接(三台数据库都执行):
ln -s /application/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog
ln -s /application/mysql/bin/mysql /usr/bin/mysql
4.2、配置各节点互信(在db01上执行):
rm -rf /root/.ssh
ssh-keygen
cd /root/.ssh
mv id_rsa.pub authorized_keys
scp -r /root/.ssh 192.168.98.6:/root
scp -r /root/.ssh 192.168.98.13:/root
各节点验证
db01:
ssh 192.168.98.5 date
ssh 192.168.98.6 date
ssh 192.168.98.13 date
db02:
ssh 192.168.98.5 date
ssh 192.168.98.6 date
ssh 192.168.98.13 date
db03:
ssh 192.168.98.5 date
ssh 192.168.98.6 date
ssh 192.168.98.13 date
4.3、安装软件
4.3.1、下载mha软件
mha官网:https://code.google.com/archive/p/mysql-master-ha/
github下载地址:https://github.com/yoshinorim/mha4mysql-manager/wiki/Downloads
4.3.2、所有节点安装Node软件依赖包(三台数据库都执行):
yum install perl-DBD-MySQL -y
rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm
4.3.3、在db01主库中创建mha需要的用户(因为主从已经搭建完成,所以在db01上创建mha用户,db02和db03上面也会存在mha用户):
grant all privileges on *.* to mha@'192.168.%.%' identified by 'mha';
4.3.4、Manager软件安装(因为db03作为MHA软件的manager,所以只在它上面安装)
yum install -y perl-Config-Tiny epel-release perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes
rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm
4.3.5、配置文件准备(db03上配置mha的配置文件)
创建配置文件目录
mkdir -p /etc/mha
创建日志目录
mkdir -p /var/log/mha/app1
编辑mha配置文件:
vim /etc/mha/app1.cnf
[server default]
manager_log=/var/log/mha/app1/manager ##MHA架构manager软件的日志(工作流程日志)
manager_workdir=/var/log/mha/app1 ##工作路径
master_binlog_dir=/data/binlog ##主库的二进制日志路径
user=mha ##连接数据库用户
password=mha ##连接数据库密码
ping_interval=2 ##ping时间间隔
repl_password=123456 ##主从复制用户密码
repl_user=repl ##主从复制用户
ssh_user=root ##互信用户
[server1]
hostname=192.168.98.5
port=3306
[server2]
hostname=192.168.98.6
port=3306
[server3]
hostname=192.168.98.13
port=3306
4.3.6、 状态检查
4.3.6.1、互信检查(db03里面执行)
[root@db03 ~]# masterha_check_ssh --conf=/etc/mha/app1.cnf
Fri Apr 19 16:39:34 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Apr 19 16:39:34 2019 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Fri Apr 19 16:39:34 2019 - [info] Reading server configuration from /etc/mha/app1.cnf..
Fri Apr 19 16:39:34 2019 - [info] Starting SSH connection tests..
Fri Apr 19 16:39:35 2019 - [debug]
Fri Apr 19 16:39:34 2019 - [debug] Connecting via SSH from root@10.0.0.51(10.0.0.51:22) to root@10.0.0.52(10.0.0.52:22)..
Fri Apr 19 16:39:34 2019 - [debug] ok.
Fri Apr 19 16:39:34 2019 - [debug] Connecting via SSH from root@10.0.0.51(10.0.0.51:22) to root@10.0.0.53(10.0.0.53:22)..
Fri Apr 19 16:39:35 2019 - [debug] ok.
Fri Apr 19 16:39:36 2019 - [debug]
Fri Apr 19 16:39:35 2019 - [debug] Connecting via SSH from root@10.0.0.52(10.0.0.52:22) to root@10.0.0.51(10.0.0.51:22)..
Fri Apr 19 16:39:35 2019 - [debug] ok.
Fri Apr 19 16:39:35 2019 - [debug] Connecting via SSH from root@10.0.0.52(10.0.0.52:22) to root@10.0.0.53(10.0.0.53:22)..
Fri Apr 19 16:39:35 2019 - [debug] ok.
Fri Apr 19 16:39:37 2019 - [debug]
Fri Apr 19 16:39:35 2019 - [debug] Connecting via SSH from root@10.0.0.53(10.0.0.53:22) to root@10.0.0.51(10.0.0.51:22)..
Fri Apr 19 16:39:35 2019 - [debug] ok.
Fri Apr 19 16:39:35 2019 - [debug] Connecting via SSH from root@10.0.0.53(10.0.0.53:22) to root@10.0.0.52(10.0.0.52:22)..
Fri Apr 19 16:39:36 2019 - [debug] ok.
Fri Apr 19 16:39:37 2019 - [info] All SSH connection tests passed successfully.
4.3.6.2、主从状态检查(db03里面)
[root@db03 ~]# masterha_check_repl --conf=/etc/mha/app1.cnf
Fri Jun 18 05:40:31 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Jun 18 05:40:31 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Fri Jun 18 05:40:31 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
Fri Jun 18 05:40:31 2021 - [info] MHA::MasterMonitor version 0.56.
Fri Jun 18 05:40:32 2021 - [info] GTID failover mode = 1
Fri Jun 18 05:40:32 2021 - [info] Dead Servers:
Fri Jun 18 05:40:32 2021 - [info] Alive Servers:
Fri Jun 18 05:40:32 2021 - [info] 192.168.98.5(192.168.98.5:3306)
Fri Jun 18 05:40:32 2021 - [info] 192.168.98.6(192.168.98.6:3306)
Fri Jun 18 05:40:32 2021 - [info] 192.168.98.13(192.168.98.13:3306)
Fri Jun 18 05:40:32 2021 - [info] Alive Slaves:
Fri Jun 18 05:40:32 2021 - [info] 192.168.98.6(192.168.98.6:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled
Fri Jun 18 05:40:32 2021 - [info] GTID ON
Fri Jun 18 05:40:32 2021 - [info] Replicating from 192.168.98.5(192.168.98.5:3306)
Fri Jun 18 05:40:32 2021 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Jun 18 05:40:32 2021 - [info] 192.168.98.13(192.168.98.13:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled
Fri Jun 18 05:40:32 2021 - [info] GTID ON
Fri Jun 18 05:40:32 2021 - [info] Replicating from 192.168.98.5(192.168.98.5:3306)
Fri Jun 18 05:40:32 2021 - [info] Current Alive Master: 192.168.98.5(192.168.98.5:3306)
Fri Jun 18 05:40:32 2021 - [info] Checking slave configurations..
Fri Jun 18 05:40:32 2021 - [info] read_only=1 is not set on slave 192.168.98.6(192.168.98.6:3306).
Fri Jun 18 05:40:32 2021 - [info] read_only=1 is not set on slave 192.168.98.13(192.168.98.13:3306).
Fri Jun 18 05:40:32 2021 - [info] Checking replication filtering settings..
Fri Jun 18 05:40:32 2021 - [info] binlog_do_db= , binlog_ignore_db=
Fri Jun 18 05:40:32 2021 - [info] Replication filtering check ok.
Fri Jun 18 05:40:32 2021 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Fri Jun 18 05:40:32 2021 - [info] Checking SSH publickey authentication settings on the current master..
Fri Jun 18 05:40:32 2021 - [info] HealthCheck: SSH to 192.168.98.5 is reachable.
Fri Jun 18 05:40:32 2021 - [info]
192.168.98.5(192.168.98.5:3306) (current master)
+--192.168.98.6(192.168.98.6:3306)
+--192.168.98.13(192.168.98.13:3306)
Fri Jun 18 05:40:32 2021 - [info] Checking replication health on 192.168.98.6..
Fri Jun 18 05:40:32 2021 - [info] ok.
Fri Jun 18 05:40:32 2021 - [info] Checking replication health on 192.168.98.13..
Fri Jun 18 05:40:32 2021 - [info] ok.
Fri Jun 18 05:40:32 2021 - [info] Checking master_ip_failover_script status:
Fri Jun 18 05:40:32 2021 - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.98.5 --orig_master_ip=192.168.98.5 --orig_master_port=3306
IN SCRIPT TEST====/sbin/ifconfig ens160:1 down==/sbin/ifconfig ens160:1 192.168.98.92/24===
Checking the Status of the script.. OK
Fri Jun 18 05:40:32 2021 - [info] OK.
Fri Jun 18 05:40:32 2021 - [warning] shutdown_script is not defined.
Fri Jun 18 05:40:32 2021 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
4.3.7、 开启MHA(db03):
[root@db03 ~]#nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 &
停止MHA命令:
[root@db03 ~]#masterha_stop --conf=/etc/mha/app1.cnf
4.3.8、 查看MHA状态(db03):
[root@db03 ~]# masterha_check_status --conf=/etc/mha/app1.cnf
app1 (pid:1101) is running(0:PING_OK), master:192.168.98.5
4.4 故障模拟及处理
(1) 停主库db01:
systemctl stop mysqld
备注:观察manager 日志 tail -f /var/log/mha/app1/manager
末尾必须显示successfully,才算正常切换成功。
(2) 主机宕机,修复主库
[root@db01 ~]# systemctl start mysqld
备注:启动完成主库后,集群并不认识,所以需要后续操作。
(3) 修复主从(把它重新加入MHA集群,当做从库添加进去)
[root@db01 ~]#mysql -e "change master to master_host(新主库ip)='192.168.98.6',master_user='repl',master_password='123456' ,MASTER_AUTO_POSITION=1;"
[root@db01 ~]#mysql -e "start slave;"
[root@db01 ~]# mysql -e "show slave status \G"|grep Running:
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
(4) 恢复配置文件(在db03中做)
[root@db03 /etc/mha]# cat app1.cnf
[server default]
manager_log=/var/log/mha/app1/manager
manager_workdir=/var/log/mha/app1
master_binlog_dir=/data/binlog
master_ip_failover_script=/usr/local/bin/master_ip_failover
password=mha
ping_interval=2
repl_password=123456
repl_user=repl
report_script=/usr/local/bin/send
ssh_user=root
user=mha
[server1] ##server1节点重新加入,因为之前他宕机了,系统把它自动清理掉。
candidate_master=1
hostname=192.168.98.5
port=3306
[server2]
candidate_master=1
hostname=192.168.98.6
port=3306
[server3]
hostname=192.168.98.13
port=3306
(5) 启动MHA
nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 &
检查状态:
masterha_check_status --conf=/etc/mha/app1.cnf
备注:故障事遇到问题,如何修复:
主库由db2切换至db1,但是在db2上指定主库db1时候报错:
[root@db02 ~]# mysql -e "change master to master_host='192.168.98.5',master_user='repl',master_password='123456' ,MASTER_AUTO_POSITION=1;"
ERROR 3081 (HY000) at line 1: This operation cannot be performed with running replication threads; run STOP SLAVE FOR CHANNEL '' first
解决办法:把主从复制停止,然后切换到主库,再把主从复制开启解决故障:
[root@db02 ~]# mysql -e "stop slave;"
[root@db02 ~]# mysql -e "change master to master_host='192.168.98.5',master_user='repl',master_password='123456' ,MASTER_AUTO_POSITION=1;"
[root@db02 ~]# mysql -e "start slave;"
[root@db02 ~]# mysql -e "show slave status \G"|grep Running:
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
MHA上检查信息:
[root@db03 ~]# masterha_check_repl --conf=/etc/mha/app1.cnf
MySQL Replication Health is OK.
后面就可以正常开启MHA,再检查主库信息,修复完成。
启动MHA:
nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 &
检查状态:
masterha_check_status --conf=/etc/mha/app1.cnf
网友评论