美文网首页
Linux - drbd9+corosync+pacemaker

Linux - drbd9+corosync+pacemaker

作者: 找呀找提莫 | 来源:发表于2020-03-03 16:00 被阅读0次

    1.1部署drbd9

    1. 安装drbd-utils

    yum install -y gcc gcc-c++ make autoconf automake flex kernel-devel libxslt libxslt-devel asciidoc
    
    tar -zxvf drbd-utils-9.10.0.tar.gz
    
    cd drbd-utils-9.10.0
    
    ./autogen.sh
    
    ./configure --prefix=/usr/local/drbd-utils-9.10.0 \
    --localstatedir=/var \
    --sysconfdir=/etc \
    --without-83support \
    --without-84support \
    --without-manual
    
    make KDIR=/usr/src/kernels/3.10.0-693.el7.x86_64
    
    make install
    
    cp scripts/drbd-overview.pl /usr/bin/drbd-overview
    

    2. 安装drbd

    
    tar -xf drbd-9.0.19-1.tar.gz
    
    cd drbd-9.0.19-1
    
    make
    
    make install
    

    3. 加载模块

    depmod
    
    modprobe drbd
    
    lsmod | grep drbd
    drbd 555120 0
    libcrc32c 12644 2 xfs,drbd
    

    4. 配置drbd

    vim /etc/drbd.d/global_common.conf
    global {
     usage-count yes;
     udev-always-use-vnr;
    }
    common {
     handlers {
     }
     startup {
     wfc-timeout 100;
     degr-wfc-timeout 120;
     }
     options {
     auto-promote yes;
     }
     disk {
     }
     net {
     protocol C;
     # trasport "tcp";
     }
    }
    
    vim /etc/drbd.d/data.res
    resource data {
     on node1 {
     node-id 0;
     device /dev/drbd0 minor 0;
     disk /dev/sdb;
     meta-disk internal;
     address ipv4 192.168.111.129:7788;
     }
     on node2 {
     node-id 1;
     device /dev/drbd0 minor 0;
     disk /dev/sdb;
     meta-disk internal;
     address ipv4 192.168.111.130:7788;
     }
     connection-mesh {
     hosts node1 node2;
     }
     disk {
     resync-rate 100M;
     }
    }
    

    5. 启动drbd

    systemctl start drbd
    systemctl enable drbd
    

    6. 初始化drbd分区

    1)创建元数据

    dd if=/dev/zero of=/dev/sdb bs=1M count=100
    drbdadm create-md data
    drbdadm primary --force data
    

    2)查看初始化进度

    drbdadm status data
    
    资源同步中:
    data role:Primary
     disk:UpToDate
     node2 role:Secondary
    replication:SyncSource peer-disk:Inconsistent done:20.30
    
    资源同步结束:
    data role:Secondary
     disk:UpToDate
     node1 role:Primary
     peer-disk:UpToDate
    

    3)格式化drbd分区

    mkfs.xfs /dev/drbd0
    

    1.2 部署crorsync

    1. 安装corosync

    yum install -y corosync
    

    2. 配置corosync(centos7使用 pcs服务会自动生成配置文件,不用做这一步)

    1)修改corosync配置文件

    cd /etc/corosync
    cp corosync.conf.example.udpu corosync.conf
    
    说明:
    corosync.conf.example corosync使用组播通信样例
    corosync.conf.example.udpu corosync使用单播通信样例
    
    vim corosync.conf
    totem {
     version: 2
     crypto_cipher: aes256
     crypto_hash: sha1
     token: 10000
     interface {
     ringnumber: 0
     bindnetaddr: 192.168.111.0
     mcastport: 5405
     ttl: 1
     }
     transport: udpu
    }
    logging {
     fileline: off
     to_logfile: no
     to_syslog: yes
     logfile: /var/log/cluster/corosync.log
     debug: off
     timestamp: on
     logger_subsys {
     subsys: QUORUM
     debug: off
     }
    }
    nodelist {
     node {
     ring0_addr: 192.168.111.129
     nodeid: 1
     }
     node {
     ring0_addr: 192.168.111.130
     nodeid: 2
     }
    }
    quorum {
     provider: corosync_votequorum
    }
    

    2)生成corosync认证文件

    corosync-keygen -l
    scp corosync.conf authkey node2:/etc/corosync/
    

    3. 启动corosync

    systemctl start corosync
    systemctl enable corosync
    

    1.3 部署pacemaker

    1. 安装pacemaker

    yum install -y pacemaker
    

    2. 启动pacemaker

    systemctl start pacemaker
    systemctl enable pacemaker
    

    1.4 部署pcs

    1. 说明

    使用pcs管理corosync+pacemaker需要关闭corosync和pacemaker的启动和自启动;corosync和pacemaker的启停由pcsd服务接管。

    2. 安装pcs

    yum install -y pcs
    

    3. 启动pcsd

    systemctl start pcsd
    systemctl enable pcsd
    

    4. 配置pcsd

    1)设置hacluster账户密码

    passwd hacluster
    Changing password for user hacluster.
    New password: Password@123
    BAD PASSWORD: The password is shorter than 8 characters
    Retype new password: Password@123
    passwd: all authentication tokens updated successfully.
    

    2)用户认证

    pcs cluster auth node1 node2
    Username: hacluster
    Password: Password@123
    node1: Authorized
    node2: Authorized
    

    3)同步corosync配置

    pcs cluster setup --name hacluster node1 node2
    Error: node1: node is already in a cluster
    Error: node2: node is already in a cluster
    Error: nodes availability check failed, use --force to override. WARNING: This will destroy existing cluster on the nodes.
    
    pcs cluster setup --name hacluster node1 node2 --force
    Destroying cluster on nodes: node1, node2...
    node1: Stopping Cluster (pacemaker)...
    node2: Stopping Cluster (pacemaker)...
    node2: Successfully destroyed cluster
    node1: Successfully destroyed cluster
    Sending 'pacemaker_remote authkey' to 'node1', 'node2'
    node1: successful distribution of the file 'pacemaker_remote authkey'
    node2: successful distribution of the file 'pacemaker_remote authkey'
    Sending cluster config files to the nodes...
    node1: Succeeded
    node2: Succeeded
    Synchronizing pcsd certificates on nodes node1, node2...
    node1: Success
    node2: Success
    Restarting pcsd on the nodes in order to reload the certificates...
    node1: Success
    node2: Success
    

    4)启动集群

    pcs cluster start --all
    node1: Starting Cluster...
    node2: Starting Cluster...
    

    5)集群开机自启动

    pcs cluster enable --all
    node1: Cluster Enabled
    node2: Cluster Enabled
    

    6)检查corosync安装

    corosync-cfgtool -s
    Printing ring status.
    Local node ID 1
    RING ID 0
     id = 192.168.111.129
     status = ring 0 active with no faults
    Printing ring status.
    Local node ID 2
    RING ID 0
     id = 192.168.111.130
     status = ring 0 active with no faults
    

    7)禁用stonish设备
    如果没有Fence,建议禁用STONITH;

    pcs property set stonith-enabled=false
    

    8)无仲裁时选择忽略
    正常集群Quorum(法定)需要半数以上的票数,如果是双节点的集群则配置忽略;

    pcs property set no-quorum-policy=ignore
    

    9)检查配置,没有报错

    crm_verify -L -V
    

    10)查看pcs设备配置

    pcs property show
    Cluster Properties:
     cluster-infrastructure: corosync
     cluster-name: hacluster
     dc-version: 1.1.16-12.el7-94ff4df
     have-watchdog: false
     no-quorum-policy: ignore
     stonith-enabled: false
    

    5. 添加pcsd资源

    1)添加VIP

    pcs resource create vip ocf:heartbeat:IPaddr2 ip=192.168.111.131 cidr_netmask=24 op monitor interval=30s --group group
    

    使用资源组约束 <--group 组名>

    2)添加drbd挂载

    pcs resource create drbdstone ocf:heartbeat:Filesystem device=/dev/drbd0 directory=/mydata fstype=xfs op monitor interval=30s --group pachira
    

    3)添加mysql资源
    mysql的安装过程就不再赘述;

    pcs resource create mysqld service:mysqld op monitor interval=30s --group group
    
    pcs resource create pgsql service:postgresql op monitor interval=30s --group group
    

    6. 问题说明

    1)查看集群状态

    pcs status
    Cluster name: hacluster
    Stack: corosync
    Current DC: node1 (version 1.1.16-12.el7-94ff4df) - partition with quorum
    Last updated: Thu Jul 18 01:48:50 2019
    Last change: Thu Jul 18 01:34:24 2019 by hacluster via crmd on node1
    2 nodes configured
    3 resources configured
    Online: [ node1 node2 ]
    Full list of resources:
     Resource Group: group
     vip  (ocf::heartbeat:IPaddr2): Started node2
     drbdstone (ocf::heartbeat:Filesystem): Started node2
     mysqld  (service:mysqld): Started node2
    Daemon Status:
     corosync: active/disabled
     pacemaker: active/disabled
     pcsd: active/enabled
    

    2)资源unknown error
    现象:

    pcs status
    Cluster name: hacluster
    Stack: corosync
    Current DC: node1 (version 1.1.16-12.el7-94ff4df) - partition with quorum
    Last updated: Thu Jul 18 01:32:22 2019
    Last change: Thu Jul 18 01:31:57 2019 by root via cibadmin on node1
    2 nodes configured
    3 resources configured
    Node node1: standby
    Online: [ node2 ]
    Full list of resources:
     Resource Group: pachira
     vip  (ocf::heartbeat:IPaddr2): Started node2
     drbdstone (ocf::heartbeat:Filesystem): Started node2
     mysqld  (service:mysqld): Stopped
    Failed Actions:
    * mysqld_start_0 on node2 'unknown error' (1): call=56, status=complete, exitreason='none',
     last-rc-change='Thu Jul 18 01:32:05 2019', queued=0ms, exec=1036ms
    Daemon Status:
     corosync: active/disabled
     pacemaker: active/disabled
     pcsd: active/enabled
    

    解决方法:

    (1)找到资源为正确启动的原因;
    
    (2)执行 pcs resource cleanup mysqld 即可自动恢复;
    Cleaning up vip on node1, removing fail-count-vip
    Cleaning up vip on node2, removing fail-count-vip
    Cleaning up drbdstone on node1, removing fail-count-drbdstone
    Cleaning up drbdstone on node2, removing fail-count-drbdstone
    Cleaning up mysqld on node1, removing fail-count-mysqld
    Cleaning up mysqld on node2, removing fail-count-mysqld
    Waiting for 6 replies from the CRMd...... OK
    

    3)模拟切机和恢复

    将集群的一个节点设置为standby
    pcs node standby node1
    pcs node unstandby node1
    

    4)删除配置错误资源

    pcs resource delete mysqld
    

    1.5 pcsd图形界面

    1. 修改pcsd服务图形界面端口

    pcsd默认端口是在tcp6的2224,需要手动配置为ipv4端口;

    netstat -antp | grep 2224 | grep LIST
    tcp6 0 0 :::2224 :::*              LISTEN 3831/ruby
    
    vim /usr/lib/pcsd/ssl.rb
    webrick_options = {
     :Port => 2224,
     #:BindAddress => primary_addr,
     #:Host => primary_addr,
     :BindAddress => '0.0.0.0',
     :Host => '0.0.0.0',
    
    systemctl restart pcsd
    
    netstat -antp | grep 2224 | grep LIST
    tcp 0 0 0.0.0.0:2224 0.0.0.0:* LISTEN 96985/ruby
    

    2. 访问https://IP:2224

    pcs1.png pcs2.png pcs3.png pcs4.png

    然后就能看到之前命令行配置的集群了

    1.6 crmsh

    相关文章

      网友评论

          本文标题:Linux - drbd9+corosync+pacemaker

          本文链接:https://www.haomeiwen.com/subject/zazclhtx.html