美文网首页程序员
K8S v1.15.1高可用集群(实操排坑)

K8S v1.15.1高可用集群(实操排坑)

作者: 托瓦斯克一 | 来源:发表于2020-07-21 15:29 被阅读0次

    架构说明

    K8S高可用,其实简单的说就是对K8S各核心组件做高可用。

    • apiversion高可用:通过haproxy+keepalived的方式实现;
    • controller-manager高可用:通过k8s内部通过选举方式产生领导者(由--leader-elect 选型控制,默认为true),同一时刻集群内只有一个controller-manager组件运行;
    • scheduler高可用:通过k8s内部通过选举方式产生领导者(由--leader-elect 选型控制,默认为true),同一时刻集群内只有一个scheduler组件运行;
    • etcd高可用:有两种实现方式(堆叠式和外置式),笔者建议使用外置式etcd。
    1. 堆叠式:etcd服务和控制平面被部署在同样的节点中,意思是跟kubeadm一起部署,etcd只与本节点apiserver通信,该方案对基础设施的要求较低,对故障的应对能力也较低。
    2. 外置式:etcd服务和控制平面被分离,每个etcd都与apiserver节点通信,需要更多的硬件,也有更好的保障能力,当然也可以部署在已部署kubeadm的服务器上,只是在架构上实现了分离,不是所谓的硬件层面分离,笔者建议在硬件资源充足的情况下尽可能选择空闲节点来部署。
    k8s-ha.png

    该架构图来自 https://www.kubernetes.org.cn/6964.html


    准备工作

    主机名 设备IP 角色 系统版本
    k8s-master90 192.168.1.90 Master,Haproxy,Keepalived,Etcd CentOS 7.7
    k8s-master91 192.168.1.91 Master,Haproxy,Keepalived,Etcd CentOS 7.7
    k8s-master93 192.168.1.93 Master,Haproxy,Keepalived,Etcd CentOS 7.7

    为了节省服务器资源,此次操作仅使用三台设备,采用kubeadm方式完成K8S高可用集群搭建!


    环境配置

    笔者使用一键部署shell脚本,快速为各节点配置基础环境,下载相关软件包,启动相关服务;但笔者还是建议先在一台设备上每条命令都手动敲一遍,能更直观地看到每条命令的效果,如果都很顺利的话,可在其他的设备上直接跑脚本一键完成!
    使用该脚本的前提是yum源可用,防火墙可修改关闭,如需使用还得修改相关内容,请各位小伙伴先阅读清楚!

    auto_configure_env.sh

    #!/bin/bash
    echo "##### Update /etc/hosts #####"
    cat >> /etc/hosts <<EOF
    192.168.1.90 k8s-master90
    192.168.1.91 k8s-master91
    192.168.1.93 k8s-master93
    EOF
    
    echo "##### Stop firewalld #####"
    systemctl stop firewalld
    systemctl disable firewalld
    
    echo "##### Modify iptables FORWARD policy #####"
    iptables -P FORWARD ACCEPT
    
    echo "##### Close selinux #####"
    setenforce 0 
    sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
    
    echo "##### Close swap #####"
    swapoff -a
    
    echo "##### Modify limits #####"
    cat > /etc/security/limits.d/kubernetes.conf <<EOF
    *       soft    nproc   131072
    *       hard    nproc   131072
    *       soft    nofile  131072
    *       hard    nofile  131072
    root    soft    nproc   131072
    root    hard    nproc   131072
    root    soft    nofile  131072
    root    hard    nofile  131072
    EOF
    
    echo "##### Create /etc/sysctl.d/k8s.conf #####"
    cat >> /etc/sysctl.d/k8s.conf <<EOF
    net.bridge.bridge-nf-call-ip6tables = 1
    net.bridge.bridge-nf-call-iptables = 1
    net.ipv4.ip_forward = 1
    net.ipv4.tcp_tw_recycle = 0
    vm.swappiness = 0
    vm.overcommit_memory = 1
    vm.panic_on_oom = 0
    fs.inotify.max_user_instances = 8192
    fs.inotify.max_user_watches = 1048576
    fs.file-max = 52706963
    fs.nr_open = 52706963
    net.ipv6.conf.all.disable_ipv6 = 1
    EOF
    
    echo "##### Add kernel module and Sysctl #####"
    modprobe br_netfilter
    sysctl -p /etc/sysctl.d/k8s.conf
    
    echo "##### Add ipvs modules #####"
    cat > /etc/sysconfig/modules/ipvs.modules <<EOF
    modprobe -- ip_vs
    modprobe -- ip_vs_rr
    modprobe -- ip_vs_wrr
    modprobe -- ip_vs_sh
    modprobe -- nf_conntrack_ipv4
    EOF
    
    echo "##### Install ipset ipvsadm #####"
    yum -y install ipset ipvsadm
    
    echo "##### Install docker-ce-18.09.7 #####"
    # https://download.docker.com/linux/centos/7/x86_64/stable/Packages/docker-ce-18.09.7-3.el7.x86_64.rpm
    # https://download.docker.com/linux/centos/7/x86_64/stable/Packages/docker-ce-cli-18.09.7-3.el7.x86_64.rpm
    # https://download.docker.com/linux/centos/7/x86_64/stable/Packages/containerd.io-1.2.13-3.1.el7.x86_64.rpm
    yum -y install yum-utils device-mapper-persistent-data lvm2
    yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
    yum makecache fast
    yum -y install docker-ce-18.09.7-3.el7.x86_64 docker-ce-cli-18.09.7-3.el7.x86_64 containerd.io-1.2.13-3.1.el7.x86_64
    systemctl start docker
    systemctl enable docker
    
    echo "##### Modify docker cgroup driver #####"
    cat > /etc/docker/daemon.json <<EOF
    {
      "exec-opts": ["native.cgroupdriver=systemd"]
      "insecure-registries": ["https://hub.atguigu.com"]
    }
    EOF
    
    echo "##### Restart docker service #####"
    systemctl restart docker
    
    echo "##### Install kubeadm kubelet kubectl #####"
    cat > /etc/yum.repos.d/kubernetes.repo << EOF
    [kubernetes]
    name=Kubernetes
    baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
    enabled=1
    gpgcheck=1
    repo_gpgcheck=1
    gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
    EOF
    yum -y install kubeadm-1.15.1-0.x86_64 kubelet-1.15.1-0.x86_64 kubectl-1.15.1-0.x86_64
    
    echo "##### Modify kubelet config #####"
    sed -i "s/KUBELET_EXTRA_ARGS=/KUBELET_EXTRA_ARGS=--fail-swap-on=false/g" /etc/sysconfig/kubelet
    
    echo "##### Enable kubelet service #####"
    systemctl enable kubelet.service
    

    搭建haproxy服务

    haproxy提供高可用性、负载均衡以及基于TCP和HTTP应用的代理,相对于nginx来说,有更好的负载均衡性能,并且可以支持数以万计的并发连接,支持Session的保持,Cookie的引导,同时自带强大的监控服务器状态的web页面,而且负载均衡策略也非常之多。

    该服务需要部署在每台master服务器上,服务配置都一样;使用该脚本也需要yum源可用,并且修改后端的apiserver的地址,请仔细阅读!

    auto_install_haproxy.sh

    #!/bin/bash
    echo "##### Install harpoxy service #####"
    # http://mirror.centos.org/centos/7/os/x86_64/Packages/haproxy-1.5.18-9.el7.x86_64.rpm
    yum -y install haproxy-1.5.18-9.el7.x86_64
    
    echo "##### Modify harpoxy cfg #####"
    cat > /etc/haproxy/haproxy.cfg <<EOF
    global
        log         127.0.0.1 local2
        pidfile     /var/run/haproxy.pid
        maxconn     4000
        stats socket /var/lib/haproxy/stats
    
    defaults
        mode                    http
        log                     global
        option                  httplog
        option                  dontlognull
        option http-server-close
        option forwardfor       except 127.0.0.0/8
        option                  redispatch
        retries                 3
        timeout http-request    10s
        timeout queue           1m
        timeout connect         10s
        timeout client          1m
        timeout server          1m
        timeout http-keep-alive 10s
        timeout check           10s
        maxconn                 3000
    
    frontend  kubernetes-apiserver
        mode tcp
        bind *:12567    # 自定义监听的端口号
    
        acl url_static       path_beg       -i /static /images /javascript /stylesheets
        acl url_static       path_end       -i .jpg .gif .png .css .js
    
        default_backend      kubernetes-apiserver
    
    backend kubernetes-apiserver
        mode tcp
        balance     roundrobin
        # 配置apiserver
        server  k8s-master90 192.168.1.90:6443 check
        server  k8s-master91 192.168.1.91:6443 check
        server  k8s-master93 192.168.1.93:6443 check
    EOF
    
    echo "##### Restart harpoxy service #####"
    systemctl restart haproxy
    systemctl enable haproxy
    

    搭建keepalived服务

    keepalived是以VRRP(虚拟路由冗余协议)为实现基础的,专门用于实现集群高可用的一个服务软件;通过高优先级的节点劫持vip来对外提供服务,当该节点宕机时,该节点上的vip会自动漂移到下一个较高优先级的节点上,从而实现多台提供相同服务的机器之间的故障自动转移,防止单点故障。

    笔者所写的检测脚本chk_haproxy.sh,含义是:每次检查haproxy服务端口时,如果结果的字符串为零(即端口不存在),则关闭该节点上的keepalived服务,让vip地址能漂移到其他正常的主机服务上。

    该服务需要部署在每台master服务器上,服务配置基本一样,只需要修改priority优先级大小即可;使用该脚本也需要yum源可用,并且修改相关的内容,笔者建议vip地址设置为服务器网段的一个空闲可用地址,请仔细阅读!

    auto_install_keepalived.sh

    #!/bin/bash
    echo "##### Install keepalived service #####"
    # http://mirror.centos.org/centos/7/os/x86_64/Packages/keepalived-1.3.5-16.el7.x86_64.rpm
    yum -y install keepalived-1.3.5-16.el7.x86_64
    
    echo "##### Add check haproxy.service live script #####"
    cat > /etc/keepalived/chk_haproxy.sh <<EOF
    ID=$(netstat -tunlp | grep haproxy)
    if [ -z $ID ]; then
       systemctl stop keepalived
       sleep 3
    fi
    EOF
    
    echo "##### Chmod script #####"
    chmod 777 /etc/keepalived/chk_haproxy.sh
    
    echo "##### Modify keepalived conf #####"
    cat < /etc/keepalived/keepalived.conf <<EOF
    global_defs {
        router_id kv90      # 自定义用户标识本节点的名称,每台节点都不一样
    }
    
    vrrp_script chk_haproxy {
        script "/etc/keepalived/chk_haproxy.sh"    # 脚本检测
        interval 2
        weight -20
        fail 10
        rise 2
    }
    
    vrrp_instance VI_1 {
        state BACKUP            # 建议全部节点都写BACKUP,通过优先级来决定谁是MASTER
        interface enp5s0f0      # 绑定为宿主机上使用的物理网卡接口
        virtual_router_id 33    # 所有节点的虚拟路由id必须一致,表示同个组;可自定义,范围是0-255
        priority 100            # 优先级,即初始权重;可自定义,范围是1-254
        nopreempt               # 强烈建议配置为不可抢占模式,减少业务频繁切换;使用该选项时,state必须都得写成BACKUP
    
        authentication {
            auth_type PASS
            auth_pass 123456    # 自定义认证密码
        }
    
        virtual_ipaddress {
            192.168.1.33        # 自定义vip地址
        }
    
        track_script {
            chk_haproxy         # 必须与vrrp_script的名称一致
        }
    }
    EOF
    
    echo "##### Restart keepalived service #####"
    systemctl restart keepalived
    systemctl enable keepalived
    

    搭建etcd高可用集群

    etcd是一个强一致性的分布式键值(key-value)存储,它提供了一种可靠的方式来存储需要被分布式系统或机器集群访问的数据,它可以在网络分区时优雅地处理leader选举,并且能够容忍机器故障,即使在leader节点中也能容忍;etcd机器之间的通信是通过Raft一致性算法处理。

    笔者是在k8s的三台服务器上部署etcd,因此基础环境已经配置过,如果是在空闲的服务器上部署etcd,也需要配置基础环境,关闭防火墙,禁用selinux;本文搭建的是一个TLS安全加固的外置式etcd高可用集群。

    该服务需要部署在每台etcd服务器上,笔者通过在k8s-master90上执行下面的部署脚本即可将相关证书和etcd可执行文件等内容直接scp到其他etcd节点上,而且还通过ssh远程执行命令修改其他节点上的etcd配置文件,并启动etcd服务,请仔细阅读!

    auto_install_etcd.sh

    #!/bin/bash
    echo "##### Create CA certificate and private key #####"
    mkdir -p /etc/ssl/etcd/ssl/
    mkdir -p /var/lib/etcd/
    mkdir -p /home/ssl
    cd /home/ssl
    openssl genrsa -out ca.key 2048
    openssl req -x509 -new -nodes -key ca.key -subj "/CN=home" -days 10000 -out ca.crt
    
    echo "##### Create etcd certificate and private key #####"
    cat > /home/ssl/etcd-ca.conf <<EOF
    [ req ]
    default_bits = 2048
    prompt = no
    default_md = sha256
    req_extensions = req_ext
    distinguished_name = dn
    
    [ dn ]
    C = CN
    ST = Guangdong
    L = Guangzhou
    O = etcd
    OU = home
    CN = etcd
    
    [ req_ext ]
    subjectAltName = @alt_names
    
    [ alt_names ]
    DNS.1 = localhost
    DNS.2 = k8s-master90
    DNS.3 = k8s-master91
    DNS.4 = k8s-master93
    IP.1 = 127.0.0.1
    IP.2 = 192.168.1.90
    IP.3 = 192.168.1.91
    IP.4 = 192.168.1.93
    
    [ v3_ext ]
    authorityKeyIdentifier=keyid,issuer:always
    basicConstraints=CA:FALSE
    keyUsage=keyEncipherment,dataEncipherment
    extendedKeyUsage=serverAuth,clientAuth
    subjectAltName=@alt_names
    EOF
    
    openssl genrsa -out etcd.key 2048
    openssl req -new -key etcd.key -out etcd.csr -config etcd-ca.conf
    openssl x509 -req -in etcd.csr -CA ca.crt -CAkey ca.key \
    -CAcreateserial -out etcd.crt -days 10000 \
    -extensions v3_ext -extfile etcd-ca.conf
    openssl verify -CAfile ca.crt etcd.crt
    
    echo "##### Scp certificates to other etcd nodes #####"
    scp -r /etc/ssl/etcd 192.168.1.91:/etc/ssl/
    scp -r /etc/ssl/etcd 192.168.1.93:/etc/ssl/
    
    echo "##### Install etcd v3.4.7 #####"
    mkdir -p /home/etcd-pkg
    cd /home/etcd-pkg
    wget https://github.com/coreos/etcd/releases/download/v3.4.7/etcd-v3.4.7-linux-amd64.tar.gz
    tar -xf etcd-v3.4.7-linux-amd64.tar.gz
    cd ./etcd-v3.4.7-linux-amd64/
    cp -a {etcd,etcdctl} /usr/local/bin/    # 前提:环境变量$PATH里有该路径
    
    echo "##### Scp etcd to other etcd nodes #####"
    scp etcd etcdctl 192.168.1.91:/usr/local/bin/
    scp etcd etcdctl 192.168.1.93:/usr/local/bin/
    
    echo "##### Modify etcd cluster conf #####"
    mkdir -p /etc/etcd/
    cat > /etc/etcd/etcd.conf <<EOF
    # [Member Flags]
    # ETCD_ELECTION_TIMEOUT=1000
    # ETCD_HEARTBEAT_INTERVAL=100
    ETCD_NAME=k8s-master90
    ETCD_DATA_DIR=/var/lib/etcd/
    
    # [Cluster Flags]
    # ETCD_AUTO_COMPACTION_RETENTIO:N=0
    ETCD_INITIAL_CLUSTER_STATE=new
    ETCD_ADVERTISE_CLIENT_URLS=https://192.168.1.90:2379
    ETCD_INITIAL_ADVERTISE_PEER_URLS=https://192.168.1.90:2380
    ETCD_LISTEN_CLIENT_URLS=https://192.168.1.90:2379,https://127.0.0.1:2379
    ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster
    ETCD_LISTEN_PEER_URLS=https://192.168.1.90:2380
    ETCD_INITIAL_CLUSTER=k8s-master90=https://192.168.1.90:2380,k8s-master91=https://192.168.1.91:2380,k8s-master93=https://192.168.1.93:2380
    
    # [Proxy Flags]
    ETCD_PROXY=off
    
    # [Security flags]
    # ETCD_CLIENT_CERT_AUTH=
    # ETCD_PEER_CLIENT_CERT_AUTH=
    ETCD_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.crt
    ETCD_CERT_FILE=/etc/ssl/etcd/ssl/etcd.crt
    ETCD_KEY_FILE=/etc/ssl/etcd/ssl/etcd.key
    ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.crt
    ETCD_PEER_CERT_FILE=/etc/ssl/etcd/ssl/etcd.crt
    ETCD_PEER_KEY_FILE=/etc/ssl/etcd/ssl/etcd.key
    
    # [Profiling flags]
    # ETCD_METRICS={{ etcd_metrics }}
    EOF
    
    echo "##### Scp etcd conf to other etcd nodes #####"
    scp -r /etc/etcd 192.168.1.91:/etc/
    scp -r /etc/etcd 192.168.1.93:/etc/
    
    echo "##### Modify etcd.conf with ssh #####"
    ssh 192.168.1.91 "sed -i 's/k8s-master90/k8s-master91/g' /etc/etcd/etcd.conf ; sed -i '10,14s/192.168.1.90/192.168.1.91/' /etc/etcd/etcd.conf"
    ssh 192.168.1.93 "sed -i 's/k8s-master90/k8s-master93/g' /etc/etcd/etcd.conf ; sed -i '10,14s/192.168.1.90/192.168.1.93/' /etc/etcd/etcd.conf"
    
    echo "##### Create systemd etcd.service #####"
    cat > /usr/lib/systemd/system/etcd.service <<EOF
    [Unit]
    Description=etcd server
    After=network.target
    After=network-online.target
    Wants=network-online.target
    
    [Service]
    Type=notify
    WorkingDirectory=/var/lib/etcd/
    EnvironmentFile=-/etc/etcd/etcd.conf
    ExecStart=/usr/local/bin/etcd
    NotifyAccess=all
    Restart=always
    RestartSec=5s
    LimitNOFILE=40000
    
    [Install]
    WantedBy=multi-user.target
    EOF
    
    echo "##### Scp systemd to other etcd nodes #####"
    scp /usr/lib/systemd/system/etcd.service 192.168.1.91:/usr/lib/systemd/system/
    scp /usr/lib/systemd/system/etcd.service 192.168.1.93:/usr/lib/systemd/system/
    
    # 需要在一定时间范围内快速启动三台etcd服务,否则服务会启动失败
    echo "##### Start etcd.service #####"
    systemctl daemon-reload
    systemctl restart etcd
    systemctl enable etcd
    systemctl status etcd
    ssh 192.168.1.91 "systemctl restart etcd ; systemctl enable etcd ; systemctl status etcd"
    ssh 192.168.1.93 "systemctl restart etcd ; systemctl enable etcd ; systemctl status etcd"
    
    echo "##### Check etcd cluster status #####"
    etcdctl \
      --cacert=/etc/ssl/etcd/ssl/ca.crt \
      --cert=/etc/ssl/etcd/ssl/etcd.crt \
      --key=/etc/ssl/etcd/ssl/etcd.key \
      --endpoints=https://192.168.1.90:2379,https://192.168.1.91:2379,https://192.168.1.93:2379 \
      endpoint health
    
    # 输出如下结果,三个节点的状态均为healthy,则表示etcd集群服务正常
    # https://192.168.1.93:2379 is healthy: successfully committed proposal: took = 22.044394ms
    # https://192.168.1.91:2379 is healthy: successfully committed proposal: took = 23.946175ms
    # https://192.168.1.90:2379 is healthy: successfully committed proposal: took = 26.130848ms
    
    # 小技巧,为了方便执行etcdctl命令,可以通过设置alias永久别名
    cat >> /root/.bashrc <<EOF
    alias etcdctl='etcdctl \
      --cacert=/etc/ssl/etcd/ssl/ca.crt \
      --cert=/etc/ssl/etcd/ssl/etcd.crt \
      --key=/etc/ssl/etcd/ssl/etcd.key \
      --endpoints=https://192.168.1.90:2379,https://192.168.1.91:2379,https://192.168.1.93:2379'
    EOF
    
    # 别名快捷命令,以表格形式输出etcd成员列表
    echo "##### List etcd member #####"
    etcdctl --write-out=table member list
    

    kubeadm初始化集群

    Kubeadm是一个工具,它提供了 kubeadm init 以及 kubeadm join 这两个命令作为快速创建kubernetes集群的最佳实践,本文使用的是kubeadm v1.15.1版本,下面的初始化脚本在k8s-master90上执行即可,使用该脚本需要修改相关内容,请仔细阅读!

    初始化命令的参数说明:--upload-certs用于在后续执行加入节点时自动分发证书文件;tee用于追加输出日志到指定文件中。

    kubeadm-init.sh

    #!/bin/bash
    echo "##### Create kubeadm config #####"
    cat > kubeadm-config.yaml << EOF
    apiVersion: kubeadm.k8s.io/v1beta2
    imageRepository: docker.io/mirrorgooglecontainers
    controlPlaneEndpoint: 192.168.1.33:12567    # 配置为vip地址和haproxy端口号
    kind: ClusterConfiguration
    kubernetesVersion: v1.15.1          # k8s的版本号
    networking:
      podSubnet: 10.244.0.0/16          # k8s内部pods子网网络,该网段最好不要修改,后面flannel的默认Network是该网段
      serviceSubnet: 10.10.0.0/16       # k8s内部services子网网络,可自定义,但不能撞网段
    apiServer:
      certSANs:             # 最好写上所有kube-apiserver的主机名,设备IP,vip
        - 192.168.1.33
        - 192.168.1.90
        - 192.168.1.91
        - 192.168.1.93
        - k8s-master90
        - k8s-master91
        - k8s-master93
        - 127.0.0.1
        - localhost
    etcd:
      external:             # 此处代表使用的是外置式etcd集群
        endpoints:
        - https://192.168.1.90:2379
        - https://192.168.1.91:2379
        - https://192.168.1.93:2379
        caFile: /etc/ssl/etcd/ssl/ca.crt
        certFile: /etc/ssl/etcd/ssl/etcd.crt
        keyFile: /etc/ssl/etcd/ssl/etcd.key
    ---
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    kind: KubeProxyConfiguration
    mode: "ipvs"        # 修改kube-proxy的工作模式,默认使用的是iptables
    EOF
    
    echo "##### Pull docker images #####"
    kubeadm config images list --config kubeadm-config.yaml
    kubeadm config images pull --config kubeadm-config.yaml
    docker images | grep mirrorgooglecontainers | awk '{print "docker tag ",$1":"$2,$1":"$2}' | sed -e 's#mirrorgooglecontainers#k8s.gcr.io#2' | sh -x 
    docker images | grep mirrorgooglecontainers | awk '{print "docker rmi ", $1":"$2}' | sh -x 
    docker pull coredns/coredns:1.3.1
    docker tag coredns/coredns:1.3.1 k8s.gcr.io/coredns:1.3.1
    docker rmi coredns/coredns:1.3.1
    
    echo "##### Init kubeadm #####"
    kubeadm init --config kubeadm-config.yaml --upload-certs | tee kubeadm-init.log
    

    执行脚本后,会输出如下类似内容:

    ......
    Your Kubernetes control-plane has initialized successfully!
    
    To start using your cluster, you need to run the following as a regular user:
    
      mkdir -p $HOME/.kube
      sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
      sudo chown $(id -u):$(id -g) $HOME/.kube/config
    
    You should now deploy a pod network to the cluster.
    Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
      https://kubernetes.io/docs/concepts/cluster-administration/addons/
    
    You can now join any number of the control-plane node running the following command on each as root:
    
      # 其他master节点加入,则使用这条较长的join命令
      kubeadm join 192.168.1.33:12567 --token 3vjfpc.647mossfkxl2v6u6 \
        --discovery-token-ca-cert-hash sha256:64891f8de74bc48c969446061bd60069643de2a70732631301fc0eb8283d4cc3 \
        --control-plane --certificate-key ee78fe3d5d0666503018dccb3a0e664f3c8e3b65ba6ad1362804de63ff451737
    
    Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
    As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use 
    "kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
    
    Then you can join any number of worker nodes by running the following on each as root:
    
    # 其他node工作节点加入,则使用这条较短的join命令
    kubeadm join 192.168.1.33:12567 --token 3vjfpc.647mossfkxl2v6u6 \
        --discovery-token-ca-cert-hash sha256:64891f8de74bc48c969446061bd60069643de2a70732631301fc0eb8283d4cc3
    

    最后根据提示,在当前节点上配置kubectl

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config
    

    验证操作,如果打印出节点信息则表示成功

    kubectl get nodes
    

    在本文中,笔者的目标是搭建三台master节点的k8s高可用集群,因此最后在91和93两台服务器上执行较长的join加入命令;而且默认master节点是自带一个taints污点为NoSchedule,因此还需要执行kubectl taint nodes --all node-role.kubernetes.io/master-,让master节点允许被调度,最后再执行kubectl get nodes,结果如下:

    NAME           STATUS      ROLES    AGE   VERSION
    k8s-master90   NotReady    master   30h   v1.15.1
    k8s-master91   NotReady    master   30h   v1.15.1
    k8s-master93   NotReady    master   30h   v1.15.1
    

    以上节点状态为NotReady,其实是因为我们还没有解决容器的跨主机通信问题,因此笔者在本文中将使用flannel网络插件来打通这条网络通道!


    flannel网络插件

    flannel是kubernetes默认提供网络插件,它能协助kubernetes,给每个Node上的Docker容器都分配互相不冲突的IP地址,而且在这些IP地址之间建立了一个覆盖网络(Overlay Network),通过这个覆盖网络将数据包原封不动地传递到目标容器内。

    观察下图,flannel会在每台Node上创建一个名为flannel0的网桥,并且这个网桥的一端连接docker0网桥,另一端连接名为flanneld的代理服务进程;而且这个flanneld进程上连etcd,利用etcd来管理可分配的IP地址段资源,同时监控etcd中每个Pod的实际地址,这其实也是为什么flannel不会分配冲突的IP地址的原因。

    如果要获取flannel的yml文件,需要先找到对应域名的IP地址,本地做个域名映射(原因大家都懂)!

    flannel.png

    该图来自 https://www.kubernetes.org.cn/4105.html

    auto_install_flannel.sh

    #!/bin/bash
    cat > /etc/hosts <<EOF
    151.101.108.133 raw.githubusercontent.com
    EOF
    
    echo "##### Install pod network (Flannel) #####"
    curl -O https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
    kubectl apply -f kube-flannel.yml
    

    应用flannel的yml文件后,可以使用kubectl get pods -A查看对应的pod是否生成,并且查看其状态是否为Running,得出的结果如下图:

    image

    上图中,除了flannel的pod为Running,其他pod都是Running状态,而且几个核心组件的pod数量都是三个,达到了我们要搭建的目标;其实仔细看上图中是没有etcd相关的pod的,因为笔者部署的是外置式etcd,但可以通过kubectl get cs查看到etcd集群的健康状态,如下图:

    image

    遇到过的问题

    当k8s已经运行,然而因为之前写kubeadm-config.yaml文件,没有写对pod或service子网网段等内容,假设我修改了serviceSubnet100.100.0.0/16,需要重新初始化,使用了kubeadm reset等命令去清理集群,也删除了$HOME/.kube/config,然后再kubadm init初始化集群,会出现报错提示如下图:

    image

    然后我强制忽略这个报错,最后所有初始化操作撸完一遍,查看k8s集群的services,结果clusterip还是之前的那个10.10.0.0/16的网段IP,感觉就是连集群都没清理干净,接着就是后面各种讨教大牛和百度,最终找到的问题出处是在etcd上,因为相关组件都是对接etcd这个数据库的,而我们重置清理kubeadm其实并没有清理etcd上的数据,最后的解决方法就是先删除etcd上的旧数据(生产环境谨慎),然后再执行初始化操作即可!

    # 查看etcd中所有的keys
    etcdctl get --keys-only=true --prefix /
    
    # 删除etcd中所有的数据(相当于 rm -rf /)
    etcdctl del / --prefix
    

    相关文章

      网友评论

        本文标题:K8S v1.15.1高可用集群(实操排坑)

        本文链接:https://www.haomeiwen.com/subject/qzbikktx.html