美文网首页K8s
二进制安装-k8s高可用集群10-部署kube-schedule

二进制安装-k8s高可用集群10-部署kube-schedule

作者: Chris0Yang | 来源:发表于2021-09-01 13:03 被阅读0次

    本文档介绍部署高可用 kube-scheduler 集群的步骤。

    该集群包含 3 个节点,启动后将通过竞争选举机制产生一个 leader 节点,其它节点为阻塞状态。当 leader 节点不可用后,剩余节点将再次进行选举产生新的 leader 节点,从而保证服务的可用性。

    为保证通信安全,本文档先生成 x509 证书和私钥,kube-scheduler 在如下两种情况下使用该证书:

    • 与 kube-apiserver 的安全端口通信;
    • 在安全端口 (https,10251) 输出 prometheus 格式的 metrics;

    配置之前需要先安装 kubelet,flannel 等组件,不过前边已经安装,现在直接进入配置

    1、创建 kube-scheduler 证书和私钥

    创建证书签名请求:

    cat > kube-scheduler-csr.json <<EOF
    {
        "CN": "system:kube-scheduler",
        "hosts": [
          "127.0.0.1",
          "172.68.96.101",
          "172.68.96.102",
          "172.68.96.103"
        ],
        "key": {
            "algo": "rsa",
            "size": 2048
        },
        "names": [
          {
            "C": "CN",
            "ST": "BeiJing",
            "L": "BeiJing",
            "O": "system:kube-scheduler",
            "OU": "4Paradigm"
          }
        ]
    }
    EOF
    
    • hosts 列表包含所有 kube-scheduler 节点 IP;
    • CN 为 system:kube-scheduler、O 为 system:kube-scheduler,kubernetes 内置的 ClusterRoleBindings system:kube-scheduler 将赋予 kube-scheduler 工作所需的权限。

    生成证书和私钥:

    cfssl gencert -ca=/etc/kubernetes/cert/ca.pem \
      -ca-key=/etc/kubernetes/cert/ca-key.pem \
      -config=/etc/kubernetes/cert/ca-config.json \
      -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler
    

    2、创建和分发 kubeconfig 文件

    kubeconfig 文件包含访问 apiserver 的所有信息,如 apiserver 地址、CA 证书和自身使用的证书;

    source /opt/k8s/bin/environment.sh
    
    kubectl config set-cluster kubernetes \
      --certificate-authority=/etc/kubernetes/cert/ca.pem \
      --embed-certs=true \
      --server=${KUBE_APISERVER} \
      --kubeconfig=kube-scheduler.kubeconfig
    kubectl config set-credentials system:kube-scheduler \
      --client-certificate=kube-scheduler.pem \
      --client-key=kube-scheduler-key.pem \
      --embed-certs=true \
      --kubeconfig=kube-scheduler.kubeconfig
    kubectl config set-context system:kube-scheduler \
      --cluster=kubernetes \
      --user=system:kube-scheduler \
      --kubeconfig=kube-scheduler.kubeconfig
    kubectl config use-context system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig
    
    • 上一步创建的证书、私钥以及 kube-apiserver 地址被写入到 kubeconfig 文件中;
      分发 kubeconfig 到所有 master 节点:
    cat > magic46_distribute_kubeconfig_All_NodeServier.sh << "EOF"
    #!/bin/bash
    # 分发 kubeconfig 到所有 master 节点:
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
    do
        echo ">>> ${node_ip}" 
        scp kube-scheduler.kubeconfig k8s@${node_ip}:/etc/kubernetes/
    done
    EOF
    

    3、创建和分发 kube-scheduler systemd unit 文件

    cat > kube-scheduler.service <<EOF
    [Unit]
    Description=Kubernetes Scheduler
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    [Service]
    ExecStart=/opt/k8s/bin/kube-scheduler \\
      --address=127.0.0.1 \\
      --kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \\
      --leader-elect=true \\
      --alsologtostderr=true \\
      --logtostderr=false \\
      --log-dir=/var/log/kubernetes \\
      --v=2
    Restart=on-failure
    RestartSec=5
    User=k8s
    [Install]
    WantedBy=multi-user.target
    EOF
    
    • --address:在 127.0.0.1:10251 端口接收 http /metrics 请求;kube-scheduler 目前还不支持接收 https 请求;
    • --kubeconfig:指定 kubeconfig 文件路径,kube-scheduler 使用它连接和验证 kube-apiserver;
    • --leader-elect=true:集群运行模式,启用选举功能;被选为 leader 的节点负责处理工作,其它节点为阻塞状态;
    • User=k8s:使用 k8s 账户运行;

    分发 systemd unit 文件到所有 master 节点:

    cat > magic47_distribute_kube-scheduler_All_NodeServier.sh << "EOF"
    #!/bin/bash
    # 分发 systemd unit 文件到所有 master 节点
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
    do
        echo ">>> ${node_ip}" 
        scp kube-scheduler.service root@${node_ip}:/etc/systemd/system/
    done
    EOF
    

    4、启动 kube-scheduler 服务

    cat > magic48_start_kube-scheduler_servier.sh << "EOF"
    #!/bin/bash
    # 启动 kube-scheduler 服务
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
    do
        echo ">>> ${node_ip}" 
        ssh root@${node_ip} "mkdir -p /var/log/kubernetes && chown -R k8s /var/log/kubernetes"
        ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-scheduler && systemctl start kube-scheduler"
    done
    EOF
    

    5、检查服务运行状态

    cat > magic49_check_servier.sh << "EOF"
    #!/bin/bash
    # 检查服务运行状态
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
    do
        echo ">>> ${node_ip}" 
        ssh k8s@${node_ip} "systemctl status kube-scheduler|grep Active"
    done
    EOF
    

    如果看到如下输出:

    bash magic49_check_servier.sh
    >>> 172.68.96.101
       Active: active (running) since Wed 20XX-XX-XX XX:XX:XX CST; XXh ago
    >>> 172.68.96.102
       Active: active (running) since Wed 20XX-XX-XX XX:XX:XX CST; XXh ago
    >>> 172.68.96.103
       Active: active (running) since Wed 20XX-XX-XX XX:XX:XX CST; XXh ago
    

    则正常,如果失败,看日志:

    journalctl -xu kube-scheduler
    

    5,查看输出的 metric

    注意:以下命令在 kube-scheduler 节点上执行。
    kube-scheduler 监听 10251 端口,接收 http 请求:

    sudo netstat -lnpt|grep kube-sche
    tcp        0      0 127.0.0.1:10251         0.0.0.0:*               LISTEN      15377/kube-schedule
    
    curl -s http://127.0.0.1:10251/metrics |head
    # HELP apiserver_audit_event_total Counter of audit events generated and sent to the audit backend.
    # TYPE apiserver_audit_event_total counter
    apiserver_audit_event_total 0
    # HELP go_gc_duration_seconds A summary of the GC invocation durations.
    # TYPE go_gc_duration_seconds summary
    go_gc_duration_seconds{quantile="0"} 6.3423e-05
    go_gc_duration_seconds{quantile="0.25"} 0.000120079
    go_gc_duration_seconds{quantile="0.5"} 0.000146495
    go_gc_duration_seconds{quantile="0.75"} 0.000174475
    go_gc_duration_seconds{quantile="1"} 0.001807813
    

    6、查看当前的 leader

    kubectl get endpoints kube-scheduler --namespace=kube-system  -o yaml
    apiVersion: v1
    kind: Endpoints
    metadata:
      annotations:
        control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"kube-node2_XXXXXX-XXXXXX-XXXXXX","leaseDurationSeconds":15,"acquireTime":"20XX-XX-XX XX:XX:XX","renewTime":"20XX-XX-XX XX:XX:XX","leaderTransitions":1}'
      creationTimestamp: 20XX-XX-XX XX:XX:XX
      name: kube-scheduler
      namespace: kube-system
      resourceVersion: "30835"
      selfLink: /api/v1/namespaces/kube-system/endpoints/kube-scheduler
      uid: XXXXXXXXX
    

    可见,当前的 leader 为 kube-node2 节点

    7、测试 kube-scheduler 集群的高可用

    随便找一个或两个 master 节点,停掉 kube-scheduler 服务,看其它节点是否获取了 leader 权限(systemd 日志)

    现在就去停掉 kube-node2 上的 kube-scheduler 服务。

    systemctl stop kube-scheduler
    systemctl status kube-scheduler | grep Active
       Active: inactive (dead) since Sat 20XX-XX-XX XX:XX:XX CST; XXs ago
    

    然后再来查看一下现在的 leader 是谁:

    kubectl get endpoints kube-scheduler --namespace=kube-system  -o yaml
    apiVersion: v1
    kind: Endpoints
    metadata:
      annotations:
        control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"kube-node3_XXXXXX-XXXXXX-XXXXXX","leaseDurationSeconds":15,"acquireTime":"20XX-XX-XX XX:XX:XXZ","renewTime":"20XX-XX-XX XX:XX:XXZ","leaderTransitions":2}'
      creationTimestamp: 20XX-XX-XX XX:XX:XXZ
      name: kube-scheduler
      namespace: kube-system
      resourceVersion: "30984"
      selfLink: /api/v1/namespaces/kube-system/endpoints/kube-scheduler
      uid: XXXXXXXX
    

    可以看到,已经漂移到了 kube-node3 上去了

    相关文章

      网友评论

        本文标题:二进制安装-k8s高可用集群10-部署kube-schedule

        本文链接:https://www.haomeiwen.com/subject/hzsxwltx.html