美文网首页收藏ansibleK8s
Ansible部署二进制的k8s之calico网络插件

Ansible部署二进制的k8s之calico网络插件

作者: Chris0Yang | 来源:发表于2022-09-06 18:41 被阅读0次

    Calico部署完后pod状态显示CrashLoopBackOff如何处理?

    地址:https://www.jianshu.com/p/87a01ec9964c

    环境准备

    如题,在开始之前我们需要一个干净 Kubernetes 集群,这里说的干净是指没有被网络插件干预过的集群。所以我这里准备如下三个节点:

    IP           Role            OS
    10.0.1.111   Master、Node    Ubuntu 18.04
    10.0.1.112   Master、Node    Ubuntu 18.04
    10.0.1.113   Node            Ubuntu 18.04
    

    使用系统版本为 Ubuntu 18.04,这里就直接使用 Ansible Role 的方式来快速以二进制形式创建一个干净的 Kubernetes 集群

    Ansible部署二进制的k8s

    地址:https://www.jianshu.com/p/85edca636ddc

    修改主机清单 hosts.yaml 配置内容如下:

    all:
      vars:
        ansible_user: root
        ansible_ssh_pass: root1234
        ansible_sudo_pass: root1234
        is_mutil_master: yes
        virtual_ip: 10.0.1.110
        virtual_ip_device: ens33
        service_net: 10.0.0.0/24
        pod_net: 10.244.0.0/16
        proxy_master_port: 7443
        install_dir: /opt/apps/
        package_dir: /opt/packages/
        tls_dir: /opt/k8s_tls
        ntp_host: ntp1.aliyun.com
        have_network: yes
        replace_repo: yes
        docker_registry_mirrors: https://7hsct51i.mirror.aliyuncs.com
        kubelet_bootstrap_token: 8fba966b6e3b5d182960a30f6cb94428
        pause_image: registry.cn-shenzhen.aliyuncs.com/zze/pause:3.2
        dashboard_port: 30001
        dashboard_token_file: dashboard_token.txt
        ingress_controller_type: nginx
      hosts:
        10.0.1.111:
          hostname: k8s-master1
          master: yes
          node: yes
          etcd: yes
          proxy_master: yes
          proxy_priority: 110
        10.0.1.112:
          hostname: k8s-master2
          master: yes
          node: yes
          etcd: yes
          proxy_master: yes
          proxy_priority: 100
        10.0.1.113: 
          hostname: k8s-node1
          etcd: yes
          node: yes
          ingress: yes
    

    通过如上配置可以构建一个由三节点组成的 2 Master + 3 Node 的 Kubernetes 集群,开始执行 Playbook:

    $ ansible-playbook -i hosts.yml run.yml --skip-tag=deploy_manifests
    ...
    TASK [start_service : 签发 Kubelet 申请的证书 - 签发证书 (2/2)] ***************************************************************************************************************************************************************************
    skipping: [10.0.1.112]
    skipping: [10.0.1.113]
    changed: [10.0.1.111]
    
    PLAY RECAP *********************************************************************************************************************************************************************************************************************
    10.0.1.111                 : ok=88   changed=46   unreachable=0    failed=0    skipped=20   rescued=0    ignored=0   
    10.0.1.112                 : ok=68   changed=32   unreachable=0    failed=0    skipped=15   rescued=0    ignored=0   
    10.0.1.113                 : ok=48   changed=20   unreachable=0    failed=0    skipped=35   rescued=0    ignored=0 
    

    Ansible 默认在部署完 Kubernetes 的基本组件后还会自动安装网络插件、CoreDNSDashboard 等附件,这里通过 --skip-tag=deploy-manifests 来忽略这些步骤

    由于此 Ansible 默认是使用 Flannel 作为 cni 插件实现的,所以预装了一些 Flannel 二进制包,可以在各个节点中将其删除:

    $ rm -f /opt/apps/cni/bin/*
    

    至此,一个干净的 Kubernetes 集群就已经构建完成,可以看到它的各个节点如下:

    $ kubectl get node 
    NAME          STATUS     ROLES    AGE   VERSION
    k8s-master1   NotReady   <none>   38m   v1.19.3
    k8s-master2   NotReady   <none>   38m   v1.19.3
    k8s-node1     NotReady   <none>   38m   v1.19.3
    

    这里由于还没有安装网络插件,所以它处于 NotReady 状态,咱们继续下面的 Calico 部署步骤做完它们就会成为 Ready 状态了。

    Calico 部署

    从官网下载资源文件:

    $ wget https://docs.projectcalico.org/manifests/calico-etcd.yaml
    

    下面只列出修改的部分:

    $ vim calico-etcd.yaml
    ...
    # 这里反引号包裹的内容表示需要执行它将其结果替换到此处
      # etcd 证书私钥
      etcd-key: `cat /opt/k8s_tls/etcd/server-key.pem | base64 -w 0`
      # etcd 证书
      etcd-cert: `cat /opt/k8s_tls/etcd/server.pem | base64 -w 0`
      # etcd CA 证书
      etcd-ca: `cat /opt/k8s_tls/etcd/ca.pem | base64 -w 0`
    ...
      # etcd 集群地址
      etcd_endpoints: "https://10.0.1.111:2379,https://10.0.1.112:2379,https://10.0.1.113:2379"
      etcd_ca: "/calico-secrets/etcd-ca"
      etcd_cert: "/calico-secrets/etcd-cert"
      etcd_key: "/calico-secrets/etcd-key"
    ...
                # 禁止使用 IPIP 模式
                - name: CALICO_IPV4POOL_IPIP
                  value: "Never"
                # 设置 Pod IP 地址段,此处 value 应该与之前配置的 hosts.yaml 中的 pod_net 变量值一致
                - name: CALICO_IPV4POOL_CIDR
                  value: "10.244.0.0/16"
    ...
            # 修改 cni 插件二进制文件映射到宿主机的目录,此处 /opt/apps 与 hosts.yaml 中的 install_dir 变量值一致
            - name: cni-bin-dir
              hostPath:
                path: /opt/apps/cni/bin
            # 修改 cni 配置目录为手动指定的目录,此处 /opt/apps 与 hosts.yaml 中的 install_dir 变量值一致
            - name: cni-net-dir
              hostPath:
                path: /opt/apps/cni/conf
            # 修改 cni 日志目录为手动指定的目录,此处 /opt/apps 与 hosts.yaml 中的 install_dir 变量值一致
            - name: cni-log-dir
              hostPath:
                path: /opt/apps/cni/log
            # 修改此卷的挂载权限为 0440,有两处
            - name: etcd-certs
              secret:
                secretName: calico-etcd-secrets
                defaultMode: 0440
    

    由于该资源文件使用的镜像源在国外,我将它们 download 下来后上传到了阿里云仓库,可以执行下面操作进行替换

    $ sed -i 's#docker.io/calico/cni:v3.18.0#registry.cn-shenzhen.aliyuncs.com/zze/calico-cni:v3.18.0#g;s#docker.io/calico/pod2daemon-flexvol:v3.18.0#registry.cn-shenzhen.aliyuncs.com/zze/calico-pod2daemon-flexvol:v3.18.0#g;s#docker.io/calico/node:v3.18.0#registry.cn-shenzhen.aliyuncs.com/zze/calico-node:v3.18.0#g;s#docker.io/calico/kube-controllers:v3.18.0#registry.cn-shenzhen.aliyuncs.com/zze/calico-kube-controllers:v3.18.0#g' calico-etcd.yaml
    

    注意:此时下载的 YAML 镜像版本为 v3.18.0

    应用修改好的资源文件:

    kubectl apply -f calico-etcd.yaml 
    secret/calico-etcd-secrets created
    configmap/calico-config created
    clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
    clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
    clusterrole.rbac.authorization.k8s.io/calico-node created
    clusterrolebinding.rbac.authorization.k8s.io/calico-node created
    daemonset.apps/calico-node created
    serviceaccount/calico-node created
    deployment.apps/calico-kube-controllers created
    serviceaccount/calico-kube-controllers created
    poddisruptionbudget.policy/calico-kube-controllers created
    

    稍等片刻会在 kube-system 命名空间下启动如下 Pod:

    $ kubectl get pod -n kube-system
    NAME                                       READY   STATUS    RESTARTS   AGE   IP               NODE          NOMINATED NODE   READINESS GATES
    calico-kube-controllers-79678fdb96-5w4kl   1/1     Running   0          16m   10.0.1.111       k8s-master1   <none>           <none>
    calico-node-hsm8s                          1/1     Running   0          16m   10.0.1.112       k8s-master2   <none>           <none>
    calico-node-qnm9r                          1/1     Running   0          16m   10.0.1.113       k8s-node1     <none>           <none>
    calico-node-t4cjq                          1/1     Running   0          16m   10.0.1.111       k8s-master1   <none>           <none>
    

    测试一下 Pod 的跨主机通信,应用如下资源文件:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: test
      name: test
    spec:
      replicas: 3
      selector:
        matchLabels:
          app: test
      strategy: {}
      template:
        metadata:
          labels:
            app: test
        spec:
          containers:
          - image: busybox:latest
            command: ['sleep','3000']
            name: busybox
    

    成功应用后将会创建如下三个 Pod:

    $ kubectl get pod -o wide
    NAME                   READY   STATUS    RESTARTS   AGE     IP           NODE          NOMINATED NODE   READINESS GATES
    test-c4f594994-nl2ks   1/1     Running   0          2m49s   10.244.0.5   k8s-node1     <none>           <none>
    test-c4f594994-s48pl   1/1     Running   0          2m49s   10.244.2.2   k8s-master1   <none>           <none>
    test-c4f594994-wv6nj   1/1     Running   0          2m49s   10.244.1.2   k8s-master2   <none>           <none>
    

    也可以在各 Node 上查看到由 Calico 管理的路由信息

    随便进入一个 Pod 测试 ping 其它两个 Pod:

    $ kubectl exec -it test-c4f594994-nl2ks -- sh
    / # ping 10.244.2.2
    PING 10.244.2.2 (10.244.2.2): 56 data bytes
    64 bytes from 10.244.2.2: seq=0 ttl=62 time=0.474 ms
    ^C
    --- 10.244.2.2 ping statistics ---
    1 packets transmitted, 1 packets received, 0% packet loss
    round-trip min/avg/max = 0.474/0.474/0.474 ms
    / # ping 10.244.1.2
    PING 10.244.1.2 (10.244.1.2): 56 data bytes
    64 bytes from 10.244.1.2: seq=0 ttl=62 time=0.321 ms
    ^C
    --- 10.244.1.2 ping statistics ---
    1 packets transmitted, 1 packets received, 0% packet loss
    round-trip min/avg/max = 0.321/0.321/0.321 ms
    

    可以正常通信,说明 Calico 已经正常在 Kubernetes 集群中工作了。
    参考官网:https://docs.projectcalico.org/getting-started/kubernetes/self-managed-onprem/onpremises

    在完成这篇文章之后。我已经对上述使用的 Ansible Role 进行了增强,以对 Calico 提供支持,所以你如果想要在 Kubernetes 集群中应用 Calico,直接使用我的 Ansible Role 就可以一键部署完成。地址:https://www.jianshu.com/p/85edca636ddc

    相关文章

      网友评论

        本文标题:Ansible部署二进制的k8s之calico网络插件

        本文链接:https://www.haomeiwen.com/subject/edtwnrtx.html