美文网首页operator
kube-on-kube-operator和node-opera

kube-on-kube-operator和node-opera

作者: davisgao | 来源:发表于2019-11-09 15:31 被阅读0次

    1.说明

    总体的设计思想主要来源于蚂蚁金服的文章深度 | 蚂蚁金服自动化运维大规模 Kubernetes 集群的实践之路

    实现中开发的Operator主要基于operator-sdk开发。

    目的,通过元数据集群能够对业务集群进行快速的上下线,管理业务集群的生命周期。在公有云上为租户快速的创建回收集群等。

    2.总体的架构

    image.png

    3.主要的流程

    • 元数据集群搭建(平台部署操作)
      作为整个容器云的支撑系统,放在支撑区,搭建过程kubeadm安装kubernetes手动安装kubernetes

      注册Master-Operator和Node-Operator到元数据集群。Master包含Etcd,Apiserver,scheduler,controller-manager;Node包含Kubelet和docker(可能是其他容器)

      同时将节点的证书写入到ConfigMap中,目的是Node-Operator在加入节点时候,从集群中拿到证书,分发给被加入的节点,当然也可以放到配置中心。实际上ca的证书默认已经放到集群中,比如在configmap[extension-apiserver-authentication]中就有。

      kubectl -n kube-system create configmap ca --from-file/etc/kubernetes/ca.pem
      kubectl -n kube-system create configmap bootstrap --from-file=/etc/kubernetes/bootstrap.conf
      
    • 主机申请或配置(租户操作)
      如果有底层IAAS云管,可以通过IAAS API根据配置创建主机,没有云管你可以直接配置主机

    • 证书生成
      根据提供的集群信息(主机,POD_CIDR,SERVICE_CIDR等),生成集群的证书,并写入到配置中心Nacos或者元数据集群的ConfigMap中,当然有的证书大小开会超过1M,etcd中建议不要超过1M,但证书很少不会有太大影响。此处的证书生成工具,博主通过serverless的方式作为函数注册到元数据集群中了,基于kubeless开发,实际看需要,详情参考serverless开发

    • 通过元数据集群Node-Operator将控制节点加入到元数据集群作为元数据集群的node
      在将节点加入的元数据集群的时候,需要给kubelet增加默认的标签,用于在部署的时候济进行标签选择,配置如下

      --node-labels=kubelet.kubernetes.io/tenant=租户ID  kubelet.kubernetes.io/cluster=集群名称 \
      

      另外在搭建元数据集群时,将kubelet需要的证书放到ConfigMap中,此时在分发(Node-Operator内部主要通过Ansible实现)给kubelet

        bootstrap := &corev1.ConfigMap{}
        err = r.client.Get(context.TODO(), types.NamespacedName{Name: "bootstrap", Namespace: "kube-system"}, bootstrap)
        if err != nil && errors.IsNotFound(err) {
            ioutil.WriteFile("/etc/kubernetes/bootstrap.conf",[]byte(bootstrap.Data["bootstrap.conf"]),os.ModePerm)
        }
        ca:= &corev1.ConfigMap{}
        err = r.client.Get(context.TODO(), types.NamespacedName{Name: "ca", Namespace: "kube-system"}, ca)
        if err != nil && errors.IsNotFound(err) {
            ioutil.WriteFile("/etc/kubernetes/ca.pem",[]byte(clientCA.Data["ca.pem"]),os.ModePerm)
        }
      

    部署其他容器组件,比如kube-proxy,监控组件,采集组件等

    • 通过Master-Operator在特定标签(租户和集群名称)的节点上部署Master

      Etcd:通过Deployment实现,增加标签选择,使用HostNetwork方式。同时在InitContainers中增加一个容器,用于根据租户和集群名称等信息从元数据集群(或者配置中心)中获取生成的集群证书,包括业务集群节点的证书bootstrap.conf和ce.pem,同时挂载到主机目录/etc/kubernetes/pki下

      func initContainers() []corev1.Container{
        return []corev1.Container{
            {
              Name: "certs",
              Image: "cloud.org/cert:latest",
              ImagePullPolicy: corev1.PullIfNotPresent,
              VolumeMounts:[]corev1.VolumeMount{
                  {
                      Name:      "cert",
                      MountPath: "/etc/kubernetes/pki",
                  },
               },
            },
          }
      }
      

      后续为了容灾可以将Etcd通过operator实现,coreos已提供。

      Apiserver:通过Deployment实现,增加标签选择,使用HostNetwork方式
      scheduler:通过Deployment实现,增加标签选择,使用HostNetwork方式
      controller-manager:通过Deployment实现,增加标签选择,使用HostNetwork方式

    启动完成之后,还需要执行如下动作:
    ①将业务集群中节点需要的证书,写入到业务集群的configmap中

    ②给业务集群注册Node-Operator

    • 通过业务集群的Node-Operator将节点加入到业务集群

    4.主要实现过程

    4.1.开发环境

    • 环境安装( go git)
    export GOROOT=/data/go
    export GOPATH=/data/work/go
    export PATH=$PATH:$GOROOT/bin:$GOPATH/bin
    #开启mod支持
    export GO111MODULE=on
    #方便包下载
    export GOPROXY=https://goproxy.cn
    
    • operator-sdk安装
    #/data/work/go/src
    [root@node1 src]# wget https://github.com/operator-framework/operator-sdk/releases/download/v0.11.0/operator-sdk-v0.11.0-x86_64-linux-gnu
    [root@node1 src]# chmod +x operator-sdk-v0.11.0-x86_64-linux-gnu
    [root@node1 src]# mv operator-sdk-v0.11.0-x86_64-linux-gnu operator-sdk
    [root@node1 src]# mv operator-sdk /usr/local/bin
    

    4.2.Master-Operator

    [root@node operator]# operator-sdk new master --repo cloud.org/operator/master
    INFO[0000] Creating new Go operator 'master'.           
    INFO[0000] Created go.mod                               
    #此处省略......
    INFO[0014] Project validation successful.               
    INFO[0014] Project creation complete.
    [root@node master]# operator-sdk add api --api-version=crd.cloud.org/v1alpha1 --kind=KubeMaster
    INFO[0000] Generating api version crd.cloud.org/v1alpha1 for kind KubeMaster. 
    INFO[0000] Created pkg/apis/crd/group.go                
    INFO[0012] Created pkg/apis/crd/v1alpha1/kubemaster_types.go 
    INFO[0012] Created pkg/apis/addtoscheme_crd_v1alpha1.go 
    INFO[0012] Created pkg/apis/crd/v1alpha1/register.go    
    INFO[0012] Created pkg/apis/crd/v1alpha1/doc.go         
    #此处省略......
    deploy/crds/crd.cloud.org_kubemasters_crd.yaml 
    INFO[0026] Code-generation complete.                    
    INFO[0026] API generation complete.
    [root@node master]# operator-sdk add controller --api-version=crd.cloud.org/v1alpha1 --kind=KubeMaster
    INFO[0000] Generating controller version crd.cloud.org/v1alpha1 for kind KubeMaster. 
    INFO[0000] Created pkg/controller/kubemaster/kubemaster_controller.go 
    INFO[0000] Created pkg/controller/add_kubemaster.go     
    INFO[0000] Controller generation complete. 
    

    创建deployment

    
    // newPodForCR returns a busybox pod with the same name/namespace as the cr
    func newDeploymentForCR(cr *cloudv1alpha1.MasterOperator) *appv1.Deployment {
        labels := map[string]string{
            "app": cr.Name,
        }
        nodeSelector := map[string]string{
            "kubelet.kubernetes.io/tenant":cr.Name,//TODO:租户+集群名称
        }
    
        buffer := bytes.NewBufferString("")
        etcdSize := len(cr.Spec.Etcds)
        for index,node := range cr.Spec.Etcds {
            buffer.WriteString("https://"+node.IP+":2379")
            if etcdSize-1 != index  {
                buffer.WriteString(",")
            }
        }
    
        matchLabels := map[string]string{
            "app" : cr.Name + "-master",
        }
        if cr.Selector == nil {
            cr.Selector = &metav1.LabelSelector{
                MatchLabels: matchLabels,
            }
        }else if cr.Selector.MatchLabels == nil{
            if cr.Selector.MatchLabels == nil{
                cr.Selector.MatchLabels = matchLabels
            }
        }else {
            cr.Selector.MatchLabels["app"] = cr.Name + "-master"
        }
        //TODO:将ETCD,Apiserver,Controller-Manager,Scheduler按POD维度部署,生产中建议按照Deployment维度
        size := int32(len(cr.Spec.Masters))
        return &appv1.Deployment{
            ObjectMeta: metav1.ObjectMeta{
                Name:      cr.Name + "-master",
                Namespace: cr.Namespace,
                Labels:    labels,
            },
            Spec: appv1.DeploymentSpec{
                Selector: cr.Selector,
                Replicas: &size,
                Template: corev1.PodTemplateSpec{
                    ObjectMeta: metav1.ObjectMeta{
                        Labels: cr.Selector.MatchLabels,
                        Name:  cr.Name + "-master",
                    },
                    Spec: corev1.PodSpec{
                        NodeSelector: nodeSelector,
                        HostNetwork: true,
                        DNSPolicy: corev1.DNSClusterFirstWithHostNet,
                        //Priority: 2000000000,
                        PriorityClassName: "system-cluster-critical",
                        //负责下载证书
                        InitContainers: initContainers(),
                        Containers: assemblyContainers(buffer.String(),cr.Spec.ServiceCIDR,cr.Spec.PodCIDR),
                        Volumes: []corev1.Volume{
                            certVolume(),
                            localtimeVolume("localtime","/etc/localtime","file"),
                        },
                    },
    
                },
            },
    
        }
    }
    

    4.3.Node-Operator

    [root@node operator]# operator-sdk new node --repo cloud.org/operator/node
    #此处省略......
    INFO[0006] Project validation successful.               
    INFO[0006] Project creation complete.
    [root@node node]# operator-sdk add api --api-version=crd.cloud.org/v1alpha1 --kind=KubeNode
    #此处省略......
    INFO[0011] Code-generation complete.                    
    INFO[0011] API generation complete.
    [root@node node]# operator-sdk add controller --api-version=crd.cloud.org/v1alpha1 --kind=KubeNode
    INFO[0000] Generating controller version crd.cloud.org/v1alpha1 for kind KubeNode. 
    INFO[0000] Created pkg/controller/kubenode/kubenode_controller.go 
    INFO[0000] Created pkg/controller/add_kubenode.go       
    INFO[0000] Controller generation complete.
    

    初始化Ansible配置文件

        //1.初始化ansible hosts文件
        err = initAnsibleHosts(instance.Spec.Nodes)
        if err !=nil {
            fmt.Println("初始化Ansible 主机异常")
            return reconcile.Result{}, err
        }
    

    拉取节点所需证书

            bootstrap := &corev1.ConfigMap{}
            err = r.client.Get(context.TODO(), types.NamespacedName{Name: "bootstrap", Namespace: "kube-system"}, bootstrap)
            if err != nil && errors.IsNotFound(err) {
                ioutil.WriteFile("/etc/kubernetes/bootstrap.conf",[]byte(bootstrap.Data["bootstrap.conf"]),os.ModePerm)
            }
            clientCA := &corev1.ConfigMap{}
            err = r.client.Get(context.TODO(), types.NamespacedName{Name: "extension-apiserver-authentication", Namespace: "kube-system"}, clientCA)
            if err != nil && errors.IsNotFound(err) {
                ioutil.WriteFile("/etc/kubernetes/ca.pem",[]byte(clientCA.Data["client-ca-file"]),os.ModePerm)
            }
    

    Ansible执行的yaml

    - hosts: kubernetes
      remote_user: root
      tasks:
        - name: "1.创建工作目录"
          file:
            path: "{{ item.path }}"
            state: "{{ item.state }}"
          with_items:
            - { path: "{{WORK_PATH}}/", state: "directory" }
            - { path: "/etc/kubernetes/pki", state: "directory" }
            - { path: "/etc/kubernetes/manifests", state: "directory" }
        - name: "2.拷贝安装文件,后续可以放到下载中心,TODO:证书"
          copy:
            src: "{{ item.src }}"
            dest: "{{ item.dest }}"
          with_items:
            - { src: "/assets/systemd/docker.service", dest: "/usr/lib/systemd/system/" }
            - { src: "/assets/systemd/kubelet.service", dest: "/usr/lib/systemd/system/" }
            - { src: "/assets/pki/ca.pem", dest: "/etc/kubernetes/pki" }
            - { src: "/assets/pki/bootstrap.conf", dest: "/etc/kubernetes/" }
            - { src: "/assets/script/kubelet.sh", dest: "/data/" }
        - name: "3.安装Node节点"
          shell: sh /data/kubelet.sh {{WORK_PATH}} {{TENANT}}
        - name: "4.清理安装文件"
          shell: rm -rf /data/kubelet.sh
    

    执行的脚本

    #!/bin/bash
    # ----------------------------------------------------------------------
    # name:         kubelet.sh
    # version:      1.0
    # createTime:   2019-06-25
    # description:  初始化
    # author:       doublegao
    # email:        doublegao@gmail.com
    # params:       工作目录,tenant,dockerVersion,kubeletVersion
    # example:      kubelet.sh path tenant 18.09.9 v1.15.3
    # ----------------------------------------------------------------------
    
    
    WORK_PATH=$1
    #去掉最后一个斜杠 "/"
    WORK_PATH=${WORK_PATH%*/}
    DOCKER_PATH=$WORK_PATH/docker
    KUBELET_PATH=$WORK_PATH/kubelet
    mkdir -p $DOCKER_PATH
    mkdir -p $KUBELET_PATH
    DOCKER_VERSION=18.09.9
    KUBELET_VERSION=v1.15.3
    
    
    echo "#############   1.系统初始化"
    setenforce 0
    sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
    systemctl disable firewalld
    systemctl stop firewalld
    swapoff -a
    sysctl -p
    sed -i 's/.*swap.*/#&/' /etc/fstab
    iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
    
    cat > /etc/sysctl.d/k8s.conf <<EOF
    net.bridge.bridge-nf-call-ip6tables = 1
    net.bridge.bridge-nf-call-iptables = 1
    EOF
    
    echo "#############   2.添加docker和kuberentes yum源"
    wget -P /etc/yum.repos.d https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
    
    cat <<EOF > /etc/yum.repos.d/kubernetes.repo
    [kubernetes]
    name=Kubernetes
    baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
    enabled=1
    gpgcheck=1
    repo_gpgcheck=1
    gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
    EOF
    
    echo "#############   3.检查yum可用性"
    yum repolistj
    
    echo "#############   4.安装启用kubelet 和 docker"
    yum install -y docker-ce-$DOCKER_VERSION-3.el7.x86_64 kubelet-$KUBELET_VERSION-0.x86_64
    systemctl enable docker.service && systemctl enable kubelet.service
    
    echo "#############   5.修改配置文件"
    sed -i "s#_DATA_ROOT_#$DOCKER_PATH#g" /usr/lib/systemd/system/docker.service
    sed -i "s#_HOSTNAME_#$HOSTNAME#g" /usr/lib/systemd/system/kubelet.service
    sed -i "s#_KUBELET_PATH_#$KUBELET_PATH#g" /usr/lib/systemd/system/kubelet.service
    sed -i "s#_TENANT_#$2#g" /usr/lib/systemd/system/kubelet.service
    
    
    
    echo "#############   6.启动docker,拉去pause镜像"
    systemctl daemon-reload && systemctl start docker.service
    docker pull k8s.gcr.io/pause-amd64:3.1
    
    echo "#############   7.启动kubelet"
    systemctl daemon-reload && systemctl start kubelet.service
    

    相关文章

      网友评论

        本文标题:kube-on-kube-operator和node-opera

        本文链接:https://www.haomeiwen.com/subject/omoimctx.html