美文网首页
kubernetes1.9部署

kubernetes1.9部署

作者: 一个技术 | 来源:发表于2018-04-06 19:42 被阅读0次
    install docker/kubeadm

    https://kubernetes.io/docs/setup/independent/install-kubeadm/

    在所有kubernetes节点上设置kubelet使用cgroupfs,与dockerd保持一致,否则kubelet会启动报错

    docker配置
    cat << EOF > /etc/docker/daemon.json
    {
      "exec-opts": ["native.cgroupdriver=cgroupfs"]
    }
    EOF
    systemctl daemon-reload && systemctl restart docker
    
    默认kubelet使用的cgroup-driver=systemd,改为cgroup-driver=cgroupfs
    vi /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
    #Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd"
    Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs"
    
    重设kubelet服务,并重启kubelet服务
    systemctl daemon-reload && systemctl restart kubelet
    

    关闭swap

    swapoff -a
    vim /etc/fstab  #swap一行注释掉
    
    master节点 kubeadm init 方法1

    初始化集群,也会起一个etcd的pod

    kubeadm init --pod-network-cidr=10.244.0.0/16
    
    master节点 kubeadm init 方法2

    通过配置文件init
    配置文件大全见官方
    https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/#config-file

    install etcd

    1. 准备证书
      https://www.kubernetes.org.cn/3096.html

    在master1需要安装CFSSL工具,这将会用来建立 TLS certificates。

    export CFSSL_URL="https://pkg.cfssl.org/R1.2"
    wget "${CFSSL_URL}/cfssl_linux-amd64" -O /usr/local/bin/cfssl
    wget "${CFSSL_URL}/cfssljson_linux-amd64" -O /usr/local/bin/cfssljson
    chmod +x /usr/local/bin/cfssl /usr/local/bin/cfssljson
    

    创建集群 CA 与 Certificates
    在这部分,将会需要产生 client 与 server 的各组件 certificates,并且替 Kubernetes admin user 产生 client 证书。

    建立/etc/etcd/ssl文件夹,然后进入目录完成以下操作。

     mkdir -p /etc/etcd/ssl && cd /etc/etcd/ssl
     export PKI_URL="https://kairen.github.io/files/manual-v1.8/pki"
    

    下载ca-config.json与etcd-ca-csr.json文件,并产生 CA 密钥:

    wget "${PKI_URL}/ca-config.json" "${PKI_URL}/etcd-ca-csr.json"
    cfssl gencert -initca etcd-ca-csr.json | cfssljson -bare etcd-ca
    ls etcd-ca*.pem
    etcd-ca-key.pem  etcd-ca.pem
    

    下载etcd-csr.json文件,并产生 kube-apiserver certificate 证书:

    wget "${PKI_URL}/etcd-csr.json"   #修改IP为本地,如果是集群,每个节点IP都要添加进去
    cfssl gencert \
      -ca=etcd-ca.pem \
      -ca-key=etcd-ca-key.pem \
      -config=ca-config.json \
      -profile=kubernetes \
      etcd-csr.json | cfssljson -bare etcd
    
    ls etcd*.pem
    etcd-ca-key.pem  etcd-ca.pem  etcd-key.pem  etcd.pe
    

    若节点 IP 不同,需要修改etcd-csr.json的hosts。

    完成后删除不必要文件: rm -rf *.json

    确认/etc/etcd/ssl有以下文件:

    ls /etc/etcd/ssl
    etcd-ca.csr  etcd-ca-key.pem  etcd-ca.pem  etcd.csr  etcd-key.pem  etcd.pem
    
    1. Etcd 安装与设定
      首先在master1节点下载 Etcd,并解压缩放到 /opt 底下与安装:
    export ETCD_URL="https://github.com/coreos/etcd/releases/download"
    cd && wget -qO- --show-progress "${ETCD_URL}/v3.2.9/etcd-v3.2.9-linux-amd64.tar.gz" | tar -zx
    mv etcd-v3.2.9-linux-amd64/etcd* /usr/local/bin/ && rm -rf etcd-v3.2.9-linux-amd64
    

    完成后新建 Etcd Group 与 User,并建立 Etcd 配置文件目录:

    groupadd etcd && useradd -c "Etcd user" -g etcd -s /sbin/nologin -r etcd
    

    下载etcd相关文件,我们将来管理 Etcd:

    export ETCD_CONF_URL="https://kairen.github.io/files/manual-v1.8/master"
    wget "${ETCD_CONF_URL}/etcd.conf" -O /etc/etcd/etcd.conf
    wget "${ETCD_CONF_URL}/etcd.service" -O /lib/systemd/system/etcd.service
    

    编辑/etc/etcd/etcd.conf
    把IP改成本地IP,0.0.0.0的不要改。
    如果是etcd集群,ETCD_INITIAL_CLUSTER="master1=https://192.168.1.144:2380,node1=https://192.168.1.145:2380,node2=https://192.168.1.146:2380"
    master1,node1,node2与ETCD_NAME参数匹配。

    建立 var 存放信息,然后启动 Etcd 服务:

    mkdir -p /var/lib/etcd && chown etcd:etcd -R /var/lib/etcd /etc/etcd
    
    1. node1,node2 etcd安装(如果单点etcd跳过此步)
      从master1 copy配置文件
    mkdir -p /etc/etcd/ssl && cd /etc/etcd/ssl
    scp  192.168.1.144:/etc/etcd/ssl/* .
    scp  192.168.1.144:/usr/local/bin/etcd* /usr/local/bin/
    groupadd etcd && useradd -c "Etcd user" -g etcd -s /sbin/nologin -r etcd
    scp  192.168.1.144:/etc/etcd/etcd.conf /etc/etcd/etcd.conf
    scp  192.168.1.144:/lib/systemd/system/etcd.service /lib/systemd/system/etcd.service
    mkdir -p /var/lib/etcd && chown etcd:etcd -R /var/lib/etcd /etc/etcd
    

    vim /etc/etcd/etcd.conf
    ETCD_NAME改为node1 node2, 及修改IP

    1. 启动etcd
      systemctl enable etcd.service && systemctl start etcd.service
      如为集群,则都要启动
      验证,集群内节点注意时间要同步
     export CA="/etc/etcd/ssl"
     ETCDCTL_API=3 etcdctl  --cacert=${CA}/etcd-ca.pem \
        --cert=${CA}/etcd.pem  --key=${CA}/etcd-key.pem \
        --endpoints="https://192.168.1.144:2379" \
        endpoint health
     ETCDCTL_API=3 etcdctl  --cacert=${CA}/etcd-ca.pem \
        --cert=${CA}/etcd.pem  --key=${CA}/etcd-key.pem \
        --endpoints="https://192.168.1.144:2379" \
        member list
    
    1. 写kubeadm配置文件
    root@instance-1:/home# cat cluster 
    apiVersion: kubeadm.k8s.io/v1alpha1
    kind: MasterConfiguration
    etcd:
      endpoints:
      - https://192.168.1.144:2379
      - https://192.168.1.145:2379
      - https://192.168.1.146:2379
      caFile : /etc/etcd/ssl/etcd-ca.pem
      certFile : /etc/etcd/ssl/etcd.pem
      keyFile : /etc/etcd/ssl/etcd-key.pem
    networking:
      podSubnet: 10.244.0.0/16
    

    kubeadm init
    root@master1:~# kubeadm init --config=cluster

    kubeadm init 排错

    完成后如果不能get node ,get po

    root@master1:/etc/kubernetes# kubectl get node
    Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
    

    清空/root/.kube

    root@master1:/etc/kubernetes# rm -rf /root/.kube/
    
    root@master1:/etc/kubernetes#   mkdir -p $HOME/.kube
    root@master1:/etc/kubernetes#   sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    root@master1:/etc/kubernetes#   sudo chown $(id -u):$(id -g) $HOME/.kube/config
    
    部署网络-flannel,与calico二选一

    未部署网络时,get node,node的状态是not ready

    root@master1:~# kubectl get no
    NAME      STATUS     ROLES     AGE       VERSION
    master1   NotReady   master    22m       v1.9.2
    node1     NotReady   <none>    14m       v1.9.2
    

    kube-dns的状态也是pending,他依赖于网络

    通过官方模版导入flannel,yaml文件中有一行参数,Network": "10.244.0.0/16",这个要与kubeadm init时候参数一致,是pod IP的范围

    kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
    root@master1:~# kubectl get no
    NAME      STATUS    ROLES     AGE       VERSION
    master1   Ready     master    24m       v1.9.2
    node1     Ready     <none>    16m       v1.9.2
    
    部署网络-calico,与flannel二选一
    1. 如果使用的k8s内置etcd
      直接执行,注意calico.yaml中pod IP范围与init一致。
    kubectl taint node master1 node-role.kubernetes.io/master-  #如果只有一个maser,要设为可运行pod
    kubectl apply -f https://docs.projectcalico.org/v3.0/getting-started/kubernetes/installation/rbac.yaml
    kubectl apply -f https://docs.projectcalico.org/v3.0/getting-started/kubernetes/installation/hosted/calico.yaml
    
    1. 使用自己部署的etcd
      RBAC. If deploying Calico on an RBAC enabled cluster, you should first apply the ClusterRole and ClusterRoleBinding specs:
      kubectl apply -f https://docs.projectcalico.org/v3.0/getting-started/kubernetes/installation/rbac.yaml

    Download calico.yaml
    Configure etcd_endpoints in the provided ConfigMap to match your etcd cluster.
    Then simply apply the manifest:

    wget https://docs.projectcalico.org/v3.0/getting-started/kubernetes/installation/hosted/calico.yaml
    
    vim calico.yaml  
    修改
    etcd_endpoints: "https://192.168.1.144:2379,https://192.168.1.145:2379,https://192.168.1.146:2379"
      etcd_ca: ""   # "/calico-secrets/etcd-ca"  --/etc/etcd/ssl/etcd-ca.pem
      etcd_cert: "" # "/calico-secrets/etcd-cert"  --/etc/etcd/ssl/etcd.pem
      etcd_key: ""  # "/calico-secrets/etcd-key"  --/etc/etcd/ssl/etcd-key.pem
    
    对三个pem文件分别转码  
     base64 /etc/etcd/ssl/etcd-ca.pem  | tr -d '\n'
     base64 /etc/etcd/ssl/etcd.pem  | tr -d '\n'
     base64 /etc/etcd/ssl/etcd-key.pem | tr -d '\n'
    
    把pod ip池修改为和init时候的ip池一致
                - name: CALICO_IPV4POOL_CIDR
                  value: "192.168.0.0/16"
    
    kubectl apply -f calico.yaml
    kubectl taint node master1 node-role.kubernetes.io/master-  #如果只有一个maser,要设为可运行pod
    
    root@master1:/home# kubectl logs calico-node-vvl87 calico-node  -n kube-system 
    2018-01-29 09:46:16.638 [INFO][9] startup.go 187: Early log level set to info
    2018-01-29 09:46:16.639 [INFO][9] startup.go 198: NODENAME environment not specified - check HOSTNAME
    
    1. calicoctl
    curl -O -L https://github.com/projectcalico/calicoctl/releases/download/v2.0.0/calicoctl
    mv calicoctl /usr/local/bin/
    chmod a+x /usr/local/bin/calicoctl
    mkdir -p /etc/calico
    
    vim /etc/calico/calicoctl.cfg
    apiVersion: projectcalico.org/v3
    kind: CalicoAPIConfig
    metadata:
    spec:
      etcdEndpoints: https://192.168.1.144:2379,https://192.168.1.145:2379,https://192.168.1.146:2379
      etcdKeyFile: /etc/etcd/ssl/etcd-key.pem
      etcdCertFile: /etc/etcd/ssl/etcd.pem
      etcdCACertFile: /etc/etcd/ssl/etcd-ca.pem
    
    calicoctl get ippools
    calicoctl get node
    
    node节点kubeadm join

    node节点join失败

    kubeadm join --token 55c2c6.2a4bde1bc73a6562 192.168.1.144:6443 --discovery-token-ca-cert-hash sha256:0fdf8cfc6fecc18fded38649a4d9a81d043bf0e4bf57341239250dcc62d2c832
    
    [discovery] Failed to request cluster info, will try again: [Get https://192.168.1.144:6443/api/v1/namespaces/kube-public/configmaps/cluster-info: x509: certificate has expired or is not yet valid]
    

    检查master节点KUBECONFIG变量,如果不存在,则执行export

    root@master1:~# echo $KUBECONFIG
    export KUBECONFIG=$HOME/.kube/config
    

    node节点kubeadm reset ,再join

    Addons

    https://github.com/kubernetes/kubernetes/tree/master/cluster/addons

    部署dashboard

    dashboard(UI)默认是没有部署的,需要手动导入
    https://github.com/kubernetes/dashboard
    dashboard的语言是根据浏览器的语言自己识别的

    方法1: https

    wget https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
    
    修改下service,不然无法外部访问dashboard
    kind: Service,增加type: NodePort和NodePort: 32001
      type: NodePort
      ports:
        - port: 443
          targetPort: 8443
          nodePort: 32001
    
    kubectl create -f kubernetes-dashboard.yaml
    
    赋予dashboard 的sa 集群admin权限
    cat admin-role.yaml
    
    kind: ClusterRoleBinding
    apiVersion: rbac.authorization.k8s.io/v1beta1
    metadata:
      name: kubernetes-dashboard
      annotations:
        rbac.authorization.kubernetes.io/autoupdate: "true"
    roleRef:
      kind: ClusterRole
      name: cluster-admin
      apiGroup: rbac.authorization.k8s.io
    subjects:
    - kind: ServiceAccount
      name: kubernetes-dashboard
      namespace: kube-system
    
    root@master1:~# kubectl create -f admin-role.yaml 
    clusterrolebinding "admin" created
    
    创建完成后,通过 https://任意节点的IP:32001即可访问,注意是https。
    如果chrome无法打开,换firefox,kubeadm生产的证书日期不太多,过期了
    登录页面直接点跳过就行,因为默认的serviceaccount已经具备了集群管理员权限。
    
    如果想使用token登录,使用下面命令获取。就是kubernetes-dashboard-token这个secret
    kubectl -n kube-system describe secret `kubectl -n kube-system get secret|grep kubernetes-dashboard-token|cut -d " " -f1`|grep "token:"|tr -s " "|cut -d " " -f2
    

    方法2: http,登录不弹出认证页面

    wget https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/alternative/kubernetes-dashboard.yaml
    
    修改下service,不然无法外部访问dashboard
    kind: Service,增加type: NodePort和NodePort: 32001
      type: NodePort
      ports:
      - port: 80
        targetPort: 9090
        nodePort: 32002
    
    kubectl create -f kubernetes-dashboard.yaml
    
    赋予dashboard 的sa 集群admin权限
    cat admin-role.yaml
    
    kind: ClusterRoleBinding
    apiVersion: rbac.authorization.k8s.io/v1beta1
    metadata:
      name: kubernetes-dashboard
      annotations:
        rbac.authorization.kubernetes.io/autoupdate: "true"
    roleRef:
      kind: ClusterRole
      name: cluster-admin
      apiGroup: rbac.authorization.k8s.io
    subjects:
    - kind: ServiceAccount
      name: kubernetes-dashboard
      namespace: kube-system
    
    root@master1:~# kubectl create -f admin-role.yaml 
    clusterrolebinding "admin" created
    

    创建完成后,通过 http://任意节点的IP:32002即可访问

    部署监控

    Heapster + InfluxDB + Grafana
    https://github.com/kubernetes/heapster/blob/master/docs/influxdb.md

    kubectl create -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/rbac/heapster-rbac.yaml
    kubectl create -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/grafana.yaml
    kubectl create -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/heapster.yaml
    kubectl create -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/influxdb.yaml
    

    部署完成后,在dashboard能看到pod的cpu、mem使用量

    如果想看grafana页面,改下service,通过nodeport暴露端口

     kubectl edit svc monitoring-grafana  -n kube-system
     type: ClusterIP 改成  type: NodePort
     
     kubectl get svc monitoring-grafana  -n kube-system
    

    遗留问题:不显示master节点pod监控信息

    部署EFK
    https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/fluentd-elasticsearch
    
    kubectl create -f https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/fluentd-elasticsearch/es-service.yaml
    kubectl create -f https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/fluentd-elasticsearch/es-statefulset.yaml
    kubectl create -f https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/fluentd-elasticsearch/fluentd-es-configmap.yaml
    kubectl create -f https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/fluentd-elasticsearch/fluentd-es-ds.yaml
    kubectl create -f https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/fluentd-elasticsearch/kibana-deployment.yaml
    kubectl create -f https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/fluentd-elasticsearch/kibana-service.yaml
    
    给每个每个每个节点加下标签,不然fluentd-es-ds不会部署
    kubectl label node node1 beta.kubernetes.io/fluentd-ds-ready=true
    
    
    
    部署ingress controller

    https://github.com/kubernetes/ingress-nginx

    Mandatory commands

    curl https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/namespace.yaml  | kubectl apply -f -
    curl https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/default-backend.yaml | kubectl apply -f -
    curl https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/configmap.yaml  | kubectl apply -f -
    curl https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/tcp-services-configmap.yaml | kubectl apply -f -
    curl https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/udp-services-configmap.yaml | kubectl apply -f -
    

    Install with RBAC roles

    wget https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/rbac.yaml
    wget https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/with-rbac.yaml
       
    vim with-rbac.yaml  修改三处
    1. kind: Deployment 改成kind: DaemonSet
    2. replicas: 1  删除
    3. initContainers:上面一行加hostNetwork: true #两句平级
    
    导入
    kubectl create -f rbac.yaml 
    kubectl create -f with-rbac.yaml
    

    检查

    root@master1:~# kubectl get po -n ingress-nginx  -o wide
    NAME                                    READY     STATUS    RESTARTS   AGE       IP              NODE
    default-http-backend-55c6c69b88-6n88s   1/1       Running   0          10m       10.244.1.15     node2
    nginx-ingress-controller-2w9g4          1/1       Running   0          7m        192.168.1.146   node2
    nginx-ingress-controller-bzfzr          1/1       Running   0          7m        192.168.1.145   node1
    nginx-ingress-controller-j9lds          1/1       Running   3          7m        192.168.1.144   master1
    

    FAQ

    kubectl 命令补全
    root@master1:/# vim /etc/profilecd   #添加下面这句,再source
    source <(kubectl completion bash)
    root@master1:/# source /etc/profile
    
    master节点默认不可部署pod

    执行如下,node-role.kubernetes.io/master 可以在 kubectl edit node master1中taint配置参数下查到

    root@master1:/var/lib/kubelet# kubectl taint node master1 node-role.kubernetes.io/master-
    node "master1" untainted
    
    node节点pod无法启动/节点删除网络重置

    node1之前反复添加过,添加之前需要清除下网络

    root@master1:/var/lib/kubelet# kubectl get po -o wide
    NAME                   READY     STATUS              RESTARTS   AGE       IP           NODE
    nginx-8586cf59-6zw9k   1/1       Running             0          9m        10.244.3.3   node2
    nginx-8586cf59-jk5pc   0/1       ContainerCreating   0          9m        <none>       node1
    nginx-8586cf59-vm9h4   0/1       ContainerCreating   0          9m        <none>       node1
    nginx-8586cf59-zjb84   1/1       Running             0          9m        10.244.3.2   node2
    
    root@node1:~# journalctl -u kubelet
     failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "nginx-8586cf59-rm4sh_default" network: failed to set bridge addr: "cni0" already has an IP address different from 10.244.2.1/24
    12252 cni.go:227] Error while adding to cni network: failed to set bridge addr: "cni0" already
    

    重置kubernetes服务,重置网络。删除网络配置,link

    kubeadm reset
    systemctl stop kubelet
    systemctl stop docker
    rm -rf /var/lib/cni/
    rm -rf /var/lib/kubelet/*
    rm -rf /etc/cni/
    ifconfig cni0 down
    ifconfig flannel.1 down
    ifconfig docker0 down
    ip link delete cni0
    ip link delete flannel.1
    

    加入节点
    systemctl start docker

    kubeadm join --token 55c2c6.2a4bde1bc73a6562 192.168.1.144:6443 --discovery-token-ca-cert-hash sha256:0fdf8cfc6fecc18fded38649a4d9a81d043bf0e4bf57341239250dcc62d2c832
    

    相关文章

      网友评论

          本文标题:kubernetes1.9部署

          本文链接:https://www.haomeiwen.com/subject/nwbshftx.html