美文网首页
k8s集群及metrics-server、prometheus搭

k8s集群及metrics-server、prometheus搭

作者: Li_MAX | 来源:发表于2020-08-27 14:28 被阅读0次

    准备工作

    准备三台Linux服务器


    image.png
    角色 主机名 IP地址
    master k8s 192.168.5.159
    node1 k8s-node 192.168.5.160
    node2 k8s-node2 192.168.5.161

    vim /etc/hosts

    192.168.5.159 master
    192.168.5.160 node1
    192.168.5.161 node2
    

    关闭防火墙

    systemctl stop firewalld
    systemctl disable firewalld
    

    三台服务器校验系统时间

    # 安装ntp
    yum install -y ntp
    # 同步时间
    ntpdate cn.pool.ntp.org
    

    关闭selinux

    sed -i 's/enforcing/disabled/' /etc/selinux/config
    setenforce 0
    

    关闭swap => K8S中不支持swap分区
    编辑etc/fstab将swap那一行注释掉或者删除掉

    vim /etc/fstab
    #/dev/mapper/centos-swap swap                    swap    defaults        0 0
    

    将桥接的IPv4流量传递到iptables的链

    cat > /etc/sysctl.d/k8s.conf << EOF
       net.bridge.bridge-nf-call-ip6tables = 1
       net.bridge.bridge-nf-call-iptables = 1
       EOF
    
    sysctl --system
    

    安装docker、kubeadm、kublet

    以下所有节点都需要做的操作

    # wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
    # yum -y install docker-ce-18.06.1.ce-3.el7
    # systemctl enable docker && systemctl start docker
    # docker --version
    Docker version 18.06.1-ce, build e68fc7a
    

    添加阿里云yum软件源

    # cat > /etc/yum.repos.d/kubernetes.repo << EOF
    [kubernetes]
    name=Kubernetes
    baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
    enabled=1
    gpgcheck=1
    repo_gpgcheck=1
    gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
    https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
    EOF
    

    安装Kubeadm&Kubelet&Kubectl

    yum install -y kubelet-1.17.3 kubeadm-1.17.3 kubectl-1.17.3
    systemctl enable kubelet
    

    部署Master节点

    由于默认拉取镜像地址k8s.gcr.io国内无法访问,这里指定阿里云镜像仓库地址(registry.aliyuncs.com/google_containers)。官方建议服务器至少2CPU+2G内存

    kubeadm init \
    --apiserver-advertise-address=192.168.5.159 \
    --image-repository registry.aliyuncs.com/google_containers \
    --kubernetes-version v1.17.3 \
    --service-cidr=10.1.0.0/16 \
    --pod-network-cidr=10.244.0.0/16
    

    配置文件

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config
    # kubectl get nodes
    

    部署pod网络插件

    节点全部需部署,被墙原因,曲线救国

    docker pull lizhenliang/flannel:v0.11.0-amd64
    docker tag lizhenliang/flannel:v0.11.0-amd64 quay.io/coreos/flannel:v0.12.0-amd64
    
    kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
    

    查看状态
    全部为Running则OK,其中一个不为Running,比如:Pending、ImagePullBackOff都表明Pod没有就绪

    kubectl get pod --all-namespaces
    

    如果其中有的Pod没有Running,可以通过以下命令查看具体错误原因,比如这里我想查看kube-flannel-ds-amd64-8bmbm这个pod的错误信息:

    kubectl describe pod kube-flannel-ds-amd64-xpd82 -n kube-system
    

    node节点加入master

    kubeadm join 192.168.5.159:6443 --token 1l64hh.7z7xgdjp4bu58720     --discovery-token-ca-cert-hash sha256:8c4bafd2aa326a7c45754f982132a38a8b4f651ca6d052dc4294424e93fe7129
    

    如果master的token找不到,可以使用以下命令查看

    kubeadm token list 
    

    token默认有效期24小时,过期后使用该命令无法查看,可通过命令修改

    kubeadm token create
    

    获取ca证书sha256编码hash值

    openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
    

    部署metrics-server

    下载部署文件

    for file in auth-delegator.yaml auth-reader.yaml metrics-apiservice.yaml metrics-server-deployment.yaml metrics-server-service.yaml resource-reader.yaml ; do wget https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/metrics-server/$file;done
    
    image.png
    修改metrics-server-deployment.yaml

    被墙原因镜像地址改成了国内的


    image.png
    image.png
    修改resource-reader.yaml

    添加nodes/stats


    image.png
    kubectl apply -f .
    
    image.png

    测试


    image.png
    image.png

    metrics-server API使用:

    <meta charset="utf-8">

    Metrics-server 可用 API 列表如下:

    • http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/nodes
    • http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/nodes/<node-name>
    • http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/pods
    • http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/namespace/<namespace-name>/pods/<pod-name>

    由于 k8s 在 v1.10 后废弃了 8080 端口,可以通过代理或者使用认证的方式访问这些 API:

    $ kubectl proxy
    $ curl http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/nodes
    {
      "kind": "NodeMetricsList",
      "apiVersion": "metrics.k8s.io/v1beta1",
      "metadata": {
        "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes"
      },
      "items": [
        {
          "metadata": {
            "name": "node2",
            "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node2",
            "creationTimestamp": "2020-08-28T02:24:07Z"
          },
          "timestamp": "2020-08-28T02:23:42Z",
          "window": "30s",
          "usage": {
            "cpu": "37549321n",
            "memory": "302864Ki"
          }
        },
        {
          "metadata": {
            "name": "master",
            "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/master",
            "creationTimestamp": "2020-08-28T02:24:07Z"
          },
          "timestamp": "2020-08-28T02:24:30Z",
          "window": "30s",
          "usage": {
            "cpu": "174668532n",
            "memory": "1105964Ki"
          }
        },
        {
          "metadata": {
            "name": "node1",
            "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node1",
            "creationTimestamp": "2020-08-28T02:24:07Z"
          },
          "timestamp": "2020-08-28T02:23:43Z",
          "window": "30s",
          "usage": {
            "cpu": "22156105n",
            "memory": "362676Ki"
          }
        }
      ]
    }
    

    也可以直接通过 kubectl 命令来访问这些 API,比如:

    $ kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes
    $ kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods
    $ kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes/<node-name>
    $ kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespace/<namespace-name>/pods/<pod-name>
    

    prometheus搭建

    git clone https://github.com/iKubernetes/k8s-prom.git
    
    cd k8s-prom
    
    # 创建prom的名称空间
    [root@master k8s-prom]# kubectl apply -f namespace.yaml
    namespace/prom created
    
    # 部署node_exporter:
    [root@master k8s-prom]# cd node_exporter/
    [root@master node_exporter]# ls
    node-exporter-ds.yaml  node-exporter-svc.yaml
    [root@master node_exporter]# kubectl apply -f .
    daemonset.apps/prometheus-node-exporter created
    service/prometheus-node-exporter created
    
    [root@master node_exporter]# kubectl get pods -n prom
    NAME                             READY     STATUS    RESTARTS   AGE
    prometheus-node-exporter-dmmjj   1/1       Running   0          7m
    prometheus-node-exporter-ghz2l   1/1       Running   0          7m
    prometheus-node-exporter-zt2lw   1/1       Running   0          7m
    
    # 部署prometheus
    [root@master k8s-prom]# cd prometheus/
    [root@master prometheus]# ls
    prometheus-cfg.yaml  prometheus-deploy.yaml  prometheus-rbac.yaml  prometheus-svc.yaml
    [root@master prometheus]# kubectl apply -f .
    configmap/prometheus-config created
    deployment.apps/prometheus-server created
    clusterrole.rbac.authorization.k8s.io/prometheus created
    serviceaccount/prometheus created
    clusterrolebinding.rbac.authorization.k8s.io/prometheus created
    service/prometheus created
    

    看prom名称空间中的所有资源

    [root@master prometheus]# kubectl logs prometheus-server-556b8896d6-dfqkp -n prom 
    Warning  FailedScheduling  2m52s (x2 over 2m52s)  default-scheduler  0/3 nodes are available: 3 Insufficient memory.
    

    修改prometheus-deploy.yaml,删掉内存那三行

    resources:
      limits:
        memory: 2Gi
    

    重新apply

    [root@master prometheus]# kubectl apply -f prometheus-deploy.yaml
    
    [root@master prometheus]# kubectl get all -n prom
    NAME                                     READY     STATUS    RESTARTS   AGE
    pod/prometheus-node-exporter-dmmjj       1/1       Running   0          10m
    pod/prometheus-node-exporter-ghz2l       1/1       Running   0          10m
    pod/prometheus-node-exporter-zt2lw       1/1       Running   0          10m
    pod/prometheus-server-65f5d59585-6l8m8   1/1       Running   0          55s
    NAME                               TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
    service/prometheus                 NodePort    10.111.127.64   <none>        9090:30090/TCP   56s
    service/prometheus-node-exporter   ClusterIP   None            <none>        9100/TCP         10m
    NAME                                      DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
    daemonset.apps/prometheus-node-exporter   3         3         3         3            3           <none>          10m
    NAME                                DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
    deployment.apps/prometheus-server   1         1         1            1           56s
    NAME                                           DESIRED   CURRENT   READY     AGE
    replicaset.apps/prometheus-server-65f5d59585   1         1         1         56s
    

    上面我们看到通过NodePorts的方式,可以通过宿主机的30090端口,来访问prometheus容器里面的应用。


    image.png

    部署kube-state-metrics,用来整合数据

    [root@master k8s-prom]# cd kube-state-metrics/
    [root@master kube-state-metrics]# ls
    kube-state-metrics-deploy.yaml  kube-state-metrics-rbac.yaml  kube-state-metrics-svc.yaml
    [root@master kube-state-metrics]# kubectl apply -f .
    deployment.apps/kube-state-metrics created
    serviceaccount/kube-state-metrics created
    clusterrole.rbac.authorization.k8s.io/kube-state-metrics created
    clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created
    service/kube-state-metrics created
    

    镜像被墙原因无法拉取,可拉国内再打tag

    docker pull quay.io/coreos/kube-state-metrics:v1.3.1
    docker tag quay.io/coreos/kube-state-metrics:v1.3.1   gcr.io/google_containers/kube-state-metrics-amd64:v1.3.1
    

    部署k8s-prometheus-adapter,这个需要自制证书:

    [root@master k8s-prometheus-adapter]# cd /etc/kubernetes/pki/
    [root@master pki]# (umask 077; openssl genrsa -out serving.key 2048)
    Generating RSA private key, 2048 bit long modulus
    ...........................................................................................+++
    ...............+++
    e is 65537 (0x10001)
    

    证书请求:

    [root@master pki]#  openssl req -new -key serving.key -out serving.csr -subj "/CN=serving"
    

    开始签证:

    [root@master pki]# openssl  x509 -req -in serving.csr -CA ./ca.crt -CAkey ./ca.key -CAcreateserial -out serving.crt -days 3650
    Signature ok
    subject=/CN=serving
    Getting CA Private Key
    

    创建加密的配置文件:
    注:cm-adapter-serving-certs是custom-metrics-apiserver-deployment.yaml文件里面的名字。

    [root@master pki]# kubectl create secret generic cm-adapter-serving-certs --from-file=serving.crt=./serving.crt --from-file=serving.key=./serving.key  -n prom
    secret/cm-adapter-serving-certs created
    
    [root@master pki]# kubectl get secrets -n prom
    NAME                             TYPE                                  DATA      AGE
    cm-adapter-serving-certs         Opaque                                2         51s
    default-token-knsbg              kubernetes.io/service-account-token   3         1h
    kube-state-metrics-token-sccdf   kubernetes.io/service-account-token   3         1h
    prometheus-token-nqzbz           kubernetes.io/service-account-token   3         1h
    

    部署k8s-prometheus-adapter:

    [root@master k8s-prom]# cd k8s-prometheus-adapter/
    [root@master k8s-prometheus-adapter]# ls
    custom-metrics-apiserver-auth-delegator-cluster-role-binding.yaml   custom-metrics-apiserver-service.yaml
    custom-metrics-apiserver-auth-reader-role-binding.yaml              custom-metrics-apiservice.yaml
    custom-metrics-apiserver-deployment.yaml                            custom-metrics-cluster-role.yaml
    custom-metrics-apiserver-resource-reader-cluster-role-binding.yaml  custom-metrics-resource-reader-cluster-role.yaml
    custom-metrics-apiserver-service-account.yaml                       hpa-custom-metrics-cluster-role-binding.yaml
    

    会遇到不兼容问题,解决方法访问https://github.com/DirectXMan12/k8s-prometheus-adapter/tree/master/deploy/manifests下载最新版的custom-metrics-apiserver-deployment.yaml文件,并把里面的namespace的名字改成prom;同时还要下载custom-metrics-config-map.yaml文件到本地来,并把里面的namespace的名字改成prom。

    [root@master k8s-prometheus-adapter]# kubectl apply -f .
    

    查看状态,全部running状态

    [root@master k8s-prometheus-adapter]# kubectl get all -n prom
    

    查看api是否存在 custom.metrics.k8s.io/v1beta1

    [root@master k8s-prometheus-adapter]# kubectl api-versions
    custom.metrics.k8s.io/v1beta1
    

    开代理测试

    [root@master k8s-prometheus-adapter]# kubectl proxy --port=8080
    [root@master pki]# curl  http://localhost:8080/apis/custom.metrics.k8s.io/v1beta1/
    {
      "kind": "APIResourceList",
      "apiVersion": "v1",
      "groupVersion": "custom.metrics.k8s.io/v1beta1",
      "resources": [
        {
          "name": "namespaces/kube_endpoint_info",
          "singularName": "",
          "namespaced": false,
          "kind": "MetricValueList",
          "verbs": [
            "get"
          ]
        },
        {
          "name": "namespaces/kube_hpa_status_desired_replicas",
          "singularName": "",
          "namespaced": false,
          "kind": "MetricValueList",
          "verbs": [
            "get"
          ]
        },
        {
          "name": "namespaces/kube_pod_container_status_waiting",
          "singularName": "",
          "namespaced": false,
          "kind": "MetricValueList",
          "verbs": [
            "get"
          ]
        },
        {
          "name": "namespaces/kube_hpa_labels",
          "singularName": "",
          "namespaced": false,
          "kind": "MetricValueList",
          "verbs": [
            "get"
          ]
        },
        {
          "name": "jobs.batch/kube_hpa_spec_min_replicas",
          "singularName": "",
          "namespaced": true,
          "kind": "MetricValueList",
          "verbs": [
            "get"
          ]
        }
    }
    

    问题:

    1、node节点 使用 kubectl查询资源报错:kubernetes:The connection to the server localhost:8080 was refused - did you specify the right host

    将master节点的admin.conf 拷贝到work节点$HOME/.kube目录下,文件改名config

    2、证书已存在,删除对应目录重新执行命令

    3、kubelet和kubeadm版本不一致问题,重新安装

    yum remove kubelet
    yum install -y kubelet-1.17.3 kubeadm-1.17.3 kubectl-1.17.3
    

    相关文章

      网友评论

          本文标题:k8s集群及metrics-server、prometheus搭

          本文链接:https://www.haomeiwen.com/subject/hgoqsktx.html