外部Prometheus监控k8s集群资源

作者: 阿当运维 | 来源:发表于2022-09-22 10:22 被阅读0次

    Prometheus监控k8s资源

    通过Prometheus监控k8s集群中各种资源:如微服务,容器资源指标 并在Grafana显示

    思路

    • 可以通过外部prometheus通过连接apiserver去监控k8s集群内指标。(前提k8s集群内安装好相应的exports)

    • 可以通过部署kube-prometheus(集群内部起了一套监控) 在通过联邦的方式,进行监控。

    以下采用 外部prometheus监控cadvisor,kube-state-metrics来获取k8s集群指标资源

    准备工作

    插件介绍

    想要监控k8s比较全面的资源指标,我们需要在集群内安装相应的exports,这要借助cadvisor,kube-state-metrics

    1. cadvisor: 集成在kubelet内,不需要单独去安装了,它可以收集集群内容器的cpu,内存等指标

    2. kube-state-metrics: 轮询api-server,监听 add delete update等事件,换句话说 光有cadvisor这些基本指标去监控,维度是不够的
      对于deployment,Pod、daemonset、cronjob等k8s资源对象并没有监控,比如:当前replace是多少?Pod当前状态(pending or running?)cadvisor并没有对具体的资源对象就行监控,因此就需引用新的exports来暴漏监控指标,kube-state-metrics

    kube-state-metrics安装部署

    1. 下载kube-state-metrics安装包

    注意: 我的k8s版本为v1.20.6 所以,要在github上看下说明,根据自己k8s的版本按实际情况来选择kube-state-metrics的版本

    kube-state-metrics_v2.2.1
    下载地址

    https://github.com/starsliao/Prometheus/tree/master/kubernetes
    

    将安装包上传至服务器

    1. 部署kube-state-metrics

    按需修改service.yml 中暴露端口,修改后如下:

    apiVersion: v1
    kind: Service
    metadata:
    #  annotations:
    #    prometheus.io/scrape: 'true'
      labels:
        app.kubernetes.io/name: kube-state-metrics
        app.kubernetes.io/version: v2.2.1
      name: kube-state-metrics
      namespace: ops-monit
    spec:
      type: NodePort
      ports:
      - name: http-metrics
        port: 8080
        targetPort: http-metrics
        nodePort: 30866
      - name: telemetry
        port: 8081
        targetPort: telemetry
        nodePort: 30867
      selector:
        app.kubernetes.io/name: kube-state-metrics
    
    

    在k8s集群中

    kubectl create namespace ops-monit
    cd kube-state-metrics
    kubectl apply -f .
    

    配置Prometheus

    vim prometheus.yml

    新增如下内容:

      - job_name: 'k8s-cadvisor'
        scrape_interval: 60s
        scrape_timeout: 60s
        metrics_path: /metrics/cadvisor
        kubernetes_sd_configs:  # kubernetes 自动发现
        - api_server: https://192.168.1.21:6443  # apiserver 地址
          role: node  # node 类型的自动发现
          namespaces:
            names:
            - ops-monit
          bearer_token_file: k8s.token
          tls_config:
            insecure_skip_verify: true
        bearer_token_file: k8s.token
        tls_config:
          insecure_skip_verify: true
        relabel_configs:
        - source_labels: [__address__]
          regex: '(.*):10250'
          replacement: '${1}:10255'
          target_label: __address__
          action: replace
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
    
        metric_relabel_configs:
        - source_labels: [instance]
          separator: ;
          regex: (.+)
          target_label: node
          replacement: $1
          action: replace
    
        - source_labels: [pod_name]
          separator: ;
          regex: (.+)
          target_label: pod
          replacement: $1
          action: replace
        - source_labels: [container_name]
          separator: ;
          regex: (.+)
          target_label: container
          replacement: $1
          action: replace
    
      - job_name: kube-state-metrics-1
        kubernetes_sd_configs:
        - api_server: https://192.168.1.21:6443  # apiserver 地址
          role: endpoints  # node 类型的自动发现
          namespaces:
            names:
            - ops-monit      
          bearer_token_file: k8s.token
          tls_config:
            insecure_skip_verify: true
        bearer_token_file: k8s.token
        tls_config:
          insecure_skip_verify: true
        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - separator: ;
          regex: (.*)
          target_label: __address__
          replacement: 192.168.1.21:30866
        - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name]
          regex: kube-state-metrics
          replacement: $1
          action: keep
        - action: labelmap
          regex: __meta_kubernetes_service_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: k8s_namespace
        - source_labels: [__meta_kubernetes_service_name]
          action: replace
          target_label: k8s_sname
          
      - job_name: kube-state-metrics-2
        kubernetes_sd_configs:
        - api_server: https://192.168.1.21:6443  # apiserver 地址
          role: endpoints  # node 类型的自动发现
          namespaces:
            names:
            - ops-monit
          bearer_token_file: k8s.token
          tls_config:
            insecure_skip_verify: true
        bearer_token_file: k8s.token
        tls_config:
          insecure_skip_verify: true
        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - separator: ;
          regex: (.*)
          target_label: __address__
          replacement: 192.168.1.21:30867
        - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name]
          regex: kube-state-metrics
          replacement: $1
          action: keep
        - action: labelmap
          regex: __meta_kubernetes_service_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: k8s_namespace
        - source_labels: [__meta_kubernetes_service_name]
          action: replace
          target_label: k8s_sname
    
    
    1. 以上涉及到了一些段落说明一下
      这是外部prometheus连接k8s集群,
      填写apiserver地址
      命名空间写kube-state-metrics所在的ns空间
      通信的token : 在prometheus.yml同级目录创建了个k8s.token的文件,内容为k8s集群中ops-monit空间下的secret---prometheus-token-wq9fd的内容
    kubernetes_sd_configs:
        - api_server: https://192.168.1.21:6443  # apiserver 地址
          role: endpoints  # node 类型的自动发现
          namespaces:
            names:
            - ops-monit
          bearer_token_file: k8s.token
          tls_config:
            insecure_skip_verify: true
        bearer_token_file: k8s.token
        tls_config:
          insecure_skip_verify: true
    

    配置仪表盘

    Grafana导入id:13105

    一些配置上细节的说明,可以参考模板的说明https://grafana.com/grafana/dashboards/13105-1-k8s-for-prometheus-dashboard-20211010/

    grafana监控k8s.png

    相关文章

      网友评论

        本文标题:外部Prometheus监控k8s集群资源

        本文链接:https://www.haomeiwen.com/subject/godwortx.html