美文网首页
使用Prometheus Operator监控Kubernete

使用Prometheus Operator监控Kubernete

作者: Anson前行 | 来源:发表于2020-10-27 17:52 被阅读0次

    原文链接:http://www.unmin.club/2020/07/prometheus-operator/

    一、Prometheus-Operator介绍

    Prometheus Operator 是了为简化在 Kubernetes 上部署、管理和运行 Prometheus 和 Alertmanager 集群而设计的,它为监控 Kubernetes 资源和 Prometheus 实例的管理提供了简单的定义,比如自带了一些报警规则和展示看板。

    Prometheus Operator 架构图


    在这里插入图片描述

    Prometheus Operator 组件:

    • Operator:控制器,根据自定义资源来部署和管理 Prometheus Server;
    • Prometheus Server: 根据自定义资源 Prometheus 类型中定义的内容而部署的 Prometheus Server 集群,这些自定义资源可以看作是用来管理 Prometheus Server 集群的 StatefulSets 资源;
    • Prometheus:声明 Prometheus 资源对象期望的状态,Operator 确保这个资源对象运行时一直与定义保持一致;
    • ServiceMonitor:声明指定监控的服务, 也就是exporter 的抽象,通过 Labels 来选取对应的Service Endpoint,让 Prometheus Server 通过选取的 Service 来获取 Metrics 信息。
    • Service:需要监控的服务,简单的说就是 Prometheus 监控的对象。

    二、Prometheus-Operator安装

    可以使用Helm方式安装,这里选择的是手动安装

    下载Prometheus-Operator项目到本地服务器

    $ git clone https://github.com/coreos/kube-prometheus.git
    $ cd manifests
    

    安装setup目录下的CRD和Operator对象

    $ kubectl  apply  -f setup/
    namespace/monitoring created
    customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created
    customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created
    customresourcedefinition.apiextensions.k8s.io/probes.monitoring.coreos.com created
    customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created
    customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created
    customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created
    customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com created
    clusterrole.rbac.authorization.k8s.io/prometheus-operator created
    clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
    deployment.apps/prometheus-operator created
    service/prometheus-operator created
    serviceaccount/prometheus-operator created
    

    创建manifests目录下的各类资源

    $ kubectl  apply -f  .
    alertmanager.monitoring.coreos.com/main created
    secret/alertmanager-main created
    service/alertmanager-main created
    serviceaccount/alertmanager-main created
    .............
    $ kubectl  get pods -n monitoring
    NAME                                   READY   STATUS    RESTARTS   AGE
    alertmanager-main-0                    1/2     Running   0          2d17h
    alertmanager-main-1                    1/2     Running   0          2d17h
    alertmanager-main-2                    1/2     Running   0          2d17h
    grafana-86445dccbb-m7kzg               1/1     Running   0          2d17h
    kube-state-metrics-5b67d79459-zf27k    3/3     Running   0          2d17h
    node-exporter-blx8m                    2/2     Running   0          2d17h
    node-exporter-zpns2                    2/2     Running   0          2d17h
    node-exporter-zrd6g                    2/2     Running   0          2d17h
    prometheus-adapter-66b855f564-mf9mc    1/1     Running   0          2d17h
    prometheus-k8s-0                       3/3     Running   1          2d17h
    prometheus-k8s-1                       3/3     Running   1          2d17h
    prometheus-operator-78fcb48ccf-sgklz   2/2     Running   0          2d17h
    

    三、通过Ingress访问组件

    由于这些资源的默认Service为ClusterIP,集群外部无法访问,我们可以通过使用kubectl edit xxx命令将Service类型修改为NodePort方式来提供外部访问。

    这里使用Ingress 分别为Prometheus,Alertmanager,Grafana创建域名。

    apiVersion: extensions/v1beta1
    kind: Ingress
    metadata:
      namespace: monitoring
      name: prometheus-ingress
    spec:
      rules:
      - host: k8s.grafana.com
        http:
          paths:
          - backend:
              serviceName: grafana
              servicePort: 3000
      - host: k8s.prometheus.com
        http:
          paths:
          - backend:
              serviceName: prometheus-k8s
              servicePort: 9090
      - host: k8s.alertmanager.com
        http:
          paths:
          - backend:
              serviceName: alertmanager-main
              servicePort: 9093
    

    创建Ingress对象

    $ kubectl  create -f ingress.yaml 
    ingress.extensions/prometheus-ingress created
    $ kubectl  get ingress -A
    NAMESPACE    NAME                 CLASS    HOSTS                                                     ADDRESS   PORTS     AGE
    default      my-nginx             <none>   nginx.ingress.com                                                   80, 443   2d21h                        
    monitoring   prometheus-ingress   <none>   k8s.grafana.com,k8s.prometheus.com,k8s.alertmanager.com             80        4s
    

    可以看到,已经分别创建了相应的域名,我们在本地的hosts文件添加Ingress主机的IP地址解析即可通过域名访问了。


    在这里插入图片描述
    在这里插入图片描述

    四、添加监控对象

    在上面Kube-proemtheus默认监控了一些系统的组件,我们还需要根据实际的业务需求去添加自定义的组件监控,添加自定义监控对象步骤如下:

    1. 建立一个 ServiceMonitor 对象,用于 Prometheus 添加监控项
    2. 为 ServiceMonitor 对象关联 metrics 数据接口的一个 Service 对象
    3. 确保 Service 对象可以正确获取到 metrics 数据

    比如对etcd服务进行监控,先创建ServiceMonitor对象

    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
     name: etcd-k8s
     namespace: monitoring
     labels:
       k8s-app: etcd-k8s
    spec:
     jobLabel: k8s-app
     endpoints:
     - port: port
       interval: 15s
     selector:
       matchLabels:
         k8s-app: etcd
     namespaceSelector:
       matchNames:
       - kube-system
    

    定义为:匹配 kube-system 这个命名空间下面的具有k8s-app=etcd 这个 label 标签的 Service,其中jobLabel 表示用于检索 job 任务名称的标签。

    创建这个serviceMonitor对象

    $ kubectl apply -f prometheus-serviceMonitorEtcd.yaml
    servicemonitor.monitoring.coreos.com "etcd-k8s" created
    

    然后再创建一个Etcd的Service 对象

    apiVersion: v1
    kind: Service
    metadata:
      name: etcd-k8s
      namespace: kube-system
      labels:
        k8s-app: etcd
    spec:
      type: ClusterIP
      clusterIP: None  # 一定要设置 clusterIP:None
      ports:
      - name: port
        port: 2381
    ---
    apiVersion: v1
    kind: Endpoints
    metadata:
      name: etcd-k8s
      namespace: kube-system
      labels:
        k8s-app: etcd
    subsets:
    - addresses:
      - ip: 192.168.16.173  # 指定etcd节点地址,如果是集群则继续向下添加
        nodeName: etc-master
      ports:
      - name: port
        port: 2381
    

    上面文件定义为:将后端的Etcd服务通过Endpoints添加到集群,然后为其创建Service对象

    创建这个Service对象

    $ kubectl apply -f etcd-service.yaml
    service/etcd-k8s configured
    endpoints/etcd-k8s configured
    $ kubectl get svc -n kube-system -l k8s-app=etcd
    NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
    etcd-k8s   ClusterIP   None         <none>        2381/TCP   1d
    

    创建完成后去prommetheus查看targes
    [图片上传失败...(image-477e83-1603792328401)]
    可以看到2381 端口链接被拒绝,这是因为在创建etcd服务时metrics接口设置为- --listen-metrics-urls=http://127.0.0.1:2381,我们只需要修改 /etc/kubernetes/manifest/ 目录下面的 etcd.yaml 文件中将上面的listen-metrics-urls更改成节点 IP 即可:

    - --listen-metrics-urls=http://192.168.16.173:2381
    

    修改后etcd会自动重启,然后在去Prometheus查看是否正常

    image.png

    然后在Grafana导入编号为 3070 的 dashboard,就可以获取到 etcd 的监控图表:(grafana默认口令为admin/admin)


    在这里插入图片描述

    五、自定义报警规则

    1. 指定Alertmanager地址

    在之前使用部署Prometheus时,我们只需要修改Prometheus下的Prometheus.yaml 配置文件来指定Alertmanager地址即可。现在通过Operator方式部署的Prometheus如何指定呢?我们可以先去Prometheus的web页面上查看Configuration的配置信息

    在这里插入图片描述

    可以看到上面 alertmanagers 的配置是通过 role 为 endpoints 的 kubernetes 的自动发现机制获取的,匹配的是服务名为 alertmanager-main,端口名为 web 的 Service 服务,所以Prometheus就这样指定了Alertmanager的地址。

    2. 添加报警规则

    在上面的配置中可以看到规则文件的路径为: /etc/prometheus/rules/prometheus-k8s-rulefiles-0/*.yaml,我们可以进入Prometheus的容器中查看:

    $ kubectl exec -it prometheus-k8s-0 /bin/sh -n monitoring
    Defaulting container name to prometheus.
    Use 'kubectl describe pod/prometheus-k8s-0 -n monitoring' to see all of the containers in this pod.
    /prometheus $ ls /etc/prometheus/rules/prometheus-k8s-rulefiles-0/
    monitoring-prometheus-k8s-rules.yaml
    /prometheus $ cat /etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-pr
    ometheus-k8s-rules.yaml
    groups:
    - name: k8s.rules
      rules:
      - expr: |
          sum(rate(container_cpu_usage_seconds_total{job="kubelet", image!="", container_name!=""}[5m])) by (namespace)
        record: namespace:container_cpu_usage_seconds_total:sum_rate
    ......
    

    而这个文件实际上就是我们之前创建的一个 PrometheusRule 文件包含的内容:

    $ cat manifests/prometheus-rules.yaml |  head -10
    apiVersion: monitoring.coreos.com/v1
    kind: PrometheusRule
    metadata:
      labels:
        prometheus: k8s
        role: alert-rules
      name: prometheus-k8s-rules
      namespace: monitoring
    spec:
      groups:
    

    这个文件中有非常重要的一个属性ruleSelector,用来匹配 rule 规则的过滤器,要求匹配具有 prometheus=k8srole=alert-rules 标签的 PrometheusRule 资源对象。

    ruleSelector:
      matchLabels:
        prometheus: k8s
        role: alert-rules
    

    所以我们想要添加规则时,只需要创建一个具有prometheus=k8srole=alert-rules标签的 PrometheusRule 对象就可以了,比如我们对刚才添加的etcd服务编写一条是否可用的报警规则。

    apiVersion: monitoring.coreos.com/v1
    kind: PrometheusRule
    metadata:
      labels:
        prometheus: k8s
        role: alert-rules
      name: etcd-rules
      namespace: monitoring
    spec:
      groups:
      - name: etcd
        rules:
        - alert: EtcdClusterUnavailable
          annotations:
            summary: etcd cluster small
            description: If one more etcd peer goes down the cluster will be unavailable
          expr: |
            count(up{job="etcd"} == 0) > (count(up{job="etcd"}) / 2 - 1)
          for: 3m
          labels:
            severity: critical
    

    创建这个PrometheusRule对象,然后查看Prometheus容器是否有这个报警规则文件

    $ kubectl  create -f etcd-rules.yaml 
    prometheusrule.monitoring.coreos.com/etcd-rules created
    [root@k8s-master01 manifests]# kubectl exec -it prometheus-k8s-0 /bin/sh -n monitoring
    $ kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead.
    Defaulting container name to prometheus.
    Use 'kubectl describe pod/prometheus-k8s-0 -n monitoring' to see all of the containers in this pod.
    /prometheus $ ls  /etc/prometheus/rules/prometheus-k8s-rulefiles-0/
    monitoring-etcd-rules.yaml            monitoring-prometheus-k8s-rules.yaml
    

    然后再去prometheus的web页面查看下

    在这里插入图片描述

    六、配置微信方式报警

    现在报警规则有了,但是报警渠道还没有配置,所以下面我们来修改下Alertmanager的配置,首先我们可以去 Alertmanager 的页面上 status 路径下面查看 AlertManager 的配置信息:


    在这里插入图片描述

    这些配置的来由也是我们之前创建的alertmanger-secret文件。

    $ cat  manifests/alertmanager-secret.yaml 
    apiVersion: v1
    data: {}
    kind: Secret
    metadata:
      name: alertmanager-main
      namespace: monitoring
    stringData:
      alertmanager.yaml: |-
        "global":
          "resolve_timeout": "5m"
        "inhibit_rules":
        - "equal":
          - "namespace"
          - "alertname"
          "source_match":
            "severity": "critical"
          "target_match_re":
            "severity": "warning|info"
        - "equal":
          - "namespace"
          - "alertname"
          "source_match":
            "severity": "warning"
          "target_match_re":
            "severity": "info"
        "receivers":
        - "name": "Default"
        - "name": "Watchdog"
        - "name": "Critical"
        "route":
          "group_by":
          - "namespace"
          "group_interval": "5m"
          "group_wait": "30s"
          "receiver": "Default"
          "repeat_interval": "12h"
          "routes":
          - "match":
              "alertname": "Watchdog"
            "receiver": "Watchdog"
          - "match":
              "severity": "critical"
            "receiver": "Critical"
    type: Opaque
    

    然后我们就可以通过修改这个yaml文件来指定我们的报警渠道了,比如我们这里将critical级别的报警发送到微信。

    apiVersion: v1
    data: {}
    kind: Secret
    metadata:
      name: alertmanager-main
      namespace: monitoring
    stringData:
      alertmanager.yaml: |-
        "global":
          "resolve_timeout": "5m"
        "inhibit_rules":
        - "equal":
          - "namespace"
          - "alertname"
          "source_match":
            "severity": "critical"
          "target_match_re":
            "severity": "warning|info"
        - "equal":
          - "namespace"
          - "alertname"
          "source_match":
            "severity": "warning"
          "target_match_re":
            "severity": "info"
        "receivers":
        - "name": "Default"
        - "name": "Watchdog"
        - "name": "Critical"
          "wechat_configs":  #添加微信的认证
           - "corp_id": 'ww314010b4720f24'
             "to_party": '1'
             "agent_id": '1000002'
             "api_secret": '9nmYzEg8X860ZBIoOkToCbh_oNc'
             "send_resolved": true
        "route":
          "group_by":
          - "namespace"
          "group_interval": "5m"
          "group_wait": "30s"
          "receiver": "Default"
          "repeat_interval": "12h"
          "routes":
          - "match":
              "alertname": "Watchdog"
            "receiver": "Watchdog"
          - "match":
              "severity": "critical"
            "receiver": "Critical"
    type: Opaque
    

    然后强制更新alertmanager-secret对象

    $ kubectl  delete -f alertmanager-secret.yaml 
    ksecret "alertmanager-main" deleted
    $ kubectl   apply -f alertmanager-secret.yaml 
    secret/alertmanager-main created
    

    然后查看alertmanger的web页面中的配置信息是否加载


    在这里插入图片描述

    如果有critical级别的报警,微信就会收到报警信息


    在这里插入图片描述

    七、自动发现配置

    当集群中的Service和Pod越来越多时,我们再手动的为每一个服务创建相应的ServiceMonitor就很麻烦了,所以为解决这个问题,Prometheus Operator 为我们提供了一个额外的抓取配置的来解决这个问题,我们可以通过添加额外的配置来进行服务发现进行自动监控。

    新建prometheus-additional.yaml

    - job_name: 'kubernetes-endpoints'
      kubernetes_sd_configs:
      - role: endpoints
      relabel_configs:
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
        action: replace
        target_label: __scheme__
        regex: (https?)
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
      - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
        action: replace
        target_label: __address__
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_service_name]
        action: replace
        target_label: kubernetes_name
      - source_labels: [__meta_kubernetes_pod_name]
        action: replace
        target_label: kubernetes_pod_name
    

    通过这个文件创建一个对应的 Secret 对象:

    $ kubectl create secret generic additional-configs --from-file=prometheus-additional.yaml -n monitoring
    secret "additional-configs" created
    

    然后我们需要在声明 prometheus 的资源对象文件中通过additionalScrapeConfigs 属性添加上这个额外的配置:

    $ cat prometheus-prometheus.yaml
    .................
      serviceAccountName: prometheus-k8s
      serviceMonitorNamespaceSelector: {}
      serviceMonitorSelector: {}
      version: v2.20.0
      additionalScrapeConfigs:
        name: additional-configs
        key: prometheus-additional.yaml
    

    添加完成后,更新 prometheus 这个 CRD 资源对象

    $ kubectl apply -f prometheus-prometheus.yaml
    prometheus.monitoring.coreos.com/k8s configured
    

    然后我们就可以在Prometheus的可视化页面查看是否加载了该配置

    在这里插入图片描述

    但是在 targets 页面下面并没有对应的监控任务,查看 Prometheus 的 Pod 日志:

    $ kubectl logs -f prometheus-k8s-0 prometheus -n monitoring
    .............
    level=error ts=2020-10-27T06:01:30.129Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:361: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" at the cluster scope"
    level=error ts=2020-10-27T06:01:39.194Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:362: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" at the cluster scope"
    

    这个报错的原因是因为Prometheus绑定了一个名为 prometheus-k8s 的 ServiceAccount 对象,而这个ServiceAccount账户绑定的是一个名为 prometheus-k8s 的 ClusterRole集群角色,查看这个ClusterRole的权限

    $ cat prometheus-clusterRole.yaml 
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: prometheus-k8s
    rules:
    - apiGroups:
      - ""
      resources:
      - nodes/metrics
      verbs:
      - get
    - nonResourceURLs:
      - /metrics
      verbs:
      - get
    

    可以看到这个clusterRole没有对 Service 或者 Pod 的 list 权限,添加上对应权限应该就可以了。

    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: prometheus-k8s
    rules:
    - apiGroups:
      - ""
      resources:
      - nodes
      - services
      - endpoints
      - pods
      - nodes/proxy
      verbs:
      - get
      - list
      - watch
    - apiGroups:
      - ""
      resources:
      - configmaps
      - nodes/metrics
      verbs:
      - get
    - nonResourceURLs:
      - /metrics
      verbs:
      - get
    

    重建Prometheus的所有对象

    $ cd manifests
    $ mkdir prometheus
    $ mv prometheus-* prometheus
    $ cd prometheus
    $ kubectl delete -f .
    $ kubectl apply  -f .
    

    重建完成后就可以看到 targets 页面下面有 kubernetes-endpoints 这个监控任务了:

    在这里插入图片描述
    可以看到,抓取到了kube-dns这个Service,这是因为Service 中含有 prometheus.io/scrape=true这个 annotation,可以查看下kube-dns 的service信息
    $  kubectl   describe svc kube-dns -n kube-system
    Name:              kube-dns
    Namespace:         kube-system
    Labels:            k8s-app=kube-dns
                       kubernetes.io/cluster-service=true
                       kubernetes.io/name=KubeDNS
    Annotations:       prometheus.io/port: 9153
                       prometheus.io/scrape: true 
    Selector:          k8s-app=kube-dns
    Type:              ClusterIP
    IP:                10.96.0.10
    Port:              dns  53/UDP
    TargetPort:        53/UDP
    Endpoints:         10.244.0.2:53,10.244.0.3:53
    Port:              dns-tcp  53/TCP
    TargetPort:        53/TCP
    Endpoints:         10.244.0.2:53,10.244.0.3:53
    Port:              metrics  9153/TCP
    TargetPort:        9153/TCP
    Endpoints:         10.244.0.2:9153,10.244.0.3:9153
    Session Affinity:  None
    Events:            <none>
    

    所以我们在创建Service的时候就要添加 prometheus.io/scrape=true 这个annotations,才可以被Prometheus的服务发现抓取到,添加方式如下:

    apiVersion: v1
    kind: Service
    metadata:
      annotations:
        prometheus.io/port: "9153"
        prometheus.io/scrape: "true"
      creationTimestamp: "2020-10-26T08:39:47Z"
      labels:
        k8s-app: kube-dns
        kubernetes.io/cluster-service: "true"
    ............
    

    注意: 虽然应用添加了这个annotations信息,Prometheus也可以抓取到这个目标,但前提是这个应用必须提供了metics接口来暴露指标信息,可以通过 prometheus.io/port: "9153"这个annotations来指定metics接口,否则prometheus采集不到metics信息,则会认为这个服务是DOWN状态。

    在这里插入图片描述

    八、使用NFS持久化数据

    在上面我们重建了Prometheus的Pod,查看时会发现之前的数据都丢失了,这是因为我们通过 prometheus 这个 CRD 创建的 Prometheus 并没有做数据的持久化,

    $ kubectl get pod prometheus-k8s-0 -n monitoring -o yaml
    ......
        volumeMounts:
        - mountPath: /etc/prometheus/config_out
          name: config-out
          readOnly: true
        - mountPath: /prometheus
          name: prometheus-k8s-db
    ......
      volumes:
    ......
      - emptyDir: {}
        name: prometheus-k8s-db
    ......
    

    可以看到 Prometheus 的数据目录 /prometheus 实际上是通过emptyDir进行挂载的,而emptyDir的设计就是应用删除后,数据也会删除,所以我们需要对Prometheus的数据进行持久化。

    Prometheus是通过 Statefulset 控制器进行部署的,所以我们这里通过 storageclass 来做数据持久化,这里我们选择之前搭建的NFS StorageClass作为数据持久化。

    在Prometheus的CRD对象中添加storage属性:

    $ cat  prometheus-prometheus.yaml 
    apiVersion: monitoring.coreos.com/v1
    kind: Prometheus
    metadata:
      labels:
        prometheus: k8s
      name: k8s
      namespace: monitoring
    spec:
      alerting:
        alertmanagers:
        - name: alertmanager-main
          namespace: monitoring
          port: web
      storage:         #持久化存储
        volumeClaimTemplate:
          spec:
            storageClassName: nfs-data-db
            resources:
              requests:
                storage: 10Gi
      image: quay.io/prometheus/prometheus:v2.20.0
      nodeSelector:
        kubernetes.io/os: linux
      podMonitorNamespaceSelector: {}
      podMonitorSelector: {}
      probeNamespaceSelector: {}
      probeSelector: {}
      replicas: 2
      resources:
        requests:
          memory: 400Mi
      ruleSelector:
        matchLabels:
          prometheus: k8s
          role: alert-rules
      securityContext:
        fsGroup: 2000
        runAsNonRoot: true
        runAsUser: 1000
      serviceAccountName: prometheus-k8s
      serviceMonitorNamespaceSelector: {}
      serviceMonitorSelector: {}
      version: v2.20.0
      additionalScrapeConfigs:
        name: additional-configs
        key: prometheus-additional.yaml
    

    更新prometheus CRD 资源

    $ kubectl  apply -f prometheus-prometheus.yaml 
    prometheus.monitoring.coreos.com/k8s configured
    

    查看PVC状态

    $ kubectl   get pvc -n monitoring
    NAME                                 STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
    prometheus-k8s-db-prometheus-k8s-0   Bound    pvc-8e73e8b2-1e98-452c-8aaa-a9ba694fe234   10Gi       RWO            nfs-data-db    2m
    prometheus-k8s-db-prometheus-k8s-1   Bound    pvc-fe817cdd-812f-489e-b82c-d5de7f0dbf93   10Gi       RWO            nfs-data-db    2m
    

    相关文章

      网友评论

          本文标题:使用Prometheus Operator监控Kubernete

          本文链接:https://www.haomeiwen.com/subject/bpvggctx.html