使用Prometheus Operator ,Prometheus Operator是CoreOS的一个开源项目,用来增强Prometheus在K8S中的管理运维能力,利用k8s的自定义资源定义(custom resource definition)的特性,实现声明式管理运维Prometheus监控告警系统
Helm Chart部署
是用的repo为stable
helm repo add stable https://charts.helm.sh/stable
需要等一段时间之后,等它缓存完毕,类似yum makecache的过程
创建命名空间(一般Prometheus ns为monitoring):
kubectl create namespace monitoring
为监控etcd,需要为证书创建secret
Prometheus Operator定义了etcd的ServiceMonitor,但需要https才能访问metrics,如果不导入证书,将无法访问,导致etcd无法监控。
kubectl create secret generic etcd-certs -nmonitoring \
--from-file=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
--from-file=/etc/kubernetes/pki/etcd/healthcheck-client.key \
--from-file=/etc/kubernetes/pki/etcd/ca.crt
helm安装时,在crds/目录中的清单文件会自动提交给k8s
helm install prometheus stable/prometheus-operator \
--namespace monitoring \
--set prometheusOperator.createCustomResource=false \
--set kubeEtcd.serviceMonitor.scheme=https \
--set kubeEtcd.serviceMonitor.caFile=/etc/prometheus/secrets/etcd-certs/ca.crt \
--set kubeEtcd.serviceMonitor.certFile=/etc/prometheus/secrets/etcd-certs/healthcheck-client.crt \
--set kubeEtcd.serviceMonitor.keyFile=/etc/prometheus/secrets/etcd-certs/healthcheck-client.key \
--set prometheus.prometheusSpec.secrets={etcd-certs}
查看k8s资源
[root@master1 ~]# kubectl -n monitoring get all
NAME READY STATUS RESTARTS AGE
pod/alertmanager-prometheus-prometheus-oper-alertmanager-0 2/2 Running 0 3d18h
pod/prometheus-grafana-7c78857f5c-nc9nn 2/2 Running 0 3d18h
pod/prometheus-kube-state-metrics-95d956569-gr29p 1/1 Running 0 3d18h
pod/prometheus-prometheus-node-exporter-d92j6 1/1 Running 0 3d18h
pod/prometheus-prometheus-node-exporter-fvcps 1/1 Running 0 3d18h
pod/prometheus-prometheus-node-exporter-kd7dj 1/1 Running 0 3d18h
pod/prometheus-prometheus-node-exporter-xj9vb 1/1 Running 0 3d18h
pod/prometheus-prometheus-oper-operator-6d9c4bdb9f-xfq8w 2/2 Running 0 3d18h
pod/prometheus-prometheus-prometheus-oper-prometheus-0 3/3 Running 1 3d18h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 3d18h
service/prometheus-grafana NodePort 10.99.178.85 <none> 80:32378/TCP 3d18h
service/prometheus-kube-state-metrics ClusterIP 10.99.192.121 <none> 8080/TCP 3d18h
service/prometheus-operated ClusterIP None <none> 9090/TCP 3d18h
service/prometheus-prometheus-node-exporter ClusterIP 10.101.141.120 <none> 9100/TCP 3d18h
service/prometheus-prometheus-oper-alertmanager ClusterIP 10.99.195.15 <none> 9093/TCP 3d18h
service/prometheus-prometheus-oper-operator ClusterIP 10.107.137.161 <none> 8080/TCP,443/TCP 3d18h
service/prometheus-prometheus-oper-prometheus NodePort 10.107.206.105 <none> 9090:30689/TCP 3d18h
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/prometheus-prometheus-node-exporter 4 4 4 4 4 <none> 3d18h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/prometheus-grafana 1/1 1 1 3d18h
deployment.apps/prometheus-kube-state-metrics 1/1 1 1 3d18h
deployment.apps/prometheus-prometheus-oper-operator 1/1 1 1 3d18h
NAME DESIRED CURRENT READY AGE
replicaset.apps/prometheus-grafana-7c78857f5c 1 1 1 3d18h
replicaset.apps/prometheus-kube-state-metrics-95d956569 1 1 1 3d18h
replicaset.apps/prometheus-prometheus-oper-operator-6d9c4bdb9f 1 1 1 3d18h
NAME READY AGE
statefulset.apps/alertmanager-prometheus-prometheus-oper-alertmanager 1/1 3d18h
statefulset.apps/prometheus-prometheus-prometheus-oper-prometheus 1/1 3d18h
查看创建的crds
[root@master1 ~]# kubectl get crds |grep coreos
alertmanagers.monitoring.coreos.com 2021-03-22T03:46:43Z
podmonitors.monitoring.coreos.com 2021-03-22T03:46:43Z
prometheuses.monitoring.coreos.com 2021-03-22T03:46:43Z
prometheusrules.monitoring.coreos.com 2021-03-22T03:46:43Z
servicemonitors.monitoring.coreos.com 2021-03-22T03:46:43Z
thanosrulers.monitoring.coreos.com 2021-03-22T07:34:54Z
查看Prometheus pod 是否正常
image.jpeg
正常后将pod 端口对外暴露
kubectl patch svc prometheus-grafana -p '{"spec":{"type":"NodePort"}}' -n monitoring
kubectl patch svc prometheus-prometheus-oper-prometheus -p '{"spec":{"type":"NodePort"}}' -n monitoring
然后查询对外端口
# kubectl -n monitoring get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 3d19h
prometheus-grafana NodePort 10.99.178.85 <none> 80:32378/TCP 3d19h
prometheus-kube-state-metrics ClusterIP 10.99.192.121 <none> 8080/TCP 3d19h
prometheus-operated ClusterIP None <none> 9090/TCP 3d19h
prometheus-prometheus-node-exporter ClusterIP 10.101.141.120 <none> 9100/TCP 3d19h
prometheus-prometheus-oper-alertmanager ClusterIP 10.99.195.15 <none> 9093/TCP 3d19h
prometheus-prometheus-oper-operator ClusterIP 10.107.137.161 <none> 8080/TCP,443/TCP 3d19h
prometheus-prometheus-oper-prometheus NodePort 10.107.206.105 <none> 9090:30689/TCP 3d19h
获取grafana密码
kubectl get secret --namespace monitoring prometheus-grafana -o jsonpath="{.data.admin-password}"
通过浏览器登录即可,用户名为admin,上面获取的是加密后的密码,解密后默认密码应为prom-operator
网友评论