一,Prometheus简介
#Prometheus (中文名:普罗米修斯)是由 SoundCloud 开发的开源监控报警系统和时序列数据库(TSDB).自2012年起,许多公司及组织已经采用 Prometheus,并且该项目有着非常活跃的开发者和用户社区.现在已经成为一个独立的开源项目。Prometheus 在2016加入 CNCF ( Cloud Native Computing Foundation ), 作为在 kubernetes 之后的第二个由基金会主持的项目。 Prometheus 的实现参考了Google内部的监控实现,与源自Google的Kubernetes结合起来非常合适。另外相比influxdb的方案,性能更加突出,而且还内置了报警功能。它针对大规模的集群环境设计了拉取式的数据采集方式,只需要在应用里面实现一个metrics接口,然后把这个接口告诉Prometheus就可以完成数据采集了
下图为prometheus的架构图。
1,Prometheus的特点:
1、多维数据模型(时序列数据由metric名和一组key/value组成)
2、在多维度上灵活的查询语言(PromQl)
3、不依赖分布式存储,单主节点工作.
4、通过基于HTTP的pull方式采集时序数据
5、可以通过中间网关进行时序列数据推送(pushing)
6、目标服务器可以通过发现服务或者静态配置实现
7、多种可视化和仪表盘支持
prometheus 相关组件,Prometheus生态系统由多个组件组成,其中许多是可选的:
1、Prometheus 主服务,用来抓取和存储时序数据
2、client library 用来构造应用或 exporter 代码 (go,java,python,ruby)
3、push 网关可用来支持短连接任务
4、可视化的dashboard (两种选择,promdash 和 grafana.目前主流选择是 grafana.)
4、一些特殊需求的数据出口(用于HAProxy, StatsD, Graphite等服务)
5、实验性的报警管理端(alartmanager,单独进行报警汇总,分发,屏蔽等 )
2,监控指标
1,在node中对资源节点,统计他们的利用率。(硬件,系统)
2,监控pod的情况(系统,应用)
3,k8s的资源监控(应用)
4,业务层面的监控(业务)
5,硬件、系统、应用、API监控、业务监控、流量分析。
监控指标 具体实现 举例
Pod性能 cAdvisor 容器CPU,内存利用率
Node性能 node-exporter 节点CPU,内存利用率
K8S资源对象 kube-state-metrics Pod/Deployment/Service
1,如何监控POD:
部署一个项目,有很多pod,也有很多容器,如何动态感知这些容器呢,他是通过普罗米修斯自身的kube-state-metrics实时的从k8s的api-server里面做实时的listwatch去感知到有没有新的事件的创建,从而自动的检索到普罗米修斯监控。
相关其他:
* 除了kube-state-metrics监控K8S资源以外还有很多监控发现,比如azure_sd_config的服务发现,基于consul_sd_config,ec2,openstack,file_sd_config,等等。可以在下面这个链接看到。
* [https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config)
* 安装:[https://grafana.com/grafana/download](https://grafana.com/grafana/download)
* 仪表盘模板:[https://grafana.com/grafana/dashboards](https://grafana.com/grafana/dashboards)
二,部署Prometheus
部署前提是数据持久化存储,pv和pvc。
挑选NODE2这台做NFS服务器,但是NFS和RPCBIND是三台都安装。
node02:
yum -y install nfs-utils rpcbind
mkdir -p /nfs/kubernetes
systemctl enable rpcbind
systemctl enable nfs
systemctl start nfs
vim /etc/exports
/nfs/kubernetes *(rw,no_root_squash)
exportfs -r
测试:
mount -t 10.0.0.13:/nfs/kubernetes :/mnt
[root@k8s-master prometheus]# mount |grep nfs
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
nfsd on /proc/fs/nfsd type nfsd (rw,relatime)
10.0.0.13:/nfs/kubernetes on /mnt type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.0.11,local_lock=none,addr=10.0.0.13)
在nfs上新建三个文件
mkdir /nfs/kubernetes/{pv01,pv02,pv03}
1,静态分配pv
创建pv
[root@k8s-master prometheus]# cat pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv01
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteMany
nfs:
path: /nfs/kubernetes/pv01
server: 10.0.0.13
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv02
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
nfs:
path: /nfs/kubernetes/pv02
server: 10.0.0.13
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv03
spec:
capacity:
storage: 50Gi
accessModes:
- ReadWriteMany
nfs:
path: /nfs/kubernetes/pv03
server: 10.0.0.13
创建pvc
[root@k8s-master prometheus]# cat deploy-pvc.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: web
name: web
spec:
replicas: 1
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- image: nginx
name: nginx
volumeMounts:
- name: data
mountPath: /usr/share/nginx/html
volumes:
- name: data
persistentVolumeClaim:
claimName: my-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
查看:
[root@k8s-master prometheus]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv01 5Gi RWX Retain Bound default/my-pvc 83s
pv02 10Gi RWX Retain Available 83s
pv03 50Gi RWX Retain Available 83s
[root@k8s-master prometheus]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
my-pvc Bound pv01 5Gi RWX 9s
2,动态分配pv
需要先安装一个插件
[root@k8s-master prometheus]# cat nfs-client.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: nfs-client-provisioner
---
kind: Deployment
apiVersion: apps/v1
metadata:
name: nfs-client-provisioner
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
app: nfs-client-provisioner
template:
metadata:
labels:
app: nfs-client-provisioner
spec:
serviceAccountName: nfs-client-provisioner
containers:
- name: nfs-client-provisioner
image: quay.io/external_storage/nfs-client-provisioner:latest
volumeMounts:
- name: nfs-client-root
mountPath: /persistentvolumes
env:
- name: PROVISIONER_NAME
value: fuseim.pri/nfs
- name: NFS_SERVER
value: 10.0.0.13
- name: NFS_PATH
value: /nfs/kubernetes
volumes:
- name: nfs-client-root
nfs:
server: 10.0.0.13
path: /nfs/kubernetes
生成动态存储
[root@k8s-master prometheus]# cat class.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: managed-nfs-storage
provisioner: fuseim.pri/nfs # or choose another name, must match deployment's env PROVISIONER_NAME'
parameters:
archiveOnDelete: "true"
授权
[root@k8s-master prometheus]# cat rbac.yaml
kind: ServiceAccount
apiVersion: v1
metadata:
name: nfs-client-provisioner
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: nfs-client-provisioner-runner
rules:
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get", "list", "watch", "update"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "update", "patch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: run-nfs-client-provisioner
subjects:
- kind: ServiceAccount
name: nfs-client-provisioner
namespace: default
roleRef:
kind: ClusterRole
name: nfs-client-provisioner-runner
apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-client-provisioner
rules:
- apiGroups: [""]
resources: ["endpoints"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-client-provisioner
subjects:
- kind: ServiceAccount
name: nfs-client-provisioner
# replace with namespace where provisioner is deployed
namespace: default
roleRef:
kind: Role
name: leader-locking-nfs-client-provisioner
apiGroup: rbac.authorization.k8s.io
自动创建存储卷
[root@k8s-master prometheus]# cat web.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: web
name: web
spec:
replicas: 1
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- image: nginx
name: nginx
volumeMounts:
- name: data
mountPath: /usr/share/nginx/html
volumes:
- name: data
persistentVolumeClaim:
claimName: my-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
storageClassName: "managed-nfs-storage"
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
[root@k8s-master prometheus]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
my-pvc Bound pvc-bcecf31e-ab0b-4428-bbab-8dcae441b51c 5Gi RWX managed-nfs-storage 10s
[root@k8s-master prometheus]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv01 5Gi RWX Retain Released default/my-pvc 41m
pv02 10Gi RWX Retain Available 41m
pv03 50Gi RWX Retain Available 41m
pvc-bcecf31e-ab0b-4428-bbab-8dcae441b51c 5Gi RWX Delete Bound default/my-pvc managed-nfs-storage 113s
自动生成pvc成功
3,部署Prometheus
创建namespace
kubectl create ns prometheus
[root@k8s-master prometheus]# cat prome-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
namespace: prometheus
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: prometheus
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
- ""
resources:
- nodes
- nodes/metrics
- services
- endpoints
- pods
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- configmaps
verbs:
- get
- nonResourceURLs:
- "/metrics"
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: prometheus
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: prometheus
创建configmap
[root@k8s-master prometheus]# cat configmap.yaml
# Prometheus configuration format https://prometheus.io/docs/prometheus/latest/configuration/configuration/
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: prometheus
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: EnsureExists
data:
prometheus.yml: |
rule_files:
- /etc/config/rules/*.rules
scrape_configs:
- job_name: prometheus
static_configs:
- targets:
- localhost:9090
- job_name: kubernetes-apiservers
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- action: keep
regex: default;kubernetes;https
source_labels:
- __meta_kubernetes_namespace
- __meta_kubernetes_service_name
- __meta_kubernetes_endpoint_port_name
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
- job_name: kubernetes-nodes-kubelet
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
- job_name: kubernetes-nodes-cadvisor
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __metrics_path__
replacement: /metrics/cadvisor
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
- job_name: kubernetes-service-endpoints
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- action: keep
regex: true
source_labels:
- __meta_kubernetes_service_annotation_prometheus_io_scrape
- action: replace
regex: (https?)
source_labels:
- __meta_kubernetes_service_annotation_prometheus_io_scheme
target_label: __scheme__
- action: replace
regex: (.+)
source_labels:
- __meta_kubernetes_service_annotation_prometheus_io_path
target_label: __metrics_path__
- action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
source_labels:
- __address__
- __meta_kubernetes_service_annotation_prometheus_io_port
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: kubernetes_namespace
- action: replace
source_labels:
- __meta_kubernetes_service_name
target_label: kubernetes_name
- job_name: kubernetes-services
kubernetes_sd_configs:
- role: service
metrics_path: /probe
params:
module:
- http_2xx
relabel_configs:
- action: keep
regex: true
source_labels:
- __meta_kubernetes_service_annotation_prometheus_io_probe
- source_labels:
- __address__
target_label: __param_target
- replacement: blackbox
target_label: __address__
- source_labels:
- __param_target
target_label: instance
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels:
- __meta_kubernetes_namespace
target_label: kubernetes_namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: kubernetes_name
- job_name: kubernetes-pods
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: keep
regex: true
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_scrape
- action: replace
regex: (.+)
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_path
target_label: __metrics_path__
- action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
source_labels:
- __address__
- __meta_kubernetes_pod_annotation_prometheus_io_port
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: kubernetes_namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: kubernetes_pod_name
alerting:
alertmanagers:
- static_configs:
- targets: ["alertmanager:80"]
查看
[root@k8s-master prometheus]# kubectl get cm -n prometheus
NAME DATA AGE
prometheus-config 1 7s
创建rules
[root@k8s-master prometheus]# cat prome-rules.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-rules
namespace: prometheus
data:
general.rules: |
groups:
- name: general.rules
rules:
- alert: InstanceDown
expr: up == 0
for: 1m
labels:
severity: error
annotations:
summary: "Instance {{ $labels.instance }} 停止工作"
description: "{{ $labels.instance }} job {{ $labels.job }} 已经停止5分钟以上."
node.rules: |
groups:
- name: node.rules
rules:
- alert: NodeFilesystemUsage
expr: |
100 - (node_filesystem_free_bytes{fstype=~"ext4|xfs"} /
node_filesystem_size_bytes{fstype=~"ext4|xfs"} * 100) > 80
for: 1m
labels:
severity: warning
annotations:
summary: "Instance {{ $labels.instance }} : {{ $labels.mountpoint }} 分区使用率过高"
description: "{{ $labels.instance }}: {{ $labels.mountpoint }} 分区使用大于80% (当前值: {{ $value }})"
- alert: NodeMemoryUsage
expr: |
100 - (node_memory_MemFree_bytes+node_memory_Cached_bytes+node_memory_Buffers_bytes) /
node_memory_MemTotal_bytes * 100 > 80
for: 1m
labels:
severity: warning
annotations:
summary: "Instance {{ $labels.instance }} 内存使用率过高"
description: "{{ $labels.instance }}内存使用大于80% (当前值: {{ $value }})"
- alert: NodeCPUUsage
expr: |
100 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])) by (instance) * 100) > 60
for: 1m
labels:
severity: warning
annotations:
summary: "Instance {{ $labels.instance }} CPU使用率过高"
description: "{{ $labels.instance }}CPU使用大于60% (当前值: {{ $value }})"
- alert: KubeNodeNotReady
expr: |
kube_node_status_condition{condition="Ready",status="true"} == 0
for: 1m
labels:
severity: error
annotations:
message: '{{ $labels.node }} 已经有10多分钟没有准备好了.'
pod.rules: |
groups:
- name: pod.rules
rules:
- alert: PodCPUUsage
expr: |
sum(rate(container_cpu_usage_seconds_total{image!=""}[1m]) * 100) by (pod_name, namespace) > 80
for: 5m
labels:
severity: warning
annotations:
summary: "命名空间: {{ $labels.namespace }} | Pod名称: {{ $labels.pod_name }} CPU使用大于80% (当前值: {{ $value }})"
- alert: PodMemoryUsage
expr: |
sum(container_memory_rss{image!=""}) by(pod_name, namespace) /
sum(container_spec_memory_limit_bytes{image!=""}) by(pod_name, namespace) * 100 != +inf > 80
for: 5m
labels:
severity: warning
annotations:
summary: "命名空间: {{ $labels.namespace }} | Pod名称: {{ $labels.pod_name }} 内存使用大于80% (当前值: {{ $value }})"
- alert: PodNetworkReceive
expr: |
sum(rate(container_network_receive_bytes_total{image!="",name=~"^k8s_.*"}[5m]) /1000) by (pod_name,namespace) > 30000
for: 5m
labels:
severity: warning
annotations:
summary: "命名空间: {{ $labels.namespace }} | Pod名称: {{ $labels.pod_name }} 入口流量大于30MB/s (当前值: {{ $value }}K/s)"
- alert: PodNetworkTransmit
expr: |
sum(rate(container_network_transmit_bytes_total{image!="",name=~"^k8s_.*"}[5m]) /1000) by (pod_name,namespace) > 30000
for: 5m
labels:
severity: warning
annotations:
summary: "命名空间: {{ $labels.namespace }} | Pod名称: {{ $labels.pod_name }} 出口流量大于30MB/s (当前值: {{ $value }}/K/s)"
- alert: PodRestart
expr: |
sum(changes(kube_pod_container_status_restarts_total[1m])) by (pod,namespace) > 0
for: 1m
labels:
severity: warning
annotations:
summary: "命名空间: {{ $labels.namespace }} | Pod名称: {{ $labels.pod }} Pod重启 (当前值: {{ $value }})"
- alert: PodFailed
expr: |
sum(kube_pod_status_phase{phase="Failed"}) by (pod,namespace) > 0
for: 5s
labels:
severity: error
annotations:
summary: "命名空间: {{ $labels.namespace }} | Pod名称: {{ $labels.pod }} Pod状态Failed (当前值: {{ $value }})"
- alert: PodPending
expr: |
sum(kube_pod_status_phase{phase="Pending"}) by (pod,namespace) > 0
for: 1m
labels:
severity: error
annotations:
summary: "命名空间: {{ $labels.namespace }} | Pod名称: {{ $labels.pod }} Pod状态Pending (当前值: {{ $value }})"
查看
[root@k8s-master prometheus]# kubectl get cm -n prometheus
NAME DATA AGE
prometheus-config 1 3m
prometheus-rules 3 16s
部署prome+svc
[root@k8s-master prometheus]# cat prome-statefulset-svc.yaml
kind: Service
apiVersion: v1
metadata:
name: prometheus
namespace: prometheus
labels:
kubernetes.io/name: "Prometheus"
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
type: NodePort
ports:
- name: http
port: 9090
protocol: TCP
targetPort: 9090
nodePort: 30090
selector:
k8s-app: prometheus
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: prometheus
namespace: prometheus
labels:
k8s-app: prometheus
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
version: v2.2.1
spec:
serviceName: "prometheus"
replicas: 1
podManagementPolicy: "Parallel"
updateStrategy:
type: "RollingUpdate"
selector:
matchLabels:
k8s-app: prometheus
template:
metadata:
labels:
k8s-app: prometheus
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
priorityClassName: system-cluster-critical
serviceAccountName: prometheus
initContainers:
- name: "init-chown-data"
image: "busybox:latest"
imagePullPolicy: "IfNotPresent"
command: ["chown", "-R", "65534:65534", "/data"]
volumeMounts:
- name: prometheus-data
mountPath: /data
subPath: ""
containers:
- name: prometheus-server-configmap-reload
image: "jimmidyson/configmap-reload:v0.1"
imagePullPolicy: "IfNotPresent"
args:
- --volume-dir=/etc/config
- --webhook-url=http://localhost:9090/-/reload
volumeMounts:
- name: config-volume
mountPath: /etc/config
readOnly: true
resources:
limits:
cpu: 10m
memory: 10Mi
requests:
cpu: 10m
memory: 10Mi
- name: prometheus-server
image: "prom/prometheus:v2.2.1"
imagePullPolicy: "IfNotPresent"
args:
- --config.file=/etc/config/prometheus.yml
- --storage.tsdb.path=/data
- --web.console.libraries=/etc/prometheus/console_libraries
- --web.console.templates=/etc/prometheus/consoles
- --web.enable-lifecycle
ports:
- containerPort: 9090
readinessProbe:
httpGet:
path: /-/ready
port: 9090
initialDelaySeconds: 30
timeoutSeconds: 30
livenessProbe:
httpGet:
path: /-/healthy
port: 9090
initialDelaySeconds: 30
timeoutSeconds: 30
# based on 10 running nodes with 30 pods each
resources:
limits:
cpu: 200m
memory: 1000Mi
requests:
cpu: 200m
memory: 1000Mi
volumeMounts:
- name: config-volume
mountPath: /etc/config
- name: prometheus-data
mountPath: /data
subPath: ""
- name: prometheus-rules
mountPath: /etc/config/rules
terminationGracePeriodSeconds: 300
volumes:
- name: config-volume
configMap:
name: prometheus-config
- name: prometheus-rules
configMap:
name: prometheus-rules
volumeClaimTemplates:
- metadata:
name: prometheus-data
spec:
storageClassName: managed-nfs-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: "16Gi"
查看:
[root@k8s-master prometheus]# kubectl get pods -n prometheus
NAME READY STATUS RESTARTS AGE
prometheus-0 2/2 Running 0 3m55s
[root@k8s-master prometheus]# kubectl get svc -n prometheus
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
prometheus NodePort 10.105.99.207 <none> 9090:30090/TCP 4m4s
image.png
4,部署grafan展示数据
[root@k8s-master prometheus]# cat granfana.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
namespace: prometheus
spec:
replicas: 1
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
containers:
- name: grafana
image: grafana/grafana
ports:
- containerPort: 3000
protocol: TCP
resources:
limits:
cpu: 100m
memory: 256Mi
requests:
cpu: 100m
memory: 256Mi
volumeMounts:
- name: grafana-data
mountPath: /var/lib/grafana
subPath: grafana
securityContext:
fsGroup: 472
runAsUser: 472
volumes:
- name: grafana-data
persistentVolumeClaim:
claimName: grafana
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: grafana
namespace: prometheus
spec:
storageClassName: "managed-nfs-storage"
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
---
apiVersion: v1
kind: Service
metadata:
name: grafana
namespace: prometheus
spec:
type: NodePort
ports:
- port : 80
targetPort: 3000
nodePort: 30007
selector:
app: grafana
查看:
[root@k8s-master prometheus]# kubectl get pods -n prometheus
NAME READY STATUS RESTARTS AGE
grafana-85dc8f8cd4-bcjkk 1/1 Running 0 2m59s
prometheus-0 2/2 Running 0 18m
[root@k8s-master prometheus]# kubectl get svc -n prometheus
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
grafana NodePort 10.101.145.125 <none> 80:30007/TCP 3m7s
prometheus NodePort 10.105.99.207 <none> 9090:30090/TCP 18m
[root@k8s-master prometheus]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv01 5Gi RWX Retain Released default/my-pvc 76m
pv02 10Gi RWX Retain Available 76m
pv03 50Gi RWX Retain Available 76m
pvc-2db8aab7-2325-4f13-8f24-c6ba31fcb876 16Gi RWO Delete Bound prometheus/prometheus-data-prometheus-0 managed-nfs-storage 18m
pvc-53e19eb4-4f97-4fad-ad88-75ce607632d4 5Gi RWX Delete Bound prometheus/grafana managed-nfs-storage 3m16s
pvc-bcecf31e-ab0b-4428-bbab-8dcae441b51c 5Gi RWX Delete Bound default/my-pvc managed-nfs-storage 36m
部署grafana-ingress
[root@k8s-master prometheus]# cat granfana-ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: grafana
namespace: prometheus
spec:
rules:
- host: k8s.grafana
http:
paths:
- path: /
backend:
serviceName: grafana
servicePort: 80
查看:
[root@k8s-master prometheus]# kubectl get ingress -n prometheus
NAME CLASS HOSTS ADDRESS PORTS AGE
grafana <none> k8s.grafana 80 10s
做好hosts访问:默认用户密码都是admin
image.png image.png image.png
image.png
下载模板:https://grafana.com/grafana/dashboards/315 image.png
image.png image.png
网友评论