美文网首页
Alertmanager告警配置及prometheusalert

Alertmanager告警配置及prometheusalert

作者: Rami | 来源:发表于2023-10-30 16:33 被阅读0次

1. Alertmanager 告警配置

前面Alertmanager控制器已部署好了alertmanager实例

# kubectl get po -n monitoring |grep alertmanager
alertmanager-main-0                         2/2     Running   0          14m

1.2. 修改alertmanager配置

因为默认的配置并不能满足告警需求,需要进行修改
用operator部署的alertmanager配置默认是通过base64加密后通过secret挂载到容器中的,所以接下来我们去修改

#进入你存放prometheus-operator配置的默认路径下
cd /opt/basic-server-charts/prometheus/kube-prometheus-0.10.0/manifests/
cp  alertmanager-secret.yaml{,.bak}
vim alertmanager-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  labels:
    app.kubernetes.io/component: alert-router
    app.kubernetes.io/instance: main
    app.kubernetes.io/name: alertmanager
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 0.23.0
  name: alertmanager-main
  namespace: monitoring
stringData:
  alertmanager.yaml: |-
    "global":       #全局配置
      "resolve_timeout": "5m"     # 如果在 resolve_timeout 时间内,相关条件再次达到触发阈值,警报将保持在 "firing" 状态,即未解决状态。如果在 resolve_timeout 时间内,相关条件没有再次触发,警报将被标记为 "resolved"(已解决)。被标记为 "resolved" 的警报通常不再发送通知。
    "inhibit_rules":  #抑制规则
    - "equal":  #当源警报的 "namespace" 和 "alertname" 与目标被抑制的警报的 "namespace" 和 "alertname" 匹配,并且源警报的严重性为 "critical",同时目标被抑制的警报的严重性为 "warning" 或 "info" 时,触发了抑制规则,目标被抑制的警报将不会发送通知
      - "namespace"
      - "alertname"
      "source_matchers":
      - "severity = critical"
      "target_matchers":
      - "severity =~ warning|info"
    - "equal":
      - "namespace"
      - "alertname"
      "source_matchers":
      - "severity = warning"
      "target_matchers":
      - "severity = info"
    "receivers":
    - "name": "prometheusalert"
      "webhook_configs":
       - "url": 'http://prometheusalert-dingding:8080/prometheusalert?type=dd&tpl=prometheus-dd&ddurl=https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx'
    "route":    #根路由,该模块用于该根路由下的节点及子路由routes的定义. 子树节点如果不对相关配置进行配置,则默认会从父路由树继承该配置选项。每一条告警都要进入route,即要求配置选项group_by的值能够匹配到每一条告警的至少一个labelkey(即通过POST请求向altermanager服务接口所发送告警的labels项所携带的<labelname>),告警进入到route后,将会根据子路由routes节点中的配置项match_re或者match来确定能进入该子路由节点的告警(由在match_re或者match下配置的labelkey: labelvalue是否为告警labels的子集决定,是的话则会进入该子路由节点,否则不能接收进入该子路由节点).
      "group_by":
      - "namespace"
      "group_interval": "5m"     # 再次告警时间间隔
      "group_wait": "30s"   # 若一组新的告警产生,则会等group_wait后再发送通知,该功能主要用于当告警在很短时间内接连产生时,在group_wait内合并为单一的告警后再发送
      "receiver": "prometheusalert"   # 默认告警通知接收者
      "repeat_interval": "10m"  # 如果一条告警通知已成功发送,且在间隔repeat_interval后,该告警仍然未被设置为resolved,则会再次发送该告警通知
      "routes":           # 子路由树
      - "matchers":    #匹配到告警规则标签为severity = critical时,将告警发送给prometheusalert
        - "severity = critical"
        "receiver": "prometheusalert"
      - "matchers":
        - "severity = warning"
        "receiver": "prometheusalert"
type: Opaque

配置修改完成后刷新并重启服务,到此为止,Alertmanager配置完成

kubectl apply  -f alertmanager-secret.yaml
kubectl delete  po -n monitoring  alertmanager-main-0

注:Alertmanager的webhook_configs字段配置详解
接口说明:https://github.com/feiyu563/PrometheusAlert/blob/master/doc/readme/base-restful.md

- "url": 'http://prometheusalert-dingding:8080/prometheusalert?type=dd&tpl=prometheus-dd&ddurl=https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx'

prometheusalert-dingding : 对应下面prometheusalert-dingding 服务的svc名称
8080: 对应下面prometheusalert-dingding 服务的端口
type=dd
tpl=prometheus-dd&ddurl

2. prometheusalert告警推送

参考:https://github.com/feiyu563/PrometheusAlert

2.1 部署(部署方式有很多种,我这里选择helm部署)

参考:https://github.com/feiyu563/PrometheusAlert/blob/master/doc/readme/base-install.md

helm部署模版支持配置Ingress域名,可在values.yaml中进行配置

git clone https://github.com/feiyu563/PrometheusAlert.git
cd PrometheusAlert/example/helm/prometheusalert
ll
total 40
drwxr-xr-x 4 root root 4096 Oct 20 14:03 ./
drwxr-xr-x 3 root root 4096 Oct 20 13:29 ../
-rw-r--r-- 1 root root  399 Oct 20 13:29 Chart.yaml
drwxr-xr-x 2 root root 4096 Oct 20 15:36 config/
-rw-r--r-- 1 root root  333 Oct 20 13:29 .helmignore
-rw-r--r-- 1 root root 9355 Oct 20 13:29 README.md
drwxr-xr-x 2 root root 4096 Oct 20 14:01 templates/
-rw-r--r-- 1 root root  828 Oct 20 14:03 values.yaml

修改配置,我这里只配置钉钉告警

cd config && cp  app.conf{,.bak}
vim  app.conf
#---------------------↓全局配置-----------------------
appname = PrometheusAlert
#登录用户名
login_user=prometheusalert
#登录密码
login_password=prometheusalert
#监听地址
httpaddr = "0.0.0.0"
#监听端口
httpport = 8080
runmode = dev
#设置代理 proxy = http://123.123.123.123:8080
proxy =
#开启JSON请求
copyrequestbody = true
#告警消息标题
title=PrometheusAlert
#是否前台输出file or console
logtype=file
#日志文件路径
logpath=logs/prometheusalertcenter.log
#转换Prometheus,graylog告警消息的时区为CST时区(如默认已经是CST时区,请勿开启)
prometheus_cst_time=0
#数据库驱动,支持sqlite3,mysql,postgres如使用mysql或postgres,请开启db_host,db_port,db_user,db_password,db_name的注释
db_driver=sqlite3

#---------------------↓webhook-----------------------
#是否开启钉钉告警通道,可同时开始多个通道0为关闭,1为开启
open-dingding=1
#默认钉钉机器人地址
ddurl=https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxxxxxxxxxx
#是否开启 @所有人(0为关闭,1为开启)
dd_isatall=1

2.1.1 启动

helm upgrade --install prometheusalert-dingding  prometheusalert -n monitoring

# kubectl get po -n monitoring |grep prometheusalert-dingding
prometheusalert-dingding-56d7848dd8-lkjwc   1/1     Running   0          11d
# kubectl get svc -n monitoring |grep prometheusalert-dingding
prometheusalert-dingding   ClusterIP   192.168.124.248   <none>        8080/TCP                     11d
# kubectl get ing -n monitoring |grep prometheusalert-dingding
prometheusalert-dingding   <none>   test.prom-alter.test.cn     172.23.11.36   80      11d

相关文章

网友评论

      本文标题:Alertmanager告警配置及prometheusalert

      本文链接:https://www.haomeiwen.com/subject/popbidtx.html