1.基于文件服务发现
prometheus配置
# cd /root/prometheus-2.25.0.linux-amd64
# cp -p prometheus.yml prometheus.yml.`date +%Y%m%d`
# vi prometheus.yml
在scrape_configs:下添加如下内容
- job_name: node.search
file_sd_configs: #file_sd_configs代替static_configs,即为基于文件服务的发现
- files:
- targets/nodes/*.yml
refresh_interval: 5m
- job_name: mysql
file_sd_configs:
- files:
- targets/db/*.yml
refresh_interval: 5m
创建文件目录
# mkdir -p targets/{nodes,db}
服务文件配置
nodes.yml配置示例
# cat <<- 'EOF' | tee -a targets/nodes/nodes.yml
- targets:
- '192.168.8.254:9100'
labels:
hostname: basefilesearchnode1
job: nodes
- targets:
- '192.168.8.254:9100'
labels:
hostname: basefilesearchnode2
job: nodes
EOF
mysql.yml配置示例
# cat <<- 'EOF' | tee -a targets/db/mysql.yml
- targets:
- '192.168.8.254:9104'
labels:
hostname: basefilesearchmysqk1
job: mysql
EOF
检查配置文件
# ./promtool check config prometheus.yml
Checking prometheus.yml
SUCCESS: 2 rule files found
Checking rules/linux_rule.yml
SUCCESS: 8 rules found
Checking rules/node_alerts.yml
SUCCESS: 1 rules found
重启prometheus
# systemctl restart prometheus
# systemctl status prometheus
确认服务发现对象
Web界面登陆查看:http://192.168.8.124:9090/service-discovery#
2.基于Consul服务发现
Consul简介
Consul 是基于 GO 语言开发的开源工具,主要面向分布式,服务化的系统提供服务注册、服务发现和配置管理的功能。Consul 提供服务注册/发现、健康检查、Key/Value存储、多数据中心和分布式一致性保证等功能。之前我们通过 Prometheus 实现监控,当新增一个 Target 时,需要变更服务器上的配置文件,即使使用 file_sd_configs 配置,也需要登录服务器修改对应 Json 文件,会非常麻烦。不过 Prometheus 官方支持多种自动服务发现的类型,其中就支持 Consul。
Consul取得及解压
# cd
# mkdir -p consul/{data,conf}
# wget https://releases.hashicorp.com/consul/1.10.3/consul_1.10.3_linux_amd64.zip
# ll consul_1.10.3_linux_amd64.zip
# unzip consul_1.10.3_linux_amd64.zip
Consul配置
# cat <<- 'EOF' | tee -a /root/consul/conf/nodes.json
{
"services":[
{
"ID": "node-exporter-8-100",
"Name": "consultest1:192.168.8.100",
"Address": "192.168.8.100",
"Port": 9100,
"Tags": ["nodes", "test"],
"Meta": {
"os": "linux",
"group": "nodes",
"project": "xxx.xxx"
},
"Check": [{
"http": "http://192.168.8.100:9100/metrics",
"interval": "60s"
}]
},
{
"ID": "node-exporter-8-101",
"Name": "consultest1:192.168.8.101",
"Address": "192.168.8.101",
"Port": 9100,
"Tags": ["nodes", "test"],
"Meta": {
"os": "linux",
"group": "nodes",
"project": "xxx.xxx"
},
"Check": [{
"http": "http://192.168.8.101:9100/metrics",
"interval": "60s"
}]
}
]
}
EOF
# cat <<- 'EOF' | tee -a /root/consul/conf/prometheus.json
{
"services":[
{
"Id": "prometheus-000-090",
"Name": "prometheus:192.168.8.124",
"Address": "192.168.8.124",
"Port": 9090,
"Tags": ["prometheus", "test"],
"Meta": {
"os": "Linux",
"version": "2.25.0",
"group": "devops",
"service": "prometheus",
"roles": "server01",
"project": "xxx.xxx"
},
"Check": [{
"http": "http://192.168.8.124:9090/metrics",
"interval": "60s"
}]
}
]
}
# cat <<- 'EOF' | tee -a /root/consul/conf/alertmanager.json
{
"services":[
{
"Id": "alertmanager-000-090",
"Name": "alertmanager:192.168.8.124",
"Address": "192.168.8.124",
"Port": 9093,
"Tags": ["alertmanager", "test"],
"Meta": {
"os": "linux",
"version": "0.21.0",
"group": "devops",
"service": "alertmanager",
"roles": "server01",
"project": "xxx.ffcs"
},
"Check": [{
"http": "http://192.168.8.124:9093/metrics/",
"interval": "60s"
}]
}
]
}
EOF
Consul启动文件制作
# cat <<- 'EOF' | tee -a /lib/systemd/system/consul.service
[Unit]
Description=consul
Documentation=https://www.consul.io/
After=network.target
[Service]
Type=simple
User=root
Group=root
ExecStart=/root/consul/consul agent -dev -ui -data-dir=/root/consul/data/ -config-dir=/root/consul/conf/ -client=0.0.0.0
ExecReload=/usr/local/bin/consul reload
Restart=on-failure
LimitNOFILE=65535
TimeoutSec=0
PermissionsStartOnly=true
RestartPreventExitStatus=1
PrivateTmp=false
[Install]
WantedBy=multi-user.target
EOF
# systemctl daemon-reload
设置Consul自动启动
# systemctl enable consul
启动Consul
# systemctl start consul
# systemctl status consul
登陆Consul UI确认
URL地址:http://192.168.8.124:8500/ui/
各服务向consul注册,Prometheus监听consul中发现的服务。
Prometheus设置
# cd /root/prometheus-2.25.0.linux-amd64
# cp -p prometheus.yml prometheus.yml.`date +%Y%m%d`
# vi prometheus.yml
在scrape_configs:下添加如下内容
- job_name: "prometheus_consul-test"
scrape_interval: 60s
consul_sd_configs: #consul_sd_configs代替static_configs,即为基于consul服务的发现
- server: "192.168.8.124:8500"
services: []
tags: ["prometheus", "test"]
refresh_interval: 5m
检查Prometheus配置文件
# ./promtool check config prometheus.yml
Checking prometheus.yml
SUCCESS: 2 rule files found
Checking rules/linux_rule.yml
SUCCESS: 8 rules found
Checking rules/node_alerts.yml
SUCCESS: 1 rules found
重启prometheus
# systemctl restart prometheus
# systemctl status prometheus
网友评论