前言
随着系统架构部署复杂度越来越高,想要登陆每台服务器去查看日志就变得比较麻烦,也不太安全,并且容器部署的,每个容器都会产生日志,还要汇总之后再去查看日志,就变得非常的麻烦,影响工作效率,所以就需要一个日志管理系统去统一的搜集,处理,汇总这些日志,让各个视角可以通过同一入口,获取到自己需要的资源日志,目前市面上比较流行,比较成熟的日志方案就是ELK的日志套件。
ELK 是 Elastic 公司提供的一套完整的日志收集以及展示的解决方案,是三个产品的首字母缩写,分别是
E: Elasticsearch
L: Logstash
K: Kibana
Elasticsearch 是实时全文搜索和分析引擎,提供搜索、分析、存储数据三大功能
Logstash 是一个用来搜集、分析、过滤日志的工具
Kibana 是一个基于 Web 的图形界面,用于搜索、分析和可视化存储在 Elasticsearch 指标中的日志数据
由于之前logstash性能和效率问题,目前已经从ELK->EFK,目前EFK在容器领域使用的比较多
E: Elasticsearch
F: Filebeat or Fluentd
K: Kibana
因为docker本身的日志驱动可以和fluentd结合
docker.png
部署架构
一般常用的部署架构如下图
常用架构.png
为了处理生产中的大量数据而构建的更复杂的管道,可能会在日志架构中添加额外的组件,以实现弹性(Kafka、RabbitMQ、Redis)和安全性(nginx)
缓存架构.png
基于容器的部署架构
容器架构.png
Elstaticsearch
它是一个面向文档的数据库,既然是数据库那就来说一下它和数据库的对应关系:
关系数据库 ⇒ 数据库 ⇒ 表 ⇒ 行 ⇒ 列(Columns)
Elasticsearch(非关系型数据库) ⇒ 索引(index) ⇒ 类型(type) ⇒ 文档 (document)⇒ 字段(Fields)
mkdir -p /data/elk-data && chmod 755 /data/elk-data
chown -R 1000:1000 /data
docker run -p 9200:9200 -p 9300:9300 \
-v /data/elk-data:/usr/share/elasticsearch/data \
--name esforlog \
docker.elastic.co/elasticsearch/elasticsearch:latest
# 单节点要加 discovery.type=single-node
# Master 节点 node-1
# 进入容器 docker exec -it [container_id] bash
# docker exec -it 70ada825aae1 bash
# vi /usr/share/elasticsearch/config/elasticsearch.yml
cluster.name: "esforlog"
network.host: 0.0.0.0
node.master: true
node.data: true
node.name: node-1
network.publish_host: 10.5.11.145
discovery.zen.ping.unicast.hosts: ["10.5.11.145:9300","10.5.11.146:9300","10.5.11.147:9300"]
查看节点状态
curl http://10.5.11.145:9200/_cluster/health?pretty
https://www.elastic.co/guide/en/elasticsearch/reference/master/index.html
Filebeat or Fluentd or Logstash
Beats
包含四种工具:
Packetbeat(搜集网络流量数据)
Topbeat(搜集系统、进程和文件系统级别的 CPU 和内存使用情况等数据)
Filebeat(搜集文件数据)
Winlogbeat(搜集 Windows 事件日志数据)
https://www.elastic.co/guide/en/beats/filebeat/master/index.html
https://www.elastic.co/guide/en/logstash/master/index.html
Fluentd
常用两种
Fluentd
Fluentd-bit 转发,更轻量级
Fluentd
使用 Fluentd 的插件(fluent-plugin-elasticsearch)直接将日志发送给 Elasticsearch
fluent-gem install fluent-plugin-elasticsearch
或者
FROM fluent/fluentd:v1.13.3
USER root
RUN ["gem", "install", "elasticsearch", "--no-document", "--version", "7.13.3"]
RUN ["gem", "install", "fluent-plugin-elasticsearch", "--no-document", "--version", "5.0.5"]
USER fluent
配置文件
<source>
@type forward
@id input1
@label @mainstream
port 24224
</source>
<filter **>
@type stdout
</filter>
<label @mainstream>
<match docker.**>
@type file
@id output_docker1
path /fluentd/log/docker.*.log
symlink_path /fluentd/log/docker.log
append true
time_slice_format %Y%m%d
time_slice_wait 1m
time_format %Y%m%dT%H%M%S%z
</match>
<match **>
@type file
@id output1
path /fluentd/log/data.*.log
symlink_path /fluentd/log/data.log
append true
time_slice_format %Y%m%d
time_slice_wait 10m
time_format %Y%m%dT%H%M%S%z
</match>
</label>
docker run -it -d --name fluentd \
-p 24224:24224 \
-p 24224:24224/udp \
-v /data/fluentd/log:/fluentd/log \
fluent/fluentd:latest
为了验证效果,这里我们 Run 两个容器,并分别制定其 log-dirver 为 fluentd
docker run -d \
--log-driver=fluentd \
--log-opt fluentd-address=localhost:24224 \
--log-opt tag="test-docker-A" \
busybox sh -c 'while true; do echo "This is a log message from container A"; sleep 10; done;'
docker run -d \
--log-driver=fluentd \
--log-opt fluentd-address=localhost:24224 \
--log-opt tag="test-docker-B" \
busybox sh -c 'while true; do echo "This is a log message from container B"; sleep 10; done;'
Fluentd-bit
官方镜像地址:docker pull fluent/fluent-bit
配置文件路径:/fluent-bit/etc
fluent-bit.conf
cat >fluent-bit.conf <<EOF
[INPUT]
name tail
path /logs/*.log
Skip_Long_Lines on
parser json
tag prod
Buffer_Chunk_Size 2M
Buffer_Max_Size 3M
[root@wh-sr-180-49 etc]# cat fluent-bit.conf
[SERVICE]
flush 1
daemon off
log_level debug
parsers_file parsers.conf
http_port 2020
storage.sync normal
storage.checksum off
storage.metrics on
plugins_file plugins.conf
http_server off
http_listen 0.0.0.0
@INCLUDE inputs.conf
@INCLUDE outputs.conf
EOF
cat >inputs.conf<<EOF
[INPUT]
name tail
path /logs/*.log
Skip_Long_Lines on
parser json
tag prod
Buffer_Chunk_Size 2M
Buffer_Max_Size 3M
EOF
cat >outputs.conf<<EOF
[OUTPUT]
name es
Host ${eshost}
Port 9200
Logstash_Format On
Logstash_Prefix testxxx
Logstash_DateFormat %Y%m%d
Retry_Limit 5
EOF
可以通过环境变量的方式注入到配置文件里面
docker run --restart=always -d -v /data/logs/testlogdir:/logs -e "eshost=10.1.10.1" hub.test.com/tools/fluentbit:test
常用的一些配置
[OUTPUT]
Name kafka
Match api*
Brokers 10.1.1.3:9300,10.1.1.4:9300,10.1.1.5:9300
Topics test.api.log
rdkafka.message.max.bytes 200000000
rdkafka.fetch.message.max.bytes 204857600
[OUTPUT] #可以通过env来注入变量
Name es #插件的类型
Match {{.Host}}_{{.ESIndex}}* #匹配到tag为{{.Host}}_{{.ESIndex}}*的数据源
Host ${.eshost} #es的hostname 可以是域名和ip
Port {{.esport}} #es的端口
Index {{.ESIndex}}
HTTP_User {{.UserName}}
HTTP_Passwd {{.Password}}
Logstash_Format On #是否采用类似logstash的index,可以根据时间设置index名字
Logstash_Prefix logstash #索引名称的前缀
Logstash_DateFormat %Y.%m.%d #名称后缀格式
Time_Key tail-time #Logstash_Format enabled的时候,每条记录会新产生一个时间戳
Time_Key_Format %Y-%m-%dT%H:%M:%S #新时间戳的格式
Generate_ID On #对记录去重,可能有性能消耗。
Trace_Output Off #打印elasticsearch API calls 调试的时候用。
Logstash_Prefix_Key ttt #
Retry_Limit 5 #传输失败后重试次数,默认为2,设置为False时,无限次重试
Logstash
docker run -d -p 5044:5044 --name logstash docker.elastic.co/logstash/logstash:6.7.0
# vi /usr/share/logstash/pipeline/logstash.conf
# 配置详情请参考下面的链接, 记得 output hosts IP 指向 Elasticsearch 的 IP
# Elasticsearch 的默认端口是 9200,在下面的配置中可以省略。
hosts => ["IP Address 1:port1", "IP Address 2:port2", "IP Address 3"]
vi /usr/share/logstash/config/logstash.yml
# 需要把 url 改为 elasticsearch master 节点的 IP
http.host: "0.0.0.0"
xpack.monitoring.elasticsearch.url: http://elasticsearch_master_IP:9200
node.name: "testelk"
pipeline.workers: 4 # same with cores
配置文件
input {
beats {
port => 5044
#ssl => true
#ssl_certificate => "/etc/logstash/logstash.crt"
#ssl_key => "/etc/logstash/logstash.key"
# 1. SSL 详情可参考
}
}
# filter 模块主要是数据预处理,提取一些信息,方便 elasticsearch 好归类存储。
# 2. grok 正则捕获
# 3. grok 插件语法介绍
# 4. logstash 配置语法
# 5. grok 内置 pattern
filter {
grok {
match => {"message" => "%{EXIM_DATE:timestamp}\|%{LOGLEVEL:log_level}\|%{INT:pid}\|%{GREEDYDATA}"}
# message 字段是 log 的内容,例如 2018-12-11 23:46:47.051|DEBUG|3491|helper.py:85|helper._save_to_cache|shop_session
# 在这里我们提取出了 timestamp log_level pid,grok 有内置定义好的 patterns: EXIM_DATE, EXIM_DATE, INT
# GREEDYDATA 贪婪数据,代表任意字符都可以匹配
}
# 我们在 filebeat 里面添加了这个字段[fields][function] 的话,那就会执行对应的 match 规则去匹配 path
# source 字段就是 log 的来源路径,例如 /var/log/nginx/feiyang233.club.access.log
# match 后我们就可以得到 path=feiyang233.club.access
if [fields][function]=="nginx" {
grok {
match => {"source" => "/var/log/nginx/%{GREEDYDATA:path}.log%{GREEDYDATA}"}
}
}
# 例如 ims 日志来源是 /var/log/ims_logic/debug.log
# match 后我们就可以得到 path=ims_logic
else if [fields][function]=="ims" {
grok {
match => {"source" => "/var/log/%{GREEDYDATA:path}/%{GREEDYDATA}"}
}
}
else {
grok {
match => {"source" => "/var/log/app/%{GREEDYDATA:path}/%{GREEDYDATA}"}
}
}
# filebeat 有定义 [fields][function] 时,我们就添加上这个字段,例如 QA
if [fields][function] {
mutate {
add_field => {
"function" => "%{[fields][function]}"
}
}
}
# 因为线上的机器更多,线上的我默认不在 filebeat 添加 function,所以 else 我就添加上 live
else {
mutate {
add_field => {
"function" => "live"
}
}
}
# 在之前 filter message 时,我们得到了 timestamp,这里我们修改一下格式,添加上时区。
date {
match => ["timestamp" , "yyyy-MM-dd HH:mm:ss Z"]
target => "@timestamp"
timezone => "Asia/Singapore"
}
# 将之前获得的 path 替换其中的 / 替换为 - , 因为 elasticsearch index name 有要求
# 例如 feiyang/test feiyang_test
mutate {
gsub => ["path","/","-"]
add_field => {"host_ip" => "%{[fields][host]}"}
remove_field => ["tags","@version","offset","beat","fields","exim_year","exim_month","exim_day","exim_time","timestamp"]
}
# remove_field 去掉一些多余的字段
}
# 单节点 output 就在本机,也不需要 SSL, 但 index 的命名规则还是需要非常的注意
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "sg-%{function}-%{path}-%{+xxxx.ww}"
# sg-nginx-feiyang233.club.access-2019.13 ww 代表周数
}
}
Kibana
docker run -p 5601:5601 docker.elastic.co/kibana/kibana:6.7.0
# vi /usr/share/kibana/config/kibana.yml
# 需要把 hosts IP 改为 elasticsearch 容器的 IP
# 我这里 elasticsearch 容器的 IP 是 172.17.0.2
# 如何查看 docker inspect elasticsearch_ID
server.name: kibana
server.host: "0.0.0.0"
elasticsearch.hosts: [ "[http://172.17.0.2:9200]" ]
xpack.monitoring.ui.container.elasticsearch.enabled: true
# 退出容器并重启
docker restart [container_ID]
使用实践:
ELasticsearch简单操作
GET _search
{
"query": {
"match_all": {}
}
}
GET /_cat/health?v
GET /_cat/nodes?v
GET /_cluster/allocation/explain
GET /_cluster/state
GET /_cat/thread_pool?v
GET /_cat/indices?health=red&v
GET /_cat/indices?v
# 将当前所有的 index 的 replicas 设置为 0
PUT /*/_settings
{
"index" : {
"number_of_replicas" : 0,
"refresh_interval": "30s"
}
}
GET /_template
# 在单节点的时候,不需要备份,所以将 replicas 设置为 0
PUT _template/app-logstash
{
"index_patterns": ["app-*"],
"settings": {
"number_of_shards": 3,
"number_of_replicas": 0,
"refresh_interval": "30s"
}
}
示例1.nginx日志的收集
修改nginx.conf配置文件,更改日志格式
log_format main '{"@timestamp":"$time_iso8601",'
'"@source":"$server_addr",'
'"hostname":"$hostname",'
'"ip":"$http_x_forwarded_for",'
'"client":"$remote_addr",'
'"request_method":"$request_method",'
'"scheme":"$scheme",'
'"domain":"$server_name",'
'"referer":"$http_referer",'
'"request":"$request_uri",'
'"args":"$args",'
'"size":$body_bytes_sent,'
'"status": $status,'
'"responsetime":$request_time,'
'"upstreamtime":"$upstream_response_time",'
'"upstreamaddr":"$upstream_addr",'
'"http_user_agent":"$http_user_agent",'
'"https":"$https"'
'}';
在hosts模块引用
server {
listen 443 ssl;
server_name rdc-test.xxx.com;
client_body_in_file_only clean;
client_body_buffer_size 32K;
client_max_body_size 150M;
#charset koi8-r;
access_log /data/nginx/logs/host.access.log main;
···
日志格式
{"@timestamp":"2021-09-23T10:16:00+08:00","@source":"10.1.36.x","hostname":"rdc-web","ip":"222.175.xx.x, 100.116.233.200:20778","client":"114.55.xxx.xx","request_method":"GET","scheme":"https","domain":"rdc-test.mingyuanyun.com","referer":"-","request":"//api/v2/upgrade/executive-commands/prepared?CustomerId=2499e3ef-c528-4d35-b713-175396676964&ServerUniqId=432267103072017461784841264064569636499355933101121867611029926&CustomerId=2499e3ef-c528-4d35-b713-175396676964&ServerUniqId=432267103072017461784841264064569636499355933101121867611029926","args":"CustomerId=2499e3ef-c528-4d35-b713-175396676964&ServerUniqId=432267103072017461784841264064569636499355933101121867611029926&CustomerId=2499e3ef-c528-4d35-b713-175396676964&ServerUniqId=432267103072017461784841264064569636499355933101121867611029926","size":99,"status": 200,"responsetime":0.042,"upstreamtime":"0.042","upstreamaddr":"10.4.36.211:8077","http_user_agent":"-","https":"on"}
{"@timestamp":"2021-09-23T10:16:00+08:00","@source":"10.1.36.x","hostname":"rdc-web","ip":"218.97.xx.x, 100.116.233.227:10828","client":"114.55.xxx.xx","request_method":"POST","scheme":"https","domain":"rdc-test.mingyuanyun.com","referer":"-","request":"//api/v2/upgrade/customers/8dd0f7e1-759e-4bed-aba3-e3e4f3e43855/inform","args":"-","size":148,"status": 200,"responsetime":0.028,"upstreamtime":"0.028","upstreamaddr":"10.4.36.208:8077","http_user_agent":"-","https":"on"}
{"@timestamp":"2021-09-23T10:16:00+08:00","@source":"10.1.36.x","hostname":"rdc-web","ip":"121.8.xx.x, 100.116.233.195:22542","client":"114.55.xxx.xx","request_method":"POST","scheme":"https","domain":"rdc-test.mingyuanyun.com","referer":"-","request":"//api/v2/upgrade/products/register","args":"-","size":101,"status": 200,"responsetime":0.083,"upstreamtime":"0.083","upstreamaddr":"10.4.36.209:8077","http_user_agent":"-","https":"on"}
{"@timestamp":"2021-09-23T10:16:00+08:00","@source":"10.1.36.x","hostname":"rdc-web","ip":"219.145.xx.x, 100.116.233.132:38136","client":"114.55.xxx.xx","request_method":"POST","scheme":"https","domain":"rdc-test.mingyuanyun.com","referer":"-","request":"//api/v2/upgrade/products/register","args":"-","size":101,"status": 200,"responsetime":0.036,"upstreamtime":"0.036","upstreamaddr":"10.4.36.208:8077","http_user_agent":"-","https":"on"}
{"@timestamp":"2021-09-23T10:16:00+08:00","@source":"10.1.36.x","hostname":"rdc-web","ip":"120.55.xx.x, 100.116.233.234:23986","client":"114.55.xxx.xx","request_method":"POST","scheme":"https","domain":"rdc-test.mingyuanyun.com","referer":"-","request":"//api/v2/upgrade/customers/5d1f1d5f-ddf2-4243-8bc6-c2d1381fc396/inform","args":"-","size":148,"status": 200,"responsetime":0.027,"upstreamtime":"0.027","upstreamaddr":"10.4.36.207:8077","http_user_agent":"-","https":"on"}
https://www.elastic.co/guide/en/kibana/master/index.html
参考文档
https://wsgzao.github.io/post/elk/
https://my.oschina.net/u/2392330/blog/1558733
网友评论