一、背景
-
Tomcat的access日志记录的是部署在tomcat中的应用被访问的链接信息等。下如图。下面将记录一下我将access日志用filebeat收集,存入redis,然后logstash过滤后存储到es中,并且用kibana可视化展示的过程。
Tomcat access日志
二、配置filebeat收集access日志
- 编辑filebeat.yml
vi /etc/filebeat/filebeat.yml
###################### Filebeat Configuration Example #########################
# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html
# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.
#=========================== Filebeat inputs =============================
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/nginx/access-json.log #指明读取文件的位置
tags: ["nginx-access"] #用于logstash过滤
- type: log
enabled: true
paths:
- /var/log/nginx/error.log #指明读取文件的位置
tags: ["nginx-error"] #用于logstash过滤
- type: log
enabled: true
paths:
- /root/server/apache-tomcat-aic/logs/localhost_access_log.*.txt #指明tomcat-access日志文件的位置
tags: ["tomcat-access"] #用于logstash过滤
#============================= Filebeat modules ===============================
filebeat.config.modules:
# Glob pattern for configuration loading
path: ${path.config}/modules.d/*.yml
# Set to true to enable config reloading
reload.enabled: false
# Period on which files under path should be checked for changes
#reload.period: 10s
#==================== Elasticsearch template setting ==========================
setup.template.settings:
index.number_of_shards: 3
#index.codec: best_compression
#_source.enabled: false
#================================ Outputs =====================================
# Configure what output to use when sending the data collected by the beat.
#-------------------------- Redis output ------------------------------
output.redis:
hosts: ["192.168.1.110:6379"] #输出到redis的机器
password: "123456"
key: "filebeat:test16" #redis中日志数据的key值ֵ
db: 0
timeout: 5
#================================ Processors =====================================
# Configure processors to enhance or manipulate events generated by the beat.
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
三、配置logstash过滤access日志
- 新建一个logstash的配置文件
vi /etc/logstash/conf.d/tomcat110-access.conf
- 输入以下内容
input {
redis {
data_type =>"list"
key =>"filebeat:test16"
host =>"192.168.1.110"
port => 6379
password => "123456"
threads => "8"
db => 0
#codec => json
}
}
filter {
if "tomcat-access" in [tags]{
grok {
match => ["message", "%{IPORHOST:client} (%{USER:ident}|-) (%{USER:auth}|-) \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:http_version})?|-)\" %{NUMBER:response} %{NUMBER:bytes}"]
}
date{
match=>["timestamp","dd/MMM/yyyy:HH:mm:ss Z"]
target=>"@timestamp"
}
mutate {
split => [ "request" ,"?" ]
add_field => ["url" , "%{request[0]}"]
}
}
geoip {
source => "client"
database => "/opt/GeoLite2-City/GeoLite2-City.mmdb"
remove_field => ["[geoip][latitude]", "[geoip][longitude]", "[geoip][country_code]", "[geoip][country_code2]", "[geoip][country_code3]", "[geoip][timezone]", "[geoip][continent_code]", "[geoip][region_code]", "[geoip][ip]"]
target => "geoip"
}
mutate {
convert => [ "[geoip][coordinates]", "float" ]
remove_field => "log"
remove_field => "beat"
remove_field => "meta"
remove_field => "prospector"
}
}
output {
if "tomcat-access" in [tags]{
elasticsearch {
hosts => ["192.168.1.110:9200"]
index => "logstash-test16-tomcat-access-%{+yyyy.MM.dd}"
}
}
}
- 这里简单解释一下logstash的内容
input节点的内容代表,从redis中获取key为filebeat:test16的值。
input {
redis {
data_type =>"list"
key =>"filebeat:test16"
host =>"192.168.1.110"
port => 6379
password => "123456"
threads => "8"
db => 0
#codec => json
}
}
- filter节点
filter {
# 当遇到tags中含有tomcat-access,执行以下逻辑。这个tags是在filebeat.yml中设置的。
if "tomcat-access" in [tags]{
#使用grok过滤,匹配日志
grok {
match => ["message", "%{IPORHOST:client} (%{USER:ident}|-) (%{USER:auth}|-) \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:http_version})?|-)\" %{NUMBER:response} %{NUMBER:bytes}"]
}
#将 11/Apr/2019:09:22:28 +0800 时间转换为: 2019-04-11T01:22:28.000Z
date{
match=>["timestamp","dd/MMM/yyyy:HH:mm:ss Z"]
target=>"@timestamp"
}
# 将request字段用问号分隔,并且增加一个字段为url,将分隔后的第一个字段赋值给它。此处的作用是为了取接口地址,去掉访问参数。如:/api/publicScreen?shopId=12&deviceNo=123456&status=1。则url为:/api/publicScreen
mutate {
split => [ "request" ,"?" ]
add_field => ["url" , "%{request[0]}"]
}
}
#使用geoip库定位ip
geoip {
source => "client" #日志中外部访问ip对应字段
database => "/opt/GeoLite2-City/GeoLite2-City.mmdb"
#去掉显示geoip显示的多余信息
remove_field => ["[geoip][latitude]", "[geoip][longitude]", "[geoip][country_code]", "[geoip][country_code2]", "[geoip][country_code3]", "[geoip][timezone]", "[geoip][continent_code]", "[geoip][region_code]", "[geoip][ip]"]
target => "geoip"
}
mutate {
convert => [ "[geoip][coordinates]", "float" ]
remove_field => "log"
remove_field => "beat"
remove_field => "meta"
remove_field => "prospector"
}
}
-
另外grok 可以在kibana中测试自己的正则语法。
image.png
- 重启logstash,查看启动日志是否报错。
systemctl restart logstash #重启
tail -f /var/log/logstash/logstash-plain.log #查看运行日志
三、配置Kibana展示access日志
-
上述配置正确后。可以在kibana中看到我们建立的日志索引
image.png
-
建立kibana日志索引,可视化日志
image.png
image.png -
查看日志
image.png - 为这个索引创建一个Dashboard
-
先为这个Dashboard添加需要的Visualize。此处先创建排名前10的访问链接。
image.png
image.png
image.png
image.png
image.png - 点击save后,保存数据名字即可。接下来再创建一个访问IP的可视化地图,接下来就不贴那么多图了,步骤类似上面。注意类型选择 Coordinate Map
网友评论