ELK之Logstash

作者: 漫画三毛 | 来源:发表于2022-09-15 22:24 被阅读0次

如何搭建本地ELK日志分析系统
Logstash-学习路线
ELK搭建
EleasticSearch, LogStash, Kibana
ELK日志分析系统初体验
Elk
ELK之logstash
ELK之Logstash
Logstash 、 Web日志实时分析
docker安装ELK 收集springboot日志

版本与Elasticsearch一致

**1、下载并安装适用于 Linux 的存档编辑
**~~Elasticsearch v8.4.1 的 Linux 存档可以按如下方式下载和安装：~~

wget https://artifacts.elastic.co/downloads/logstash/logstash-7.17.6-linux-x86_64.tar.gz
wget https://artifacts.elastic.co/downloads/logstash/logstash-7.17.6-linux-x86_64.tar.gz.sha512
shasum -a 512 -c logstash-7.17.6-linux-x86_64.tar.gz.sha512
tar -xzf logstash-7.17.6-linux-x86_64.tar.gz
cd logstash-7.17.6/

2、logstash.yml配置文件

基础配置：

#该值为true时，开启转义（避免写法格式错误导致浪费太多时间）
config.support_escapes: true
<meta name="wolai" content="bAHmkmE9rVVWdBhyjXgUDd1666174563784">

配合ES开启认证：

xpack.monitoring.enabled: true
xpack.monitoring.elasticsearch.username: logstash_system > xpack.monitoring.elasticsearch.password: *
xpack.monitoring.elasticsearch.hosts: ["http://ip:9200"]

如开启认证，pipeline output如使用es，需配置user/password（elastic/****）

配置说明如下：

------------ Node identity ------------
#节点名称，默认主机名
node.name: test

------------ Data path ------------------
#数据存储路径，默认LOGSTASH_HOME/data
path.data:

------------ Pipeline Settings --------------
#pipeline ID，默认main
pipeline.id: main

#输出通道的工作workers数据量，默认cpu核心数
pipeline.workers:

How many events to retrieve from inputs before sending to filters+workers
#单个工作线程在尝试执行其过滤器和输出之前将从输入收集的最大事件数量，默认125
pipeline.batch.size: 125

#将较小的批处理分派给管道之前，等待的毫秒数，默认50ms
pipeline.batch.delay: 50

#此值为true时，即使内存中仍然有运行中事件，也会强制Logstash在关机期间退出
#默认flase
pipeline.unsafe_shutdown: false

#管道事件排序
#选项有；auto,true,false,默认auto
pipeline.ordered: auto

------------ Pipeline Configuration Settings --------------
#配置文件路径
path.config:

#主管道的管道配置字符串
config.string:

#该值为true时，检查配置是否有效，然后退出，默认false
config.test_and_exit: false

#该值为true时，会定期检查配置是否已更改，并在更改后重新加载配置，默认false
config.reload.automatic: false

#检查配置文件更改的时间间隔，默认3s
config.reload.interval: 3s

#该值为true时，将完整编译的配置显示为调试日志消息，默认false
config.debug: false

#该值为true时，开启转义
config.support_escapes: false

------------ HTTP API Settings -------------
#是否开启http访问，默认true
http.enabled: true

#绑定主机地址，可以是ip,主机名，默认127.0.0.1
http.host: 127.0.0.1

#服务监听端口，可以是单个端口，也可以是范围端口，默认9600-9700
http.port: 9600-9700

------------ Module Settings ---------------
#模块定义，必须为数组
#模块变量名格式必须为var.PLUGIN_TYPE.PLUGIN_NAME.KEY
modules:
- name: MODULE_NAME
var.PLUGINTYPE1.PLUGINNAME1.KEY1: VALUE
var.PLUGINTYPE1.PLUGINNAME1.KEY2: VALUE
var.PLUGINTYPE2.PLUGINNAME1.KEY1: VALUE
var.PLUGINTYPE3.PLUGINNAME3.KEY1: VALUE

------------ Queuing Settings --------------
#事件缓冲的内部排队模型，可选项:memory,persisted，默认memory
queue.type: memory

#启用持久队列(queue.type: persisted)后将在其中存储数据文件的目录路径
#默认path.data/queue
path.queue:

#启用持久队列(queue.type: persisted)时使用的页面数据文件的大小
#默认64mb
queue.page_capacity: 64mb

#启用持久队列(queue.type: persisted)后，队列中未读事件的最大数量
#默认0
queue.max_events: 0

#启用持久队列(queue.type: persisted)后，队列的总容量，单位字节，默认1024mb
queue.max_bytes: 1024mb

#启用持久队列(queue.type: persisted)后，在强制检查点之前的最大ACKed事件数，默认1024
queue.checkpoint.acks: 1024

#启用持久队列(queue.type: persisted)后，在强制检查点之前的最大书面事件数，默认1024
queue.checkpoint.writes: 1024

If using queue.type: persisted, the interval in milliseconds when a checkpoint is forced on the head page
Default is 1000, 0 for no periodic checkpoint.

#启用持久队列(queue.type: persisted)后，执行检查点的时间间隔，单位ms，默认1000ms
queue.checkpoint.interval: 1000

------------ Dead-Letter Queue Settings --------------
#是否启用插件支持的DLQ功能的标志，默认false
dead_letter_queue.enable: false

#dead_letter_queue.enable为true时，每个死信队列的最大大小
#若死信队列的大小超出该值，则被删除，默认1024mb
dead_letter_queue.max_bytes: 1024mb

#死信队列存储路径，默认path.data/dead_letter_queue
path.dead_letter_queue:

------------ Debugging Settings --------------
#日志输出级别，选项：fatal,error,warn,info,debug,trace,默认info
log.level: info

#日志格式，选项:json,plain,默认plain
log.format:

#日志路径，默认LOGSTASH_HOME/logs
path.logs:

------------ Other Settings --------------
#插件存储路径
path.plugins: []

#是否启用每个管道在不同日志文件中的日志分隔
#默认false
pipeline.separate_logs: false

3、pipeline配置

多个管道

若是须要在同一进程中运行多个管道，Logstash提供了一种经过名为pipelines.yml的配置文件完成此操做的方法，这个文件必须放在path.settings文件夹，并遵循此结构：

pipeline.id: my-pipeline_1

path.config: "/etc/path/to/p1.config"

pipeline.workers: 3

pipeline.id: my-other-pipeline
path.config: "/etc/different/path/p2.cfg"
queue.type: persisted

该文件在YAML中格式化并包含一个字典列表，其中每一个字典描述一个管道，每一个键/值对指定该管道的设置。这个示例展现了两个不一样的管道，它们由ID和配置路径描述，对于第一个管道，pipeline.workers的值被设置为3，而在另外一个则启用持久队列特性，在pipelines.yml文件中未显式设置的设置值将退回到pipelines.yml设置文件中指定的默认值。

性能在没有参数的状况下启动Logstash时，它将读取pipelines.yml文件并实例化文件中指定的全部管道，另外一方面，当你使用-e或-f时，Logstash会忽略pipelines.yml文件，并记录对此的警告。

4、启动命令

sudo -u elasticsearch ./bin/logstash

# 指定文件运行
sudo -u elasticsearch ./bin/logstash -f aaa.config

# -t 指定文件运行,验证文件是否存在问题
sudo -u elasticsearch ./bin/logstash -f aaa.config -t

nohup sudo -u elasticsearch ./logstash/bin/logstash  > /www/logs/logstash.log &1 &

nohup sudo -u elasticsearch ./bin/logstash  > /www/logs/logstash.log &1 &

5、使用logstash过滤出特定格式的日志

日志内容：[2018-11-24 08:33:43,253][ERROR][http-nio-8080-exec-4][com.hh.test.logs.LogsApplication][code:200,msg:测试录入错误日志,param:{}]

filter {
  if "nova" in [tags]{
    grok {
      # 筛选过滤
      match => {
        "message" => "(?<date>\d{4}\-\d{2}\-\d{2}\s\d{2}:\d{2}:\d{2},\d{3})\]\[(?<level>[A-Z]{4,5})\]\[(?<thread>[A-Za-z0-9/\-]{4,40})\]\[(?<class>[A-Za-z0-9/.]{4,40})\]\[(?<msg>.*)"
      }
    mutate {
      remove_field => [
        "message",
      ]
    }
    # 不匹配正则则删除，匹配正则用=~
    if [level] !~ "(ERROR|WARN|INFO)" {
      # 删除日志
      drop {}
    }
  }
}

正则表达式在线的调试库，以供参考： http://grokdebug.herokuapp.com/
需注意：”-“ 破折号，需要转移不然会出现解析异常

6、异常问题

配置文件格式问题

"LogStash::ConfigurationError", :message=>"Expected one of #, input, filter, output at line 1
- 方式1：配置文件设置
```
config.support_escapes: true
```
- 方式2：问题需区分部署环境系统（windows/linux）
  1. windows
    
    文件编码格式问题，改为UTF-8无BOM模式（可参考相关链接：https://blog.csdn.net/qq_32131499/article/details/91338972 ）；
    
    如不行，可考虑参考：http://www.04007.cn/article/835.html
  2. linux
    
    如第一步还不生效，可以直接用vi命令编写简单的配置验证。大有可能是从windows创建提交导致该问题。
  3. 如果未能定位，则考虑通过命令行 logstash -f **.config -t 指定文件执行，看看是否成功。
如果开启pipeline，也出现提示文件格式化问题
- 先确认日志中是否出现“Ignoring the 'pipelines.yml' file because modules or command line options are specified”，如出现上述警告内容，代表未成功开启pipeline配置导致。
  
  比如：
  
  logstash.yml开启pipeline 单通道配置与 pipeline.yml配置同时开启
- 如上述未能解决，参照第一个问题。

📌参考文献:

http://www.noobyard.com/article/p-ucnfmacz-gk.html

https://www.elastic.co/guide/en/logstash/7.17/introduction.html

https://blog.51cto.com/u_15047490/4228036

样例：

input {
    file {
        path => "/www/log/java-web/log_info.log"
        type => "java-web-info"
        start_position => "beginning"
        # 多行输入
        codec => multiline {
            pattern => "^%{TIMESTAMP_ISO8601} "
            negate => true
            what => previous
        }
    }

    file {
        path => "/www/log/java-web/log_error.log"
        type => "java-web-error"
        start_position => "beginning"
        # 多行输入
        codec => multiline {
            pattern => "^%{TIMESTAMP_ISO8601} "
            negate => true
            what => previous
        }
    }
}

filter {

    if [type] == "java-web-info" {
    grok {
        match => {
            "message" => "(?<date>\d{4}\-\d{2}\-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3})\s(?<level>[A-Z]{4,5})\s{1,2}(?<pid>\d{1,20})\s\-{1,5}\s\[(?<thread>[A-Za-z0-9./-]{4,40})\]\s(?<method>[A-Z
a-z0-9.\\(\\:/\-\\)]{0,200}):\s(?<msg>.*)"
        }
    }

    # 过滤debug
    if [level] !~ "(ERROR|WARN|INFO)" {
        drop {}
    }
    # 过滤指定内容
    if [thread] == "com.alibaba.nacos.naming.push.receiver" {
        drop {}
    }

    }
}

output {

    if [type] == "java-web-info" {
        elasticsearch {
            hosts => ["127.0.0.1:9200"]
            index => "java-web-info-%{+YYYY.MM.dd}"
            #开启认证后，需配置
            #user => "elastic"
            #password => "******"
        }
    }
    
    if [type] == "java-web-error" {
        elasticsearch {
            hosts => ["127.0.0.1:9200"]
            index => "java-web-error-%{+YYYY.MM.dd}"
            #开启认证后，需配置
            #user => "elastic"
            #password => "******"
        }
    }
}

如何搭建本地ELK日志分析系统
ELK ELK名词解释：elastic search & logstash & kibana logstash是数...
Logstash-学习路线
资料 ELK Stack之logstash中文文档基础篇 Logstash-概念篇Logstash-命令行参数L...
ELK搭建
ELK（ElasticSearch, Logstash, Kibana） ELK（ElasticSearch, L...
EleasticSearch, LogStash, Kibana
EleasticSearch, LogStash, Kibana ELK Stack ELK in one doc...
ELK日志分析系统初体验
1 ELK技术栈 1.0 官方文档 ELK logstash elasticsearch kibana ELK技术...
Elk
elk : elasticsearch logstash kibanaelasticsearch: ...
ELK之logstash
ELK架构图： logstash 官方网站：https://www.elastic.co/logstash工作模式...
ELK之Logstash
版本与Elasticsearch一致 **1、下载并安装适用于 Linux 的存档编辑**Elasticsearc...
Logstash 、 Web日志实时分析
ELK日志分析平台 ELK架构图例 logstash安装购买云主机主机IP地址配置logstash192.16...
docker安装ELK 收集springboot日志
ELK简介 ELK是Elasticsearch+Logstash+Kibana简称 Elasticsearch 是...