美文网首页
logstash out file to HDFS

logstash out file to HDFS

作者: 北极企鹅ys | 来源:发表于2018-07-26 16:09 被阅读0次

    logstash out file to HDFS

    logstash 直接把文件内容写入 hdfs 中, 并支持 hdfs 压缩格式。
    logstash 需要安装第三方插件,webhdfs插件,通过hdfs的web接口写入。
    http://namenode00:50070/webhdfs/v1/ 接口

    安装

    可以在官网找到相应的版本, 我们用的是2.3.1,下载地址:

    https://www.elastic.co/downloads/past-releases  
    

    webhdfs插件地址

    github地址:
      git clone  https://github.com/heqin5136/logstash-output-webhdfs-discontinued.git
    
    官网地址及使用说明:
      https://www.elastic.co/guide/en/logstash/current/plugins-outputs-webhdfs.html
    

    插件安装方式:

    logstash 安装在 /home/mtime/logstash-2.3.1
    git clone  https://github.com/heqin5136/logstash-output-webhdfs-discontinued.git
    cd logstash-output-webhdfs-discontinued
    /home/mtime/logstash-2.3.1/bin/plugin install logstash-output-webhdfs-discontinued
    

    检查hdfs的webhds接口

        curl -i  "http://namenode:50070/webhdfs/v1/?user.name=hadoop&op=LISTSTATUS"   
        
        HTTP/1.1 200 OK
        Cache-Control: no-cache
        Expires: Thu, 13 Jul 2017 04:53:39 GMT
        Date: Thu, 13 Jul 2017 04:53:39 GMT
        Pragma: no-cache
        Expires: Thu, 13 Jul 2017 04:53:39 GMT
        Date: Thu, 13 Jul 2017 04:53:39 GMT
        Pragma: no-cache
        Content-Type: application/json
        Set-Cookie: hadoop.auth="u=hadoop&p=hadoop&t=simple&e=1499957619679&s=KSxdSAtjXAllhn73vh1MAurG9Bk="; Path=/; Expires=Thu, 13-Jul-2017 14:53:39 GMT; HttpOnly
        Transfer-Encoding: chunked
        Server: Jetty(6.1.26) 
    

    注释: active namenode 返回是200 ,standby namenode 返回是403.

    配置

    添加 logstash 一个配置文件

    vim /home/mtime/logstash-2.3.1/conf/hdfs.conf

    input {
      kafka {
        zk_connect => "192.168.51.191:2181,192.168.51.192:2181,192.168.51.193:2181"   ## kafka zk 地址 
        group_id => 'hdfs'   # 消费者组
        topic_id => 'tracks'  # topic 名字
        consumer_threads => 1  
        codec => 'json'  
      }
    }
    
    filter {   ##  为解决 插入hdfs时间相差8小时, 
            date {  
                    match => [ "time" , "yyyy-MM-dd HH:mm:ss" ]
                    locale => "zh"
                    timezone => "-00:00:00"
                    target => "@timestamp"
            }
    }
    
    output {
    #if [app] == "mx.tc.virtualcard.service" {
        webhdfs {
               workers => 2
               host => "namenode"
               standby_host => "standbynamenode"
               port => 50070
               user => "loguser"
               path => "/Service-Data/%{+YYYY}-%{+MM}-%{+dd}/%{app}/logstash-%{+HH}.log"
               flush_size => 100
               idle_flush_time => 10
               compression => "gzip"
               retry_interval => 3
               codec => 'json'   # 解决 写入hdfs文件是json格式,否则内容为 %{message}
           }
    #   }
      stdout { codec => rubydebug }
    }
    

    关于hdfs部分配置,可以在 plugins-outputs-webhdfs 官网找到。

    启动 logstart

    cd /home/mtime/logstash-2.3.1/bin/
    ./logstash -f ../conf/hdfs.conf    # 为前台启动 
    

    我的 github 博客 https://sukbeta.github.io/2018/05/29/logstash-out-file-to-HDFS/

    相关文章

      网友评论

          本文标题:logstash out file to HDFS

          本文链接:https://www.haomeiwen.com/subject/otvbmftx.html