美文网首页
Flume使用tairDir采集数据到HDFS

Flume使用tairDir采集数据到HDFS

作者: 喵星人ZC | 来源:发表于2019-06-06 21:59 被阅读0次

    架构:
    tairdir source --> memory channel --> HDFS sink

    有一个脚本会每五分钟往access.log写100条日志

    [hadoop@hadoop000 conf]$ cd /home/hadoop/soul/data/flume/tairdir/
    [hadoop@hadoop000 tairdir]$ ll
    total 8
    -rw-r--r-- 1 hadoop hadoop 7881 Jun  6 21:55 access.log
    

    配置文件:

    tairdir-hdfs-agent.sources = tairdir-source
    tairdir-hdfs-agent.sinks = hdfs-sink
    tairdir-hdfs-agent.channels = memory-channel
    
    tairdir-hdfs-agent.sources.tairdir-source.type = TAILDIR
    tairdir-hdfs-agent.sources.tairdir-source.filegroups = f1
    tairdir-hdfs-agent.sources.tairdir-source.filegroups.f1 = /home/hadoop/soul/data/flume/tairdir/.*
    # 元数据位置
    tairdir-hdfs-agent.sources.tairdir-source.positionFile = /home/hadoop/soul/data/flume/taildir_position.json
    
    
    tairdir-hdfs-agent.channels.memory-channel.type = memory
    tairdir-hdfs-agent.channels.memory-channel.capacity = 1000
    tairdir-hdfs-agent.channels.memory-channel.transactionCapacity = 100
    
    
    tairdir-hdfs-agent.sinks.hdfs-sink.type = hdfs
    tairdir-hdfs-agent.sinks.hdfs-sink.hdfs.path = hdfs://hadoop000:8020/g6/flume/tairDir/%Y%m%d/%H%M
    tairdir-hdfs-agent.sinks.hdfs-sink.hdfs.filePrefix = baidu
    tairdir-hdfs-agent.sinks.hdfs-sink.hdfs.rollInterval = 30
    tairdir-hdfs-agent.sinks.hdfs-sink.hdfs.rollSize = 20000000
    tairdir-hdfs-agent.sinks.hdfs-sink.hdfs.rollCount = 0
    tairdir-hdfs-agent.sinks.hdfs-sink.dfs.codeC = gzip
    tairdir-hdfs-agent.sinks.hdfs-sink.hdfs.writeFormat = Text
    tairdir-hdfs-agent.sinks.hdfs-sink.hdfs.useLocalTimeStamp = true
    
    
    tairdir-hdfs-agent.sources.tairdir-source.channels = memory-channel
    tairdir-hdfs-agent.sinks.hdfs-sink.channel= memory-channel
    

    启动Flume

    flume-ng agent \
    --conf $FLUME_HOME/conf \
    --conf-file $FLUME_HOME/conf/tairdir-hdfs.conf \
    --name tairdir-hdfs-agent \
    -Dflume.root.logger=INFO,console
    

    结果:


    image.png image.png

    相关文章

      网友评论

          本文标题:Flume使用tairDir采集数据到HDFS

          本文链接:https://www.haomeiwen.com/subject/pelpxctx.html