美文网首页
Flume之tairDir采集数据同时发往HDFS及Kafka

Flume之tairDir采集数据同时发往HDFS及Kafka

作者: 喵星人ZC | 来源:发表于2019-06-06 22:04 被阅读0次

有一个脚本会每五分钟往access.log写100条日志

[hadoop@hadoop000 conf]$ cd /home/hadoop/soul/data/flume/tairdir/
[hadoop@hadoop000 tairdir]$ ll
total 8
-rw-r--r-- 1 hadoop hadoop 7881 Jun  6 21:55 access.log

配置文件:

tairdir-hdfs-agent.sources = tairdir-source
tairdir-hdfs-agent.sinks = hdfs-sink kafka-sink
tairdir-hdfs-agent.channels = c1 c2

tairdir-hdfs-agent.sources.tairdir-source.type = TAILDIR
tairdir-hdfs-agent.sources.tairdir-source.filegroups = f1
tairdir-hdfs-agent.sources.tairdir-source.filegroups.f1 = /home/hadoop/soul/data/flume/tairdir/.*
# 元数据位置
tairdir-hdfs-agent.sources.tairdir-source.positionFile = /home/hadoop/soul/data/flume/taildir_position.json

tairdir-hdfs-agent.channels.c1.type = memory
tairdir-hdfs-agent.channels.c1.capacity = 1000
tairdir-hdfs-agent.channels.c1.transactionCapacity = 100

tairdir-hdfs-agent.channels.c2.type = memory
tairdir-hdfs-agent.channels.c2.capacity = 1000
tairdir-hdfs-agent.channels.c2.transactionCapacity = 100

tairdir-hdfs-agent.sinks.hdfs-sink.type = hdfs
tairdir-hdfs-agent.sinks.hdfs-sink.hdfs.path = hdfs://hadoop000:8020/g6/flume/tairDir/%Y%m%d/%H%M
tairdir-hdfs-agent.sinks.hdfs-sink.hdfs.filePrefix = baidu
tairdir-hdfs-agent.sinks.hdfs-sink.hdfs.rollInterval = 30
tairdir-hdfs-agent.sinks.hdfs-sink.hdfs.rollSize = 20000000
tairdir-hdfs-agent.sinks.hdfs-sink.hdfs.rollCount = 0
tairdir-hdfs-agent.sinks.hdfs-sink.dfs.codeC = gzip
tairdir-hdfs-agent.sinks.hdfs-sink.hdfs.writeFormat = Text
tairdir-hdfs-agent.sinks.hdfs-sink.hdfs.useLocalTimeStamp = true

tairdir-hdfs-agent.sinks.kafka-sink.type = org.apache.flume.sink.kafka.KafkaSink
tairdir-hdfs-agent.sinks.kafka-sink.brokerList = localhost:9092
tairdir-hdfs-agent.sinks.kafka-sink.topic = baidu

tairdir-hdfs-agent.sources.tairdir-source.channels = c1 c2
tairdir-hdfs-agent.sinks.hdfs-sink.channel= c1
tairdir-hdfs-agent.sinks.kafka-sink.channel= c2

启动Flume

flume-ng agent \
--conf $FLUME_HOME/conf \
--conf-file $FLUME_HOME/conf/tairdir-hdfs-kafka.conf  \
--name tairdir-hdfs-agent \
-Dflume.root.logger=INFO,console

启动Kafka消费者

kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic baidu --from-beginning

结果:
HDFS


image.png

Kafka


image.png

相关文章

网友评论

      本文标题:Flume之tairDir采集数据同时发往HDFS及Kafka

      本文链接:https://www.haomeiwen.com/subject/wtrpxctx.html