美文网首页
flume 负载均配置

flume 负载均配置

作者: baker_dai | 来源:发表于2018-07-04 10:54 被阅读0次

    Flume的负载配置通过slink的group来实现,每次按照一定的算法选择slink输出到指定的地方,如果文件输出量很大的情况下负载均衡还是很有必要的,通过多通道输出缓解输出压力。
    Flume内置的负载均衡的算法默认是round robin(轮询算法)
    文件从主机传到HDFS上。
    集群信息如下:
    Flume集群采用4台主机
    Flumeapp1 load_balance
    Flumeapp2 slink1
    Flumeapp3 slink2
    Flumeapp4 slink3

    Load_balance配置如下(文件采用默认的配置文件名conf/flume-conf.properties):

    agent1.sources=source1
    agent1.sinks=sink1 sink2 sink3
    agent1.channels = channel1

    source

    agent1.sources.source1.type = spooldir
    agent1.sources.source1.spoolDir = /e3base/spooldir

    配置原文件中与目标文件名相同

    agent1.sources.source1.basenameHeader=true
    agent1.sources.source1.basenameHeaderKey=fileName

    sink group

    agent1.sinkgroups=group1
    agent1.sinkgroups.group1.sinks=sink1 sink2 sink3
    agent1.sinkgroups.group1.processor.type=load_balance
    agent1.sinkgroups.group1.processor.backoff=true
    agent1.sinkgroups.group1.processor.selector=round_robin

    sink1

    agent1.sinks.sink1.type=avro
    agent1.sinks.sink1.hostname=134.32.50.13
    agent1.sinks.sink1.port=21000

    sink2

    agent1.sinks.sink2.type=avro
    agent1.sinks.sink2.hostname=134.32.50.14
    agent1.sinks.sink2.port=21000

    sink3

    agent1.sinks.sink3.type=avro
    agent1.sinks.sink3.hostname=134.32.152.49
    agent1.sinks.sink3.port=21000

    channel

    agent1.channels.channel1.type = memory
    agent1.channels.channel1.capacity = 1000
    agent1.channels.channel1.transactionCapacity=100

    bind

    agent1.sources.source1.channels = channel1
    agent1.sinks.sink1.channel = channel1
    agent1.sinks.sink2.channel = channel1
    agent1.sinks.sink3.channel = channel1

    Flumeapp2~Flumeapp4配置相同如下(文件采用默认的配置文件名conf/flume-conf.properties):

    agent1.sources=source1
    agent1.channels=channel1
    agent1.sinks = sink1

    source

    agent1.sources.source1.type=avro
    agent1.sources.source1.bind= 134.32.152.49
    agent1.sources.source1.port=21000
    agent1.sources.source1.basenameHeader=true
    agent1.sources.source1.basenameHeaderKey=filename

    channels

    agent1.channels.channel1.type=memory
    agent1.channels.channel1.capacity=1000
    agent1.channels.channel1.transactionCapacity=100

    sinks

    agent1.sinks.sink1.type=hdfs
    agent1.sinks.sink1.hdfs.path=hdfs://drmcluster/test_bak/flume/
    agent1.sinks.sink1.hdfs.filePrefix=%{fileName}
    agent1.sinks.sink1.hdfs.fileType=DataStream
    agent1.sinks.sink1.hdfs.rollCount=0
    agent1.sinks.sink1.hdfs.rollSize=134217728
    agent1.sinks.sink1.hdfs.rollInterval=60
    agent1.sinks.sink1.hdfs.writeFormat=Text
    agent1.sinks.sink1.hdfs.useLocalTimeStamp=true
    agent1.sources.source1.channels=channel1
    agent1.sinks.sink1.channel=channel1

    agent1.sources=source1
    agent1.channels=channel1
    agent1.sinks = sink1
    agnet1.channel=channel1

    在集群中四个主机启动 flume-ng
    启动命令:(conf,properties尽量用绝对路径,否则会有意想不到的错误)
    ./flume-ng agent -c /e3base/flume/conf -f /e3base/flume/conf/flume-conf.properties -n agent1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true
    顺序先启sink1~sink3 然后再启动load_balanc3否则主报端口找不到。

    相关文章

      网友评论

          本文标题:flume 负载均配置

          本文链接:https://www.haomeiwen.com/subject/zufeuftx.html