美文网首页
flume遇到的问题和解决

flume遇到的问题和解决

作者: 米卡啦 | 来源:发表于2018-09-13 14:23 被阅读0次

1.安装,需要指定JAVA_HOME
在conf/flume-evn.sh里面指定JAVA_HOME

2.channel 为file channel 运行时报OOM:

    [hdfs]$ bin/flume-ng agent --conf conf/ --conf-file conf/kafka-hdfs-pv.conf --name tier1
    Info: Sourcing environment configuration script /data2/apache-flume-1.8.0-bin/conf/flume-env.sh
    Info: Including Hive libraries found via () for Hive access
    + exec /usr/local/java/jdk1.8.0_131/bin/java -Xmx20m -cp '/data2/apache-flume-1.8.0-bin/conf:/data2/apache-flume-1.8.0-bin/lib/*:/lib/*' -Djava.library.path= org.apache.flume.node.Application --conf-file conf/kafka-hdfs-pv.conf --name tier1
    Exception in thread "PollableSourceRunner-KafkaSource-source1" java.lang.OutOfMemoryError: GC overhead limit exceeded
            at org.apache.kafka.common.record.Record.computeChecksum(Record.java:166)
            at org.apache.kafka.common.record.Record.computeChecksum(Record.java:204)
            at org.apache.kafka.common.record.Record.isValid(Record.java:218)
            at org.apache.kafka.common.record.Record.ensureValid(Record.java:225)
            at org.apache.kafka.clients.consumer.internals.Fetcher.parseRecord(Fetcher.java:617)
            at org.apache.kafka.clients.consumer.internals.Fetcher.handleFetchResponse(Fetcher.java:566)
            at org.apache.kafka.clients.consumer.internals.Fetcher.access$000(Fetcher.java:69)
            at org.apache.kafka.clients.consumer.internals.Fetcher$1.onSuccess(Fetcher.java:139)
            at org.apache.kafka.clients.consumer.internals.Fetcher$1.onSuccess(Fetcher.java:136)
            at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
            at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
            at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:380)
            at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:274)
            at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:320)
            at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:213)
            at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.quickPoll(ConsumerNetworkClient.java:202)
            at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:864)
            at org.apache.flume.source.kafka.KafkaSource.doProcess(KafkaSource.java:202)
            at org.apache.flume.source.AbstractPollableSource.process(AbstractPollableSource.java:60)
            at org.apache.flume.source.PollableSourceRunner$PollingRunner.run(PollableSourceRunner.java:133)
            at java.lang.Thread.run(Thread.java:748)
    ^CAttempting to shutdown background worker.

解决办法:
File Channel默认的java内存分配太少,只有20M,提高内存分配:
vim conf/flume-env.sh

export JAVA_OPTS="-Xms50m -Xmx50m -Dcom.sun.management.jmxremote"

修改为50M,不再报错.

3.在非Hadoop集群安装Flume,从kafka采集数据到HDFS,存储到HDFS时候报错:

    java.lang.NoClassDefFoundError: org/apache/hadoop/io/SequenceFile$CompressionType
        at org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:235)
        at org.apache.flume.conf.Configurables.configure(Configurables.java:41)
        at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:411)
        at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:102)
        at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:141)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
    Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.SequenceFile$CompressionType
            at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
            at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
            at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
            at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

原因:
缺少Hadoop相关的jar包,从Hadoop集群的jar包中复制到
flume/lib/文件下即可:
缺少的jar包如下:

commons-configuration-1.6.jar
hadoop-auth-2.6.0-cdh5.15.0.jar
hadoop-common-2.6.0-cdh5.15.0.jar
hadoop-hdfs-2.6.0.jar
htrace-core-3.2.0-incubating.jar
htrace-core4.4.0.1-incubating.jar

相关文章

  • flume遇到的问题和解决

    1.安装,需要指定JAVA_HOME在conf/flume-evn.sh里面指定JAVA_HOME 2.chann...

  • 谈一谈Flume日志采集系统

    以下想说一说:1.Flume是什么,还有他的历史2.Flume的应用场景,能解决我们什么问题3.我在使用flume...

  • flume与kafka集成遇到的问题与解决思路

    0x00 背景知识 基本上想去用flume的同学都知道点flume的用途了。flume是一个分布式,可靠的,易用的...

  • flume-ng+Kafka+Storm+HDFS 实时系统搭建

    问题导读: flume和kafka整合需要什么组件? flume-conf.properties需要做哪些修改? ...

  • 数据采集之Flume+Kafka

    Flume简介 1. Flume特点 flume是收集日志的开源软件解决方案之一,相对于其他同类软件他具有高可用的...

  • 坑合集

    Flume flume细节 Hive 数据倾斜Hive优化 Hive分区表新增字段为null的bug及解决方法 S...

  • 突破自己

    人的生活哪有那么多不可能一帆风顺,总会遇到很多需要解决的事情和问题。 遇到问题就解决问题,遇到瓶颈期就突破自己。 ...

  • 成长就是解决问题

    问题的解决阶段 问题的未解决阶段 遇到问题-察觉到问题-发现问题-回避问题 问题的解决阶段 遇到问题-察觉到问题-...

  • 2020-12-15

    在维修时遇到问题要想办法解决,有些经常遇到的问题想到很方便的方法来解决。遇到的问题从问题原因想办法解决。

  • 29.鼎然开示:面临苦受,要善于处理

    遇到问题要善于解决,解决问题不是目的,因遇到问题找出问题并解决问题然后找出合理解决问题的经验避免下次再发生才是解决...

网友评论

      本文标题:flume遇到的问题和解决

      本文链接:https://www.haomeiwen.com/subject/qnuwgftx.html