美文网首页报错锦集
2021-06-28/2022~报错总结

2021-06-28/2022~报错总结

作者: 勇于自信 | 来源:发表于2021-06-28 14:45 被阅读0次
    1.运行内存问题报错

    运行从ods到dwd层的脚本,跑MapReduce任务时报错内容如下:

    Logging initialized using configuration in jar:file:/usr/local/src/apache-hive-3.1.0-bin/lib/hive-common-3.1.0.jar!/hive-log4j2.properties Async: true
    Hive Session ID = 2928985c-003d-4e65-9c9f-7df704a0e929
    Query ID = root_20210628143131_870136d1-7127-4fdc-a8b5-518215736057
    Total jobs = 3
    Launching Job 1 out of 3
    Number of reduce tasks not specified. Estimated from input data size: 1
    In order to change the average load for a reducer (in bytes):
      set hive.exec.reducers.bytes.per.reducer=<number>
    In order to limit the maximum number of reducers:
      set hive.exec.reducers.max=<number>
    In order to set a constant number of reducers:
      set mapreduce.job.reduces=<number>
    Starting Job = job_1624695615043_0009, Tracking URL = http://master:8088/proxy/application_1624695615043_0009/
    Kill Command = /usr/local/src/hadoop-2.7.3/bin/mapred job  -kill job_1624695615043_0009
    Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
      21-06-28 14:31:52,049 Stage-1 map = 0%,  reduce = 0%
    2021-06-28 14:32:16,979 Stage-1 map = 100%,  reduce = 100%
    Ended Job = job_1624695615043_0009 with errors
    Error during job, obtaining debugging information...
    Examining task ID: task_1624695615043_0009_m_000000 (and more) from job job_1624695615043_0009
    
    Task with the most failures(4):
    -----
    Task ID:
      task_1624695615043_0009_m_000000
    
    URL:
      http://master:8088/taskdetails.jsp?jobid=job_1624695615043_0009&tipid=task_1624695615043_0009_m_000000
    -----
    Diagnostic Messages for this Task:
    Container [pid=20836,containerID=container_1624695615043_0009_01_000005] is running beyond virtual memory limits. Current usage: 372.0 MB of 1 GB physical memory used; 6.9 GB of 2.1 GB virtual memory used. Killing container.
    Dump of the process-tree for container_1624695615043_0009_01_000005 :
            |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
            |- 21057 20836 20836 20836 (java) 765 25 7280078848 94845 /usr/local/src/jdk1.8.0_144/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx5006m -Djava.io.tmpdir=/usr/local/src/hadoop-2.7.3/tmp/nm-local-dir/usercache/root/appcache/application_1624695615043_0009/container_1624695615043_0009_01_000005/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/usr/local/src/hadoop-2.7.3/logs/userlogs/application_1624695615043_0009/container_1624695615043_0009_01_000005 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 172.16.157.182 41533 attempt_1624695615043_0009_m_000000_3 5
            |- 20836 20834 20836 20836 (bash) 4 5 116170752 375 /bin/bash -c /usr/local/src/jdk1.8.0_144/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  -Xmx5006m -Djava.io.tmpdir=/usr/local/src/hadoop-2.7.3/tmp/nm-local-dir/usercache/root/appcache/application_1624695615043_0009/container_1624695615043_0009_01_000005/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/usr/local/src/hadoop-2.7.3/logs/userlogs/application_1624695615043_0009/container_1624695615043_0009_01_000005 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 172.16.157.182 41533 attempt_1624695615043_0009_m_000000_3 5 1>/usr/local/src/hadoop-2.7.3/logs/userlogs/application_1624695615043_0009/container_1624695615043_0009_01_000005/stdout 2>/usr/local/src/hadoop-2.7.3/logs/userlogs/application_1624695615043_0009/container_1624695615043_0009_01_000005/stderr
    
    Container killed on request. Exit code is 143
    Container exited with a non-zero exit code 143
    
    
    FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
    MapReduce Jobs Launched:
    Stage-Stage-1: Map: 1  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
    
    Total MapReduce CPU Time Spent: 0 msec 
    

    报错原因:
    MapReduce运行的Container试图使用过多的内存,而被
    NodeManager kill掉了。

    解决办法:

    mapred-site.xml中设置map和reduce任务的内存配置如下:(value中实际配置的内存需要根据自己机器内存大小及应用情况进行修改)

    yarn-site.xml中添加配置:

    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>1536</value>
    </property>
    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx1024M</value>
    </property>
    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>3072</value>
    </property>
    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx2560M</value>
    </property>
    
    2.执行Hadoop的mapreduce的jar包出现内存问题

    报错如下:



    产生原因:
    因为从机上运行的Container试图使用过多的内存,而被NodeManager kill掉了
    解决办法:
    修改mapred-site.xml文件:添加以下配置

    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>1536</value>
    </property>
    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx1024M</value>
    </property>
    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>3072</value>
    </property>
    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx2560M</value>
    </property>
    

    重启yarn即可

    3.hive3.1.2安装后执行插入语句报错

    报错信息:


    4.运行Hadoop MapReduce示例报错

    报错信息如下:



    日志打印到mapreduce.Job: Running job: job_1649499142410_0001长时间卡住。
    分析原因:
    这个问题是由于yarn默认设置的内存比较大,但是虚拟机内存没那么大,所以要调整内存大小,在yarn-site.xml文件添加一下配置:

    <property>
      <name>yarn.app.mapreduce.am.resource.mb</name>
      <value>256</value>
    </property>
    

    再次运行MapReduce示例,成功执行:



    5.sqoop导入MySQL数据到HDFS卡住问题

    导入执行命令:

    sqoop import --connect jdbc:mysql://bigdata101:3306/company --username root --password 123456 --table staff \--target-dir /user/company --delete-target-dir --num-mappers 1 --fields-terminated-by "\t"
    

    报错信息:



    执行以上脚本一直卡在这里很久没动,结果还是报错。
    解决办法:
    在hive的conf目录下增加如下配置:

    [root@bigdata101 conf]# vim sqoop-env.sh
    [root@bigdata101 bin]# cd /usr/local/src/sqoop-1.4.7.bin__hadoop-2.6.0/conf/
    [root@bigdata101 conf]# vim sqoop-env.sh  
    增加
    export ZOOKEEPER_HOME=/usr/local/src/zookeeper-3.4.10
    export ZOOCFGDIR=/usr/local/src/zookeeper-3.4.10
    

    整体配置如下:

    export HADOOP_COMMON_HOME=/usr/local/src/hadoop-3.1.3
    export HADOOP_MAPRED_HOME=/usr/local/src/hadoop-3.1.3
    export HIVE_HOME=/usr/local/src/apache-hive-3.1.2-bin
    export ZOOKEEPER_HOME=/usr/local/src/zookeeper-3.4.10
    export ZOOCFGDIR=/usr/local/src/zookeeper-3.4.10
    

    增加了zookeeper的配置后,再次运行就很快导入了。


    6.datax部署填坑

    1)下载DataX安装包并上传到hadoop102的/opt/software
    下载地址:http://datax-opensource.oss-cn-hangzhou.aliyuncs.com/datax.tar.gz
    2)解压datax.tar.gz到/opt/module

    [root@bigdata101 datax]# tar -zxvf datax.tar.gz -C /usr/local/src/                                                                                                                                                 
    

    3)自检,执行如下命令

    [root@bigdata101 datax]# python bin/datax.py job/job.json 
    

    执行报错如下:



    解决办法:

    1. 删除datax/plugin/reader下所有.xxxx隐藏文件
      注意:一定要.
      *er这种方式匹配文件,否则会匹配到里面的隐藏jar包
    [root@bigdata101 datax]# find /opt/module/datax/plugin/reader/ -type f -name "._*er" | xargs rm -rf
    

    同理也删除datax/plugin/writer/下所有._xxxx隐藏文件

    [root@bigdata101 datax]# find /opt/module/datax/plugin/writer/ -type f -name "._*er" | xargs rm -rf
    

    再次执行成功!

    相关文章

      网友评论

        本文标题:2021-06-28/2022~报错总结

        本文链接:https://www.haomeiwen.com/subject/fqmqultx.html