美文网首页
无法创建线程导致的nodemanager频繁挂掉

无法创建线程导致的nodemanager频繁挂掉

作者: invincine | 来源:发表于2018-08-21 16:22 被阅读0次

    hadoop集群在执行一个MapReduce任务时,其中一个节点的nodemanager频繁挂掉,以下是日志中纪录的报错内容:

    2018-08-21 14:31:05,210 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
    java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:714)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:521)
        at org.apache.hadoop.util.Shell.run(Shell.java:455)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.containerIsAlive(DefaultContainerExecutor.java:430)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.signalContainer(DefaultContainerExecutor.java:401)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:419)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55)
        at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
        at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
        at java.lang.Thread.run(Thread.java:745)
    2018-08-21 14:31:05,214 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye.
    

    报错内容显而易见,是因为jvm没有多余的内存来创建新的线程导致的
    由于该节点是新扩容的服务器,最开始想到的是系统限制用户创建线程数
    ulimit -u命令查看,果然是默认的1024
    修改数值:ulimit -u 102400
    启动nodemanager
    但过了一会儿,nodemanager又挂掉了,一样的报错:无法分配线程

    最后找到一篇文章,解决了这个问题:
    文章地址:https://blog.csdn.net/hw446/article/details/47908571
    由于MapReduce分配了过多的内存,导致没有多余的内存供jvm分配线程
    解决方法是修改mapred-site.xml配置文件相关参数
    修改之前:

    mapred-site.xml
        <property>
            <name>mapreduce.map.memory.mb</name>
            <value>4096</value>
        </property>
    
        <property>
            <name>mapreduce.map.java.opts</name>
            <value>-Xmn1200m -Xms3600m  -Xmx3600m -XX:MaxPermSize=100m -XX:PermSize=100m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:+DisableExplicitGC -Dfile.encoding=UTF-8</value>
        </property>
    
        <property>
            <name>mapreduce.reduce.memory.mb</name>
            <value>8192</value>
        </property>
    
        <property>
            <name>mapreduce.reduce.java.opts</name>
            <value>-Xmn2000m -Xms7200m  -Xmx7200m -XX:MaxPermSize=200m -XX:PermSize=200m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:+DisableExplicitGC -Dfile.encoding=UTF-8</value>
        </property>
    

    修改之后:

        <property>
            <name>mapreduce.map.java.opts</name>
            <value>-Xmn1200m -Xms1600m  -Xmx1600m -XX:MaxPermSize=100m -XX:PermSize=100m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:+DisableExplicitGC -Dfile.encoding=UTF-8</value>
        </property>
    
        <property>
            <name>mapreduce.reduce.memory.mb</name>
            <value>4096</value>
        </property>
    
        <property>
            <name>mapreduce.reduce.java.opts</name>
            <value>-Xmn2000m -Xms3072m  -Xmx3072m -XX:MaxPermSize=200m -XX:PermSize=200m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:+DisableExplicitGC -Dfile.encoding=UTF-8</value>
        </property>
    

    最后重启nodemanager

    相关文章

      网友评论

          本文标题:无法创建线程导致的nodemanager频繁挂掉

          本文链接:https://www.haomeiwen.com/subject/aqkjiftx.html