美文网首页
nodemanager 启动container脚本分析

nodemanager 启动container脚本分析

作者: JX907 | 来源:发表于2019-04-19 11:44 被阅读0次

    ContainerLaunch类在启动一个container前会在临时目录中生成default_container_executor.sh、default_container_executor_session.sh、launch_container.sh三个文件,下面对以某个container启动为例分析其进程启动过程。

    首先执行脚本
    tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1555117719646_0008/container_1555117719646_0008_01_000001/default_container_executor.sh

    default_container_executor.sh内容:

    /bin/bash "/tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1555117719646_0008/container_1555117719646_0008_01_000001/default_container_executor_session.sh"
    rc=$?
    echo $rc > "/tmp/hadoop-hadoop/nm-local-dir/nmPrivate/application_1555117719646_0008/container_1555117719646_0008_01_000001/container_1555117719646_0008_01_000001.pid.exitcode.tmp"
    /bin/mv -f "/tmp/hadoop-hadoop/nm-local-dir/nmPrivate/application_1555117719646_0008/container_1555117719646_0008_01_000001/container_1555117719646_0008_01_000001.pid.exitcode.tmp" "/tmp/hadoop-hadoop/nm-local-dir/nmPrivate/application_1555117719646_0008/container_1555117719646_0008_01_000001/container_1555117719646_0008_01_000001.pid.exitcode"
    exit $rc
    

    default_container_executor_session.sh脚本内容:

    echo $$ > /tmp/hadoop-hadoop/nm-local-dir/nmPrivate/application_1555117719646_0008/container_1555117719646_0008_01_000001/container_1555117719646_0008_01_000001.pid.tmp
    
    /bin/mv -f /tmp/hadoop-hadoop/nm-local-dir/nmPrivate/application_1555117719646_0008/container_1555117719646_0008_01_000001/container_1555117719646_0008_01_000001.pid.tmp /tmp/hadoop-hadoop/nm-local-dir/nmPrivate/application_1555117719646_0008/container_1555117719646_0008_01_000001/container_1555117719646_0008_01_000001.pid
    
    exec setsid /bin/bash "/tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1555117719646_0008/container_1555117719646_0008_01_000001/launch_container.sh"
    

    default_container_executor_session.sh先获取shell的pid写入到.pid.tmp文件,然后去掉后缀tmp,最后调用launch_container.sh启动container进程,注意启动launch_container.sh时使用的是exec setsid,即替换default_container_executor_session.sh进程,且在新的sessionid中,这样pid.tmp记录的pid就成为新session中的首进程,然后lauch_container.sh中在启动container进程时前面也加了exec,见下面代码,这样container进程pid就是上述首进程的pid,这样做的目的是在kill container时可以kill -15/-9首进程,该进程产生的所有子进程都将被结束。

    [hadoop@node1 testshell]$ cat /tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1555117719646_0011/container_1555117719646_0011_01_000001/launch_container.sh

    
    export HADOOP_CONF_DIR="/home/hadoop/hadoop-2.6.5/etc/hadoop"
    export MAX_APP_ATTEMPTS="2"
    export JAVA_HOME="/usr/local/jdk1.8.0_121"
    export LEVER_APPLICATION_ID="application_1555117719646_0011"
    export LEVER_APPLICATION_QUEUE="default"
    export APP_SUBMIT_TIME_ENV="1555164926233"
    export NM_HOST="node1"
    export HADOOP_HDFS_HOME="/home/hadoop/hadoop-2.6.5"
    export LOGNAME="hadoop"
    export JVM_PID="$$"
    export PWD="/tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1555117719646_0011/container_1555117719646_0011_01_000001"
    export HADOOP_COMMON_HOME="/home/hadoop/hadoop-2.6.5"
    export LOCAL_DIRS="/tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1555117719646_0011"
    export APPLICATION_WEB_PROXY_BASE="/proxy/application_1555117719646_0011"
    export NM_HTTP_PORT="8042"
    export LOG_DIRS="/home/hadoop/hadoop-2.6.5/logs/userlogs/application_1555117719646_0011/container_1555117719646_0011_01_000001"
    export NM_AUX_SERVICE_mapreduce_shuffle="AAA0+gAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=
    "
    export NM_PORT="45301"
    export USER="hadoop"
    export HADOOP_YARN_HOME="/home/hadoop/hadoop-2.6.5"
    export CLASSPATH="$CLASSPATH:./*:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/share/hadoop/common/*:$HADOOP_COMMON_HOME/share/hadoop/common/lib/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*:$HADOOP_YARN_HOME/share/hadoop/yarn/*:$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*"
    export HADOOP_TOKEN_FILE_LOCATION="/tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1555117719646_0011/container_1555117719646_0011_01_000001/container_tokens"
    export NM_AUX_SERVICE_spark_shuffle=""
    export HOME="/home/"
    export CONTAINER_ID="container_1555117719646_0011_01_000001"
    export MALLOC_ARENA_MAX="4"
    ln -sf "/tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1555117719646_0011/filecache/10/action.conf.4master" "action.conf.4master"
    hadoop_shell_errorcode=$?
    if [ $hadoop_shell_errorcode -ne 0 ]
    then
      exit $hadoop_shell_errorcode
    fi
    ln -sf "/tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1555117719646_0011/filecache/11/lever-master-0.1.0-jar-with-dependencies.jar" "lever-master-0.1.0-jar-with-dependencies.jar"
    hadoop_shell_errorcode=$?
    if [ $hadoop_shell_errorcode -ne 0 ]
    then
      exit $hadoop_shell_errorcode
    fi
    exec /bin/bash -c "$JAVA_HOME/bin/java -Xmx512m com.lucky.lever.master.LeverMaster --container_memory 128 --container_vcores 1 1>/home/hadoop/hadoop-2.6.5/logs/userlogs/application_1555117719646_0011/container_1555117719646_0011_01_000001/application_master.stdout 2>/home/hadoop/hadoop-2.6.5/logs/userlogs/application_1555117719646_0011/container_1555117719646_0011_01_000001/application_master.stderr "
    hadoop_shell_errorcode=$?
    if [ $hadoop_shell_errorcode -ne 0 ]
    then
      exit $hadoop_shell_errorcode
    fi
    

    相关文章

      网友评论

          本文标题:nodemanager 启动container脚本分析

          本文链接:https://www.haomeiwen.com/subject/sekqgqtx.html