美文网首页
Ambari hdp Spark多版本并存

Ambari hdp Spark多版本并存

作者: Respect123 | 来源:发表于2019-04-02 10:02 被阅读0次

    本示例基于ambari hdp-2.6.0.3spark2.1.0),集成spark2.2.2,其他的版本思路类似(spark2.3经测试也是没问题的)

    1. 下载spark-2.2.2-bin-hadoop2.7.tgz,并解压到放到任意目录;

    2. 修改配置

      cd spark-2.2.2-bin-hadoop2.7/
      1. 复制ambari的hive-site.xml到conf
          cp /usr/hdp/2.6.0.3-8/hive/conf/hive-site.xml .
              
      2. 复制ambari的spark-defaults.conf到conf
          cp /usr/hdp/2.6.0.3-8/spark2/conf/spark-defaults.conf .
          并修改:
          spark.history.fs.logDirectory hdfs:///spark2.2.2-history/
          spark.yarn.historyServer.address [ip/domain name]:18081
          spark.driver.extraJavaOptions -Dhdp.version=2.6.0.3-8
          spark.yarn.am.extraJavaOptions -Dhdp.version=2.6.0.3-8
              
      3. 复制ambari的spark-env.sh到conf
          cp /usr/hdp/2.6.0.3-8/spark2/conf/spark-env.sh .
          并修改:
          export SPARK_CONF_DIR=${SPARK_CONF_DIR:-/home/install/spark-2.2.2-bin-hadoop2.7/conf}
          export SPARK_LOG_DIR=/var/log/spark2.2.2
          export SPARK_PID_DIR=/var/run/spark2.2.2
              
      4. 解决jersey版本冲突
          cd spark-2.2.2-bin-hadoop2.7/jars
          mv jersey-client-2.22.2.jar jersey-client-2.22.2.jar.bak
          cp /usr/hdp/2.6.0.3-8/hadoop-yarn/lib/jersey-client-1.9.jar .
          cp /usr/hdp/2.6.0.3-8/hadoop-yarn/lib/jersey-core-1.9.jar .
      

      备注:为了与ambari原生spark分离,因此新建Spark2 History Server目录与日志目录;[ip/domain name]为启动新版本的Spark2 History Server机器ip地址或域名;

    3. 为保证yarn cluster模式可以正常运行需要做如下修改:

      进入Ambari管理页面 -> MapReduce2 -> Configs -> advanced -> Advanced mapred-site
              
      把所有的${hdp.version}改成2.6.0.3-8
      
    4. 在启动Spark2 History Server的服务器创建文件:

      1. 创建日志目录与pid目录
          cd /var/log
          mkdir spark2.2.2
          chown spark:hadoop spark2.2.2
              
          cd /var/run
          mkdir spark2.2.2
          chown spark:hadoop spark2.2.2
          
      2. 创建hdfs目录
          hadoop fs -mkdir /spark2.2.2-history
          hadoop fs -chown spark:hadoop /spark2.2.2-history
          hadoop fs -chmod 777 /spark2.2.2-history
      
    5. 如果yarn的ResourceManagerHA模式需要如下修改:

      cd spark-2.2.2-bin-hadoop2.7/jars
      mv hadoop-auth-2.7.3.jar hadoop-auth-2.7.3.jar.bak
      mv hadoop-common-2.7.3.jar hadoop-common-2.7.3.jar.bak
      mv hadoop-yarn-api-2.7.3.jar hadoop-yarn-api-2.7.3.jar.bak
      mv hadoop-yarn-client-2.7.3.jar hadoop-yarn-client-2.7.3.jar.bak
      mv hadoop-yarn-common-2.7.3.jar hadoop-yarn-common-2.7.3.jar.bak
      mv hadoop-yarn-server-common-2.7.3.jar hadoop-yarn-server-common-2.7.3.jar.bak
          
      cp /usr/hdp/2.6.0.3-8/hadoop-yarn/hadoop-yarn-api-2.7.3.2.6.0.3-8.jar .
      cp /usr/hdp/2.6.0.3-8/hadoop-yarn/hadoop-yarn-client-2.7.3.2.6.0.3-8.jar .
      cp /usr/hdp/2.6.0.3-8/hadoop-yarn/hadoop-yarn-common-2.7.3.2.6.0.3-8.jar .
      cp /usr/hdp/2.6.0.3-8/hadoop-yarn/hadoop-yarn-server-common-2.7.3.2.6.0.3-8.jar .
      cp /usr/hdp/2.6.0.3-8/hadoop/hadoop-common-2.7.3.2.6.0.3-8.jar .
      cp /usr/hdp/2.6.0.3-8/hadoop/hadoop-auth-2.7.3.2.6.0.3-8.jar .
          
      cp /usr/hdp/current/spark2-client/jars/hadoop-yarn-api-2.7.3.2.6.0.3-8.jar .
      cp /usr/hdp/current/spark2-client/jars/hadoop-yarn-client-2.7.3.2.6.0.3-8.jar .
      cp /usr/hdp/current/spark2-client/jars/hadoop-yarn-common-2.7.3.2.6.0.3-8.jar .
      cp /usr/hdp/current/spark2-client/jars/hadoop-yarn-server-common-2.7.3.2.6.0.3-8.jar .
      cp /usr/hdp/current/spark2-client/jars/hadoop-common-2.7.3.2.6.0.3-8.jar .
      cp /usr/hdp/current/spark2-client/jars/hadoop-auth-2.7.3.2.6.0.3-8.jar .
      
    6. 启动新版本的Spark2 History Server

    7. 用新版本的spark-submit提交程序

    相关文章

      网友评论

          本文标题:Ambari hdp Spark多版本并存

          本文链接:https://www.haomeiwen.com/subject/wgkwbqtx.html