美文网首页
CDH5.9.2 整合TEZ

CDH5.9.2 整合TEZ

作者: Jack_Wonng | 来源:发表于2017-09-27 11:38 被阅读0次

    1.安装配置TEZ

    1.1 环境要求

    • CDH5.9.2(hadoop2.6.0)
    • 编译环境:gcc, gcc-c++, make, build
    • Nodejs、npm (Tez-ui需要)
    • Git
    • pb2.5.0
    • maven3
    • Tez0.8.5

    1.2编译环境准备

    安装gcc, gcc-c++, make, build

    yum install gcc gcc-c++ libstdc++-devel make build
    

    安装Nodejs,npm

    wget http://nodejs.org/dist/v0.8.14/node-v0.8.14.tar.gz
    ./configure 
    make && make install
    

    安装GIt

    https://git-scm.com/download  ./configuremakemake install
    

    安装ProtocolBuffer.5.0

    https://github.com/google/protobuf/releases/tag/v2.5.0
    ./configure
    make && make install
    

    1.3 编译TEZ

    1.3.1 官网下载tez

    1.3.2 解压

    1.3.3修改源码

    /tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java

    diff --git a/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java b/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java
     
    index 12491ed..b4ca24c 100644
     
    --- a/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java
     
    +++ b/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java
     
    @@ -475,5 +475,16 @@ public class JobContextImpl implements JobContext {
     
       public Progressable getProgressible() {
     
         return progress;
     
       }
     
    +
     
    +  /**
     
    +   * Get the boolean value for the property that specifies which classpath
     
    +   * takes precedence when tasks are launched. True - user's classes takes
     
    +   * precedence. False - system's classes takes precedence.
     
    +   * @return true if user's classes should take precedence
     
    +   */
     
    +   @Override
     
    +  public boolean userClassesTakesPrecedence() {
     
    +    return getJobConf().getBoolean(MRJobConfig.MAPREDUCE_JOB_USER_CLASSPATH_FIRST, false);
     
    +  }
     
        
     
     }
    

    /tez-ext-service-tests/src/test/java/org/apache/tez/shufflehandler/ShuffleHandler.java

    I figured out this answer from my coworker we have to use "headers().set, headers().get" instead of "setHeader(), getHeader()".
    

    1.3.4 修改配置

    vi pom.xml

    <profile>
        <id>cdh5.9.2</id>
        <activation>
          <activeByDefault>false</activeByDefault>
        </activation>
        <properties>
           <hadoop.version>2.5.0-cdh5.2.5</hadoop.version>
         </properties>
         <pluginRepositories>
           <pluginRepository>
              <id>cloudera</id>
              <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
           </pluginRepository>
          </pluginRepositories>
          <repositories>
          <repository>
             <id>cloudera</id>
             <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
           </repository>
         </repositories>
       </profile>
    

    vi tez-ui/pom.xml

    <nodeVersion>v0.12.9</nodeVersion>
    <npmVersion>2.14.9</npmVersion>
    

    1.3.5 开始编译

    mvn clean package -DskipTests=true -Dmaven.javadoc.skip=true  -Dfrontend-maven-plugin.version=0.0.23
    

    注意:出现node.gz.tar文件下载失败 到tez-ui下手动编译:

    手动编译tez-ui 和tez-ui2 使用taobao的源

    npm --registry=https://registry.npm.taobao.org install --verbose
    

    2.Hive On Tez

    1. 拷贝tez-0.8.2-minimal目录至HDFS
    hdfs dfs -put tez-dist/target/tez-0.8.5-minimal tez-dir/
    
    1. 把hadoop-mapreduce-client-common-2.6.0-cdh5.9.2.jar拷贝到hdfs的/tez-dir/tez-0.8.5-minimal目录
    2. 把tez-0.8.5下jar和lib下的jar拷贝到hive客户端部署的lib目录,删除hive/auxlib下的hive-exec-1.1.0-cdh5.9.3-core.jar和hive-exec-core.jar 否则会有kryo错误。
    3. 创建tez.size.xml 保存到/etc/hive/conf/
    <?xml version="1.0"?>
    <!--
      Licensed under the Apache License, Version 2.0 (the "License");
      you may not use this file except in compliance with the License.
      You may obtain a copy of the License at
     
        http://www.apache.org/licenses/LICENSE-2.0
     
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License. See accompanying LICENSE file.
    -->
     
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
     
    <!-- Put site-specific property overrides in this file. -->
     
    <configuration>
     <property>
       <name>tez.lib.uris</name>
       <value>${fs.defaultFS}/tez-dir/tez-0.8.5-minimal,${fs.defaultFS}/tez-dir/tez-0.8.5-minimal/lib</value>
     </property>
     <property>
       <name>tez.use.cluster.hadoop-libs</name>
       <value>true</value>
     </property>
    </configuration>
    
    1. 在hive中使用tez
    set hive.execution.engine=tez;
    

    截止2017-09-27日,已在公司线上环境使用3个多月,由于时间问题tez-ui没有成功整合。有时间会解决下ui问题。有朋友成功整合tez-ui的也可分享下。

    相关文章

      网友评论

          本文标题:CDH5.9.2 整合TEZ

          本文链接:https://www.haomeiwen.com/subject/ljgyextx.html