美文网首页
Spark 2.4.2源码编译

Spark 2.4.2源码编译

作者: 吃货大米饭 | 来源:发表于2019-08-06 17:21 被阅读0次

    一、下载源码

    https://archive.apache.org/dist/spark/spark-2.4.2/spark-2.4.2-bin-sources.tgz

    二、解压源码

    tar -xzvf spark-2.4.2-bin-sources.tgz

    三、修改版本

    dev/make-distribution.sh 文件 
    # that is 128~146 Lines 
    #VERSION=$("$MVN" help:evaluate -Dexpression=project.version $@ 2>/dev/null\  #    | grep -v "INFO"\  
    #    | grep -v "WARNING"\  
    #    | tail -n 1)  
    #SCALA_VERSION=$("$MVN" help:evaluate -Dexpression=scala.binary.version $@ 2>/dev/null\  
    #    | grep -v "INFO"\  
    #    | grep -v "WARNING"\  
    #    | tail -n 1)  
    #SPARK_HADOOP_VERSION=$("$MVN" help:evaluate -Dexpression=hadoop.version $@ 2>/dev/null\  
    #    | grep -v "INFO"\  
    #    | grep -v "WARNING"\  
    #    | tail -n 1)  
    #SPARK_HIVE=$("$MVN" help:evaluate -Dexpression=project.activeProfiles -pl sql/hive $@ 2>/dev/null\  
    #    | grep -v "INFO"\  
    #    | grep -v "WARNING"\  
    #    | fgrep --count "<id>hive</id>";\  
    #    # Reset exit status to 0, otherwise the script stops here if the last grep finds nothing\  #    # because we use "set -o pipefail"  #    echo -n)    
    # 为了让编译的时候跳过检测  
    VERSION=2.4.2         # spark 版本  
    SCALA_VERSION=2.11    # scala 版本  
    SPARK_HADOOP_VERSION=2.6.0-cdh5.16.1   #对应的 hadoop 版本 
    SPARK_HIVE=1          # 支持的 hive
    

    四、修改 pom.xml 仓库,添加阿里云和 cloudera 仓库地址

     <repositories> 
        <!-- This should be at top, it makes maven try the central repo first and then others and hence faster dep resolution    <repository>      
            <id>central</id>      
            <name>Maven Repository</name>      
            <url>https://repo.maven.apache.org/maven2</url>      
            <releases>        
                <enabled>true</enabled>      
            </releases>      
            <snapshots>        
                <enabled>false</enabled>      
            </snapshots>    
        </repository>  -->    
        <repository>      
            <id>maven-ali</id>      
            <url>http://maven.aliyun.com/nexus/content/groups/public//</url>      
            <releases>        
                <enabled>true</enabled>      
            </releases>      
            <snapshots>        
                <enabled>true</enabled>        
                <updatePolicy>always</updatePolicy>        
                <checksumPolicy>fail</checksumPolicy>      
            </snapshots>    
        </repository>    
        <repository>      
            <id>cloudera</id>      
            <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>    
        </repository>  
    </repositories> 
    

    五、编译

    ./dev/make-distribution.sh \
     --name cdh5.16.1 \
     --tgz \
     -Dhadoop.version=2.6.0-cdh5.16.1 \
     -Phadoop-2.6 \
     -Phive \
     -Phive-thriftserver \
     -Pyarn 
    

    相关文章

      网友评论

          本文标题:Spark 2.4.2源码编译

          本文链接:https://www.haomeiwen.com/subject/gbuwdctx.html