美文网首页
Flink1.8.1源码编译

Flink1.8.1源码编译

作者: 吃货大米饭 | 来源:发表于2019-08-02 14:58 被阅读0次

    一、环境准备

    Maven 3.3.x +
    JDK 8 高版本
    SCALA 2.11.x
    flink 1.8.1 源码包

    二、maven setting文件配置

    <mirror>
      <id>nexus-aliyun</id>
      <mirrorOf>*,!jeecg,!jeecg-snapshots,!mapr-releases,!cloudera-releases,!confluent</mirrorOf>
      <name>Nexus aliyun</name>
      <url>http://maven.aliyun.com/nexus/content/groups/public</url>
    </mirror>
    
    • mirrorof 配置解释
    * = everything
    external:* = everything not on the localhost and not file based.
    repo,repo1 = repo or repo1
    *,!repo1 = everything except repo1
    

    发现如果在<mirrorOf>中配置*,表示当前mirror为所有仓库镜像,所有远程仓库请求地址为当前mirror对应的URL( having it mirror all repository requests)。所以我把此处的mirrorOf改为*,!jeecg,!jeecg-snapshots,!mapr-releases,!cloudera-releases,!confluent,此时当前mirror只会拦截仓库除了jeecg,jeecg-snapshots,mapr-releases,cloudera-releases,confluent的依赖请求,对于未被拦截的请求会到pom文件指定的仓库去下载。
    cloudera-releases指的是这里的id

    2019-08-02_145611.png

    其他问题:
    Setting.xml中repository的配置与pom.xml中repository的配置有什么不同?
    Setting.xml中配置repository与pom.xml中配置repository的作用是相同的,都是为了指定多个存储库的使用(you can specify the use of multiple repositories)。但在pom.xml中配置只对当前项目与子项目有用,而在setting.xml中配置为全局性配置,用于所用的项目。

    三、下载并编译flink-shade的源码

    • 下载源码
    wget https://archive.apache.org/dist/flink/flink-shaded-7.0/flink-shaded-7.0-src.tgz
    
    • 解压缩后修改pom文件,增加cloudera的仓库
    <repositories>
     <repository>
         <id>cloudera-releases</id>
         <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
     </repository>
    </repositories>
    
    • 编译flink-shade源码
    mvn clean install -DskipTests -Dhadoop.version=2.6.0-cdh5.15.1
    

    三、执行编译命令

    mvn clean install -DskipTests -Pvendor-repos -Dfast -Dhadoop.version=2.6.0-cdh5.15.1
    

    四、安装结果

    2019-08-02_112848.png

    五、测试

    运行flink on yarn案例

    flink run -m yarn-cluster ./examples/batch/WordCount.jar \
    --input /ruozedata/LICENSE-2.0.txt --output /ruozedata/wordcount-result.txt
    

    报错:

    ------------------------------------------------------------
     The program finished with the following exception:
    
    java.lang.RuntimeException: Could not identify hostname and port in 'yarn-cluster'.
            at org.apache.flink.client.ClientUtils.parseHostPortAddress(ClientUtils.java:47)
            at org.apache.flink.client.cli.AbstractCustomCommandLine.applyCommandLineOptionsToConfiguration(AbstractCustomCommandLine.java:83)
            at org.apache.flink.client.cli.DefaultCLI.createClusterDescriptor(DefaultCLI.java:60)
            at org.apache.flink.client.cli.DefaultCLI.createClusterDescriptor(DefaultCLI.java:35)
            at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:216)
            at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:205)
            at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1010)
            at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1083)
            at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
            at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1083)
    

    解决方案:

    • 将已经编译好的flink-shaded-7.0目录中的flink-shaded-hadoop-2-2.6.0-cdh5.15.1-7.0.jar拷贝到flink的lib目录中

    • 指定hadoop依赖,可在系统变量中配置 或在跑yarn 时 执行

    [hadoop@hadoop001 lib]$ export HADOOP_CLASSPATH=`hadoop classpath`
    [hadoop@hadoop001 flink]$ bin/flink run -m yarn-cluster -yn 3 -s 4 examples/batch/WordCount.jar
    

    相关文章

      网友评论

          本文标题:Flink1.8.1源码编译

          本文链接:https://www.haomeiwen.com/subject/yoakdctx.html