-
前提条件
1.)配置Java环境
2.)配置Hadoop集群环境 -
下载flink包
# 下载步骤 1~3 1 https://flink.apache.org/downloads.html 2 https://archive.apache.org/dist/flink/ 3 https://archive.apache.org/dist/flink/flink-1.10.0/flink-1.10.0-bin-scala_2.11.tgz # 安装步骤 1~3 1 tar -zxvf flink-1.10.0-bin-scala_2.11.tgz 2 mv flink-1.10.0 ../flink # 启动/停止本地集群 8081端口 3 bin/start-cluster.sh / bin/stop-cluster.sh # yarn启动/停止本地集群 随机端口 4 ./bin/yarn-session.sh -jm 1024m -tm 4096m # yarn提交任务 5 ./bin/flink run ./examples/batch/WordCount.jar --input hdfs:/user/yuan/input/wc.count --output hdfs:/user/yuan/swww
-
配置flunk所需要依赖Hadoop的内容
1.)配置环境变量
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_CLASSPATH=/home/soft/hadoop/lib/*2.)配置flink 所依赖的Hadoop包 (需要根据Hadoop版本自定义 )
(1)下载编译项目 并且切换到 flink对应的版本 flink1.10对应flink-shaded09
https://github.com/apache/flink-shaded
(2) 配置 flink-shaded-hadoop-2-uber pom.xml<dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>2.7.6</version> <scope>provided</scope> </dependency> <dependency> <groupId>commons-cli</groupId> <artifactId>commons-cli</artifactId> <version>1.3.1</version> </dependency>
(3)编译项目
mvn -P clean install -Dhadoop.version=2.7.6
(4)将编译好的包上传到 fink/lib/ 目录
(5)重新启动flink集群
# yarn启动/停止本地集群 随机端口
./bin/yarn-session.sh -jm 1024m -tm 4096m
# yarn提交任务
./bin/flink run ./examples/batch/WordCount.jar --input hdfs:/user/yuan/input/wc.count --output hdfs:/user/yuan/swww
-
启动错误
# 没有flink-shaded-hadoop-2-uber pom.xml 没有添加 commons-cli依赖
1. flink .NoSuchMethodError: org.apache.commons.cli.Option.builder
(Ljava/lang/String;)Lorg/apache/commons/cli/Option$Builder;
# 设置的内存超过了限制,修改各个节点的etc/hadoop/yarn-site.xml
2. The Flink Yarn cluster has failed.
etc/hadoop/yarn-site.xml
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
网友评论