机器信息
ip1:172.16.201.121 (外网ip:9.30.251.112 )
ip2:172.16.201.122
ip3:172.16.201.123
前提:java环境
sudo apt-get install openjdk-8-jdk
验证:java -version
which java
如果机器上有多个java版本,修改java版本:/usr/sbin/alternatives --config java
1. 修改主机名及域名解析
1.1 修改主机名
- vi /etc/hostname
master - 同时将ip2和ip3分别修改为slave1和slave2
- hostname生效:reboot
1.2 修改域名hosts
- vi /etc/hosts
ip1--master
ip2--slave1
ip3--slave3
2. spark安装和配置
2.1 下载安装
- wget http://d3kbcqa49mib13.cloudfront.net/spark-2.0.2-bin-hadoop2.7.tgz
- tar -zxvf spark-2.0.2-bin-hadoop2.7.tgz
-
index.png
2.2 配置
-
cd /root/spark/spark-2.0.2-bin-hadoop2.7/conf
-
vi spark-env.sh
- export SPARK_MASTER_IP=9.30.95.158
- SPARK_MASTER_HOST=9.30.95.158(页面上指定master为ip值)
- export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.111-2.b15.el7_3.x86_64/jre
- export PATH=$PATH:$JAVA_HOME/bin
-
vi slaves
- slave1
- slave2
3. slave配置
- 将master的spark目录拷贝到slave1和slave2
- scp -r spark-2.0.2-bin-hadoop2.7 root@172.16.201.123:/root/spark/
4. 启动集群
-
启动master:./start-master.sh
-
启动slave:./start-slaves.sh
-
访问http://9.30.251.112:8080/查看集群状态
Capture.PNG
-
也可以在后台输入jps命令查询集群状态
Captur123e.PNG
5.提交job
spark提交方式有两种
5.1 standalone cluster模式
--deploy-mode cluster
./bin/spark-submit --master spark://9.30.147.30:6066 --deploy-mode cluster --class org.apache.spark.examples.SparkPi examples/jars/spark-examples_2.11-2.0.2.jar 10
-
后台输出:
Captu1111re.PNG
-
页面输出:
Captur2222e.PNG
Captu333re.PNG
5.2standalone client模式
./bin/spark-submit --master spark://9.30.147.30:7077 --class org.apache.spark.examples.SparkPi examples/jars/spark-examples_2.11-2.0.2.jar 10
遇到的问题
- jps: command not found
- 确认java目录下是否存在jps
如果/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el7_3.x86_64/jre/bin/目录下没有jps
需要安装openjdk-devel:yum install java-1.8.0-openjdk-devel
alias jps='/usr/lib/jvm/java-1.8.0-openjdk/bin/jps' - 确认是否设置java环境变量
vi ~/.bashrc
添加java环境变量:
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.111-2.b15.el7_3.x86_64/jre
export PATH=$PATH:$JAVA_HOME/bin
source ~/.bashrc
网友评论