准备工作
集群机器IP 及机器名
192.168.153.128 mongodb01
192.168.153.129 mongodb02
192.168.153.130 mongodb03
查看机器名
[mongodb@mongodb02 conf]$ hostname
mongodb02
确保机器名相互访问
异常零、IP可访问,机器名无法访问
[mongodb@mongodb02 conf]$ ping 192.168.153.128
PING 192.168.153.128 (192.168.153.128) 56(84) bytes of data.
64 bytes from 192.168.153.128: icmp_seq=1 ttl=64 time=77.4 ms
64 bytes from 192.168.153.128: icmp_seq=2 ttl=64 time=0.305 ms
64 bytes from 192.168.153.128: icmp_seq=3 ttl=64 time=0.301 ms
^C
--- 192.168.153.128 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 0.301/26.020/77.455/36.370 ms
[mongodb@mongodb02 conf]$ ping mongodb01
ping: mongodb01: 未知的名称或服务
[mongodb@mongodb02 conf]$
各个机器上修改本地hosts
,添加相互访问
[mongodb@mongodb02 conf]$ cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.153.128 mongodb01
192.168.153.129 mongodb02
192.168.153.130 mongodb03
相互拼机器名访问
[mongodb@mongodb02 conf]$ ping mongodb01
PING mongodb01 (192.168.153.128) 56(84) bytes of data.
64 bytes from mongodb01 (192.168.153.128): icmp_seq=1 ttl=64 time=0.500 ms
64 bytes from mongodb01 (192.168.153.128): icmp_seq=2 ttl=64 time=0.304 ms
64 bytes from mongodb01 (192.168.153.128): icmp_seq=3 ttl=64 time=0.309 ms
^C
--- mongodb01 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 0.304/0.371/0.500/0.091 ms
相互间免密登录
使用root
登录,生成密钥
[root@mongodb01 ~]# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
/root/.ssh/id_rsa already exists.
Overwrite (y/n)?
[root@mongodb01 ~]#
将密钥拷贝至其他服务器
[root@mongodb01 ~]# ssh-copy-id 192.168.153.130
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@192.168.153.130's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh '192.168.153.130'"
and check to make sure that only the key(s) you wanted were added.
[root@mongodb01 ~]#
文件目录
[mongodb@mongodb02 ~]$ cd /home/spark/software/spark-2.4.4-bin-hadoop2.7/
[mongodb@mongodb02 spark-2.4.4-bin-hadoop2.7]$
添加环境变量
[mongodb@mongodb02 spark-2.4.4-bin-hadoop2.7]$ sudo vim /etc/profile
export SPARK_HOME=/home/spark/software/spark-2.4.4-bin-hadoop2.7
export PATH=/bin:/usr/bin:$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
如果环境变量修改错误导致基础命令无效
[root@mongodb01]$ ll
bash: ll: 未找到命令...
相似命令是:`ll`
修复办法,进到/usr/bin
目录下,然后使用./sudo su
命令获取root
权限,使用sudo vim /etc/profile
重新修改环境变量
配置集群
配置 slaves
[mongodb@mongodb02 spark-2.4.4-bin-hadoop2.7]$ cd ./conf
[mongodb@mongodb02 conf]$ ll
总用量 36
-rw-r--r--. 1 mongodb mongodb 996 8月 28 05:30 docker.properties.template
-rw-r--r--. 1 mongodb mongodb 1105 8月 28 05:30 fairscheduler.xml.template
-rw-r--r--. 1 mongodb mongodb 2025 8月 28 05:30 log4j.properties.template
-rw-r--r--. 1 mongodb mongodb 7801 8月 28 05:30 metrics.properties.template
-rw-r--r--. 1 mongodb mongodb 865 8月 28 05:30 slaves.template
-rw-r--r--. 1 mongodb mongodb 1292 8月 28 05:30 spark-defaults.conf.template
-rwxr-xr-x. 1 mongodb mongodb 4221 8月 28 05:30 spark-env.sh.template
[mongodb@mongodb02 conf]$ cp slaves.template slaves
[mongodb@mongodb02 conf]$
[mongodb@mongodb02 conf]$ sudo vim slaves
slaves
文件添加集群内机器名
# A Spark Worker will be started on each of the machines listed below.
#localhost
mongodb01
mongodb02
mongodb03
配置 spark-env.sh
[mongodb@mongodb02 conf]$ cp spark-env.sh.template spark-env.sh
[mongodb@mongodb02 conf]$ sudo vim spark-env.sh
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/jre
export SPARK_MASTER_IP=mongodb01
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_CORES=1
export SPARK_WORKER_INSTANCES=1
export SPARK_WORKER_MEMORY=800M
注意,使用内存不要超过分配虚拟机内存
jdk 是否安装判断及路径查询
[mongodb@mongodb03 sbin]$ java -version
openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)
[mongodb@mongodb03 sbin]$ which java
/usr/bin/java
[mongodb@mongodb03 alternatives]$ ls -lrt /usr/bin/java
lrwxrwxrwx. 1 root root 22 9月 18 15:06 /usr/bin/java -> /etc/alternatives/java
[mongodb@mongodb03 alternatives]$ ls -lrt /etc/alternatives/java
lrwxrwxrwx. 1 root root 71 9月 18 15:06 /etc/alternatives/java -> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/jre/bin/java
[mongodb@mongodb03 alternatives]$ cd /usr/lib/jvm
[mongodb@mongodb03 jvm]$ ls
java-1.7.0-openjdk-1.7.0.191-2.6.15.5.el7.x86_64
java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64
jre
jre-1.7.0
jre-1.7.0-openjdk
jre-1.7.0-openjdk-1.7.0.191-2.6.15.5.el7.x86_64
jre-1.8.0
jre-1.8.0-openjdk
jre-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64
jre-openjdk
文件夹拷贝至其他服务器
各服务器(mongodb01
、mongodb03
)新建对应目录
[mongodb@mongodb03 ~]$ sudo mkdir -p /home/spark/software/spark-2.4.4-bin-hadoop2.7/
文件拷贝至各服务器(mongodb01
、mongodb03
)
scp -r /home/spark/software/spark-2.4.4-bin-hadoop2.7/ root@mongodb03:/home/spark/software/
分别去各服务器检查文件目录,确认完全一致
[mongodb@mongodb03]$ cd /home/spark/software/spark-2.4.4-bin-hadoop2.7/
[mongodb@mongodb03 spark-2.4.4-bin-hadoop2.7]$ ll
总用量 108
drwxr-xr-x. 2 root root 4096 11月 11 15:54 bin
drwxr-xr-x. 2 root root 264 11月 11 15:54 conf
drwxr-xr-x. 5 root root 50 11月 11 15:54 data
drwxr-xr-x. 4 root root 29 11月 11 15:54 examples
drwxr-xr-x. 2 root root 12288 11月 11 15:54 jars
drwxr-xr-x. 4 root root 38 11月 11 15:54 kubernetes
-rw-r--r--. 1 root root 21316 11月 11 15:54 LICENSE
drwxr-xr-x. 2 root root 4096 11月 11 15:54 licenses
-rw-r--r--. 1 root root 42919 11月 11 15:54 NOTICE
drwxr-xr-x. 9 root root 4096 11月 11 15:54 python
drwxr-xr-x. 3 root root 17 11月 11 15:54 R
-rw-r--r--. 1 root root 3952 11月 11 15:54 README.md
-rw-r--r--. 1 root root 164 11月 11 15:54 RELEASE
drwxr-xr-x. 2 root root 4096 11月 11 15:54 sbin
drwxr-xr-x. 2 root root 42 11月 11 15:54 yarn
[mongodb@mongodb03 spark-2.4.4-bin-hadoop2.7]$
启动集群
启动命令
[mongodb@mongodb01 ~]$ cd /home/spark/software/spark-2.4.4-bin-hadoop2.7/sbin/
[mongodb@mongodb01 sbin]$ sudo ./start-all.sh
starting org.apache.spark.deploy.master.Master, logging to /home/spark/software/spark-2.4.4-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.master.Master-1-mongodb01.out
mongodb03: starting org.apache.spark.deploy.worker.Worker, logging to /home/spark/software/spark-2.4.4-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-mongodb03.out
mongodb02: starting org.apache.spark.deploy.worker.Worker, logging to /home/spark/software/spark-2.4.4-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-mongodb02.out
mongodb01: starting org.apache.spark.deploy.worker.Worker, logging to /home/spark/software/spark-2.4.4-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-mongodb01.out
[mongodb@mongodb01 sbin]$
启动异常
留意输出日志,如有异常,在对应服务器使用cat
命令查看日志分析full log in ......
验证
浏览器访问http://192.168.153.128:8080/
,看到页面且节点数目正确即正常
Spark Master at spark://mongodb01:7077
URL: spark://mongodb01:7077
Alive Workers: 3
Cores in use: 3 Total, 0 Used
Memory in use: 2.3 GB Total, 0.0 B Used
Applications: 0 Running, 0 Completed
Drivers: 0 Running, 0 Completed
Status: ALIVE
Workers (3)
Worker Id Address State Cores Memory
worker-20191112151337-192.168.153.130-45924 192.168.153.130:45924 ALIVE 1 (0 Used) 800.0 MB (0.0 B Used)
worker-20191112151339-192.168.153.129-37468 192.168.153.129:37468 ALIVE 1 (0 Used) 800.0 MB (0.0 B Used)
worker-20191112151347-192.168.153.128-43606 192.168.153.128:43606 ALIVE 1 (0 Used) 800.0 MB (0.0 B Used)
Running Applications (0)
Application ID Name Cores Memory per Executor Submitted Time User State Duration
Completed Applications (0)
Application ID Name Cores Memory per Executor Submitted Time User State Duration
停止命令
[mongodb@mongodb01 sbin]$ sudo ./stop-all.sh
mongodb01: stopping org.apache.spark.deploy.worker.Worker
mongodb03: stopping org.apache.spark.deploy.worker.Worker
mongodb02: stopping org.apache.spark.deploy.worker.Worker
stopping org.apache.spark.deploy.master.Master
[mongodb@mongodb01 sbin]$
异常一、没有那个文件或目录
mongodb01: /home/spark/software/spark-2.4.4-bin-hadoop2.7/bin/spark-class:行71: /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/jre/bin/java: 没有那个文件或目录
目录配置问题,可能因不同服务器间环境版本不一导致(有的软件升级了,有的软件没有升级),检查对应服务器对应目录进行修改即可
[mongodb@mongodb01 lib]$ cd /usr/lib/jvm
[mongodb@mongodb01 jvm]$ ll
总用量 0
drwxr-xr-x. 3 root root 17 10月 29 16:22 java-1.7.0-openjdk-1.7.0.191-2.6.15.5.el7.x86_64
drwxr-xr-x. 4 root root 100 10月 29 16:16 java-1.7.0-openjdk-1.7.0.241-2.6.20.0.el7_7.x86_64
drwxr-xr-x. 3 root root 17 10月 23 00:34 java-1.8.0-openjdk-1.8.0.232.b09-0.el7_7.x86_64
lrwxrwxrwx. 1 root root 21 10月 29 16:22 jre -> /etc/alternatives/jre
lrwxrwxrwx. 1 root root 27 10月 29 16:22 jre-1.7.0 -> /etc/alternatives/jre_1.7.0
lrwxrwxrwx. 1 root root 35 10月 29 16:22 jre-1.7.0-openjdk -> /etc/alternatives/jre_1.7.0_openjdk
lrwxrwxrwx. 1 root root 54 10月 29 16:16 jre-1.7.0-openjdk-1.7.0.241-2.6.20.0.el7_7.x86_64 -> java-1.7.0-openjdk-1.7.0.241-2.6.20.0.el7_7.x86_64/jre
lrwxrwxrwx. 1 root root 27 10月 29 16:22 jre-1.8.0 -> /etc/alternatives/jre_1.8.0
lrwxrwxrwx. 1 root root 35 10月 29 16:22 jre-1.8.0-openjdk -> /etc/alternatives/jre_1.8.0_openjdk
lrwxrwxrwx. 1 root root 51 10月 29 16:15 jre-1.8.0-openjdk-1.8.0.232.b09-0.el7_7.x86_64 -> java-1.8.0-openjdk-1.8.0.232.b09-0.el7_7.x86_64/jre
lrwxrwxrwx. 1 root root 29 10月 29 16:22 jre-openjdk -> /etc/alternatives/jre_openjdk
[mongodb@mongodb01 jvm]$ cd /home/spark/software/spark-2.4.4-bin-hadoop2.7/conf
[mongodb@mongodb01 conf]$ sudo vim spark-env.sh
java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64
修改为 java-1.8.0-openjdk-1.8.0.232.b09-0.el7_7.x86_64
异常二、没有到主机的路由
mongodb02: failed to launch: nice -n 0 /home/spark/software/spark-2.4.4-bin-hadoop2.7/bin/spark-class org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://mongodb01:7077
mongodb02: full log in /home/spark/software/spark-2.4.4-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-mongodb02.out
# 查看日志文件
Caused by: java.net.NoRouteToHostException: 没有到主机的路由
开启服务器间防火墙指定端口
[mongodb@mongodb01 sbin]$ sudo firewall-cmd --zone=public --add-port=8080/tcp --permanent
success
[mongodb@mongodb01 sbin]$ sudo firewall-cmd --zone=public --add-port=8081/tcp --permanent
success
[mongodb@mongodb01 sbin]$ sudo firewall-cmd --zone=public --add-port=7077/tcp --permanent
success
[mongodb@mongodb01 sbin]$ sudo firewall-cmd --reload
success
[mongodb@mongodb01 sbin]$ sudo firewall-cmd --list-all
public (active)
target: default
icmp-block-inversion: no
interfaces: ens33
sources:
services: dhcpv6-client ssh
ports: 27017/tcp 8080/tcp 7077/tcp
protocols:
masquerade: no
forward-ports:
source-ports:
icmp-blocks:
rich rules:
[mongodb@mongodb01 sbin]$
异常三、节点正在运行或页面节点数量缺失
org.apache.spark.deploy.master.Master running as process 52557. Stop it first.
mongodb02: org.apache.spark.deploy.worker.Worker running as process 11938. Stop it first.
mongodb03: org.apache.spark.deploy.worker.Worker running as process 58232. Stop it first.
mongodb01: org.apache.spark.deploy.worker.Worker running as process 52650. Stop it first.
停止各服务器对应进程,重点查找8080
、8081
、7077
,重启服务
[mongodb@mongodb01 sbin]$ sudo netstat -ntl
[mongodb@mongodb01 sbin]$ sudo kill -9 53435
[mongodb@mongodb01 sbin]$ sudo kill -9 53523
[mongodb@mongodb01 sbin]$ sudo ./start-all.sh
异常四、正常启动,页面中Master下无节点
可能执行启动命令服务器与spark-env.sh
文件中Master
配置机器名不是同一台机器,执行停止命令,切换至Master
机器启动
网友评论