本环境使用centos 8
HOSTNAME | 作用 | IP | role |
---|---|---|---|
hadoop301 | hdfs namenode,hdfs datanode, yarn resourcemanager, yarn node manager, journal node, zookeeeper | 192.168.142.101 | yarn and hdfs master, worker |
hadoop302 | hdfs namenode, hdfs datanode, yarn resourcemanager, yarn node manager, journal node, zookeeeper | 192.168.142.102 | yarn and hdfs master, worker |
hadoop303 | hdfs namenode, hdfs datanode, yarn resourcemanager, journal node, zookeeeper | 192.168.142.103 | hdfs master , worker |
前置条件
先设置好 linux 通用设置
配置好三台机器之间的免密码登陆
下载好hadoop-3.2.1,zookeeper-3.5.6 的压缩包
安装步骤
1 在所有机器上执行
安装jdk
- 1.1. 安装jdk
yum install -y java-1.8.0-openjdk-devel.x86_64 java-1.8.0-openjdk.x86_64 - 1.2 编辑/etc/profile, 在末尾添加
export HADOOP_HOME=/opt/hadoop-3.2.1
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export JAVA_HOME=/usr/lib/jvm/jre-1.8.0
export ZOOKEEPER_HOME=/opt/zookeeper-3.5.6
export PATH=$PATH:$ZOOKEEPER_HOME/bin
- 1.3 创建文件夹,至于为什么是这些路径,和后面的配置文件有关。
mkdir -p /tmp/hadoop/tmpdir
mkdir -p /tmp/hadoop/journalnode/data
mkdir -p /tmp/hadoop/hdfs/namenode
mkdir -p /tmp/hadoop/hdfs/datanode
mkdir -p /tmp/zookeeper
echo > 1 /tmp/zookeeper/myid #hadoop301
echo > 2 /tmp/zookeeper/myid #hadoop302
echo > 3 /tmp/zookeeper/myid #hadoop303
- 1.4 设置好hosts
192.168.142.101 hadoop301
192.168.142.102 hadoop302
192.168.142.103 hadoop303
2 在 hadoop301 上执行
2.1安装ZK
2.1.1 将zk 解压
tar -zxf zookeeper-3.5.6.tar.gz
2.1.2 配置zoo.cfg
cd zookeeper-3.5.6/conf
mv zoo_sample.cfg zoo.cfg
vim zoo.cfg
zoo.cfg 内容如下
# The number of milliseconds of each tick 心跳基本时间单位,毫秒级,ZK基本上所有的时间都是这个时间的整数倍。
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
# tickTime的个数,表示在leader选举结束后,followers与leader同步需要的时间,如果followers比较多或者说leader的数据灰常多时,同步时间相应可能会增加,那么这个值也需要相应增加。当然,这个值也是follower和observer在开始同步leader的数据时的最大等待时间(setSoTimeout)
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
# tickTime的个数,这时间容易和上面的时间混淆,它也表示follower和observer与leader交互时的最大等待时间,只不过是在与leader同步完毕之后,进入正常请求转发或ping等消息交互时的超时时间
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
# 内存数据库快照存放地址,如果没有指定事务日志存放地址(dataLogDir),默认也是存放在这个路径下,建议两个地址分开存放到不同的设备上
dataDir=/tmp/zookeeper
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=hadoop301:2888:3888
server.2=hadoop302:2888:3888
server.3=hadoop303:2888:3888
2.1.3 同步到 hadoop302 hadoop303
yum install -y rsync
rsync -auvp /opt/zookeeper-3.5.6 root@hadoop302:/opt
rsync -auvp /opt/zookeeper-3.5.6 root@hadoop303:/opt
2.2 安装 hadoop
- 2.2.1 将 hadoop-3.2.1.tar.gz 上传到/opt 目录下, 并解压
- 2.2.2 编辑 /opt/hadoop-3.2.1/etc/hadoop/core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop/tmpdir</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop301:2181,hadoop302:2181,hadoop303:2181</value>
</property>
</configuration>
- 2.2.3 /opt/hadoop-3.2.1/etc/hadoop/hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!-- hdfs HA configuration-->
<!-- all default configuration can be found at https://hadoop.apache.org/docs/stable|<can be a version liek r3.2.1></can>/hadoop-project-dist/hadoop-hdfs//hdfs-default.xml -->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!-- dfs.nameservices 这里需要与core-site.xml 中fs.defaultFS 的名称一致-->
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<!-- 定义集群中 namenode 列表,这里定义了三个namenode,分别是nn1,nn2,nn3-->
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2,nn3</value>
</property>
<!-- namenode nn1的具体定义,这里要和 dfs.ha.namenodes.mycluster 定义的列表对应 -->
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>hadoop301:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>hadoop302:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn3</name>
<value>hadoop303:8020</value>
</property>
<!-- namenode nn1的具体定义,这里要和 dfs.ha.namenodes.mycluster 定义的列表对应 -->
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>hadoop301:9870</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>hadoop302:9870</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn3</name>
<value>hadoop303:9870</value>
</property>
<!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop301:8485;hadoop302:8485;hadoop303:8485/mycluster</value>
</property>
<!-- 指定JournalNode在本地磁盘存放数据的位置 -->
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/tmp/hadoop/journalnode/data</value>
</property>
<!-- 配置失败自动切换实现方式 -->
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<!-- 使用sshfence隔离机制时需要ssh免登陆 -->
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<!-- 配置sshfence隔离机制超时时间 -->
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
<property>
<name>dfs.journalnode.http-address</name>
<value>0.0.0.0:8480</value>
</property>
<property>
<name>dfs.journalnode.rpc-address</name>
<value>0.0.0.0:8485</value>
</property>
<!-- hdfs HA configuration end-->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/tmp/hadoop/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/tmp/hadoop/hdfs/datanode</value>
</property>
<!--开启webhdfs接口访问-->
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<!-- 关闭权限验证,hive可以直连 -->
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
</configuration>
2.2.4 编辑 /opt/hadoop-3.2.1/etc/hadoop/yarn-site.xml
<?xml version="1.0"?>
<configuration>
<!-- yarn ha configuration-->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!-- 定义集群名称 -->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>cluster1</value>
</property>
<!-- 定义本机在在高可用集群中的id 要与 yarn.resourcemanager.ha.rm-ids 定义的值对应,如果不作为resource manager 则删除这项配置。-->
<property>
<name>yarn.resourcemanager.ha.id</name>
<value>rm1</value>
</property>
<!-- 定义高可用集群中的 id 列表 -->
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<!-- 定义高可用RM集群具体是哪些机器 -->
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>hadoop301</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>hadoop302</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>hadoop301:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>hadoop302:8088</value>
</property>
<property>
<name>hadoop.zk.address</name>
<value>hadoop301:2181,hadoop302:2181,hadoop303:2181</value>
</property>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
2.2.5 编辑 /opt/hadoop-3.2.1/etc/hadoop/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>
/opt/hadoop-3.2.1/share/hadoop/common/*,
/opt/hadoop-3.2.1/share/hadoop/common/lib/*,
/opt/hadoop-3.2.1/share/hadoop/hdfs/*,
/opt/hadoop-3.2.1/share/hadoop/hdfs/lib/*,
/opt/hadoop-3.2.1/share/hadoop/mapreduce/*,
/opt/hadoop-3.2.1/share/hadoop/mapreduce/lib/*,
/opt/hadoop-3.2.1/share/hadoop/yarn/*,
/opt/hadoop-3.2.1/share/hadoop/yarn/lib/*
</value>
</property>
</configuration>
2.2.6 编辑 /opt/hadoop-3.2.1/etc/hadoop/hadoop-env.sh
# The java implementation to use. By default, this environment
# variable is REQUIRED on ALL platforms except OS X!
# export JAVA_HOME=
export JAVA_HOME=/usr/lib/jvm/jre-1.8.0
# Some parts of the shell code may do special things dependent upon
# the operating system. We have to set this here. See the next
# section as to why....
export HADOOP_OS_TYPE=${HADOOP_OS_TYPE:-$(uname -s)}
export HADOOP_PID_DIR=/opt/hadoop-3.2.1/pid
export HADOOP_LOG_DIR=/var/log/hadoop
2.2.7 编辑 /opt/hadoop-3.2.1/etc/hadoop/yarn-env.sh
# Specify the max heapsize for the ResourceManager. If no units are
# given, it will be assumed to be in MB.
# This value will be overridden by an Xmx setting specified in either
# HADOOP_OPTS and/or YARN_RESOURCEMANAGER_OPTS.
# Default is the same as HADOOP_HEAPSIZE_MAX
#export YARN_RESOURCEMANAGER_HEAPSIZE=
export JAVA_HOME=/usr/lib/jvm/jre-1.8.0
2.2.7 编辑 /opt/hadoop-3.2.1/sbin/start-dfs.sh, /opt/hadoop-3.2.1/sbin/stop-dfs.sh,在脚本开始添加
HDFS_NAMENODE_USER=root
HDFS_DATANODE_USER=root
HDFS_JOURNALNODE_USER=root
HDFS_ZKFC_USER=root
2.2.8 编辑 /opt/hadoop-3.2.1/sbin/start-yarn.sh, /opt/hadoop-3.2.1/sbin/stop-yarn.sh,在脚本开始添加
YARN_RESOURCEMANAGER_USER=root
YARN_NODEMANAGER_USER=root
2.2.9 修改/opt/hadoop-3.2.1/etc/hadoo/workers 为如下内容
hadoop301
hadoop302
hadoop303
2.2.10 拷贝 hadoop-3.2.1 到 hadoop302 hadoop303
rsync -auvp /opt/hadoop-3.2.1 root@hadoop302:/opt
rsync -auvp /opt/hadoop-3.2.1 root@hadoop303:/opt
3 在hadoop302 上执行
需要修改yarn-site.xml的yarn.resourcemanager.ha.id,改为如下内容
<property>
<name>yarn.resourcemanager.ha.id</name>
<value>rm2</value>
</property>
4 在hadoop303上执行
删除如下property
<property>
<name>yarn.resourcemanager.ha.id</name>
<value>rm1</value>
</property>
5 启动
启动顺序 Zookeeper->JournalNode->格式化NameNode->创建命名空间zkfs->NameNode->Datanode->ResourceManager->NodeManager
5.1 启动zookeeper
在所有机器行上执行,顺序 hadoop301 hadoop302 hadoop303
# 注意,如果使用zsh 需要切换回bash
#chsh -s /usr/bin/bash
#如果想用zsh 直接执行,需要使用如下领命,emualte 命令必须安装 oh my zsh 才有。
# emulate sh -c '/opt/zookepper-3.5.6/bin/zkServer.sh start'
/opt/zookepper-3.5.6/bin/zkServer.sh start
/opt/zookepper-3.5.6/bin/zkServer.sh status
5.2 启动journalnode
在所有机器行上执行,顺序 hadoop301 hadoop302 hadoop303
# 注意,如果使用zsh 需要切换回bash
#chsh -s /usr/bin/bash
/opt/hadoop-3.2.1/sbin/hadoop-daemon.sh start journalnode
# 或者通过 /opt/hadoop-3.2.1/bin/hdfs --daemon start journalnode
5.3 格式化 Namenode
在hadoop301上执行
# 注意,如果使用zsh 需要切换回bash
#chsh -s /usr/bin/bash
/opt/hadoop-3.2.1/bin/hadoop namenode -format
# 同步格式化之后的元数据到其他namenode,不然可能起不来
rsync -auvp /tmp/hadoop/hdfs/namenode/current root@hadoop302:/tmp/hadoop/hdfs/namenode
rsync -auvp /tmp/hadoop/hdfs/namenode/current root@hadoop303:/tmp/hadoop/hdfs/namenode
# 格式化ZK
hdfs zkfc -formatZK
5.4 停止 jounalnode
在所有机器上执行
/opt/hadoop-3.2.1/sbin/hadoop-daemon.sh stop journalnode
# 或者通过 /opt/hadoop-3.2.1/bin/hdfs --daemon stop journalnode
5.5 启动 hadoop
在hadoop301 上执行
# 必须在bash 环境下执行,zsh 兼容模式也不行
start-dfs.sh
start-yarn.sh
hdfs haadmin -getAllServiceState
# 正常启动后所看到的进程 jps 查看
2193 QuorumPeerMain
5252 JournalNode
4886 NameNode
5016 DataNode
5487 DFSZKFailoverController
Hadooop Classpath
很多其他的计算引擎都会使用hadoop的hdfs和yarn,他们使用的方式都是通过Hadoop class path。通过如下命令,可以看到hadoop的class path 又哪些
/opt/hadoop-3.2.1/bin/hadoop classpath
spark without hadoop 的安装包就会要求配置已经安装的hadoop 的classpath,可以再spark-env.sh中添加如下配置
### in conf/spark-env.sh ###
# If 'hadoop' binary is on your PATH
export SPARK_DIST_CLASSPATH=$(hadoop classpath)
# With explicit path to 'hadoop' binary
export SPARK_DIST_CLASSPATH=$(/path/to/hadoop/bin/hadoop classpath)
# Passing a Hadoop configuration directory
export SPARK_DIST_CLASSPATH=$(hadoop --config /path/to/configs classpath)
troubleshooting
- 遇到所有namenoe 都是standby
这种问题一般是DFSZKFailoverController 没起起来,没起来的原因一般是hdfs zkfc -formatZK 初始化失败或者后期操作破坏了数据,通过命令hdfs zkfc -formatZK
重新初始化即可
网友评论