美文网首页
HDFS&YARN HA集群安装

HDFS&YARN HA集群安装

作者: 吃货大米饭 | 来源:发表于2019-08-19 18:15 被阅读0次

一、版本

组件名称 版本
CentOS CentOS-7-x86_64-DVD-1611.iso
JDK jdk-8u45-linux-x64.gz
Hadoop hadoop-2.6.0-cdh5.15.1.tar.gz
Zookeeper zookeeper-3.4.6.tar.gz

二、主机规划

IP Host 安装软件 进程
192.168.174.121 hadoop001 hadoop、zookeeper NameNode(主)
ZKFC
JournalNode
DataNode

ResourceManager(主)
JobhistoryServer
NodeManager

QuorumPeerMain
192.168.174.122 hadoop002 hadoop、zookeeper NameNode(备)
ZKFC
JournalNode
DataNode

ResourceManager(备)
NodeManager

QuorumPeerMain
192.168.174.123 hadoop003 hadoop、zookeeper JournalNode
DataNode

NodeManager

QuorumPeerMain

三、目录规划

用户 名称 路径 说明
hadoop app /home/hadoop/app 最终软件安装的目录
hadoop data /home/hadoop/data 测试数据
hadoop lib /home/hadoop/lib 开发的jar
hadoop maven_repos /home/hadoop/maven_repos Maven本地仓库
hadoop software /home/hadoop/software 软件
hadoop script /home/hadoop/software 脚本
hadoop source /home/hadoop/script 源码
hadoop tmp /home/hadoop/tmp 临时文件夹

四、环境安装

1.CentOS7.2、主机名、静态IP、防火墙和可访问外网配置(三台)

详细见本人之前的博客:https://www.jianshu.com/p/482cbff461bf

2.ip 与 hostname 绑定(3三台)

[root@hadoop001 ~]# vi /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.174.121 hadoop001
192.168.174.122 hadoop002
192.168.174.123 hadoop003

3.关闭所有节点的selinux

vi /etc/selinux/config 
将SELINUX=enforcing改为SELINUX=disabled 
设置后需要重启才能⽣生

4.设置所有节点的时区⼀一致及时钟同步

1.时区设置

[root@hadoop001 ~]# date
Mon Aug 19 10:54:12 CST 2019
[root@hadoop001 ~]# timedatectl
      Local time: Mon 2019-08-19 10:55:58 CST
  Universal time: Mon 2019-08-19 02:55:58 UTC
        RTC time: Mon 2019-08-19 02:55:53
       Time zone: Asia/Shanghai (CST, +0800)
     NTP enabled: n/a
NTP synchronized: no
 RTC in local TZ: no
      DST active: n/a
#所有节点设置亚洲上海时区 
[root@hadoop001 ~]# timedatectl set-timezone Asia/Shanghai 
[root@hadoop002 ~]# timedatectl set-timezone Asia/Shanghai 
[root@hadoop003 ~]# timedatectl set-timezone Asia/Shanghai

2.时间设置

#所有节点安装ntp
[root@hadoop001 ~]# yum install -y ntp

#选取hadoop001为ntp的主节点
[root@hadoop001 ~]# vi /etc/ntp.conf

#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst

#time 
server 0.asia.pool.ntp.org 
server 1.asia.pool.ntp.org 
server 2.asia.pool.ntp.org 
server 3.asia.pool.ntp.org 
#当外部时间不不可⽤用时,可使⽤用本地硬件时间 
server 127.127.1.0 iburst local clock 
#允许哪些⽹网段的机器器来同步时间 192.168.174指你的网段
restrict 192.168.174.0 mask 255.255.255.0 nomodify notrap

#开启ntpd及查看状态 
[root@hadoop001 ~]# systemctl start ntpd
[root@hadoop001 ~]# systemctl status ntpd
● ntpd.service - Network Time Service
   Loaded: loaded (/usr/lib/systemd/system/ntpd.service; disabled; vendor preset: disabled)
   Active: active (running) since Mon 2019-08-19 11:13:20 CST; 19s ago
  Process: 9154 ExecStart=/usr/sbin/ntpd -u ntp:ntp $OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 9155 (ntpd)
   CGroup: /system.slice/ntpd.service
           └─9155 /usr/sbin/ntpd -u ntp:ntp -g

Aug 19 11:13:20 hadoop001 ntpd[9155]: Listen normally on 2 lo 127.0.0.1 UDP 123
Aug 19 11:13:20 hadoop001 ntpd[9155]: Listen normally on 3 ens33 192.168.174.121 UDP 123
Aug 19 11:13:20 hadoop001 ntpd[9155]: Listen normally on 4 ens33 fe80::f3a2:882:b52f:d0b UDP 123
Aug 19 11:13:20 hadoop001 ntpd[9155]: Listen normally on 5 lo ::1 UDP 123
Aug 19 11:13:20 hadoop001 ntpd[9155]: Listening on routing socket on fd #22 for interface updates
Aug 19 11:13:20 hadoop001 systemd[1]: Started Network Time Service.
Aug 19 11:13:20 hadoop001 ntpd[9155]: 0.0.0.0 c016 06 restart
Aug 19 11:13:20 hadoop001 ntpd[9155]: 0.0.0.0 c012 02 freq_set kernel 0.000 PPM
Aug 19 11:13:20 hadoop001 ntpd[9155]: 0.0.0.0 c011 01 freq_not_set
Aug 19 11:13:21 hadoop001 ntpd[9155]: 0.0.0.0 c514 04 freq_mode

#验证 
[root@hadoop001 ~]# ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*LOCAL(0)        .LOCL.           5 l   12   64    3    0.000    0.000   0.000
 send.mx.cdnetwo 216.239.35.8     2 u   11   64    3   48.414  -5021.1  16.238
 ntp.gnc.am      42.204.179.159   2 u    9   64    3  377.631  -5058.7  15.043
 27.54.120.10    207.148.72.47    3 u   10   64    1  447.310  -5079.8   0.000
 202.28.116.236  .INIT.          16 u    -   64    0    0.000    0.000   0.000

#其他从节点停⽌止禁⽤用ntpd服务 
[root@hadoop002 ~]# systemctl stop ntpd
[root@hadoop002 ~]# systemctl disable ntpd
[root@hadoop002 ~]# /usr/sbin/ntpdate hadoop001
19 Aug 11:17:35 ntpdate[9154]: step time server 192.168.174.121 offset 0.696211 sec
#其他节点每分钟同步hadoop001节点时间
[root@hadoop002 ~]# crontab -e
*/1 * * * * /usr/sbin/ntpdate hadoop001

5.创建hadoop用户、并创建相应目录(三台)

[root@hadoop001 ~]# useradd hadoop
[root@hadoop001 ~]# su - hadoop
[hadoop@hadoop001 ~]$ mkdir app data lib maven_repos software script source tmp

6.安装lrzsz、上传安装包到software目录

[root@hadoop001 ~]# yum install -y lrzsz
[root@hadoop001 ~]# su - hadoop
[hadoop@hadoop001 ~]$ cd software/
[hadoop@hadoop001 software]$ rz -be

7.配置三台机器ssh免密码登入(三台)

[hadoop@hadoop001 software]$ ssh-keygen
[hadoop@hadoop001 .ssh]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[hadoop@hadoop001 .ssh]$ chmod 600 ~/.ssh/authorized_keys
#将所有机器的公钥发送到window本地,将其合成一个authorized_keys文件,然后再发送到所有机器上去替换它自己的authorized_keys文件。

#测试
[hadoop@hadoop001 .ssh]$ ssh hadoop001 date
Mon Aug 19 14:56:54 CST 2019
[hadoop@hadoop001 .ssh]$ ssh hadoop002 date
Mon Aug 19 14:57:01 CST 2019
[hadoop@hadoop001 .ssh]$ ssh hadoop003 date
Mon Aug 19 14:57:08 CST 2019

注意:linux可能识别不了window文件,需要用dos2unix,它是将Windows格式文件转换为Unix、Linux格式的实用命令。

8.安装JDK(三台)

#将安装包发给其他机器
[hadoop@hadoop001 software]$ scp ./* hadoop002:/home/hadoop/software/
hadoop-2.6.0-cdh5.15.1.tar.gz                                                                       100%  241MB 120.5MB/s   00:02    
jdk-8u45-linux-x64.gz                                                                               100%  165MB 165.2MB/s   00:00    
zookeeper-3.4.6.tar.gz

#用root用户解压jdk-8u45-linux-x64.gz到/usr/java/目录下
[root@hadoop001 ~]# mkdir /usr/java/
[root@hadoop001 ~]# tar -zxvf /home/hadoop/software/jdk-8u45-linux-x64.gz -C /usr/java/

#修改解压文件的所属用户和所属组
[root@hadoop001 java]# chown -R root:root jdk1.8.0_45/

[root@hadoop001 java]# echo "export JAVA_HOME=/usr/java/jdk1.8.0_45" >> /etc/profile 
[root@hadoop001 java]# echo "export PATH=${JAVA_HOME}/bin:${PATH}" >> /etc/profile 
[root@hadoop001 java]# source /etc/profile 
[root@hadoop001 java]# which java

9.安装zookeeper(三台)

[hadoop@hadoop001 software]$ tar -zxvf zookeeper-3.4.6.tar.gz -C /home/hadoop/app/
[hadoop@hadoop002 app]$ ln -s zookeeper-3.4.6 zookeeper

[hadoop@hadoop001 conf]$ cp zoo_sample.cfg zoo.cfg

[hadoop@hadoop001 conf]$ vi zoo.cfg 
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/home/hadoop/data/zookeeper
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

server.1=hadoop001:2888:3888 
server.2=hadoop002:2888:3888
server.3=hadoop003:2888:3888
"zoo.cfg" 32L, 1023C written


[hadoop@hadoop001 data]$ mkdir zookeeper
#注意 > 符号两边有空格
[hadoop@hadoop001 data]$ echo 1 > /home/hadoop/data/zookeeper/myid
[hadoop@hadoop002 data]$ echo 2 > /home/hadoop/data/zookeeper/myid
[hadoop@hadoop003 data]$ echo 3 > /home/hadoop/data/zookeeper/myid

10.安装hadoop ha(三台)

[hadoop@hadoop001 software]$ tar -zxvf hadoop-2.6.0-cdh5.15.1.tar.gz -C /home/hadoop/app/
[hadoop@hadoop001 app]$ ln -s hadoop-2.6.0-cdh5.15.1 hadoop

配置hadoop-env.sh

[hadoop@hadoop001 hadoop]$ vi hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_45
export HADOOP_CONF_DIR=/home/hadoop/app/hadoop/etc/hadoop

配置core-site.xml ,

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <!--Yarn 需要使用 fs.defaultFS 指定NameNode URI -->
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://ruozeclusterg7</value>
        </property>
        <!--==============================Trash机制======================================= -->
        <property>
                <!--多长时间创建CheckPoint NameNode截点上运行的CheckPointer 从Current文件夹创建CheckPoint;默认:0 由fs.trash.interval项指定 -->
                <name>fs.trash.checkpoint.interval</name>
                <value>0</value>
        </property>
        <property>
                <!--多少分钟.Trash下的CheckPoint目录会被删除,该配置服务器设置优先级大于客户端,默认:0 不删除 -->
                <name>fs.trash.interval</name>
                <value>1440</value>
        </property>

         <!--指定hadoop临时目录, hadoop.tmp.dir 是hadoop文件系统依赖的基础配置,很多路径都依赖它。如果hdfs-site.xml中不配 置namenode和datanode的存放位置,默认就放在这>个路径中 -->
        <property>   
                <name>hadoop.tmp.dir</name>
                <value>/home/hadoop/tmp/hadoop</value>
        </property>

         <!-- 指定zookeeper地址 -->
        <property>
                <name>ha.zookeeper.quorum</name>
                <value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value>
        </property>
         <!--指定ZooKeeper超时间隔,单位毫秒 -->
        <property>
                <name>ha.zookeeper.session-timeout.ms</name>
                <value>2000</value>
        </property>

        <!--第二个hadoop代表你安装hadoop的用户 -->
        <property>
           <name>hadoop.proxyuser.hadoop.hosts</name>
           <value>*</value> 
        </property> 
        <property> 
            <name>hadoop.proxyuser.hadoop.groups</name> 
            <value>*</value> 
       </property> 


      <property>
          <name>io.compression.codecs</name>
          <value>org.apache.hadoop.io.compress.GzipCodec,
            org.apache.hadoop.io.compress.DefaultCodec,
            org.apache.hadoop.io.compress.BZip2Codec,
            org.apache.hadoop.io.compress.SnappyCodec
          </value>
      </property>
</configuration>
[hadoop@hadoop001 hadoop]$ mkdir /home/hadoop/tmp/hadoop
[hadoop@hadoop001 hadoop]$ chmod 777 /home/hadoop/tmp/hadoop

配置hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <!--HDFS超级用户 -->
    <property>
        <name>dfs.permissions.superusergroup</name>
        <value>hadoop</value>
    </property>

    <!--开启web hdfs -->
    <property>
        <name>dfs.webhdfs.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/home/hadoop/data/dfs/name</value>
        <description> namenode 存放name table(fsimage)本地目录(需要修改)</description>
    </property>
    <property>
        <name>dfs.namenode.edits.dir</name>
        <value>${dfs.namenode.name.dir}</value>
        <description>namenode粗放 transaction file(edits)本地目录(需要修改)</description>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/home/hadoop/data/dfs/data</value>
        <description>datanode存放block本地目录(需要修改)</description>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
    <!-- 块大小128M (默认128M) -->
    <property>
        <name>dfs.blocksize</name>
        <value>134217728</value>
    </property>
    <!--======================================================================= -->
    <!--HDFS高可用配置 -->
    <!--指定hdfs的nameservice为ruozeclusterg7,需要和core-site.xml中的保持一致 -->
    <property>
        <name>dfs.nameservices</name>
        <value>ruozeclusterg7</value>
    </property>
    <property>
        <!--设置NameNode IDs 此版本最大只支持两个NameNode -->
        <name>dfs.ha.namenodes.ruozeclusterg7</name>
        <value>nn1,nn2</value>
    </property>

    <!-- Hdfs HA: dfs.namenode.rpc-address.[nameservice ID] rpc 通信地址 -->
    <property>
        <name>dfs.namenode.rpc-address.ruozeclusterg7.nn1</name>
        <value>hadoop001:8020</value>
    </property>
    <property>
        <name>dfs.namenode.rpc-address.ruozeclusterg7.nn2</name>
        <value>hadoop002:8020</value>
    </property>

    <!-- Hdfs HA: dfs.namenode.http-address.[nameservice ID] http 通信地址 -->
    <property>
        <name>dfs.namenode.http-address.ruozeclusterg7.nn1</name>
        <value>hadoop001:50070</value>
    </property>
    <property>
        <name>dfs.namenode.http-address.ruozeclusterg7.nn2</name>
        <value>hadoop002:50070</value>
    </property>

    <!--==================Namenode editlog同步 ============================================ -->
    <!--保证数据恢复 -->
    <property>
        <name>dfs.journalnode.http-address</name>
        <value>0.0.0.0:8480</value>
    </property>
    <property>
        <name>dfs.journalnode.rpc-address</name>
        <value>0.0.0.0:8485</value>
    </property>
    <property>
        <!--设置JournalNode服务器地址,QuorumJournalManager 用于存储editlog -->
        <!--格式:qjournal://<host1:port1>;<host2:port2>;<host3:port3>/<journalId> 端口同journalnode.rpc-address -->
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://hadoop001:8485;hadoop002:8485;hadoop003:8485/ruozeclusterg7</value>
    </property>

    <property>
        <!--JournalNode存放数据地址 -->
        <name>dfs.journalnode.edits.dir</name>
        <value>/home/hadoop/data/dfs/jn</value>
    </property>
    <!--==================DataNode editlog同步 ============================================ -->
    <property>
        <!--DataNode,Client连接Namenode识别选择Active NameNode策略 -->
                             <!-- 配置失败自动切换实现方式 -->
        <name>dfs.client.failover.proxy.provider.ruozeclusterg7</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <!--==================Namenode fencing:=============================================== -->
    <!--Failover后防止停掉的Namenode启动,造成两个服务 -->
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence</value>
    </property>
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/home/hadoop/.ssh/id_rsa</value>
    </property>
    <property>
        <!--多少milliseconds 认为fencing失败 -->
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>30000</value>
    </property>

    <!--==================NameNode auto failover base ZKFC and Zookeeper====================== -->
    <!--开启基于Zookeeper  -->
    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>
    <!--动态许可datanode连接namenode列表 -->
     <property>
       <name>dfs.hosts</name>
       <value>/home/hadoop/app/hadoop/etc/hadoop/slaves</value>
     </property>
</configuration>

配置mapred-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <!-- 配置 MapReduce Applications -->
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <!-- JobHistory Server ============================================================== -->
    <!-- 配置 MapReduce JobHistory Server 地址 ,默认端口10020 -->
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>hadoop001:10020</value>
    </property>
    <!-- 配置 MapReduce JobHistory Server web ui 地址, 默认端口19888 -->
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>hadoop001:19888</value>
    </property>

<!-- 配置 Map段输出的压缩,snappy-->
  <property>
      <name>mapreduce.map.output.compress</name> 
      <value>true</value>
  </property>
              
  <property>
      <name>mapreduce.map.output.compress.codec</name> 
      <value>org.apache.hadoop.io.compress.SnappyCodec</value>
   </property>

</configuration>

配置slaves

hadoop001
hadoop002
hadoop003

配置yarn-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <!-- nodemanager 配置 ================================================= -->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
        <name>yarn.nodemanager.localizer.address</name>
        <value>0.0.0.0:23344</value>
        <description>Address where the localizer IPC is.</description>
    </property>
    <property>
        <name>yarn.nodemanager.webapp.address</name>
        <value>0.0.0.0:23999</value>
        <description>NM Webapp address.</description>
    </property>

    <!-- HA 配置 =============================================================== -->
    <!-- Resource Manager Configs -->
    <property>
        <name>yarn.resourcemanager.connect.retry-interval.ms</name>
        <value>2000</value>
    </property>
    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>
    <!-- 使嵌入式自动故障转移。HA环境启动,与 ZKRMStateStore 配合 处理fencing -->
    <property>
        <name>yarn.resourcemanager.ha.automatic-failover.embedded</name>
        <value>true</value>
    </property>
    <!-- 集群名称,确保HA选举时对应的集群 -->
    <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>yarn-cluster</value>
    </property>
    <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
    </property>


    <!--这里RM主备结点需要单独指定,(可选)
    <property>
         <name>yarn.resourcemanager.ha.id</name>
         <value>rm2</value>
     </property>
     -->

    <property>
        <name>yarn.resourcemanager.scheduler.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
    </property>
    <property>
        <name>yarn.resourcemanager.recovery.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
        <value>5000</value>
    </property>
    <!-- ZKRMStateStore 配置 -->
    <property>
        <name>yarn.resourcemanager.store.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
    </property>
    <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value>
    </property>
    <property>
        <name>yarn.resourcemanager.zk.state-store.address</name>
        <value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value>
    </property>
    <!-- Client访问RM的RPC地址 (applications manager interface) -->
    <property>
        <name>yarn.resourcemanager.address.rm1</name>
        <value>hadoop001:23140</value>
    </property>
    <property>
        <name>yarn.resourcemanager.address.rm2</name>
        <value>hadoop002:23140</value>
    </property>
    <!-- AM访问RM的RPC地址(scheduler interface) -->
    <property>
        <name>yarn.resourcemanager.scheduler.address.rm1</name>
        <value>hadoop001:23130</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address.rm2</name>
        <value>hadoop002:23130</value>
    </property>
    <!-- RM admin interface -->
    <property>
        <name>yarn.resourcemanager.admin.address.rm1</name>
        <value>hadoop001:23141</value>
    </property>
    <property>
        <name>yarn.resourcemanager.admin.address.rm2</name>
        <value>hadoop002:23141</value>
    </property>
    <!--NM访问RM的RPC端口 -->
    <property>
        <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
        <value>hadoop001:23125</value>
    </property>
    <property>
        <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
        <value>hadoop002:23125</value>
    </property>
    <!-- RM web application 地址 -->
    <property>
        <name>yarn.resourcemanager.webapp.address.rm1</name>
        <value>hadoop001:8088</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address.rm2</name>
        <value>hadoop002:8088</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.https.address.rm1</name>
        <value>hadoop001:23189</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.https.address.rm2</name>
        <value>hadoop002:23189</value>
    </property>



    <property>
       <name>yarn.log-aggregation-enable</name>
       <value>true</value>
    </property>
    <property>
         <name>yarn.log.server.url</name>
         <value>http://hadoop001:19888/jobhistory/logs</value>
    </property>


    <property>
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>2048</value>
    </property>
    <property>
        <name>yarn.scheduler.minimum-allocation-mb</name>
        <value>1024</value>
        <discription>单个任务可申请最少内存,默认1024MB</discription>
     </property>

  
  <property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>2048</value>
    <discription>单个任务可申请最大内存,默认8192MB</discription>
  </property>

   <property>
       <name>yarn.nodemanager.resource.cpu-vcores</name>
       <value>2</value>
    </property>

</configuration>

将修改的文件传输到其他机器上去

[hadoop@hadoop001 hadoop]$ scp *.xml slaves hadoop-env.sh hadoop002:/home/hadoop/app/hadoop/etc/hadoop/
[hadoop@hadoop001 hadoop]$ scp *.xml slaves hadoop-env.sh hadoop003:/home/hadoop/app/hadoop/etc/hadoop/

12.启动集群

1.启动zookeeper

command: ./zkServer.sh start|stop|status

2.启动hadoop ha

  • 格式化前 , 先在 journalnode 节点机器上先启动 JournalNode进程
    hadoop-daemon.sh start journalnode

  • NameNode格式化
    hadoop namenode -format

  • 同步NameNode元数据
    同步 hadoop001 元数据到 hadoop002
    [hadoop@hadoop001 dfs]$ scp -r name hadoop002:/home/hadoop/data/dfs/

  • 初始化 ZFCK
    hdfs zkfc -formatZK

19/08/19 17:42:09 INFO ha.ActiveStandbyElector: Session connected.
19/08/19 17:42:09 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/ruozeclusterg7 in ZK.
19/08/19 17:42:09 INFO zookeeper.ZooKeeper: Session: 0x26ca8e014380000 closed
19/08/19 17:42:09 INFO zookeeper.ClientCnxn: EventThread shut down
19/08/19 17:42:09 INFO tools.DFSZKFailoverController: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down DFSZKFailoverController at hadoop001/192.168.174.121
************************************************************/
  • 启动 HDFS 分布式存储系统
    start-dfs.sh

  • 启动 YARN框架
    start-yarn.sh

[hadoop@hadoop001 current]$ start-yarn.sh 
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.15.1/logs/yarn-hadoop-resourcemanager-hadoop001.out
hadoop003: starting nodemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.15.1/logs/yarn-hadoop-nodemanager-hadoop003.out
hadoop002: starting nodemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.15.1/logs/yarn-hadoop-nodemanager-hadoop002.out
hadoop001: starting nodemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.15.1/logs/yarn-hadoop-nodemanager-hadoop001.out
  • hadoop002 备机启动 RM
[hadoop@hadoop002 dfs]$ yarn-daemon.sh start resourcemanager  

13.关闭集群

  • 关闭 Hadoop(YARN-->HDFS)
[hadoop@hadoop001 sbin]# stop-yarn.sh  
[hadoop@hadoop002 sbin]# yarn-daemon.sh stop resourcemanager  [hadoop@hadoop001 sbin]# stop-dfs.sh  

14.再次开启集群

  • 启动 Zookeeper
[hadoop@hadoop001 bin]# zkServer.sh start 
[hadoop@hadoop002 bin]# zkServer.sh start 
[hadoop@hadoop003 bin]# zkServer.sh start
  • 启动 Hadoop(HDFS-->YARN)
[hadoop@hadoop001 sbin]# start-dfs.sh 
[hadoop@hadoop001 sbin]# start-yarn.sh 
[hadoop@hadoop002 sbin]# yarn-daemon.sh start resourcemanager 
[hadoop@hadoop001 ~]# mr-jobhistory-daemon.sh start historyserver

15.监控集群

[root@hadoop001 ~]# hdfs dfsadmin -report

HDFS:http://192.168.174.121:50070/
HDFS:http://192.168.174.122:50070/

ResourceManger(Active):http://192.168.174.121:8088 ResourceManger(Standby):http://192.168.174.121:8088/cluster/cluster

JobHistory:http://192.168.174.121:19888/jobhistory

相关文章

  • HDFS&YARN HA集群安装

    一、版本 二、主机规划 三、目录规划 四、环境安装 1.CentOS7.2、主机名、静态IP、防火墙和可访问外网配...

  • Hadoop(HDFS,YARN)的HA集群安装-TOGET

    搭建Hadoop的HDFS HA及YARN HA集群,基于2.7.1版本安装。 安装规划 安装用户 garriso...

  • 大数据开发环境搭载4--安装Hadoop HA集群

    4、安装Hadoop HA集群 下载 将Hadoop安装到/usr/local/hadoop 修改/etc/pro...

  • MyCat集群和负载均衡

    前言 本文主要包含mycat集群,mycat+ha实现负载均衡等操作。 mycat集群 1、安装zookeeper...

  • spark HA集群安装

    配置HA spark集群Master ...

  • Sqoop2安装

    Sqoop2安装,基于版本sqoop-1.99.7,Sqoop2为单机安装,没有集群概念。Sqoop2安装依赖Ha...

  • vmware搭建hadoop集群

    0. 安装环境 安装系统版本,集群IP列表 ubunut 16.04 server 1. 创建用户 和用户组(ha...

  • 大数据||HDFS HA配置详解

    根据HA架构图,规划HA的分布式集群服务器 HA集群规划 根据官方文档配置HA 部分说明 Architecture...

  • Hadoop HA

    Hadoop HA 什么是 HA HA是High Available缩写,是双机集群系统简称,指高可用性集群,是保...

  • Hadoop HA

    Hadoop HA 什么是 HA HA是High Available缩写,是双机集群系统简称,指高可用性集群,是保...

网友评论

      本文标题:HDFS&YARN HA集群安装

      本文链接:https://www.haomeiwen.com/subject/jjajsctx.html