美文网首页大数据
hadoop 3.2.x 高可用集群搭建

hadoop 3.2.x 高可用集群搭建

作者: 陈sir的知识图谱 | 来源:发表于2020-02-14 12:55 被阅读0次

    本环境使用centos 8

    HOSTNAME 作用 IP role
    hadoop301 hdfs namenode,hdfs datanode, yarn resourcemanager, yarn node manager, journal node, zookeeeper 192.168.142.101 yarn and hdfs master, worker
    hadoop302 hdfs namenode, hdfs datanode, yarn resourcemanager, yarn node manager, journal node, zookeeeper 192.168.142.102 yarn and hdfs master, worker
    hadoop303 hdfs namenode, hdfs datanode, yarn resourcemanager, journal node, zookeeeper 192.168.142.103 hdfs master , worker

    前置条件

    先设置好 linux 通用设置
    配置好三台机器之间的免密码登陆
    下载好hadoop-3.2.1,zookeeper-3.5.6 的压缩包

    安装步骤


    1 在所有机器上执行

    安装jdk

    • 1.1. 安装jdk
      yum install -y java-1.8.0-openjdk-devel.x86_64 java-1.8.0-openjdk.x86_64
    • 1.2 编辑/etc/profile, 在末尾添加
    export HADOOP_HOME=/opt/hadoop-3.2.1
    export PATH=$PATH:$HADOOP_HOME/bin
    export PATH=$PATH:$HADOOP_HOME/sbin
    export JAVA_HOME=/usr/lib/jvm/jre-1.8.0
    export ZOOKEEPER_HOME=/opt/zookeeper-3.5.6
    export PATH=$PATH:$ZOOKEEPER_HOME/bin
    
    • 1.3 创建文件夹,至于为什么是这些路径,和后面的配置文件有关。
    mkdir -p /tmp/hadoop/tmpdir
    mkdir -p /tmp/hadoop/journalnode/data
    mkdir -p /tmp/hadoop/hdfs/namenode
    mkdir -p /tmp/hadoop/hdfs/datanode
    mkdir -p /tmp/zookeeper
    echo > 1 /tmp/zookeeper/myid #hadoop301
    echo > 2 /tmp/zookeeper/myid #hadoop302
    echo > 3 /tmp/zookeeper/myid #hadoop303
    
    • 1.4 设置好hosts
    192.168.142.101 hadoop301
    192.168.142.102 hadoop302
    192.168.142.103 hadoop303
    

    2 在 hadoop301 上执行

    2.1安装ZK

    2.1.1 将zk 解压

    tar -zxf zookeeper-3.5.6.tar.gz
    

    2.1.2 配置zoo.cfg

    cd zookeeper-3.5.6/conf
    mv zoo_sample.cfg zoo.cfg
    vim zoo.cfg
    

    zoo.cfg 内容如下

    # The number of milliseconds of each tick 心跳基本时间单位,毫秒级,ZK基本上所有的时间都是这个时间的整数倍。
    tickTime=2000
    # The number of ticks that the initial 
    # synchronization phase can take 
    # tickTime的个数,表示在leader选举结束后,followers与leader同步需要的时间,如果followers比较多或者说leader的数据灰常多时,同步时间相应可能会增加,那么这个值也需要相应增加。当然,这个值也是follower和observer在开始同步leader的数据时的最大等待时间(setSoTimeout)
    initLimit=10
    # The number of ticks that can pass between 
    # sending a request and getting an acknowledgement
    # tickTime的个数,这时间容易和上面的时间混淆,它也表示follower和observer与leader交互时的最大等待时间,只不过是在与leader同步完毕之后,进入正常请求转发或ping等消息交互时的超时时间
    syncLimit=5
    # the directory where the snapshot is stored.
    # do not use /tmp for storage, /tmp here is just 
    # example sakes.
    # 内存数据库快照存放地址,如果没有指定事务日志存放地址(dataLogDir),默认也是存放在这个路径下,建议两个地址分开存放到不同的设备上
    dataDir=/tmp/zookeeper
    # the port at which the clients will connect
    clientPort=2181
    # the maximum number of client connections.
    # increase this if you need to handle more clients
    #maxClientCnxns=60
    #
    # Be sure to read the maintenance section of the 
    # administrator guide before turning on autopurge.
    #
    # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
    #
    # The number of snapshots to retain in dataDir
    #autopurge.snapRetainCount=3
    # Purge task interval in hours
    # Set to "0" to disable auto purge feature
    #autopurge.purgeInterval=1
    server.1=hadoop301:2888:3888
    server.2=hadoop302:2888:3888
    server.3=hadoop303:2888:3888
    

    2.1.3 同步到 hadoop302 hadoop303

    yum install -y rsync
    rsync -auvp /opt/zookeeper-3.5.6 root@hadoop302:/opt 
    rsync -auvp /opt/zookeeper-3.5.6 root@hadoop303:/opt 
    

    2.2 安装 hadoop

    • 2.2.1 将 hadoop-3.2.1.tar.gz 上传到/opt 目录下, 并解压
    • 2.2.2 编辑 /opt/hadoop-3.2.1/etc/hadoop/core-site.xml
    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <configuration>
      <property>
        <name>fs.defaultFS</name>
        <value>hdfs://mycluster</value>
      </property>
      <property>
        <name>hadoop.tmp.dir</name>
        <value>/tmp/hadoop/tmpdir</value>
      </property>
      <property>
        <name>ha.zookeeper.quorum</name>
        <value>hadoop301:2181,hadoop302:2181,hadoop303:2181</value>
      </property>
    </configuration>
    
    • 2.2.3 /opt/hadoop-3.2.1/etc/hadoop/hdfs-site.xml
    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <configuration>
      <!-- hdfs HA configuration-->
      <!-- all default configuration can be found at https://hadoop.apache.org/docs/stable|<can be a version liek r3.2.1></can>/hadoop-project-dist/hadoop-hdfs//hdfs-default.xml -->
      
      <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
      </property>
      <!-- dfs.nameservices 这里需要与core-site.xml 中fs.defaultFS 的名称一致-->
      <property>
        <name>dfs.nameservices</name>
        <value>mycluster</value>
      </property>
      <!-- 定义集群中 namenode 列表,这里定义了三个namenode,分别是nn1,nn2,nn3-->
      <property>
        <name>dfs.ha.namenodes.mycluster</name>
        <value>nn1,nn2,nn3</value>
      </property>
      <!-- namenode nn1的具体定义,这里要和 dfs.ha.namenodes.mycluster 定义的列表对应 -->
      <property>
        <name>dfs.namenode.rpc-address.mycluster.nn1</name>
        <value>hadoop301:8020</value>
      </property>
      <property>
        <name>dfs.namenode.rpc-address.mycluster.nn2</name>
        <value>hadoop302:8020</value>
      </property>
      <property>
        <name>dfs.namenode.rpc-address.mycluster.nn3</name>
        <value>hadoop303:8020</value>
      </property>
      <!-- namenode nn1的具体定义,这里要和 dfs.ha.namenodes.mycluster 定义的列表对应 -->
      <property>
        <name>dfs.namenode.http-address.mycluster.nn1</name>
        <value>hadoop301:9870</value>
      </property>
      <property>
        <name>dfs.namenode.http-address.mycluster.nn2</name>
        <value>hadoop302:9870</value>
      </property>
      <property>
        <name>dfs.namenode.http-address.mycluster.nn3</name>
        <value>hadoop303:9870</value>
      </property>
    <!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
      <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://hadoop301:8485;hadoop302:8485;hadoop303:8485/mycluster</value>
      </property>
      <!-- 指定JournalNode在本地磁盘存放数据的位置 -->
      <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/tmp/hadoop/journalnode/data</value>
      </property>
      <!-- 配置失败自动切换实现方式 -->
      <property>
        <name>dfs.client.failover.proxy.provider.mycluster</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
      </property>
      <!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->
      <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence</value>
      </property>
      <!-- 使用sshfence隔离机制时需要ssh免登陆 -->
      <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/root/.ssh/id_rsa</value>
      </property>
      <!-- 配置sshfence隔离机制超时时间 -->
      <property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>30000</value>
      </property>
      <property>
        <name>dfs.journalnode.http-address</name>
        <value>0.0.0.0:8480</value>
      </property>
      <property>
        <name>dfs.journalnode.rpc-address</name>
        <value>0.0.0.0:8485</value>
      </property>
      <!-- hdfs HA configuration end-->
    
      <property>
        <name>dfs.replication</name>
        <value>1</value>
      </property>
      <property>
        <name>dfs.namenode.name.dir</name>
        <value>/tmp/hadoop/hdfs/namenode</value>
      </property>
      <property>
        <name>dfs.datanode.data.dir</name>
        <value>/tmp/hadoop/hdfs/datanode</value>
      </property>
      <!--开启webhdfs接口访问-->
      <property>
        <name>dfs.webhdfs.enabled</name>
        <value>true</value>
      </property>
    <!-- 关闭权限验证,hive可以直连 -->
      <property>
        <name>dfs.permissions.enabled</name>
        <value>false</value>
      </property>
    </configuration>
    

    2.2.4 编辑 /opt/hadoop-3.2.1/etc/hadoop/yarn-site.xml

    <?xml version="1.0"?>
    <configuration>
    
      <!-- yarn ha configuration-->
      <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
      </property>
      <!-- 定义集群名称 -->
      <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>cluster1</value>
      </property>
      <!-- 定义本机在在高可用集群中的id 要与 yarn.resourcemanager.ha.rm-ids 定义的值对应,如果不作为resource manager 则删除这项配置。-->
      <property>
        <name>yarn.resourcemanager.ha.id</name>
        <value>rm1</value>
      </property>
      <!-- 定义高可用集群中的 id 列表 -->
      <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
      </property>
      <!-- 定义高可用RM集群具体是哪些机器 -->
      <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>hadoop301</value>
      </property>
      <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>hadoop302</value>
      </property>
      <property>
        <name>yarn.resourcemanager.webapp.address.rm1</name>
        <value>hadoop301:8088</value>
      </property>
      <property>
        <name>yarn.resourcemanager.webapp.address.rm2</name>
        <value>hadoop302:8088</value>
      </property>
      <property>
        <name>hadoop.zk.address</name>
        <value>hadoop301:2181,hadoop302:2181,hadoop303:2181</value>
      </property>
    
      <!-- Site specific YARN configuration properties -->
      <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
      </property>
    </configuration>
    

    2.2.5 编辑 /opt/hadoop-3.2.1/etc/hadoop/mapred-site.xml

    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <configuration>
      <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
      </property>
      <property>
        <name>mapreduce.application.classpath</name>
        <value>  
            /opt/hadoop-3.2.1/share/hadoop/common/*,
            /opt/hadoop-3.2.1/share/hadoop/common/lib/*,
            /opt/hadoop-3.2.1/share/hadoop/hdfs/*,
            /opt/hadoop-3.2.1/share/hadoop/hdfs/lib/*,
            /opt/hadoop-3.2.1/share/hadoop/mapreduce/*,
            /opt/hadoop-3.2.1/share/hadoop/mapreduce/lib/*,
            /opt/hadoop-3.2.1/share/hadoop/yarn/*,
            /opt/hadoop-3.2.1/share/hadoop/yarn/lib/*
        </value>
      </property>
    
    </configuration>
    

    2.2.6 编辑 /opt/hadoop-3.2.1/etc/hadoop/hadoop-env.sh

    # The java implementation to use. By default, this environment
    # variable is REQUIRED on ALL platforms except OS X!
    # export JAVA_HOME=
    export JAVA_HOME=/usr/lib/jvm/jre-1.8.0
    
    # Some parts of the shell code may do special things dependent upon
    # the operating system.  We have to set this here. See the next
    # section as to why....
    export HADOOP_OS_TYPE=${HADOOP_OS_TYPE:-$(uname -s)}
    export HADOOP_PID_DIR=/opt/hadoop-3.2.1/pid
    export HADOOP_LOG_DIR=/var/log/hadoop
    

    2.2.7 编辑 /opt/hadoop-3.2.1/etc/hadoop/yarn-env.sh

    # Specify the max heapsize for the ResourceManager.  If no units are
    # given, it will be assumed to be in MB.
    # This value will be overridden by an Xmx setting specified in either
    # HADOOP_OPTS and/or YARN_RESOURCEMANAGER_OPTS.
    # Default is the same as HADOOP_HEAPSIZE_MAX
    #export YARN_RESOURCEMANAGER_HEAPSIZE=
    export JAVA_HOME=/usr/lib/jvm/jre-1.8.0
    

    2.2.7 编辑 /opt/hadoop-3.2.1/sbin/start-dfs.sh, /opt/hadoop-3.2.1/sbin/stop-dfs.sh,在脚本开始添加

    HDFS_NAMENODE_USER=root
    HDFS_DATANODE_USER=root
    HDFS_JOURNALNODE_USER=root
    HDFS_ZKFC_USER=root
    

    2.2.8 编辑 /opt/hadoop-3.2.1/sbin/start-yarn.sh, /opt/hadoop-3.2.1/sbin/stop-yarn.sh,在脚本开始添加

    YARN_RESOURCEMANAGER_USER=root
    YARN_NODEMANAGER_USER=root
    

    2.2.9 修改/opt/hadoop-3.2.1/etc/hadoo/workers 为如下内容

    hadoop301
    hadoop302
    hadoop303
    

    2.2.10 拷贝 hadoop-3.2.1 到 hadoop302 hadoop303
    rsync -auvp /opt/hadoop-3.2.1 root@hadoop302:/opt
    rsync -auvp /opt/hadoop-3.2.1 root@hadoop303:/opt

    3 在hadoop302 上执行

    需要修改yarn-site.xml的yarn.resourcemanager.ha.id,改为如下内容

      <property>
        <name>yarn.resourcemanager.ha.id</name>
        <value>rm2</value>
      </property>
    

    4 在hadoop303上执行

    删除如下property

      <property>
        <name>yarn.resourcemanager.ha.id</name>
        <value>rm1</value>
      </property>
    

    5 启动

    启动顺序 Zookeeper->JournalNode->格式化NameNode->创建命名空间zkfs->NameNode->Datanode->ResourceManager->NodeManager

    5.1 启动zookeeper

    在所有机器行上执行,顺序 hadoop301 hadoop302 hadoop303

    # 注意,如果使用zsh 需要切换回bash
    #chsh -s /usr/bin/bash
    #如果想用zsh 直接执行,需要使用如下领命,emualte 命令必须安装 oh my zsh 才有。
    # emulate sh -c '/opt/zookepper-3.5.6/bin/zkServer.sh start'
    /opt/zookepper-3.5.6/bin/zkServer.sh start 
    /opt/zookepper-3.5.6/bin/zkServer.sh status
    

    5.2 启动journalnode

    在所有机器行上执行,顺序 hadoop301 hadoop302 hadoop303

    # 注意,如果使用zsh 需要切换回bash
    #chsh -s /usr/bin/bash
    /opt/hadoop-3.2.1/sbin/hadoop-daemon.sh start journalnode
    # 或者通过 /opt/hadoop-3.2.1/bin/hdfs --daemon start journalnode
    

    5.3 格式化 Namenode

    在hadoop301上执行

    # 注意,如果使用zsh 需要切换回bash
    #chsh -s /usr/bin/bash
    /opt/hadoop-3.2.1/bin/hadoop namenode -format
    # 同步格式化之后的元数据到其他namenode,不然可能起不来
    rsync -auvp /tmp/hadoop/hdfs/namenode/current root@hadoop302:/tmp/hadoop/hdfs/namenode
    rsync -auvp /tmp/hadoop/hdfs/namenode/current root@hadoop303:/tmp/hadoop/hdfs/namenode
    # 格式化ZK
    hdfs zkfc -formatZK
    

    5.4 停止 jounalnode

    在所有机器上执行

    /opt/hadoop-3.2.1/sbin/hadoop-daemon.sh stop journalnode
    # 或者通过 /opt/hadoop-3.2.1/bin/hdfs --daemon stop journalnode
    

    5.5 启动 hadoop

    在hadoop301 上执行

    # 必须在bash 环境下执行,zsh 兼容模式也不行
    start-dfs.sh
    start-yarn.sh
    hdfs haadmin -getAllServiceState
    
    # 正常启动后所看到的进程 jps 查看
    2193 QuorumPeerMain
    5252 JournalNode
    4886 NameNode
    5016 DataNode
    5487 DFSZKFailoverController
    
    

    Hadooop Classpath

    很多其他的计算引擎都会使用hadoop的hdfs和yarn,他们使用的方式都是通过Hadoop class path。通过如下命令,可以看到hadoop的class path 又哪些

    /opt/hadoop-3.2.1/bin/hadoop classpath
    

    spark without hadoop 的安装包就会要求配置已经安装的hadoop 的classpath,可以再spark-env.sh中添加如下配置

    ### in conf/spark-env.sh ###
    
    # If 'hadoop' binary is on your PATH
    export SPARK_DIST_CLASSPATH=$(hadoop classpath)
    
    # With explicit path to 'hadoop' binary
    export SPARK_DIST_CLASSPATH=$(/path/to/hadoop/bin/hadoop classpath)
    
    # Passing a Hadoop configuration directory
    export SPARK_DIST_CLASSPATH=$(hadoop --config /path/to/configs classpath)
    

    troubleshooting

    1. 遇到所有namenoe 都是standby
      这种问题一般是DFSZKFailoverController 没起起来,没起来的原因一般是hdfs zkfc -formatZK 初始化失败或者后期操作破坏了数据,通过命令hdfs zkfc -formatZK重新初始化即可

    参考资料
    参考资料
    参考资料

    相关文章

      网友评论

        本文标题:hadoop 3.2.x 高可用集群搭建

        本文链接:https://www.haomeiwen.com/subject/ubmroctx.html