美文网首页
关于hadoop 3.1.1的集群搭建并完成高可用配置

关于hadoop 3.1.1的集群搭建并完成高可用配置

作者: 暴走的MINE | 来源:发表于2018-11-17 21:37 被阅读0次

    @关于hadoop 3.1.1集群搭建(四个节点模式)

    一、简介

    hadoop是Apache基金会的一个顶级项目,最早期版本是十多年前发布的,随着飞速的迭代更新,2018年已经更新到了3.1.1版本。网络上大多数都是旧版本的配置,本文却是最新版本的hadoop的配置方法。本文以hadoop 3.1.1为例,讲述如何从零开始搭建好hadoop集群。

    二、准备工作

    1.安装平台

    安装平台,是指hadoop软件需要搭建在linux系统中。国内有两大知名linux平台,分别是Ubuntu和Centos。

    给出官网的下载地址:

    Ubuntu : https://www.ubuntu.com/download/desktop

    Centos : https://www.centos.org/download/

    2.软件包

    (1) JDK: hadoop是基于java进行开发的,所有hadoop运行需要JVM的支持,作者使用的是jdk1.8的版本,下载地址:

    https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

    如图:

    在这里插入图片描述

    (2) Hadoop: 给出官网镜像下载地址:http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-3.1.1/

    如图:

    hadoop3.1.1下载

    (3) zookeeper

    下载地址:http://archive.apache.org/dist/zookeeper/zookeeper-3.4.6/

    zookeeper

    (4)常用软件

    Windows端:推荐使用VMware虚拟机,xshell:linux连接管理工具,xftp:上传文件到linux 工具

    MacOs端:推荐使用VMware虚拟机,zoc7:linux连接管理工具,FileZila:上传文件到linux 工具

    二、让我们开始吧

    1.配置静态ip

    修改网络配置文件,以centos7为例。

    vi /etc/sysconfig/network-scripts/ifcfg-ens33
    

    设置如下:

    
    ++++++++++++++++++++++++++++++++
    
    DEVICE="eth0"
    
    BOOTPROTO="static" #将原来的值“dhcp”改为“static”
    
    HWADDR="00:0C:29:F2:4E:96"
    
    IPV6INIT="yes"
    
    NM_CONTROLLED="yes"
    
    ONBOOT="yes"
    
    TYPE="Ethernet"
    
    UUID="b68b1ef8-13a0-4d11-a738-1ae704e6a0a4"
    
    IPADDR=192.168.1.16    #你需要定义的IP地址
    
    NETMASK=255.255.255.0 #子网掩码
    
    GATEWAY=192.168.1.1    #默认网关,
    
    ++++++++++++++++++++++++++++++++
    
    

    保存退出

    重启网络服务

    service network restart
    

    检查一下状态

    ifconfig -a
    
    
    +++++++++++++++++++++++++++++++++
    
    ens33   Link encap:Ethernet  HWaddr 00:0C:29:F2:4E:96 
    
              inet addr:192.168.1.16  Bcast:192.168.1.255  Mask:255.255.255.0
    
              inet6 addr: fe80::20c:29ff:fef2:4e96/64 Scope:Link
    
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
    
              RX packets:17017 errors:0 dropped:0 overruns:0 frame:0
    
              TX packets:9586 errors:0 dropped:0 overruns:0 carrier:0
    
              collisions:0 txqueuelen:1000
    
              RX bytes:7803412 (7.4 MiB)  TX bytes:1613751 (1.5 MiB)
    
    lo        Link encap:Local Loopback 
    
              inet addr:127.0.0.1  Mask:255.0.0.0
    
              inet6 addr: ::1/128 Scope:Host
    
              UP LOOPBACK RUNNING  MTU:16436  Metric:1
    
              RX packets:21844 errors:0 dropped:0 overruns:0 frame:0
    
              TX packets:21844 errors:0 dropped:0 overruns:0 carrier:0
    
              collisions:0 txqueuelen:0
    
              RX bytes:2042507 (1.9 MiB)  TX bytes:2042507 (1.9 MiB)
    
    +++++++++++++++++++++++++++++++++
    
    

    到了这里,请ping一下外网

    ping baidu.com
    

    我遇到两个情况:

    1)ping通了,oh yeah!

    2)报错:ping:unknow host baidu.com

    好吧,我的解决方法是:

    dhclient
    

    敲了这个命令后,再ping一次

    
      ---------------------------------
    
      PING baidu.com (180.149.132.47) 56(84) bytes of data.
    
      64 bytes from 180.149.132.47: icmp_seq=1 ttl=54 time=38.3 ms
    
      64 bytes from 180.149.132.47: icmp_seq=2 ttl=54 time=38.7 ms
    
      64 bytes from 180.149.132.47: icmp_seq=3 ttl=54 time=49.7 ms
    
      64 bytes from 180.149.132.47: icmp_seq=4 ttl=54 time=38.1 ms
    
      64 bytes from 180.149.132.47: icmp_seq=5 ttl=54 time=37.9 ms
    
      64 bytes from 180.149.132.47: icmp_seq=6 ttl=54 time=38.3 ms
    
      ---------------------------------
    
    

    反正我是这样解决的

    还有人是这样:配置静态IP之后reboot

    还有一个情况就是,ping 外网IP可以,但是无法ping域名。我的解决办法是:设置DNS

    
      vi /ect/resolv.conf
    
      nameserver 114.114.114.114 //这个值我是在本地连接的状态信息里找到的
    
    

    保存之后退出,再ping!

    2.配置免密钥

    (1)原理

    免密钥

    (2)配置方法

    键入ssh-keygen-t rsa,如图

    ssh

    进入ssh主目录,键入cd ~/.ssh

    ~/.ssh

    其中id_rsa是秘钥,id_rsa.pub是公钥文件

    将密钥追加到文件authorized_keys中:cat id_rsa.pub >> authorized_keys

    这就实现了对本机的免密钥

    要操作其他节点,首先需要将 id_rsa.pub分发给其他节点

    scp  id_rsa.pub 用户名@主机名或ip地址:"目录"
    

    然后根据上面的追加方法重复执行。

    3.安装jdk 1.8以及配置环境变量

    (1)解压jdk安装包

    tar xzvf jdk-8u191-linux-x64.tar.gz
    

    (2)配置环境变量

    以centos7为例:vi /etc/profile

    在末尾追加:

    
    export JAVA_HOME=/usr/java/jdk1.8.0_181(根据实际路径改动)
    
    export PATH=$JAVA_HOME/bin:$PATH
    
    export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
    
    

    4.安装zookeeper集群管理工具

    (1)解压

    tar zxvf zookeeper-3.4.6.tar.gz

    (2)配置

    我上一篇博客有详细讲解,这里不多赘述,传送门:https://blog.csdn.net/u011328843/article/details/84190285

    zoo.cfg配置文件如下:

    
    # The number of milliseconds of each tick
    
    tickTime=2000
    
    # The number of ticks that the initial
    
    # synchronization phase can take
    
    initLimit=10
    
    # The number of ticks that can pass between
    
    # sending a request and getting an acknowledgement
    
    syncLimit=5
    
    # the directory where the snapshot is stored.
    
    # do not use /tmp for storage, /tmp here is just
    
    # example sakes.
    
    dataDir=/opt/zookeepertmp
    
    dataLogDir=/opt/zookeepertmp/log
    
    # the port at which the clients will connect
    
    clientPort=2181
    
    # the maximum number of client connections.
    
    # increase this if you need to handle more clients
    
    #maxClientCnxns=60
    
    #
    
    # Be sure to read the maintenance section of the
    
    # administrator guide before turning on autopurge.
    
    #
    
    # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
    
    #
    
    # The number of snapshots to retain in dataDir
    
    #autopurge.snapRetainCount=3
    
    # Purge task interval in hours
    
    # Set to "0" to disable auto purge feature
    
    #autopurge.purgeInterval=1
    
    server.1=node2:2888:3888
    
    server.2=node3:2888:3888
    
    server.3=node4:2888:3888
    
    

    记得创建dataDir=/opt/zookeepertmp和dataLogDir=/opt/zookeepertmp/log文件夹

    (3)配置环境变量并启动

    环境变量:

    
    export ZOOKEEPER_HOME=/opt/zookeeper-3.4.6
    
    export PATH=$ZOOKEEPER_HOME/bin:$PATH
    
    

    启动命令: zkServer.sh start

    5.安装hadoop 3.1.1完全分布式以及配置高可用

    (1)解压

    用软件将"hadoop-3.1.1.tar.gz"文件上传到linux中,解压到当前命令如下:

    tar zxvf hadoop-3.1.1.tar.gz
    

    (2)配置

    在这里,我就不阐述原理了,具体原理以后会发新贴,直接给出配置文件内容。

    配置文件在下面的目录:

      cd hadoop-3.1.1/etc/hadoop/
    

    主要就是hdfs-site.xml和core-site.xml

    hdfs.site.xml

    
    <?xml version="1.0" encoding="UTF-8"?>
    
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <configuration>
    
    <property>
    
      <name>dfs.replication</name>
    
      <value>2</value>
    
    </property>
    
    <property>
    
      <name>dfs.namenode.secondary.http-address</name>
    
      <value>node2:9869</value>
    
    </property>
    
    <property>
    
      <name>dfs.nameservices</name>
    
      <value>mycluster</value>
    
    </property>
    
    <property>
    
      <name>dfs.ha.namenodes.mycluster</name>
    
      <value>nn1,nn2</value>
    
    </property>
    
    <property>
    
      <name>dfs.namenode.rpc-address.mycluster.nn1</name>
    
      <value>node1:8020</value>
    
    </property>
    
    <property>
    
      <name>dfs.namenode.rpc-address.mycluster.nn2</name>
    
      <value>node2:8020</value>
    
    </property>
    
    <property>
    
      <name>dfs.namenode.http-address.mycluster.nn1</name>
    
      <value>node1:9870</value>
    
    </property>
    
    <property>
    
      <name>dfs.namenode.http-address.mycluster.nn2</name>
    
      <value>node2:9870</value>
    
    </property>
    
    <property>
    
      <name>dfs.namenode.shared.edits.dir</name>
    
      <value>qjournal://node1:8485;node2:8485/mycluster</value>
    
    </property>
    
    <property>
    
      <name>dfs.client.failover.proxy.provider.mycluster</name>
    
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    
    </property>
    
    <property>
    
          <name>dfs.ha.fencing.methods</name>
    
          <value>sshfence</value>
    
    </property>
    
    <property>
    
          <name>dfs.ha.fencing.ssh.private-key-files</name>
    
          <value>/root/.ssh/id_rsa</value>
    
    </property>
    
    <property>
    
      <name>dfs.journalnode.edits.dir</name>
    
      <value>/opt/hadooptmp/ha/journalnode</value>
    
    </property>
    
    <property>
    
      <name>dfs.ha.automatic-failover.enabled</name>
    
      <value>true</value>
    
    </property>
    
    </configuration>
    
    

    core-site.xml

    
    <?xml version="1.0" encoding="UTF-8"?>
    
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <configuration>
    
    <property>
    
            <name>fs.defaultFS</name>
    
            <value>hdfs://mycluster</value>
    
    </property>
    
    <property>
    
            <name>hadoop.tmp.dir</name>
    
            <value>/opt/hadooptmp/ha</value>
    
    </property>
    
    <property>
    
            <name>hadoop.http.staticuser.user</name>
    
            <value>root</value>
    
    </property>
    
    <property>
    
            <name>hadoop.tmp.dir</name>
    
            <value>/opt/hadooptmp/ha</value>
    
    </property>
    
    <property>
    
      <name>ha.zookeeper.quorum</name>
    
          <value>node2:2181,node3:2181,node4:2181</value>
    
    </property>
    
    </configuration>
    
    

    (3)高可用集群启动顺序:

    1. 启动zookeeper

      zkServer.sh start

    2. 启动journalnode

      hadoop-daemon.sh start journalnode

    3. 格式化zkfc

      hdfs zkfc -formatZK

    4. 格式化主节点namenode格式化主节点namenode

    hdfs namenode -format

    1. 副节点同步主节点格式化

    hdfs namenode -bootstrapStandby

    1. 启动集群

      start-dfs.sh

    (4)完成效果

    node1的web端页面(Overview)

    pic1

    node1的web端页面(Datanode Information)

    pic2

    node2的web端页面(Overview)

    pic3

    当你做到这边,你就大功告成了,祝贺你,请持续关注我的csdn,你会有新的收获!

    相关文章

      网友评论

          本文标题:关于hadoop 3.1.1的集群搭建并完成高可用配置

          本文链接:https://www.haomeiwen.com/subject/usnifqtx.html