美文网首页集群配置
HADOOP配置HA(新版)

HADOOP配置HA(新版)

作者: 0_9f3a | 来源:发表于2017-10-27 16:30 被阅读0次

    集群网络规划

    master 192.168.111.111
    slave1 192.168.111.112
    slave2 192.168.111.113
    slave3 192.168.111.114
    

    高可用节点配置

    QQ截图20171027154301.png

    1.配置hdfs-site.xml
    文件目录/usr/local/hadoop/etc/hadoop/

    vim /usr/local/hadoop/etc/hadoop/hdfs-site.xml
    
    <property>
      <name>dfs.nameservices</name>
      <value>mycluster</value>
    </property>
    <property>
      <name>dfs.ha.namenodes.mycluster</name>
      <value>nn1,nn2</value>
    </property>
    <property>
      <name>dfs.namenode.rpc-address.mycluster.nn1</name>
      <value>master:8020</value>
    </property>
    <property>
      <name>dfs.namenode.rpc-address.mycluster.nn2</name>
      <value>slave1:8020</value>
    </property>
    <property>
      <name>dfs.namenode.http-address.mycluster.nn1</name>
      <value>master:50070</value>
    </property>
    <property>
      <name>dfs.namenode.http-address.mycluster.nn2</name>
      <value>slave1:50070</value>
    </property>
    <property>
      <name>dfs.namenode.shared.edits.dir</name>
      <value>qjournal://master:8485;slave1:8485;slave2:8485/mycluster</value>
    </property>
    
    
    <property>
      <name>dfs.client.failover.proxy.provider.mycluster</name>
      <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <property>
      <name>dfs.ha.fencing.methods</name>
      <value>sshfence</value>
    </property>
    <property>
      <name>dfs.ha.fencing.ssh.private-key-files</name>
      <value>/root/.ssh/id_rsa</value>
    </property>
    <property>
       <name>dfs.ha.automatic-failover.enabled</name>
       <value>true</value>
     </property>
    

    具体原理可以参考官方文档
    2.配置core-site.xml
    文件目录/usr/local/hadoop/etc/hadoop/

    vim /usr/local/hadoop/etc/hadoop/core-site.xml
    
    <configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://master:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/usr/local/hadoop/current/hdfs/ha</value>
    </property>
    <property>
       <name>ha.zookeeper.quorum</name>
       <value>slave1:2181,slave2:2181,slave3:2181</value>
     </property>
    

    注意: hdfs-site.xml core-site.xml所有的机器配置相同
    master上配置好的文件传到slave

    scp core-site.xml hdfs-site.xml slaveX:/usr/local/hadoop/etc/hadoop/
    

    3.安装zookeeper(所有机器同步)
    版本3.4.6
    3.1使用Xftp上传至/usr/local/
    3.2解压文件包

    tar -xzvf zookeeper-3.4.6.tar.gz
    

    3.3 重命名(可以忽略)

    mv zookeeper-3.4.6.tar zookeeper
    

    3.4配置环境变量(所有目录配置相同)

    vim ~/.bash_profile 
    
    export ZK_HOME=/usr/local/zookeeper
    export PATH=$PATH:$ZK_HOME/bin:$ZK_HOME/sbin
    export JAVA_HOME=/usr/java/jdk1.8.0_91
    export PATH=$PATH:$JAVA_HOME/bin
    export HADOOP_HOME=/usr/local/hadoop/
    export HADOOP_PREFIX=$HADOOP_HOME
    export HADOOP_COMMON_HOME=$HADOOP_PREFIX
    export HADOOP_CONF_DIR=$HADOOP_PREFIX/etc/hadoop
    export HADOOP_HDFS_HOME=$HADOOP_PREFIX
    export HADOOP_MAPRED_HOME=$HADOOP_PREFIX
    export HADOOP_YARN_HOME=$HADOOP_PREFIX
    export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
    

    使环境变量生效

    source ~/.bash_profile
    

    4.配置zookeeper (所有机器同步)

    /usr/local/zookeeper/conf
    mv zoo_sample.cfg zoo.cfg
    vim zoo.cfg
    在最后加上
    dataDir=/usr/local/zookeeper/zk
    server.1=slave1:2888:3888
    server.2=slave2:2888:3888
    server.3=slave3:2888:3888
    

    在 slave集群新建文件夹

    mkdir /usr/local/zookeeper/zk
    

    在文件夹内新建myid文件,分别在里面写1 2 3
    *******至此HADOOP HA配置完毕******
    5.第一次启动
    在所有集群上

    zkServer.sh start
    

    等待一会儿,集群会自动选举主从。检查一下

    zkServer.sh status
    
    Paste_Image.png

    启动journalnode

    hadoop-daemon.sh start journalnode
    

    在master上

    hdfs namenode -format
    hadoop-daemon.sh start namenode
    

    在slave1上

    hdfs namenode -bootstrapStandby
    

    在master上

    start-dfs.sh
    

    在slave1上

    hdfs zkfc -formatZK
    hadoop-daemon.sh start zkfc
    

    再配置yarn-site.xml

    <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
    </property>
    <property>
      <name>yarn.resourcemanager.ha.enabled</name>
      <value>true</value>
    </property>
    <property>
      <name>yarn.resourcemanager.cluster-id</name>
      <value>sxt2yarn</value>
    </property>
    <property>
      <name>yarn.resourcemanager.ha.rm-ids</name>
      <value>rm1,rm2</value>
    </property>
    <property>
      <name>yarn.resourcemanager.hostname.rm1</name>
      <value>slave2</value>
    </property>
    <property>
      <name>yarn.resourcemanager.hostname.rm2</name>
      <value>slave3</value>
    </property>
    <property>
      <name>yarn.resourcemanager.webapp.address.rm1</name>
      <value>slave2:8088</value>
    </property>
    <property>
      <name>yarn.resourcemanager.webapp.address.rm2</name>
      <value>slave3:8088</value>
    </property>
    <property>
      <name>yarn.resourcemanager.zk-address</name>
      <value>master:2181,slave1:2181,slave2:2181</value>
    </property>
    

    至此 集群第一次启动完毕,可以观察到以下状态

    Paste_Image.png

    以后启动集群只需要
    1.启动zookeeper集群
    2.start-dfs.sh

    zkServer.sh start
    start-dfs.sh
    

    slave2 slave3 的resourcemanager需要单独启动
    yarn-daemons.sh start resourcemanager

    相关文章

      网友评论

        本文标题:HADOOP配置HA(新版)

        本文链接:https://www.haomeiwen.com/subject/apslpxtx.html