美文网首页我爱编程
hadoop环境搭建

hadoop环境搭建

作者: hello高world | 来源:发表于2018-03-31 00:01 被阅读0次

一、集群节点配置及服务

主机名 角色 运行服务 安装目录
tinygao1 Master NameNode
ResourceManager
/data/program/hadoop-2.8.0/
tinygao2 Slave DataNode
NodeManager
/data/program/hadoop-2.8.0/
tinygao3 Slave DataNode
NodeManager
/data/program/hadoop-2.8.0/

二、安装环境准备

1. 设置主机名和配置hosts

  • 修改主机名执行指令:

    # hostnamectl set-hostname tinygao1
    # hostnamectl status #查看是否修改成功“Static hostname: tinygao1”
    或修改文件/etc/hostname,并重启

  • 修改hosts

    192.168.17.128 tinygao1
    192.168.17.129 tinygao2
    192.168.17.130 tinygao3

2. 免密码登陆

​ client端把公钥保存到服务端,以便于服务端比对“随机字符”加解密后是否一致来判断client端的来源合法性,减少了输入密码的步骤。

举例:tinygao1免密登陆到tinygao2和tinygao3。

tinygao1

# ssh-keygen -t ras
# 一直回车后会生成两个文件: id_rsa(私钥) id_rsa.pub(公钥)
# ssh-copy-id root@tinygao2
# ssh-copy-id root@tinygao3

3. 关闭防火墙

两种办法:

  • 添加信任网段

# iptables -A INPUT -i ens33 -s 192.168.17.0/24 -j ACCEPT

  • 直接关闭(暴力)

# systemctl stop firewalld

4. 设置环境变量(以hadoop为例)

# touch /etc/profile.d/hadoop.sh
# 输入如下并保存:
HADOOP_HOME=/data/program/hadoop-2.8.0
PATH=$PATH:$HADOOP_HOME/bin
# . /etc/profile #生效
# evn | grep hadoop #查看是否在环境变量中

三、集群搭建

1. 修改配置文件(简单配置)

在各节点的$HADOOP_HOME/etc/hadoop目录下,修改

  • slaves文件内容如下(保证在tinygao1上一次可以启动下面机器的服务,需要ssh免密登陆):

    tinygao2
    tinygao3
    
  • core-site.xml

    可以参考:http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml。这边简单配置如下

    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <configuration>
       <property>
            <name>hadoop.tmp.dir</name>
            <value>/data/data/hadoop/tmpdir</value>
        </property>
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://tinygao1:9000</value>
        </property>
        <property>
            <name>io.file.buffer.size</name>
            <value>131072</value>
        </property>
    </configuration>
    
    
  • hdfs-site.xml

    可以参考:http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <configuration>
       <property>
        <name>dfs.replication</name>
        <value>3</value>
      </property>
      <property>
            <name>dfs.namenode.name.dir</name>
            <value>file:///data/data/hadoop/namenode</value>
        </property>
       <property>
            <name>dfs.webhdfs.enabled</name>
            <value>true</value>
        </property>
       <property>
            <name>dfs.namenode.handler.count</name>
            <value>20</value>
        </property>
    </configuration>
    
  • mapred-site.xml

    可以参考:http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <configuration>
    <property>  
              <name>mapreduce.framework.name</name>  
              <value>yarn</value>  
              <description>Execution framework set to Hadoop YARN.</description>  
         </property>
    <property>  
              <name>mapreduce.jobhistory.address</name>  
              <value>tinygao1:10020</value>  
              <description>MapReduce JobHistory Server host:port, default port is 10020</description>  
         </property>  
         <property>  
              <name>mapreduce.jobhistory.webapp.address</name>  
              <value>tinygao1:19888</value>  
              <description>MapReduce JobHistory Server Web UI host:port, default port is 19888.</description>  
         </property>  
    
    </configuration>
    
  • yarn-site.xml

    可以参考:http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

    <?xml version="1.0"?>
    <configuration>
       <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
        </property>
        <property>
            <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
            <value>org.apache.hadoop.mapred.ShuffleHandler</value>
        </property>
        <property>
            <name>yarn.resourcemanager.address</name>
            <value>tinygao1:8032</value>
        </property>
        <property>
            <name>yarn.resourcemanager.scheduler.address</name>
            <value>tinygao1:8030</value>
        </property>
        <property>
            <name>yarn.resourcemanager.resource-tracker.address</name>
            <value>tinygao1:8031</value>
        </property>
        <property>
            <name>yarn.resourcemanager.admin.address</name>
            <value>tinygao1:8033</value>
        </property>
        <property>
           <name>yarn.resourcemanager.webapp.address</name>
           <value>tinygao1:8088</value>
        </property>
    </configuration>
    

2.启动Hadoop服务

  • tinygao1进行hadoop格式,相当于元数据格式,会在/data/data/hadoop/namenode产生current目录

    # hdfs namenode –format
    # cd sbin
    # ./start-dfs.sh #本机启动namenode、读取slaves文件启动远程(tinygao2、tinygao3)datanode
    # ./start-yarn.sh

    分别使用jps -ml 指令可以查看到

    tinygao1

    10288 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
    9897 org.apache.hadoop.hdfs.server.namenode.NameNode
    10109 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode
    

    tinygao2

    2755 org.apache.hadoop.hdfs.server.datanode.DataNode
    2874 org.apache.hadoop.yarn.server.nodemanager.NodeManager
    

    tinygao3

    11184 org.apache.hadoop.yarn.server.nodemanager.NodeManager
    10994 org.apache.hadoop.hdfs.server.datanode.DataNode
    

相关文章

网友评论

    本文标题:hadoop环境搭建

    本文链接:https://www.haomeiwen.com/subject/aztydxtx.html