作者: 大数据首席数据师 | 来源:发表于2018-11-16 14:25 被阅读15次

    一、HDFS集群搭建

    准备工作

    1)关闭防火墙

    systemctl stop firewalld.service

    systemctl disable  firewalld.service

    2)安装JAVASDK,并设置环境变量

    vi ~/.bash_profile

    JAVA_HOME=/root/training/jdk1.8.0_144

    export JAVA_HOME

    PATH=$JAVA_HOME/bin:$PATH

    export PATH

    source ~/.bash_profile

    3)配置主机 vi /etc/hosts

    172.16.112.11 bigdata11

    172.16.112.12 bigdata12

    172.16.112.13 bigdata13

    4)配置免密码登录,两两之间的免密码登录

    ssh-keygen -t rsa

    创建过程中,操作全是回车

    ssh-copy-id -i ~/.ssh/id_rsa.pub root@bigdata11

    ssh-copy-id -i ~/.ssh/id_rsa.pub root@bigdata12

    ssh-copy-id -i ~/.ssh/id_rsa.pub root@bigdata13

    5)保证每台机器的时间同步

    如果时间不一样,执行MapReduce程序的时候可能存在问题

    在SecureCRT中 View -> Command Windows -> 出现的编辑框中右键选择 Send Command to All Sessions

    输入 date -s 2018-11-04 同步所有虚拟机时间

    开始搭建

    1)设置环境变量

    vi ~/.bash_profile

    HADOOP_HOME=/root/training/hadoop-2.7.3

    export HADOOP_HOME

    PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

    export PATH

    source ~/.bash_profile

    2)修改配置文件

    cd ~/training/hadoop-2.7.3/etc/hadoop/

    hadoop-env.sh

    export JAVA_HOME=/root/training/jdk1.8.0_144

    hdfs-site.xml,设置数据块的冗余度,默认3

    <property>

        <name>dfs.replication</name>

        <value>3</value>

    </property>

    core-site.xml,设置NameNode地址、HDFS数据保存路径(默认是Linux的tmp目录)

    /root/training/hadoop-2.7.3/tmp 文件夹路径自行创建

    <property>

        <name>fs.defaultFS</name>

        <value>hdfs://bigdata11:9000</value>

    </property>   

    <property>

        <name>hadoop.tmp.dir</name>

        <value>/root/training/hadoop-2.7.3/tmp</value>

    </property>

    slaves

    bigdata12

    bigdata13

    3)格式化NameNode

    hdfs namenode -format

    4)把主节点上配置好的hadoop复制到从节点上

    scp -r hadoop-2.7.3/ root@bigdata12:/root/training

    scp -r hadoop-2.7.3/ root@bigdata13:/root/training

    5)在主节点上启动

    start-dfs.sh

    启动日志文件

    Starting namenodes on [bigdata11]

    bigdata11: starting namenode, logging to /root/training/hadoop-2.7.3/logs/hadoop-root-namenode-bigdata11.out

    bigdata13: starting datanode, logging to /root/training/hadoop-2.7.3/logs/hadoop-root-datanode-bigdata13.out

    bigdata12: starting datanode, logging to /root/training/hadoop-2.7.3/logs/hadoop-root-datanode-bigdata12.out

    Starting secondary namenodes [0.0.0.0]

    0.0.0.0: starting secondarynamenode, logging to /root/training/hadoop-2.7.3/logs/hadoop-root-secondarynamenode-bigdata11.out

    优化,设置第二名称节点secondarynamenode位置

    hdfs-site.xml

    <property>

      <name>dfs.namenode.secondary.http-address</name>

      <value>bigdata12:50090</value>

    </property>

    在主节点上启动

    start-dfs.sh

    启动日志文件

    Starting namenodes on [bigdata11]

    bigdata11: starting namenode, logging to /root/training/hadoop-2.7.3/logs/hadoop-root-namenode-bigdata11.out

    bigdata13: starting datanode, logging to /root/training/hadoop-2.7.3/logs/hadoop-root-datanode-bigdata13.out

    bigdata12: starting datanode, logging to /root/training/hadoop-2.7.3/logs/hadoop-root-datanode-bigdata12.out

    Starting secondary namenodes [bigdata12]

    bigdata12: starting secondarynamenode, logging to /root/training/hadoop-2.7.3/logs/hadoop-root-secondarynamenode-bigdata12.out

    查看各节点的服务

    bigdat11

    3170 NameNode

    3398 Jps

    bigdata12

    1861 SecondaryNameNode

    1768 DataNode

    1902 Jps

    bigdata13

    1843 Jps

    1769 DataNode

    网页访问HDFS

    http://bigdata11:50070

    http://bigdata12:50090

    二、Yarn集群搭建

    开始搭建

    1)修改配置文件

    yarn-site.xml,设置主节点RM位置和MR运行方式

    <property>

        <name>yarn.resourcemanager.hostname</name>

        <value>bigdata11</value>

    </property>   

    <property>

        <name>yarn.nodemanager.aux-services</name>

        <value>mapreduce_shuffle</value>

    </property>

    mapred-site.xml,设置MR运行框架

    该文件不存在,自行创建

    cp mapred-site.xml.template mapred-site.xml

    <property>

        <name>mapreduce.framework.name</name>

        <value>yarn</value>

    </property>

    slaves

    bigdata12

    bigdata13

    2)然后复制到每台机器 $PWD 当前目录,slaves文件在HDFS集群搭建的时候已经复制到其他机器了,这边就不再复制

    scp yarn-site.xml root@bigdata12:$PWD

    scp yarn-site.xml root@bigdata13:$PWD

    scp mapred-site.xml root@bigdata12:$PWD

    scp mapred-site.xml root@bigdata13:$PWD

    3)脚本启动yarn集群

    start-yarn.sh

    启动日志

    starting yarn daemons

    starting resourcemanager, logging to /root/training/hadoop-2.7.3/logs/yarn-root-resourcemanager-bigdata11.out

    bigdata12: starting nodemanager, logging to /root/training/hadoop-2.7.3/logs/yarn-root-nodemanager-bigdata12.out

    bigdata13: starting nodemanager, logging to /root/training/hadoop-2.7.3/logs/yarn-root-nodemanager-bigdata13.out

    查看各节点的服务

    bigdat11

    4337 ResourceManager

    4594 Jps

    bigdata12

    2123 NodeManager

    2223 Jps

    bigdata13

    2032 Jps

    1931 NodeManager

    网页访问yarn

    http://bigdata11:8088

    大家喜欢多多关注,你爹

    相关文章

      网友评论

        本文标题:

        本文链接:https://www.haomeiwen.com/subject/ocrmfqtx.html