美文网首页大数据
hadoop第三章:伪分布式(没啥用)

hadoop第三章:伪分布式(没啥用)

作者: 张磊_e325 | 来源:发表于2019-10-20 20:24 被阅读0次

    官方文档

    1、配置java路径

    [atguigu@hadoop101 hadoop-2.7.2]$ vi etc/hadoop/hadoop-env.sh
    [atguigu@hadoop101 ~]$ echo $JAVA_HOME
    /opt/module/jdk1.8.0_144
    
    export JAVA_HOME=/opt/module/jdk1.8.0_144/
    

    2、配置core-site.xml

    [atguigu@hadoop101 hadoop-2.7.2]$ vi etc/hadoop/core-site.xml

    <configuration>
            <!-- 指定HDFS中NameNode的地址 -->
            <property>
                <name>fs.defaultFS</name>
                <value>hdfs://hadoop102:9000</value>
            </property>
    
            <!-- 指定Hadoop运行时产生文件的存储目录 -->
            <property>
                <name>hadoop.tmp.dir</name>
                <value>/opt/module/hadoop-2.7.2/data/tmp</value>
            </property>
    </configuration>
    

    3、配置hdfs-site.xml

    [atguigu@hadoop101 hadoop-2.7.2]$ vi etc/hadoop/hdfs-site.xml

    <configuration>
            <!-- 数据的副本数量(默认是3,但是伪分布式3份副本没有意义) -->
            <property>
                <name>dfs.replication</name>
                <value>1</value>
            </property>
    </configuration>
    

    4、配置ssh

    4.1 检测ssh

    [atguigu@hadoop101 hadoop-2.7.2]$ ssh localhost
    The authenticity of host 'localhost (::1)' can't be established.
    ECDSA key fingerprint is SHA256:a/fLj+zcF6SbAWiYcfZ/fy15Sky+kIyHEpBDHks+VDI.
    ECDSA key fingerprint is MD5:49:5b:e0:13:a4:45:fd:25:6a:68:74:ca:26:2b:b7:54.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
    atguigu@localhost's password: 
    Last login: Sun Oct 20 12:43:35 2019 from 192.168.37.1
    

    4.2 如果没有配置SSH就无法免密到本机,请执行以下命令:

    [atguigu@hadoop101 ~]$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
    Generating public/private dsa key pair.
    Your identification has been saved in /home/atguigu/.ssh/id_dsa.
    Your public key has been saved in /home/atguigu/.ssh/id_dsa.pub.
    The key fingerprint is:
    SHA256:toc5O2iScootgkTJN5Dim45WuhPGft0m8BpKEaqlkYo atguigu@hadoop101
    The key's randomart image is:
    +---[DSA 1024]----+
    |  .              |
    |.o               |
    |+.o              |
    |.*.o             |
    |*.= .   S        |
    |+@.o   . +       |
    |E.+.+...= .      |
    |*Oo.=+oo.+       |
    |++*=.oo ..       |
    +----[SHA256]-----+
    [atguigu@hadoop101 ~]$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
    [atguigu@hadoop101 ~]$ chmod 0600 ~/.ssh/authorized_keys
    [atguigu@hadoop101 ~]$ ssh localhost
    Last login: Sun Oct 20 12:55:26 2019 from localhost
    

    5、本地运行MapReduce作业

    5.1 格式化文件系统:

    [atguigu@hadoop101 hadoop-2.7.2]$ bin/hdfs namenode -format
    ...
    common.Storage: Storage directory /opt/module/hadoop-2.7.2/data/tmp/dfs/name has been successfully formatted.
    ...
    

    5.2 启动namenode和datanode

    [atguigu@hadoop101 hadoop-2.7.2]$ hadoop-daemon.sh start namenode
    starting namenode, logging to /opt/module/hadoop-2.7.2/logs/hadoop-atguigu-namenode-hadoop101.out
    [atguigu@hadoop101 hadoop-2.7.2]$ hadoop-daemon.sh start datanode
    starting datanode, logging to /opt/module/hadoop-2.7.2/logs/hadoop-atguigu-datanode-hadoop101.out
    [atguigu@hadoop101 hadoop-2.7.2]$ jps
    7300 Jps
    7237 DataNode
    7147 NameNode
    

    5.3 访问可视化页面

    192.168.37.101:50070
    能看到datanode就是成功了
    格式化NameNode,会产生新的集群id,导致NameNode和DataNode的集群id不一致,集群找不到以往数据。所以,格式NameNode时,一定要先删除data数据和log日志,然后再格式化NameNode。

    6、操作hdfs

    [atguigu@hadoop101 hadoop-2.7.2]$ bin/hdfs dfs -mkdir -p /user/atguigu/input
    [atguigu@hadoop101 hadoop-2.7.2]$ bin/hdfs dfs -put wcinput/wc.input /user/atguigu/input
    [atguigu@hadoop101 hadoop-2.7.2]$ bin/hdfs dfs -ls /user/atguigu/input
    Found 1 items
    -rw-r--r--   1 atguigu supergroup         57 2019-10-20 13:44 /user/atguigu/input/wc.input
    [atguigu@hadoop101 hadoop-2.7.2]$ bin/hdfs dfs -cat /user/atguigu/input/wc.input
    hadoop hdfs
    hadoop mapreduce
    hadoop yarn
    atguigu
    atguigu
    [atguigu@hadoop101 hadoop-2.7.2]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /user/atguigu/input /user/atguigu/output
    [atguigu@hadoop101 hadoop-2.7.2]$ bin/hdfs dfs -cat /user/atguigu/output/*
    atguigu 2
    hadoop  3
    hdfs    1
    mapreduce   1
    yarn    1
    
    

    web界面

    7、启动YARN并运行MapReduce程序

    [atguigu@hadoop101 hadoop-2.7.2]$ vi etc/hadoop/yarn-env.sh

    export JAVA_HOME=/opt/module/jdk1.8.0_144
    

    [atguigu@hadoop101 hadoop-2.7.2]$ vi etc/hadoop/yarn-site.xml

    <configuration>
            <!-- Reducer获取数据的方式 -->
            <property>
                    <name>yarn.nodemanager.aux-services</name>
                    <value>mapreduce_shuffle</value>
            </property>
            <!-- 指定YARN的ResourceManager的地址 -->
            <property>
                    <name>yarn.resourcemanager.hostname</name>
                    <value>hadoop101</value>
            </property>
    </configuration>
    

    [atguigu@hadoop101 hadoop-2.7.2]$ vi etc/hadoop/mapred-env.sh

    export JAVA_HOME=/opt/module/jdk1.8.0_144
    
    [atguigu@hadoop101 hadoop-2.7.2]$ mv etc/hadoop/mapred-site.xml.template etc/hadoop/mapred-site.xml
    [atguigu@hadoop101 hadoop-2.7.2]$ vi etc/hadoop/mapred-site.xml
    
    <configuration>
            <!-- 指定MR运行在YARN上 -->
            <property>
                    <name>mapreduce.framework.name</name>
                    <value>yarn</value>
            </property>
    </configuration>
    

    注意:启动yarn之前一定要保证hdfs已启动

    [atguigu@hadoop101 hadoop-2.7.2]$ jps
    7237 DataNode
    7147 NameNode
    7643 Jps
    [atguigu@hadoop101 hadoop-2.7.2]$ sbin/yarn-daemon.sh start resourcemanager
    starting resourcemanager, logging to /opt/module/hadoop-2.7.2/logs/yarn-atguigu-resourcemanager-hadoop101.out
    [atguigu@hadoop101 hadoop-2.7.2]$ sbin/yarn-daemon.sh start nodemanager
    starting nodemanager, logging to /opt/module/hadoop-2.7.2/logs/yarn-atguigu-nodemanager-hadoop101.out
    [atguigu@hadoop101 hadoop-2.7.2]$ jps
    8450 Jps
    7237 DataNode
    7909 NodeManager
    7671 ResourceManager
    7147 NameNode
    [atguigu@hadoop101 hadoop-2.7.2]$ bin/hdfs dfs -rm -R /user/atguigu/output
    19/10/20 14:02:06 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
    Deleted /user/atguigu/output
    [atguigu@hadoop101 hadoop-2.7.2]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /user/atguigu/input /user/atguigu/output
    [atguigu@hadoop101 hadoop-2.7.2]$ bin/hdfs dfs -cat /user/atguigu/output/*
    atguigu 2
    hadoop  3
    hdfs    1
    mapreduce   1
    yarn    1
    

    相关文章

      网友评论

        本文标题:hadoop第三章:伪分布式(没啥用)

        本文链接:https://www.haomeiwen.com/subject/lhoomctx.html