美文网首页
hadoop 2.8 入门学习

hadoop 2.8 入门学习

作者: 大奇聊数据 | 来源:发表于2019-01-24 23:00 被阅读24次

    1、环境准备

    [root@host196 hadoop-2.8.5]# cat /etc/hosts
    127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
    ::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
    192.168.74.10  host196
    192.168.74.29  host197
    192.168.74.30  host198
    
    

    安装好jdk 1.8, zookeeper集群,规划好机器安装角色

    启用ssh 免密登录

    2、安装步骤

    cd opt/
    wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-2.7.7/hadoop-2.7.7.tar.gz
    tar -zxvf hadoop-2.7.7.tar.gz
    
    配置环境变量
    vi /etc/profile
    
    HADOOP_HOME=/opt/hadoop-2.8.5
    HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
    ZOOKEEPER_HOME=/opt/zookeeper-3.4.8
    
    export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$GIT_HOME/bin:$PATH
    
    source /etc/profile
    
    更改hadoop-env.sh文件
    vi hadoop-env.sh
    export JAVA_HOME=/usr/local/jdk1.8.0_111
    
    修改 core-site.xml 打开 core-site.xml
    vi core-site.xml
    <configuration>
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://host196:9000</value>
        </property>
        <property>
            <name>hadoop.tmp.dir</name>
            <value>file:/opt/hadoop-2.8.5/tmp</value>
        </property>
    </configuration>
    
    修改 hdfs-site.xml 打开hdfs-site.xml文件
    vi hdfs-site.xml
    <configuration>
        <property>
            <name>dfs.namenode.secondary.http-address</name>
            <value>host196:50090</value>
        </property>
        <property>
            <name>dfs.replication</name>
            <value>2</value>
        </property>
        <property>
            <name>dfs.namenode.name.dir</name>
            <value>file:/opt/hadoop-2.8.5/tmp/dfs/name</value>
        </property>
        <property>
            <name>dfs.datanode.data.dir</name>
            <value>file:/opt/hadoop-2.8.5/tmp/dfs/data</value>
        </property>
    </configuration>
    
    修改 mapred-site.xml,根据模板拷贝
    cp mapred-site.xml.template mapred-site.xml
    vi mapred-site.xml
    <configuration>
        <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
        </property>
        <property>
            <name>mapreduce.jobhistory.address</name>
            <value>host196:10020</value>
        </property>
        <property>
            <name>mapreduce.jobhistory.webapp.address</name>
            <value>host196:19888</value>
        </property>
    </configuration>
    
    修改 yarn-site.xml
    vi yarn-site.xml
    <configuration>
    <!-- Site specific YARN configuration properties -->
        <property>
            <name>yarn.resourcemanager.hostname</name>
            <value>host196</value>
        </property>
        <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
        </property>
    </configuration>
    
    更改slaves
    vi slaves
    host197
    host198
    
    scp hadoop目录至host197, host198上
    scp -r hadoop-2.8.5 host197:/opt/
    scp -r hadoop-2.8.5 host198:/opt/
    
    集群格式化,只需要在主节点上,第一次操作就行了
    ./hadoop namenode -format
    ./hadoop datanode -format
    
    

    3、启动、停止服务脚本

    • [启动]
    关闭防火墙
    systemctl stop firewalld.service
    start-dfs.sh
    start-yarn.sh
    mr-jobhistory-daemon.sh start historyserver
        或者
    start-all.sh
    mr-jobhistory-daemon.sh start historyserver
    
    
    • [停止]
    stop-all.sh
    
    

    访问服务: http://192.168.74.10:50070

    http://192.168.74.10:8088

    4、简单测试

    在HDFS上创建一个文件夹

    hadoop fs -mkdir -p /test/hdfs_test/input
    
    

    查看创建的文件夹

    [root@host196 hadoop]# hadoop fs -ls /
    Found 2 items
    drwxr-xr-x   - root supergroup          0 2018-10-18 17:54 /opt
    drwxrwx---   - root supergroup          0 2018-10-18 17:40 /tmp
    [root@host196 hadoop]# hadoop fs -ls /opt/hdfs_test
    Found 2 items
    drwxr-xr-x   - root supergroup          0 2018-10-18 18:06 /opt/hdfs_test/input
    drwxr-xr-x   - root supergroup          0 2018-10-19 10:11 /opt/hdfs_test/output
    
    

    创建一个文件words.txt

    vi words.txt
    hello zhangsan
    hello lisi
    hello wangwu
    
    

    上传到HDFS的/test/hdfs_test/input文件夹中

    [hadoop@hadoop1 ~]$ hadoop fs -put words.txt /test/hdfs_test/input
    
    

    将刚刚上传的文件下载到~/data文件夹中

    hadoop fs -get /test/hdfs_test/input/words.txt ~/data
    
    

    删除hdfs上的文件夹

    hadoop fs -rmdir /opt/hdfs_test/output
    
    

    运行一个mapreduce的例子程序

    hadoop jar /opt/hadoop-2.8.5/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.5.jar wordcount /opt/hdfs_test/input /opt/hdfs_test/output
    
    

    5、参考资料

    1、Hadoop-2.7.7 集群快速搭建-(https://blog.csdn.net/qq_33857413/article/details/82853037) 2、Hadoop学习之路(四)Hadoop集群搭建和简单应用-(https://www.cnblogs.com/qingyunzong/p/8496127.html)

    6、FAQ

    1、连接hadoop 进行hdfs文件操作,出现以下异常:Permission denied: user=administrator, access=WRITE, inode="/":root:supergroup:drwxr-xr-x

    解决方法:hadoop 的hdfs-site文件中添加以下内容,关闭权限检查 ,即解决了上述问题。
    
    <property>
        <name>dfs.permissions</name>
        <value>false</value>
    </property>
    
    sudo -u hdfs hadoop fs -mkdir /user/root 我们可以以hdfs的身份对文件进行操作
    
    切换到hdfs用户 进行执行命令即可
    

    相关文章

      网友评论

          本文标题:hadoop 2.8 入门学习

          本文链接:https://www.haomeiwen.com/subject/sirpjqtx.html