美文网首页
大数据之Hadoop 安装(macOS Mojave)

大数据之Hadoop 安装(macOS Mojave)

作者: etrols | 来源:发表于2019-01-05 13:15 被阅读0次

    本教程采用 CDH 版,以避免版本依赖冲突导致错误,本教程同样适用于 Linux(推荐 CentOS);
    本教程 Hadoop 使用伪分布式模式

    Hadoop 运行模式

    本地模式(单机模式)

    Hadoop 默认模式为非分布式模式(本地模式),无需进行配置即可运行,即单 java 进程,方便进行调试。

    伪分布式模式

    Hadoop 可以在单节点上以伪分布式的方式运行,Hadoop 进程以分离的 Java 进程来运行,节点即作为 NameNode,也作为 DataNode,同时,读取的是 HDFS 中的文件

    分布式模式

    使用多个节点构成集群环境来运行 Hadoop

    Hadoop CDH版本下载

    下载地址:https://archive.cloudera.com/cdh5/cdh/5/
    版本:hadoop-2.6.0-cdh5.9.3.tar.gz

    环境准备

    ssh 免密登录(此步骤可以忽略,但 Hadoop 每次启动都需要输入密码)

    终端执行以下命令:

    zhangzhaodeMacBook-Pro:~ zhangzhao$ ssh-keygen -t rsa -P "" //一直回车即可
    zhangzhaodeMacBook-Pro:~ zhangzhao$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    

    验证免密登录

    zhangzhaodeMacBook-Pro:~ zhangzhao$ ssh localhost
    Last login: Fri Jan  4 13:45:54 2019 //出现这个结果表示免密登录成功
    

    JDK 安装

    JDK 版本:
            macOS:jdk-8u192-macosx-x64.dmg
            Linux:jdk-8u192-linux-x64.tar.gz
    macOS 双击安装,Linux 解压即可

    JDK 环境变量配置:

    macOS:

    在系统根目录(~)下打开.bash_profile

    zhangzhaodeMacBook-Pro:~ zhangzhao$ vim .bash_profile
    

    添加以下内容:

      1 JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/
      2 CLASSPAHT=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
      3 PATH=$JAVA_HOME/bin:$PATH:
      4 export JAVA_HOME
      5 export CLASSPATH
      6 export PATH
    

    最后使环境变量生效:

    zhangzhaodeMacBook-Pro:~ zhangzhao$ source .bash_profile
    

    JDK 验证:

    zhangzhaodeMacBook-Pro:~ zhangzhao$ java -version
    java version "1.8.0_192"
    Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
    Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode) java -version
    

    Linux(有默认的 openJDK 的话,可以忽略):

    在系统根目录(~)下打开.bash_profile

    vim .bash_profile
    

    添加以下内容:

    JAVA_HOME=/usr/lib/jdk1.8.0_192
    CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar 
    PATH=$JAVA_HOME/bin:$HOME/bin:$HOME/.local/bin:$PATH
    

    最后使环境变量生效:

    source .bash_profile
    

    JDK 验证:

    java -version
    java version "1.8.0_192"
    Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
    Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode) java -version
    

    下载 Hadoop

    使用 wget 命令,也可以手动下载
    我这里下载到 /Users/zhangzhao/develop/hadoop

    zhangzhaodeMacBook-Pro:hadoop zhangzhao$ wget https://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.9.3.tar.gz
    

    mac 系统默认没有 wget,使用 Homebrew 安装(Linux 请忽略):

    zhangzhaodeMacBook-Pro:~ zhangzhao$ brew install wget
    

    Homebrew官网
    安装Homebrew(Linux 请忽略):

    zhangzhaodeMacBook-Pro:~ zhangzhao$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
    

    Homebrew使用请参考官网

    解压 Hadoop

    zhangzhaodeMacBook-Pro:hadoop zhangzhao$ zhangzhao$ tar -zxvf hadoop-2.6.0-cdh5.9.3.tar.gz
    zhangzhaodeMacBook-Pro:hadoop zhangzhao$ ls
    hadoop-2.6.0-cdh5.9.3
    hadoop-2.6.0-cdh5.9.3.tar.gz
    

    Hadoop 目录结构

    zhangzhaodeMacBook-Pro:hadoop zhangzhao$ cd hadoop-2.6.0-cdh5.9.3/
    zhangzhaodeMacBook-Pro:hadoop-2.6.0-cdh5.9.3 zhangzhao$ ls
    LICENSE.txt        cloudera                     lib
    NOTICE.txt         etc                          libexec
    README.txt         examples                     sbin
    bin                examples-mapreduce1          share
    bin-mapreduce1     include                      src
    

    bin:存放基础的管理脚本和使用脚本,这些脚本是sbin目录下管理脚本的基础实现,用户可以用这些脚本管理和使用hadoop
    etc:存放包括core-site.xml、hdfs-site.xml、mapred-site.xml和yarn-site.xml等配置文件。.template是模板文件。
    lib:存放Hadoop的本地库(对数据进行压缩解压缩功能)
    sbin:存放启动或停止Hadoop集群相关服务的脚本
    share:存放Hadoop的依赖jar包、文档、和官方案例
    libexec:各个服务所对应的shell配置文件所在目录,可用于配置日志输出目录、启动参数(比如JVM参数)等基本信息

    Hadoop 核心配置文件配置

    配置文件目录:~/develop/hadoop/hadoop-2.6.0-cdh5.9.3/etc/hadoop

    hadoop-env.sh

    添加 JDK 安装目录路径:

    export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/
    
    vim hadoop-env.sh
    
    hadoop-env.sh

    core-site.xml

    添加如下配置:

    <!-- hdfs 端口 -->
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:8020</value>
    </property>
    <!-- hadoop 临时数据目录 -->
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/Users/zhangzhao/develop/tmp</value>
    </property>
    
    vim core-site.xml
    
    core-site.xml

    hdfs-site.xml

    添加如下配置:

    <configuration>
        <!-- hdfs 数据副本数目  -->
        <property>
            <name>dfs.replication</name>
            <value>1</value>
        </property>
        <!-- hdfs 存储 fsimage 的地方  -->
        <property>
            <name>dfs.namenode.name.dir</name>
            <value>/Users/zhangzhao/develop/tmp/dfs/name</value>
        </property>
        <!-- hdfs 数据存放 block 的地方  -->
        <property>
            <name>dfs.datanode.data.dir</name>
            <value>/Users/zhangzhao/develop/tmp/dfs/data</value>
        </property>
    </configuration>
    
    vim hdfs-site.xml
    
    hdfs-site.xml

    Hadoop 环境变量

    vim ~/.bash_profile
    

    添加如下配置:

    # added by Hadoop installer
    export HADOOP_HOME=/Users/zhangzhao/develop/hadoop/hadoop-2.6.0-cdh5.9.3
    export HADOOP_INSTALL=$HADOOP_HOME
    export HADOOP_MAPRED_HOME=$HADOOP_HOME
    export HADOOP_COMMON_HOME=$HADOOP_HOME
    export HADOOP_HDFS_HOME=$HADOOP_HOME
    export YARN_HOME=$HADOOP_HOME
    export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
    export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
    

    使配置生效

    source ~/.bash_profile
    

    HDFS 格式化与启动停止

    格式化 HDFS

    注意:这一步操作,只在初始化时执行一次,如果每次都格式化,那么 HDFS 上的数据会全部清空。

    zhangzhaodeMacBook-Pro:bin zhangzhao$ hdfs namenode -format
    

    出现以下日志表示格式化成功:


    HDFS 格式化日志

    启动 HDFS

    zhangzhaodeMacBook-Pro:hadoop-2.6.0-cdh5.9.3 zhangzhao$ cd sbin/
    zhangzhaodeMacBook-Pro:sbin zhangzhao$ start-dfs.sh 
    19/01/05 12:43:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Starting namenodes on [localhost]
    localhost: starting namenode, logging to /Users/zhangzhao/develop/hadoop/hadoop-2.6.0-cdh5.9.3/logs/hadoop-zhangzhao-namenode-zhangzhaodeMacBook-Pro.local.out
    localhost: starting datanode, logging to /Users/zhangzhao/develop/hadoop/hadoop-2.6.0-cdh5.9.3/logs/hadoop-zhangzhao-datanode-zhangzhaodeMacBook-Pro.local.out
    Starting secondary namenodes [account.jetbrains.com]
    account.jetbrains.com: starting secondarynamenode, logging to /Users/zhangzhao/develop/hadoop/hadoop-2.6.0-cdh5.9.3/logs/hadoop-zhangzhao-secondarynamenode-zhangzhaodeMacBook-Pro.local.out
    19/01/05 12:44:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    

    验证 HDFS 启动是否成功

    zhangzhaodeMacBook-Pro:sbin zhangzhao$ jps
    87715 NameNode
    87781 DataNode
    87871 SecondaryNameNode
    87950 Jps
    

    出现以上三个 node,表示成功
    访问 HDFS:http://localhost:50070

    HDFS 地址

    停止 HDFS

    zhangzhaodeMacBook-Pro:sbin zhangzhao$ stop-dfs.sh 
    19/01/05 12:47:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Stopping namenodes on [localhost]
    localhost: stopping namenode
    localhost: stopping datanode
    Stopping secondary namenodes [account.jetbrains.com]
    account.jetbrains.com: stopping secondarynamenode
    19/01/05 12:48:05 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    zhangzhaodeMacBook-Pro:sbin zhangzhao$ jps
    88263 Jps
    

    启动 Hadoop 集群

    zhangzhaodeMacBook-Pro:sbin zhangzhao$ start-all.sh 
    This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
    19/01/05 13:13:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Starting namenodes on [localhost]
    localhost: namenode running as process 88426. Stop it first.
    localhost: datanode running as process 88500. Stop it first.
    Starting secondary namenodes [account.jetbrains.com]
    account.jetbrains.com: secondarynamenode running as process 88592. Stop it first.
    19/01/05 13:13:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    starting yarn daemons
    starting resourcemanager, logging to /Users/zhangzhao/develop/hadoop/hadoop-2.6.0-cdh5.9.3/logs/yarn-zhangzhao-resourcemanager-zhangzhaodeMacBook-Pro.local.out
    localhost: starting nodemanager, logging to /Users/zhangzhao/develop/hadoop/hadoop-2.6.0-cdh5.9.3/logs/yarn-zhangzhao-nodemanager-zhangzhaodeMacBook-Pro.local.out
    zhangzhaodeMacBook-Pro:sbin zhangzhao$ jps
    88592 SecondaryNameNode
    88500 DataNode
    89591 NodeManager
    88426 NameNode
    89519 ResourceManager
    89615 Jps
    

    jps 命令出现以上 5 个服务表示正常

    停止 Hadoop 集群

    zhangzhaodeMacBook-Pro:sbin zhangzhao$ start-all.sh 
    This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
    19/01/05 13:15:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Starting namenodes on [localhost]
    localhost: namenode running as process 88426. Stop it first.
    localhost: datanode running as process 88500. Stop it first.
    Starting secondary namenodes [account.jetbrains.com]
    account.jetbrains.com: secondarynamenode running as process 88592. Stop it first.
    19/01/05 13:15:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    starting yarn daemons
    resourcemanager running as process 89519. Stop it first.
    localhost: nodemanager running as process 89591. Stop it first.
    

    相关文章

      网友评论

          本文标题:大数据之Hadoop 安装(macOS Mojave)

          本文链接:https://www.haomeiwen.com/subject/uhnhrqtx.html