美文网首页
mac基于docker安装hadoop+hive+hbase+s

mac基于docker安装hadoop+hive+hbase+s

作者: 菇菇菇呀 | 来源:发表于2020-11-17 13:52 被阅读0次

    版本配置:
    Hadoop 3.2.1+hive apache-hive-3.1.2+hbase-2.2.6+spark3.0.1+mysql:8.0.22

    Mac基于docker安装,对于docker的一些常规操作此处不做叙诉。

    由于hadoop与hive等存在版本兼容问题,安装前可以先通过官网确认版本兼容情况:

    http://hive.apache.org/downloads.html

    docker安装也可以采取docker-compose.yml配置文件的形式拉取配置,但本人对于docker的了解目前有限,故采用自己相对了解的方式进行安装配置。

    Hadoop

    1.拉取hadoop镜像

    docker pull registry.cn-hangzhou.aliyuncs.com/hadoop_test/hadoop_base
    
    image.png

    2.运行容器

    关于worker路径,通过etc/profile环境变量配置的文件即可查看hadoop安装目录

    //查看环境变量配置
    cat etc/profile
    

    查看wokers状态

    [图片上传失败...(image-c02e61-1607568203727)]

    • 建立hadoop用的内部网络

      #指定固定ip号段
      docker network create --driver=bridge --subnet=172.19.0.0/16  hadoop
      
    • 建立Master容器,映射端口,10000端口为hiveserver2端口

      docker run -it --network hadoop -h Master --name Master -p 9870:9870 -p 8088:8088 -p 10000:10000 registry.cn-hangzhou.aliyuncs.com/hadoop_test/hadoop_base bash
      
    • 创建Slave1容器

    docker run -it --network hadoop -h Slave1 --name Slave1 registry.cn-hangzhou.aliyuncs.com/hadoop_test/hadoop_base bash
    
    • 创建Slave2容器

      docker run -it --network hadoop -h Slave2 --name Slave2 registry.cn-hangzhou.aliyuncs.com/hadoop_test/hadoop_base bash
      
      

      配置hosts文件

      172.19.0.4    Master
      172.19.0.3    Slave1
      172.19.0.2    Slave2
      

    3.启动hadoop

    虽然容器里面已经把hadoop路径配置在系统变量里面,但每次进入需要运行source /etc/profile才能生效使用

    查看已经运行的容器

    docker ps
    

    [图片上传失败...(image-f018c4-1607568203727)]

    docker进入容器中

    #进入Master容器
    docker exec -it Master /bin/bash
    #进入后格式化hdfs
    hadoop namenode -format
    

    [图片上传失败...(image-612090-1607568203727)]

    启动所有服务

    root@Master:/usr/local/hadoop/sbin# ./start-all.sh
    
    Starting namenodes on [Master]
    Master: Warning: Permanently added 'master,172.19.0.4' (ECDSA) to the list of known hosts.
    Starting datanodes
    Slave1: Warning: Permanently added 'slave1,172.19.0.3' (ECDSA) to the list of known hosts.
    Slave2: Warning: Permanently added 'slave2,172.19.0.2' (ECDSA) to the list of known hosts.
    Slave1: WARNING: /usr/local/hadoop/logs does not exist. Creating.
    Slave2: WARNING: /usr/local/hadoop/logs does not exist. Creating.
    Starting secondary namenodes [Master]
    Starting resourcemanager
    Starting nodemanagers
    

    [图片上传失败...(image-4ae517-1607568203727)]

    [图片上传失败...(image-90e3ef-1607568203727)]

    查看分布式文件分布状态

    root@Master:/usr/local/hadoop/sbin# hdfs dfsadmin -report
    
    root@Master:/usr/local/hadoop/sbin#  hdfs dfsadmin -report
    bash: hdfs: command not found
    root@Master:/usr/local/hadoop/sbin# source /etc/profile
    root@Master:/usr/local/hadoop/sbin# hdfs dfsadmin -report
    Configured Capacity: 188176871424 (175.25 GB)
    Present Capacity: 152964861952 (142.46 GB)
    DFS Remaining: 152510214144 (142.04 GB)
    DFS Used: 454647808 (433.59 MB)
    DFS Used%: 0.30%
    Replicated Blocks:
        Under replicated blocks: 0
        Blocks with corrupt replicas: 0
        Missing blocks: 0
        Missing blocks (with replication factor 1): 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0
    Erasure Coded Block Groups:
        Low redundancy block groups: 0
        Block groups with corrupt internal blocks: 0
        Missing block groups: 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0
    
    

    这里提示:hdfs命令没发现,原因是因为启动容器时没有source profile文件,虽然在profile文件中配置hadoop的相关配置单没有生效

    4.wordCount案例

    //复制文件内容到file1.txt文件中
    root@Master:/usr/local/hadoop# cp LICENSE.txt  file1.txt
    //设置上传文件夹
    root@Master:/usr/local/hadoop# hadoop fs -mkdir /input
    //上传file1文件到hadoop中
    root@Master:/usr/local/hadoop# hadoop fs -put file1.txt /input
    2020-11-23 02:15:37,958 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
    //查看 HDFS 中 input 文件夹里的内容
    Found 1 items
    -rw-r--r--   2 root supergroup     150569 2020-11-23 02:15 /input/file1.txt
    
    //查看运行结果
    root@Master:/usr/local/hadoop#  hadoop fs -ls /output
    Found 2 items
    -rw-r--r--   2 root supergroup          0 2020-11-23 02:22 /output/_SUCCESS
    -rw-r--r--   2 root supergroup      35324 2020-11-23 02:22 /output/part-r-00000
    
    //查看具体结果内容
    hadoop fs -cat /output/part-r-00000
    

    Hive

    hive镜像采取的是apache-hive-3.1.2

    下载地址:https://mirror.bit.edu.cn/apache/hive/hive-3.1.2/

    1.上传hive镜像

     docker cp apache-hive-3.1.2-bin.tar.gz Master:/usr/local    
     
     //进入到目录后解压
     cd /usr/local/
    # 解压安装包
    tar xvf apache-hive-3.1.2-bin.tar.gz
    

    2.修改配置文件

    root@Master:/usr/local/apache-hive-3.1.2-bin/conf# cp hive-default.xml.template hive-site.xml
    root@Master:/usr/local/apache-hive-3.1.2-bin/conf# vim hive-site.xml
    

    删除 hive-site.xml中3215,96 特殊字符

    在hive-site.xml文件中加上以下内容

    <property>
        <name>system:java.io.tmpdir</name>
        <value>/tmp/hive/java</value>
      </property>
      <property>
        <name>system:user.name</name>
        <value>${user.name}</value>
      </property>
    

    3.配置hive相关环境变量

    root@Master:/usr/local/apache-hive-3.1.2-bin/conf# vi /etc/profile
    
    #hive
    export HIVE_HOME="/usr/local/apache-hive-3.1.2-bin"
    export PATH=$PATH:$HIVE_HOME/bin
    
    //source一下配置文件
    root@Master:/usr/local/apache-hive-3.1.2-bin/conf# source /etc/profile
    

    4.配置hadoop作为元数据库

    拉取mysql镜像
     docker pull mysql:8.0.22     
     #建立容器
    docker run --name mysql_hive -p 4306:3306 --net hadoop --ip 172.19.0.5 -v /root/mysql:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=abc123456 -d mysql:8.0.18
    #进入容器
    docker exec -it mysql_hive bash
    #进入myslq
    mysql -uroot -p
    #密码上面建立容器时候已经设置abc123456
    #建立hive数据库
    create database hive;
    #修改远程连接权限
    ALTER USER 'root'@'%' IDENTIFIED WITH mysql_native_password BY 'abc123456';
    
    回去Master容器,修改关联数据库的配置
    docker exec -it Master bash
    vi /usr/local/apache-hive-3.1.2-bin/conf/hive-site.xml
    
    #还请注意hive配置文件里面使用&amp;作为分隔,高版本myssql需要SSL验证,在这里设置关闭
    <property>
      <name>javax.jdo.option.ConnectionURL</name>
      <value>jdbc:mysql://172.19.0.5:3306/hive?createDatabaseIfNotExist=true&amp;useSSL=false</value>
      <description>JDBC connect string for a JDBC metastore</description>
    </property>
    <property>
      <name>javax.jdo.option.ConnectionDriverName</name>
      <value>com.mysql.jdbc.Driver</value>
       <!--MySQL8的jar包com.mysql.cj.jdbc.Driver,mysql5用com.mysql.jdbc.Driver-->
      <description>mysql-jdbc驱动</description>
    </property>
    <property>
      <name>javax.jdo.option.ConnectionUserName</name>
      <value>root</value>
      <description>mysql用户</description>
    </property>
    <property>
      <name>javax.jdo.option.ConnectionPassword</name>
      <value>abc123456</value>
       <description>mysql密码</description>
    </property>
    <property>
      <name>hive.metastore.schema.verification</name>
      <value>false</value>
    </property>
    
    mysql驱动上传到hive的lib下
    root@Master:/usr/local# cp mysql-connector-java-5.1.49.jar /usr/local/apache-hive-3.1.2-bin/lib
    

    对hive的lib文件夹下的部分文件做修改,不然初始化数据库的时候会报错

    #slf4j这个包hadoop及hive两边只能有一个,这里删掉hive这边
    root@Master:/usr/local/apache-hive-3.1.2-bin/lib# rm log4j-slf4j-impl-2.10.0.jar
    
    #guava这个包hadoop及hive两边只删掉版本低的那个,把版本高的复制过去,这里删掉hive,复制hadoop的过去
    root@Master:/usr/local/hadoop/share/hadoop/common/lib# cp guava-27.0-jre.jar /usr/local/apache-hive-3.1.2-bin/lib
    root@Master:/usr/local/hadoop/share/hadoop/common/lib# rm /usr/local/apache-hive-3.1.2-bin/lib/guava-19.0.jar
    
    #把文件hive-site.xml第3225行的特殊字符删除
    root@Master: vim /usr/local/apache-hive-3.1.2-bin/conf/hive-site.xml
    

    五、初始化元数据库

    root@Master:/usr/local/apache-hive-3.1.2-bin/bin# schematool -initSche
    

    Hbase

    hbase镜像采取的是hbase-2.2.6,地址:https://mirror.bit.edu.cn/apache/hbase/2.2.6/

    1.上传hbase镜像

    //复制镜像到master服务器
    docker cp hbase-2.2.6-bin.tar.gz Master:/usr/local
    
    //进入到目录后解压
    root@Master:/# cd /usr/local/
    root@Master:/usr/local# tar -zxvf hbase-2.2.6-bin.tar.gz
    

    2.配置hbase环境变量

    #hbase
    export HBASE_HOME=/usr/local/hbase-2.2.6
    export PATH=$HBASE_HOME/bin:$PATH
    

    将hadoop/etc/hadoop下的core-site.xml和hdfs-site.xml复制到hbase/conf/下

    配置hbase-site.xml

    <property>
                    <name>hbase.rootdir</name>
                    <value>hdfs://localhost:9000/hbase</value>
            </property>
            <property>
                    <name>hbase.cluster.distributed</name>
                    <value>true</value>
            </property>
            <property>
                    <name>hbase.master</name>
                    <value>localhost:60000</value>
            </property>
            <property>
                    <name>hbase.zookeeper.quorum</name>
                    <value>localhost</value>
            </property>
            <property>
                    <name>hbase.zookeeper.property.dataDir</name>
                    <value>/home/yourname/zoodata</value>
            </property>
            <property>
                    <name>hbase.unsafe.stream.capability.enforce</name>
                    <value>false</value>
            </property>
    
    #在hbase/lib/client-facing-thirdparty 下
    mv ./slf4j-log4j12-1.7.30.jar ./slf4j-log4j12-1.7.30.jar.bak
    #将hbase的slf4j文件改名但不删除,作备份作用,以免和hadoop的日志冲突
    

    Spark

    spark镜像采取的是spark3.0.1,下载地址:https://mirror.bit.edu.cn/apache/spark/spark-3.0.1

    1.上传spark镜像

    docker cp spark-3.0.1-bin-hadoop3.2.tgz Master:/usr/local  
    
    //进入到目录后解压
    root@Master:/# cd /usr/local/
    root@Master:/usr/local# tar -zxvf spark-3.0.1-bin-hadoop3.2.tgz
    

    2.配置spark环境

    root@Master:/usr/local/spark-3.0.1/conf# vi spark-env.sh
    #spark
    export SPARK_MASTER_HOST=Master
    export SPARK_MEM=1G
    export SPARK_MASTER_PORT=7077
    export SPARK_WORKER_MEMORY=1G
    
    #java
    export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
    #export SCALA_HOME=/usr/local/scala-2.12.12
    
    #hadoop
    export HADOOP_HOME=/usr/local/hadoop
    export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
    export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
    
    root@Master:/usr/local/spark-3.0.1/conf# vi spark-defaults.conf
    spark.master     yarn
    

    3.启动spark

    //启动spark
    root@Master:/usr/local/spark-3.0.1/sbin# ./start-all.sh
    //查看
    root@Master:/usr/local/hbase-2.2.6/conf# jps
    8449 Master
    8532 Worker
    8582 Jps
    7144 ResourceManager
    6810 SecondaryNameNode
    7275 NodeManager
    6621 DataNode
    6493 NameNode
    

    相关文章

      网友评论

          本文标题:mac基于docker安装hadoop+hive+hbase+s

          本文链接:https://www.haomeiwen.com/subject/iguxiktx.html