美文网首页
搭hadoop+hbase(mac单机)

搭hadoop+hbase(mac单机)

作者: 死鱼 | 来源:发表于2019-11-16 19:18 被阅读0次

    摘要

    单机搭hadoop+hbase流程记录

    引用学习:

    1、hadoop 2.7.4 单机版安装
    2、HBase环境搭建

    安装程序准备:

    hadoop-2.7.7 : https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/
    hbase-2.1.7 : https://hbase.apache.org/downloads.html

    安装hadoop

    (JAVA环境不说了,自行配置)

    1、解压

    tar -zxvf hadoop-2.7.7.tar.gz
    mkdir /usr/local/hadoop
    mv hadoop-2.7.7 /usr/local/hadoop
    # 软连接
    ln -s /usr/local/hadoop/bin/hdfs /usr/bin/hdfs
    

    验证:

    hadoop version
    
    image.png

    2、配置文件

    1. hadoop-env.sh
    # 具体拿一下自己的JAVA_HOME路径
    export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_202.jdk/Contents/Home
    
    1. yarn-env.sh
    # 具体拿一下自己的JAVA_HOME路径
    export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_202.jdk/Contents/Home
    
    1. core-site.xml
    <configuration>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://localhost:9000</value>
        <description>HDFS的URI</description>
    </property>
     
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/usr/local/hadoop/tmp</value>
        <description>本地的hadoop临时文件夹</description>
    </property>
    </configuration>
    

    此处需要创建 /usr/lcoal/hadoop/tmp 文件夹

    1. hdfs-site.xml
    <configuration>
    <property>
        <name>dfs.name.dir</name>
        <value>/usr/local/hadoop/data0/hadoop/hdfs/name</value>
        <description>namenode上存储hdfs名字空间元数据 </description> 
    </property>
     
    <property>
        <name>dfs.data.dir</name>
        <value>/usr/local/hadoop/data0/hadoop/hdfs/data</value>
        <description>datanode上数据块的物理存储位置</description>
    </property>
     
    <property>
        <name>dfs.replication</name>
        <value>1</value>
        <description>副本个数,配置默认是3,应小于datanode机器数量</description>
    </property>
    </configuration>
    

    需要创建 /usr/local/hadoop/data0/hadoop 文件夹

    1. mapred-site.xml
    <configuration>
    <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
    </property>
    </configuration>
    
    1. yarn-site.xml
    <configuration>
    <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
    </property>
    <property>
            <name>yarn.resourcemanager.webapp.address</name>
            <value>192.168.31.250:8099</value>
    </property>
    </configuration>
    

    webapp的地址配置成本机的地址即可

    3、启动hadoop

    # 初始化
    bin/hdfs namenode –format
    # 启动服务
    $HADOOP_HOME/bin/start-all.sh
    

    判断是否启动成功:


    image.png

    4、hdfs命令使用:

    # 创建文件与查看
    hadoop fs -mkdir /xxx
    hadoop fs -ls /
    # 文件上传
    hadoop fs -put ./data/* /data/
    
    # 文件删除
    hadoop fs -rm -r /data/*
    

    5、一些学习脚本

    mapper.py

    #!/usr/bin/env python
    import sys
    
    for line in sys.stdin:
        line = line.strip()
        words = line.split()
        for word in words:
            print '%s\t%s' % (word, 1)
    

    reducer.py

    #!/usr/bin/env python
    from operator import itemgetter
    import sys
    
    current_word = None
    current_count = 0
    word = None
    
    for line in sys.stdin:
        line = line.strip()
        word, count = line.split('\t', 1)
        count = int(count)
    
        if current_word == word:
            current_count += count
        else:
            if current_word:
                print '%s\t%s' % (current_word, current_count)
            current_count = count
            current_word = word
    
    if current_word == word:
        print '%s\t%s' % (current_word, current_count)
    

    run.sh

    STREAM=/usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.7.7.jar
    WORKPLACE=/usr/local/hadoop/workplace
    
    hadoop fs -rm -r /data/output
    
    hadoop jar $STREAM \
    -files $WORKPLACE/script/mapper.py,$WORKPLACE/script/reducer.py \
    -mapper $WORKPLACE/script/mapper.py \
    -reducer $WORKPLACE/script/reducer.py \
    -input /data/*.json \
    -output /data/output
    

    当然,要先把字典(*.json)上传到hdfs的/data文件中。
    *.json格式如下

    test1 test1 test1 test1 test1 test2 test1 test1 test2 test2 test2 test2 test3 test3 test3 test3 test3 test3 test3 test3
    

    安装hbase

    1、解压

    我把解压后的文件夹放了在

    $HADOOP_HOME/softs/hbase
    

    2、配置

    1. hbase-site.xml
    <configuration>
      <property>
        <name>hbase.rootdir</name>
        <value>/usr/local/hadoop/soft/hbase</value>
      </property>
      <property>
        <name>hbase.zookeeper.property.dataDir</name>
        <value>/usr/local/hadoop/softs/hbase/zookeeper</value>
      </property>
      <property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
      </property>
      <property>
      <name>hbase.master.info.port</name>
        <value>60010</value>
      </property>
    </configuration>
    

    注意,这里的路径都要配置成自己解压的路径。

    3、启动

    bash hbase-daemon.sh start zookeeper
    bash hbase-daemon.sh start master
    bash hbase-daemon.sh start regionserver
    

    验证启动完成:


    image.png

    以及本地控制台:
    http://127.0.0.1:60010/master-status

    4、一些数据操作方式

    1. 进入hbase shell
    hbase shell
    
    1. 创建 user 表:
    create 'user', 'info'
    
    1. 删除表
    disable 'user'
    drop 'user'
    
    1. 增删改查数据
    # 插入keyid为'id001'的用户,其中有一个属性'name',值为'SteveWooo'
    put 'user','id001','info:name','SteveWooo'
    
    # 查看'id001'用户数据
    get 'user','id001'
    
    # 为'id001'用户新增一个属性age
    put 'user','id001','info:age','18'
    
    # 修改'id001'用户的age
    put 'user','id001','info:age','19'
    
    # 删除'id001'用户数据
    deleteall 'user','id001'
    

    相关文章

      网友评论

          本文标题:搭hadoop+hbase(mac单机)

          本文链接:https://www.haomeiwen.com/subject/hzoyictx.html