Hive的安装与使用-1

作者: sparkle123 | 来源:发表于2018-02-27 16:08 被阅读0次

    1.下载tar包,在客户端窗口上执行
    wget -c http://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.7.0.tar.gz
    2.解压到app目录下
    tar -zxvf hive-1.1.0-cdh5.7.0.tar.gz -C ~/app
    3.配置~/.bash_profile

    export HIVE_HOME=/home/hadoop/app/hive-1.1.0-cdh5.7.0
    export PATH=$HIVE_HOME/bin:$PATH
    

    4.环境配置

    cd hive-1.1.0-cdh5.7.0/conf/
    cp hive-env.sh.template hive-env.sh
    vi hive-env.sh
    
    # Set HADOOP_HOME to point to a specific hadoop install directory
    # HADOOP_HOME=${bin}/../../hadoop
    
      HADOOP_HOME=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
    
    

    5.安装mysql,yum install mysql-xxx
    6.配置hive-site.xml文件

    <configuration>
    <property>
            <name>javax.jdo.option.ConnectionURL</name>
            <value>jdbc:mysql://hadoop000:3306/sparksql?createDatabaseIfNotExist=true</value>
    
    </property>
    <property>
            <name>javax.jdo.option.ConnectionDriverName</name>
            <value>com.mysql.jdbc.Driver</value>
            <description>Driver class name for a JDBC metastore</description>
    </property>
    <property>
            <name>javax.jdo.option.ConnectionUserName</name>
            <value>root</value>
            <description>username to use against metastore database</description>
    </property>
    <property>
            <name>javax.jdo.option.ConnectionPassword</name>
            <value>root</value>
            <description>password to use against metastore database</description>
    </property>
    
    </configuration>
    

    7.拷贝数据库驱动到hive-1.1.0-cdh5.7.0/lib目录下
    cp /home/hadoop/software/mysql-connector-java-5.1.27-bin.jar
    8.启动hive,bin/hive

    image.png

    9.到mysql数据库里查看sparksql

    mysql> show databases;
    +--------------------+
    | Database           |
    +--------------------+
    | information_schema |
    | azkaban            |
    | hive               |
    | mysql              |
    | spark              |
    | sparksql           |
    | sqoop              |
    | ss                 |
    | test               |
    +--------------------+
    9 rows in set (0.00 sec)
    
    mysql> use sparksql
    Database changed
    mysql> show tables;
    +---------------------------+
    | Tables_in_sparksql        |
    +---------------------------+
    | BUCKETING_COLS            |
    | CDS                       |
    | COLUMNS_V2                |
    | DATABASE_PARAMS           |
    | DBS                       |
    | FUNCS                     |
    | FUNC_RU                   |
    | GLOBAL_PRIVS              |
    | PARTITIONS                |
    | PARTITION_KEYS            |
    | PARTITION_KEY_VALS        |
    | PART_COL_STATS            |
    | ROLES                     |
    | SDS                       |
    | SEQUENCE_TABLE            |
    | SERDES                    |
    | SKEWED_COL_NAMES          |
    | SKEWED_COL_VALUE_LOC_MAP  |
    | SKEWED_STRING_LIST        |
    | SKEWED_STRING_LIST_VALUES |
    | SKEWED_VALUES             |
    | SORT_COLS                 |
    | TAB_COL_STATS             |
    | TBLS                      |
    | VERSION                   |
    +---------------------------+
    25 rows in set (0.00 sec)
    

    10.在Hive中建表,
    create table hive_count(context string );
    报错:

    FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:For direct MetaStore DB connections, we don't support retries at the client level.)

    解决:

    这是由于字符集的问题,需要配置MySQL的字符集:
    mysql> alter database hive character set latin1;

    11.建好表以后,查看sparksql里的TBLSCOLUMNS_V2

    image.png
    12.查看hive官方文档

    https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Loadingfilesintotables
    从外部文件加载数据

    LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
    

    13.在/home/hadoop/data/data.log下面有外部数据,把它导入到hive表里
    LOAD DATA LOCAL INPATH '/home/hadoop/data/data.log' INTO TABLE hive_count;

    14.select下结果:

    image.png

    15.hive ql提交执行后生成MR作业,并在yarn上运行。

    select word,count(1) from hive_count lateral view explode(split(context,'\t')) wc as word group by word;
    

    lateral view explode(split(context,'\t'))
    :把每行记录按照指定分割符进行分解。

    16.确认结果


    image.png

    yarn页面,

    http://hostname:8088/cluster/apps
    
    image.png

    相关文章

      网友评论

        本文标题:Hive的安装与使用-1

        本文链接:https://www.haomeiwen.com/subject/yxvpxftx.html