美文网首页
Note23:Hive-2.3.6安装配置

Note23:Hive-2.3.6安装配置

作者: K__3f8b | 来源:发表于2020-07-14 15:26 被阅读0次

安装包下载安装

[kevin@hadoop112 software]$ tar -zxvf apache-hive-2.3.6-bin.tar.gz -C /opt/module/
  • 改名
[kevin@hadoop112 module]$ mv apache-hive-2.3.6-bin/ hive-2.3.6

配置

  • 修改conf 目录下的 hive-env.sh.template 名称为 hive-env.sh
[kevin@hadoop112 hive-2.3.6]$ cd conf/
[kevin@hadoop112 conf]$ mv hive-env.sh.template hive-env.sh
[kevin@hadoop112 conf]$ vim hive-env.sh
  • 配置 hive-env.sh 文件
# 配置 HADOOP_HOME 路径
export HADOOP_HOME=/opt/module/hadoop-2.7.2

# 配置 HIVE_CONF_DIR 路径
export HIVE_CONF_DIR=/opt/module/hive-2.3.6/conf
  • 把Hive的元数据配置到MySQL

拷贝 mysql-connector-java-5.1.48-bin.jar 到/opt/module/hive-2.3.6/lib/

[kevin@hadoop112 conf]# cp /opt/software/mysql-libs-CentOS6/mysql-connector-java-5.1.48.jar /opt/module/hive-2.3.6/lib/

配置Metastore 到 MySQL(MySQL增加 metastore 数据库)

在/opt/module/hive-2.3.6/conf 目录下创建一个 hive-site.xml

[kevin@hadoop112 conf]$ touch hive-site.xml
[kevin@hadoop112 conf]$ vim hive-site.xml

根据官方文档配置参数,拷贝数据到 hive-site.xml 文件中

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <!-- 显示表头 -->
    <property>
        <name>hive.cli.print.header</name>
        <value>true</value>
    </property>

   <!-- 显示库名 -->
    <property>
        <name>hive.cli.print.current.db</name>
        <value>true</value>
    </property>
    
    <!-- 数据库地址 -->
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://hadoop112:3306/hive_metastore?createDatabaseIfNotExist=true&amp;useSSL=false</value>
        <description>JDBC connect string for a JDBC metastore</description>
    </property>
    <!-- 数据库驱动 -->
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
        <description>Driver class name for a JDBC metastore</description>
    </property>
    <!-- 数据库用户 -->
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>hive</value>
        <description>username to use against metastore database</description>
    </property>

    <!-- 数据库密码 -->
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>hive</value>
        <description>password to use against metastore database</description>
    </property>

    <!--元数据存放路径 -->
    <property>
        <name>hive.metastore.uris</name>
        <value>thrift://node101:9083</value>
    </property>

    <!-- Default 数据仓库在 hdfs 的位置 -->
    <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/user/hive/warehouse</value>
        <description>location of default database for the warehouse</description>
    </property>

    <property>
        <name>hive.metastore.schema.verification</name>
        <value>false</value>
        <description>
            Enforce metastore schema version consistency.
            True: Verify that version information stored in is compatible with one from Hive jars.  Also disable automatic
                  schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
                  proper metastore schema migration. (Default)
            False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
        </description>
    </property>

    <property>
        <name>datanucleus.schema.autoCreateAll</name>
        <value>true</value>
        <description>
            Auto creates necessary schema on a startup if one doesn't exist. Set this to false, after creating it once.To enable 
            auto create also set hive.metastore.schema.verification=false. Auto creation is not recommended for production use 
            cases, run schematool command instead.
        </description>
    </property>

    <property>
        <name>hive.server2.thrift.port</name>
        <value>10000</value>
    </property>
    <property>
        <name>hive.server2.thrift.bind.host</name>
        <value>hadoop102</value>
    </property>

</configuration>
  • 配置同组用户有执行权限
[kevin@hadoop112 conf]$ cd /opt/module/hadoop-2.7.2/
[kevin@hadoop112 hadoop2.7.2]$ bin/hdfs dfs -mkdir -p /tmp
[kevin@hadoop112 hadoop2.7.2]$ bin/hdfs dfs -mkdir -p /user/hive/warehouse # 这里使用-p 是为了防止报上级路径没有的错误
[kevin@hadoop112 hadoop2.7.2]$ bin/hdfs dfs -chmod g+w /tmp # 给目录所在的群组增加写入权限
[kevin@hadoop112 hadoop2.7.2]$ bin/hdfs dfs -chmod g+w /user/hive/warehouse
  • 配置日志存储位置
[kevin@hadoop112 conf]$ cp hive-log4j2.properties.template hive-log4j2.properties
[kevin@hadoop112 conf]$ vim hive-log4j2.properties

# 修改
property.hive.log.dir = /opt/module/hive-2.3.6/logs
  • 配置环境变量 vim /etc/profile.d/myPath.sh
#HIVE_HOME
export HIVE_HOME=/opt/module/hive-2.3.6
export PATH=$PATH:$HIVE_HOME/bin
  • 创建 Hive 所用的原数据库结构,Hive 用户
mysql> CREATE DATABASE hive_metastore; 
mysql> USE hive; 
mysql> CREATE USER 'hive'@'localhost' IDENTIFIED BY 'hive';
mysql> GRANT ALL ON hive_metastore.* TO 'hive'@'localhost' IDENTIFIED BY 'hive'; 
mysql> GRANT ALL ON hive_metastore.* TO 'hive'@'%' IDENTIFIED BY 'hive'; 
mysql> FLUSH PRIVILEGES; 
mysql> quit;
  • 初始化数据库

从Hive 2.1开始,我们需要运行下面的schematool命令作为初始化步骤。例如,这里使用“mysql”作为db类型。

[kevin@hadoop112 hive-2.3.6]$ cd bin/
[kevin@hadoop112 bin]$ schematool -dbType mysql -initSchema --verbose

使用

  • 启动Hadoop集群
[kevin@hadoop112 hive-2.3.6]$ hadoop-cluster.sh start
  • 启动 metastore
[kevin@hadoop112 hive-2.3.6]$ bin/hive --service metastore
2021-07-06 23:49:57: Starting Hive Metastore Server
  • 启动hive
[kevin@hadoop112 hive-2.3.6]$ bin/hive
  • 测试数据

现有一个文件student.txt,将其存入hive中,student.txt数据格式如下:

95002,刘晨,女,19,IS
95017,王风娟,女,18,IS
95018,王一,女,19,IS
95013,冯伟,男,21,CS
95014,王小丽,女,19,CS
95019,邢小丽,女,19,IS
95020,赵钱,男,21,IS
95003,王敏,女,22,MA
95004,张立,男,19,IS
95012,孙花,女,20,CS
95010,孔小涛,男,19,CS
95005,刘刚,男,18,MA
95006,孙庆,男,23,CS
95007,易思玲,女,19,MA
95008,李娜,女,18,CS
95021,周二,男,17,MA
95022,郑明,男,20,MA
95001,李勇,男,20,CS
95011,包小柏,男,18,MA
95009,梦圆圆,女,18,MA
95015,王君,男,18,MA
  • 测试-创建库
hive (default)> create database myhive;
OK
Time taken: 2.833 seconds
hive (default)>
  • 测试-使用新的数据库
hive (default)> use myhive;
OK
Time taken: 0.09 seconds
hive (myhive)> select current_database();
OK
_c0
myhive
Time taken: 1.458 seconds, Fetched: 1 row(s)
hive (myhive)>
  • 测试-在数据库myhive创建一张student表
hive (myhive)> create table student(id int, name string, sex string, age int, department string) row format delimited fields terminated by ",";
OK
Time taken: 0.785 seconds
hive (myhive)>
  • 测试-往表中加载数据
hive (myhive)>  load data local inpath "/opt/module/datas/student.txt" into table student;
Loading data to table myhive.student
OK
Time taken: 1.075 seconds
hive (myhive)>
  • 测试-查表
hive (myhive)> select * from student;
OK
student.id  student.name    student.sex student.age student.department
95002   刘晨  女   19  IS
95017   王风娟 女   18  IS
95018   王一  女   19  IS
95013   冯伟  男   21  CS
95014   王小丽 女   19  CS
95019   邢小丽 女   19  IS
95020   赵钱  男   21  IS
95003   王敏  女   22  MA
95004   张立  男   19  IS
95012   孙花  女   20  CS
95010   孔小涛 男   19  CS
95005   刘刚  男   18  MA
95006   孙庆  男   23  CS
95007   易思玲 女   19  MA
95008   李娜  女   18  CS
95021   周二  男   17  MA
95022   郑明  男   20  MA
95001   李勇  男   20  CS
95011   包小柏 男   18  MA
95009   梦圆圆 女   18  MA
95015   王君  男   18  MA
Time taken: 2.467 seconds, Fetched: 21 row(s)
hive (myhive)>
  • 查看表结构
hive (myhive)> desc student;
OK
col_name    data_type   comment
id                      int                                         
name                    string                                      
sex                     string                                      
age                     int                                         
department              string                                      
Time taken: 0.075 seconds, Fetched: 5 row(s)
hive (myhive)>
  • 退出
hive (default)> quit;

使用 HiveServer2

  • 启动 metastore
[kevin@hadoop112 hive-2.3.6]$ bin/hive --service metastore
2021-07-06 23:49:57: Starting Hive Metastore Server
  • 启动 hiveServer2
[kevin@hadoop112 hive-2.3.6]$ bin/hive --service hiveserver2
2021-07-08 16:26:00: Starting HiveServer2
测试 HiveServer2
  • 启动 beeline
[kevin@hadoop112 hive-2.3.6]$ beeline
beeline> 
  • 连接 hiveServer2
beeline> ! connect jdbc:hive2://192.168.1.101:10000/default 
# 输出日志
Connecting to jdbc:hive2://192.168.1.101:10000/myhive
Enter username for jdbc:hive2://192.168.1.101:10000/myhive: kevin (当前用户名)
Enter password for jdbc:hive2://192.168.1.101:10000/myhive:  (可以为空)
Connected to: Apache Hive (version 2.3.6)
Driver: Hive JDBC (version 2.3.6)
Transaction isolation: TRANSACTION_REPEATABLE_READ

0: jdbc:hive2://192.168.1.101:10000/myhive>

上面的可能会报错

21/07/09 00:12:56 [main]: WARN jdbc.HiveConnection: Failed to connect to 192.168.1.101:10000
Error: Could not open client transport with JDBC Uri: jdbc:hive2://192.168.1.101:10000/myhive: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: kevin is not allowed to impersonate kevin (state=08S01,code=0)

需要在 hadoop 的配置文件core-site.xml增加如下配置,分发同步,重启hdfs,其中“xxx”是连接beeline的用户,将“xxx”替换成自己的用户名即可

    <property>
        <name>hadoop.proxyuser.kevin.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.kevin.groups</name>
        <value>*</value>
    </property>

HiveServer2 Web_UI http://hadoop112:10002/

Hive 脚本启动 元数据 和 HiveServer2

#!/bin/bash

hive_home=${HIVE_HOME}

case $1 in
"start"){

    echo "=================       node101正在启动 metastore (记录在 HIVE_HOME/logs 里)           ==============="
    nohup ${hive_home}/bin/hive --service metastore >${hive_home}/logs/metastore.out 2>&1 &

    sleep 5s

    echo "=================       node101正在启动 hiveserver2 (记录在 HIVE_HOME/logs 里 )        ==============="
    nohup ${hive_home}/bin/hive --service hiveserver2 >${hive_home}/logs/hiveserver2.out 2>&1 &
};;
"stop"){

    echo "=================       node101正在关闭 hiveserver2          ==============="
    ps -ef | grep org.apache.hive.service.server.HiveServer2 | grep -v grep | awk '{print $2}' | xargs  kill -15

    sleep 2s

    echo "=================       node101正在关闭 metastore            ==============="
    ps -ef | grep org.apache.hadoop.hive.metastore.HiveMetaStore | grep -v grep | awk '{print $2}' | xargs kill -15
};;
esac

可视化

DBeaver

建议使用 7.3 版本,21版本有点问题。

连接时,需要像 beeline 那样填写用户名

相关文章

网友评论

      本文标题:Note23:Hive-2.3.6安装配置

      本文链接:https://www.haomeiwen.com/subject/kotjhktx.html