安装包下载安装
- 地址:
- 个人选择这个版本apache-hive-2.3.6-bin.tar.gz
- 把安装包上传到 /opt/software 目录
- 解压
[kevin@hadoop112 software]$ tar -zxvf apache-hive-2.3.6-bin.tar.gz -C /opt/module/
- 改名
[kevin@hadoop112 module]$ mv apache-hive-2.3.6-bin/ hive-2.3.6
配置
- 修改conf 目录下的 hive-env.sh.template 名称为 hive-env.sh
[kevin@hadoop112 hive-2.3.6]$ cd conf/
[kevin@hadoop112 conf]$ mv hive-env.sh.template hive-env.sh
[kevin@hadoop112 conf]$ vim hive-env.sh
- 配置 hive-env.sh 文件
# 配置 HADOOP_HOME 路径
export HADOOP_HOME=/opt/module/hadoop-2.7.2
# 配置 HIVE_CONF_DIR 路径
export HIVE_CONF_DIR=/opt/module/hive-2.3.6/conf
- 把Hive的元数据配置到MySQL
拷贝 mysql-connector-java-5.1.48-bin.jar 到/opt/module/hive-2.3.6/lib/
[kevin@hadoop112 conf]# cp /opt/software/mysql-libs-CentOS6/mysql-connector-java-5.1.48.jar /opt/module/hive-2.3.6/lib/
配置Metastore 到 MySQL(MySQL增加 metastore 数据库)
在/opt/module/hive-2.3.6/conf 目录下创建一个 hive-site.xml
[kevin@hadoop112 conf]$ touch hive-site.xml
[kevin@hadoop112 conf]$ vim hive-site.xml
根据官方文档配置参数,拷贝数据到 hive-site.xml 文件中
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!-- 显示表头 -->
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<!-- 显示库名 -->
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
<!-- 数据库地址 -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop112:3306/hive_metastore?createDatabaseIfNotExist=true&useSSL=false</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<!-- 数据库驱动 -->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<!-- 数据库用户 -->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>username to use against metastore database</description>
</property>
<!-- 数据库密码 -->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
<description>password to use against metastore database</description>
</property>
<!--元数据存放路径 -->
<property>
<name>hive.metastore.uris</name>
<value>thrift://node101:9083</value>
</property>
<!-- Default 数据仓库在 hdfs 的位置 -->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>
Enforce metastore schema version consistency.
True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
proper metastore schema migration. (Default)
False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
</description>
</property>
<property>
<name>datanucleus.schema.autoCreateAll</name>
<value>true</value>
<description>
Auto creates necessary schema on a startup if one doesn't exist. Set this to false, after creating it once.To enable
auto create also set hive.metastore.schema.verification=false. Auto creation is not recommended for production use
cases, run schematool command instead.
</description>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>hadoop102</value>
</property>
</configuration>
- 配置同组用户有执行权限
[kevin@hadoop112 conf]$ cd /opt/module/hadoop-2.7.2/
[kevin@hadoop112 hadoop2.7.2]$ bin/hdfs dfs -mkdir -p /tmp
[kevin@hadoop112 hadoop2.7.2]$ bin/hdfs dfs -mkdir -p /user/hive/warehouse # 这里使用-p 是为了防止报上级路径没有的错误
[kevin@hadoop112 hadoop2.7.2]$ bin/hdfs dfs -chmod g+w /tmp # 给目录所在的群组增加写入权限
[kevin@hadoop112 hadoop2.7.2]$ bin/hdfs dfs -chmod g+w /user/hive/warehouse
- 配置日志存储位置
[kevin@hadoop112 conf]$ cp hive-log4j2.properties.template hive-log4j2.properties
[kevin@hadoop112 conf]$ vim hive-log4j2.properties
# 修改
property.hive.log.dir = /opt/module/hive-2.3.6/logs
- 配置环境变量
vim /etc/profile.d/myPath.sh
#HIVE_HOME
export HIVE_HOME=/opt/module/hive-2.3.6
export PATH=$PATH:$HIVE_HOME/bin
- 创建 Hive 所用的原数据库结构,Hive 用户
mysql> CREATE DATABASE hive_metastore;
mysql> USE hive;
mysql> CREATE USER 'hive'@'localhost' IDENTIFIED BY 'hive';
mysql> GRANT ALL ON hive_metastore.* TO 'hive'@'localhost' IDENTIFIED BY 'hive';
mysql> GRANT ALL ON hive_metastore.* TO 'hive'@'%' IDENTIFIED BY 'hive';
mysql> FLUSH PRIVILEGES;
mysql> quit;
- 初始化数据库
从Hive 2.1开始,我们需要运行下面的schematool命令作为初始化步骤。例如,这里使用“mysql”作为db类型。
[kevin@hadoop112 hive-2.3.6]$ cd bin/
[kevin@hadoop112 bin]$ schematool -dbType mysql -initSchema --verbose
使用
- 启动Hadoop集群
[kevin@hadoop112 hive-2.3.6]$ hadoop-cluster.sh start
- 启动 metastore
[kevin@hadoop112 hive-2.3.6]$ bin/hive --service metastore
2021-07-06 23:49:57: Starting Hive Metastore Server
- 启动hive
[kevin@hadoop112 hive-2.3.6]$ bin/hive
- 测试数据
现有一个文件student.txt,将其存入hive中,student.txt数据格式如下:
95002,刘晨,女,19,IS
95017,王风娟,女,18,IS
95018,王一,女,19,IS
95013,冯伟,男,21,CS
95014,王小丽,女,19,CS
95019,邢小丽,女,19,IS
95020,赵钱,男,21,IS
95003,王敏,女,22,MA
95004,张立,男,19,IS
95012,孙花,女,20,CS
95010,孔小涛,男,19,CS
95005,刘刚,男,18,MA
95006,孙庆,男,23,CS
95007,易思玲,女,19,MA
95008,李娜,女,18,CS
95021,周二,男,17,MA
95022,郑明,男,20,MA
95001,李勇,男,20,CS
95011,包小柏,男,18,MA
95009,梦圆圆,女,18,MA
95015,王君,男,18,MA
- 测试-创建库
hive (default)> create database myhive;
OK
Time taken: 2.833 seconds
hive (default)>
- 测试-使用新的数据库
hive (default)> use myhive;
OK
Time taken: 0.09 seconds
hive (myhive)> select current_database();
OK
_c0
myhive
Time taken: 1.458 seconds, Fetched: 1 row(s)
hive (myhive)>
- 测试-在数据库myhive创建一张student表
hive (myhive)> create table student(id int, name string, sex string, age int, department string) row format delimited fields terminated by ",";
OK
Time taken: 0.785 seconds
hive (myhive)>
- 测试-往表中加载数据
hive (myhive)> load data local inpath "/opt/module/datas/student.txt" into table student;
Loading data to table myhive.student
OK
Time taken: 1.075 seconds
hive (myhive)>
- 测试-查表
hive (myhive)> select * from student;
OK
student.id student.name student.sex student.age student.department
95002 刘晨 女 19 IS
95017 王风娟 女 18 IS
95018 王一 女 19 IS
95013 冯伟 男 21 CS
95014 王小丽 女 19 CS
95019 邢小丽 女 19 IS
95020 赵钱 男 21 IS
95003 王敏 女 22 MA
95004 张立 男 19 IS
95012 孙花 女 20 CS
95010 孔小涛 男 19 CS
95005 刘刚 男 18 MA
95006 孙庆 男 23 CS
95007 易思玲 女 19 MA
95008 李娜 女 18 CS
95021 周二 男 17 MA
95022 郑明 男 20 MA
95001 李勇 男 20 CS
95011 包小柏 男 18 MA
95009 梦圆圆 女 18 MA
95015 王君 男 18 MA
Time taken: 2.467 seconds, Fetched: 21 row(s)
hive (myhive)>
- 查看表结构
hive (myhive)> desc student;
OK
col_name data_type comment
id int
name string
sex string
age int
department string
Time taken: 0.075 seconds, Fetched: 5 row(s)
hive (myhive)>
- 退出
hive (default)> quit;
使用 HiveServer2
- 启动 metastore
[kevin@hadoop112 hive-2.3.6]$ bin/hive --service metastore
2021-07-06 23:49:57: Starting Hive Metastore Server
- 启动 hiveServer2
[kevin@hadoop112 hive-2.3.6]$ bin/hive --service hiveserver2
2021-07-08 16:26:00: Starting HiveServer2
测试 HiveServer2
- 启动 beeline
[kevin@hadoop112 hive-2.3.6]$ beeline
beeline>
- 连接 hiveServer2
beeline> ! connect jdbc:hive2://192.168.1.101:10000/default
# 输出日志
Connecting to jdbc:hive2://192.168.1.101:10000/myhive
Enter username for jdbc:hive2://192.168.1.101:10000/myhive: kevin (当前用户名)
Enter password for jdbc:hive2://192.168.1.101:10000/myhive: (可以为空)
Connected to: Apache Hive (version 2.3.6)
Driver: Hive JDBC (version 2.3.6)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://192.168.1.101:10000/myhive>
上面的可能会报错
21/07/09 00:12:56 [main]: WARN jdbc.HiveConnection: Failed to connect to 192.168.1.101:10000
Error: Could not open client transport with JDBC Uri: jdbc:hive2://192.168.1.101:10000/myhive: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: kevin is not allowed to impersonate kevin (state=08S01,code=0)
需要在 hadoop 的配置文件core-site.xml增加如下配置,分发同步,重启hdfs,其中“xxx”是连接beeline的用户,将“xxx”替换成自己的用户名即可
<property>
<name>hadoop.proxyuser.kevin.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.kevin.groups</name>
<value>*</value>
</property>
HiveServer2 Web_UI http://hadoop112:10002/
Hive 脚本启动 元数据 和 HiveServer2
#!/bin/bash
hive_home=${HIVE_HOME}
case $1 in
"start"){
echo "================= node101正在启动 metastore (记录在 HIVE_HOME/logs 里) ==============="
nohup ${hive_home}/bin/hive --service metastore >${hive_home}/logs/metastore.out 2>&1 &
sleep 5s
echo "================= node101正在启动 hiveserver2 (记录在 HIVE_HOME/logs 里 ) ==============="
nohup ${hive_home}/bin/hive --service hiveserver2 >${hive_home}/logs/hiveserver2.out 2>&1 &
};;
"stop"){
echo "================= node101正在关闭 hiveserver2 ==============="
ps -ef | grep org.apache.hive.service.server.HiveServer2 | grep -v grep | awk '{print $2}' | xargs kill -15
sleep 2s
echo "================= node101正在关闭 metastore ==============="
ps -ef | grep org.apache.hadoop.hive.metastore.HiveMetaStore | grep -v grep | awk '{print $2}' | xargs kill -15
};;
esac
可视化
建议使用 7.3 版本,21版本有点问题。
连接时,需要像 beeline 那样填写用户名
网友评论