在Hive安装之前,我们需要安装好Hadoop,启动并运行所有Hadoop守护进程。参考 大数据工具Hadoop快速入门3安装
安装Hive
步骤1)下载: http://www.apache.org/dyn/closer.cgi/hive/
Installation and Configuration of HIVE and MYSQL
转到URL位置,然后单击链接以下载hive tar文件。
步骤2)提取tar文件。
转到下载的Tar文件位置 - >使用以下命令解压缩tar文件
$ tar -xvf apache-hive-3.1.1-bin.tar.gz
$ mv apache-hive-3.1.1-bin /opt/hive
步骤3)配置
$ vi ~/.bashrc
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_HOME=/opt/hadoop
export HADOOP_CONF_DIR=/opt/hadoop/conf
export HIVE_HOME=/opt/hive
export HIVE_CONF_DIR=/opt/hive/conf
export PATH=$PATH:$HIVE_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
$ vi /opt/hive/bin/hive-config.sh
# 增加
export HADOOP_HOME=/opt/hadoop
步骤4)在Hadoop中创建Hive目录:
$ hdfs fs -mkdir /usr/hive/warehouse
$ hdfs fs -chmod g+w /usr/hive/warehouse
步骤5)进入Hive shell
$ hive
...
Hive Session ID = 5300ba85-e010-40dd-873d-574b8f44dbf3
hive>
# 如果出现 Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
# Derby - https://askubuntu.com/questions/1073035/failed-hiveexception-java-lang-runtimeexception-unable-to-instantiate-org-apac
$ rm -rf $HIVE_HOME/metastore_db
$ cd $HIVE_HOME
$ schematool -initSchema -dbType derby
...
Initialization script completed
schemaTool completed
SHELL命令
>create table product(product int, pname string, price float)
Row format delimited
Fields terminated by ',';
> describe product;
image.png
安装和配置MYSQL数据库
为什么在Hive中使用MySQL作为Meta存储:
- 默认情况下,Hive自带derby数据库作为Metastore。
- Derby数据库一次只能支持单个活动用户
- 在生产环境中不建议使用Derby
建议:
- 在后端使用MYSQL作为元存储,一次将多个用户与Hive连接
- MYSQL是独立Metastore的最佳选择
在Hadoop上的Hive中安装和配置MySQL数据库的步骤
步骤1)安装 如何在Ubuntu 18.04 LTS中安装和配置MySQL
步骤2)在Hive中配置MySql存储
mysql> CREATE USER 'hiveuser'@'%' IDENTIFIED BY 'hivepassword';
mysql> GRANT all on *.* to 'hiveuser'@localhost identified by 'hivepassword';
mysql> flush privileges;
步骤3)配置hive-site.xml
# 注意:之前这个文件是为空的。
$ vi hive-site.xml
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true&useSSL=false</value>
<description>metadata is stored in a MySQL server</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>MySQL JDBC driver class</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>user name for connecting to mysql server</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>12345678_</value>
<description>password for connecting to mysql server</description>
</property>
</configuration>
Step4)初始化数据库
$ rm -rf $HIVE_HOME/metastore_db
$ cd $HIVE_HOME
$ schematool -initSchema -dbType mysql
$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = f62b701e-eb7b-4916-8405-611b7dd087a0
Logging initialized using configuration in file:/opt/hive/conf/hive-log4j2.properties Async: true
Hive Session ID = 9b33def0-0c0f-47a1-a60d-dd7b225e80f4
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> show tables;
OK
Time taken: 2.434 seconds
hive> create table guru99(id int, name string);
OK
Time taken: 1.929 seconds
hive>
步骤5)进入MySql shell模式验证
```sql
mysql> use metastore;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show tables;
+-------------------------------+
| Tables_in_metastore |
+-------------------------------+
| AUX_TABLE |
| BUCKETING_COLS |
| CDS |
| COLUMNS_V2 |
| COMPACTION_QUEUE |
| COMPLETED_COMPACTIONS |
| COMPLETED_TXN_COMPONENTS |
| CTLGS |
| DATABASE_PARAMS |
| DBS |
| DB_PRIVS |
| DELEGATION_TOKENS |
| FUNCS |
| FUNC_RU |
| GLOBAL_PRIVS |
| HIVE_LOCKS |
| IDXS |
| INDEX_PARAMS |
| I_SCHEMA |
| KEY_CONSTRAINTS |
| MASTER_KEYS |
| MATERIALIZATION_REBUILD_LOCKS |
| METASTORE_DB_PROPERTIES |
| MIN_HISTORY_LEVEL |
| MV_CREATION_METADATA |
| MV_TABLES_USED |
| NEXT_COMPACTION_QUEUE_ID |
| NEXT_LOCK_ID |
| NEXT_TXN_ID |
| NEXT_WRITE_ID |
| NOTIFICATION_LOG |
| NOTIFICATION_SEQUENCE |
| NUCLEUS_TABLES |
| PARTITIONS |
| PARTITION_EVENTS |
| PARTITION_KEYS |
| PARTITION_KEY_VALS |
| PARTITION_PARAMS |
| PART_COL_PRIVS |
| PART_COL_STATS |
| PART_PRIVS |
| REPL_TXN_MAP |
| ROLES |
| ROLE_MAP |
| RUNTIME_STATS |
| SCHEMA_VERSION |
| SDS |
| SD_PARAMS |
| SEQUENCE_TABLE |
| SERDES |
| SERDE_PARAMS |
| SKEWED_COL_NAMES |
| SKEWED_COL_VALUE_LOC_MAP |
| SKEWED_STRING_LIST |
| SKEWED_STRING_LIST_VALUES |
| SKEWED_VALUES |
| SORT_COLS |
| TABLE_PARAMS |
| TAB_COL_STATS |
| TBLS |
| TBL_COL_PRIVS |
| TBL_PRIVS |
| TXNS |
| TXN_COMPONENTS |
| TXN_TO_WRITE_ID |
| TYPES |
| TYPE_FIELDS |
| VERSION |
| WM_MAPPING |
| WM_POOL |
| WM_POOL_TO_TRIGGER |
| WM_RESOURCEPLAN |
| WM_TRIGGER |
| WRITE_SET |
+-------------------------------+
74 rows in set (0.00 sec)
mysql> select * from tbls;
ERROR 1146 (42S02): Table 'metastore.tbls' doesn't exist
mysql> select * from TBLS;
+--------+-------------+-------+------------------+---------+------------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | OWNER_TYPE | RETENTION | SD_ID | TBL_NAME | TBL_TYPE | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT | IS_REWRITE_ENABLED |
+--------+-------------+-------+------------------+---------+------------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
| 1 | 1560151436 | 1 | 0 | hduser_ | USER | 0 | 1 | guru99 | MANAGED_TABLE | NULL | NULL | |
+--------+-------------+-------+------------------+---------+------------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
1 row in set (0.00 sec)
无论在Hive中创建的表是什么,元数据都对应于TBLS表。
网友评论