hive的数据分为两部分,一部分是真实的数据文件,存放在hdfs上,另一份是真实数据的元数据(即数据的描述信息,比如说存储位置、时间、大小之类的),一般存放在mysql中,存放的配置可以修改hive-site.xml来完成:
data:image/s3,"s3://crabby-images/68869/68869b66606558e22f7fd7c27709fd01898dd7bd" alt=""
表结构可以参考博客:[一起学Hive]之十四-Hive的元数据表结构详解 - 1130136248 - 博客园
1、hive版本表
version:存储hive的版本信息的表,有且只能有一条记录,多或者少都不行
data:image/s3,"s3://crabby-images/cd65b/cd65b4d2cd175ee339db38661ad641ba4aa58ab9" alt=""
2、数据库相关的表
data:image/s3,"s3://crabby-images/9bee9/9bee9793d03502db9c8d0426ffda23945beb60e9" alt=""
dbs是主表,其他表都是从表,根据DB_ID关联。其中tbls表也是dbs的从表,tbls表的从表也很多就不在这里显示了
dbs示例数据:
data:image/s3,"s3://crabby-images/2dc0d/2dc0d2533abdac8f668be71f7461d156c09b70ed" alt=""
data_params示例数据(创建表的时候指定的创建者和创建时间):
data:image/s3,"s3://crabby-images/515dd/515dd945383273bffcdbb7cf92f6a80a037ed38d" alt=""
funcs示例数据(自定义方法相关描述):
data:image/s3,"s3://crabby-images/5122a/5122aafdf8066a53c510c724f24d32c5ed527e55" alt=""
func_ru示例数据:
data:image/s3,"s3://crabby-images/92a7c/92a7c91cec157ee5c75afd4648e8612c1147ab42" alt=""
3、表相关的表
data:image/s3,"s3://crabby-images/34039/34039a520473a1bffa7d9663546f71c785c9f315" alt=""
tbls是表的主体部分,包括表名称、表的创建时间、所属数据库的id、所在目录等信息,示例数据如下:
data:image/s3,"s3://crabby-images/42884/4288495b255c79b877a032107e6dea59bb325937" alt=""
table_params示例数据,这里存放表的相关属性信息:
data:image/s3,"s3://crabby-images/675d1/675d1163bd048b0a408d3c69b318389d7c7e3a3d" alt=""
partitions示例数据:
data:image/s3,"s3://crabby-images/d32c6/d32c6f2c0d88c808e5cfc5504a3df361d8e919a6" alt=""
partition_keys示例数据:
data:image/s3,"s3://crabby-images/6e7eb/6e7ebb235d828836e3dd9ca2c4ab5b13b5017d3d" alt=""
idxs、tbl_col_privs、tbl_col_stats、tbl_privs表没数据。
4、分区相关
data:image/s3,"s3://crabby-images/13c22/13c229f2202f9e6b8dc5a97002f1910fc80fd261" alt=""
partition_params表示例数据:
data:image/s3,"s3://crabby-images/6b036/6b036dc49c8942570d2516f653315aeb60c602b9" alt=""
partition_params示例数据:
data:image/s3,"s3://crabby-images/d60d4/d60d49bdcb8a1f898d520cd8fbf97fec346c9f26" alt=""
part_privs、part_col_stats、part_col_privs无数据
5、文件相关
data:image/s3,"s3://crabby-images/ccca7/ccca7ac5385ad23c58b5975eac814809bd7209df" alt=""
sds示例数据:
data:image/s3,"s3://crabby-images/5856c/5856c28a0caeed0cc9f156477df8256c4d8d648e" alt=""
columns_v2示例数据:
data:image/s3,"s3://crabby-images/6c64a/6c64aa0cec95fd5602fe5cd37ec0f1488490f7f3" alt=""
serdes示例数据:
data:image/s3,"s3://crabby-images/d2b8c/d2b8c42e0657306e09a18565f7b34ed48fcad0f7" alt=""
serde_params示例数据:
data:image/s3,"s3://crabby-images/b599a/b599a447f02fac7edd403495b7984bd1ea3705f0" alt=""
元数据操作需求:
再不适用hive相关的客户端和其他类似直接操作hive的工具的情况下,删除一张表(test.a),如下:
data:image/s3,"s3://crabby-images/f757f/f757f07c74fbf991ef83c4a5eb39ea4a5232ff9c" alt=""
操作步骤如下:
第一步:根据表名和库名查询得出TBL_ID、SD_ID、PART_ID、CD_ID、SERDE_ID
SELECT
t.TBL_ID,t.SD_ID,p.PART_ID,s.CD_ID,s.SERDE_ID
FROM
(
SELECT
tt.TBL_ID,
tt.SD_ID
FROM
tbls tt,
dbs d
WHERE
d.DB_ID = tt.DB_ID
AND d.`NAME` = 'test'
AND tt.TBL_NAME = 'a'
) t
LEFT JOIN sds s ON t.sd_id = s.SD_ID
LEFT JOIN `partitions` p on t.TBL_ID=p.TBL_ID
结果如下:
data:image/s3,"s3://crabby-images/7a533/7a533e7f1772382e9a08213f497e509fb948b499" alt=""
第二步:按照顺序依次执行删除数据语句(要先删除子表数据)
先删除partition相关的(选的实验表没有分区,假设有分区):
delete from partition_params where PART_ID=
delete from partition_key_vals where PART_ID=
part_privs,part_col_stats,part_col_privs //这几个表基本没有数据,可以不用管,如果有数据则必须删除,否则partitions因为外键关系无法删除对应数据
delete from partitions where PART_ID=
再删除tbls相关的
delete from table_params where TBL_ID=16
delete from partition_keys where TBL_ID=16
tbl_col_privs,tbl_privs,tab_col_stats,index_params,idxs的原理同上(注意idxs是index_params的主表)
delete from tbls where TBL_ID=16
最后删除sds相关
sd_params,bucketing_cols,skewed_string_list_values,skewed_col_names,skewed_values
这些都是sds的从表,需要先行删除,但是基本上都没有数据
delete from sds where SD_ID=16
删除cds相关
delete from columns_v2 where CD_ID=21
delete from cds where CD_ID=21
删除serdes相关
delete from serde_params where SERDE_ID=16
delete from serdes where SERDE_ID=16
删除结果:
data:image/s3,"s3://crabby-images/f58fd/f58fd388f2c0d48720f5a5841591eb886f46feb6" alt=""
网友评论