版权声明:本文为博主原创文章,未经博主允许不得转载。https://www.jianshu.com/p/34cc4e2cf181
#date:2019-05-13
#author:Wang Kuan
dbname=$1
#categorize the table type
cat /home/tablename | while read table;
do
hive -e "use ${dbname};desc $table;" 2>&1 |grep 'Table not found'
if [ $> -ne 0 ]; then
hive -e "desc formatted $table;" 2>&1 |grep 'EXTERNAL_TABLE'
if [ $> -ne 0 ]; then
echo $table " is internal table"
echo $table >> internal_table.txt
else
echo $table " is external table"
echo $table >> external_table.txt
fi
else
echo $table " is not exist"
echo $table >> table_not_exits.txt
fi
done
#repair the external table
cat /home/external_table.txt | while read externaltable;
do
hive -e "msck repair table $externaltable";
echo $externaltable " have been repaired!"
done
#repair the table_not_exist
cat /home/table_not_exist.txt | while read table_not_exist;
do
beeline -u jdbc:hive2://sourceip:10010 -e "use ${dbname};show tables;"|grep -i $table_not_exist
if [ $? -ne 0 ];then
echo $table_not_exist " is not exist in source cluster"
else
beeline -u jdbc:hive2://ip:10010 --silent=true --outputformat=csv2 -e "show create table $table_not_exist;">> TableDDL.txt
fi
done
#repair the internal table
start_day="20190303"
end_day="20190509"
cat /home/internal_table.txt | while read internaltable
do
echo $internaltable "begin to add partition"
while [[ $start_date < $end_date ]]
do
hive -e "use ${dbname};alter table $internaltable add partition(dt=$start_day)";
start_date=`date -d "+1 day $start_date" +%Y%m%d`
done
done
对于内部表来说,drop表HDFS上数据也会删除,而外部表不会。
因此如果新建内部表,需要先将其在HDFS上对应的目录临时移动到别的位置,否则创建表时会将原始目录文件清空。待创建好内部表之后,再移回HDFS文件,添加分区即可。
Hive数据迁移4(内外部表) Hive数据迁移4(内外部表)ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerde'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'.....
网友评论