一、CDH集群因为强制断电,导致报错,提示块损坏和丢失
1、故障如下图
image.png image.png
2、登陆cdh主节点服务器,执行如下命令:
root账号没有执行权限,需要切换到hdfs账号执行
sudo -u hdfs hdfs fsck / > test.log
3、查看test.log文件
可以看到损坏的块所在节点、路径等。
[root@cdh1 ~]# cat test.log
FSCK started by hdfs (auth:SIMPLE) from /172.16.40.170 for path / at Tue Jan 18 16:03:34 CST 2022
/hbase/data/hbase/meta/1588230740/info/3f61550423ef4eef884ca6541b2a73c2: CORRUPT blockpool BP-2050597982-172.16.40.170-1642485592225 block blk_1073741884
/hbase/data/hbase/meta/1588230740/info/3f61550423ef4eef884ca6541b2a73c2: CORRUPT 1 blocks of total size 6809 B.
/hbase/data/hbase/meta/1588230740/info/834bd0386103452b9e943c975902416c: CORRUPT blockpool BP-2050597982-172.16.40.170-1642485592225 block blk_1073741883
/hbase/data/hbase/meta/1588230740/info/834bd0386103452b9e943c975902416c: CORRUPT 1 blocks of total size 6479 B.
Status: CORRUPT
Number of data-nodes: 3
Number of racks: 1
Total dirs: 49
Total symlinks: 0
Replicated Blocks:
Total size: 235179578 B (Total open files size: 248 B)
Total files: 20 (Files currently being written: 16)
Total blocks (validated): 19 (avg. block size 12377872 B) (Total open file blocks (not validated): 15)
********************************
UNDER MIN REPL'D BLOCKS: 2 (10.526316 %)
dfs.namenode.replication.min: 1
CORRUPT FILES: 2
CORRUPT BLOCKS: 2
CORRUPT SIZE: 13288 B
********************************
4、如果文件不重要直接将文件块删除:
sudo -u hdfs hdfs fsck -delete /hbase/data/hbase/meta/1588230740/info/834bd0386103452b9e943c975902416c
5、再次刷新即可恢复。
参考链接:https://blog.csdn.net/hcq_lxq/article/details/121628219
网友评论