美文网首页
如何获取HDFS上文件的存储位置

如何获取HDFS上文件的存储位置

作者: 润土1030 | 来源:发表于2019-02-27 17:41 被阅读9次

    我们知道存储在HDFS上的文件一般有多个副本,默认是3个,访问这个文件是通过一个URL来的,但是这个文件到底存储在哪个DataNode节点的什么位置,这是很多人不清楚的。其实HDFS提供了一个命令,接下来我们就看看这个问题。

    hdfs fsck命令

    HDFS supports the fsck command to check for various inconsistencies. It it is designed for reporting problems with various files, for example, missing blocks for a file or under-replicated blocks. Unlike a traditional fsck utility for native file systems, this command does not correct the errors it detects. Normally NameNode automatically corrects most of the recoverable failures. By default fsck ignores open files but provides an option to select all files during reporting. The HDFS fsck command is not a Hadoop shell command. It can be run as <tt>bin/hdfs fsck</tt>. For command usage, see fsck. fsck can be run on the whole file system or on a subset of files.

    命令使用方式
    hdfs fsck file_path_on_hdfs -files -blocks -locations
    
    执行命令查看我们的文件
    [hdfs@dlbdn3 data]$ hdfs fsck /user/ericsson/eop/template_workflow.xml -files -blocks -locations
    Connecting to namenode via http://dlbdn3:50070
    FSCK started by hdfs (auth:SIMPLE) from /192.168.123.4 for path /user/ericsson/eop/template_workflow.xml at Wed Feb 27 17:28:57 CST 2019
    /user/ericsson/eop/template_workflow.xml 3685 bytes, 1 block(s):  OK
    0. BP-358999289-192.168.123.4-1530520401469:blk_1074308735_568435 len=3685 Live_repl=3 [DatanodeInfoWithStorage[192.168.123.4:7710,DS-c440ebd2-4553-4b87-b2e1-67a8ae1e29c1,DISK], DatanodeInfoWithStorage[192.168.123.3:7710,DS-4c6c7796-0027-4cb9-a476-041a13146dcf,DISK], DatanodeInfoWithStorage[192.168.123.2:7710,DS-83c58757-f199-48e1-9d04-bd09fc996fbc,DISK]]
    
    Status: HEALTHY
     Total size:    3685 B
     Total dirs:    0
     Total files:   1
     Total symlinks:        0
     Total blocks (validated):  1 (avg. block size 3685 B)
     Minimally replicated blocks:   1 (100.0 %)
     Over-replicated blocks:    0 (0.0 %)
     Under-replicated blocks:   0 (0.0 %)
     Mis-replicated blocks:     0 (0.0 %)
     Default replication factor:    3
     Average block replication: 3.0
     Corrupt blocks:        0
     Missing replicas:      0 (0.0 %)
     Number of data-nodes:      3
     Number of racks:       1
    FSCK ended at Wed Feb 27 17:28:57 CST 2019 in 1 milliseconds
    
    
    The filesystem under path '/user/ericsson/eop/template_workflow.xml' is HEALTHY
    
    
    根据DatanodeInfoWithStorage里面提供的ip信息,进去对应节点, 执行find
    [root@dlbdn3 subdir166]# find / -name "*blk_1074308735_568435*"
    find: ‘/run/user/42/gvfs’: Permission denied
    /data/2/dfs/dn/current/BP-358999289-192.168.123.4-1530520401469/current/finalized/subdir8/subdir166/blk_1074308735_568435.meta
    [root@dlbdn3 subdir166]# cd /data/2/dfs/dn/current/BP-358999289-192.168.123.4-1530520401469/current/finalized/subdir8/subdir166
    [root@dlbdn3 subdir166]# ll | grep blk_1074308735
    -rw-r--r-- 1 hdfs hdfs   3685 Feb 27 16:29 blk_1074308735
    -rw-r--r-- 1 hdfs hdfs     39 Feb 27 16:29 blk_1074308735_568435.meta
    [root@dlbdn3 subdir166]# 
    
    
    查看blk文件的内容,是否是我们想要找的文件
    image.png
    image.png
    确认是一个文件,至此就找到了HDFS文件上存储的信息。很简单吧,也很实用,很多时候需要知道这个信息。

    相关文章

      网友评论

          本文标题:如何获取HDFS上文件的存储位置

          本文链接:https://www.haomeiwen.com/subject/sgwfuqtx.html