美文网首页我爱编程程序员
4 HDFS常用命令 2018-05-24

4 HDFS常用命令 2018-05-24

作者: lizhigang | 来源:发表于2018-05-25 09:51 被阅读0次

    1.jps命令,查看进程

    [hadoop@hadoop003 ~]$ jps
    2034 NameNode
    2148 DataNode
    2633 NodeManager
    5129 Jps
    2521 ResourceManager
    2364 SecondaryNameNode
    

    查看详细进程

    [hadoop@hadoop003 ~]$ jps -l
    2034 org.apache.hadoop.hdfs.server.namenode.NameNode
    5170 sun.tools.jps.Jps
    2148 org.apache.hadoop.hdfs.server.datanode.DataNode
    2633 org.apache.hadoop.yarn.server.nodemanager.NodeManager
    2521 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
    2364 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode
    

    查看某进程相关信息

    [hadoop@hadoop003 ~]$ ps -el|grep 2148
    0 S   515  2148     1  0  80   0 - 690720 futex_ ?       00:00:42 java
    

    若有残留进程,则利用jps命令+ps命令,删除残留信息:

    [hadoop@hadoop003 ~]$ rm -f 2148
    

    2.hadoop和hdfs 文件系统命令

    hdfs dfs等价于Hadoop fs
    查看主目录下内容:

    [hadoop@hadoop003 hadoop]$ bin/hdfs dfs -ls /
    Found 3 items
    drwxr-xr-x   - hadoop supergroup          0 2018-05-23 12:27 /lizhigangdir001
    drwx------   - hadoop supergroup          0 2018-05-17 12:33 /tmp
    drwxr-xr-x   - hadoop supergroup          0 2018-05-17 12:33 /user
    

    创建目录:

    [hadoop@hadoop003 hadoop]$ bin/hdfs dfs -mkdir -p /lizhignagdir001/001
    

    新建文件并写入000000:

    [hadoop@hadoop003 hadoop]$ echo "000000">lizhigang.log
    

    将lizhigang.log上传到/lizhigangdir001/001/目录下:

    [hadoop@hadoop003 hadoop]$ bin/hdfs dfs -put lizhigang.log /lizhigangdir001/001
    

    查看/lizhigangdir001/001/lizhigang.log内容:

    [hadoop@hadoop003 hadoop]$ bin/hdfs dfs -cat /lizhigangdir001/001/lizhigang.log
    

    将/lizhigangdir001/001目录下lizhigang.log文件下载到/tmp/目录下:

    [hadoop@hadoop003 hadoop]$ bin/hdfs dfs -get /lizhigangdir001/001/lizhigang.log /tmp/
    

    将/lizhigangdir001/001目录下lizhigang.log文件下载到/tmp/目录下,并重命名为lizhigang1.log:

    bin/hdfs dfs -get /lizhigangdir001/001/lizhigang.log /tmp/lizhigang1.log
    

    [-moveFromLocal <localsrc> ... <dst>]上传
    [-moveToLocal <src> <localdst>]下载

    将/lizhigangdir001/001目录下lizhigang.log文件删除,此操作被删除文件放入回收站,一定时间内可恢复:

    [hadoop@hadoop003 hadoop]$ bin/hdfs dfs -rm -f /lizhigangdir001/001/lizhigang.log
    

    将/lizhigangdir001/001目录下lizhigang.log文件删除,此操作被删除文件不放入回收站,不可恢复:

    [hadoop@hadoop003 hadoop]$ bin/hdfs dfs -rm -f  -skipTrash /lizhigangdir001/001/lizhigang.log
    

    3.hdfs dfsadmin
    查看磁盘空间:

    [hadoop@hadoop003 hadoop]$ bin/hdfs dfsadmin -report
    Configured Capacity: 39900024832 (37.16 GB)
    Present Capacity: 23986335744 (22.34 GB)
    DFS Remaining: 23986073600 (22.34 GB)
    DFS Used: 262144 (256 KB)
    DFS Used%: 0.00%
    Under replicated blocks: 0
    Blocks with corrupt replicas: 0
    Missing blocks: 0
    Missing blocks (with replication factor 1): 0
    Pending deletion blocks: 0
    -------------------------------------------------
    Live datanodes (1):
    Name: 192.168.137.200:50010 (hadoop003)
    Hostname: hadoop003
    Decommission Status : Normal
    Configured Capacity: 39900024832 (37.16 GB)
    DFS Used: 262144 (256 KB)
    Non DFS Used: 13886844928 (12.93 GB)
    DFS Remaining: 23986073600 (22.34 GB)
    DFS Used%: 0.00%
    DFS Remaining%: 60.12%
    Configured Cache Capacity: 0 (0 B)
    Cache Used: 0 (0 B)
    Cache Remaining: 0 (0 B)
    Cache Used%: 100.00%
    Cache Remaining%: 0.00%
    Xceivers: 1
    Last contact: Fri May 25 09:38:02 CST 2018
    

    安全模式进入、退出、get、wait:

    [hadoop@hadoop003 hadoop]$ bin/hdfs dfsadmin -safemode [ enter | leave | get | wait ]
    

    4.hadoop fsck
    检查整个文件系统的健康状况:

    [hadoop@hadoop003 hadoop]$ bin/hadoop fsck /
    DEPRECATED: Use of this script to execute hdfs command is deprecated.
    Instead use the hdfs command for it.
    
    Connecting to namenode via http://hadoop003:50070/fsck?ugi=hadoop&path=%2F
    FSCK started by hadoop (auth:SIMPLE) from /192.168.137.200 for path / at Fri May 25 09:45:02 CST 2018
    ...Status: HEALTHY
     Total size:    194589 B
     Total dirs:    13
     Total files:   3
     Total symlinks:                0
     Total blocks (validated):      3 (avg. block size 64863 B)
     Minimally replicated blocks:   3 (100.0 %)
     Over-replicated blocks:        0 (0.0 %)
     Under-replicated blocks:       0 (0.0 %)
     Mis-replicated blocks:         0 (0.0 %)
     Default replication factor:    1
     Average block replication:     1.0
     Corrupt blocks:                0
     Missing replicas:              0 (0.0 %)
     Number of data-nodes:          1
     Number of racks:               1
    FSCK ended at Fri May 25 09:45:02 CST 2018 in 7 milliseconds
    The filesystem under path '/' is HEALTHY
    

    打印出hadoop的环境变量:

    [hadoop@hadoop003 hadoop]$ bin/hadoop classpath
    /opt/software/hadoop-2.8.1/etc/hadoop:/opt/software/hadoop-2.8.1/share/hadoop/common/lib/*:/opt/software/hadoop-2.8.1/share/hadoop/common/*:/opt/software/hadoop-2.8.1/share/hadoop/hdfs:/opt/software/hadoop-2.8.1/share/hadoop/hdfs/lib/*:/opt/software/hadoop-2.8.1/share/hadoop/hdfs/*:/opt/software/hadoop-2.8.1/share/hadoop/yarn/lib/*:/opt/software/hadoop-2.8.1/share/hadoop/yarn/*:/opt/software/hadoop-2.8.1/share/hadoop/mapreduce/lib/*:/opt/software/hadoop-2.8.1/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar
    

    5.start-blancer.sh
    1.多台机器的磁盘存储分布不均匀?
    解决方案:
    1.1 不加新机器,原机器的磁盘分布不均匀:

           [hadoop@rzdatahadoop002 ~]$ hdfs dfsadmin -setBalancerBandwidth  52428800
           Balancer bandwidth is set to 52428800
           [hadoop@rzdatahadoop002 ~]$ 
           [hadoop@rzdatahadoop002 sbin]$ ./start-balancer.sh  
           等价
           [hadoop@rzdatahadoop002 sbin]$ hdfs balancer 
           Apache Hadoop集群环境:  shell脚本每晚业务低谷时调度
           CDH集群环境: 忽略
           http://blog.itpub.net/30089851/viewspace-2052138/
    

    1.2 加新机器,原机器的磁盘比如450G(500G),现在的新机器磁盘规格是5T,在业务低谷时,先将多台新机器加入到HDFS,做DN;然后选一台的DN下架掉,等待hdfs自我修复块,恢复3份(网络和io最高的,也是最有风险性的)

    2.一台机器的多个磁盘分布不均匀?
    2.1.无论加不加磁盘,且多块磁盘的分布不均匀

        https://hadoop.apache.org/docs/r3.0.0-alpha2/hadoop-project-dist/hadoop-hdfs/HDFSDiskbalancer.html
        hdfs diskbalancer -plan node1.mycluster.com
        hdfs diskbalancer -execute /system/diskbalancer/nodename.plan.json
    
        Apache Hadoop3.x
        CDH5.12+
    

    相关文章

      网友评论

        本文标题:4 HDFS常用命令 2018-05-24

        本文链接:https://www.haomeiwen.com/subject/juysjftx.html