You can also flush your current trash folders as follows, though below I've detailed the related commands to see what is going on:
$ hadoop fs -df -h
Filesystem Size Used Available Use%
hdfs://soptct52-01.hursley.ibm.com:8020 284.7 G 132.5 G 10.7 G 47%
So we can see the current state of the related local filesystem.
$ hdfs dfs -du -h /
908.2 K /app-logs
20.0 K /apps
0 /benchmarks
380.3 M /iop
0 /mapred
9.4 M /mr-history
1.1 M /tmp
53.3 G /user
With a break-down of the filesystem structure to see where the most data is held.
$ hadoop dfsadmin -report
Configured Capacity: 305667547136 (284.68 GB)
Present Capacity: 153351950336 (142.82 GB)
DFS Remaining: 11116261376 (10.35 GB)
DFS Used: 142235688960 (132.47 GB)
DFS Used%: 92.75%
Under replicated blocks: 855
Blocks with corrupt replicas: 0
Missing blocks: 4
Missing blocks (with replication factor 1): 136
Live datanodes (3):
Name: 9.20.170.23:50010 (soptct52-01.hursley.ibm.com)
Hostname: soptct52-01.hursley.ibm.com
Decommission Status : Normal
Configured Capacity: 101844443136 (94.85 GB)
DFS Used: 58115502080 (54.12 GB)
Non DFS Used: 39498178560 (36.79 GB)
DFS Remaining: 4230762496 (3.94 GB)
DFS Used%: 57.06%
DFS Remaining%: 4.15%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 3
Last contact: Thu Aug 13 15:49:44 BST 2015
Name: 9.20.170.39:50010 (soptct52-05.hursley.ibm.com)
Hostname: soptct52-05.hursley.ibm.com
Decommission Status : Normal
Configured Capacity: 101844443136 (94.85 GB)
DFS Used: 41994641408 (39.11 GB)
Non DFS Used: 59849801728 (55.74 GB)
DFS Remaining: 0 (0 B)
DFS Used%: 41.23%
DFS Remaining%: 0.00%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 3
Last contact: Thu Aug 13 15:49:44 BST 2015
Name: 9.20.170.35:50010 (soptct52-03.hursley.ibm.com)
Hostname: soptct52-03.hursley.ibm.com
Decommission Status : Normal
Configured Capacity: 101978660864 (94.98 GB)
DFS Used: 42125545472 (39.23 GB)
Non DFS Used: 52967616512 (49.33 GB)
DFS Remaining: 6885498880 (6.41 GB)
DFS Used%: 41.31%
DFS Remaining%: 6.75%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Thu Aug 13 15:49:44 BST 2015
Then we have a breakdown of the HDFS across the cluster and each of the related nodes.
[hdfs@soptct52-01 ~]$ hadoop fs -expunge
15/08/13 16:03:18 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 360 minutes, Emptier interval = 0 minutes.
15/08/13 16:03:18 INFO fs.TrashPolicyDefault: Created trash checkpoint: /user/hdfs/.Trash/150813160318
The other option when cleaning up your data is to the the -skipTrash flag:
$ hadoop fs -rm -R -skipTrash /folder-path
网友评论