美文网首页
启动hadoop时,datanode未能正常启动

启动hadoop时,datanode未能正常启动

作者: cllblogs | 来源:发表于2019-08-26 09:02 被阅读0次
    • 启动hdfs时,查看进程发现没有datanode进程
    [hadoop@hadoop001 sbin]$ start-dfs.sh 
    Starting namenodes on [hadoop001]
    hadoop001: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-namenode-hadoop001.out
    hadoop001: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-datanode-hadoop001.out
    Starting secondary namenodes [0.0.0.0]
    0.0.0.0: starting secondarynamenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-secondarynamenode-hadoop001.out
    
    [hadoop@hadoop001 sbin]$ jps
    2770 ResourceManager
    2883 NodeManager
    5880 SecondaryNameNode
    5995 Jps
    5599 NameNode
    
    • 尝试单独启动datanode,发现还是不行
    [hadoop@hadoop001 sbin]$ hadoop-daemon.sh start datanode
    starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-datanode-hadoop001.out
    
    [hadoop@hadoop001 sbin]$ jps
    2770 ResourceManager
    2883 NodeManager
    5880 SecondaryNameNode
    6107 Jps
    5599 NameNode
    
    • 然后到hadoop日志目录下查看datanode得日志信息
      报错信息如下:
    2019-08-26 08:11:56,368 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage directory [DISK]file:/tmp/hadoop-hadoop/dfs/data/
    java.io.IOException: Incompatible clusterIDs in /tmp/hadoop-hadoop/dfs/data: namenode clusterID = CID-56cb1b3a-d272-4b55-a560-93d34f3ea536; datanode clusterID = CID-f06280d7-1870-452d-a155-419b58c23f55
            at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:779)
            at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:302)
            at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:418)
            at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:397)
            at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:575)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1560)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1520)
            at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:354)
            at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:219)
            at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:673)
            at java.lang.Thread.run(Thread.java:745)
    2019-08-26 08:11:56,371 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool ID needed, but service not yet registered with NN, trace:
    java.lang.Exception
            at org.apache.hadoop.hdfs.server.datanode.BPOfferService.getBlockPoolId(BPOfferService.java:190)
            at org.apache.hadoop.hdfs.server.datanode.BPOfferService.hasBlockPoolId(BPOfferService.java:200)
            at org.apache.hadoop.hdfs.server.datanode.BPOfferService.shouldRetryInit(BPOfferService.java:799)
            at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.shouldRetryInit(BPServiceActor.java:712)
            at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:678)
            at java.lang.Thread.run(Thread.java:745)
    2019-08-26 08:11:56,371 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid c7fd2b2f-13af-4ec2-9d60-17a9122bc43d) service to hadoop001/172.19.6.118:9000. Exiting.
    java.io.IOException: All specified directories are failed to load.
            at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:576)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1560)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1520)
            at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:354)
            at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:219)
            at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:673)
            at java.lang.Thread.run(Thread.java:745)
    2019-08-26 08:11:56,371 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid c7fd2b2f-13af-4ec2-9d60-17a9122bc43d) service to hadoop001/172.19.6.118:9000
    2019-08-26 08:11:56,472 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool ID needed, but service not yet registered with NN, trace:
    java.lang.Exception
            at org.apache.hadoop.hdfs.server.datanode.BPOfferService.getBlockPoolId(BPOfferService.java:190)
            at org.apache.hadoop.hdfs.server.datanode.BPOfferService.hasBlockPoolId(BPOfferService.java:200)
            at org.apache.hadoop.hdfs.server.datanode.BlockPoolManager.remove(BlockPoolManager.java:91)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownBlockPool(DataNode.java:1475)
            at org.apache.hadoop.hdfs.server.datanode.BPOfferService.shutdownActor(BPOfferService.java:437)
            at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.cleanUp(BPServiceActor.java:457)
            at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:707)
            at java.lang.Thread.run(Thread.java:745)
    2019-08-26 08:11:56,473 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid c7fd2b2f-13af-4ec2-9d60-17a9122bc43d)
    2019-08-26 08:11:56,473 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool ID needed, but service not yet registered with NN, trace:
    

    此时会发现问题所在:
    namenode clusterID = CID-56cb1b3a-d272-4b55-a560-93d34f3ea536
    datanode clusterID = CID-f06280d7-1870-452d-a155-419b58c23f55
    是由于这两个id不一致导致的

    • 解决办法
      方法1: 根据日志中的路径:/tmp/hadoop-hadoop/dfs/data (该路径在root用户下)
    [root@hadoop001 dfs]# ll
    总用量 12
    drwx------ 3 hadoop hadoop 4096 8月  26 08:48 data
    drwxrwxr-x 3 hadoop hadoop 4096 8月  26 08:33 name
    drwxrwxr-x 3 hadoop hadoop 4096 8月  26 08:33 namesecondary
    
    # 将name/current/VERSION 文件中的 clusterID的值 
    #拷贝到 name/current/VERSION 文件中的 clusterID的=后面 
    # 也就是让name data两个的clusterID保持一致
    

    方法2: 直接删除data name下面的文件夹,重新格式化namenode

    • 重新启动
    [hadoop@hadoop001 sbin]$ start-dfs.sh
    Starting namenodes on [hadoop001]
    hadoop001: namenode running as process 5599. Stop it first.
    hadoop001: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-datanode-hadoop001.out
    Starting secondary namenodes [0.0.0.0]
    0.0.0.0: secondarynamenode running as process 5880. Stop it first.
    [hadoop@hadoop001 sbin]$ jps
    2770 ResourceManager
    2883 NodeManager
    5880 SecondaryNameNode
    6331 DataNode  # 正常启动
    6556 Jps
    5599 NameNode
    

    相关文章

      网友评论

          本文标题:启动hadoop时,datanode未能正常启动

          本文链接:https://www.haomeiwen.com/subject/jhilectx.html