美文网首页我爱编程
【更新中】个人总结:在大数据hadoop路上跳过的坑

【更新中】个人总结:在大数据hadoop路上跳过的坑

作者: gk4030 | 来源:发表于2015-07-11 18:14 被阅读24920次

    环境说明:
    CentOS-6.4-x86_64-bin-DVD1.iso
    hadoop-2.4.1.tar.gz
    hbase-0.98.3-hadoop2-bin.tar.gz
    jdk-7u79-linux-x64.tar.gz
    scala-2.10.4.tgz
    spark-1.2.0-bin-hadoop2.4.tgz.tar
    zookeeper-3.4.6.tar.gz

    附上下载地址:

    hadoop2.4.1:http://archive.apache.org/dist/hadoop/common/hadoop-2.4.1/hadoop-2.4.1.tar.gz
    hbase0.98.3:http://archive.apache.org/dist/hbase/hbase-0.98.3/hbase-0.98.3-hadoop2-bin.tar.gz
    spark1.2.0:http://archive.apache.org/dist/spark/spark-1.2.0/spark-1.2.0-bin-hadoop2.4.tgz
    zookeeper3.4.6:http://apache.fayea.com/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz
    scala2.10.4:http://www.scala-lang.org/files/archive/scala-2.10.4.tgz


    1、hadoop与hbase依赖关系【没选好,后面就等着推倒重来吧】

    见:Apache HBase Reference Guide
    https://hbase.apache.org/book.html#configuration
    ----4.1.Hadoop章节

    Paste_Image.png

    2、namenode无法启动

    日志报错“ulimit -a for user root”
    解决:
    重新格式化namenode,然后启动hadoop,jps存在namenode。

    3、然也有可能datanode无法启动

    日志报错“ulimit -a for user root”
    原因:datanamenode运行时打开文件数,达到系统最大限制
    当前最大限制
    [root@centos-FI hadoop-2.4.1]# ulimit -n
    1024
    解决:
    调整最大打开文件数
    [root@centos-FI hadoop-2.4.1]# ulimit -n 65536
    [root@centos-FI hadoop-2.4.1]# ulimit -n
    65536
    再次启动hadoop
    [root@centos-FI hadoop-2.4.1]# jps
    6330 WebAppProxyServer
    6097 NameNode
    6214 ResourceManager
    6148 DataNode
    6441 Jps
    6271 NodeManager
    6390 JobHistoryServer
    ps:ulimit命令只是临时修改,重启又恢复默认,可在/etc/security/limits.conf 里修改 nofile 的限制。
    参考:http://labs.chinamobile.com/mblog/225_17546

    4、hadoop fs -ls报“WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable”

    还是中招了64位操作系统,32位的jdk,不过还好该warn无影响可忽视。

    5、hadoop fs -ls报“ls: `.': No such file or directory”

    hadoop2.x不同于hadoop1.x的地方,
    1.x是可以执行的,
    而2.x的执行命令为:hadoop fs -ls /
    【坑啊】

    6、启动hbase时,hmaster启动后又消失了,且hin/base shell后,list执行报错"ERROR:can't get master address from ZooKeeper; znode data == null"

    [root@centos-FI hbase-0.98.3-hadoop2]# jps
    28406 NameNode
    28576 ResourceManager
    32196 HRegionServer
    32079 HMaster
    28464 DataNode

    32253 Jps
    28748 JobHistoryServer
    28635 NodeManager
    24789 QuorumPeerMain
    [root@centos-FI hbase-0.98.3-hadoop2]# jps
    28406 NameNode
    28576 ResourceManager
    32196 HRegionServer
    28464 DataNode
    32293 Jps
    28748 JobHistoryServer
    28635 NodeManager
    24789 QuorumPeerMain
    [root@centos-FI hbase-0.98.3-hadoop2]#

    日志提示:

    2015-07-13 13:25:53,904 DEBUG [main-EventThread] master.ActiveMasterManager: A master is now available
    2015-07-13 13:25:53,912 INFO [master:centos-FI:60000] master.ActiveMasterManager: Registered Active Master=centos-FI,60000,1436819147113
    2015-07-13 13:25:53,921 INFO [master:centos-FI:60000] Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
    2015-07-13 13:25:54,250 FATAL [master:centos-FI:60000] master.HMaster: Unhandled exception. Starting shutdown.
    java.net.ConnectException: Call From centos-FI/127.0.0.1 to centos-FI:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

    解决方法:
    根据 http://wiki.apache.org/hadoop/ConnectionRefused 的这条提示信息"Check that there isn't an entry for your hostname mapped to 127.0.0.1 or 127.0.1.1 in /etc/hosts (Ubuntu is notorious for this)"检查:

    [root@centos-FI ~]# hostname -i
    127.0.0.1 192.168.128.120
    [root@centos-FI hbase-0.98.3-hadoop2]# cat /etc/hosts
    127.0.0.1 centos-FI localhost #localhost localhost.localdomain localhost4 localhost4.localdomain4

    ::1 centos-FI #localhost localhost.localdomain localhost6 localhost6.localdomain6

    192.168.128.120 master
    192.168.128.120 slave
    192.168.128.120 centos-FI

    改为:

    [root@centos-FI hbase-0.98.3-hadoop2]# cat /etc/hosts
    127.0.0.1 localhost #localhost localhost.localdomain localhost4 localhost4.localdomain4

    ::1 centos-FI #localhost localhost.localdomain localhost6 localhost6.localdomain6

    192.168.128.120 master
    192.168.128.120 slave
    192.168.128.120 centos-FI

    即确保

    [root@centos-FI hbase-0.98.3-hadoop2]# hostname -i
    192.168.128.120

    hmaster进程正常后,hin/base shell再次执行list也就正常了
    当然hin/base shell后,list执行报错"ERROR:can't get master address from ZooKeeper; znode data == null"还有可能是其他原因,详见:

    http://blog.csdn.net/u010022051/article/details/44176931

    7、启动hbase时报错“localhost: ssh: Could not resolve hostname localhost: Temporary failure in name resolution”

    [root@centos-FI hbase-0.98.3-hadoop2]# bin/start-hbase.sh
    starting master, logging to /opt/program/hbase-0.98.3-hadoop2/bin/../logs/hbase-root-master-centos-FI.out
    localhost: ssh: Could not resolve hostname localhost: Temporary failure in name resolution

    以上报错,/etc/hosts中必须存在127.0.0.1 localhost

    8、CRT进入hbase shell后无法退格

    在secureCRT中,点击【选项】【会话选项】【终端】【仿真】,右边的终端选择linux,在hbase shell中如输入出错,按住Ctrl+删除键(backspace) 即可删除!

    9、执行spark-example:HBaseTest.scala 如下报错:

    WARN TableInputFormatBase: Cannot resolve the host name for master/192.168.128.120 because of javax.naming.CommunicationException: DNS error [Root exception is java.net.PortUnreachableException: ICMP Port Unreachable]; remaining name '120.128.168.192.in-addr.arpa'

    这是由于DNS服务器没有当前节点的记录(都没使用到DNS服务器),手动在/etc/hosts中添加一条记录"master 192.168.128.120"

    10、使用flume拦截器时,报错"java.lang.NullPointerException: Expected timestamp in the Flume event headers, but it was null"

    sink是hdfs,然后使用目录自动生成功能。出现如题的错误,看官网文档说的是需要在每个文件记录行的开头需要有时间戳,但是时间戳的格式可能比较难调节,所以亦可设置hdfs.useLocalTimeStamp这个参数,比如以每个小时作为一个文件夹,那么配置应该是这样:

    a1.sinks.k1.hdfs.path = hdfs://ubuntu:9000/flume/events/%y-%m-%d/%H
    a1.sinks.k1.hdfs.filePrefix = events-
    a1.sinks.k1.hdfs.round = true
    a1.sinks.k1.hdfs.roundValue = 1
    a1.sinks.k1.hdfs.roundUnit = hour
    a1.sinks.k1.hdfs.useLocalTimeStamp = true

    修改之后再次执行flume确实是自动生成了hdfs目录,
    flume日志:

    2015-07-23 15:16:28,405 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:261)] Creating hdfs://master:9000/flume_test/15-07-23/events-.1437689788240.tmp


    备忘

    查看RM:http://ip:8088/
    查看hdfs:http://ip:50070/
    查看jobtrack;http://ip:50030/
    hbase master web ui:http://ip:60010/
    hbase region web ui:http://ip:60030/

    • spark监控之web UI:
      每个SparkContext启动一个UI,默认端口4040,多个SparkContext端口4040以此累加,显示应用信息:

    高度器stage和task列表
    RDD大小和内存占用概况
    环境信息
    正在运行的executors信息

    此信息之在应用运行时才能显示,若想运行后查看web UI,须在启动应用之前将spark-default.conf的spark.eventLog.enabled设置为true:spark将UI中显示的信息记录为spark事件,并记录到持久化存储中

    spark.eventLog.enabled true
    spark.eventLog.dir hdfs://centos-FI:9000/spark_eventLog


    hadoop2.4.1参考资料:

    相关文章

      网友评论

        本文标题:【更新中】个人总结:在大数据hadoop路上跳过的坑

        本文链接:https://www.haomeiwen.com/subject/ddduqttx.html