美文网首页
2018-01-27 7 HDFS Performance Tu

2018-01-27 7 HDFS Performance Tu

作者: 鸭鸭学语言 | 来源:发表于2018-01-27 20:10 被阅读0次

    Performance Parameter define / change

    Parameters are defined in HDFS-site.xml.

    Cloudera manager has friendly GUI for end-user to change the para, without going with xml file modification manually.

    Start Cloudera manager:

        On terminal, run:  $ sudo /home/cloudera/cloudera-manager --express --force

        Then, on firefox: access : quickstart.cloudera:7180/cmf/services/8/config


    4 main parameters impact performances:

        DFS Block size  -- dfs.blocksize : default 64M.  Impact directly the name node mamory usage and mumber of map tasks.

        HDFS Replication -- dfs.replication : default 3. Reducing replication has a trade off with regards to robustness. It mitigates the failure and is achieved from perspectives below:

            periodicaly heartbeat from data node to name node.

            file's checksum stored in name node, to verify the re-read from other healthy nodes.

        Number of handlers on each data node -- dfs.datanode.handler.count

        Maximum number of blocks per file -- dfs.namenode.fs-limits.max-blocks-per-file


    lesson 7 - slides

    相关文章

      网友评论

          本文标题:2018-01-27 7 HDFS Performance Tu

          本文链接:https://www.haomeiwen.com/subject/xfqoaxtx.html