hdfs笔记

作者: BIDIU猿 | 来源:发表于2016-02-06 09:27 被阅读40次

    Yarn背景

    Yarn全称为Yet Another Resource Negotiator。是一种资源管理器,负责集群资源的管理和调度,它可以实现对集群所有cpu,内存,文件系统,磁盘等各种资源的分配。
    yarn是hadoop mapreduce的第二版本,解决version1的一些问题。

    名词解释

    Application Master (AM):
    Resource Manager (RM):
    Node Manager (NM):

    The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is either a single job or a DAG of jobs.
    The ResourceManager and the NodeManager form the data-computation framework. The ResourceManager is the ultimate authority that arbitrates resources among all the applications in the system. The NodeManager is the per-machine framework agent who is responsible for containers, monitoring their resource usage (cpu, memory, disk, network) and reporting the same to the ResourceManager/Scheduler.
    The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks.

    也就是说,RM和NM组成了数据计算框架,RM管理系统中所有资源的框架,NM是管理容器的机器级别的框架(管理机器cpu内存硬盘网络资源)并汇报给RM/Scheduler。应用级别的AM是框架定义的库,负责与RM协调资源,和NM一起执行并监控task。

    Scheduler是RM的两个主要部分之一,分别是Scheduler和Applications Manager (ASM)。

    yarn_architecture.gif

    yarn并不能单独安装,只能通过部署hadoop来安装yarn。

    参考

    Apache Hadoop YARN
    Architecture of Next Generation Apache Hadoop MapReduce Framework
    hadoop杂记-为什么会有Map-reduce v2 (Yarn)
    Deploying MapReduce v2 (YARN) on a Cluster

    相关文章

      网友评论

        本文标题:hdfs笔记

        本文链接:https://www.haomeiwen.com/subject/soxykttx.html