Yarn全称为Yet Another Resource Negotiator。是一种资源管理器,负责集群资源的管理和调度,它可以实现对集群所有cpu,内存,文件系统,磁盘等各种资源的分配。
yarn是hadoop mapreduce的第二版本,解决version1的一些问题。
Application Master (AM):
Resource Manager (RM):
Node Manager (NM):
The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is either a single job or a DAG of jobs.
The ResourceManager and the NodeManager form the data-computation framework. The ResourceManager is the ultimate authority that arbitrates resources among all the applications in the system. The NodeManager is the per-machine framework agent who is responsible for containers, monitoring their resource usage (cpu, memory, disk, network) and reporting the same to the ResourceManager/Scheduler.
The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks.
Scheduler是RM的两个主要部分之一,分别是Scheduler和Applications Manager (ASM)。
Apache Hadoop YARN
Architecture of Next Generation Apache Hadoop MapReduce Framework
hadoop杂记-为什么会有Map-reduce v2 (Yarn)
Deploying MapReduce v2 (YARN) on a Cluster