美文网首页Flink
Flink On Yarn Capacity Scheduler

Flink On Yarn Capacity Scheduler

作者: 淡淡的小番茄 | 来源:发表于2021-09-03 08:28 被阅读0次

    背景

    我们想将集群的机器打上标签,将不同的业务跑在不同的机器上,以应对不同级别客户的业务需求。

                     root

                /                \

        default            perjob

    Yarn调度方式

    我们hadoop版本使用的是3.1.4。yarn的调度方式有三总:FIFOScheduler、CapacityScheduler、FairScheduler。一般常用的是后两种。之前没有使用标签的功能所以一直使用的FairScheduler,这个调度器比较简单。如果想用标签的话,只能使用CapacityScheduler调度器。

    配置yarn-site.xml

    <!-- 设置调度为CapacityScheduler -->

    <property>

      <name>yarn.resourcemanager.scheduler.class</name>

      <!--value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value-->

      <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>

    </property>

    <!-- 开启标签功能 -->

    <property>

        <name>yarn.node-labels.enabled</name>

        <value>true</value>

    </property>

    <!-- 设置标签存储位置-->

    <property>

        <name>yarn.node-labels.fs-store.root-dir</name>

        <value>hdfs://node1:9900/yn/node-labels/</value>

    </property>

    <!-- 开启资源抢占监控 -->

    <property>

        <name>yarn.resourcemanager.scheduler.monitor.enable</name>

        <value>true</value>

    </property>

    <!-- 设置一轮抢占的资源占比,默认为0.1 -->

    <property>

        <name>yarn.resourcemanager.monitor.capacity.preemption.total_preemption_per_round</name>

        <value>0.3</value>

    </property>

    配置capacity-scheduler.xml

    这个调度器的配置实在是太多了,也是最复杂的一个调度器。官方的文档是非常详细的,但是想看懂你首先需要有个总体的了解。直接使用如下配置覆盖默认的capacity-scheduler.xml。

    <configuration>

      <property>

        <name>yarn.scheduler.capacity.maximum-applications</name>

        <value>10000</value>

        <description>

          Maximum number of applications that can be pending and running.

        </description>

      </property>

      <property>

        <name>yarn.scheduler.capacity.maximum-am-resource-percent</name>

        <value>0.1</value>

        <description>

          Maximum percent of resources in the cluster which can be used to run

          application masters i.e. controls number of concurrent running

          applications.

        </description>

      </property>

      <property>

        <name>yarn.scheduler.capacity.resource-calculator</name>

        <value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value>

        <description>

          The ResourceCalculator implementation to be used to compare

          Resources in the scheduler.

          The default i.e. DefaultResourceCalculator only uses Memory while

          DominantResourceCalculator uses dominant-resource to compare

          multi-dimensional resources such as Memory, CPU etc.

        </description>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.queues</name>

        <value>default,perjob</value>

        <description>

          The queues at the this level (root is the root queue).

        </description>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.default.capacity</name>

        <value>60</value>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.perjob.capacity</name>

        <value>40</value>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.default.maximum-capacity</name>

        <value>100</value>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.perjob.maximum-capacity</name>

        <value>80</value>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.default.accessible-node-labels</name>

        <value>SE</value>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.default.default-node-label-expression</name>

        <value>SE</value>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.perjob.accessible-node-labels</name>

        <value>AP</value>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.perjob.default-node-label-expression</name>

        <value>AP</value>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.accessible-node-labels.SE.capacity</name>

        <value>100</value>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.default.accessible-node-labels.SE.capacity</name>

        <value>100</value>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.accessible-node-labels.AP.capacity</name>

        <value>100</value>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.perjob.accessible-node-labels.AP.capacity</name>

        <value>100</value>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.default.user-limit-factor</name>

        <value>5</value>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.perjob.user-limit-factor</name>

        <value>5</value>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.default.default-application-priority</name>

        <value>10</value>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.perjob.default-application-priority</name>

        <value>100</value>

      </property>

      <property>

        <name>yarn.scheduler.capacity.root.leaf-queue-template.ordering-policy</name>

        <value>fair</value>

      </property>

    </configuration>

    配置标签

    新建SE、AP两个标签
    yarn rmadmin -addToClusterNodeLabels "SE,AP";

    将机器打上标签

    yarn rmadmin -replaceLabelsOnNode "node1=SE node2=AP  node3=AP";

    属性配置

    yarn rmadmin -refreshQueues

    结论和总结

    配置的成功之前,遇到个比较棘手的问题:就是提交flink任务的时候,任务一直处于ACCEPTED状态,查看yarn rm日志为看到相关异常。那么如何查看调度的异常信息的呢,我也是无意间发现,在控制台Scheduler菜单页面,可以Dump scheduler logs。

    点击后会在hadoop日志目录下生成yarn-capacity-scheduler-debug.log。

    2021-09-02 15:29:18,687 DEBUG org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Trying to assign containers to child-queue of root

        2021-09-02 15:29:18,687 DEBUG org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue: Failed to assign to queue: root nodePatrition: AP, usedResources: <memory:0, vCores:0>, clusterResources: <memory:110592, vCores:96>, reservedResources: <memory:0, vCores:0>, maxLimitCapacity: <memory:0, vCores:0>, currTotalUsed:<memory:0, vCores:0>

    结合github上的源码,找到AbstractCSQueue,很容易定位到打日志的代码行:

    可以看出来很多参数都是默认的值0,导致无法分配资源。我是因为没有配置yarn.scheduler.capacity.<queue-path>.accessible-node-labels.<label>.capacity,导致一直分配不了资源。此配置项默认值是0,官方文档上有详细的说明:

    配置好后,通过yarn rmadmin -refreshQueues来刷新capacity-scheduler.xml的配置信息。

    正常的yarn-capacity-scheduler-debug.log如下:

    2021-09-03 08:04:54,261 DEBUG org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.UsersManager: User limit computation for deployer, in queue: perjob, userLimitPercent=100, userLimitFactor=5.0, required=<memory:512, vCores:1>, consumed=<memory:0, vCores:0>, user-limit-resource=<memory:512, vCores:1>, queueCapacity=<memory:512, vCores:1>, qconsumed=<memory:0, vCores:0>, currentCapacity=<memory:512, vCores:1>, activeUsers=0.0, clusterCapacity=<memory:51200, vCores:32>, resourceByLabel=<memory:51200, vCores:32>, usageratio=0.0, Partition=SE, resourceUsed=<memory:512, vCores:1>, maxUserLimit=<memory:2560, vCores:5>, userWeight=1.0

    至此终于将Capacity Scheduler调度配置完成。断断续续的看了两天了,还是挺不容易的。周末给自己加个鸡腿,犒劳下自己。

    相关文章

      网友评论

        本文标题:Flink On Yarn Capacity Scheduler

        本文链接:https://www.haomeiwen.com/subject/xibhwltx.html