volcano基于guarantee和capability的调度

作者: oo的布丁 | 来源:发表于2022-08-02 13:48 被阅读0次

volcano基于guarantee和capability的调度
Volcano 监控设计解读，一看就懂
guarantee - -
关于Volcano的整理
万科V-learn父母学院-小西妈双语工程1706期217号ke
Volcano火山：容器与批量计算的碰撞
Capability配置简介与启动app
打卡第130天
Schedule 调度系统设计（单机版）
linux中的调度

一、问题背景

在使用kubernenetes云原生系统的过程中，随着业务场景的不断复杂，kubernetes默认的调度器便难以支撑，因此很多公司都会选择一些优秀的开源项目来替换默认的调度器，volcano就是其中之一。volcano有诸多有点，比如gang、backfill等，在此就不一一展开。但在使用中也存在一些不便，比如下面这种场景：
集群一共20C资源，3个团队，要求每个团队最少要保证5C资源，最多不能超过10C，这个时候如果将volcano的queue的guarantee设置成5C，capability设置成10C的话，就会出现超配。

二、问题解析

我们再剖析一下造成“一” 中问题的原因，volcano在设计Queue的过程中，为了支持各种场景下的调度策略，给Queue加了几个属性：weight、guarantee和capability。weight表示集群剩余资源的分配权重，是可动态多次分配的；guarantee表示预占资源，是一个最小资源保障；capability表示最大资源，是一个最大资源限制；deserved表示当前session轮次queue分配到的资源。当某个queue的guarantee超过weight所占集群的资源的时候，会出现既要满足各个queue的weight，又要满足某个queue的guarantee，最终的deserved总和就会超过集群总资源的问题。
官方代码：

`
for {
totalWeight := int32(0)
for _, attr := range pp.queueOpts {
if _, found := meet[attr.queueID]; found {
continue
}
totalWeight += attr.weight
}

    // If no queues, break
    if totalWeight == 0 {
        klog.V(4).Infof("Exiting when total weight is 0")
        break
    }

    oldRemaining := remaining.Clone()
    // Calculates the deserved of each Queue.
    // increasedDeserved is the increased value for attr.deserved of processed queues
    // decreasedDeserved is the decreased value for attr.deserved of processed queues
    increasedDeserved := api.EmptyResource()
    decreasedDeserved := api.EmptyResource()
    for _, attr := range pp.queueOpts {
        klog.V(4).Infof("Considering Queue <%s>: weight <%d>, total weight <%d>.",
            attr.name, attr.weight, totalWeight)
        if _, found := meet[attr.queueID]; found {
            continue
        }

        oldDeserved := attr.deserved.Clone()
        attr.deserved.Add(remaining.Clone().Multi(float64(attr.weight) / float64(totalWeight)))

        if attr.realCapability != nil {
            attr.deserved.MinDimensionResource(attr.realCapability, api.Infinity)
        }
        attr.deserved.MinDimensionResource(attr.request, api.Zero)

        klog.V(4).Infof("Format queue <%s> deserved resource to <%v>", attr.name, attr.deserved)

        if attr.request.LessEqual(attr.deserved, api.Zero) {
            meet[attr.queueID] = struct{}{}
            klog.V(4).Infof("queue <%s> is meet", attr.name)
        } else if reflect.DeepEqual(attr.deserved, oldDeserved) {
            meet[attr.queueID] = struct{}{}
            klog.V(4).Infof("queue <%s> is meet cause of the capability", attr.name)
        }
        attr.deserved = helpers.Max(attr.deserved, attr.guarantee)
        pp.updateShare(attr)

        klog.V(4).Infof("The attributes of queue <%s> in proportion: deserved <%v>, realCapability <%v>, allocate <%v>, request <%v>, share <%0.2f>",
            attr.name, attr.deserved, attr.realCapability, attr.allocated, attr.request, attr.share)

        increased, decreased := attr.deserved.Diff(oldDeserved, api.Zero)
        increasedDeserved.Add(increased)
        decreasedDeserved.Add(decreased)

        // Record metrics
        metrics.UpdateQueueDeserved(attr.name, attr.deserved.MilliCPU, attr.deserved.Memory)
    }

    remaining.Sub(increasedDeserved).Add(decreasedDeserved)
    klog.V(4).Infof("Remaining resource is  <%s>", remaining)
    if remaining.IsEmpty() || reflect.DeepEqual(remaining, oldRemaining) {
        klog.V(4).Infof("Exiting when remaining is empty or no queue has more reosurce request:  <%v>", remaining)
        break
    }
}

`

三、我的场景及设计

场景：假设集群一共有20C资源，现在有3个团队，要求每个团队最少要保证使用5C资源，最多不能超过10C资源。
设计：在此场景下，其实是只需要设置queue的guarantee为5C，capability为10C，而不需要考虑weight属性。所以在此场景下，我们重新设计了方案，步骤如下：
1、优先分配各个queue的guarantee资源，前提是guarantee总和小于集群总资源，否则可以直接panic出来；
2、