kube-batch overused问题定位

作者: 陈先生_9e91 | 来源:发表于2018-10-10 16:59 被阅读0次

kube-batch overused问题定位
kube-batch--启动过程
深入分析kube-batch（1）——启动过程
kube-batch 修改默认的配置
深入分析kube-batch（3）——Plugins
kubeflow kube-batch:安装遇到的问题及解决办法
深入分析kube-batch（4）——actions
定位问题
定位问题
iOS 端定位「网络问题」

kube-batch overused问题定位

kube-batch is batch scheduler built on Kubernetes, providing mechanisms for the applications which would like to run batch jobs in Kubernetes.

default-scheduler每次只能调度一个pod，所以我用kube-bach解决多job，多pod调度问题。

背景

创建一个Tensorflow分布式作业，包括2 ps和2 worker四个任务。每个任务对应创建一个K8S Job，并行度1，即一个Pod。

配置文件如下：

- apiVersion: batch/v1
  kind: Job
  metadata:
    name: cyx2-worker-0
        annotations:
          scheduling.k8s.io/group-name: cyx2
  spec:
   template:
     spec:
       containers:
       -  resources:
            limits:
              nvidia.com/gpu: "1"
            requests:
              cpu: "1"
              memory: 1Gi
- apiVersion: batch/v1
  kind: Job
  metadata:
    name: cyx2-ps-0
        annotations:
          scheduling.k8s.io/group-name: cyx2
  spec:
   template:
     spec:
       containers:

这个四个K8S Job都有相同的scheduling.k8s.io/group-name: cyx2，所以可以被一个PodGroup（kube-batch创建的CRD）管理。

问题

我们希望在资源足够的情况下，四个Job都running，否则都不running。但是发现事与愿违。集群资源足够，但是四个Job都是pending。

日志

ps. 只有关键日志

I1009 21:42:36.472045   21605 allocate.go:42] Enter Allocate ...
I1009 21:42:36.472224   21605 allocate.go:118] Binding Task <mind-automl/cyx2-worker-0-2mr7q> to node <192.168.47.52>
I1009 21:42:36.472399   21605 allocate.go:118] Binding Task <mind-automl/cyx2-worker-1-hdz8r> to node <192.168.47.52>
I1009 21:42:36.472426   21605 allocate.go:72] Queue <mind-automl> is overused, ignore it.
I1009 21:42:36.472431   21605 allocate.go:155] Leaving Allocate ..

这里我们看到调度程序已经进入资源分配阶段，但是只调度了2个worker task，就显示overused。显然问题出现在这里。

overused相关概念在queue，就是说资源使用量超过了queue可使用资源总量，但是我没有设置过queue啊，所以应该是默认配置作梗，只能看源代码了。

源码

之前曲折的代码定位就不复述了，直接到重点代码。

kube-batch\pkg\scheduler\plugins\proportion\proportion.go

remaining := pp.totalResource.Clone()

// Calculates the deserved of each Queue.
attr.deserved.Add(remaining.Clone().Multi(float64(attr.weight) / float64(totalWeight)))

if !attr.deserved.LessEqual(attr.request) {
        attr.deserved = helpers.Min(attr.deserved, attr.request)
}

计算集群资源总数
根据queue权重，设置queue的可以用资源数，默认使用全部资源
比较可用资源和申请资源，取小的。

因为我ps没有设置资源申请，所以queue的可用资源总数就等于两个worker的资源总数。当调度完两个worker之后，资源就用光了，所以overused。

解决

解决方法很简单，给ps也设置资源申请就好了。

QoS

Guaranteed：每个容器都必须设置CPU和内存的限制和请求（最大和最小）。最严格的要求
1. Every Container in the Pod must have a memory limit and a memory request, and they must be the same.
2. Every Container in the Pod must have a CPU limit and a CPU request, and they must be the same.
Burstable：在不满足Guaranteed的情况下，至少设置一个CPU或者内存的请求。
1. The Pod does not meet the criteria for QoS class Guaranteed.
2. At least one Container in the Pod has a memory or CPU request.
BestEffort：什么都不设置，佛系资源申请。
1. For a Pod to be given a QoS class of BestEffort, the Containers in the Pod must not have any memory or CPU limits or requests.

网友评论

本文标题：kube-batch overused问题定位

本文链接：https://www.haomeiwen.com/subject/poqmaftx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

kube-batch overused问题定位

kube-batch overused问题定位

背景

问题

日志

源码

解决

QoS

相关文章