k8s-scheduler
从api-server
获取pending
的且没有指定nodeName
的pods,然后创建一个binding
,表示将pod调度到哪个node。
调度算法
For given pod:
+---------------------------------------------+
| Schedulable nodes: |
| |
| +--------+ +--------+ +--------+ |
| | node 1 | | node 2 | | node 3 | |
| +--------+ +--------+ +--------+ |
| |
+-------------------+-------------------------+
|
|
v
+-------------------+-------------------------+
Pred(预选). filters: node 3 doesn't have enough resource
+-------------------+-------------------------+
|
|
v
+-------------------+-------------------------+
| remaining nodes: |
| +--------+ +--------+ |
| | node 1 | | node 2 | |
| +--------+ +--------+ |
| |
+-------------------+-------------------------+
|
|
v
+-------------------+-------------------------+
Priority(优选) function: node 1: p=2
node 2: p=5
+-------------------+-------------------------+
|
|
v
select max{node priority} = node 2
一种调度器每次只调度一个pod;一种调度器多副本情况,会选举一个leader。
- 预选,过滤不符合条件的node;
- 优选,根据算法对node打分,选择分数最高的node,如果多个node分数相同,则随机选择一个。
扩展
policy-config
通过--policy-config-file
命令行,指定策略文件修改调度策略。
{
"kind" : "Policy",
"apiVersion" : "v1",
"predicates" : [
{"name" : "PodFitsHostPorts"},
{"name" : "PodFitsResources"},
{"name" : "NoDiskConflict"},
{"name" : "NoVolumeZoneConflict"},
{"name" : "MatchNodeSelector"},
{"name" : "HostName"}
],
"priorities" : [
{"name" : "LeastRequestedPriority", "weight" : 1},
{"name" : "BalancedResourceAllocation", "weight" : 1},
{"name" : "ServiceSpreadingPriority", "weight" : 1},
{"name" : "EqualPriority", "weight" : 1}
],
"hardPodAffinitySymmetricWeight" : 10,
"alwaysCheckAllPredicates" : false
}
多调度器
根据kube-scheduler组件源代码,自己重新开发scheduler,工作量巨大,必须完全理解schduler源代码逻辑,详见Configure Multiple Schedulers
Schduler extender
利用scheduler extender特性,增强kube-schduler功能,非常优雅,推荐。大概逻辑是开发自己的Web Scheduler服务,并且提供filer、prioritize、bind等接口,详见Scheduler extender,如何实现自己的k8s调度器
网友评论