简介
Pod priority and Preemption
在k8s里面调度节点的时候可以给pod指定Priority,让pod有不同的优先级.这样在scheduler调度pod的时候会优先调度优先级高的pod,如果发生资源不够的时候会触发抢占式调度.
启用 Pod priority and Preemption
- 在1.11之后的版本中默认开启,并且在1.14中变成stable.
- 在1.11之前的版本需要给kube-scheduler指定--feature-gates=PodPriority=true来开启
example
创建PriorityClass
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: high-priority
value: 1000000
globalDefault: false
description: "This priority class should be used for Test pods only."
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: low-priority
value: 10000
globalDefault: false
description: "This priority class should be used for Test pods only."
上面的yaml中定义了2个优先级 high-priority, low-priority.value分别是1000000,10000.
创建deployment
apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2
kind: Deployment
metadata:
name: nginx-deploy-high
spec:
selector:
matchLabels:
app: nginx
replicas: 1 # tells deployment to run 2 pods matching the template
template:
metadata:
labels:
app: nginx
spec:
hostNetwork: true
priorityClassName: high-priority
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 8088
apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2
kind: Deployment
metadata:
name: nginx-deploy-low
spec:
selector:
matchLabels:
app: nginx
replicas: 1 # tells deployment to run 2 pods matching the template
template:
metadata:
labels:
app: nginx
spec:
hostNetwork: true
priorityClassName: low-priority
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 8088
kubectl create -f ./nginx- deploy-low-priority.yaml
kubectl create -f ./nginx-deploy-high.yaml
About to try and schedule pod prometheus/nginx-deploy-high-76b56d5cc5-vpfjn
I1225 15:10:09.753527 1 scheduler.go:456] Attempting to schedule pod: prometheus/nginx-deploy-high-76b56d5cc5-vpfjn
I1225 15:10:09.753643 1 generic_scheduler.go:648] since alwaysCheckAllPredicates has not been set, the predicate evaluation is short circuited and there are chances of other predicates failing as well.
I1225 15:10:09.753696 1 factory.go:665] Unable to schedule prometheus/nginx-deploy-high-76b56d5cc5-vpfjn: no fit: 0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports.; waiting
I1225 15:10:09.753741 1 factory.go:736] Updating pod condition for prometheus/nginx-deploy-high-76b56d5cc5-vpfjn to (PodScheduled==False, Reason=Unschedulable)
I1225 15:10:09.755568 1 generic_scheduler.go:318] Pod prometheus/nginx-deploy-high-76b56d5cc5-vpfjn is not eligible for more preemption.
I1225 15:10:09.755726 1 scheduling_queue.gkube
I1225 15:10:11.729743 1 generic_scheduler.go:1147] Node host108752172 is a potential node for preemption.
I1225 15:10:11.729916 1 generic_scheduler.go:648] since alwaysCheckAllPredicates has not been set, the predicate evaluation is short circuited and there are chances of other predicates failing as well.
I1225 15:10:11.730407 1 cache.go:309] Finished binding for pod ac27a286-4272-47a6-8677-735b23e981fa. Can be expired.
I1225 15:10:11.730627 1 scheduler.go:593] pod prometheus/nginx-deploy-high-76b56d5cc5-vpfjn is bound successfully on node host108752172, 1 nodes evaluated, 1 nodes were found feasible
I1225 15:10:12.066208 1 leaderelection.go:276] successfully renewed lease kube-system/kube-scheduler
分析
上面通过kubectl创建了2个deployment,nginx-deploy-low和nginx-deploy-high. nginx-deploy-low是先创建的,nginx-deploy-high后创建.上面的日志可以看到scheduler在调度nginx-deploy-high-76b56d5cc5-vpfjn的时候发现短裤8088已经被nginx-deploy-low的pod占了.然后nginx-deploy-high-76b56d5cc5-vpfjn这个pod因为Priority的值比low的pod高.所以scheduler会标记Node host108752172 is a potential node for preemption.为可抢占.然后正在running的nginx-deploy-low pod会变成为pending.nginx-deploy-high pod会变为running.
总结
- 如果有2个pod在调度队列里面,一个的priority比较高,一个比较低.调度器会以优先调度priority值高的.这里因为实验环境不好重新.
- 如果调度的时候发现资源不够了,scheduler会抢占优先级比较低的pod的资源优先给优先级高的pod.
网友评论