Affinity 和 AntiAffinity
Affinity: 和nodeSelector很像,提供了更多维度去部署pod。比如你希望哪些pod必须部署在同一个node上,或者同一个zone里。
有两种类型:requiredDuringSchedulingIgnoredDuringExecution,preferredDuringSchedulingIgnoredDuringExecution
业界也喜欢叫hard和soft规定。requiredDuringSchedulingIgnoredDuringExecution表示pod一定要满足下列要求才允许部署(跟nodeSelector一样)。后者则没那么强制性要求,也允许跑在其他地方。类型中包含IgnoredDuringExecution和NodeSelector类似,如果在运行时,node的label变化了导致affinity规则不满足,这个pod仍然会运行在这个node上。未来有计划提供requiredDuringSchedulingRequiredDuringExecution,这个目的就是一旦标签变化,就会立即驱逐evict pod。
pods/pod-with-node-affinity.yaml
apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/e2e-az-name
operator: In
values:
- e2e-az1
- e2e-az2
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1 #(1-100)
preference:
matchExpressions:
- key: another-node-label-key
operator: In #(NotIn,Exists,DoesNotExist,Gt,Lt)
values:
- another-node-label-value
containers:
- name: with-node-affinity
image: k8s.gcr.io/pause:2.0
如果你既指定了nodeSelector,同时也设置了affinity rule, 那么就必须满足两者pod才可以调度。
preferredDuringSchedulingIgnoredDuringExecution里的weight的取值范围是1到100,调度器会根据资源需求,affinity 满足度自动合计出weight值,weight值越高的node,pod将优先选择。
那么AntiAffinity是做什么的?
它的表现就是排他性的。比如我们在生产环境经常会用到类似于rabbitmq的消息队列服务,一般为了能够高可用,会做HA,所以在调度的过程中会出现两个rabbitmq跑在了同一个node上,不仅浪费了资源,而且也增加了风险。所以我们可以用antiAffinity来强制pod部署在不同的node上。通过topologyKey来区分:
比如 1. kubernetes.io/hostname,根据hostname来区分拓扑结构。
- failure-domain.beta.kubernetes.io/zone
- failure-domain.beta.kubernetes.io/region
如下图,在node的label为app=store上,分别部署一个nginx-pod-affinity的pod。
pods/pod-with-node-antiaffinity.yaml
apiVersion: v1
kind: Deployment
metadata:
name: nginx-pod-affinity
spec:
selector:
matchLabels:
app: store
replicas: 3
template:
metadata:
labels:
app: store
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- store
topologyKey: "kubernetes.io/hostname"
containers:
- name: nginx
image: nginx
Taints 和 Tolerations
给node设置污点Taint,如:
kubectl taint nodes node1 key=value:NoSchedule
设置污点表明该node不在调度任何pod,除非pod有相应的容忍度Toleration。
删除操作和删除label一样。最后加个-。
kubectl taint nodes node1 key:NoSchedule-
下图表明虽然node1设置了taint,但是pod也设置了toleration,因此该pod还是可以部署在node1上的。
pods/pod-with-toleration.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
env: test
spec:
containers:
- name: nginx
image: nginx
imagePullPolicy: IfNotPresent
tolerations:
- key: "key"
operator: "Equal"
value: "value"
effect: "NoSchedule"
每个污点有一个key和value作为污点的标签,其中value可以为空,effect描述污点的作用。当前taint effect支持如下三个选项:
-
NoSchedule
:表示k8s将不会将Pod调度到具有该污点的Node上. -
PreferNoSchedule
:表示k8s将尽量避免将Pod调度到具有该污点的Node上. -
NoExecute
:表示k8s将不会将Pod调度到具有该污点的Node上,同时会将Node上已经存在的Pod驱逐出去
Toleration:
- 其中key, vaule, effect要与Node上设置的taint保持一致
- operator的值为Exists将会忽略value值
- tolerationSeconds用于描述当Pod需要被驱逐时可以在Pod上继续保留运行的时间
下面看一下在Pod上设置容忍的两个特例:
示例1: 当不指定key值时,表示容忍所有的污点key:
tolerations:
- operator: "Exists"
示例2:当不指定effect值时,表示容忍所有的污点作用:
tolerations:
- key: "key"
operator: "Exists"
网友评论