今天在查看k8s集群服务器的时候,发现一个不太正常的现象,每台机器资源使用看起来不够平衡
先top
一下节点
➜ ~ kubectl top no
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
10.2.3.199 190m 2% 3209Mi 20%
10.2.3.200 109m 1% 2083Mi 13%
10.2.3.201 106m 1% 2126Mi 13%
10.2.3.203 338m 4% 9644Mi 61%
10.2.3.204 333m 4% 9805Mi 62%
10.2.3.205 306m 3% 3475Mi 10%
10.2.3.206 1222m 15% 10155Mi 64%
10.2.3.207 285m 3% 9395Mi 59%
10.2.3.208 903m 11% 10317Mi 65%
10.2.3.209 533m 6% 10683Mi 33%
10.2.3.210 1251m 15% 8091Mi 25%
10.2.3.211 84m 4% 1995Mi 54%
10.2.3.212 102m 5% 2013Mi 54%
10.2.3.213 115m 5% 1737Mi 47%
10.2.3.214 97m 4% 1973Mi 53%
10.2.3.215 83m 4% 1891Mi 51%
10.2.3.216 91m 4% 1932Mi 52%
看一下角色分布
➜ ~ kubectl get no
NAME STATUS ROLES AGE VERSION
10.2.3.199 Ready worker 58m v1.14.3
10.2.3.200 Ready worker 52m v1.14.3
10.2.3.201 Ready worker 48m v1.14.3
10.2.3.203 Ready worker 2d6h v1.14.3
10.2.3.204 Ready worker 2d6h v1.14.3
10.2.3.205 Ready worker 2d6h v1.14.3
10.2.3.206 Ready worker 2d6h v1.14.3
10.2.3.207 Ready worker 2d6h v1.14.3
10.2.3.208 Ready worker 2d6h v1.14.3
10.2.3.209 Ready worker 2d6h v1.14.3
10.2.3.210 Ready worker 2d8h v1.14.3
10.2.3.211 Ready controlplane 2d7h v1.14.3
10.2.3.212 Ready controlplane 2d8h v1.14.3
10.2.3.213 Ready controlplane 2d8h v1.14.3
10.2.3.214 Ready etcd 2d8h v1.14.3
10.2.3.215 Ready etcd 2d8h v1.14.3
10.2.3.216 Ready etcd 2d8h v1.14.3
可以发现10.2.3.205
的内存真是很闲啊,只用了那么一点,即使它的内存有32G
之多,这不是偷懒吗,占着茅坑那啥?
这怎么行?先把32G
内存的机器打上标签
➜ ~ kubectl label no 10.2.3.205 mem=32
➜ ~ kubectl label no 10.2.3.209 mem=32
➜ ~ kubectl label no 10.2.3.210 mem=32
查看一下, 确认label
已经打上了
➜ ~ kubectl get no -l mem
NAME STATUS ROLES AGE VERSION
10.2.3.205 Ready worker 2d6h v1.14.3
10.2.3.209 Ready worker 2d6h v1.14.3
10.2.3.210 Ready worker 2d8h v1.14.3
接着使用亲和性策略使pod
优先部署到label
为mem=32
的节点上
➜ ~ kubectl run busybox --image busybox --restart Never --dry-run -oyaml > busybox.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: busybox
name: busybox
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 50
preference:
matchExpressions:
- key: mem
operator: In
values:
- 32
containers:
- image: busybox
name: busybox
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Never
status: {}
部署一下
➜ ~ kubectl apply -f busybox.yaml
Error from server (BadRequest): error when creating "busybox.yaml": Pod in version "v1" cannot be handled as a Pod: v1.Pod.Spec: v1.PodSpec.Affinity: v1.Affinity.NodeAffinity: v1.NodeAffinity.PreferredDuringSchedulingIgnoredDuringExecution: []v1.PreferredSchedulingTerm: v1.PreferredSchedulingTerm.Preference: v1.NodeSelectorTerm.MatchExpressions: []v1.NodeSelectorRequirement: v1.NodeSelectorRequirement.Values: []string: ReadString: expects " or n, but found 3, error found in #10 byte of ...|values":[32]}]},"wei|..., bigger context ...|essions":[{"key":"mem","operator":"In","values":[32]}]},"weight":50}]}},"containers":[{"image":"busy|...
报错了?这个报错并不是特别明显,搜索了下社区,发现还真有人遇到这个问题
The label values must be strings. In yaml, that means all numeric values must be quoted.
原来label
必须为字符串类型,如果是纯数字(恰好我设定的label
值为32),必须用引号,又一次败在了细节上
修正以后再次部署
➜ ~ kubectl apply -f busybox.yaml
pod/busybox created
➜ ~ kubectl get po busybox -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox 0/1 Completed 0 10s 10.2.67.84 10.2.3.210 <none> <none>
结果显示确实是部署到大内存机器上了
再次复习一下字段含义
➜ ~ kubectl explain pod.spec.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution
KIND: Pod
VERSION: v1
RESOURCE: preferredDuringSchedulingIgnoredDuringExecution <[]Object>
DESCRIPTION:
The scheduler will prefer to schedule pods to nodes that satisfy the
affinity expressions specified by this field, but it may choose a node that
violates one or more of the expressions. The node that is most preferred is
the one with the greatest sum of weights, i.e. for each node that meets all
of the scheduling requirements (resource request, requiredDuringScheduling
affinity expressions, etc.), compute a sum by iterating through the
elements of this field and adding "weight" to the sum if the node matches
the corresponding matchExpressions; the node(s) with the highest sum are
the most preferred.
An empty preferred scheduling term matches all objects with implicit weight
0 (i.e. it's a no-op). A null preferred scheduling term matches no objects
(i.e. is also a no-op).
FIELDS:
preference <Object> -required-
A node selector term, associated with the corresponding weight.
weight <integer> -required-
Weight associated with matching the corresponding nodeSelectorTerm, in the
range 1-100.
weight
数值越大代表preference
的优先级越高,在这之前,一直想当然地认为k8s的调度是以资源富余程度来做筛选以及
给节点打分,事实证明并非如此,需要认真啃一下k8s的调度器策略了。
参考
网友评论