美文网首页
k8s亲和性部署踩”坑“

k8s亲和性部署踩”坑“

作者: wu_sphinx | 来源:发表于2019-08-02 20:08 被阅读0次

    今天在查看k8s集群服务器的时候,发现一个不太正常的现象,每台机器资源使用看起来不够平衡
    top一下节点

    ➜  ~ kubectl  top no
    
    NAME          CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
    10.2.3.199   190m         2%     3209Mi          20%
    10.2.3.200   109m         1%     2083Mi          13%
    10.2.3.201   106m         1%     2126Mi          13%
    10.2.3.203   338m         4%     9644Mi          61%
    10.2.3.204   333m         4%     9805Mi          62%
    10.2.3.205   306m         3%     3475Mi          10%
    10.2.3.206   1222m        15%    10155Mi         64%
    10.2.3.207   285m         3%     9395Mi          59%
    10.2.3.208   903m         11%    10317Mi         65%
    10.2.3.209   533m         6%     10683Mi         33%
    10.2.3.210   1251m        15%    8091Mi          25%
    10.2.3.211   84m          4%     1995Mi          54%
    10.2.3.212   102m         5%     2013Mi          54%
    10.2.3.213   115m         5%     1737Mi          47%
    10.2.3.214   97m          4%     1973Mi          53%
    10.2.3.215   83m          4%     1891Mi          51%
    10.2.3.216   91m          4%     1932Mi          52%
    

    看一下角色分布

    ➜  ~ kubectl get no
    NAME          STATUS   ROLES          AGE    VERSION
    10.2.3.199   Ready    worker         58m    v1.14.3
    10.2.3.200   Ready    worker         52m    v1.14.3
    10.2.3.201   Ready    worker         48m    v1.14.3
    10.2.3.203   Ready    worker         2d6h   v1.14.3
    10.2.3.204   Ready    worker         2d6h   v1.14.3
    10.2.3.205   Ready    worker         2d6h   v1.14.3
    10.2.3.206   Ready    worker         2d6h   v1.14.3
    10.2.3.207   Ready    worker         2d6h   v1.14.3
    10.2.3.208   Ready    worker         2d6h   v1.14.3
    10.2.3.209   Ready    worker         2d6h   v1.14.3
    10.2.3.210   Ready    worker         2d8h   v1.14.3
    10.2.3.211   Ready    controlplane   2d7h   v1.14.3
    10.2.3.212   Ready    controlplane   2d8h   v1.14.3
    10.2.3.213   Ready    controlplane   2d8h   v1.14.3
    10.2.3.214   Ready    etcd           2d8h   v1.14.3
    10.2.3.215   Ready    etcd           2d8h   v1.14.3
    10.2.3.216   Ready    etcd           2d8h   v1.14.3
    

    可以发现10.2.3.205的内存真是很闲啊,只用了那么一点,即使它的内存有32G之多,这不是偷懒吗,占着茅坑那啥?
    这怎么行?先把32G内存的机器打上标签

    ➜  ~ kubectl label no 10.2.3.205 mem=32
    ➜  ~ kubectl label no 10.2.3.209 mem=32
    ➜  ~ kubectl label no 10.2.3.210 mem=32
    

    查看一下, 确认label已经打上了

    ➜  ~ kubectl get no -l mem
    NAME          STATUS   ROLES    AGE    VERSION
    10.2.3.205   Ready    worker   2d6h   v1.14.3
    10.2.3.209   Ready    worker   2d6h   v1.14.3
    10.2.3.210   Ready    worker   2d8h   v1.14.3
    

    接着使用亲和性策略使pod优先部署到labelmem=32的节点上

    ➜  ~ kubectl run busybox --image busybox --restart Never --dry-run -oyaml > busybox.yaml
    
    apiVersion: v1
    kind: Pod
    metadata:
      creationTimestamp: null
      labels:
        run: busybox
      name: busybox
    spec:
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 50
            preference:
              matchExpressions:
              - key: mem
                operator: In
                values:
                - 32
      containers:
      - image: busybox
        name: busybox
        resources: {}
      dnsPolicy: ClusterFirst
      restartPolicy: Never
    status: {}
    

    部署一下

    ➜  ~ kubectl  apply -f busybox.yaml
    Error from server (BadRequest): error when creating "busybox.yaml": Pod in version "v1" cannot be handled as a Pod: v1.Pod.Spec: v1.PodSpec.Affinity: v1.Affinity.NodeAffinity: v1.NodeAffinity.PreferredDuringSchedulingIgnoredDuringExecution: []v1.PreferredSchedulingTerm: v1.PreferredSchedulingTerm.Preference: v1.NodeSelectorTerm.MatchExpressions: []v1.NodeSelectorRequirement: v1.NodeSelectorRequirement.Values: []string: ReadString: expects " or n, but found 3, error found in #10 byte of ...|values":[32]}]},"wei|..., bigger context ...|essions":[{"key":"mem","operator":"In","values":[32]}]},"weight":50}]}},"containers":[{"image":"busy|...
    

    报错了?这个报错并不是特别明显,搜索了下社区,发现还真有人遇到这个问题

    The label values must be strings. In yaml, that means all numeric values must be quoted.

    原来label必须为字符串类型,如果是纯数字(恰好我设定的label值为32),必须用引号,又一次败在了细节上
    修正以后再次部署

    ➜  ~ kubectl apply -f busybox.yaml
    pod/busybox created
    ➜  ~ kubectl get po busybox -o wide
    NAME      READY   STATUS      RESTARTS   AGE   IP           NODE          NOMINATED NODE   READINESS GATES
    busybox   0/1     Completed   0          10s   10.2.67.84   10.2.3.210   <none>           <none>
    

    结果显示确实是部署到大内存机器上了

    再次复习一下字段含义

    ➜  ~ kubectl explain pod.spec.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution
    KIND:     Pod
    VERSION:  v1
    
    RESOURCE: preferredDuringSchedulingIgnoredDuringExecution <[]Object>
    
    DESCRIPTION:
         The scheduler will prefer to schedule pods to nodes that satisfy the
         affinity expressions specified by this field, but it may choose a node that
         violates one or more of the expressions. The node that is most preferred is
         the one with the greatest sum of weights, i.e. for each node that meets all
         of the scheduling requirements (resource request, requiredDuringScheduling
         affinity expressions, etc.), compute a sum by iterating through the
         elements of this field and adding "weight" to the sum if the node matches
         the corresponding matchExpressions; the node(s) with the highest sum are
         the most preferred.
    
         An empty preferred scheduling term matches all objects with implicit weight
         0 (i.e. it's a no-op). A null preferred scheduling term matches no objects
         (i.e. is also a no-op).
    
    FIELDS:
       preference   <Object> -required-
         A node selector term, associated with the corresponding weight.
    
       weight   <integer> -required-
         Weight associated with matching the corresponding nodeSelectorTerm, in the
         range 1-100.
    

    weight数值越大代表preference的优先级越高,在这之前,一直想当然地认为k8s的调度是以资源富余程度来做筛选以及
    给节点打分,事实证明并非如此,需要认真啃一下k8s的调度器策略了。

    参考

    相关文章

      网友评论

          本文标题:k8s亲和性部署踩”坑“

          本文链接:https://www.haomeiwen.com/subject/zburdctx.html