美文网首页
阿里云上K8S服务部署之配置总结

阿里云上K8S服务部署之配置总结

作者: 天草二十六_简村人 | 来源:发表于2022-09-29 09:18 被阅读0次
阿里云产品.png

一、优雅停服

1.1、pod的启动探针startupProbe

判断容器内的应用程序是否已启动。如果提供了启动探测,则禁用所有其他探测,直到它成功为止。如果启动探测失败,kubelet将杀死容器,容器将服从其重启策略。如果容器没有提供启动探测,则默认状态为成功。

# 示例中的成功依据是http://localhost:9036/mgm/health
# 如果响应的状态码大于等于200 且小于 400,则诊断被认为是成功的。
          startupProbe:
            failureThreshold: 22
            httpGet:
              path: /mgm/health
              port: 9036
              scheme: HTTP
            initialDelaySeconds: 25
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 5

探针的配置说明

  • initialDelaySeconds:容器启动后要等待多少秒后存活和就绪探测器才被初始化,默认是 0 秒,最小值是 0。
  • periodSeconds:执行探测的时间间隔(单位是秒)。默认是 10 秒。最小值是 1。
  • timeoutSeconds:探测的超时后等待多少秒。默认值是 1 秒。最小值是 1。
  • successThreshold:探测器在失败后,被视为成功的最小连续成功数。默认值是 1。存活探测的这个值必须是 1。最小值是 1。
  • failureThreshold:当 Pod 启动了并且探测到失败,Kubernetes 的重试次数。存活探测情况下的放弃就意味着重新启动容器。就绪探测情况下的放弃 Pod 会被打上未就绪的标签。默认值是 3。最小值是 1。

HTTP探针可以在 httpGet 上配置额外的字段:

  • host:连接使用的主机名,默认是 Pod 的 IP。也可以在 HTTP 头中设置 “Host” 来代替。
  • scheme:用于设置连接主机的方式(HTTP 还是 HTTPS)。默认是 HTTP。
  • path:访问 HTTP 服务的路径。
  • httpHeaders:请求中自定义的 HTTP 头。HTTP 头字段允许重复。
  • port:访问容器的端口号或者端口名。如果数字必须在 1 ~ 65535 之间。

1.2、在lifecycle中,定义钩子函数

钩子函数能够感知自身生命周期中的事件,并在相应的时刻到来时运行用户指定的程序代码。k8s在主容器的启动之后和停止之前提供了两个钩子函数。

  • post start:容器创建之后执行,如果失败了会重启容器
  • pre stop:容器终止之前执行,执行完成之后容器将成功终止,在其完成之前会阻塞删除容器的操作

钩子处理器支持使用下面三种方式定义动作:

Exec命令:在容器内执行一次命令

  lifecycle:
    postStart:
      exec:
        command: - cat - /tmp/healthy

TCPSocket:在当前容器尝试访问指定的socket

  lifecycle:
    postStart:
      tcpSocket:
        port: 8080 

HttpGet:在当前容器中向某url发起http请求

  lifecycle:
    postStart:
      httpGet:
        path: #uri地址
        port:
        host: 
        scheme: HTTP  #支持的协议,http或者https

preStop钩子

          lifecycle:
            preStop:
              exec:
                command:
                  - /bin/sh
                  - '-c'
                  - >-
                    wget http://127.0.0.1:54199/offline 2>/tmp/null;sleep 45 &&
                    /opt/xxx/wrong-answer-service/bin/do_stop.sh

二、arms的数据采集

不要在“全局配置”里配置,验证的版本是arms-bootstrap-1.7.0-SNAPSHOT.jar

image.png

2.1、采样率

image.png

2.2、忽略采集部分接口

建议忽略的接口:在默认的基础上追加,//mgm/health,//mgm/promethues

image.png

完整的yaml示例

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: '3'
  creationTimestamp: '2022-09-28T05:24:37Z'
  generation: 3
  labels:
    app: wrong-answer-service
    group: xxx
  managedFields:
    - apiVersion: apps/v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:metadata':
          'f:labels':
            .: {}
            'f:app': {}
        'f:spec':
          'f:progressDeadlineSeconds': {}
          'f:replicas': {}
          'f:revisionHistoryLimit': {}
          'f:selector': {}
          'f:strategy':
            'f:rollingUpdate':
              .: {}
              'f:maxSurge': {}
              'f:maxUnavailable': {}
            'f:type': {}
          'f:template':
            'f:metadata':
              'f:labels':
                .: {}
                'f:app': {}
                'f:armsPilotAutoEnable': {}
                'f:armsPilotCreateAppName': {}
            'f:spec':
              'f:containers':
                'k:{"name":"wrong-answer-service"}':
                  .: {}
                  'f:env':
                    .: {}
                    'k:{"name":"TZ"}':
                      .: {}
                      'f:name': {}
                      'f:value': {}
                    'k:{"name":"aliyun_logs_wrong-answer-service"}':
                      .: {}
                      'f:name': {}
                      'f:value': {}
                  'f:image': {}
                  'f:imagePullPolicy': {}
                  'f:lifecycle':
                    .: {}
                    'f:preStop':
                      .: {}
                      'f:exec':
                        .: {}
                        'f:command': {}
                  'f:name': {}
                  'f:ports':
                    .: {}
                    'k:{"containerPort":9036,"protocol":"TCP"}':
                      .: {}
                      'f:containerPort': {}
                      'f:protocol': {}
                  'f:readinessProbe':
                    .: {}
                    'f:failureThreshold': {}
                    'f:httpGet':
                      .: {}
                      'f:path': {}
                      'f:port': {}
                      'f:scheme': {}
                    'f:initialDelaySeconds': {}
                    'f:periodSeconds': {}
                    'f:successThreshold': {}
                    'f:timeoutSeconds': {}
                  'f:resources':
                    .: {}
                    'f:limits':
                      .: {}
                      'f:cpu': {}
                      'f:memory': {}
                    'f:requests':
                      .: {}
                      'f:cpu': {}
                      'f:memory': {}
                  'f:startupProbe':
                    .: {}
                    'f:failureThreshold': {}
                    'f:httpGet':
                      .: {}
                      'f:path': {}
                      'f:port': {}
                      'f:scheme': {}
                    'f:initialDelaySeconds': {}
                    'f:periodSeconds': {}
                    'f:successThreshold': {}
                    'f:timeoutSeconds': {}
                  'f:terminationMessagePath': {}
                  'f:terminationMessagePolicy': {}
                  'f:volumeMounts':
                    .: {}
                    'k:{"mountPath":"/etc/localtime"}':
                      .: {}
                      'f:mountPath': {}
                      'f:name': {}
                    'k:{"mountPath":"/opt/xxx/logs/xxljob-log/"}':
                      .: {}
                      'f:mountPath': {}
                      'f:name': {}
                      'f:subPath': {}
                    'k:{"mountPath":"/opt/xxx/wrong-answer-service/resources/"}':
                      .: {}
                      'f:mountPath': {}
                      'f:name': {}
                      'f:subPath': {}
              'f:dnsPolicy': {}
              'f:nodeSelector': {}
              'f:restartPolicy': {}
              'f:schedulerName': {}
              'f:securityContext': {}
              'f:terminationGracePeriodSeconds': {}
              'f:volumes':
                .: {}
                'k:{"name":"volume-localtime"}':
                  .: {}
                  'f:hostPath':
                    .: {}
                    'f:path': {}
                    'f:type': {}
                  'f:name': {}
                'k:{"name":"volume-resources"}':
                  .: {}
                  'f:name': {}
                  'f:persistentVolumeClaim':
                    .: {}
                    'f:claimName': {}
                'k:{"name":"volume-xxljob"}':
                  .: {}
                  'f:name': {}
                  'f:persistentVolumeClaim':
                    .: {}
                    'f:claimName': {}
      manager: python-requests
      operation: Update
      time: '2022-09-28T05:24:37Z'
    - apiVersion: apps/v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:metadata':
          'f:labels':
            'f:group': {}
        'f:spec':
          'f:template':
            'f:metadata':
              'f:annotations':
                .: {}
                'f:redeploy-timestamp': {}
      manager: ACK-Console Apache-HttpClient
      operation: Update
      time: '2022-09-28T07:10:46Z'
    - apiVersion: apps/v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:metadata':
          'f:annotations':
            .: {}
            'f:deployment.kubernetes.io/revision': {}
        'f:status':
          'f:availableReplicas': {}
          'f:conditions':
            .: {}
            'k:{"type":"Available"}':
              .: {}
              'f:lastTransitionTime': {}
              'f:lastUpdateTime': {}
              'f:message': {}
              'f:reason': {}
              'f:status': {}
              'f:type': {}
            'k:{"type":"Progressing"}':
              .: {}
              'f:lastTransitionTime': {}
              'f:lastUpdateTime': {}
              'f:message': {}
              'f:reason': {}
              'f:status': {}
              'f:type': {}
          'f:observedGeneration': {}
          'f:readyReplicas': {}
          'f:replicas': {}
          'f:updatedReplicas': {}
      manager: kube-controller-manager
      operation: Update
      subresource: status
      time: '2022-09-28T07:30:46Z'
  name: wrong-answer-service
  namespace: java-service
  resourceVersion: '28026688'
  uid: 2f898c1b-b5c1-4c31-a1ab-f5e27788a433
spec:
  progressDeadlineSeconds: 600
  replicas: 2
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: wrong-answer-service
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      annotations:
        redeploy-timestamp: '1664350112420'
      labels:
        app: wrong-answer-service
        armsPilotAutoEnable: 'on'
        armsPilotCreateAppName: wrong-answer-service
    spec:
      containers:
# 环境变量,特别是时区
        - env:
            - name: aliyun_logs_wrong-answer-service
              value: stdout
            - name: TZ
              value: Asia/Shanghai
# docker镜像
          image: >-
            xxx-harbor-registry.cn-hangzhou.cr.aliyuncs.com/xxx-zty/wrong-answer-service:1.0.16
          imagePullPolicy: Always
          lifecycle:
            preStop:
              exec:
                command:
                  - /bin/sh
                  - '-c'
                  - >-
                    wget http://127.0.0.1:54199/offline 2>/tmp/null;sleep 45 &&
                    /opt/xxx/wrong-answer-service/bin/do_stop.sh
          name: wrong-answer-service
          ports:
            - containerPort: 9036
              protocol: TCP
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /mgm/health
              port: 9036
              scheme: HTTP
            initialDelaySeconds: 1
            periodSeconds: 5
            successThreshold: 1
            timeoutSeconds: 3
          resources:
            limits:
              cpu: '2'
              memory: 2Gi
            requests:
              cpu: 250m
              memory: 1717986918400m
          startupProbe:
            failureThreshold: 22
            httpGet:
              path: /mgm/health
              port: 9036
              scheme: HTTP
            initialDelaySeconds: 25
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 5
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
# 挂载卷
          volumeMounts:
            - mountPath: /etc/localtime
              name: volume-localtime
            - mountPath: /opt/xxx/logs/xxljob-log/
              name: volume-xxljob
              subPath: wrong-answer-service
            - mountPath: /opt/xxx/wrong-answer-service/resources/
              name: volume-resources
              subPath: wrong-answer-service
      dnsPolicy: ClusterFirst
      nodeSelector:
        pod: normal
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 120
      volumes:
        - hostPath:
            path: /etc/localtime
            type: ''
          name: volume-localtime
        - name: volume-xxljob
          persistentVolumeClaim:
            claimName: xxljob
        - name: volume-resources
          persistentVolumeClaim:
            claimName: resources
status:
  availableReplicas: 2
  conditions:
    - lastTransitionTime: '2022-09-28T05:30:49Z'
      lastUpdateTime: '2022-09-28T05:30:49Z'
      message: Deployment has minimum availability.
      reason: MinimumReplicasAvailable
      status: 'True'
      type: Available
    - lastTransitionTime: '2022-09-28T05:24:37Z'
      lastUpdateTime: '2022-09-28T07:30:46Z'
      message: >-
        ReplicaSet "wrong-answer-service-7f7957f69d" has successfully
        progressed.
      reason: NewReplicaSetAvailable
      status: 'True'
      type: Progressing
  observedGeneration: 3
  readyReplicas: 2
  replicas: 2
  updatedReplicas: 2

相关文章

网友评论

      本文标题:阿里云上K8S服务部署之配置总结

      本文链接:https://www.haomeiwen.com/subject/fkpcartx.html