今天技术实践分享的是开源工具KEDA,KEDA是一种基于事件驱动K8s资源对象扩缩容的利器,非常轻量、简单、功能强大,不仅支持早期为大家分享的基于CPU/MEM资源和基于Cron定时HPA方式,同时也支持各种事件驱动型HPA,比如MQ 、Kafka等消息队列长度事件,Redis 、URL Metric 、Promtheus 数值阀值事件等等事件源(Scalers),目前KEDA版本(v2.7)内置支持53种事件源Scalers。今天本文分享的是作为运维人员最常用的一个神器scaler " Promtheus " 事件源来驱动来执行扩缩Job实例,同样实现以下需求场景为目标:
单个业务应用程序执行一个请求(长处理任务)
可按需启动多个业务应用程序
业务应用程序处理后自动退出
一、KEDA安装篇
1.1 安装(Yaml方式)
wget https://github.com/kedacore/keda/releases/download/v2.7.1/keda-2.7.1.yaml
# 注意国内网络访问github镜像问题,需要进行替为国内的代理镜像地址
ghcr.io 替换为 ghcr.nju.edu.cn
# 安装过程显示
[keda /]# kubectl apply -f keda-2.7.1.yaml
namespace/keda created
customresourcedefinition.apiextensions.k8s.io/clustertriggerauthentications.keda.sh created
customresourcedefinition.apiextensions.k8s.io/scaledjobs.keda.sh created
customresourcedefinition.apiextensions.k8s.io/scaledobjects.keda.sh created
customresourcedefinition.apiextensions.k8s.io/triggerauthentications.keda.sh created
serviceaccount/keda-operator created
clusterrole.rbac.authorization.k8s.io/keda-external-metrics-reader created
clusterrole.rbac.authorization.k8s.io/keda-operator created
rolebinding.rbac.authorization.k8s.io/keda-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/keda-hpa-controller-external-metrics created
clusterrolebinding.rbac.authorization.k8s.io/keda-operator created
clusterrolebinding.rbac.authorization.k8s.io/keda-system-auth-delegator created
service/keda-metrics-apiserver created
deployment.apps/keda-metrics-apiserver created
deployment.apps/keda-operator created
apiservice.apiregistration.k8s.io/v1beta1.external.metrics.k8s.io configured
1.2 安装检测
查看已安装资源对象
[keda /]# kubectl get all -n keda
NAME READY STATUS RESTARTS AGE
pod/keda-metrics-apiserver-5ff7b56d-lgwrs 0/1 ContainerCreating 0 26s
pod/keda-operator-65df59d669-r5qct 0/1 ContainerCreating 0 26s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/keda-metrics-apiserver ClusterIP 10.109.61.104 <none> 443/TCP,80/TCP 26s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/keda-metrics-apiserver 0/1 1 0 26s
deployment.apps/keda-operator 0/1 1 0 26s
NAME DESIRED CURRENT READY AGE
replicaset.apps/keda-metrics-apiserver-5ff7b56d 1 1 0 26s
replicaset.apps/keda-operator-65df59d669 1 1 0 26s
[keda /]# kubectl api-resources | grep scale
scaledjobs sj keda.sh/v1alpha1 true ScaledJob
scaledobjects so keda.sh/v1alpha1 true ScaledObject
二、ScaledJob 篇
2.1 Prometheus 查询结果 Trigger(事件触发器)
在k8s集群内创建与应用ScaledJob对象
# scaledjob-demo.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
name: scaled-job-prometheus
spec:
jobTargetRef:
parallelism: 2 # 单个任务的并写Pod数
# completions: 4
# activeDeadlineSeconds: 600 # job运行最大时间Deadline
backoffLimit: 6
template:
spec:
restartPolicy: Never
containers:
- image: alpine
name: demo-job-scale
command: ["/bin/sh"]
args:
- -c
- echo "job doing..."; sleep 120
pollingInterval: 30
successfulJobsHistoryLimit: 5
failedJobsHistoryLimit: 5
maxReplicaCount: 5
rolloutStrategy: gradual
scalingStrategy:
strategy: "custom"
customScalingQueueLengthDeduction: 1
customScalingRunningJobPercentage: "0.5"
triggers:
- type: prometheus
metadata:
serverAddress: http://x.x.x.x:xxxx #prometheus服务器地址
metricName: prometheus_http_requests_total
query: round(sum(rate(prometheus_http_requests_total{container="prometheus",handler="/api/v1/query",code="200"}[2m]))*100)
threshold: '60'
# kubectl apply -f scaledjob-demo.yaml
2.2 ScaledJob 配置参数说明
jobTargetRef
K8s 原生 Job 对象定义。三种运行任务的形式:非并行Job、具有确定完成计数的并行Job、带工作队列的并行Job。 详情参数 Jobs
pollingInterval: 30 # Optional. Default: 30 seconds
每个解发器的检测间隔设定,默认每30秒检测一次触发器。
successfulJobsHistoryLimit: 5 # Optional. Default: 100
failedJobsHistoryLimit: 5 # Optional. Default: 100.
设定保留成功完成和失败任务记录历史条目。
envSourceContainerName: {container-name}
指定获取环境变量属性值容器名,如果未指定默认为JobTargetRef指定的第一个容器 .spec.JobTargetRef.template.spec.containers[0]
maxReplicaCount: 5 # Optional. Default: 100
可扩容最大 Job 数。需基于Target Average Value 单个Job所消费数量和Running Job Count当前运行Job数
量来决定所需扩容的Pod数。
注:实验测试结论 maxReplicaCount 为 Job 数量, 且为在单个检测周期内可扩容的最大job数
rolloutStrategy: gradual # Optional. Default: default | gradual
在更新一个存在的ScaledJob配置时执行rollout策略,"default"将终止存在的Jobs任务,并重新创建这些任务;"gradual"则仅创建新的Jobs,对存在的Jobs不做处理;
scalingStrategy: strategy: "default" # default | custom | accurate
- 三种Scale策略计算说明:
default 策略: maxScale - RunningJobCount
custom 策略: min(maxScale-int64(s.CustomScalingQueueLengthDeduction)-int64(float64(runningJobCount)(*s.CustomScalingRunningJobPercentage)), maxReplicaCount)
customScalingQueueLengthDeduction: 1
customScalingRunningJobPercentage: "0.5"
accurate 策略: if (maxScale + runningJobCount) > maxReplicaCount { return maxReplicaCount - runningJobCount } return maxScale - pendingJobCount
注: 实验测试结论 "当default策略时最小的运行Job数为1,而custom策略则可以为0"
- 策略内名词意义说明
"maxScale" 队列长度与目标消费值之比,与maxReplicaCount设定值最最小值。公式: maxValue = min(scaledJob.MaxReplicaCount(), divideWithCeil(queueLength, targetAverageValue))
"RunningJobCount" 运行中且未完成Job数
"PendingJobCount" Pending状态的Job数
scalingStrategy: multipleScalersCalculation : "max" | min | avg | sum
如果存在有多个触发器的选择行为 Max(default) / Min / Avg / Sum
2.3 测试验证
hey -z 60 -c 10 http://x.x.x.x:xxxx/api/v1/query?query=prometheus_http_requests_total{container="prometheus"}
注:压力测试所设定的prometheus查询语句对象和当达到阀值60后将创建与运行K8s Job对象创建POD执行任务;
# 查看scaledjob应用资源对象
[keda /]# kubectl get sj
NAME MAX TRIGGERS AUTHENTICATION READY ACTIVE AGE
scaled-job-prometheus 5 prometheus True True 14s
# 压测达到阀值后Job创建的POD资源状态
[keda /]# kubectl get pod
NAME READY STATUS RESTARTS AGE
scaled-job-prometheus-29798-7xn8g 0/1 ContainerCreating 0 17s
scaled-job-prometheus-29798-ghf2z 1/1 Running 0 17s
scaled-job-prometheus-2zcvv-44xkg 0/1 ContainerCreating 0 77s
scaled-job-prometheus-2zcvv-6fvqz 1/1 Running 0 77s
scaled-job-prometheus-6dl26-jkl8m 1/1 Running 0 77s
scaled-job-prometheus-6dl26-vmrd8 1/1 Running 0 77s
scaled-job-prometheus-9w2qf-rtlqg 1/1 Running 0 77s
scaled-job-prometheus-9w2qf-zrc4p 1/1 Running 0 77s
scaled-job-prometheus-dwrp4-h7qvn 1/1 Running 0 77s
scaled-job-prometheus-dwrp4-rbzn7 1/1 Running 0 77s
scaled-job-prometheus-ng6nj-6pmvk 1/1 Running 0 47s
scaled-job-prometheus-ng6nj-gkp8d 1/1 Running 0 47s
scaled-job-prometheus-qb7bk-p9699 0/1 ContainerCreating 0 47s
scaled-job-prometheus-qb7bk-zmlb7 1/1 Running 0 47s
# 压测达到阀值后Job创建的POD资源对象执行完成状态
[keda /]# kubectl get pod
NAME READY STATUS RESTARTS AGE
scaled-job-prometheus-29798-7xn8g 0/1 Completed 0 3m8s
scaled-job-prometheus-29798-ghf2z 0/1 Completed 0 3m8s
scaled-job-prometheus-2zcvv-44xkg 0/1 Completed 0 4m8s
scaled-job-prometheus-2zcvv-6fvqz 0/1 Completed 0 4m8s
scaled-job-prometheus-6dl26-jkl8m 0/1 Completed 0 4m8s
scaled-job-prometheus-6dl26-vmrd8 0/1 Completed 0 4m8s
scaled-job-prometheus-lf27p-m7v2x 0/1 Completed 0 2m38s
scaled-job-prometheus-lf27p-r9lpb 0/1 Completed 0 2m38s
scaled-job-prometheus-qb7bk-p9699 0/1 Completed 0 3m38s
scaled-job-prometheus-qb7bk-zmlb7 0/1 Completed 0 3m38s
# 压测时对K8s Job的状态持续观察
[keda /]# kubectl get job -w
scaled-job-prometheus-9w2qf 0/1 of 2 0s #pod创建
scaled-job-prometheus-2zcvv 0/1 of 2 0s
scaled-job-prometheus-9w2qf 0/1 of 2 0s
scaled-job-prometheus-dwrp4 0/1 of 2 0s
scaled-job-prometheus-2zcvv 0/1 of 2 0s
scaled-job-prometheus-6dl26 0/1 of 2 0s
scaled-job-prometheus-dwrp4 0/1 of 2 0s
scaled-job-prometheus-6dl26 0/1 of 2 0s
scaled-job-prometheus-9w2qf 0/1 of 2 0s 0s
scaled-job-prometheus-dwrp4 0/1 of 2 0s 0s
scaled-job-prometheus-6dl26 0/1 of 2 0s 0s
scaled-job-prometheus-2zcvv 0/1 of 2 0s 0s
scaled-job-prometheus-ng6nj 0/1 of 2 0s
scaled-job-prometheus-ng6nj 0/1 of 2 0s
scaled-job-prometheus-qb7bk 0/1 of 2 0s
scaled-job-prometheus-qb7bk 0/1 of 2 0s
scaled-job-prometheus-ng6nj 0/1 of 2 0s 0s
scaled-job-prometheus-qb7bk 0/1 of 2 0s 0s
scaled-job-prometheus-29798 0/1 of 2 0s
scaled-job-prometheus-29798 0/1 of 2 0s
scaled-job-prometheus-29798 0/1 of 2 0s 0s
scaled-job-prometheus-lf27p 0/1 of 2 0s
scaled-job-prometheus-lf27p 0/1 of 2 0s
scaled-job-prometheus-lf27p 0/1 of 2 0s 0s
scaled-job-prometheus-dwrp4 1/1 of 2 2m17s 2m17s # pod1执行完成
scaled-job-prometheus-dwrp4 1/1 of 2 2m17s 2m17s
scaled-job-prometheus-9w2qf 1/1 of 2 2m18s 2m18s
scaled-job-prometheus-9w2qf 1/1 of 2 2m18s 2m18s
scaled-job-prometheus-2zcvv 1/1 of 2 2m18s 2m18s
scaled-job-prometheus-2zcvv 1/1 of 2 2m18s 2m18s
scaled-job-prometheus-dwrp4 2/1 of 2 2m33s 2m33s # pod2执行完成
scaled-job-prometheus-dwrp4 2/1 of 2 2m33s 2m33s
scaled-job-prometheus-6dl26 1/1 of 2 2m34s 2m34s
scaled-job-prometheus-6dl26 1/1 of 2 2m34s 2m34s
scaled-job-prometheus-qb7bk 1/1 of 2 2m17s 2m17s
scaled-job-prometheus-qb7bk 1/1 of 2 2m17s 2m17s
scaled-job-prometheus-9w2qf 2/1 of 2 2m49s 2m49s
scaled-job-prometheus-9w2qf 2/1 of 2 2m49s 2m49s
scaled-job-prometheus-ng6nj 1/1 of 2 2m19s 2m19s
scaled-job-prometheus-ng6nj 1/1 of 2 2m19s 2m19s
scaled-job-prometheus-ng6nj 2/1 of 2 2m32s 2m32s
scaled-job-prometheus-ng6nj 2/1 of 2 2m32s 2m32s
scaled-job-prometheus-6dl26 2/1 of 2 3m4s 3m4s
scaled-job-prometheus-6dl26 2/1 of 2 3m4s 3m4s
scaled-job-prometheus-29798 1/1 of 2 2m17s 2m17s
scaled-job-prometheus-29798 1/1 of 2 2m17s 2m17s
scaled-job-prometheus-2zcvv 2/1 of 2 3m19s 3m19s
scaled-job-prometheus-2zcvv 2/1 of 2 3m19s 3m19s
scaled-job-prometheus-qb7bk 2/1 of 2 2m50s 2m50s
scaled-job-prometheus-qb7bk 2/1 of 2 2m50s 2m50s
scaled-job-prometheus-dwrp4 2/1 of 2 2m33s 3m30s
scaled-job-prometheus-29798 2/1 of 2 2m36s 2m36s
scaled-job-prometheus-29798 2/1 of 2 2m36s 2m36s
scaled-job-prometheus-lf27p 1/1 of 2 2m17s 2m17s
scaled-job-prometheus-lf27p 1/1 of 2 2m17s 2m17s
scaled-job-prometheus-lf27p 2/1 of 2 2m21s 2m21s
scaled-job-prometheus-lf27p 2/1 of 2 2m21s 2m21s
scaled-job-prometheus-9w2qf 2/1 of 2 2m49s 4m
scaled-job-prometheus-ng6nj 2/1 of 2 2m32s 3m30s
~~FINISH ~~
网友评论