美文网首页Kubernetes精选学习
【K8s 精选】如何定位镜像和 yaml 部署问题

【K8s 精选】如何定位镜像和 yaml 部署问题

作者: 熊本极客 | 来源:发表于2022-02-14 16:26 被阅读0次

    步骤一:查看 pod event

    #如果有明确的异常事件,例如资源不足。即如果马上定位出问题,就不需要继续下面步骤了。
    #方法1
    $kubectl describe pod deployment-flink-jobmanager-f766989b9-dth5v -niot test
    #方法2
    $kubectl get events -A |grep deployment-flink-jobmanager-f766989b9-dth5v
    

    步骤二:kubectl logs 查看容器日志

    #如果有明确的异常事件,例如启动脚本异常退出。即如果马上定位出问题,就不需要继续下面步骤了。
    $kubectl logs deployment-flink-jobmanager-f766989b9-dth5v -niot test
    Error from server (BadRequest): container "jobmanager" in pod "deployment-flink-jobmanager-f766989b9-dth5v" is waiting to start: PodInitializing
    #Pod 无法启动导致没有日志,需要进一步查看 kubelet 日志。
    

    步骤三:登录容器所在的 node,利用 journalctl 查看 kubelet 日志

    #该日志没有明确的错误,只是显示了启动容器 StartContainer 失败,CrashLoopBackOff 状态,即部署 yaml 没有异常。因此需要进一步利用 docker run 定位镜像的问题。
    $journalctl -u kubelet |grep deployment-flink-jobmanager-f766989b9-dth5v
    Feb 11 02:08:43 iota-node-3 kubelet[57638]: I0211 02:08:43.239559   57638 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "flink-jobmanager-log-dir" (UniqueName: "kubernetes.io/host-path/db4c58cb-2cec-4c23-b7e9-bd9adfded8b1-flink-jobmanager-log-dir") pod "deployment-flink-jobmanager-f766989b9-dth5v" (UID: "db4c58cb-2cec-4c23-b7e9-bd9adfded8b1")
    Feb 11 02:08:43 iota-node-3 kubelet[57638]: I0211 02:08:43.239637   57638 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "flink-tools-dir" (UniqueName: "kubernetes.io/host-path/db4c58cb-2cec-4c23-b7e9-bd9adfded8b1-flink-tools-dir") pod "deployment-flink-jobmanager-f766989b9-dth5v" (UID: "db4c58cb-2cec-4c23-b7e9-bd9adfded8b1")
    Feb 11 02:08:43 iota-node-3 kubelet[57638]: I0211 02:08:43.239753   57638 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "flink-config-volume" (UniqueName: "kubernetes.io/configmap/db4c58cb-2cec-4c23-b7e9-bd9adfded8b1-flink-config-volume") pod "deployment-flink-jobmanager-f766989b9-dth5v" (UID: "db4c58cb-2cec-4c23-b7e9-bd9adfded8b1")
    Feb 11 02:08:43 iota-node-3 kubelet[57638]: I0211 02:08:43.239801   57638 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "serviceaccount-iota-token-crpqb" (UniqueName: "kubernetes.io/secret/db4c58cb-2cec-4c23-b7e9-bd9adfded8b1-serviceaccount-iota-token-crpqb") pod "deployment-flink-jobmanager-f766989b9-dth5v" (UID: "db4c58cb-2cec-4c23-b7e9-bd9adfded8b1")
    Feb 11 02:09:16 iota-node-3 kubelet[57638]: E0211 02:09:16.177804   57638 pod_workers.go:191] Error syncing pod db4c58cb-2cec-4c23-b7e9-bd9adfded8b1 ("deployment-flink-jobmanager-f766989b9-dth5v_iot(db4c58cb-2cec-4c23-b7e9-bd9adfded8b1)"), skipping: failed to "StartContainer" for "jobmanager" with CrashLoopBackOff: "back-off 10s restarting failed container=jobmanager pod=deployment-flink-jobmanager-f766989b9-dth5v_iot(db4c58cb-2cec-4c23-b7e9-bd9adfded8b1)"
    
    #如果有明确的异常事件,如下 yaml 的 initContainers 异常
    Feb 14 08:10:48 iota-node-3 kubelet[57638]: I0214 08:10:48.685536   57638 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "flink-jobmanager-log-dir" (UniqueName: "kubernetes.io/host-path/ec0429a1-5774-466b-a5b0-699ca9353b61-flink-jobmanager-log-dir") pod "deployment-flink-jobmanager-689bc4b45f-tpvjb" (UID: "ec0429a1-5774-466b-a5b0-699ca9353b61")
    Feb 14 08:10:48 iota-node-3 kubelet[57638]: I0214 08:10:48.685574   57638 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "flink-config-volume" (UniqueName: "kubernetes.io/configmap/ec0429a1-5774-466b-a5b0-699ca9353b61-flink-config-volume") pod "deployment-flink-jobmanager-689bc4b45f-tpvjb" (UID: "ec0429a1-5774-466b-a5b0-699ca9353b61")
    Feb 14 08:10:48 iota-node-3 kubelet[57638]: I0214 08:10:48.685698   57638 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "serviceaccount-iota-token-crpqb" (UniqueName: "kubernetes.io/secret/ec0429a1-5774-466b-a5b0-699ca9353b61-serviceaccount-iota-token-crpqb") pod "deployment-flink-jobmanager-689bc4b45f-tpvjb" (UID: "ec0429a1-5774-466b-a5b0-699ca9353b61")
    Feb 14 08:10:50 iota-node-3 kubelet[57638]: E0214 08:10:50.660663   57638 kuberuntime_container.go:706] failed to remove pod init container "init-flink-home-dir": rpc error: code = Unknown desc = failed to remove container "e594274677f05e878b9133c4ead97d6eb91f240616540f9732d158c07a648cc7": Error response from daemon: removal of container e594274677f05e878b9133c4ead97d6eb91f240616540f9732d158c07a648cc7 is already in progress; Skipping pod "deployment-flink-jobmanager-689bc4b45f-tpvjb_iot(ec0429a1-5774-466b-a5b0-699ca9353b61)"
    Feb 14 08:10:50 iota-node-3 kubelet[57638]: E0214 08:10:50.660941   57638 pod_workers.go:191] Error syncing pod ec0429a1-5774-466b-a5b0-699ca9353b61 ("deployment-flink-jobmanager-689bc4b45f-tpvjb_iot(ec0429a1-5774-466b-a5b0-699ca9353b61)"), skipping: failed to "StartContainer" for "init-flink-home-dir" with CrashLoopBackOff: "back-off 10s restarting failed container=init-flink-home-dir pod=deployment-flink-jobmanager-689bc4b45f-tpvjb_iot(ec0429a1-5774-466b-a5b0-699ca9353b61)"
    
    

    步骤四:登录容器所在的 node,利用 docker run 运行容器

    $docker run -it 192.168.0.60:5000/test/flink:2022.0211.1104.57 bash
    bash-5.0$ cd /opt/test/flink/
    bash-5.0$ ls -l
    total 48
    drwxrwxrwx 1  flink flink  4096 Feb 11 06:49 bin
    drwxrwxrwx 1  flink flink  4096 Jun 15  2021 conf
    drwxrwxrwx 1  flink flink  4096 Jun 15  2021 examples
    drwxrwxrwx 1  flink flink  4096 Feb 11 06:49 lib
    -rwxrwxrwx 1  flink flink 11558 Apr 29  2021 LICENSE
    drwxrwxrwx 1  flink flink  4096 Apr 29  2021 log
    drwxrwxrwx 1  flink flink  4096 Jun 15  2021 opt
    drwxrwxrwx 1  flink flink  4096 Jun 15  2021 plugins
    -rwxrwxrwx 1  flink flink  1341 Apr 29  2021 README.txt
    drwxr--r-- 1  flink flink 4096 Feb 11 06:49 scripts
    #说明:镜像的 dockerfile 使用 USER paas,而 docker run 进入容器后,发现目录的权限为 flink。因此,镜像有问题,需要在 dockerfile 修改目录权限。
    

    相关文章

      网友评论

        本文标题:【K8s 精选】如何定位镜像和 yaml 部署问题

        本文链接:https://www.haomeiwen.com/subject/dgzclrtx.html