前言
由于机房停电,服务器重启后,kubelet无法启动
现象
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.014426 16847 server.go:410] Version: v1.16.2
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.014605 16847 plugins.go:100] No cloud provider specified.
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.014619 16847 server.go:773] Client rotation is on, will bootstrap in background
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.016286 16847 certificate_store.go:129] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.084908 16847 server.go:636] --cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.085161 16847 container_manager_linux.go:265] container manager verified user specified cgroup-root exists: []
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.085186 16847 container_manager_linux.go:270] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoo
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.085257 16847 fake_topology_manager.go:29] [fake topologymanager] NewFakeManager
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.085262 16847 container_manager_linux.go:305] Creating device plugin manager: true
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.085281 16847 fake_topology_manager.go:39] [fake topologymanager] AddHintProvider HintProvider: &{kubelet.sock /var/lib/kubelet/device-plugins/ map[] {0 0} <nil> {{} [0 0 0]} 0x1b6baf0 0x799f318 0x1b6c4f0 map[] ma
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.085315 16847 state_mem.go:36] [cpumanager] initializing new in-memory state store
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.085389 16847 state_mem.go:84] [cpumanager] updated default cpuset: ""
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.085398 16847 state_mem.go:92] [cpumanager] updated cpuset assignments: "map[]"
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.085406 16847 fake_topology_manager.go:39] [fake topologymanager] AddHintProvider HintProvider: &{{0 0} 0x799f318 10000000000 0xc000051860 <nil> <nil> <nil> <nil> map[memory:{{104857600 0} {<nil>} BinarySI}]}
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.085452 16847 kubelet.go:287] Adding pod path: /etc/kubernetes/manifests
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.085471 16847 kubelet.go:312] Watching apiserver
Nov 10 07:03:46 k8smaster kubelet[16847]: E1110 07:03:46.088435 16847 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://192.168.100.10:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dk8smaster&limit=500&resourceVe
Nov 10 07:03:46 k8smaster kubelet[16847]: E1110 07:03:46.088584 16847 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: Get https://192.168.100.10:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.100.10:6443: c
Nov 10 07:03:46 k8smaster kubelet[16847]: E1110 07:03:46.088712 16847 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: Get https://192.168.100.10:6443/api/v1/nodes?fieldSelector=metadata.name%3Dk8smaster&limit=500&resourceVersion=
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.091834 16847 client.go:75] Connecting to docker on unix:///var/run/docker.sock
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.091850 16847 client.go:104] Start docker client with request timeout=2m0s
Nov 10 07:03:46 k8smaster kubelet[16847]: W1110 07:03:46.099637 16847 docker_service.go:563] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.099659 16847 docker_service.go:240] Hairpin mode set to "hairpin-veth"
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.110517 16847 docker_service.go:255] Docker cri networking managed by cni
Nov 10 07:03:46 k8smaster kubelet[16847]: I1110 07:03:46.117302 16847 docker_service.go:260] Docker Info: &{ID:PL4E:TW6I:WRG2:VGKQ:STTZ:US6O:7CMH:DY7B:OI5W:QQSB:GQGP:7PAQ Containers:19 ContainersRunning:0 ContainersPaused:0 ContainersStopped:19 Images:32 Driver:overlay2
Nov 10 07:03:46 k8smaster kubelet[16847]: F1110 07:03:46.117368 16847 server.go:271] failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "$DOCKER_CGROUPS" is different from docker cgroup driver: "systemd"
Nov 10 07:03:46 k8smaster systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
Nov 10 07:03:46 k8smaster systemd[1]: Unit kubelet.service entered failed state.
Nov 10 07:03:46 k8smaster systemd[1]: kubelet.service failed.
错误定位
server.go:271] failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "$DOCKER_CGROUPS" is different from docker cgroup driver: "systemd"
分析
- 获取DOCKER_CGROUPS
# 正常情况下获取DOCKER_CGROUPS是:cgroupfs或者systemd
# 但是我们这里获取到的是Driver:,所以错误
[root@k8smaster systemd]# docker info | grep 'Cgroup' | cut -d' ' -f3
Driver:
解决办法
# 直接修改kubelet 的DOCKER_CGROUPS
vim /etc/sysconfig/kubelet
# 这里不适用动态获取DOCKER_CGROUPS,而是直接使用cgroupfs ,cgroupfs 通过docker info | grep 'Cgroup' | cut -d' ' -f4获取到
KUBELET_EXTRA_ARGS="--cgroup-driver=cgroupfs --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1"
网友评论