美文网首页Linux Troubleshooting
systemd stops reading and proces

systemd stops reading and proces

作者: yangqing | 来源:发表于2024-02-28 16:37 被阅读0次

环境

  • CentOS 7
  • Openshift Container Platform 3.x
  • systemd

问题

  • 在节点上启动的 Pod 处于 状态,直到它们被手动删除或主机重启。ContainerCreating
  • Pod 处于 状态,直到 OpenShift 节点被手动重启。Terminating
  • 部署存在许多问题,因为大多数 deploy pod 已处于 状态超过一天。ContainerCreating
  • 使用 启动容器失败,并显示docker``/usr/bin/docker-current:Error response from daemon: containerd: container did not start before the specified timeout
  • 我们在 中看到大量以下消息,并且无法在此特定节点上使用 启动任何新容器。journal``docker
Jun 24 10:10:26 node123 crond[111309]: pam_systemd(crond:session): Failed to create session: Connection timed out
Jun 24 10:10:26 node123 systemd-logind[10714]: Failed to start user slice user-0.slice, ignoring: Connection timed out ((null))
Jun 24 10:10:51 node123 systemd-logind[10714]: Failed to start session scope session-13692.scope: Connection timed out

  • systemd报告一个奇怪的错误消息,之后与 相关的各种操作都失败:DBus
Jun 04 14:03:35 node123 systemd[1]: Failed to propagate agent release message: Operation not supported

  • 容器卡在 ContainerCreating 状态,以下信息可在journal 日志中看到。
Jan 01 01:01:01 hostname atomic-openshift-node: I0530 01:01:01.145075   91428 server.go:470] type: 'Warning' reason: 'FailedCreatePodSandBox' Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "pod-name": Error response from daemon: oci runtime error: The maximum number of active connections for UID 0 has been reached

决议

请更新这些软件 systemd systemd-219-62.el7_6.9 systemd-219-67

根本原因

当在给定的时间段内创建/删除大量单元时,sd_bus->cookie将溢出,dbus org.freedesktop.systemd1将根本没有响应,因为systemd无法密封dbus1类型的消息。

诊断步骤

  • 检查 ,以确认在问题启动时是否发现以下信息:journalctl
Jun 24 10:10:26 node123 crond[111309]: pam_systemd(crond:session): Failed to create session: Connection timed out
Jun 24 10:10:26 node123 systemd-logind[10714]: Failed to start user slice user-0.slice, ignoring: Connection timed out ((null))
Jun 24 10:10:51 node123 systemd-logind[10714]: Failed to start session scope session-13692.scope: Connection timed out

或/和

Jun 04 14:03:35 node123 systemd[1]: Failed to propagate agent release message: Operation not supported

或/和

Jan 01 01:01:01 hostname atomic-openshift-node: I0530 01:01:01.145075   91428 server.go:470] type: 'Warning' reason: 'FailedCreatePodSandBox' Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "pod-name": Error response from daemon: oci runtime error: The maximum number of active connections for UID 0 has been reached

相关文章

网友评论

    本文标题:systemd stops reading and proces

    本文链接:https://www.haomeiwen.com/subject/uvkfzdtx.html