引言
大概是在2016/10前后,我们部门使用docker一段时间后偶尔会出现docker exec ... 无法进入容器的问题,环境为centos7.2、docker1.12.6,docker存储引擎为devicemapper,经过排查发现容器对应的文件系统已经umount,且发现开发同学用了大量的docker restart ... 操作。于是产生docker restart导致容器文件系统umount的疑问,后面对docker restart、start、stop三个命令与容器文件系统关系做了研究,以下是研究的记录。
问题描述
通过docker run启动一个容器后,docker会同时挂载该容器的内存文件系统与容器的根文件系统(rootfs),比如
/dev/dm-1 10G 147M 9.9G 2% /var/lib/docker/devicemapper/mnt/9bf11bfa99d87e874df9cff2720fdfe03ffe9e9fbb95fabdbf1386f844cfc077
shm 64M 0 64M 0% /var/lib/docker/containers/e536db1a4695ca6c8c41523e736fc92f39840839d5628b3f722fedc1db795436/shm
若容器的根文件系统(rootfs)umount,执行 docker exec -it xxx /bin/bash or /bin/sh会触发异常:
rpc error: code = 13 desc = invalid header field value "oci runtime error: exec failed: container_linux.go:247: starting container process caused \"process_linux.go:75: starting setns process caused \\\"fork/exec /proc/self/exe: no such file or directory\\\"\"\n"
同时执行 docker restart xxx会触发异常:
Error response from daemon: Cannot restart container anyrobot-etl: invalid header field value "oci runtime error: rootfs (\"/var/lib/docker/devicemapper/mnt/e512b093e57e257ed7bf5cc451c24e08e7ae4823e46657af1e160c2a029357e2/rootfs\") does not exist\n"
查看containerRestart函数执行的流程
containerRestart函数中,首先会执行daemon.Mount()函数,进入 daemon.Mount()函数 此时容器的根文件系统(rootfs)并没有正常挂载,所以这里container.RWLayer.Mount(container.GetMountLabel())会出现错误err,即 err != nil ,因此Mount()函数返回 err,并终止往下进行。返回到containerRestart()函数这里,err = daemon.Mount(container),err != nil 满足条件 ,因此 defer daemon.Unmount(container)最后不会被执行。接着往下走,当前的容器是处于运行状态的,因此会进入 if container.IsRunning()的判断,判断域内一直往下执行,执行中会调用 daemon.containerStop(container, seconds)函数,停止该容器。最后 if err != nil 满足条件,则返回 return err,containerRestart函数执行结束。 回到ContainerRestart()函数中,daemon.containerRestart()函数返回的err满足条件 err != nil ,所以这里会有 "Cannot restart container xxx"异常信息报出。查看containerStart函数执行的流程
containerRestart() 函数中,调用了daemon的 conditionalMountOnStart()函数,进入该函数 函数内部调用了daemon的Mount()函数,由此可知,containerStart()函数是显示的挂载容器的根文件系统(rootfs)与容器的。查看containerStop函数的源码
研究containerStop函数,并没有发现哪里调用了Umount卸载文件系统的函数。当容器停止后,容器与容器的根文件系统(rootfs)卸载的情况,这里猜测可能的原因是文件系统是随着容器的进程被停止而卸载的。查看docker debug日志信息
分别查看docker restart、start、stop三个命令的debug信息,这里的实践环境为:centos7.2、docker1.12.6、存储引擎(storage-driver):devicemapper、镜像:nginx:1.12
- 创建nginx容器
docker run -itd --name nginx nginx:1.12
nginx容器对应的根文件系统与内存文件系统mount信息分别为:
/dev/dm-1 10G 147M 9.9G 2% /var/lib/docker/devicemapper/mnt/86f8de453061f09db6eebebacb4a439bfa2f5afa3d202b98f55887c13bdfa0da
shm 64M 0 64M 0% /var/lib/docker/containers/6d3286762568ae0535908de12382f2789a77b8e4faa1fd76a30e981f00f4d34f/shm
nginx容器对应的块设备为dm-1
- docker stop nginx,系统日志输出:
Nov 2 16:09:14 node dockerd: time="2016-11-02T16:09:14.846401331+08:00" level=debug msg="Calling POST /v1.24/containers/nginx/stop?t=10"
Nov 2 16:09:14 node dockerd: time="2016-11-02T16:09:14.846470393+08:00" level=debug msg="Sending 15 to 6d3286762568ae0535908de12382f2789a77b8e4faa1fd76a30e981f00f4d34f"
Nov 2 16:09:14 node dockerd: time="2016-11-02T16:09:14.853533994+08:00" level=debug msg="containerd: process exited" id=6d3286762568ae0535908de12382f2789a77b8e4faa1fd76a30e981f00f4d34f pid=init status=0 systemPid=121294
Nov 2 16:09:14 node dockerd: time="2016-11-02T16:09:14.861900485+08:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"exit\", Id:\"6d3286762568ae0535908de12382f2789a77b8e4faa1fd76a30e981f00f4d34f\", Status:0x0, Pid:\"init\", Timestamp:(*timestamp.Timestamp)(0xc820ef9d30)}"
Nov 2 16:09:14 node dockerd: time="2016-11-02T16:09:14.862344401+08:00" level=debug msg="Revoking external connectivity on endpoint nginx (7b98d117889d1c01a21a7ca674e64a785b792bec135f66c6fb365168b193b3e6)"
Nov 2 16:09:14 node dockerd: time="2016-11-02T16:09:14.862964952+08:00" level=warning msg="failed to close stdin: rpc error: code = 2 desc = containerd: container not found"
Nov 2 16:09:14 node kernel: docker0: port 1(vethe871091) entered disabled state
Nov 2 16:09:14 node kernel: docker0: port 1(vethe871091) entered disabled state
Nov 2 16:09:14 node kernel: device vethe871091 left promiscuous mode
Nov 2 16:09:14 node kernel: docker0: port 1(vethe871091) entered disabled state
Nov 2 16:09:14 node dockerd: time="2016-11-02T16:09:14.885060892+08:00" level=debug msg="Releasing addresses for endpoint nginx's interface on network bridge"
Nov 2 16:09:14 node dockerd: time="2016-11-02T16:09:14.885104751+08:00" level=debug msg="ReleaseAddress(LocalDefault/172.17.0.0/16, 172.17.0.2)"
Nov 2 16:09:14 node dockerd: time="2016-11-02T16:09:14.888964154+08:00" level=debug msg="devmapper: UnmountDevice(hash=86f8de453061f09db6eebebacb4a439bfa2f5afa3d202b98f55887c13bdfa0da)"
Nov 2 16:09:14 node dockerd: time="2016-11-02T16:09:14.888997975+08:00" level=debug msg="devmapper: Unmount(/var/lib/docker/devicemapper/mnt/86f8de453061f09db6eebebacb4a439bfa2f5afa3d202b98f55887c13bdfa0da)"
Nov 2 16:09:14 node kernel: XFS (dm-1): Unmounting Filesystem
Nov 2 16:09:14 node dockerd: time="2016-11-02T16:09:14.911053057+08:00" level=debug msg="devmapper: Unmount done"
Nov 2 16:09:14 node dockerd: time="2016-11-02T16:09:14.911106735+08:00" level=debug msg="devmapper: deactivateDevice(86f8de453061f09db6eebebacb4a439bfa2f5afa3d202b98f55887c13bdfa0da)"
Nov 2 16:09:14 node dockerd: time="2016-11-02T16:09:14.911192040+08:00" level=debug msg="devmapper: removeDevice START(docker-202:1-1589930-86f8de453061f09db6eebebacb4a439bfa2f5afa3d202b98f55887c13bdfa0da)"
Nov 2 16:09:14 node dockerd: time="2016-11-02T16:09:14.919197718+08:00" level=debug msg="devmapper: removeDevice END(docker-202:1-1589930-86f8de453061f09db6eebebacb4a439bfa2f5afa3d202b98f55887c13bdfa0da)"
Nov 2 16:09:14 node dockerd: time="2016-11-02T16:09:14.919246950+08:00" level=debug msg="devmapper: deactivateDevice END(86f8de453061f09db6eebebacb4a439bfa2f5afa3d202b98f55887c13bdfa0da)"
Nov 2 16:09:14 node dockerd: time="2016-11-02T16:09:14.919259532+08:00" level=debug msg="devmapper: UnmountDevice(hash=86f8de453061f09db6eebebacb4a439bfa2f5afa3d202b98f55887c13bdfa0da) END"
Nov 2 16:09:17 node ntpd[473]: Deleting interface #42 vethe871091, fe80::3cb6:deff:fe0e:9bda#123, interface stats: received=0, sent=0, dropped=0, active_time=220 secs
有文件系统umount的操作信息
- docker start nginx,系统日志输出:
Nov 2 16:12:18 node dockerd: time="2016-11-02T16:12:18.559834721+08:00" level=debug msg="Calling POST /v1.24/containers/nginx/start"
Nov 2 16:12:18 node dockerd: time="2016-11-02T16:12:18.560447373+08:00" level=debug msg="devmapper: activateDeviceIfNeeded(86f8de453061f09db6eebebacb4a439bfa2f5afa3d202b98f55887c13bdfa0da)"
Nov 2 16:12:18 node kernel: XFS (dm-1): Mounting V5 Filesystem
Nov 2 16:12:18 node kernel: XFS (dm-1): Ending clean mount
Nov 2 16:12:18 node dockerd: time="2016-11-02T16:12:18.590744411+08:00" level=debug msg="container mounted via layerStore: /var/lib/docker/devicemapper/mnt/86f8de453061f09db6eebebacb4a439bfa2f5afa3d202b98f55887c13bdfa0da/rootfs"
Nov 2 16:12:18 node dockerd: time="2016-11-02T16:12:18.590914625+08:00" level=debug msg="Assigning addresses for endpoint nginx's interface on network bridge"
Nov 2 16:12:18 node dockerd: time="2016-11-02T16:12:18.590938496+08:00" level=debug msg="RequestAddress(LocalDefault/172.17.0.0/16, <nil>, map[])"
Nov 2 16:12:18 node kernel: device veth1a44a92 entered promiscuous mode
Nov 2 16:12:18 node kernel: IPv6: ADDRCONF(NETDEV_UP): veth1a44a92: link is not ready
Nov 2 16:12:18 node kernel: docker0: port 1(veth1a44a92) entered forwarding state
Nov 2 16:12:18 node kernel: docker0: port 1(veth1a44a92) entered forwarding state
Nov 2 16:12:18 node kernel: docker0: port 1(veth1a44a92) entered disabled state
Nov 2 16:12:18 node dockerd: time="2016-11-02T16:12:18.601330774+08:00" level=debug msg="Assigning addresses for endpoint nginx's interface on network bridge"
Nov 2 16:12:18 node dockerd: time="2016-11-02T16:12:18.608483995+08:00" level=debug msg="Programming external connectivity on endpoint nginx (d5ceb98ca54865a44156310cf4e3f7f3df2c5a74646598552cbd4f78e8ff2b4e)"
Nov 2 16:12:18 node kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth1a44a92: link becomes ready
Nov 2 16:12:18 node kernel: docker0: port 1(veth1a44a92) entered forwarding state
Nov 2 16:12:18 node kernel: docker0: port 1(veth1a44a92) entered forwarding state
Nov 2 16:12:18 node dockerd: time="2016-11-02T16:12:18.669534147+08:00" level=debug msg="sandbox set key processing took 7.06337ms for container 6d3286762568ae0535908de12382f2789a77b8e4faa1fd76a30e981f00f4d34f"
Nov 2 16:12:18 node dockerd: time="2016-11-02T16:12:18.726420146+08:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"start-container\", Id:\"6d3286762568ae0535908de12382f2789a77b8e4faa1fd76a30e981f00f4d34f\", Status:0x0, Pid:\"\", Timestamp:(*timestamp.Timestamp)(0xc820f34350)}"
Nov 2 16:12:18 node dockerd: time="2016-11-02T16:12:18.729815704+08:00" level=debug msg="libcontainerd: event unhandled: type:\"start-container\" id:\"6d3286762568ae0535908de12382f2789a77b8e4faa1fd76a30e981f00f4d34f\" timestamp:<seconds:1546416738 nanos:724341305 > "
Nov 2 16:12:21 node ntpd[473]: Listen normally on 43 veth1a44a92 fe80::c8c8:65ff:fe20:7b01 UDP 123
Nov 2 16:12:33 node kernel: docker0: port 1(veth1a44a92) entered forwarding state
有文件系统的mount操作信息
- docker restart nginx,系统日志输出:
Nov 2 16:13:50 node dockerd: time="2016-11-02T16:13:50.463384463+08:00" level=debug msg="Calling POST /v1.24/containers/nginx/restart?t=10"
Nov 2 16:13:50 node dockerd: time="2016-11-02T16:13:50.463476842+08:00" level=debug msg="container mounted via layerStore: /var/lib/docker/devicemapper/mnt/86f8de453061f09db6eebebacb4a439bfa2f5afa3d202b98f55887c13bdfa0da/rootfs"
Nov 2 16:13:50 node dockerd: time="2016-11-02T16:13:50.463550127+08:00" level=debug msg="Sending 15 to 6d3286762568ae0535908de12382f2789a77b8e4faa1fd76a30e981f00f4d34f"
Nov 2 16:13:50 node dockerd: time="2016-11-02T16:13:50.470408535+08:00" level=debug msg="containerd: process exited" id=6d3286762568ae0535908de12382f2789a77b8e4faa1fd76a30e981f00f4d34f pid=init status=0 systemPid=122202
Nov 2 16:13:50 node dockerd: time="2016-11-02T16:13:50.479775109+08:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"exit\", Id:\"6d3286762568ae0535908de12382f2789a77b8e4faa1fd76a30e981f00f4d34f\", Status:0x0, Pid:\"init\", Timestamp:(*timestamp.Timestamp)(0xc820f34d60)}"
Nov 2 16:13:50 node dockerd: time="2016-11-02T16:13:50.480243393+08:00" level=debug msg="Revoking external connectivity on endpoint nginx (d5ceb98ca54865a44156310cf4e3f7f3df2c5a74646598552cbd4f78e8ff2b4e)"
Nov 2 16:13:50 node dockerd: time="2016-11-02T16:13:50.480852093+08:00" level=warning msg="failed to close stdin: rpc error: code = 2 desc = containerd: container not found"
Nov 2 16:13:50 node kernel: docker0: port 1(veth1a44a92) entered disabled state
Nov 2 16:13:50 node kernel: docker0: port 1(veth1a44a92) entered disabled state
Nov 2 16:13:50 node kernel: device veth1a44a92 left promiscuous mode
Nov 2 16:13:50 node kernel: docker0: port 1(veth1a44a92) entered disabled state
Nov 2 16:13:50 node dockerd: time="2016-11-02T16:13:50.499758603+08:00" level=debug msg="Releasing addresses for endpoint nginx's interface on network bridge"
Nov 2 16:13:50 node dockerd: time="2016-11-02T16:13:50.499800548+08:00" level=debug msg="ReleaseAddress(LocalDefault/172.17.0.0/16, 172.17.0.2)"
Nov 2 16:13:50 node kernel: device veth4fb623e entered promiscuous mode
Nov 2 16:13:50 node kernel: IPv6: ADDRCONF(NETDEV_UP): veth4fb623e: link is not ready
Nov 2 16:13:50 node kernel: docker0: port 1(veth4fb623e) entered forwarding state
Nov 2 16:13:50 node kernel: docker0: port 1(veth4fb623e) entered forwarding state
Nov 2 16:13:50 node dockerd: time="2016-11-02T16:13:50.508096957+08:00" level=debug msg="container mounted via layerStore: /var/lib/docker/devicemapper/mnt/86f8de453061f09db6eebebacb4a439bfa2f5afa3d202b98f55887c13bdfa0da/rootfs"
Nov 2 16:13:50 node dockerd: time="2016-11-02T16:13:50.508248137+08:00" level=debug msg="Assigning addresses for endpoint nginx's interface on network bridge"
Nov 2 16:13:50 node dockerd: time="2016-11-02T16:13:50.508274138+08:00" level=debug msg="RequestAddress(LocalDefault/172.17.0.0/16, <nil>, map[])"
Nov 2 16:13:50 node dockerd: time="2016-11-02T16:13:50.518255720+08:00" level=debug msg="Assigning addresses for endpoint nginx's interface on network bridge"
Nov 2 16:13:50 node dockerd: time="2016-11-02T16:13:50.525383372+08:00" level=debug msg="Programming external connectivity on endpoint nginx (fb933f68c93990950cdec7d41a001568e5342a7d935f17f48fd07b077d9a91ad)"
Nov 2 16:13:50 node kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth4fb623e: link becomes ready
Nov 2 16:13:50 node dockerd: time="2016-11-02T16:13:50.585250711+08:00" level=debug msg="sandbox set key processing took 5.83763ms for container 6d3286762568ae0535908de12382f2789a77b8e4faa1fd76a30e981f00f4d34f"
Nov 2 16:13:50 node dockerd: time="2016-11-02T16:13:50.635658245+08:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"start-container\", Id:\"6d3286762568ae0535908de12382f2789a77b8e4faa1fd76a30e981f00f4d34f\", Status:0x0, Pid:\"\", Timestamp:(*timestamp.Timestamp)(0xc820eef780)}"
Nov 2 16:13:50 node dockerd: time="2016-11-02T16:13:50.638449684+08:00" level=debug msg="libcontainerd: event unhandled: type:\"start-container\" id:\"6d3286762568ae0535908de12382f2789a77b8e4faa1fd76a30e981f00f4d34f\" timestamp:<seconds:1546416830 nanos:634697526 > "
Nov 2 16:13:53 node ntpd[473]: Listen normally on 44 veth4fb623e fe80::982c:ecff:fe44:47f2 UDP 123
Nov 2 16:13:53 node ntpd[473]: Deleting interface #43 veth1a44a92, fe80::c8c8:65ff:fe20:7b01#123, interface stats: received=0, sent=0, dropped=0, active_time=92 secs
并没有文件系统的umount与mount的操作信息。
通过上面的日志输出可以了解到
- docker start ... 包含容器文件系统挂载的操作
- docker stop ... 包含容器文件系统卸载的操作
- docker restart ... 不包含容器文件系统的卸载与挂载操作,本质上docker restart不涉及文件系统的操作,因此restart命令并不是stop与start两个命令的顺序叠加。
结论
分析发现,docker restart命令并不是stop、start两个命令的顺序叠加,docker restart操作并不涉及容器文件系统的处理,开始怀疑的由于docker restart操作导致容器的文件系统处于umount状态此处没有找到证据,但为了保证容器的根文件系统与内存系统mount的正确性,推荐对一个容器的重启使用docker stop xxx 然后 docker start xxx,而非docker restart xxx。
网友评论