问题描述
今天测试环境上出现创建缓存分区失败的情况,查看log发现是ceph-disk zap /dev/sdx
hang死,导致超时被杀。log如下所示:
318 time=2020-02-27T10:08:25+08:00 level=warning module=utils/process.go:123 topic=kernel.external.process msg="Process was killed after 2m0.000139012s: /usr/sbin/ceph-disk [ceph-disk zap /dev/sdg]
319 out:
320 err: 1+0 records in
321 1+0 records out
322 4194304 bytes (4.2 MB) copied, 0.00448586 s, 935 MB/s
323 "
分析
查看其对应的进程信息,发现有好几个sgdisk
进程
[root@sds2 ~]# ps -ef | grep zap
root 4085 1 0 11:10 ? 00:00:00 /usr/sbin/sgdisk --zap-all -- /dev/sdg
root 23181 1 0 10:06 ? 00:00:00 /usr/sbin/sgdisk --zap-all -- /dev/sdg
root 40867 1 0 Feb26 ? 00:00:00 /usr/sbin/sgdisk --zap-all -- /dev/sdg
root 41064 1 0 Feb26 ? 00:00:00 /usr/sbin/sgdisk --zap-all -- /dev/sdi
root 42785 1 0 Feb26 ? 00:00:00 /usr/sbin/sgdisk --zap-all -- /dev/sdg
root 48840 32585 0 16:24 pts/1 00:00:00 grep --color=auto zap
查看其中一个进程的栈信息,从其栈信息可以看出其hang在call_rwsem_down_read_failed
,具体介绍可以参考读写信号量与实时进程阻塞挂死问题
[root@sds2 ~]# cat /proc/4085/stack
[<ffffffff81331ad8>] call_rwsem_down_read_failed+0x18/0x30
[<ffffffff81204e8a>] iterate_supers+0xaa/0x120
[<ffffffff81233614>] sys_sync+0x44/0xb0
[<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
接着使用top命令查看其进程状态为D
,D
代表uninterruptible sleep
,Linux进程有两种睡眠状态,一种interruptible sleep
,处在这种睡眠状态的进程是可以通过给它发信号来唤醒的,比如发HUP信号给nginx的master进程可以让nginx重新加载配置文件而不需要重新启动nginx进程;另外一种睡眠状态是uninterruptible sleep
,处在这种状态的进程不接受外来的任何信号,也无法用kill
杀掉这些处于D状态的进程,无论是”kill”, “kill -9″还是”kill -15″,因为它们不受这些信号的支配。
进程为什么会被置于uninterruptible sleep
状态呢?处于uninterruptible sleep
状态的进程通常是在等待IO,比如磁盘IO,网络IO,其他外设IO,如果进程正在等待的IO在较长的时间内都没有响应,那么就很会不幸地被 ps看到了,同时也就意味着很有可能有IO出了问题,可能是外设本身出了故障,也可能是比如挂载的远程文件系统已经不可访问了。
[root@sds2 ~]# top -p 4085
top - 16:27:32 up 16 days, 20:22, 3 users, load average: 7.24, 7.25, 7.26
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.2 us, 0.1 sy, 0.0 ni, 99.4 id, 0.3 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 65758080 total, 37593416 free, 5325808 used, 22838856 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 53136852 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4085 root 20 0 53296 2112 1736 D 0.0 0.0 0:00.08 sgdisk
(ENV) [root@ceph-2 ~]# ps -axf | grep etcd
7123 pts/1 S+ 0:00 \_ grep --color=auto etcd
17158 ? Ssl 462:16 /opt/sds/bin/etcd --config-file /opt/sds/etcd/etcd.conf
17227 ? Ssl 97:00 /opt/sds/bin/etcd --config-file /opt/sds/etcd/etcd-proxy.conf
以下内容来自ps
手册页。
- This
ps
works by reading the virtual files in /proc.- Processes marked <
defunct
> are dead processes (so-called "zombies") that remain because their parent has not destroyed
them properly. These processes will be destroyed byinit(8)
if the parent process exits.
PROCESS STATE CODES
Here are the different values that the s, stat and state output specifiers (header "STAT" or "S") will display to
describe the state of a process:
D uninterruptible sleep (usually IO)
R running or runnable (on run queue)
S interruptible sleep (waiting for an event to complete)
T stopped by job control signal
t stopped by debugger during the tracing
W paging (not valid since the 2.6.xx kernel)
X dead (should never be seen)
Z defunct ("zombie") process, terminated but not reaped by its parent
For BSD formats and when the stat keyword is used, additional characters may be displayed:
< high-priority (not nice to other users)
N low-priority (nice to other users)
L has pages locked into memory (for real-time and custom IO)
s is a session leader
l is multi-threaded (using CLONE_THREAD, like NPTL pthreads do)
+ is in the foreground process group
其中,前面提到的kill
命令,我们可以调用kill -l
查看相应的信号。
[root@sds2 ~]# kill -l
1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP
6) SIGABRT 7) SIGBUS 8) SIGFPE 9) SIGKILL 10) SIGUSR1
11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM
16) SIGSTKFLT 17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP
21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU 25) SIGXFSZ
26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 29) SIGIO 30) SIGPWR
31) SIGSYS 34) SIGRTMIN 35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3
38) SIGRTMIN+4 39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8
43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13
48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12
53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7
58) SIGRTMAX-6 59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2
63) SIGRTMAX-1 64) SIGRTMAX
上面的信号中需要提到的是18,19,20。
kill -SIGSTOP [pid]
kill -SIGCONT [pid]
对于SIGSTOP
When
SIGSTOP
is sent to a process, the usual behaviour is to pause that process in its current state. The process will only resume execution if it is sent theSIGCONT
signal. SIGSTOP and SIGCONT are used for job control in the Unix shell, among other purposes. SIGSTOP cannot be caught or ignored.
对于SIGCONT
When
SIGSTOP
orSIGTSTP
is sent to a process, the usual behaviour is to pause that process in its current state. The process will only resume execution if it is sent theSIGCONT
signal. SIGSTOP and SIGCONT are used for job control in the Unix shell, among other purposes.
简而言之,SIGSTOP
告诉进程先hold on,而且SIGSTOP
不能被捕捉或忽略,SIGTSTP
可以被捕捉或忽略。 SIGCONT
通知进程从其hold on的地方继续开始。
In short, SIGSTOP tells a process to “hold on” and SIGCONT tells a process to “pick up where you left off”.
- A job running in the foreground can be stopped by typing the suspend character (Ctrl-Z). This sends the "terminal stop" signal (SIGTSTP) to the process group. By default, SIGTSTP causes processes receiving it to stop, and control is returned to the shell. However, a process can register a signal handler for or ignore SIGTSTP. A process can also be paused with the "stop" signal (SIGSTOP), which cannot be caught or ignored.
- A job running in the foreground can be interrupted by typing the interruption character (Ctrl-C). This sends the "interrupt" signal (SIGINT), which defaults to terminating the process, though it can be overridden.
另外有一个地方需要注意的是kill -0 <pid>
,其主要是执行错误检查,用于检查进程或进程组ID是否存在。当时在keepalived
启动时也看到同样的用法。
Jan 8 12:14:36 ceph-2 Keepalived[9288]: Opening file '/opt/sds/keepalived/sds-keepalived-10.252.90.77-8/keepalived.conf'.
Jan 8 12:14:36 ceph-2 Keepalived[9288]: Remove a zombie pid file /opt/sds/keepalived/sds-keepalived-10.252.90.77-8/keepalived.pid
Jan 8 12:14:36 ceph-2 Keepalived[9288]: Remove a zombie pid file /opt/sds/keepalived/sds-keepalived-10.252.90.77-8/vrrp.pid
Jan 8 12:14:36 ceph-2 Keepalived[9289]: Starting VRRP child process, pid=9290
Jan 8 12:14:36 ceph-2 Keepalived_vrrp[9290]: Registering Kernel netlink reflector
Jan 8 12:14:36 ceph-2 Keepalived_vrrp[9290]: Registering Kernel netlink command channel
Jan 8 12:14:36 ceph-2 Keepalived_vrrp[9290]: Registering gratuitous ARP shared channel
Jan 8 12:14:36 ceph-2 Keepalived_vrrp[9290]: Opening file '/opt/sds/keepalived/sds-keepalived-10.252.90.77-8/keepalived.conf'.
Jan 8 12:14:36 ceph-2 Keepalived_vrrp[9290]: WARNING - default user 'keepalived_script' for script execution does not exist - please create.
Jan 8 12:14:36 ceph-2 Keepalived_vrrp[9290]: (sds-keepalived-10.252.90.77-8): Cannot start in MASTER state if not address owner
Jan 8 12:14:36 ceph-2 Keepalived_vrrp[9290]: (sds-keepalived-10.252.90.77-8): Unable to set no_accept mode since iptables chain name unset
从log看到在keepalived pid文件中注入某进程ID之后还是能正常启动,查看源码可以看出启动时会去检查pid file。
2171 /* Check if keepalived is already running */
2172 if (keepalived_running(daemon_mode)) {
2173 log_message(LOG_INFO, "daemon is already running");
2174 report_stopped = false;
2175 goto end;
2176 }
2177 }
123 /* Return parent process daemon state */
124 bool
125 keepalived_running(unsigned long mode)
126 {
127 if (process_running(main_pidfile))
128 return true;
129 #ifdef _WITH_VRRP_
130 if (__test_bit(DAEMON_VRRP, &mode) && process_running(vrrp_pidfile))
131 return true;
132 #endif
133 #ifdef _WITH_LVS_
134 if (__test_bit(DAEMON_CHECKERS, &mode) && process_running(checkers_pidfile))
135 return true;
136 #endif
137 #ifdef _WITH_BFD_
138 if (__test_bit(DAEMON_BFD, &mode) && process_running(bfd_pidfile))
139 return true;
140 #endif
141 return false;
142 }
90 static int
91 process_running(const char *pid_file)
92 {
93 FILE *pidfile = fopen(pid_file, "r");
94 pid_t pid = 0;
95 int ret;
96
97 /* No pidfile */
98 if (!pidfile)
99 return 0;
100
101 ret = fscanf(pidfile, "%d", &pid);
102 fclose(pidfile);
103 if (ret != 1) {
104 log_message(LOG_INFO, "Error reading pid file %s", pid_file);
105 pid = 0;
106 pidfile_rm(pid_file);
107 }
108
109 /* What should we return - we don't know if it is running or not. */
110 if (!pid)
111 return 1;
112
113 /* If no process is attached to pidfile, remove it */
114 if (kill(pid, 0)) {
115 log_message(LOG_INFO, "Remove a zombie pid file %s", pid_file);
116 pidfile_rm(pid_file);
117 return 0;
118 }
119
120 return 1;
121 }
查看man 2 kill
手册页可以看到:
#include <signal.h>
int kill(pid_t pid, int sig);
If sig is 0, then no signal is sent, but error checking is still performed; this can be used to check for the existence
of a process ID or process group ID.
网友评论