重现
在服务器上部署完jenkins之后,在jenkins上新建job,点击构建的之后,经过很长的等待时间之后,浏览器显示断开连接,再去服务器上看Jenkins的进行,发现jenkins的进行已经不在。查看Jenkins日志(/var/log/jenkins/jenkins),发现该日志文件中只有启动日志,并没有找到相关的报错信息。
查看服务状态:
[root@iZm5eiqt11htvcfb521kerZ ~]# service jenkins status
● jenkins.service - LSB: Jenkins Automation Server
Loaded: loaded (/etc/rc.d/init.d/jenkins; bad; vendor preset: disabled)
Active: active (exited) since Sun 2020-01-12 13:56:05 CST; 10min ago
Docs: man:systemd-sysv-generator(8)
Process: 13301 ExecStop=/etc/rc.d/init.d/jenkins stop (code=exited, status=0/SUCCESS)
Process: 13324 ExecStart=/etc/rc.d/init.d/jenkins start (code=exited, status=0/SUCCESS)
Jan 12 13:56:03 iZm5eiqt11htvcfb521kerZ systemd[1]: Starting LSB: Jenkins Automation Server...
Jan 12 13:56:03 iZm5eiqt11htvcfb521kerZ runuser[13329]: pam_unix(runuser:session): session opened for user jenkins by (uid=0)
Jan 12 13:56:05 iZm5eiqt11htvcfb521kerZ systemd[1]: Started LSB: Jenkins Automation Server.
Jan 12 13:56:05 iZm5eiqt11htvcfb521kerZ jenkins[13324]: Starting Jenkins [ OK ]
猜想、内存
首先猜想到的是内存问题,有可能是Jenkins运行时的内存不足造成的,所以进行如下修改。
在jenkins的配置文件(/etc/sysconfig/jenkins)下对属性进行如下修改
更改前
JENKINS_JAVA_OPTIONS="-Djava.awt.headless=true"
更改后(在启动的时候增加运行是最大内存)
JENKINS_JAVA_OPTIONS="-Djava.awt.headless=true -Xmx512m"
进行上面的更改后,并没有产生效果,但是使用top命令,查看linux服务器的资源
top - 17:25:23 up 13 days, 19:39, 2 users, load average: 23.79, 18.77, 11.13
Tasks: 82 total, 1 running, 81 sleeping, 0 stopped, 0 zombie
%Cpu(s): 4.8 us, 6.5 sy, 0.0 ni, 0.0 id, 88.3 wa, 0.0 hi, 0.4 si, 0.0 st
KiB Mem : 1014756 total, 67476 free, 881612 used, 65668 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 28472 avail Mem
发现机器的内存的使用率还是比较高的,所以,有可能是宿主机的资源不足,导致的服务直接宕掉。
于是在这方面开始进行探索:
1、将linux进行重启,重启后top下资源的使用情况
top - 17:29:53 up 3 min, 1 user, load average: 0.12, 0.35, 0.18
Tasks: 72 total, 1 running, 71 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.7 us, 0.7 sy, 0.0 ni, 98.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 1014756 total, 361840 free, 308000 used, 344916 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 569284 avail Mem
2、启动jenkins服务,top服务器资源
top - 17:31:40 up 5 min, 2 users, load average: 0.02, 0.25, 0.16
Tasks: 70 total, 1 running, 69 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.7 us, 0.7 sy, 0.0 ni, 98.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 1014756 total, 359520 free, 309916 used, 345320 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 567288 avail Mem
发现变化并不大
3、使用浏览器登陆Jenkins,对job进行build,top服务器资源
//最高时资源的使用情况
top - 17:34:31 up 8 min, 2 users, load average: 1.36, 0.53, 0.27
Tasks: 72 total, 2 running, 70 sleeping, 0 stopped, 0 zombie
%Cpu(s): 81.1 us, 9.0 sy, 0.0 ni, 0.0 id, 10.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 1014756 total, 69860 free, 627700 used, 317196 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 245852 avail Mem
//构建完成后资源的使用情况
top - 17:34:58 up 8 min, 2 users, load average: 1.11, 0.56, 0.29
Tasks: 70 total, 1 running, 69 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.7 us, 1.0 sy, 0.0 ni, 98.0 id, 0.3 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 1014756 total, 292200 free, 412184 used, 310372 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 461012 avail Mem
发现此时服务是能够部署成功的
4、开启在该服务器上的docker,top服务器资源
top - 17:41:11 up 15 min, 2 users, load average: 0.23, 0.39, 0.29
Tasks: 72 total, 1 running, 71 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.7 us, 0.7 sy, 0.0 ni, 98.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 1014756 total, 149468 free, 442228 used, 423060 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 430680 avail Mem
5、开始在docker上的mysql,top服务器资源
top - 17:42:47 up 16 min, 2 users, load average: 0.12, 0.31, 0.27
Tasks: 75 total, 1 running, 74 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.0 us, 0.7 sy, 0.0 ni, 98.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 1014756 total, 68940 free, 635304 used, 310512 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 236412 avail Mem
6,此时再次使用jenkins进行build,top服务器资源
top - 17:45:33 up 19 min, 2 users, load average: 2.48, 1.05, 0.54
Tasks: 77 total, 2 running, 75 sleeping, 0 stopped, 0 zombie
%Cpu(s): 96.3 us, 2.7 sy, 0.0 ni, 0.0 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 1014756 total, 61928 free, 867412 used, 85416 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 32856 avail Mem
此时build依然能够成功,上面是高峰时期的内存使用情况
7、启动部署再该服务器上的tomcat服务
top - 17:51:27 up 25 min, 2 users, load average: 15.04, 5.85, 2.39
Tasks: 78 total, 5 running, 73 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.9 us, 5.4 sy, 0.0 ni, 0.0 id, 93.4 wa, 0.0 hi, 0.2 si, 0.0 st
KiB Mem : 1014756 total, 72560 free, 891320 used, 50876 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 26200 avail Mem
维持当前内存占用率一段时间后,jenkins与部署在当前服务器上的另一个tomcat服务同时宕机。
水落石出
机器内存资源不足,导致部署在当前机器上的服务宕机
网友评论