美文网首页
zabbix4.4.7版本维护中遇到的故障总结(持续更新。。。)

zabbix4.4.7版本维护中遇到的故障总结(持续更新。。。)

作者: 苏水的北 | 来源:发表于2020-05-14 09:24 被阅读0次

一、zabbix服务无法启动:
1、问题:服务done掉,无法启动zabbix服务:

[root@zabbix /var/log/zabbix]#systemctl status zabbix-server.service 
● zabbix-server.service - Zabbix Server
   Loaded: loaded (/usr/lib/systemd/system/zabbix-server.service; enabled; vendor preset: disabled)
   Active: activating (auto-restart) (Result: exit-code) since Wed 2020-05-13 08:11:36 CST; 764ms ago
  Process: 15248 ExecStop=/bin/kill -SIGTERM $MAINPID (code=exited, status=1/FAILURE)
  Process: 15241 ExecStart=/usr/sbin/zabbix_server -c $CONFFILE (code=exited, status=0/SUCCESS)
 Main PID: 15243 (code=exited, status=0/SUCCESS)

May 13 08:11:36 zabbix kill[15248]: -s, --signal <sig>     send specified signal
May 13 08:11:36 zabbix kill[15248]: -q, --queue <sig>      use sigqueue(2) rather than kill(2)
May 13 08:11:36 zabbix kill[15248]: -p, --pid              print pids without signaling them
May 13 08:11:36 zabbix kill[15248]: -l, --list [=<signal>] list signal names, or convert one to a name
May 13 08:11:36 zabbix kill[15248]: -L, --table            list signal names and numbers
May 13 08:11:36 zabbix kill[15248]: -h, --help     display this help and exit
May 13 08:11:36 zabbix kill[15248]: -V, --version  output version information and exit
May 13 08:11:36 zabbix kill[15248]: For more details see kill(1).
May 13 08:11:36 zabbix systemd[1]: Unit zabbix-server.service entered failed state.
May 13 08:11:36 zabbix systemd[1]: zabbix-server.service failed.

2、首先想到的就是查看zabbix日志:

[root@zabbix /var/log/zabbix]#more zabbix_server.log
15846:20200513:081745.479 Starting Zabbix Server. Zabbix 4.4.7 (revision 77fb8c7ee0).
15846:20200513:081745.479 ****** Enabled features ******
15846:20200513:081745.479 SNMP monitoring:           YES
15846:20200513:081745.479 IPMI monitoring:           YES
15846:20200513:081745.479 Web monitoring:            YES
15846:20200513:081745.479 VMware monitoring:         YES
15846:20200513:081745.479 SMTP authentication:       YES
15846:20200513:081745.479 ODBC:                      YES
15846:20200513:081745.479 SSH support:               YES
15846:20200513:081745.480 IPv6 support:              YES
15846:20200513:081745.480 TLS support:               YES
15846:20200513:081745.480 ******************************
15846:20200513:081745.480 using configuration file: /etc/zabbix/zabbix_server.conf
15846:20200513:081745.487 current database version (mandatory/optional): 04040000/04040002
15846:20200513:081745.487 required mandatory version: 04040000
15846:20200513:081745.501 server #0 started [main process]
15848:20200513:081745.502 server #1 started [configuration syncer #1]
15848:20200513:081746.189 __mem_malloc: skipped 0 asked 24 skip_min 18446744073709551615 skip_max 0
15848:20200513:081746.189 [file:dbconfig.c,line:94] __zbx_mem_realloc(): out of memory (requested 16 bytes)
15848:20200513:081746.189 [file:dbconfig.c,line:94] __zbx_mem_realloc(): please increase CacheSize configuration parameter
15848:20200513:081746.189 === memory statistics for configuration cache ===

备注:通过上面的日志信息可以发现,提示CacheSize内存不足,去检查zabbix主配置文件。
3、检查zabbix主配置文件:

403 ### Option: CacheSize
404 #       Size of configuration cache, in bytes.
405 #       Shared memory size for storing host, item and trigger data.
406 #
407 # Mandatory: no
408 # Range: 128K-8G
409 # Default:
410 #CacheSize=8M

修改410 #CacheSize=8M,去掉#号,把默认的8M在这里我改成2048M。
4、修改完后,重启zabbix服务:

[root@zabbix /var/log/zabbix]#systemctl restart zabbix-server.service 
[root@zabbix ~]#systemctl status zabbix-server.service 
● zabbix-server.service - Zabbix Server
   Loaded: loaded (/usr/lib/systemd/system/zabbix-server.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2020-05-13 08:44:40 CST; 3h 12min ago
  Process: 18301 ExecStop=/bin/kill -SIGTERM $MAINPID (code=exited, status=0/SUCCESS)
  Process: 18317 ExecStart=/usr/sbin/zabbix_server -c $CONFFILE (code=exited, status=0/SUCCESS)
 Main PID: 18319 (zabbix_server)
   CGroup: /system.slice/zabbix-server.service
           ├─18319 /usr/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
           ├─18321 /usr/sbin/zabbix_server: configuration syncer [synced configuration in 0.678445 sec, idle 60 sec]
           ├─18324 /usr/sbin/zabbix_server: housekeeper [deleted 815888 hist/trends, 0 items/triggers, 0 events, 0 sessions, 0 ala...
           ├─18325 /usr/sbin/zabbix_server: timer #1 [updated 0 hosts, suppressed 0 events in 0.000707 sec, idle 59 sec]
           ├─18326 /usr/sbin/zabbix_server: http poller #1 [got 0 values in 0.000843 sec, idle 5 sec]
           ├─18327 /usr/sbin/zabbix_server: discoverer #1 [processed 0 rules in 0.000659 sec, idle 60 sec]
           ├─18328 /usr/sbin/zabbix_server: history syncer #1 [processed 0 values, 0 triggers in 0.000043 sec, idle 1 sec]
           ├─18329 /usr/sbin/zabbix_server: history syncer #2 [processed 0 values, 0 triggers in 0.000027 sec, idle 1 sec]
           ├─18330 /usr/sbin/zabbix_server: history syncer #3 [processed 128 values, 101 triggers in 0.087992 sec, idle 1 sec]
           ├─18331 /usr/sbin/zabbix_server: history syncer #4 [processed 104 values, 96 triggers in 0.017477 sec, idle 1 sec]
           ├─18332 /usr/sbin/zabbix_server: escalator #1 [processed 0 escalations in 0.000794 sec, idle 3 sec]
           ├─18333 /usr/sbin/zabbix_server: proxy poller #1 [exchanged data with 0 proxies in 0.000068 sec, idle 5 sec]
           ├─18334 /usr/sbin/zabbix_server: self-monitoring [processed data in 0.000025 sec, idle 1 sec]
           ├─18335 /usr/sbin/zabbix_server: task manager [processed 0 task(s) in 0.000544 sec, idle 5 sec]
           ├─18337 /usr/sbin/zabbix_server: poller #1 [got 30 values in 0.037370 sec, idle 1 sec]
           ├─18339 /usr/sbin/zabbix_server: poller #2 [got 23 values in 0.026674 sec, idle 1 sec]
           ├─18340 /usr/sbin/zabbix_server: poller #3 [got 16 values in 0.036619 sec, idle 1 sec]
           ├─18341 /usr/sbin/zabbix_server: poller #4 [got 15 values in 0.016117 sec, idle 1 sec]
           ├─18342 /usr/sbin/zabbix_server: poller #5 [got 17 values in 0.044779 sec, idle 1 sec]
           ├─18343 /usr/sbin/zabbix_server: unreachable poller #1 [got 1 values in 3.009875 sec, getting values]
           ├─18345 /usr/sbin/zabbix_server: trapper #1 [processed data in 0.002796 sec, waiting for connection]
           ├─18346 /usr/sbin/zabbix_server: trapper #2 [processed data in 0.002698 sec, waiting for connection]
           ├─18347 /usr/sbin/zabbix_server: trapper #3 [processed data in 0.000471 sec, waiting for connection]
           ├─18348 /usr/sbin/zabbix_server: trapper #4 [processed data in 0.002340 sec, waiting for connection]
           ├─18349 /usr/sbin/zabbix_server: trapper #5 [processed data in 0.003454 sec, waiting for connection]
           ├─18350 /usr/sbin/zabbix_server: icmp pinger #1 [pinging hosts]
           ├─18351 /usr/sbin/zabbix_server: alert manager #1 [sent 0, failed 0 alerts, idle 5.005276 sec during 5.005400 sec]
           ├─18352 /usr/sbin/zabbix_server: alerter #1 started
           ├─18353 /usr/sbin/zabbix_server: alerter #2 started
           ├─18354 /usr/sbin/zabbix_server: alerter #3 started
           ├─18355 /usr/sbin/zabbix_server: preprocessing manager #1 [queued 5, processed 532 values, idle 5.043614 sec during 5.0...
           ├─18356 /usr/sbin/zabbix_server: preprocessing worker #1 started
           ├─18357 /usr/sbin/zabbix_server: preprocessing worker #2 started
           ├─18358 /usr/sbin/zabbix_server: preprocessing worker #3 started
           ├─18359 /usr/sbin/zabbix_server: lld manager #1 [processed 1 LLD rules during 5.501500 sec]
           ├─18360 /usr/sbin/zabbix_server: lld worker #1 [processed 1 LLD rules, idle 5.488052 sec during 5.502938 sec]
           ├─18361 /usr/sbin/zabbix_server: lld worker #2 [processed 1 LLD rules, idle 13.106991 sec during 13.126135 sec]
           ├─18362 /usr/sbin/zabbix_server: alert syncer [queued 0 alerts(s), flushed 0 result(s) in 0.001128 sec, idle 1 sec]
           ├─31987 sh -c /usr/local/fping/sbin/fping -C3 -i0 2>&1 </tmp/zabbix_server_18350.pinger;
           └─31988 /usr/local/fping/sbin/fping -C3 -i0

May 13 08:44:40 zabbix systemd[1]: Starting Zabbix Server...
May 13 08:44:40 zabbix systemd[1]: zabbix-server.service: Supervising process 18319 which is not our child. We'll most like... exits.
May 13 08:44:40 zabbix systemd[1]: Started Zabbix Server.
Hint: Some lines were ellipsized, use -l to show in full.
[root@zabbix ~]#netstat -lntup
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:1556            0.0.0.0:*               LISTEN      8363/pbx_exchange   
tcp        0      0 127.0.0.1:1557          0.0.0.0:*               LISTEN      8363/pbx_exchange   
tcp        0      0 0.0.0.0:13782           0.0.0.0:*               LISTEN      9702/bpcd           
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      5633/sshd           
tcp        0      0 0.0.0.0:13724           0.0.0.0:*               LISTEN      9697/vnetd          
tcp        0      0 127.0.0.1:40509         0.0.0.0:*               LISTEN      8363/pbx_exchange   
tcp        0      0 0.0.0.0:10050           0.0.0.0:*               LISTEN      5674/zabbix_agentd  
tcp        0      0 0.0.0.0:10051           0.0.0.0:*               LISTEN      18319/zabbix_server 
tcp6       0      0 :::3306                 :::*                    LISTEN      18050/mysqld        
tcp6       0      0 :::80                   :::*                    LISTEN      5635/httpd          
tcp6       0      0 :::1556                 :::*                    LISTEN      8363/pbx_exchange   
tcp6       0      0 :::22                   :::*                    LISTEN      5633/sshd           
tcp6       0      0 :::10050                :::*                    LISTEN      5674/zabbix_agentd  
tcp6       0      0 :::10051                :::*                    LISTEN      18319/zabbix_server 

重启完以后,zabbix服务已经启动成功。
二、故障:Zabbix value cache working in low memory mode
1、问题: Zabbix value cache working in low memory mode错误:

455 ### Option: ValueCacheSize
456 #       Size of history value cache, in bytes.
457 #       Shared memory size for caching item history data requests.
458 #       Setting to 0 disables value cache.
459 #
460 # Mandatory: no
461 # Range: 0,128K-64G
462 # Default:
463 # ValueCacheSize=8M
464 ValueCacheSize=1024M

调整了ValueCacheSize大小,由之前默认8M,改变为1024M
2、重启服务:

[root@zabbix ~]#systemctl restart zabbix-server.service 
[root@zabbix ~]#systemctl status  zabbix-server.service 
● zabbix-server.service - Zabbix Server
  Loaded: loaded (/usr/lib/systemd/system/zabbix-server.service; enabled; vendor preset: disabled)
  Active: active (running) since Sun 2015-04-19 19:20:24 CST; 15s ago
 Process: 7034 ExecStop=/bin/kill -SIGTERM $MAINPID (code=exited, status=0/SUCCESS)
 Process: 7057 ExecStart=/usr/sbin/zabbix_server -c $CONFFILE (code=exited, status=0/SUCCESS)
Main PID: 7059 (zabbix_server)
  CGroup: /system.slice/zabbix-server.service
          ├─7059 /usr/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf

相关文章

网友评论

      本文标题:zabbix4.4.7版本维护中遇到的故障总结(持续更新。。。)

      本文链接:https://www.haomeiwen.com/subject/woeznhtx.html