实例基础信息:
- 个人测试环境
- 数据库版本:MariaDB10.0.27
- 内存信息
root@sql-master:~# free -m
total used free shared buffers cached
Mem: 975 885 89 0 4 31
-/+ buffers/cache: 849 125
Swap: 1906 1222 684
(root@127.0.0.1 3306) [test]>select count(*) from util_koala_message;
+----------+
| count(*) |
+----------+
| 9212182 |
+----------+
1 row in set (3.95 sec)
异常现象
(root@127.0.0.1 3306) [test]>select * from util_koala_message;
ERROR 2013 (HY000): Lost connection to MySQL server during query
(root@127.0.0.1 3306) [test]>
异常解决过程
- 根据mysql客户端的错误信息ERROR 2013 (HY000): Lost connection to MySQL server during query
1)初步以为是connect-timeout造成的,查看了一下当前的值,如下所示是20秒,上面SQL运行时间也很巧将近20秒
2)将connect-timeout设置为50
3)修改后继续运行SQL,仍然在将近20秒的时间客户端丢失连接,异常重现
4)再次查看connect-timeout竟然变回20,因为配置文件中设置的是20,看来是数据库重启了:
1)查看connect-timeout
(root@127.0.0.1 3306) [(none)]>show global variables like "%timeout%";
+-----------------------------+----------+
| Variable_name | Value |
+-----------------------------+----------+
| connect_timeout | 20 |
| deadlock_timeout_long | 50000000 |
| deadlock_timeout_short | 10000 |
| delayed_insert_timeout | 300 |
| innodb_flush_log_at_timeout | 1 |
| innodb_lock_wait_timeout | 50 |
| innodb_rollback_on_timeout | ON |
| interactive_timeout | 28800 |
| lock_wait_timeout | 31536000 |
| net_read_timeout | 30 |
| net_write_timeout | 60 |
| slave_net_timeout | 3600 |
| thread_pool_idle_timeout | 60 |
| wait_timeout | 300 |
+-----------------------------+----------+
2)修改并查看connect-timeout
(root@127.0.0.1 3306) [(none)]>set global connect_timeout=50;
Query OK, 0 rows affected (0.00 sec)
(root@127.0.0.1 3306) [(none)]>show global variables like "%timeout%";
+-----------------------------+----------+
| Variable_name | Value |
+-----------------------------+----------+
| connect_timeout | 50 |
| deadlock_timeout_long | 50000000 |
| deadlock_timeout_short | 10000 |
| delayed_insert_timeout | 300 |
| innodb_flush_log_at_timeout | 1 |
| innodb_lock_wait_timeout | 50 |
| innodb_rollback_on_timeout | ON |
| interactive_timeout | 28800 |
| lock_wait_timeout | 31536000 |
| net_read_timeout | 30 |
| net_write_timeout | 60 |
| slave_net_timeout | 3600 |
| thread_pool_idle_timeout | 60 |
| wait_timeout | 300 |
+-----------------------------+----------+
3)再次查询,异常重现
(root@127.0.0.1 3306) [test]>select * from util_koala_message;
ERROR 2013 (HY000): Lost connection to MySQL server during query
4)再次查看connect-timeout竟然自动变回20
(root@127.0.0.1 3306) [test]>show variables like "%timeout%";
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 3
Current database: test
+-----------------------------+----------+
| Variable_name | Value |
+-----------------------------+----------+
| connect_timeout | 20 |
| deadlock_timeout_long | 50000000 |
| deadlock_timeout_short | 10000 |
| delayed_insert_timeout | 300 |
| innodb_flush_log_at_timeout | 1 |
| innodb_lock_wait_timeout | 50 |
| innodb_rollback_on_timeout | ON |
| interactive_timeout | 28800 |
| lock_wait_timeout | 31536000 |
| net_read_timeout | 30 |
| net_write_timeout | 60 |
| slave_net_timeout | 3600 |
| thread_pool_idle_timeout | 60 |
| wait_timeout | 28800 |
+-----------------------------+----------+
异常重现
(root@127.0.0.1 3306) [test]>select * from util_koala_message;
ERROR 2013 (HY000): Lost connection to MySQL server during query
(root@127.0.0.1 3306) [test]>
root@sql-master:~# free -m
total used free shared buffers cached
Mem: 975 885 89 0 4 31
-/+ buffers/cache: 849 125
Swap: 1906 1222 684
省略中间一部分信息......
root@sql-master:~# free -m
total used free shared buffers cached
Mem: 975 915 59 0 0 8
-/+ buffers/cache: 905 69
Swap: 1906 1842 64
root@sql-master:~# free -m
total used free shared buffers cached
Mem: 975 912 62 0 1 16
-/+ buffers/cache: 894 80
Swap: 1906 1893 13
root@sql-master:~# free -m
total used free shared buffers cached
Mem: 975 912 62 0 0 2
-/+ buffers/cache: 910 65
Swap: 1906 1906 0
root@sql-master:~# dmesg
......
[1698793.922672] Out of memory: Kill process 103623 (mysqld) score 294 or sacrifice child
[1698793.922755] Killed process 103623 (mysqld) total-vm:1344348kB, anon-rss:166972kB, file-rss:0kB
root@sql-master:~# tail -n 30 /var/log/syslog
Feb 1 08:54:02 sql-master kernel: [1698793.922672] Out of memory: Kill process 103623 (mysqld) score 294 or sacrifice child
Feb 1 08:54:02 sql-master kernel: [1698793.922755] Killed process 103623 (mysqld) total-vm:1344348kB, anon-rss:166972kB, file-rss:0kB
结论
- 这个查询导致数据库OOM,服务重新启动
- 当Swap使用完毕后触发OOM
- 本服务是二进制安装当mysql,通过mysqld_safe启动,OOM后数据库会自动重新启动
网友评论