美文网首页
mysql全表查询导致数据库OOM

mysql全表查询导致数据库OOM

作者: GAOCHAO_DBA | 来源:发表于2018-02-01 09:21 被阅读0次

    实例基础信息:

    • 个人测试环境
    • 数据库版本:MariaDB10.0.27
    • 内存信息
    root@sql-master:~# free -m
                 total       used       free     shared    buffers     cached
    Mem:           975        885         89          0          4         31
    -/+ buffers/cache:        849        125
    Swap:         1906       1222        684
    
    (root@127.0.0.1 3306) [test]>select count(*) from util_koala_message;
    +----------+
    | count(*) |
    +----------+
    |  9212182 |
    +----------+
    1 row in set (3.95 sec) 
    

    异常现象

    (root@127.0.0.1 3306) [test]>select * from util_koala_message;
    ERROR 2013 (HY000): Lost connection to MySQL server during query 
    (root@127.0.0.1 3306) [test]>
    

    异常解决过程

    • 根据mysql客户端的错误信息ERROR 2013 (HY000): Lost connection to MySQL server during query
      1)初步以为是connect-timeout造成的,查看了一下当前的值,如下所示是20秒,上面SQL运行时间也很巧将近20秒
      2)将connect-timeout设置为50
      3)修改后继续运行SQL,仍然在将近20秒的时间客户端丢失连接,异常重现
      4)再次查看connect-timeout竟然变回20,因为配置文件中设置的是20,看来是数据库重启了:
    1)查看connect-timeout
    (root@127.0.0.1 3306) [(none)]>show global variables like "%timeout%";
    +-----------------------------+----------+
    | Variable_name               | Value    |
    +-----------------------------+----------+
    | connect_timeout             | 20       |
    | deadlock_timeout_long       | 50000000 |
    | deadlock_timeout_short      | 10000    |
    | delayed_insert_timeout      | 300      |
    | innodb_flush_log_at_timeout | 1        |
    | innodb_lock_wait_timeout    | 50       |
    | innodb_rollback_on_timeout  | ON       |
    | interactive_timeout         | 28800    |
    | lock_wait_timeout           | 31536000 |
    | net_read_timeout            | 30       |
    | net_write_timeout           | 60       |
    | slave_net_timeout           | 3600     |
    | thread_pool_idle_timeout    | 60       |
    | wait_timeout                | 300      |
    +-----------------------------+----------+  
    
    2)修改并查看connect-timeout
    (root@127.0.0.1 3306) [(none)]>set global connect_timeout=50;
    Query OK, 0 rows affected (0.00 sec)
    (root@127.0.0.1 3306) [(none)]>show global variables like "%timeout%";
    +-----------------------------+----------+
    | Variable_name               | Value    |
    +-----------------------------+----------+
    | connect_timeout             | 50       |
    | deadlock_timeout_long       | 50000000 |
    | deadlock_timeout_short      | 10000    |
    | delayed_insert_timeout      | 300      |
    | innodb_flush_log_at_timeout | 1        |
    | innodb_lock_wait_timeout    | 50       |
    | innodb_rollback_on_timeout  | ON       |
    | interactive_timeout         | 28800    |
    | lock_wait_timeout           | 31536000 |
    | net_read_timeout            | 30       |
    | net_write_timeout           | 60       |
    | slave_net_timeout           | 3600     |
    | thread_pool_idle_timeout    | 60       |
    | wait_timeout                | 300      |
    +-----------------------------+----------+ 
    
    3)再次查询,异常重现
    (root@127.0.0.1 3306) [test]>select * from util_koala_message;
    ERROR 2013 (HY000): Lost connection to MySQL server during query
    
    4)再次查看connect-timeout竟然自动变回20
    (root@127.0.0.1 3306) [test]>show variables like "%timeout%";
    ERROR 2006 (HY000): MySQL server has gone away
    No connection. Trying to reconnect...
    Connection id:    3
    Current database: test
    
    +-----------------------------+----------+
    | Variable_name               | Value    |
    +-----------------------------+----------+
    | connect_timeout             | 20       |
    | deadlock_timeout_long       | 50000000 |
    | deadlock_timeout_short      | 10000    |
    | delayed_insert_timeout      | 300      |
    | innodb_flush_log_at_timeout | 1        |
    | innodb_lock_wait_timeout    | 50       |
    | innodb_rollback_on_timeout  | ON       |
    | interactive_timeout         | 28800    |
    | lock_wait_timeout           | 31536000 |
    | net_read_timeout            | 30       |
    | net_write_timeout           | 60       |
    | slave_net_timeout           | 3600     |
    | thread_pool_idle_timeout    | 60       |
    | wait_timeout                | 28800    |
    +-----------------------------+----------+ 
    

    异常重现

    • mysql客户端
    (root@127.0.0.1 3306) [test]>select * from util_koala_message;
    ERROR 2013 (HY000): Lost connection to MySQL server during query
    (root@127.0.0.1 3306) [test]> 
    
    • shell客户端查看内存
    root@sql-master:~# free -m
                 total       used       free     shared    buffers     cached
    Mem:           975        885         89          0          4         31
    -/+ buffers/cache:        849        125
    Swap:         1906       1222        684 
    
    省略中间一部分信息......
    
    root@sql-master:~# free -m
                 total       used       free     shared    buffers     cached
    Mem:           975        915         59          0          0          8
    -/+ buffers/cache:        905         69
    Swap:         1906       1842         64
    root@sql-master:~# free -m
                 total       used       free     shared    buffers     cached
    Mem:           975        912         62          0          1         16
    -/+ buffers/cache:        894         80
    Swap:         1906       1893         13
    root@sql-master:~# free -m
                 total       used       free     shared    buffers     cached
    Mem:           975        912         62          0          0          2
    -/+ buffers/cache:        910         65
    Swap:         1906       1906          0
    
    • shell查看日志
    root@sql-master:~# dmesg 
    ......
    [1698793.922672] Out of memory: Kill process 103623 (mysqld) score 294 or sacrifice child
    [1698793.922755] Killed process 103623 (mysqld) total-vm:1344348kB, anon-rss:166972kB, file-rss:0kB 
    root@sql-master:~# tail -n 30 /var/log/syslog
    Feb  1 08:54:02 sql-master kernel: [1698793.922672] Out of memory: Kill process 103623 (mysqld) score 294 or sacrifice child
    Feb  1 08:54:02 sql-master kernel: [1698793.922755] Killed process 103623 (mysqld) total-vm:1344348kB, anon-rss:166972kB, file-rss:0kB 
    

    结论

    • 这个查询导致数据库OOM,服务重新启动
    • 当Swap使用完毕后触发OOM
    • 本服务是二进制安装当mysql,通过mysqld_safe启动,OOM后数据库会自动重新启动

    相关文章

      网友评论

          本文标题:mysql全表查询导致数据库OOM

          本文链接:https://www.haomeiwen.com/subject/esirzxtx.html