美文网首页
记mysql的一次crash故障

记mysql的一次crash故障

作者: 天草二十六_简村人 | 来源:发表于2019-09-27 17:02 被阅读0次

一、背景

线上接口报错且响应慢,使用pinpoint进行跟踪,后发现是连接数据库的时候出现超时。
有个背景是:该库,是php和java共用,我这方为java后端开发,且没有发布数模变更或程序包。
最后根据mysql日志,初步判断是php操作数据库不当导致。

这个故障持续约一刻钟到20分钟,dba也没有做任何操作,后面自己就恢复了。

PS:仅记录本次线上事故,仅供参考。

二、apm监控

接口请求错误数和快慢情况.png 选中错误接口.png image.png

超时现象比较严重。


image.png
Could not open JPA EntityManager for transaction; nested exception is org.hibernate.TransactionException: JDBC begin transaction failed:

Communications link failure The last packet successfully received from the server was 264 milliseconds ago. The last packet sent successfully to the server was 0 milliseconds ago.

jvm堆正常,线程和响应时间出现比较大的波动:


image.png

总结:java程序连接mysql数据库超时,所以下文排查mysql故障情况。

三、mysql日志

mysql是一主一从的部署方式,下面分别贴出从库和主库的日志信息。


从库报错.png

主库:

2019-09-27 14:11:36 0x7f23876bf700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.3.9/storage/innobase/rem/rem0rec.cc line 1878
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: https://mariadb.com/kb/en/library/xtradbinnodb-recovery-modes/
InnoDB: about forcing recovery.
190927 14:11:36 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

Server version: 10.3.9-MariaDB-log
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=582
max_threads=2050
thread_count=277
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4637654 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f23830764c8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f23876bed70 thread_stack 0x49000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7f24b694d66e]
/usr/sbin/mysqld(handle_fatal_signal+0x357)[0x7f24b63ddf37]
sigaction.c:0(__restore_rt)[0x7f24b582d100]
:0(__GI_raise)[0x7f24b3d3d5f7]
:0(__GI_abort)[0x7f24b3d3ece8]
ut/ut0rbt.cc:221(rbt_tree_add_child)[0x7f24b6127e04]
rem/rem0rec.cc:1878(rec_copy_prefix_to_buf(unsigned char const*, dict_index_t const*, unsigned long, unsigned char**, unsigned long*))[0x7f24b6658193]
btr/btr0pcur.cc:174(btr_pcur_store_position(btr_pcur_t*, mtr_t*))[0x7f24b6738a98]
row/row0sel.cc:5597(row_search_mvcc(unsigned char*, page_cur_mode_t, row_prebuilt_t*, unsigned long, unsigned long))[0x7f24b669dcc6]
handler/ha_innodb.cc:9707(ha_innobase::general_fetch(unsigned char*, unsigned int, unsigned int))[0x7f24b65c6eb9]
sql/handler.cc:2893(handler::ha_index_prev(unsigned char*))[0x7f24b63e3a0f]
sql/sql_select.cc:20331(join_read_prev)[0x7f24b624b3cc]
sql/sql_select.cc:19304(sub_select(JOIN*, st_join_table*, bool))[0x7f24b624072f]
sql/sql_select.cc:18824(do_select)[0x7f24b6261d42]
sql/sql_select.cc:3823(JOIN::exec())[0x7f24b6261f93]
sql/sql_select.cc:4229(mysql_select(THD*, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*))[0x7f24b62620ea]
sql/sql_select.cc:382(handle_select(THD*, LEX*, select_result*, unsigned long))[0x7f24b6262bfc]
sql/sql_parse.cc:6547(execute_sqlcom_select)[0x7f24b611819f]
sql/sql_parse.cc:3768(mysql_execute_command(THD*))[0x7f24b620efbd]
sql/sql_parse.cc:8089(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x7f24b62117cb]
sql/sql_parse.cc:1850(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x7f24b6214315]
sql/sql_parse.cc:1397(do_command(THD*))[0x7f24b621517e]
sql/sql_connect.cc:1402(do_handle_one_connection(CONNECT*))[0x7f24b62e550a]
sql/sql_connect.cc:1310(handle_one_connection)[0x7f24b62e562d]
pthread_create.c:0(start_thread)[0x7f24b5825dc5]
/lib64/libc.so.6(clone+0x6d)[0x7f24b3dfe21d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f2372185bb0): 
// sql语句的字段和表名这里就以示例表达。关键在于后面的Like'%9个空格%'
SELECT xxx FROM user
WHERE (`xx`=0) AND (`xxx` LIKE '%         %') ORDER BY `xxxx` DESC LIMIT 15
Connection ID (thread ID): 93462600
Status: NOT_KILLED

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
2019-09-27 14:11:42 0 [Note] InnoDB: Using Linux native AIO
2019-09-27 14:11:42 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2019-09-27 14:11:42 0 [Note] InnoDB: Uses event mutexes
2019-09-27 14:11:42 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
2019-09-27 14:11:42 0 [Note] InnoDB: Number of pools: 1
2019-09-27 14:11:42 0 [Note] InnoDB: Using SSE2 crc32 instructions
2019-09-27 14:11:42 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
2019-09-27 14:11:42 0 [Note] InnoDB: Completed initialization of buffer pool
2019-09-27 14:11:42 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2019-09-27 14:11:42 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=182619431787
2019-09-27 14:11:42 0 [Note] InnoDB: Transaction 459527061 was in the XA prepared state.
2019-09-27 14:11:42 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 0 row operations to undo
2019-09-27 14:11:42 0 [Note] InnoDB: Trx id counter is 459527062
2019-09-27 14:11:42 0 [Note] InnoDB: Starting final batch to recover 61 pages from redo log.
2019-09-27 14:11:42 0 [Note] InnoDB: Last binlog file './mysql-bin.000142', position 418564782
2019-09-27 14:11:43 0 [Note] InnoDB: 128 out of 128 rollback segments are active.
2019-09-27 14:11:43 0 [Note] InnoDB: Starting in background the rollback of recovered transactions
2019-09-27 14:11:43 0 [Note] InnoDB: Rollback of non-prepared transactions completed
2019-09-27 14:11:43 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2019-09-27 14:11:43 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2019-09-27 14:11:43 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2019-09-27 14:11:43 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2019-09-27 14:11:43 0 [Note] InnoDB: Waiting for purge to start
2019-09-27 14:11:43 0 [Note] InnoDB: 10.3.9 started; log sequence number 182619436152; transaction id 459527064
2019-09-27 14:11:43 0 [Note] InnoDB: Loading buffer pool(s) from /data/mysql/ib_buffer_pool
2019-09-27 14:11:43 0 [Note] Plugin 'FEEDBACK' is disabled.
2019-09-27 14:11:43 0 [Note] Recovering after a crash using mysql-bin
2019-09-27 14:11:43 0 [Note] InnoDB: Buffer pool(s) load completed at 190927 14:11:43
2019-09-27 14:11:44 0 [Note] Starting crash recovery...
2019-09-27 14:11:44 0 [Note] InnoDB: Starting recovery for XA transactions...
2019-09-27 14:11:44 0 [Note] InnoDB: Transaction 459527061 in prepared state after recovery
2019-09-27 14:11:44 0 [Note] InnoDB: Transaction contains changes to 1 rows
2019-09-27 14:11:44 0 [Note] InnoDB: 1 transactions in prepared state after recovery
2019-09-27 14:11:44 0 [Note] Found 1 prepared transaction(s) in InnoDB
2019-09-27 14:11:44 0 [Note] Crash recovery finished.
2019-09-27 14:11:44 0 [Note] Server socket created on IP: '0.0.0.0'.
2019-09-27 14:11:44 0 [Warning] 'user' entry 'XHre&698@host' ignored in --skip-name-resolve mode.
2019-09-27 14:11:44 0 [Warning] 'proxies_priv' entry '@% root@xh-mysql-master1' ignored in --skip-name-resolve mode.
2019-09-27 14:11:44 7 [Note] Event Scheduler: scheduler thread started with id 7
2019-09-27 14:11:44 0 [Note] /usr/sbin/mysqld: ready for connections.
Version: '10.3.9-MariaDB-log'  socket: '/data/mysql/mysql.sock'  port: 3306  MariaDB Server

四、从mysql的日志可以看出,执行sql都是同一个表,且都使用了Like模糊匹配,会不会是由于这条sql语句引起的呢?

FROM user WHERE (xx=0) AND (xxx LIKE '% %') ORDER BY xxxx DESC LIMIT 15

明眼看,顶多是个慢查询语句,没有用上索引。
1、模糊匹配有数据的情况,一切正常~


image.png

2、但是,当模糊匹配不到数据的情况,就出现bug。


image.png

并且这个sql会堵塞其他的sql查询。

3、如果我们换其他的小表。(其实我们user表也才10几万数据),还是写LIKE '% %' ,顶多是没查询出来数据。并不会影响其他。

五、mysql字符集

如果你在使用MySQL或MariaDB,不要用“utf8”编码,改用“utf8mb4”。这里( https://mathiasbynens.be/notes/mysql-utf8mb4#utf8-to-utf8mb4 )提供了一个指南用于将现有数据库的字符编码从“utf8”转成“utf8mb4”。

相关文章

  • 记mysql的一次crash故障

    一、背景 线上接口报错且响应慢,使用pinpoint进行跟踪,后发现是连接数据库的时候出现超时。有个背景是:该库,...

  • 记一次MySQL Crash处理

    在MySQL 5.7的高并发场景下,有可能会触发MySQL5.7的Bug,导致宕机,临时的紧急修复方法可针对实例进...

  • Hyperledger Fabric 1.4 特性调研之Raft

    Raft是一种crash fault tolerant (CFT,崩溃故障容错)的共识排序算法。如果有节点故障掉线...

  • gitlab 500 no space left on devi

    gitlab故障处理一: gitlab nginx页面500 记一次gitlab故障处理之no space lef...

  • 一次CPU高负载问题排查

    记一次DRBD故障后发现的CPU高负载 故障发现时间: 2018-01-29 11:50IPCDN的DRBD资源发...

  • 记一次Crash分析

    在这里主要分析使用UIWebView导致产生crash的原因。 从我们项目的crash日志收集平台中,有诸多如下的...

  • 记一次Crash分析

    在这里主要分析使用UIWebView导致产生crash的原因。 从我们项目的crash日志收集平台中,有诸多如下的...

  • 6.MySQL日志系统

    Error log MySQL Server的启动、停止、crash、recovery、错误信息。--log-wa...

  • 记一次Mysql故障排查(lock wait timeout e

    最近接了一个锅,进入新公司接手了一个进入交付阶段的项目.在codereview的时候发现很多问题,然后开始修复bu...

  • 记一次安装mysql过程

    记一次安装mysql过程 更新 2019年6月10日10:57:03忘记了mysql的密码,重新安装了mysql-...

网友评论

      本文标题:记mysql的一次crash故障

      本文链接:https://www.haomeiwen.com/subject/rurauctx.html