美文网首页
记一次Mysql死锁问题排查流程

记一次Mysql死锁问题排查流程

作者: 陈盼_750f | 来源:发表于2019-10-05 15:20 被阅读0次

    第一次在实际生产环境遇到死锁问题,从开始的懵逼状到找到并解决该问题,经历了无数次的百度。。。

    1. 项目用的MySQL数据库引擎是InnoDB,数据库的行锁、表锁是通过InnoDB使用表的索引来实现的。那么就先查询一下InnoDB的状态:
    show engine innodb status;
    

    只截取有用信息:

    ------------------------
    LATEST DETECTED DEADLOCK
    ------------------------
    2019-08-21 12:08:13 7fb595c29700
    *** (1) TRANSACTION:
    TRANSACTION 63912726, ACTIVE 0 sec updating or deleting
    mysql tables in use 1, locked 1
    LOCK WAIT 3 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 1
    MySQL thread id 1587325, OS thread handle 0x7fb55ee77700, query id 87233990 210.22.120.186 root updating
    UPDATE t_app_info  SET ssn='520224727',
        create_time='2011-08-20 17:54:56.0',
        update_time='2011-08-21 12:08:14.468',
        user_id=8,
        job_status='DONE',
        rerun=0,
        merchant_id='asd',
        loan_id=502,
        `name`='autotest_0815100043938',
        phone_number='1234567890',
        `source`=0  WHERE id=1163751259396098
    *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
    RECORD LOCKS space id 1673 page no 63 n bits 552 index `t_app_info_job_status_index` of table `test`.`t_app_info` trx id 63912726 lock_mode X locks gap before rec insert intention waiting
    Record lock, heap no 167 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
     0: len 3; hex 50414e; asc PAN;;
     1: len 8; hex 901e52e11885c002; asc   R     ;;
    
    *** (2) TRANSACTION:
    TRANSACTION 63912725, ACTIVE 0 sec starting index read
    mysql tables in use 1, locked 1
    3 lock struct(s), heap size 1184, 2 row lock(s)
    MySQL thread id 1587327, OS thread handle 0x7fb595c29700, query id 87233988 210.22.120.186 root Creating sort index
    select * from t_app_info where job_status = 'PAN' and source in (0,1) order by create_time asc limit 1 for update
    *** (2) HOLDS THE LOCK(S):
    RECORD LOCKS space id 1673 page no 63 n bits 552 index `t_app_info_job_status_index` of table `test`.`t_app_info` trx id 63912725 lock_mode X
    Record lock, heap no 167 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
     0: len 3; hex 50414e; asc PAN;;
     1: len 8; hex 901e52e11885c002; asc   R     ;;
    
    *** (2) WAITING FOR THIS LOCK TO BE GRANTED:
    RECORD LOCKS space id 1673 page no 70 n bits 176 index `PRIMARY` of table `test`.`t_app_info` trx id 63912725 lock_mode X locks rec but not gap waiting
    Record lock, heap no 28 PHYSICAL RECORD: n_fields 17; compact format; info bits 0
     0: len 8; hex 901e52e11885c002; asc   R     ;;
     1: len 6; hex 000003ceb3a3; asc       ;;
     2: len 7; hex 330000801d021e; asc 3      ;;
     3: len 18; hex 363533313237313938363037333034383637; asc 653127198607304867;;
     4: len 5; hex 99a3dc9e76; asc     v;;
     5: len 8; hex 80000000000006f2; asc         ;;
     6: len 5; hex 99a3dc9e76; asc     v;;
     7: len 3; hex 50414e; asc PAN;;
     8: len 1; hex 80; asc  ;;
     9: SQL NULL;
     10: len 20; hex 416d62696c447569742d47504d65726368616e74; asc AmbilDuit-GPMerchant;;
     11: len 8; hex 8000000000001426; asc        &;;
     12: len 13; hex 4275646920536574696177616e; asc Budi Setiawan;;
     13: len 13; hex 30383133323035313037323233; asc 0813205107223;;
     14: SQL NULL;
     15: SQL NULL;
     16: len 4; hex 80000002; asc     ;;
    
    *** WE ROLL BACK TRANSACTION (2)
    
    1. 分析死锁日志信息:
      在日志中可以找到最近一次死锁的日志信息:
    ------------------------
    LATEST DETECTED DEADLOCK
    ------------------------
    

    然后在该信息里面可以看到有这两个标题,标题下的信息就是死锁产生的原因

    HOLDS THE LOCK(S):  当前事务持有的锁信息
    WAITING FOR THIS LOCK TO BE GRANTED:  当前事务尝试获取的锁信息
    

    该表有两个索引,一个是主键id自带的索引,还有一个是job_status字段上添加的t_app_info_job_status_index(普通索引)。

    • 根据日志信息可以看到第一条sql是根据主键id更新该条记录,并且需要更新字段job_status。该sql已经获取到主键索引的锁,尝试获取job_status字段索引的锁。
    • 第二条sql是根据job_status字段查询符合条件的第一条语句并且使用for update配合job_status字段的索引实现行锁。该sql已经获取到t_app_info_job_status_index索引,尝试获取到主键索引的锁。

    由此可以分析出死锁产生的原因。

    然后百度了很多大佬的文章看了下,InnoDB在获取到非主键索引的锁后会继续再获取到主键索引的锁,这也解释了为什么第二条sql会有这样的行为。

    1. 解决这个问题:
      根据产生的原因,使用for update来操作行锁时,最简单的办法是直接使用主键索引。但是因为功能的需求是获取明细状态job_status为PAN的第一条,因此sql修改为如下来满足只根据主键索引来进行行锁:
    select * from t_app_info where id = (select id from t_app_info where job_status = 'PAN' and source in (0,1) order by create_time asc limit 1) and job_status = 'PAN' for update;
    

    以上就是我这次遇到死锁的排查过程和思考过程,如有遗漏或错误的地方,欢迎指正出来。谢谢!

    相关文章

      网友评论

          本文标题:记一次Mysql死锁问题排查流程

          本文链接:https://www.haomeiwen.com/subject/qzoepctx.html