美文网首页技术干货
记一次MySQL死锁排查

记一次MySQL死锁排查

作者: BrightLoong | 来源:发表于2019-02-13 17:40 被阅读0次
    MySQL

    一. 环境

    1. 表结构

    涉及到以下两张表:

    • BIND_RECORD
    CREATE TABLE `BIND_RECORD` (
      `id` bigint(20) NOT NULL AUTO_INCREMENT,
      `bind_uid` bigint(20) NOT NULL COMMENT '被绑uid',
      `bind_by_uid` bigint(20) NOT NULL COMMENT '渠道uid',
      `bind_type` int(2) NOT NULL COMMENT '绑定规则类型,1-长期,2-连续订阅,3-短期',
      `bind_rule_ref` bigint(20) NOT NULL COMMENT '绑定规则id',
      `start_time` datetime NOT NULL COMMENT '有效起始时间',
      `end_time` datetime DEFAULT NULL COMMENT '有效终止时间',
      `extra` varchar(1024) DEFAULT NULL COMMENT '额外信息,json字段',
      `created_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
      `last_updated_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '修改时间',
      PRIMARY KEY (`id`),
      UNIQUE KEY `uq_bind_uid_type` (`bind_uid`,`bind_type`),
      KEY `idx_bind_uid` (`bind_uid`)
    ) ENGINE=InnoDB  DEFAULT CHARSET=utf8mb4 COMMENT='绑定记录表';
    
    • BIND_RECORD_HISTORY
    CREATE TABLE `BIND_RECORD_HISTORY` (
      `id` bigint(20) NOT NULL AUTO_INCREMENT,
      `bind_uid` bigint(20) NOT NULL COMMENT '渠道uid',
      `bind_by_uid` bigint(20) NOT NULL COMMENT '被绑定方uid',
      `bind_type` int(2) NOT NULL COMMENT '绑定规则类型,1-长期,2-连续订阅,3-短期',
      `bind_rule_ref` bigint(20) NOT NULL COMMENT '绑定规则id',
      `start_time` datetime NOT NULL COMMENT '有效起始时间',
      `end_time` datetime DEFAULT NULL COMMENT '有效终止时间',
      `created_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
      `last_updated_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '修改时间',
      `extra` varchar(1024) DEFAULT NULL COMMENT '额外信息,json字段',
      PRIMARY KEY (`id`),
      KEY `idx_bind_uid` (`bind_uid`)
    ) ENGINE=InnoDB AUTO_INCREMENT=245 DEFAULT CHARSET=utf8mb4 COMMENT='绑定记录历史表,删除的绑定规则会移到历史表中';
    

    表结构如上,使用InnoDB

    插入一条测试数据:

    INSERT INTO `xmds_bind`.`BIND_RECORD`(`id`, `bind_uid`, `bind_by_uid`, `bind_type`, `bind_rule_ref`, `start_time`, `end_time`, `extra`, `created_time`, `last_updated_time`) VALUES (244, 57431, 205449, 1, 75, '2018-12-07 12:43:59', '2018-12-01 12:43:59', NULL, '2018-11-29 10:55:30', '2019-02-13 16:35:40');
    

    2. 数据库环境

    查询数据的事务隔离级别如下:

    mysql> SELECT @@tx_isolation;
    +-----------------+
    | @@tx_isolation  |
    +-----------------+
    | REPEATABLE-READ |
    +-----------------+
    1 row in set (0.00 sec)
    

    REPEATABLE-READ:可重复读

    二. 死锁分析

    1. 死锁日志

    在服务运行中在日志中出现了ERROR信息:

    Deadlock found when trying to get lock; try restarting transaction
    

    由信息可以明确的知道是死锁造成的,使用下面的命令查看MySQL的日志。

    show engine innodb status
    

    找到LATEST DETECTED DEADLOCK 死锁信息如下:

    ------------------------
    LATEST DETECTED DEADLOCK
    ------------------------
    2018-12-07 12:44:00 7faa671c7700
    *** (1) TRANSACTION:
    TRANSACTION 478255595954, ACTIVE 0 sec starting index read
    mysql tables in use 1, locked 1
    LOCK WAIT 2 lock struct(s), heap size 360, 1 row lock(s)
    MySQL thread id 95407803, OS thread handle 0x7facb920e700, query id 2341515722 172.31.42.155 naliworld updating
    UPDATE BIND_RECORD  SET id = id,bind_uid = 57431,bind_by_uid = 205449,bind_type = 3,bind_rule_ref = 75,start_time = '2018-12-07 12:43:59',end_time = '2018-12-10 12:43:59' WHERE  id = 244
    *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
    RECORD LOCKS space id 129635 page no 3 n bits 152 index `PRIMARY` of table `xmds-bind`.`BIND_RECORD` trx id 478255595954 lock_mode X locks rec but not gap waiting
    Record lock, heap no 40 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
     0: len 8; hex 80000000000000f4; asc         ;;
     1: len 6; hex 006f5a414156; asc  oZAAV;;
     2: len 7; hex 1d0000048618fd; asc        ;;
     3: len 8; hex 800000000000e057; asc        W;;
     4: len 8; hex 8000000000032289; asc       " ;;
     5: len 4; hex 80000003; asc     ;;
     6: len 8; hex 800000000000004b; asc        K;;
     7: len 5; hex 99a18ecafb; asc      ;;
     8: len 5; hex 99a194cafb; asc      ;;
     9: SQL NULL;
     10: len 5; hex 99a18ecaf6; asc      ;;
     11: len 4; hex 5c09fa8f; asc \   ;;
    
    *** (2) TRANSACTION:
    TRANSACTION 478255595950, ACTIVE 0 sec fetching rows
    mysql tables in use 1, locked 1
    5 lock struct(s), heap size 1184, 96 row lock(s)
    MySQL thread id 95335471, OS thread handle 0x7faa671c7700, query id 2341515729 172.31.42.155 naliworld updating
    DELETE FROM BIND_RECORD  WHERE       (  end_time is not null
                
            
              
                  and end_time < '2018-12-07 12:44:00' )
    *** (2) HOLDS THE LOCK(S):
    RECORD LOCKS space id 129635 page no 3 n bits 152 index `PRIMARY` of table `xmds-bind`.`BIND_RECORD` trx id 478255595950 lock mode S
    Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
     0: len 8; hex 73757072656d756d; asc supremum;;
    
    Record lock, heap no 23 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
     0: len 8; hex 8000000000000027; asc        ';;
     1: len 6; hex 006f53755353; asc  oSuSS;;
     2: len 7; hex 5900000542059a; asc Y   B  ;;
     3: len 8; hex 8000000000032083; asc         ;;
     4: len 8; hex 80000000000318fe; asc         ;;
     5: len 4; hex 80000001; asc     ;;
     6: len 8; hex 8000000000000028; asc        (;;
     7: len 5; hex 99a15b1de1; asc   [  ;;
     8: SQL NULL;
     9: len 2; hex 7b7d; asc {};;
     10: len 5; hex 99a15b1de1; asc   [  ;;
     11: len 4; hex 5beaa899; asc [   ;;
    
    ......中间省略了和上面几乎一样的一堆信息,因为是批量删除
    
    *** (2) WAITING FOR THIS LOCK TO BE GRANTED:
    RECORD LOCKS space id 129635 page no 3 n bits 152 index `PRIMARY` of table `xmds-bind`.`BIND_RECORD` trx id 478255595950 lock_mode X waiting
    Record lock, heap no 40 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
     0: len 8; hex 80000000000000f4; asc         ;;
     1: len 6; hex 006f5a414156; asc  oZAAV;;
     2: len 7; hex 1d0000048618fd; asc        ;;
     3: len 8; hex 800000000000e057; asc        W;;
     4: len 8; hex 8000000000032289; asc       " ;;
     5: len 4; hex 80000003; asc     ;;
     6: len 8; hex 800000000000004b; asc        K;;
     7: len 5; hex 99a18ecafb; asc      ;;
     8: len 5; hex 99a194cafb; asc      ;;
     9: SQL NULL;
     10: len 5; hex 99a18ecaf6; asc      ;;
     11: len 4; hex 5c09fa8f; asc \   ;;
    
    *** WE ROLL BACK TRANSACTION (1)
    

    分析一下上面的日志:

    • 事务1(TRANSACTION),执行的语句为UPDATE BIND_RECORD SET id = id,bind_uid = 57431,bind_by_uid = 205449,bind_type = 3,bind_rule_ref = 75,start_time = '2018-12-07 12:43:59',end_time = '2018-12-10 12:43:59' WHERE id = 244 ;等待X锁(WAITING FOR THIS LOCK TO BE GRANTED:lock_mode X locks rec but not gap waiting),其中 but not gap waiting表示非间隙锁。
    • 事务2,执行的语句为DELETE FROM BIND_RECORD WHERE ( end_time is not nulland end_time < '2018-12-07 12:44:00' ) ;持有S锁(HOLDS THE LOCK(S)中所描述的lock mode S),等待X锁(WAITING FOR THIS LOCK TO BE GRANTED:lock_mode X waiting)

    通过上面的描述,可以大概分析出死锁产生的原因

    • 事务2当前持有了lock mode S(为什么会持有S锁还要看事务2中具体执行了哪些SQL)
    • 事务1执行UPDATE操作,UPDATE操作需要X锁(lock_mode X locks rec but not gap,和 S锁冲突),当前锁不能获取,因为它和事务2中的持有S锁冲突,所以事务1的锁请求被加入一个锁请求队列,事务1被阻塞,等待事务2释放S锁
    • 事务2执行DELETE操作,DELETE操作需要X锁(lock_mode X),所以现在情况就是,事务2等待X锁并持有S锁,但是由于事务1已经请求了一个X锁,所以不能获取到,而事务1又在等待事务2释放S锁,这样一来就出现了死锁。

    2. 如何知道SQL的锁信息

    上面提到了事务2首先持有了S锁,这个S锁是怎么持有的,怎么知道每个SQL对应的锁的信息呢。

    模拟前先设置数据库事务隔离级别,关闭自动提交:

    SET GLOBAL TRANSACTION ISOLATION LEVEL REPEATABLE READ;  //隔离级别设置为REPEATABLE READ
    SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ; //保险起见GLOBAL和SESSION都设置一下
    show variables like 'autocommit'; //查看是否开启了自动提交
    set autocommit = 0;
    

    在MySQL官网上,找到了如何打开InnoDB锁监控,具体参考https://dev.mysql.com/doc/refman/8.0/en/innodb-enabling-monitors.html

    SET GLOBAL innodb_status_output=ON; 
    SET GLOBAL innodb_status_output_locks=ON;
    

    使用上面提到的语句来试一试:

    在MySQL的终端依次输入下面的语句:

    mysql>start transaction;
    
    mysql>UPDATE BIND_RECORD  SET id = id,bind_uid = 57431,bind_by_uid = 205449,bind_type = 3,bind_rule_ref = 75,start_time = '2018-12-07 12:43:59',end_time = '2018-12-10 12:43:59' WHERE  id = 244;
    
    mysql>SHOW ENGINE INNODB STATUS;
    
    mysql>commit;
    

    在输入 SHOW ENGINE INNODB STATUS; 会打印出当前事务的锁信息如下:

    ------------
    TRANSACTIONS
    ------------
    Trx id counter 144145
    Purge done for trx's n:o < 144142 undo n:o < 0 state: running but idle
    History list length 3
    LIST OF TRANSACTIONS FOR EACH SESSION:
    ---TRANSACTION 281479443287616, not started
    0 lock struct(s), heap size 1136, 0 row lock(s)
    ---TRANSACTION 281479443286712, not started
    0 lock struct(s), heap size 1136, 0 row lock(s)
    ---TRANSACTION 281479443285808, not started
    0 lock struct(s), heap size 1136, 0 row lock(s)
    ---TRANSACTION 144144, ACTIVE 4 sec
    2 lock struct(s), heap size 1136, 1 row lock(s)
    MySQL thread id 6, OS thread handle 123145535860736, query id 106 localhost root starting
    SHOW ENGINE INNODB STATUS
    TABLE LOCK table `xmds_bind`.`bind_record` trx id 144144 lock mode IX
    RECORD LOCKS space id 616 page no 6 n bits 296 index PRIMARY of table `xmds_bind`.`bind_record` trx id 144144 lock_mode X locks rec but not gap
    Record lock, heap no 225 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
     0: len 8; hex 80000000000000f4; asc         ;;
     1: len 6; hex 00000002330c; asc     3 ;;
     2: len 7; hex 2a000001e70693; asc *      ;;
     3: len 8; hex 800000000000e057; asc        W;;
     4: len 8; hex 8000000000032289; asc       " ;;
     5: len 4; hex 80000003; asc     ;;
     6: len 8; hex 800000000000004b; asc        K;;
     7: len 5; hex 99a18ecafb; asc      ;;
     8: len 5; hex 99a194cafb; asc      ;;
     9: SQL NULL;
     10: len 5; hex 99a17aadde; asc   z  ;;
     11: len 4; hex 5c63d6dc; asc \c  ;;
    

    从输出中 RECORD LOCKS space id 616 page no 6 n bits 296 index PRIMARY of tablexmds_bind.bind_recordtrx id 144144 lock_mode X locks rec but not gap 可以看出,需要一个X锁(lock_mode X locks rec but not gap)和我们上面的分析相符。

    3. 本地模拟死锁

    通过上面的方式,再分析代码,发现了事务2中有一条语句会持有S锁

    insert into BIND_RECORD_HISTORY(id, bind_uid, bind_by_uid, bind_type, bind_rule_ref,start_time, end_time, extra) select id, bind_uid, bind_by_uid, bind_type, bind_rule_ref,start_time, end_time, extra from BIND_RECORD where end_time > '2018-12-09';
    

    持有锁信息如下:

    ------------
    TRANSACTIONS
    ------------
    Trx id counter 144148
    Purge done for trx's n:o < 144147 undo n:o < 0 state: running but idle
    History list length 4
    LIST OF TRANSACTIONS FOR EACH SESSION:
    ---TRANSACTION 281479443287616, not started
    0 lock struct(s), heap size 1136, 0 row lock(s)
    ---TRANSACTION 281479443286712, not started
    0 lock struct(s), heap size 1136, 0 row lock(s)
    ---TRANSACTION 281479443285808, not started
    0 lock struct(s), heap size 1136, 0 row lock(s)
    ---TRANSACTION 144147, ACTIVE 6 sec
    91 lock struct(s), heap size 24784, 19451 row lock(s), undo log entries 1
    MySQL thread id 6, OS thread handle 123145535860736, query id 169 localhost root starting
    SHOW ENGINE INNODB STATUS
    TABLE LOCK table `xmds_bind`.`bind_record` trx id 144147 lock mode IS
    RECORD LOCKS space id 616 page no 6 n bits 296 index PRIMARY of table `xmds_bind`.`bind_record` trx id 144147 lock mode S
    Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
     0: len 8; hex 73757072656d756d; asc supremum;;
    

    下面来我们打开两个MySQL终端来模拟下,是否会出现我们预期中的死锁:

    事务1 事务2
    start transaction;
    insert into BIND_RECORD_HISTORY(id, bind_uid, bind_by_uid, bind_type, bind_rule_ref,start_time, end_time, extra) select id, bind_uid, bind_by_uid, bind_type, bind_rule_ref,start_time, end_time, extra from BIND_RECORD where end_time < '2018-12-07 12:44:00';
    start transaction;
    UPDATE BIND_RECORD SET id = id,bind_uid = 57431,bind_by_uid = 205449,bind_type = 3,bind_rule_ref = 75,start_time = '2018-12-07 12:43:59',end_time = '2018-12-10 12:43:59' WHERE id = 244;
    DELETE FROM BIND_RECORD WHERE ( end_time is not null and end_time < '2018-12-07 12:44:00' );

    按照上面的执行顺序,结果如下,报出了死锁错误:

    ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
    
    LOCK

    在知道了死锁原因之后要修改就比较简单了。

    三. 官网上的一个例子

    最后在官网上找到了类似的一个死锁例子,地址如下:

    https://dev.mysql.com/doc/refman/8.0/en/innodb-deadlock-example.html

    相关文章

      网友评论

        本文标题:记一次MySQL死锁排查

        本文链接:https://www.haomeiwen.com/subject/ekgdeqtx.html