死锁日志
*************************** 1. row ***************************
Type: InnoDB
Name:
Status:
=====================================
2023-06-13 14:11:44 0x7f1febad8700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 25 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 27984349 srv_active, 0 srv_shutdown, 27539402 srv_idle
srv_master_thread log flush and writes: 55523751
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 23061421
OS WAIT ARRAY INFO: signal count 67625116
RW-shared spins 0, rounds 90130459, OS waits 21606013
RW-excl spins 0, rounds 168864564, OS waits 391828
RW-sx spins 10886910, rounds 129682783, OS waits 864374
Spin rounds per wait: 90130459.00 RW-shared, 168864564.00 RW-excl, 11.91 RW-sx
------------------------
LATEST DETECTED DEADLOCK
------------------------
2023-06-03 18:20:06 0x7f1fc46e5700
*** (1) TRANSACTION:
TRANSACTION 225089659, ACTIVE 1 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 320 lock struct(s), heap size 41168, 4 row lock(s)
MySQL thread id 5619472, OS thread handle 139774417090304, query id 1142417448 172.16.10.121 application updating
UPDATE `saas_application`.`tenant_account_statistics`
SET settlement = settlement - 198.00,
to_account = to_account + 198.00,
withdrawable = withdrawable + 198.00
WHERE shop_id = 1531550363716599810
AND belong_id = 1501384359112056834
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 494 page no 5 n bits 136 index index_shopid_belongid of table `saas_application`.`tenant_account_statistics` trx id 225089659 lock_mode X locks rec but not gap waiting
Record lock, heap no 40 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
0: len 8; hex 154128f0f8f2c002; asc A( ;;
1: len 8; hex 14d5fd1eb433a002; asc 3 ;;
2: len 4; hex 00000031; asc 1;;
*** (2) TRANSACTION:
TRANSACTION 225089658, ACTIVE 1 sec fetching rows, thread declared inside InnoDB 676
mysql tables in use 1, locked 1
323 lock struct(s), heap size 41168, 5 row lock(s), undo log entries 2
MySQL thread id 5619462, OS thread handle 139774416279296, query id 1142417475 172.16.10.121 application updating
UPDATE `saas_application`.`tenant_account_statistics_details`
SET status = 1,
actual_settlement_time = NOW()
WHERE order_no = 'O2305101612286510'
and `type` = 0
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 494 page no 5 n bits 136 index index_shopid_belongid of table `saas_application`.`tenant_account_statistics` trx id 225089658 lock_mode X locks rec but not gap
Record lock, heap no 40 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
0: len 8; hex 154128f0f8f2c002; asc A( ;;
1: len 8; hex 14d5fd1eb433a002; asc 3 ;;
2: len 4; hex 00000031; asc 1;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 727 page no 473 n bits 152 index PRIMARY of table `saas_application`.`tenant_account_statistics_details` trx id 225089658 lock_mode X locks rec but not gap waiting
Record lock, heap no 33 PHYSICAL RECORD: n_fields 24; compact format; info bits 0
0: len 4; hex 80004bbc; asc K ;;
1: len 6; hex 00000cd4ab0b; asc ;;
2: len 7; hex f6000001ca0110; asc ;;
3: len 4; hex 8000aa75; asc u;;
4: len 17; hex 4f32333035313031363131343938333437; asc O2305101611498347;;
5: len 4; hex 80000000; asc ;;
6: SQL NULL;
7: len 8; hex 94d5fd1eb433a002; asc 3 ;;
8: len 8; hex 9520fd639933b002; asc c 3 ;;
9: len 18; hex e5a4a9e9a290e59586e59f8ee5b9b3e58fb0; asc ;;
10: len 8; hex 954128f0f8f2c002; asc A( ;;
11: len 30; hex e5a4a9e9a290e88cb6e59381e799bde88dafe99b86e59ba2e58685e8b4ad; asc ; (total 33 bytes);
12: len 8; hex 94d5fd1eb433a002; asc 3 ;;
13: len 6; hex e680bbe5ba97; asc ;;
14: len 1; hex 80; asc ;;
15: len 4; hex 80000000; asc ;;
16: len 1; hex 80; asc ;;
17: len 5; hex 800000c600; asc ;;
18: len 5; hex 99b0150304; asc ;;
19: SQL NULL;
20: SQL NULL;
21: SQL NULL;
22: len 20; hex 4f32333035313030383131353139363031313233; asc O2305100811519601123;;
23: len 21; hex 423233303531303136313135323038333032313030; asc B23051016115208302100;;
根据sql找到代码
image.png分析对应的死锁日志
事务1死锁
image.png从这段分析出,这是一个事务,事务id = 225089659 ,
死锁的语句是:涉及的表是 tenant_account_statistics 这张表
UPDATE `saas_application`.`tenant_account_statistics`
SET settlement = settlement - 198.00,
to_account = to_account + 198.00,
withdrawable = withdrawable + 198.00
WHERE shop_id = 1531550363716599810
AND belong_id = 1501384359112056834
涉及到 lock_mode X locks 是排他锁,行锁的一种。对应锁的索引是 index index_shopid_belongid 这是一个联合索引,shop_id 和 belong_id 的联合索引,
分析
1.此时该事务, 在尝试获取表 tenant_account_statistics 的 index index_shopid_belongid 对应非聚合索引的行锁。
事务2死锁
image.png从这段分析出,这是一个事务,事务id = 225089658 ,
死锁的语句是:涉及的表是 tenant_account_statistics_details 这张表
UPDATE `saas_application`.`tenant_account_statistics_details`
SET status = 1,
actual_settlement_time = NOW()
WHERE order_no = 'O2305101612286510'
and `type` = 0
涉及到 lock_mode X locks 是排他锁,行锁的一种。对应锁的索引是 index index_shopid_belongid 这是一个联合索引,shop_id 和 belong_id 的联合索引。
分析
1.这个事务操作获取到了表 tenant_account_statistics 的 index index_shopid_belongid 索引对应的行锁。
2.这个事务操作此时等待表 tenant_account_statistics_details 的 primary 主键索引
分析代码
image.pngsettlement(jsons)代码如下
image.png
死锁分析
1.首先查看系统的数据库,mysql的隔离级别是read-commited ,读已提交,所以所有的查询都不会加锁。直接排除代码里面大事务里面的查询,造成的行锁问题。
2.那么只有update造成的行锁互相竞争,我们将事务中update操作的sql都查看一遍。发现就是更新两张表,一张是 tenant_account_statistics_details ,一张表是 tenant_account_statistics 。
3.分析死锁产生的流程:先看下面的执行sql
## SQL的执行顺序,也就是加锁的顺序
## ===============================================111111
UPDATE tenant_account_statistics_details
SET order_time = #{orderTime},
settlement_time = #{settlementTime}
WHERE order_no = #{orderNo}
AND type = 0
AND status = 0
## ================================================222222
UPDATE `saas_application`.`tenant_account_statistics`
SET settlement = settlement - #{amount},
to_account = to_account + #{amount},
withdrawable = withdrawable + #{amount}
WHERE shop_id = #{shopId}
AND belong_id = #{belongId}
## =================================================333333
UPDATE `saas_application`.`tenant_account_statistics_details`
SET status = 1,
actual_settlement_time = NOW()
WHERE order_no = #{orderNo}
and `type` = #{type}
image.png
事务在执行的时候,会同时去抢夺 tenant_account_statistics_details 和 tenant_account_statistics 两张表对应记录的行锁,并且持有的锁只有在事务提交或回滚后才会释放。
事务1和事务2同时执行,并且操作两张表的同一条记录的时候,极端的情况是:
事务1获得了 tenant_account_statistics_details 表的对应记录的行锁,等待 tenant_account_statistics 表的行锁。
事务2获得了 tenant_account_statistics 表的对应记录的行锁,等待 tenant_account_statistics_details 表对应的行锁。
此时形成了 两个事务相互等待对方的锁释放,无法回滚,无法提交,造成了死锁,数据库在超时后,会回滚undolog中操作影响小的事务。
死锁解决
死锁的原因找到了,然后就是解决思路:
1.首先避免同一时间操作对应表的对应的同一条记录。业务的逻辑无法避免这块,除非重新设计。
2.其次如何让事物操作所消耗的时间尽可能的短,这样就能减少两个事务操作同一条记录的冲突行为。排查一下sql,以及对应表的索引,发现
UPDATE tenant_account_statistics_details
SET order_time = #{orderTime},
settlement_time = #{settlementTime}
WHERE order_no = #{orderNo}
AND type = 0
AND status = 0
这个sql语句中根据 order_no去做更新操作,而order_no不是索引,所以会引起全表扫描,从而增加了这行sql的执行时间,从而也加大了整个事务的执行时间,从而提高了两个事务同时操作两张表对应的同一条数据概率,造成死锁的几率也大大提高了。
于是我给 tenant_account_statistics_details 表添加了 order_no 索引,再本地做了并发是500的压测,发现不再出现死锁现象。从而解决了业务上死锁造成的数据错误问题。
3.最后,我们还可以在这个大事务上加一个分布式锁,让所有的事务串行化执行,这样肯定也不会造成死锁,但是会减少接口的并发吞吐量。
网友评论