稍微写一下最近遇到和想到的2个问题,当然遇到的问题很多,不止这些。水平有限。
一、writeset相关参数总结
MGR一定要设置binlog_transaction_dependency_tracking为writeset吗?我们先来看看几个参数的含义和原理总结。
- transaction_write_set_extraction:控制是是否生成writeset,以及生成的算法,writeset的生成通常有一定的代价,每行修改和每个字段循环生成,可以参考binlog_log_row函数里面的参数判断(8.0 默认开启/5.7默认关闭)
writeset生成后通常有2个用途:
- 当主库的binlog_transaction_dependency_tracking参数设置为writeset或者writeset_session的时候,用于降低order_commit生成的last commit,以提高从库的并发。
- MGR下面writeset有着更重要的作用就是做冲突验证,会被封装到Transaction_context_log_event,因此MGR必须要要有writeset,因此transaction_write_set_extraction是必须要设置的。
但是MGR一定要设置binlog_transaction_dependency_tracking为writeset吗?从测试和理论上来讲实际上都是不需要的。但是Innodb cluster这是一个强制的,否则检测不过,主要为了提高MGR applier 通道应用binlog的效率,但这并不是MGR启动必须的。
二、单个事务binlog大于1G,mysqlbinlog|mysql不能恢复数据
这是最近恢复数据的时候遇到的一个问题,8.0.28也可以重现。如下
set global max_allowed_packet=1024*1024*1024;(这里已经设置了最大值)
mysql> show variables like '%max_allowed_packet%';
+----------------------------+------------+
| Variable_name | Value |
+----------------------------+------------+
| max_allowed_packet | 1073741824 |
| mysqlx_max_allowed_packet | 67108864 |
| replica_max_allowed_packet | 1073741824 |
| slave_max_allowed_packet | 1073741824 |
+----------------------------+------------+
1、create table test(id int,name text);
2、insert into test values(1,repeat('a',2000));
3、repeat insert into select command. util ibdata larger than 2G
mysql> insert into test select * from test;
Records: 524288 Duplicates: 0 Warnings: 0
4、flush binary logs;
5、delete from test;
check new binlog larger than 1G.for example binlog.000012
6、mysqlbinlog binlog.000012 | mysql
ERROR 1153 (08S01) at line 297718: Got a packet bigger than 'max_allowed_packet' bytes
大概的描述如下,也就直接粘贴提交BUG的了,但是官方不认为这是BUG,但是我觉得这是一个大的问题,下面是提交BUG的描述,也是我分析的一些结果。
When we use mysqlbinlog| mysql to recovery data,if one transaction binlog event larger than 1G (max_allowed_packet max value limit is 1G) ,will get error
-rw-r----- 1 mysql mysql 1.2G May 24 16:10 binlog.000012
mysqlbinlog binlog.000012 | mysql -f sbtest
errlog:
ERROR 1153 (08S01) at line 297718: Got a packet bigger than 'max_allowed_packet' bytes
mysql batch mode store all transaction binlog event in batch buffer then transport to mysqld,mysqld use net->buff to save all transaction binlog event.
When larger than max values off max_allowed_packet(1G) will get this error.
Is this a bug ? sometime must use mysqlbinlog| mysql to recovery data.
We change max limit of max_allowed_packet to 8G ,this work, but i think there must be some reason for limit this parameter to 1G.
in master-slave one event is net packet. see dump thread code(5.7).
Log_event::read_log_event
if (data_len < LOG_EVENT_MINIMAL_HEADER_LEN ||
data_len > max(current_thd->variables.max_allowed_packet,
opt_binlog_rows_event_max_size + MAX_LOG_EVENT_HEADER))
so one event large than 1G ,this situation is rare.
But mysql|mysqlbinlog one transaction all event is a mysqld read net packet,this not like master-slave ,when net packet large than max limit of max_allowed_packet(1G), this situation is very common.
I think "can change it in realtime when needed" is net_realloc function(5.7) .
net_read_packet
->net_realloc
length >= net->max_packet_size
but net->max_packet_size max limit is 1024 * 1024 * 1024, so "realtime change" upper limit is 1G.
static Sys_var_ulong Sys_max_allowed_packet(
"max_allowed_packet",
"Max packet length to send to or receive from the server",
SESSION_VAR(max_allowed_packet), CMD_LINE(REQUIRED_ARG),
VALID_RANGE(1024, 1024 * 1024 * 1024), DEFAULT(4096 * 1024),
BLOCK_SIZE(1024), NO_MUTEX_GUARD, NOT_IN_BINLOG,
ON_CHECK(check_max_allowed_packet));
In my issue, i change VALID_RANGE(1024, 1024 * 1024 * 1024) to VALID_RANGE(1024, 8l * 1024 * 1024 * 1024) , max limit change to 8G and set global max_allowed_packet=8*1024*1024*1024 it work fine. we add some log output in net_realloc function like :
2022-05-25T13:44:21.773569Z 5 [Note] net packet size 16777215(init value is 16M)
2022-05-25T13:44:21.916363Z 5 [Note] net packet size 33554430
...
2022-05-25T13:46:20.902361Z 5 [Note] net packet size 1006632900
2022-05-25T13:46:21.286737Z 5 [Note] net packet size 1023410115
2022-05-25T13:46:21.743261Z 5 [Note] net packet size 1040187330
2022-05-25T13:46:22.187788Z 5 [Note] net packet size 1056964545
2022-05-25T13:46:22.630998Z 5 [Note] net packet size 1073741760(max value is 1G)
2022-05-25T13:46:22.728740Z 5 [Note] Aborted connection 5 to db: 'unconnected' user: 'dbatest' host: '192.168.1.66' (Got a packet bigger than 'max_allowed_packet' bytes)
find realtime change upper limit is 1G, but this transaction all event is larger than 1G, so error encounter
注意:后来和朋友(轻松的鱼)讨论,这个问题除了修改代码,也许可以将binlog拷贝到从库,命名为relay log然后直接使用sql线程进行回放。如下:
- 拷贝binlog到从库的tmp目录下,更名为relay log,重建relay log index,注意授权
- 从库reset slave all; 并且 rm -rf relaylog* 删除当前目录下的relay信息
- cp -a /tmp/relaylog.* ./ cp到relay log目录下注意权限
- 通过如下语句建立通道,不需要连接库,其中pos可以指定为binlog中开始恢复的pos,当然这个pos就来自备份文件。
CHANGE MASTER TO RELAY_LOG_FILE='relaylog.000020',RELAY_LOG_POS=1301, MASTER_HOST='nohost';
这个时候会多一个relay log出来,没有关系,开始的pos 1301是一个事务的开始的gtid event位置。
- 指定恢复到的位置,pos/gtid都可以
start slave sql_thread until RELAY_LOG_FILE = 'relaylog.000022', RELAY_LOG_POS = 901;
start slave sql_thread until SQL_BEFORE_GTIDS = '00320cc8-39f9-11ec-b5ba-000c2929706d:46';
网友评论