美文网首页知数堂MySQL学习
VIP漂移,IO线程断连 注意事项

VIP漂移,IO线程断连 注意事项

作者: 重庆八怪 | 来源:发表于2019-08-06 22:07 被阅读0次

    考虑一个问题如下:

     
    A----B
     \  /
     VIP
      | 
      C
    

    这种构架A B 切换了,如果VIP漂移了,C从库是否有问题。结论就是POS一定不行,GTID却一定可以。证明如下:


    IO thread如果遇到主库IP断开操作会进入重连流程。这个过程触发如下逻辑

    event_len= read_event(mysql, mi, &suppress_warnings); //返回读取的长度
    

    event_len中返回错误码,如下:

    if (event_len == packet_error)
          {
            uint mysql_error_number= mysql_errno(mysql);
            switch (mysql_error_number) {
            case CR_NET_PACKET_TOO_LARGE:
              sql_print_error("\
    Log entry on master is longer than slave_max_allowed_packet (%lu) on \
    slave. If the entry is correct, restart the server with a higher value of \
    slave_max_allowed_packet",
                             slave_max_allowed_packet);
              mi->report(ERROR_LEVEL, ER_NET_PACKET_TOO_LARGE,
                         "%s", "Got a packet bigger than 'slave_max_allowed_packet' bytes");
              goto err;
            case ER_MASTER_FATAL_ERROR_READING_BINLOG:
              mi->report(ERROR_LEVEL, ER_MASTER_FATAL_ERROR_READING_BINLOG,
                         ER(ER_MASTER_FATAL_ERROR_READING_BINLOG),
                         mysql_error_number, mysql_error(mysql));
              goto err;
            case ER_OUT_OF_RESOURCES:
              sql_print_error("\
    Stopping slave I/O thread due to out-of-memory error from master");
              mi->report(ERROR_LEVEL, ER_OUT_OF_RESOURCES,
                         "%s", ER(ER_OUT_OF_RESOURCES));
              goto err;
            }
            if (try_to_reconnect(thd, mysql, mi, &retry_count, suppress_warnings,
                                 reconnect_messages[SLAVE_RECON_ACT_EVENT]))
              goto err;
            goto connected;
          } 
    
    

    上面的有些错误是不能重连的自行参考,如果重新连接成功,将会进入goto connected;

    这里会重新走一遍连接流程,最重要的是GTID和POSTION 会进入DUMP线程定位流程,也就是GTID
    会重新搜索主库的mysql binlog 和 GTID 进行定位。

    因此我们可以确认类似下面

     
    A----B
     \  /
     VIP
      | 
      C
    

    这种构架当VIP切换完成后,主要保证A B无损切换,那么C是没有问题的,但是POSTION却不行,因为A库的位点和B点的位点不一定完全一致。这一点是需要注意的。

    证明很简单,我只需要将主库IP先关闭然后过一会起来即可。日志如下:

    2019-08-06T21:30:29.723923+08:00 4 [ERROR] Slave I/O for channel '': error reconnecting to master 'repl@192.168.99.101:3340' - retry-time: 60  retries: 1, Error_code: 2
    003
    

    我们设置断点如下。

    从库设置在request_dump函数上,触发如下:

    (gdb) bt
    #0  request_dump (thd=0x7ffe800009a0, mysql=0x7ffe8000e670, mi=0x7ffe7c0223b0, suppress_warnings=0x7fffec0c5d8b)
        at /root/mysqlall/percona-server-locks-detail-5.7.22/sql/rpl_slave.cc:4363
    #1  0x00000000018beee1 in handle_slave_io (arg=0x7ffe7c0223b0) at /root/mysqlall/percona-server-locks-detail-5.7.22/sql/rpl_slave.cc:5768
    #2  0x0000000001945620 in pfs_spawn_thread (arg=0x7ffe7c033f90) at /root/mysqlall/percona-server-locks-detail-5.7.22/storage/perfschema/pfs.cc:2190
    #3  0x00007ffff7bc6aa1 in start_thread () from /lib64/libpthread.so.0
    #4  0x00007ffff6719bcd in clone () from /lib64/libc.so.6
    

    主库设置在com_binlog_dump_gtid上

    (gdb) bt
    #0  com_binlog_dump_gtid (thd=0x7fffe800edc0, packet=0x7fffe80068c1 "", packet_length=43) at /mysqldata/percona-server-locks-detail-5.7.22/sql/rpl_master.cc:356
    #1  0x00000000015c769b in dispatch_command (thd=0x7fffe800edc0, com_data=0x7fffec58bd70, command=COM_BINLOG_DUMP_GTID)
        at /mysqldata/percona-server-locks-detail-5.7.22/sql/sql_parse.cc:1705
    #2  0x00000000015c58ff in do_command (thd=0x7fffe800edc0) at /mysqldata/percona-server-locks-detail-5.7.22/sql/sql_parse.cc:1021
    #3  0x000000000170e578 in handle_connection (arg=0x6660220) at /mysqldata/percona-server-locks-detail-5.7.22/sql/conn_handler/connection_handler_per_thread.cc:312
    #4  0x0000000001945538 in pfs_spawn_thread (arg=0x665f200) at /mysqldata/percona-server-locks-detail-5.7.22/storage/perfschema/pfs.cc:2190
    #5  0x00007ffff7bcfaa1 in start_thread () from /lib64/libpthread.so.0
    #6  0x00007ffff6b37c4d in clone () from /lib64/libc.so.6
    
    

    因此这里主要证明的就是,即便是IO线程重连主库GTID定位操作依然会重新跑一次。

    相关文章

      网友评论

        本文标题:VIP漂移,IO线程断连 注意事项

        本文链接:https://www.haomeiwen.com/subject/hxugdctx.html