美文网首页
MySQL:5.7.25 truncate table导致的系统

MySQL:5.7.25 truncate table导致的系统

作者: 重庆八怪 | 来源:发表于2022-02-25 20:59 被阅读0次

    仅记录一个故障的分析过程,如果有误请见谅。


    一、问题来源

    最近一个朋友在做大表truncte的时候出现了堵塞。SEMAPHORES几乎都是DICT_SYS


    image.png

    对于show processlist显示如下,并且killed标记已经不做响应了。


    image.png

    从时间来看我们发现truncate命令执行的时间最长,那么可能是它造成的问题,但是为什么会造成问题呢。结合pstack简单分析一下

    二、查看pstack栈分析

    这里先贴出堵塞源头的truncate语句的栈,如下:

    Thread 178 (Thread 0x7f5ed0e8d700 (LWP 135097)):
    #0  mach_read_from_1 (b=<optimized out>) at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/storage/innobase/rem/rem0rec.cc:561
    #1  rec_get_bit_field_1 (rec=0x7f5eab8394df "931706384825733120", shift=0, mask=7, offs=3) at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/storage/innobase/include/rem0rec.ic:168
    #2  rec_get_status (rec=0x7f5eab8394df "931706384825733120") at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/storage/innobase/include/rem0rec.ic:472
    #3  rec_get_offsets_func (rec=0x7f5eab8394df "931706384825733120", index=0x7f5d9f04ad88, offsets=0x7f5d71627d08, n_fields=1, heap=0x7f5ed0e88ba8) at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/storage/innobase/rem/rem0rec.cc:561
    #4  0x000000000111895f in btr_search_drop_page_hash_index (block=0x7f5ea7e48570) at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/storage/innobase/btr/btr0sea.cc:1278
    #5  0x000000000111b0e9 in btr_search_drop_page_hash_when_freed (page_id=..., page_size=...) at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/storage/innobase/btr/btr0sea.cc:1387
    #6  0x000000000119f026 in fseg_free_extent (seg_inode=<optimized out>, space=116, page_size=..., page=192256, ahi=<optimized out>, mtr=<optimized out>) at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/storage/innobase/fsp/fsp0fsp.cc:3798
    #7  0x00000000011a317b in fseg_free_step (header=<optimized out>, ahi=true, mtr=0x7f5ed0e893c0) at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/storage/innobase/fsp/fsp0fsp.cc:3890
    #8  0x00000000010f45bd in btr_free_but_not_root (block=0x7f5e9f76e3e8, log_mode=MTR_LOG_NO_REDO) at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/storage/innobase/btr/btr0btr.cc:1152
    #9  0x00000000010f48ec in btr_free_if_exists (page_id=..., page_size=..., index_id=153, mtr=0x7f5ed0e89a00) at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/storage/innobase/btr/btr0btr.cc:1200
    #10 0x000000000114f86d in dict_drop_index_tree (rec=<optimized out>, pcur=<optimized out>, mtr=0x7f5ed0e89a00) at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/storage/innobase/dict/dict0crea.cc:1160
    #11 0x000000000107ab0c in DropIndex::operator() (this=0x7f5ed0e8b210, mtr=0x7f5ed0e89a00, pcur=0x7f5ed0e89ed0) at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/storage/innobase/row/row0trunc.cc:947
    #12 0x0000000001083d2a in for_each<DropIndex> (callback=..., this=<optimized out>) at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/storage/innobase/row/row0trunc.cc:102
    #13 SysIndexIterator::for_each<DropIndex> (this=<optimized out>, callback=...) at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/storage/innobase/row/row0trunc.cc:160
    #14 0x000000000107fd2d in row_truncate_table_for_mysql (table=<optimized out>, trx=0x7f5ec0c00ad0) at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/storage/innobase/row/row0trunc.cc:2022
    
    

    实际上我们的truncate有很多步骤,其中步骤9为

    Step-9: Drop all indexes (this include freeing of the pages
        associated with them).
    

    这一步中包含3个代价较高的部分,从DEBUG DICT_SYC的加锁和释放来看,5.7.22版本如下:

    • 持有DICT_SYS维护AHI,并且持有较早在row_truncate_table_for_mysql函数的1844行持有,释放在2071行的下层函数row_mysql_unlock_data_dictionary中,随后进行了内存维护。
    • 不持有DICT_SYS维护innodb buffer主要接口为buf_LRU_remove_pages。
    • 不持有DICT_SYS,unlink文件。

    如上故障的栈正是在进行AHI的维护,并且在dict_drop_index_tree中开头就对ut_ad(mutex_own(&dict_sys->mutex))进行了断言,也就是说这里肯定是在持有dict_sys锁的情况下进行的,那么我们来看一下到底发生了什么,我贴出大概的执行的步骤如下:

    循环每一个index
    btr_free_but_not_root
      释放所有的非root节点
      循环每个extent进行释放
      ->fseg_free_step
        主要释放一个extent
        是否开启了AHI
         循环extent
         ->fseg_free_extent        
            是否开启了AHI,如果是,循环每一个page
           ->btr_search_drop_page_hash_when_freed
             对单个的page进行处理,
             首先调用buf_page_get_gen,标记为BUF_PEEK_IF_IN_POOL,看page是否在pool中
             /*!< get if in pool, do not make the block young in the LRU list */
             如果block不在pool中直接返回,不做处理。
             如果block存在与pool中,则
               ->btr_search_drop_page_hash_index
                ->循环block中的每行记录 (这个步骤最慢)
                  ->计算每个字段的偏移量
                  ->循环每个字段 
                并且然后flod放入flod数组中。每行都包含这个值         
                ->根据构建的flod值drop AHI信息
                  ha_remove_all_nodes_to_page                
      每次extent的删除进行mtr 提交
    
    

    稍微总结一下如下,下面做这些的目的是精准的清理掉AHI中的信息:

    循环每一个index
       循环每个extent
          AHI 如果开启,则
          循环每一个page
             如果page在pool中(注意这里)
             循环每一行
                循环每一个字段
    

    我们发现这实际上是一个5层循环,代价最高的落在最后的循环每个page中每一行的每个字段上,如果page存在于这正是这个栈带给我们的信息。
    这些步骤需要持有DICT_SYS这样一个全局字典结构的保护锁进行,而在很多地方比如如下的函数中都会持有这个锁进行互斥保护:

    - >ha_innobase::get_foreign_key_list:获取外键列表
    - >ha_innobase::open:每当table cache不在的时候都会调用它取建立。
    

    这2个函数也是本例中等待的会话,并且在进行mutex的带的时候并不会被kill标记唤醒,因此kill是不能生效的。我在查询BUG的发现也有不少人遇到类似的问题如下:
    https://bugs.mysql.com/bug.php?id=91977
    但是bug的状态一直没有变为closed,在这个BUG的最后也有人问出问题:

    image.png
    因此不确认是否已经改进了。

    三、5.7中drop和truncate在内存维护上的区别

    其次我们来也简单讨论下truncate在innodb buffer维护上的代价,对于5.7来讲这个地方和drop是有区别的,函数接口 buf_LRU_flush_or_remove_pages 用于确认是否维护 LRU list,其中有三种类型:

    /** Algorithm to remove the pages for a tablespace from the buffer pool.
    See buf_LRU_flush_or_remove_pages(). */
    enum buf_remove_t {
        BUF_REMOVE_ALL_NO_WRITE,    /*!< Remove all pages from the buffer
                        pool, don't write or sync to disk */  
        BUF_REMOVE_FLUSH_NO_WRITE,  /*!< Remove only, from the flush list,
                        don't write or sync to disk */
        BUF_REMOVE_FLUSH_WRITE      /*!< Flush dirty pages to disk only
                        don't remove from the buffer pool */
    };
    
    • drop为:BUF_REMOVE_FLUSH_NO_WRITE,需要维护flush list,不回写数据
    • trunacte为:BUF_REMOVE_ALL_NO_WRITE,需要维护flush list和lru list,不回写数据

    因此5.7最好使用drop+create代替trucnate。

    四、测试和建议

    随后我对一个稍微大的表truncate,测试发现代价如我们描述:


    image.png

    正如我们描述的,我这里实际上没有建立AHI,但是全表扫描了一次,那么大量的page加载到了innodb buffer。这里btr_search_drop_page_hash_when_freed实际就是AHI维护的上层接口,如果有大量的AHI的存在,代价确实客观,并且这个是持有DICT_SYS的操作。

    对于AHI的维护方面直到8.0.21当时我看一个增加分区慢的case的时候依旧存在一些问题提交的BUG如下:
    https://bugs.mysql.com/bug.php?id=101900

    因此建议如下:

    • 对于大表的truncate操作,尽量保存在innodb buffer中没有相关page了,自然的淘汰的page也会单页维护AHI。如果page在innodb buffer中不存在,自然不会进入AHI维护。
    • 对于大表的truncate操作,5.7还是建议用drop+create来代替。我也见过因为维护innodb buffer带来的故障。
    • drop可以先link一个硬链接
    • 低峰期进行DDL,任何DDL都是,避免未知的影响,影响太大了。

    其他:断点

    我主要使用了如下断点

    • 线程条件断点:this==0x3424938 为DICT_SYS全局地址
    • 函数断点:unlink为文件释放的linux接口
    • 函数断点:btr_free_but_not_root为AHI维护的上层接口
    • 函数断点:buf_LRU_remove_all_pages为内存维护接口
    20      breakpoint     keep y   <MULTIPLE>          thread 42
            stop only if this==0x3424938
            stop only in thread 42
            breakpoint already hit 29 times
    20.1                        y     0x0000000001998de0 in PolicyMutex<TTASEventMutex<GenericPolicy> >::enter(unsigned int, unsigned int, char const*, unsigned int) 
                                                       at /opt/percona-server-locks-detail-5.7.22/storage/innobase/include/ib0mutex.h:973
    20.2                        y     0x000000000199906a in PolicyMutex<TTASEventMutex<BlockMutexPolicy> >::enter(unsigned int, unsigned int, char const*, unsigned int) 
                                                       at /opt/percona-server-locks-detail-5.7.22/storage/innobase/include/ib0mutex.h:973
    20.3                        y     0x0000000001a52966 in PolicyMutex<OSTrackMutex<GenericPolicy> >::enter(unsigned int, unsigned int, char const*, unsigned int) 
                                                       at /opt/percona-server-locks-detail-5.7.22/storage/innobase/include/ib0mutex.h:973
    21      breakpoint     keep y   <MULTIPLE>          thread 42
            stop only if this==0x3424938
            stop only in thread 42
            breakpoint already hit 29 times
    21.1                        y     0x0000000001998e83 in PolicyMutex<TTASEventMutex<GenericPolicy> >::exit() 
                                                       at /opt/percona-server-locks-detail-5.7.22/storage/innobase/include/ib0mutex.h:947
    21.2                        y     0x000000000199910d in PolicyMutex<TTASEventMutex<BlockMutexPolicy> >::exit() 
                                                       at /opt/percona-server-locks-detail-5.7.22/storage/innobase/include/ib0mutex.h:947
    21.3                        y     0x0000000001a52859 in PolicyMutex<OSTrackMutex<GenericPolicy> >::exit() at /opt/percona-server-locks-detail-5.7.22/storage/innobase/include/ib0mutex.h:947
    22      breakpoint     keep y   0x00007ffff6667140 <unlink>
            breakpoint already hit 4 times
    23      breakpoint     keep y   0x0000000001bd815e in btr_free_but_not_root(buf_block_t*, mtr_log_t) at /opt/percona-server-locks-detail-5.7.22/storage/innobase/btr/btr0btr.cc:1158
            breakpoint already hit 2 times
    24      breakpoint     keep y   0x00007ffff6667140 <unlink>
            breakpoint already hit 2 times
    25      breakpoint     keep y   0x0000000001c49722 in buf_LRU_remove_all_pages(buf_pool_t*, ulint) at /opt/percona-server-locks-detail-5.7.22/storage/innobase/buf/buf0lru.cc:786
            breakpoint already hit 8 times
    

    5.7 drop栈

    #0  btr_free_but_not_root (block=0x7fffb43622c0, log_mode=MTR_LOG_ALL) at /opt/percona-server-locks-detail-5.7.22/storage/innobase/btr/btr0btr.cc:1158
    #1  0x0000000001bd8516 in btr_free_if_exists (page_id=..., page_size=..., index_id=7175, mtr=0x7fffec04b450) at /opt/percona-server-locks-detail-5.7.22/storage/innobase/btr/btr0btr.cc:1239
    #2  0x0000000001c5f83e in dict_drop_index_tree (rec=0x7fffc6383cbb "", pcur=0x7fff84b0e4b0, mtr=0x7fffec04b450)
        at /opt/percona-server-locks-detail-5.7.22/storage/innobase/dict/dict0crea.cc:1162
    #3  0x0000000001b429ad in row_upd_clust_step (node=0x7fff84b0ddb0, thr=0x7fff84b11750) at /opt/percona-server-locks-detail-5.7.22/storage/innobase/row/row0upd.cc:2883
    #4  0x0000000001b43080 in row_upd (node=0x7fff84b0ddb0, thr=0x7fff84b11750) at /opt/percona-server-locks-detail-5.7.22/storage/innobase/row/row0upd.cc:3042
    #5  0x0000000001b4355f in row_upd_step (thr=0x7fff84b11750) at /opt/percona-server-locks-detail-5.7.22/storage/innobase/row/row0upd.cc:3188
    #6  0x0000000001a890b1 in que_thr_step (thr=0x7fff84b11750) at /opt/percona-server-locks-detail-5.7.22/storage/innobase/que/que0que.cc:1045
    #7  0x0000000001a8938d in que_run_threads_low (thr=0x7fff84b11750) at /opt/percona-server-locks-detail-5.7.22/storage/innobase/que/que0que.cc:1125
    #8  0x0000000001a89543 in que_run_threads (thr=0x7fff84b11750) at /opt/percona-server-locks-detail-5.7.22/storage/innobase/que/que0que.cc:1165
    #9  0x0000000001a897f9 in que_eval_sql (info=0x7fff8403d680, 
        sql=0x7fff8403b960 "PROCEDURE DROP_TABLE_PROC () IS\nsys_foreign_id CHAR;\ntable_id CHAR;\nindex_id CHAR;\nforeign_id CHAR;\nspace_id INT;\nfound INT;\nDECLARE CURSOR cur_fk IS\nSELECT ID FROM SYS_FOREIGN\nWHERE FOR_NAME = :table"..., reserve_dict_mutex=0, trx=0x7fffeca7ab60) at /opt/percona-server-locks-detail-5.7.22/storage/innobase/que/que0que.cc:1242
    #10 0x0000000001ae5f83 in row_drop_table_for_mysql (name=0x7fffec04c840 "test/test11", trx=0x7fffeca7ab60, drop_db=false, nonatomic=true, handler=0x0)
        at /opt/percona-server-locks-detail-5.7.22/storage/innobase/row/row0mysql.cc:5207
    #11 0x0000000001984580 in ha_innobase::delete_table (this=0x7fff84006db0, name=0x7fffec04df70 "./test/test11")
        at /opt/percona-server-locks-detail-5.7.22/storage/innobase/handler/ha_innodb.cc:13545
    #12 0x0000000000f5d1ae in handler::ha_delete_table (this=0x7fff84006db0, name=0x7fffec04df70 "./test/test11") at /opt/percona-server-locks-detail-5.7.22/sql/handler.cc:5240
    #13 0x0000000000f566b7 in ha_delete_table (thd=0x7fff84000c00, table_type=0x2e64aa0, path=0x7fffec04df70 "./test/test11", db=0x7fff84006bb8 "test", alias=0x7fff840065e8 "test11", 
        generate_warning=true) at /opt/percona-server-locks-detail-5.7.22/sql/handler.cc:2763
    #14 0x00000000015f6bab in mysql_rm_table_no_locks (thd=0x7fff84000c00, tables=0x7fff84006630, if_exists=false, drop_temporary=false, drop_view=false, dont_log_query=false)
        at /opt/percona-server-locks-detail-5.7.22/sql/sql_table.cc:2680
    #15 0x00000000015f59b4 in mysql_rm_table (thd=0x7fff84000c00, tables=0x7fff84006630, if_exists=0 '\000', drop_temporary=0 '\000')
        at /opt/percona-server-locks-detail-5.7.22/sql/sql_table.cc:2221
    
    由于delete了SYS_INDEXES而调用了dict_drop_index_tree 
    
    sql +=  "DECLARE CURSOR cur_idx IS\n"
                "SELECT ID FROM SYS_INDEXES\n"
                "WHERE TABLE_ID = table_id\n"
                "LOCK IN SHARE MODE;\n";
                
    sql +=  "found := 1;\n"
                "OPEN cur_idx;\n"
                "WHILE found = 1 LOOP\n"
                "       FETCH cur_idx INTO index_id;\n"
                "       IF (SQL % NOTFOUND) THEN\n"
                "               found := 0;\n"
                "       ELSE\n"
                "               DELETE FROM SYS_FIELDS\n"
                "               WHERE INDEX_ID = index_id;\n"
                "               DELETE FROM SYS_INDEXES\n"
                "               WHERE ID = index_id\n"
                "               AND TABLE_ID = table_id;\n"
                "       END IF;\n"
                "END LOOP;\n"
                "CLOSE cur_idx;\n";
    
    

    8.0 为BUF_REMOVE_NONE

    #0  Fil_shard::space_delete (this=0x7fffe02e7be0, space_id=1448, buf_remove=BUF_REMOVE_NONE) at /newdata/mysql-8.0.23/storage/innobase/fil/fil0fil.cc:4512
    #1  0x000000000543c023 in fil_delete_tablespace (space_id=1448, buf_remove=BUF_REMOVE_NONE) at /newdata/mysql-8.0.23/storage/innobase/fil/fil0fil.cc:4665
    #2  0x000000000514733b in row_drop_tablespace (space_id=1448, filepath=0xaa81ec0 "./t10/#sql-ib2899-698767554.ibd") at /newdata/mysql-8.0.23/storage/innobase/row/row0mysql.cc:3772
    #3  0x00000000050527a9 in Log_DDL::replay_delete_space_log (this=0x7fffe031c790, space_id=1448, file_path=0xaa81ec0 "./t10/#sql-ib2899-698767554.ibd")
        at /newdata/mysql-8.0.23/storage/innobase/log/log0ddl.cc:1703
    #4  0x00000000050521b3 in Log_DDL::replay (this=0x7fffe031c790, record=...) at /newdata/mysql-8.0.23/storage/innobase/log/log0ddl.cc:1590
    #5  0x0000000005051dc2 in Log_DDL::replay_by_thread_id (this=0x7fffe031c790, thread_id=7) at /newdata/mysql-8.0.23/storage/innobase/log/log0ddl.cc:1520
    #6  0x00000000050536bb in Log_DDL::post_ddl (this=0x7fffe031c790, thd=0xabe73e0) at /newdata/mysql-8.0.23/storage/innobase/log/log0ddl.cc:1924
    #7  0x0000000004f18da4 in innobase_post_ddl (thd=0xabe73e0) at /newdata/mysql-8.0.23/storage/innobase/handler/ha_innodb.cc:4682
    #8  0x0000000003eaedc6 in Sql_cmd_truncate_table::cleanup_base (this=0xabbd9c8, thd=0xabe73e0, hton=0xaa39a20) at /newdata/mysql-8.0.23/sql/sql_truncate.cc:401
    #9  0x0000000003eaf09c in Sql_cmd_truncate_table::<lambda()>::operator()(void) const (__closure=0x7fffe47cc188) at /newdata/mysql-8.0.23/sql/sql_truncate.cc:494
    #10 0x0000000003eb0196 in Scope_guard<Sql_cmd_truncate_table::truncate_base(THD*, TABLE_LIST*)::<lambda()> >::~Scope_guard(void) (this=0x7fffe47cc180, __in_chrg=<optimized out>)
        at /newdata/mysql-8.0.23/include/scope_guard.h:41
    #11 0x0000000003eaf81f in Sql_cmd_truncate_table::truncate_base (this=0xabbd9c8, thd=0xabe73e0, table_ref=0xabbda00) at /newdata/mysql-8.0.23/sql/sql_truncate.cc:606
    #12 0x0000000003eb00e6 in Sql_cmd_truncate_table::execute (this=0xabbd9c8, thd=0xabe73e0) at /newdata/mysql-8.0.23/sql/sql_truncate.cc:746
    

    相关文章

      网友评论

          本文标题:MySQL:5.7.25 truncate table导致的系统

          本文链接:https://www.haomeiwen.com/subject/lxxfrrtx.html