探秘OPTIMIZER_TRACE

作者: 秋水畏寒 | 来源:发表于2020-06-06 16:32 被阅读0次

    mysql版本

    5.7


    参考

    本文转自手把手教你认识OPTIMIZER_TRACE,综合博主自身理解,有所修改及重新排版,转载请注明上述链接

    mysql官方实例


    1. 背景

    众所周知,当遇到慢查询sql的时候,需要使用explain关键字来查看该sql的执行计划,从中找出优化点,但是,有时候,光靠explain是不足以得出全面的优化结论的,比如说,执行以下语句:

    explain select * from product0 as t1 where t1.name = '产品名称';
    

    结果:


    explain结果

    不免会有这样的疑问:

    • 已经为name字段创建了索引,可是为什么查询语句没有使用索引,而是走了全表扫描呢?
    • 如果存在多个possible keys,那么mysql又是如何抉择出合适的key的呢?

    这一切,OPTIMIZER_TRACE都将会为我们作出解答。


    2. 定义

    OPTIMIZER_TRACE,见名知义,优化器跟踪,也就是跟踪语句的解析优化过程,并将跟踪到的信息记录到INFORMATION_SCHEMA.OPTIMIZER_TRACE表中,该功能默认是不开启的,并且默认只记录最后一次结果到INFORMATION_SCHEMA.OPTIMIZER_TRACE表中。

    执行以下语句查看相关变量:

    show variables like '%optimizer_trace%';
    

    结果:


    optimizer_trace相关变量

    可以看到相关变量如下:

    • optimizer_trace
      • enabled:启用/禁用optimizer_trace功能
      • one_line:执行结果中trace字段的展示方式,on代表单行展示,off代表以json格式展示
    • optimizer_trace_features:控制执行结果中trace字段需要打印的json项和不需要打印的json项,一般全部开启
    • optimizer_trace_limitoptimizer_trace_offset:这俩个参数类似limit offset,row_count,默认值分别是1-1,代表在INFORMATION_SCHEMA.OPTIMIZER_TRACE表中只存储最近最后执行的一行结果数据
    • optimizer_trace_max_mem_sizeoptimizer_trace内存大小,如果跟踪信息超过这个大小,信息会被截断

    3. 适用语句

    • SELECT/INSERT/REPLACE/UPDATE/DELETE
    • EXPLAIN
    • SET
    • DO
    • DECLARE/CASE/IF/RETURN
    • CALL

    4. Demo

    4.1 数据准备

    CREATE TABLE t1 (
    pk INT, col_int_key INT,
    col_varchar_key VARCHAR(1), col_varchar_nokey VARCHAR(1)
    );
    INSERT INTO t1 VALUES
    (10,7,'v','v'),(11,0,'s','s'),(12,9,'l','l'),(13,3,'y','y'),(14,4,'c','c'),
    (15,2,'i','i'),(16,5,'h','h'),(17,3,'q','q'),(18,1,'a','a'),(19,3,'v','v'),
    (20,6,'u','u'),(21,7,'s','s'),(22,5,'y','y'),(23,1,'z','z'),(24,204,'h','h'),
    (25,224,'p','p'),(26,9,'e','e'),(27,5,'i','i'),(28,0,'y','y'),(29,3,'w','w');
    CREATE TABLE t2 (
    pk INT, col_int_key INT,
    col_varchar_key VARCHAR(1), col_varchar_nokey VARCHAR(1),
    PRIMARY KEY (pk)
    );
    INSERT INTO t2 VALUES
    (1,4,'b','b'),(2,8,'y','y'),(3,0,'p','p'),(4,0,'f','f'),(5,0,'p','p'),
    (6,7,'d','d'),(7,7,'f','f'),(8,5,'j','j'),(9,3,'e','e'),(10,188,'u','u'),
    (11,4,'v','v'),(12,9,'u','u'),(13,6,'i','i'),(14,1,'x','x'),(15,5,'l','l'),
    (16,6,'q','q'),(17,2,'n','n'),(18,4,'r','r'),(19,231,'c','c'),(20,4,'h','h'),
    (21,3,'k','k'),(22,3,'t','t'),(23,7,'t','t'),(24,6,'k','k'),(25,7,'g','g'),
    (26,9,'z','z'),(27,4,'n','n'),(28,4,'j','j'),(29,2,'l','l'),(30,1,'d','d'),
    (31,2,'t','t'),(32,194,'y','y'),(33,2,'i','i'),(34,3,'j','j'),(35,8,'r','r'),
    (36,4,'b','b'),(37,9,'o','o'),(38,4,'k','k'),(39,5,'a','a'),(40,5,'f','f'),
    (41,9,'t','t'),(42,3,'c','c'),(43,8,'c','c'),(44,0,'r','r'),(45,98,'k','k'),
    (46,3,'l','l'),(47,1,'o','o'),(48,0,'t','t'),(49,189,'v','v'),(50,8,'x','x'),
    (51,3,'j','j'),(52,3,'x','x'),(53,9,'k','k'),(54,6,'o','o'),(55,8,'z','z'),
    (56,3,'n','n'),(57,9,'c','c'),(58,5,'d','d'),(59,9,'s','s'),(60,2,'j','j'),
    (61,2,'w','w'),(62,5,'f','f'),(63,8,'p','p'),(64,6,'o','o'),(65,9,'f','f'),
    (66,0,'x','x'),(67,3,'q','q'),(68,6,'g','g'),(69,5,'x','x'),(70,8,'p','p'),
    (71,2,'q','q'),(72,120,'q','q'),(73,25,'v','v'),(74,1,'g','g'),(75,3,'l','l'),
    (76,1,'w','w'),(77,3,'h','h'),(78,153,'c','c'),(79,5,'o','o'),(80,9,'o','o'),
    (81,1,'v','v'),(82,8,'y','y'),(83,7,'d','d'),(84,6,'p','p'),(85,2,'z','z'),
    (86,4,'t','t'),(87,7,'b','b'),(88,3,'y','y'),(89,8,'k','k'),(90,4,'c','c'),
    (91,6,'z','z'),(92,1,'t','t'),(93,7,'o','o'),(94,1,'u','u'),(95,0,'t','t'),
    (96,2,'k','k'),(97,7,'u','u'),(98,2,'b','b'),(99,1,'m','m'),(100,5,'o','o');
    

    4.2 执行

    set optimizer_trace = 'enabled=on';
    SELECT SUM(alias2.col_varchar_nokey) , alias2.pk AS field2 FROM t1 AS alias1
    STRAIGHT_JOIN t2 AS alias2 ON alias2.pk = alias1.col_int_key WHERE alias1.pk
    GROUP BY field2 ORDER BY alias1.col_int_key,alias2.pk ;
    select * from information_schema.OPTIMIZER_TRACE;
    set optimizer_trace = 'enabled=off';
    

    执行结果:


    执行结果

    5. 解析

    各个字段解析如下:

    • QUERY:跟踪的SQL语句
    • TRACE:跟踪信息(JSON格式)
    • MISSING_BYTES_BEYOND_MAX_MEM_SIZE:跟踪信息过长时,被截断的跟踪信息的字节数
    • INSUFFICIENT_PRIVILEGES:执行跟踪语句的用户是否有查看对象的权限。当不具有权限时,该列信息为1TRACE字段为空。一般出现在调用带有SQL SECURITY DEFINER的视图或者是存储过程的情况下

    其中,TRACE详细内容如下:

    {
        "steps" : [
            {
                "join_preparation" : {
                    "select#" : 1,
                    "steps" : [
                        {
                            "expanded_query" : "/* select#1 */ select sum(`t2`.`col_varchar_nokey`) AS `SUM(t2.col_varchar_nokey)`,`t2`.`pk` AS `field2` from (`t2` straight_join `t1` on((`t2`.`pk` = `t1`.`col_int_key`))) where (`t2`.`pk` between 10 and 100) group by `field2` order by `t1`.`col_int_key`,`t2`.`pk` limit 0,1000"
                        },
                        {
                            "transformations_to_nested_joins" : {
                                "expanded_query" : "/* select#1 */ select sum(`t2`.`col_varchar_nokey`) AS `SUM(t2.col_varchar_nokey)`,`t2`.`pk` AS `field2` from `t2` straight_join `t1` where ((`t2`.`pk` between 10 and 100) and (`t2`.`pk` = `t1`.`col_int_key`)) group by `field2` order by `t1`.`col_int_key`,`t2`.`pk` limit 0,1000",
                                "transformations" : [
                                    "JOIN_condition_to_WHERE",
                                    "parenthesis_removal"
                                ]
                            }
                        },
                        {
                            "functional_dependencies_of_GROUP_columns" : {
                                "all_columns_of_table_map_bits" : [
                                    0
                                ],
                                "columns" : [
                                    "training.t2.pk",
                                    "training.t1.col_int_key"
                                ]
                            }
                        }
                    ]
                }
            },
            {
                "join_optimization" : {
                    "select#" : 1,
                    "steps" : [
                        {
                            "condition_processing" : {
                                "condition" : "WHERE",
                                "original_condition" : "((`t2`.`pk` between 10 and 100) and (`t2`.`pk` = `t1`.`col_int_key`))",
                                "steps" : [
                                    {
                                        "resulting_condition" : "((`t2`.`pk` between 10 and 100) and multiple equal(`t2`.`pk`, `t1`.`col_int_key`))",
                                        "transformation" : "equality_propagation"
                                    },
                                    {
                                        "resulting_condition" : "((`t2`.`pk` between 10 and 100) and multiple equal(`t2`.`pk`, `t1`.`col_int_key`))",
                                        "transformation" : "constant_propagation"
                                    },
                                    {
                                        "resulting_condition" : "((`t2`.`pk` between 10 and 100) and multiple equal(`t2`.`pk`, `t1`.`col_int_key`))",
                                        "transformation" : "trivial_condition_removal"
                                    }
                                ]
                            }
                        },
                        {
                            "substitute_generated_columns" : {                      }
                        },
                        {
                            "table_dependencies" : [
                                {
                                    "depends_on_map_bits" : [                               ],
                                    "map_bit" : 0,
                                    "row_may_be_null" : false,
                                    "table" : "`t2`"
                                },
                                {
                                    "depends_on_map_bits" : [
                                        0
                                    ],
                                    "map_bit" : 1,
                                    "row_may_be_null" : false,
                                    "table" : "`t1`"
                                }
                            ]
                        },
                        {
                            "ref_optimizer_key_uses" : [
                                {
                                    "equals" : "`t1`.`col_int_key`",
                                    "field" : "pk",
                                    "null_rejecting" : true,
                                    "table" : "`t2`"
                                }
                            ]
                        },
                        {
                            "rows_estimation" : [
                                {
                                    "range_analysis" : {
                                        "analyzing_range_alternatives" : {
                                            "analyzing_roworder_intersect" : {
                                                "cause" : "too_few_roworder_scans",
                                                "usable" : false
                                            },
                                            "range_scan_alternatives" : [
                                                {
                                                    "chosen" : true,
                                                    "cost" : 19.291000,
                                                    "index" : "PRIMARY",
                                                    "index_dives_for_eq_ranges" : true,
                                                    "index_only" : false,
                                                    "ranges" : [
                                                        "10 <= pk <= 100"
                                                    ],
                                                    "rowid_ordered" : true,
                                                    "rows" : 91,
                                                    "using_mrr" : false
                                                }
                                            ]
                                        },
                                        "chosen_range_access_summary" : {
                                            "chosen" : true,
                                            "cost_for_plan" : 19.291000,
                                            "range_access_plan" : {
                                                "index" : "PRIMARY",
                                                "ranges" : [
                                                    "10 <= pk <= 100"
                                                ],
                                                "rows" : 91,
                                                "type" : "range_scan"
                                            },
                                            "rows_for_plan" : 91
                                        },
                                        "group_index_range" : {
                                            "cause" : "not_single_table",
                                            "chosen" : false
                                        },
                                        "potential_range_indexes" : [
                                            {
                                                "index" : "PRIMARY",
                                                "key_parts" : [
                                                    "pk"
                                                ],
                                                "usable" : true
                                            }
                                        ],
                                        "setup_range_conditions" : [                                    ],
                                        "table_scan" : {
                                            "cost" : 23.100000,
                                            "rows" : 100
                                        }
                                    },
                                    "table" : "`t2`"
                                },
                                {
                                    "table" : "`t1`",
                                    "table_scan" : {
                                        "cost" : 1,
                                        "rows" : 20
                                    }
                                }
                            ]
                        },
                        {
                            "considered_execution_plans" : [
                                {
                                    "best_access_path" : {
                                        "considered_access_paths" : [
                                            {
                                                "access_type" : "ref",
                                                "chosen" : false,
                                                "index" : "PRIMARY",
                                                "usable" : false
                                            },
                                            {
                                                "access_type" : "range",
                                                "chosen" : true,
                                                "cost" : 37.491000,
                                                "range_details" : {
                                                    "used_index" : "PRIMARY"
                                                },
                                                "resulting_rows" : 91,
                                                "rows_to_scan" : 91
                                            }
                                        ]
                                    },
                                    "condition_filtering_pct" : 100,
                                    "cost_for_plan" : 37.491000,
                                    "plan_prefix" : [                               ],
                                    "rest_of_plan" : [
                                        {
                                            "best_access_path" : {
                                                "considered_access_paths" : [
                                                    {
                                                        "access_type" : "scan",
                                                        "buffers_needed" : 1,
                                                        "chosen" : true,
                                                        "cost" : 365,
                                                        "resulting_rows" : 20,
                                                        "rows_to_scan" : 20,
                                                        "using_join_cache" : true
                                                    }
                                                ]
                                            },
                                            "chosen" : true,
                                            "condition_filtering_pct" : 10,
                                            "cost_for_plan" : 402.490000,
                                            "plan_prefix" : [
                                                "`t2`"
                                            ],
                                            "rows_for_plan" : 182,
                                            "table" : "`t1`"
                                        }
                                    ],
                                    "rows_for_plan" : 91,
                                    "table" : "`t2`"
                                }
                            ]
                        },
                        {
                            "attaching_conditions_to_tables" : {
                                "attached_conditions_computation" : [                           ],
                                "attached_conditions_summary" : [
                                    {
                                        "attached" : "(`t2`.`pk` between 10 and 100)",
                                        "table" : "`t2`"
                                    },
                                    {
                                        "attached" : "(`t1`.`col_int_key` = `t2`.`pk`)",
                                        "table" : "`t1`"
                                    }
                                ],
                                "original_condition" : "((`t1`.`col_int_key` = `t2`.`pk`) and (`t2`.`pk` between 10 and 100))"
                            }
                        },
                        {
                            "clause_processing" : {
                                "clause" : "ORDER BY",
                                "items" : [
                                    {
                                        "item" : "`t1`.`col_int_key`"
                                    },
                                    {
                                        "item" : "`t2`.`pk`"
                                    }
                                ],
                                "original_clause" : "`t1`.`col_int_key`,`t2`.`pk`",
                                "resulting_clause" : "`t1`.`col_int_key`,`t2`.`pk`",
                                "resulting_clause_is_simple" : false
                            }
                        },
                        {
                            "clause_processing" : {
                                "clause" : "GROUP BY",
                                "items" : [
                                    {
                                        "item" : "`t2`.`pk`"
                                    }
                                ],
                                "original_clause" : "`field2`",
                                "resulting_clause" : "`field2`",
                                "resulting_clause_is_simple" : true
                            }
                        },
                        {
                            "refine_plan" : [
                                {
                                    "table" : "`t2`"
                                },
                                {
                                    "table" : "`t1`"
                                }
                            ]
                        }
                    ]
                }
            },
            {
                "join_execution" : {
                    "select#" : 1,
                    "steps" : [
                        {
                            "creating_tmp_table" : {
                                "tmp_table_info" : {
                                    "key_length" : 4,
                                    "location" : "memory (heap)",
                                    "row_length" : 18,
                                    "row_limit_estimate" : 932067,
                                    "table" : "intermediate_tmp_table",
                                    "unique_constraint" : false
                                }
                            }
                        },
                        {
                            "filesort_execution" : [                        ],
                            "filesort_information" : [
                                {
                                    "direction" : "asc",
                                    "field" : "col_int_key",
                                    "table" : "intermediate_tmp_table"
                                },
                                {
                                    "direction" : "asc",
                                    "field" : "pk",
                                    "table" : "intermediate_tmp_table"
                                }
                            ],
                            "filesort_priority_queue_optimization" : {
                                "cause" : "quicksort_is_cheaper",
                                "chosen" : false,
                                "limit" : 1000,
                                "memory_available" : 262144,
                                "row_size" : 17,
                                "rows_estimate" : 10
                            },
                            "filesort_summary" : {
                                "examined_rows" : 0,
                                "number_of_tmp_files" : 0,
                                "rows" : 0,
                                "sort_buffer_size" : 376,
                                "sort_mode" : "<sort_key, rowid>"
                            }
                        }
                    ]
                }
            }
        ]
    }
    

    对上述json串进行格式化,可以明显看到以下三个子节点:

    • join_preparation:准备阶段
    • join_optimization:优化阶段
    • join_execution:执行阶段
    主结构

    5.1 join_preparation

    准备阶段
    展开join_preparation,可以看到主结构主要有俩个:

    • select#:代表跟踪分析的是第几条语句,如果使用union,那么这里就会有俩条
    • steps:对应语句的执行步骤
      join_preparation主结构
    union下join_preparation主结构

    5.1.1 expanded_query

    {
        "expanded_query":"/* select#1 */ select sum(`t2`.`col_varchar_nokey`) AS `SUM(t2.col_varchar_nokey)`,`t2`.`pk` AS `field2` from (`t2` straight_join `t1` on((`t2`.`pk` = `t1`.`col_int_key`))) where (`t2`.`pk` between 10 and 100) group by `field2` order by `t1`.`col_int_key`,`t2`.`pk` limit 0,1000"
    }
    

    可以看到,steps中的expanded_query内容是原本的select *语句转化为具体库名、字段名、列名等后的结果

    limit语句应该是数据库可视化软件默认加上的,这里不去追究


    5.1.2 transformations_to_nested_joins

    {
        "transformations_to_nested_joins":{
            "expanded_query":"/* select#1 */ select sum(`t2`.`col_varchar_nokey`) AS `SUM(t2.col_varchar_nokey)`,`t2`.`pk` AS `field2` from `t2` straight_join `t1` where ((`t2`.`pk` between 10 and 100) and (`t2`.`pk` = `t1`.`col_int_key`)) group by `field2` order by `t1`.`col_int_key`,`t2`.`pk` limit 0,1000",
            "transformations":[
                "JOIN_condition_to_WHERE",
                "parenthesis_removal"
            ]
        }
    }
    

    从名字和expanded_query内容来看,是将on子句转化为where子句


    5.2 join_optimization

    优化阶段
    优化内容,相对内容多且复杂,主要分为以下几个部分:

    • condition_processing:对where子句和having子句的处理
    • substitute_generated_columns:替换虚拟生成列
    • table_dependencies:梳理表之间的依赖关系
    • ref_optimizer_key_uses: 如果优化器认为查询可以使用ref的话,在这里列出可以使用的索引
    • rows_estimation:估算表行数和扫描的代价。如果查询中存在range扫描的话,对range扫描进行计划分析及代价估算
    • considered_execution_plans:对比各可行计划的代价,选择相对最优的执行计划
    • attaching_conditions_to_tables:添加附加条件,使得条件尽可能筛选单表数据
    • clause_processing:对DISTINCTGROUP BYORDER BY等语句进行优化
    • refine_plan:优化后的执行计划

    5.2.1 condition_processing

    {
        "condition_processing":{
            "condition":"WHERE",
            "original_condition":"((`t2`.`pk` between 10 and 100) and (`t2`.`pk` = `t1`.`col_int_key`))",
            "steps":[
                {
                    "resulting_condition":"((`t2`.`pk` between 10 and 100) and multiple equal(`t2`.`pk`, `t1`.`col_int_key`))",
                    "transformation":"equality_propagation"
                },
                {
                    "resulting_condition":"((`t2`.`pk` between 10 and 100) and multiple equal(`t2`.`pk`, `t1`.`col_int_key`))",
                    "transformation":"constant_propagation"
                },
                {
                    "resulting_condition":"((`t2`.`pk` between 10 and 100) and multiple equal(`t2`.`pk`, `t1`.`col_int_key`))",
                    "transformation":"trivial_condition_removal"
                }
            ]
        }
    }
    

    对条件语句wherehaving的处理:

    • condition:条件语句类型,可以是where或者having
    • original_condition:原始条件查询语句
    • steps:处理步骤
      • resulting_condition:转换之后的语句
      • transformation:转换类型
        • equality_propagation:等值条件句转换
        • constant_propagation:常量条件句转换
        • trivial_condition_removal:无效条件移除的转换

    demo中的where字句只命中了equality_propagation一种情况(虽然也有constant_propagationtrivial_condition_removal的部分,但是可以看到resulting_condition并没有发生变化,也就是相当于只命中了equality_propagation一种情况),可以通过以下语句命中全部三种情况:

    select * from t1 join t2 on t1.pk=t2.pk+1 where t2.pk = 5 and 1 =1 ;
    
    {
        "condition_processing":{
            "condition":"WHERE",
            "original_condition":"((`t2`.`pk` = 5) and (1 = 1) and (`t1`.`pk` = (`t2`.`pk` + 1)))",
            "steps":[
                {
                    "resulting_condition":"((1 = 1) and (`t1`.`pk` = (5 + 1)) and multiple equal(5, `t2`.`pk`))",
                    "transformation":"equality_propagation"
                },
                {
                    "resulting_condition":"((1 = 1) and (`t1`.`pk` = 6) and multiple equal(5, `t2`.`pk`))",
                    "transformation":"constant_propagation"
                },
                {
                    "resulting_condition":"((`t1`.`pk` = 6) and multiple equal(5, `t2`.`pk`))",
                    "transformation":"trivial_condition_removal"
                }
            ]
        }
    }
    

    分析如下:

    • equality_propagation:将原始语句中的t2.pk = 5 转化为multiple equal(5, t2.pk);之后t1.pk = (t2.pk + 1)结合前面的t2.pk = 5t2.pk做了替换,转化成了t1.pk = (5 + 1),其余保持不变
    • constant_propagation:将上一步中的结果做常量计算,t1.pk = (5 + 1)变成t1.pk = 6
    • trivial_condition_removal:移除1 = 1这个恒为true的无效条件

    5.2.2 table_dependencies

    {
        "table_dependencies":[
            {
                "depends_on_map_bits":[
    
                ],
                "map_bit":0,
                "row_may_be_null":false,
                "table":"`t2`"
            },
            {
                "depends_on_map_bits":[
                    0
                ],
                "map_bit":1,
                "row_may_be_null":false,
                "table":"`t1`"
            }
        ]
    }
    

    在这个结点中我们可以看到表之间的依赖关系:

    • table:表名及其别名
    • row_may_be_nulljoin之后的列是否可能为null,注意,不是表中列属性是否设置为null,比如说左连接,那么后一张表中有些列可能为null,此时row_may_be_null显示为true
    • map_bit:表序号
    • depends_on_map_bits:依赖的表的map_bit

    5.2.3 ref_optimizer_key_uses

    {
        "ref_optimizer_key_uses":[
            {
                "equals":"`t1`.`col_int_key`",
                "field":"pk",
                "null_rejecting":true,
                "table":"`t2`"
            }
        ]
    }
    

    该节点列出了所有可用的ref类型的索引,如果使用了组合索引的多个部分,则会列出多个结构体。单个结构体会列出单表ref使用的索引及其对应值。


    5.2.4 rows_estimation

    {
        "rows_estimation":[
            {
                "range_analysis":{
                    "analyzing_range_alternatives":{
                        "analyzing_roworder_intersect":{
                            "cause":"too_few_roworder_scans",
                            "usable":false
                        },
                        "range_scan_alternatives":[
                            {
                                "chosen":true,
                                "cost":19.291,
                                "index":"PRIMARY",
                                "index_dives_for_eq_ranges":true,
                                "index_only":false,
                                "ranges":[
                                    "10 <= pk <= 100"
                                ],
                                "rowid_ordered":true,
                                "rows":91,
                                "using_mrr":false
                            }
                        ]
                    },
                    "chosen_range_access_summary":{
                        "chosen":true,
                        "cost_for_plan":19.291,
                        "range_access_plan":{
                            "index":"PRIMARY",
                            "ranges":[
                                "10 <= pk <= 100"
                            ],
                            "rows":91,
                            "type":"range_scan"
                        },
                        "rows_for_plan":91
                    },
                    "group_index_range":{
                        "cause":"not_single_table",
                        "chosen":false
                    },
                    "potential_range_indexes":[
                        {
                            "index":"PRIMARY",
                            "key_parts":[
                                "pk"
                            ],
                            "usable":true
                        }
                    ],
                    "setup_range_conditions":[
    
                    ],
                    "table_scan":{
                        "cost":23.1,
                        "rows":100
                    }
                },
                "table":"`t2`"
            },
            {
                "table":"`t1`",
                "table_scan":{
                    "cost":1,
                    "rows":20
                }
            }
        ]
    }
    

    rows_estimation展示估算的表扫描行数及其代价,从结构体中可以看到,是以表为单位展示对应结果的

    首先看一下t1t1表由于没有索引,所以走的是全表扫描,结构体也比较简单,只有俩个结点:

    {
        "table":"`t1`",
        "table_scan":{
            "cost":1,
            "rows":20
        }
    }
    
    • table:表名及其别名
    • table_scan
      • rows:扫描行数
      • cost:代价

    接下来看一下t2,t2相对复杂很多,主要看一下range_analysis,分为以下几个部分:

    • analyzing_range_alternatives
    • chosen_range_access_summary
    • group_index_range
    • potential_range_indexes
    • setup_range_conditions
    • table_scan

    5.2.4.1 analyzing_range_alternatives

    {
        "analyzing_roworder_intersect":{
            "cause":"too_few_roworder_scans",
            "usable":false
        },
        "range_scan_alternatives":[
            {
                "chosen":true,
                "cost":19.291,
                "index":"PRIMARY",
                "index_dives_for_eq_ranges":true,
                "index_only":false,
                "ranges":[
                    "10 <= pk <= 100"
                ],
                "rowid_ordered":true,
                "rows":91,
                "using_mrr":false
            }
        ]
    }
    

    分析可选方案的代价,包括

    • analyzing_roworder_intersectindex merge分析
    • range_scan_alternativesrange扫描分析

    俩个阶段,分别针对不同的情况进行执行代价的分析,从中选择出更优的执行计划。

    5.2.4.1.1 analyzing_roworder_intersect

    由于示例没有使用index merge,所以在这一段仅仅给出了不使用index merge的原因。如果是语句可以使用index_merge的情况,在该阶段会分析使用index_merge过程中消耗的代价(index_scan_costdisk_sweep_cost等),并汇总merge的代价确认是否选择使用index_merge以及对应使用的索引。

    5.2.4.1.2 range_scan_alternatives

    range扫描分析针对所有可用于range扫描的索引进行了代价分析,并根据分析结果确认选择使用的索引,包含如下字段:

    • choosen:是否使用该索引
    • cost:代价
    • index:索引名称
    • index_dives_for_eq_ranges:是否使用了index dive。这个值会被参数eq_range_index_dive_limit设定值影响
    • index_only:是否是覆盖索引
    • rangesrange扫描的条件句范围
    • rowid_ordered:扫描的结果集是否根据PK排序
    • rows:扫描行数
    • using_mrr:是否使用mrr

    5.2.4.2 chosen_range_access_summary

    {
        "chosen":true,
        "cost_for_plan":19.291,
        "range_access_plan":{
            "index":"PRIMARY",
            "ranges":[
                "10 <= pk <= 100"
            ],
            "rows":91,
            "type":"range_scan"
        },
        "rows_for_plan":91
    }
    

    在前一个步骤中分析了各类索引使用的方法及代价,得出了一定的中间结果之后,在summary阶段汇总前一阶段的中间结果确认最后的方案

    • range_access_planrange扫描最终选择的执行计划。在该结构体中会给出执行计划的type,使用的索引以及扫描行数。如果range_access_plan.typeindex_roworder_intersect(即index merge)的话,在该结构体下还会列intersect_of结构体给出index merge的具体信息。
    • rows_for_plan:该执行计划扫描行数
    • cost_for_plan:该执行计划代价
    • chosen:是否选择该执行计划

    5.2.4.3 group_index_range

    {
        "cause":"not_single_table",
        "chosen":false
    }
    

    评估在使用了GROUP BY或者是DISTINCT的时候是否有适合的索引可用:

    • 当语句中没有GROUP BY或者是DISTINCT的时候,该结构体下显示"chosen":"false"以及"cause":"'not_group_by_or_distinct"
    • 如果语句中在多表关联时使用了GROUP BY或DISTINCT时,在该结构体下显示"chosen":"false"以及"cause":"not_single_table"
    • 其他情况下会去尝试分析可用的索引(potential_group_range_indexes)并且计算对应的扫描行数及其所需代价

    5.2.4.4 potential_range_indexes

    [
        {
            "index":"PRIMARY",
            "key_parts":[
                "pk"
            ],
            "usable":true
        }
    ]
    

    列出表中所有的索引,并分析其是否可用,结点释义如下:

    • index:索引名称
    • key_parts:建立该索引的字段
    • usable:是否可用

    5.2.4.5 table_scan

    同前面t1的释义,全表扫描的行数及代价


    5.2.5 considered_execution_plans

    [
        {
            "best_access_path":{
                "considered_access_paths":[
                    {
                        "access_type":"ref",
                        "chosen":false,
                        "index":"PRIMARY",
                        "usable":false
                    },
                    {
                        "access_type":"range",
                        "chosen":true,
                        "cost":37.491,
                        "range_details":{
                            "used_index":"PRIMARY"
                        },
                        "resulting_rows":91,
                        "rows_to_scan":91
                    }
                ]
            },
            "condition_filtering_pct":100,
            "cost_for_plan":37.491,
            "plan_prefix":[
    
            ],
            "rest_of_plan":[
                {
                    "best_access_path":{
                        "considered_access_paths":[
                            {
                                "access_type":"scan",
                                "buffers_needed":1,
                                "chosen":true,
                                "cost":365,
                                "resulting_rows":20,
                                "rows_to_scan":20,
                                "using_join_cache":true
                            }
                        ]
                    },
                    "chosen":true,
                    "condition_filtering_pct":10,
                    "cost_for_plan":402.49,
                    "plan_prefix":[
                        "`t2`"
                    ],
                    "rows_for_plan":182,
                    "table":"`t1`"
                }
            ],
            "rows_for_plan":91,
            "table":"`t2`"
        }
    ]
    

    负责对比各可行计划的代价,选择相对最优的执行计划,由于我们使用了straight join,强制决定了连接顺序,由t2表驱动t1表,所以considered_execution_plans下面只有一个元素,也就代表只有一个可行的执行计划,若是使用join,那么此时就会有俩个元素,由优化器帮我们决定使用哪一个执行计划:

    {
        "considered_execution_plans":[
            {
                "best_access_path":{
                    "considered_access_paths":[
                        {
                            "access_type":"scan",
                            "chosen":true,
                            "cost":5,
                            "resulting_rows":20,
                            "rows_to_scan":20
                        }
                    ]
                },
                "condition_filtering_pct":100,
                "cost_for_plan":5,
                "plan_prefix":[
    
                ],
                "rest_of_plan":[
                    {
                        "best_access_path":{
                            "considered_access_paths":[
                                {
                                    "access_type":"eq_ref",
                                    "cause":"clustered_pk_chosen_by_heuristics",
                                    "chosen":true,
                                    "cost":24,
                                    "index":"PRIMARY",
                                    "rows":1
                                },
                                {
                                    "access_type":"range",
                                    "cause":"heuristic_index_cheaper",
                                    "chosen":false,
                                    "range_details":{
                                        "used_index":"PRIMARY"
                                    }
                                }
                            ]
                        },
                        "chosen":true,
                        "condition_filtering_pct":100,
                        "cost_for_plan":29,
                        "plan_prefix":[
                            "`t1`"
                        ],
                        "rows_for_plan":20,
                        "table":"`t2`"
                    }
                ],
                "rows_for_plan":20,
                "table":"`t1`"
            },
            {
                "best_access_path":{
                    "considered_access_paths":[
                        {
                            "access_type":"ref",
                            "chosen":false,
                            "index":"PRIMARY",
                            "usable":false
                        },
                        {
                            "access_type":"range",
                            "chosen":true,
                            "cost":37.491,
                            "range_details":{
                                "used_index":"PRIMARY"
                            },
                            "resulting_rows":91,
                            "rows_to_scan":91
                        }
                    ]
                },
                "condition_filtering_pct":100,
                "cost_for_plan":37.491,
                "plan_prefix":[
    
                ],
                "pruned_by_cost":true,
                "rows_for_plan":91,
                "table":"`t2`"
            }
        ]
    }
    

    字段释义:

    • best_access_path:当前最优的执行顺序信息结果集,可以看到主要内容是considered_access_paths。根据索引的使用与否和具体的使用方法可能会产生多个considered_access_paths
      • access_type:参照explaintype,不同的type会在considered_access_paths下展示不同的字段,详见5.2.5.1
    • condition_filtering_pct:类似于explain中的filtered列,这是一个估算值
    • rows_for_plan:该执行计划最终的扫描行数,这里的行数其实也是估算值,是由considered_access_pathsresulting_rows相乘之后再乘以condition_filtering_pct获得
    • cost_for_plan:该执行计划的执行代价,由considered_access_pathscost相加而得
    • plan_prefix:前置表名称,也就是驱动表的名称
    • table:分析的表对象名称及其别名
    • rest_of_plan:下一张表的计划,理解为类似链表的next,可以看到,其下的结点与第一层结构是类似的

    5.2.5.1 access_type

    • 使用索引的情况(t2 join t1)
    [
        {
            "access_type":"ref",
            "chosen":false,
            "index":"PRIMARY",
            "usable":false
        },
        {
            "access_type":"range",
            "chosen":true,
            "cost":37.491,
            "range_details":{
                "used_index":"PRIMARY"
            },
            "resulting_rows":91,
            "rows_to_scan":91
        }
    ]
    

    可以看到,这时候有多条数据,第一条数据使用ref类型的索引,但是该索引不可用("useable":false),所以不被选择("chosen":false);第二条数据使用了range类型的索引,使用了名称为PRIMARY的索引,并且展示了扫描的行数,扫描结果行数以及代价,最后的是在选择比较的结果中是否选择了该方式(chosen

    • 使用扫描的情况(t1 join t2)
    {
        "access_type":"scan",
        "chosen":true,
        "cost":5,
        "resulting_rows":20,
        "rows_to_scan":20
    }
    

    该结构体下列出了该表的扫描行数(rows_to_scan),由于这里没有列出index所以能分析出这里的access_type中的scan在这里指的是全表扫描。


    5.2.6 attaching_conditions_to_tables

    {
        "attaching_conditions_to_tables":{
            "attached_conditions_computation":[
    
            ],
            "attached_conditions_summary":[
                {
                    "attached":"(`t2`.`pk` between 10 and 100)",
                    "table":"`t2`"
                },
                {
                    "attached":"(`t1`.`col_int_key` = `t2`.`pk`)",
                    "table":"`t1`"
                }
            ],
            "original_condition":"((`t1`.`col_int_key` = `t2`.`pk`) and (`t2`.`pk` between 10 and 100))"
        }
    }
    

    基于considered_execution_plans中已选执行计划改造原有的where条件句并针对表的增加适当地附加条件便于单表数据的筛选。这部分条件的增改主要是为了便于ICP,但是ICP是否开启并不影响该部分的构造

    • original_condition:在准备阶段原始SQL语句以及considered_execution_plans中使用的索引基础上,改写语句,尽可能将原有语句中不能使用索引的条件句绑定到单表中对单表进行数据筛选
    • attached_conditions_computation:使用启发式算法计算已使用的索引,如果已使用的索引的访问类型是ref的话,计算使用range方式访问是否能使用组合索引中更多的索引列,如果可以的话,用range的方式替换ref的访问方式
    • attached_conditions_summary:针对上述的附加之后的情况汇总
      • attached:附加的条件或者是原语句中能直接下推给单表筛选的条件
      • table:对象表及其别名

    5.2.7 clause_processing

    {
        "clause_processing":{
            "clause":"ORDER BY",
            "items":[
                {
                    "item":"`t1`.`col_int_key`"
                },
                {
                    "item":"`t2`.`pk`"
                }
            ],
            "original_clause":"`t1`.`col_int_key`,`t2`.`pk`",
            "resulting_clause":"`t1`.`col_int_key`,`t2`.`pk`",
            "resulting_clause_is_simple":false
        }
    }
    

    主要是对DISTINCTGROUP BYORDER BY等语句进行优化

    • clause:优化语句关键字(DISTINCTGROUP BYORDER BY
    • original_clause:优化对象的原始语句
    • items:original_clause中包含的对象
      • item:对象名
      • eq_ref_to_preceding_items:与前置表关联的是否是唯一索引。在这个示例中,并没有这样的情况所以没有列出。如果语句稍微改写一下,将原始查询语句改写为t1 STRAIGHT_JOIN t2,让t1去驱动t2表,就发现在分析items[1].item = t2.pk时出现了该字段。这是由于t1t2通过t2的主键pk列进行关联,这就意味着,t1中的一行数据最多只能在t2中关联出一列,所以在后续优化的结果语句中order byt2.pk列被优化掉了,因为这一列已经确认唯一不需要再进行排序:
    {
        "clause_processing":{
            "clause":"ORDER BY",
            "items":[
                {
                    "item":"`t1`.`col_int_key`"
                },
                {
                    "eq_ref_to_preceding_items":true,
                    "item":"`t2`.`pk`"
                }
            ],
            "original_clause":"`t1`.`col_int_key`,`t2`.`pk`",
            "resulting_clause":"`t1`.`col_int_key`",
            "resulting_clause_is_simple":true
        }
    }
    
    • resulting_clause_is_simple:优化后的结果语句是否是简单语句
    • resulting_clause:优化后的结果语句

    那么显然,这一块我们需要重点关注的就是original_clauseresulting_clause,通过对比结果,再去查看细节,探究mysql在对应从句上是做了哪些优化


    5.2.8 refine_plan

    {
        "refine_plan":[
            {
                "table":"`t2`"
            },
            {
                "table":"`t1`"
            }
        ]
    }
    

    该阶段展示的是改善之后的执行计划,如图所示只展示了对应的表对象没有其他的字段的话,说明在之前已经确定的执行计划中没有需要再优化的地方,可直接应用

    • table:涉及的表名及其别名
    • pushed_index_condition:可使用到ICP的条件句
    • table_condition_attached:在attaching_conditions_to_tables阶段添加了附加条件的条件语句
    • access_type:优化后的索引访问类型

    5.3 join_execution

    执行阶段
    join_execution阶段,如果分析的语句是explain的话,该阶段的代码是空结构体,只有在真正执行语句之后,该阶段的代码才会有具体的步骤过程

    • creating_tmp_table:创建临时表
    • filesort_information:文件排序信息
    • filesort_priority_queue_optimization:文件排序-优先队列优化
    • filesort_execution:文件排序执行
    • filesort_summary:文件排序汇总信息

    5.3.1 creating_tmp_table

    {
        "creating_tmp_table":{
            "tmp_table_info":{
                "key_length":4,
                "location":"memory (heap)",
                "row_length":18,
                "row_limit_estimate":932067,
                "table":"intermediate_tmp_table",
                "unique_constraint":false
            }
        }
    }
    
    • converting_tmp_table_to_ondisk:将临时表落地到磁盘中,如果临时表的大小超过了设置的max_heap_table_size或者是tmp_table_size参数的话,那么将会出现该节点(本示例中并没有出现)
    • tmp_table_info:临时表信息
      • table:临时表的名称
      • row_length:临时表的单行长度
      • key_length:临时表索引长度
      • unique_constraint:是否有使用唯一约束
      • location:表存储位置,比如内存表memory (heap),或者是转换到磁盘的物理表disk (InnoDB)
      • row_limit_estimate:该临时表中能存储的最大行数

    5.3.2 filesort

    {
        "filesort_execution":[
    
        ],
        "filesort_information":[
            {
                "direction":"asc",
                "field":"col_int_key",
                "table":"intermediate_tmp_table"
            },
            {
                "direction":"asc",
                "field":"pk",
                "table":"intermediate_tmp_table"
            }
        ],
        "filesort_priority_queue_optimization":{
            "cause":"quicksort_is_cheaper",
            "chosen":false,
            "limit":1000,
            "memory_available":262144,
            "row_size":17,
            "rows_estimate":10
        },
        "filesort_summary":{
            "examined_rows":0,
            "number_of_tmp_files":0,
            "rows":0,
            "sort_buffer_size":376,
            "sort_mode":"<sort_key, rowid>"
        }
    }
    

    在创建临时表之后如果临时表不需要转换为磁盘表的话,即开始对文件排序进行处理

    • filesort_information:文件排序信息(如果有多列,filesort_information下会有多个结构体)
      • direction:排序列是升序还是降序
      • table:排序的表对象名
      • field:排序列
    • filesort_priority_queue_optimization:优先队列优化排序,一般在使用limit子句的时候会使用优先队列
      • usable:是否有使用
      • cause:没有使用的原因
      • limit:限制查询的行数
      • memory_available:可用内存大小
      • row_size:单行大小
      • rows_estimate:估算的行数
    • filesort_execution:执行文件排序
    • filesort_summary:文件排序汇总信息
      • rows:预计扫描行数
      • examined_rows:参与排序的行数
      • number_of_tmp_files:使用临时文件的个数,这个值为0代表全部使用sort_buffer内存排序,否则表示使用了磁盘文件排序
      • sort_buffer_size:使用的sort_buffer的大小
      • sort_mode:排序方式

    6. 总结

    在整个optimizer_trace中我们重点其实就是在跟踪记录TRACE的JSON树,我们通过这棵树中的内容可以具体去分析优化器究竟做了什么事情,进行了哪些选择,是基于什么原因做的选择,选择的结果及依据。这一系列都可以辅助验证我们的一些观点及优化,更好的帮助我们对我们的数据库的实例进行调整。

    相关文章

      网友评论

        本文标题:探秘OPTIMIZER_TRACE

        本文链接:https://www.haomeiwen.com/subject/wsprzhtx.html