参考资料
派生表
优化器可以使用两种策略来处理派生表引用(这也适用于视图引用):
- 将派生表合并到外部查询中(5.7引入的优化策略 derived_merge);
- 将派生表物化为内部临时表,再用于外部查询。
什么是派生表?形如以下子查询结果作为表对象的就是派生表:
SELECT * FROM (SELECT * FROM t1) AS derived_t1;
SELECT * FROM t1 JOIN (SELECT t2.f1 FROM t2) AS derived_t2 ON t1.f2=derived_t2.f1WHERE t1.f1 > 0;
通过 derived_merge 策略,上面两个派生子查询执行时等同:
SELECT * FROM t1;
SELECT t1.*, t2.f1 FROM t1 JOIN t2 ON t1.f2=t2.f1 WHERE t1.f1 > 0;
限制
当子查询包含以下操作时,derived_merge 策略失效:
- Aggregate functions (SUM(), MIN(), MAX(), COUNT(), and so forth)
- DISTINCT
- GROUP BY
- HAVING
- LIMIT
- UNION or UNION ALL
- Subqueries in the select list
- Assignments to user variables
- Refererences only to literal values (in this case, there is no underlying table)
如果 derived_merge 策略失效,将按以下策略执行查询:
- 在需要派生表之前,优化器会推迟派生表的物化,这能提高性能。举例:表 t1 与派生表进行联接,如果处理 t1 表时返回结果为空,则不需要进行下一步联接,这是可以完全跳过派生表的物化;
- 查询执行期间,优化器会给物化派生表添加索引,提升效率。
对于第二点,可以看这个例子:
mysql> explain select * from t1 join (select distinct a from t2) as derived_t2 on t1.a=derived_t2.a;
+----+-------------+------------+------------+-------+---------------+-------------+---------+----------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+-------+---------------+-------------+---------+----------------+------+----------+-------------+
| 1 | PRIMARY | t1 | NULL | ALL | a | NULL | NULL | NULL | 100 | 100.00 | Using where |
| 1 | PRIMARY | <derived2> | NULL | ref | <auto_key0> | <auto_key0> | 5 | join_test.t1.a | 10 | 100.00 | Using index |
| 2 | DERIVED | t2 | NULL | index | a | a | 5 | NULL | 1000 | 100.00 | Using index |
+----+-------------+------------+------------+-------+---------------+-------------+---------+----------------+------+----------+-------------+
可以看到派生表确实是走索引的。不过也不是所有情况下都会给派生表添加索引,官档上原文:
The optimizer constructs an index over column f1 from derived_t2 if doing so would enable use of ref access for the lowest cost execution plan.
网友评论