索引及执行计划管理

作者: 酷酷的伟 | 来源:发表于2019-08-11 16:09 被阅读0次

索引的作用

类似于一本书的目录，起到优化查询的功能

索引类型（笔试）

BTREE索引*****

RTREE索引

HASH索引

全文索引
image

BTREE的细分离（算法）

B-TREE

B +树

B *树

Btree索引的功能分类

聚集索引（集群索引）*****

辅助索引（二级索引）

B树是如何构建的

辅助索引
alter table t1 add index idx_name(name);
1.将name列的所有值取出来，进行自动排序
2.将排完序的值均匀的落到16KB叶子节点数据页中，并将索引键值所对应的数据行的几句索引列值
3.向上生成枝节点和根节点

聚集索引（IOT）
1.默认是按照主键生成聚集索引，没有主键，存储引擎会使用唯一键。
如果都没有，会自动生成隐藏的聚集索引
2.数据在存储时，就会按照聚集索引打的顺序存储到磁盘的数据页
3.由于本身书籍就是有序的，所以聚集索引构建时，不需要进行排序
4.聚集索引直接将磁盘的数据页，作为叶子节点
5.枝节点和根节点只会调取下层节点主键的最小值

image

辅助索引和聚集索引区别

(1) 辅助索引：
叶子节点只保存主键值+索引键值的有序存储
对索引键值会知道排序
需要手工创建
辅助索引可以有多个
任何列都可以创建
(2)
数据存储时，就是按照聚集索引顺序进行有序存储
叶子节点保存的是整个有序的数据行
叶子节点不需要单独生成

辅助索引细分

单列辅助索引

select * from t1 where name=''

联合索引

select * from t1 where a and b and c

唯一索引

索引树的高度(越低越好)

1. 表的数据量级大

分区表

分库分表(分布式架构)

2. 列的长度

尽可能选择列值短的创建索引

采用前缀索引

3. 数据类型选择

varchar 和 char

enum

索引管理

压力测试准备

create database test charset utf8mb4;
use test;
create table t100w (id int,num int,k1 char(2),k2 char(4),dt timestamp);
delimiter //
create  procedure rand_data(in num int)
begin
declare str char(62) default 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
declare str2 char(2);
declare str4 char(4);
declare i int default 0;
while i<num do
set str2=concat(substring(str,1+floor(rand()*61),1),substring(str,1+floor(rand()*61),1));
set str4=concat(substring(str,1+floor(rand()*61),2),substring(str,1+floor(rand()*61),2));
set i=i+1;
insert into t100w values (i,floor(rand()*num),str2,str4,now());
end while;
end;
//
delimiter ;

插入100w条数据：

call rand_data(1000000);

commit;

MySQL [test]>select count(*) from t100w;
+----------+
| count(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.47 sec)

索引命令操作

查询索引

use school
show tables;
desc student;

Key : PRI（主键），UNI（唯一索引），MUL（辅助索引）

show index from student\G

image

创建单列索引

alter table student add index idx_name(sname);

image

创建联合索引

alter table student add index idx_sname_sage_ssex(sname,sage,ssex);

idx_a_b_c -----> idx_a   idx_a_b   idx_a_b_c

创建前缀索引

alter table student add index idx(sname(5));

alter table student add telnum char(11) not null;

image

创建唯一索引

mysql[school]>alter table student add telnum char(11) not null ;
mysql[school]>alter table student add unique index idx_tel(telnum);

# 判断某个列有或没有重复值
MySQL [world]>select name,count(id) from city group by name having count(id)>1 order by count(id) desc ;
+---------------+-----------+
| name          | count(id) |
+---------------+-----------+
| San José      |         4 |
| Córdoba       |         3 |
| Springfield   |         3 |
...
.....
| Brest         |         2 |
| Manchester    |         2 |
| Santa Maria   |         2 |
| Colombo       |         2 |
| San Carlos    |         2 |

| Jinzhou       |         2 |
+---------------+-----------+
67 rows in set (0.00 sec)

MySQL [world]>select count(*) from city;
+----------+
| count(*) |
+----------+
|     4079 |
+----------+
1 row in set (0.00 sec)

MySQL [world]>select count(distinct name) from city;
+----------------------+
| count(distinct name) |
+----------------------+
|                 3998 |
+----------------------+
1 row in set (0.00 sec)

删除索引

mysql[school]>alter table student drop index idx;
mysql[school]>alter table student drop index idx_name;
mysql[school]>alter table student drop index idx_sname_sage_ssex;

mysqlslap --defaults-file=/etc/my.cnf --concurrency=100 --iterations=1 --create-schema='test' --query="select * from test.t100w where k2='VWtu'" engine=innodb --number-of-queries=2000 -uroot -p123456 -verbose

优化前
image

use test;
show tables;
desc t100w;
alter table t100w add index idx_k2(k2);  #优化

image

优化后
image

explain (desc) ☆☆☆☆☆

explain select * from test.t100w where k2='VWtu';
or
desc select * from test.t100w where k2='VWtu';

作用 : 抓取优化器优化过的执行计划

执行计划的分析

MySQL [test]>explain select * from test.t100w where k2='VWtu';
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key    | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t100w | NULL       | ref  | idx_k2        | idx_k2 | 17      | const | 1120 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------+

重点！！！

table：以上SQL语句设计到的表☆☆☆
type：查询的类型(全表扫描(ALL)，索引扫描，查不到数据(NULL))☆☆☆☆☆
possible_keys：可能会用到的索引☆☆☆
key：使用到的索引☆☆☆☆
key_len：索引的覆盖长度☆☆☆☆☆
Extra：额外的信息☆☆☆☆
image

type 详细说明☆☆☆☆☆

ALL：全表扫描,不走索引

(1) 查询条件，没建索引
>use test;
>MySQL [test]>explain  select * from test.t100w where k2='VWtu';
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key    | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t100w | NULL       | ref  | idx_k2        | idx_k2 | 17      | const | 1120 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

MySQL [test]>desc t100w;
+-------+-----------+------+-----+-------------------+-----------------------------+
| Field | Type      | Null | Key | Default           | Extra                       |
+-------+-----------+------+-----+-------------------+-----------------------------+
| id    | int(11)   | YES  |     | NULL              |                             |
| num   | int(11)   | YES  |     | NULL              |                             |
| k1    | char(2)   | YES  |     | NULL              |                             |
| k2    | char(4)   | YES  | MUL | NULL              |                             |
| dt    | timestamp | NO   |     | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+-------+-----------+------+-----+-------------------+-----------------------------+
5 rows in set (0.00 sec)

(2) 有索引不走
desc select * from t100w where k2 != 'asdf';
desc select * from t100w where k2 like '%aa%';
desc select * from t100w where k2 not in ('asda','asas');
desc select * from t100w;

注意: !=和not in 如果是主键列,是走range
mysql[test]>desc select * from world.city where id != 10;
mysql[test]>desc select * from world.city where id not in (10,20);

index 全索引扫描

MySQL [test]>desc select k2 from t100w;

image

=====从range开始，我们才认为索引是有价值的======

range 索引范围查询

所有索引：
>, <, >=, <=, like, between and
mysql[world]>desc select * from city where id<10;
mysql[world]>desc select * from city where countrycode like 'CH%

in () or 
oldguo[world]>desc select * from city where countrycode in ('CHN','USA');

聚集索引：
!= not in
mysql[test]>desc select * from world.city where id != 10;
mysql[test]>desc select * from world.city where id not in (10,20);
========================================
说明：
B+tree 索引能额外优化到:> ,<, >=, <= ,like , between and
in 和 or 享受不到b+tree额外的优化效果的,所以我一般情况会将in , or 进行改性
mysql[world]>desc select * from city where countrycode = 'CHN' union all  select * from city where countrycode = 'USA';

ref 辅助索引等值查询

MySQL [world]>desc select * from city where countrycode='CHN';

image

eq_ref 多表连接查询中, 非驱动表的on的调剂列是主键唯一键

MySQL [world]>desc select a.name,b.name from city as a join country as b on a.countrycode=b.code where a.population<100;

const(system) 主键或唯一键的等值

desc select * from city where id=100000000000;

image

possible_keys：可能会用到的索引☆ ☆ ☆

NULL：没有和查询条件匹配的所有条目
有值：有和查询条件匹配的索引条目，但是没走，大部分原因是语句查询方式不符合索引应用条件

10.4 key：使用到的索引☆☆☆☆

最终使用的索引，可以帮助我们判断是否走了合适的索引

10.5 key_len：索引的覆盖长度☆☆☆☆☆

在联合索引应用的判断时，会经常看

单列索引:

字符集	字节个数	not null	没有指定not null
utf8	3个字节
int	4个字节	4	4+1
tinyint	1个字节	1	2
utf8mb4	一个字符最大是4个字节
char(2)		2*4	2*4+1
varchar(2)		2*4+2	2*4+2+1

说明：

有非空约束时，key_length就是最大字节长度

在没有非空约束时：字符最大长度+1

varchar类型，需要额外在最大字符长度+2(存储字符长度的最长度占位)

create table t1 (
id int ,
num int not null, 
k1 char(2),
k2 char(2) not null,
k3 varchar(2),
k4 varchar(2) not null)charset utf8mb4;

联合搜索优化


MySQL [world]>desc t1;
+-------+------------+------+-----+---------+-------+
| Field | Type       | Null | Key | Default | Extra |
+-------+------------+------+-----+---------+-------+
| id    | int(11)    | YES  |     | NULL    |       |
| num   | int(11)    | NO   |     | NULL    |       |
| k1    | char(2)    | YES  |     | NULL    |       |
| k2    | char(2)    | NO   |     | NULL    |       |
| k3    | varchar(2) | YES  |     | NULL    |       |
| k4    | varchar(2) | NO   |     | NULL    |       |
+-------+------------+------+-----+---------+-------+

====================================
alter table t1 add index idx(id,num,k1,k2,k3,k4);
select 5+4+9+8+11+10:  \\ 47

(1)最完美的查询情况

idx(id,num,k1,k2,k3,k4)
desc select * from t1 where id=1 and num=1 and k1='a' and k2'a' and k3='a' and k4='a';
desc select * from t1 where num=1 and id=1 and k1='a' and k2'a' and k3='a' and k4='a';

结论：
当查询条件中，包含了索引列中所有的列条件，并且都是等值的查询，那么无关排列顺序，都可以走全联合索引优化；
原因是优化器会自动调整顺序，达到最佳的优化效果
所以，我们重点需要关注的是联合索引建立的顺序,从左到右，唯一值越多的列放在最左边

(2) 查询条件中，哪些因素会key_len长度

#安装索引的建立顺序，在查询条件中，少了任意一个中间列，后续列都无法走索引
desc select * from t1 where id=1 and k1='a' and k2'a' and k3='a' and k4='a';

#在条件查询中间,出现不等值查询时,从不等值列开始,所有列都无法使用联合索引 (暂存)
oldguo[world]>desc select * from t1 where  id=1 and num>10 and k1='a' and k2='a' and k3='a' and k4='a';

优化方法:

将不等值列放在最后.
oldguo[world]>alter table t1 add index idx1(id,k1,k2,k3,k4,num);
oldguo[world]>alter table t1 drop index idx;

(3) 如果有多子句的条件查询(必须是联合索引)
按照子句的执行顺序,建立联合索引.

Extra：额外的信息 ☆☆☆☆

Using filesort :  原因是 在 group  by ,order by,distinct等.
一般优化的方法是,和where条件的列建立联合索引

小结

索引的类型(算法)

BTree
HASH
RTREE

BTREE

B-tree
B+TREE
B*TREE

B+Tree有什么优势

在叶子节点增加了相邻叶子节点双向指针
为了优化范围查询

辅助索引和聚集索引区别

叶子节点 ☆☆☆☆☆

辅助索引是手工创建的，

聚集索引是自动跟主键生成

聚集索引只有一个

辅助索引可以有多个

辅助索引细分

单列
联合
唯一
前缀

索引树高度

数据量
数据类型
数据列的长度
聚集索引列的长度

索引基本管理

show index from t1;
desc t1;
alter table t1 add idnex 索引名(列名)
alter table t1 drop index 索引名

explain

作用:获取优化器选择的执行计划

分析执行计划

(1) 查询条件没有建索引

use test
desc select * from t100w where id=10;

(2) 有索引不走

desc select * from t100w  where 1=1;
desc select * from t100w where k2 !='aaaa';
desc select * from t100w where k2 like '%aa%';
desc select * from t100w where k2 not in ('aaaa','bbbb');

index:全索引扫描 
    desc select k2 from t100w;

Range: 索引范围扫描
  >, <, >=, <=, like, between and
  or in 
  desc select * from world.city where countrycode in ('CHN','USA');
改写:
  desc select * from world.city where countrycode='CHN' union all
  select * from world.city where countrycode='USA';     

ref : 辅助索引等值查询 
       oldguo[test]>desc select * from world.city where countrycode='CHN';

eq_ref : 多表连接查询,非驱动表on的条件列是主键或者唯一键 
       desc select city.name,country.name from city left join country on city.countrycode=country.code where city.population<100;   

const(system) : 主键或者唯一键的等值查询
    oldguo[world]>desc select * from city where id=10; 

NULL:略

key_len:  utf8mb4 
                   not null       没设置
    int            4              +1
    varchar(10)    4*10+2         +1
    char(10)       4*10           +1

====================================

联合索引准备

create table t1(a int not null ,b char(10) not null ,c varchar(10) not null )charset utf8mb4;
oldguo[test]>desc t1;
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| a     | int(11)     | NO   |     | NULL    |       |
| b     | char(10)    | NO   |     | NULL    |       |
| c     | varchar(10) | NO   |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+

idx(a,b,c) =  4 + 40 + 42 = 86

(1) 最理想的 
desc select * from t1 where a=1 and b='a' and c='a';
desc select * from t1 where b='1' and a=1 and c='a';
desc select * from t1 where c='1' and a=1 and b='a';
desc select * from t1 where c='1' and b='a' and a=1;
desc select * from t1 where a=1 and c='a' and b='a';
desc select * from t1 where b='1' and c='a' and a=1;

唯一值多的放在左边

(2) 部分索引
desc select * from t1 where a=1 and c='a';
oldguo[test]>desc select * from t1 where a=1 and b like 'a%' and c='a';

(3) where  + order by
Extra:filesort  ===> 排序不走索引,走的额外排序
oldguo[test]>desc select * from world.city where countrycode='CHN' order by population;

扩展:
oldguo[test]>desc format=json select * from t1 where a=1 and c='a' and b='a';

建立索引的原则(DBA规范)

image

建表时一定要有主键,一般是个无关列

选择唯一性索引

唯一性索引的值是唯一的，可以更快速的通过该索引来确定某条记录。
例如，学生表中学号是具有唯一性的字段。为该字段建立唯一性索引可以很快的确定某个学生的信息。
如果使用姓名的话，可能存在同名现象，从而降低查询速度。

优化方案:

(1) 如果非得使用重复值较多的列作为查询条件(例如:男女),可以将表逻辑拆分
(2) 可以将此列和其他的查询类,做联和索引
select count(*) from world.city;
select count(distinct countrycode) from world.city;
select count(distinct countrycode,population ) from world.city;

为经常需要where 、ORDER BY、GROUP BY,join on等操作的字段，排序操作会浪费很多时间

如果为其建立索引，优化查询
注：如果经常作为条件的列，重复值特别多，可以建立联合索引。

尽量使用前缀来索引

如果索引字段的值很长，最好使用值的前缀来索引。

限制索引的数目

索引的数目不是越多越好。

可能会产生的问题:

(1) 每个索引都需要占用磁盘空间，索引越多，需要的磁盘空间就越大。
(2) 修改表时，对索引的重构和更新很麻烦。越多的索引，会使更新表变得很浪费时间。
(3) 优化器的负担会很重,有可能会影响到优化器的选择.
percona-toolkit中有个工具,专门分析索引是否有用

删除不再使用或者很少使用的索引(percona toolkit)

pt-duplicate-key-checker
表中的数据被大量更新，或者数据的使用方式被改变后，原有的一些索引可能不再需要。数据库管理
员应当定期找出这些索引，将它们删除，从而减少索引对更新操作的影响。

大表加索引,要在业务不繁忙期间操作

https://www.cnblogs.com/TeyGao/p/7160421.html

尽量少在经常更新值的列上建索引

建索引原则

(1) 必须要有主键,如果没有可以做为主键条件的列,创建无关列
(2) 经常做为where条件列 order by group by join on, distinct 的条件(业务:产品功能+用户行为)
(3) 最好使用唯一值多的列作为索引,如果索引列重复值较多,可以考虑使用联合索引
(4) 列值长度较长的索引列,我们建议使用前缀索引.
(5) 降低索引条目,一方面不要创建没用索引,不常使用的索引清理,percona toolkit(xxxxx)
(6) 索引维护要避开业务繁忙期

面试题：有索引，为什么查询效率还是很低？

1.有索引不走
2.联合索引没有完全覆盖
3.索引失效或同级信息不真实

4.索引重复值太多

我一般遇到这样的问题，排查和解决的方法有这几种方法

1.先看看这个语句的执行计划
table 
type 
key_len   判断联合索引覆盖长度
extra     额外排序
rows  重复值 ----->
2.没索引建索引，改语句

建索引规范:

主键

where group by order by distinct join on
联合最左侧,唯一值
尽量使用前缀索引
索引条目
频繁更新的列,不适合做索引列,
避开业务繁忙期,pt-tools

不走索引:

本来全表扫描

25%

计算或函数

隐式转换

like %xx%

!= ,not in

索引失效,统计信息不真实

索引及执行计划管理

索引的作用

索引类型（笔试）

BTREE的细分离（算法）

Btree索引的功能分类

B树是如何构建的

辅助索引和聚集索引区别

辅助索引细分

单列辅助索引

联合索引

唯一索引

索引树的高度(越低越好)

索引管理

压力测试准备

索引命令操作

explain (desc) ☆☆☆☆☆

执行计划的分析

重点！！！

type 详细说明☆☆☆☆☆

ALL：全表扫描,不走索引

index 全索引扫描

range 索引范围查询

ref 辅助索引等值查询

eq_ref 多表连接查询中, 非驱动表的on的调剂列是主键唯一键

const(system) 主键或唯一键的等值

possible_keys：可能会用到的索引☆ ☆ ☆

10.4 key：使用到的索引☆☆☆☆

10.5 key_len：索引的覆盖长度☆☆☆☆☆

Extra：额外的信息 ☆☆☆☆

小结

索引的类型(算法)

BTREE

B+Tree有什么优势

辅助索引和聚集索引区别

辅助索引细分

索引树高度

索引基本管理

explain

作用:获取优化器选择的执行计划

分析执行计划

联合索引准备

建立索引的原则(DBA规范)

建表时一定要有主键,一般是个无关列

选择唯一性索引

优化方案:

为经常需要where 、ORDER BY、GROUP BY,join on等操作的字段，排序操作会浪费很多时间

尽量使用前缀来索引

限制索引的数目

可能会产生的问题:

删除不再使用或者很少使用的索引(percona toolkit)

大表加索引,要在业务不繁忙期间操作

尽量少在经常更新值的列上建索引

建索引原则

面试题：有索引，为什么查询效率还是很低？

我一般遇到这样的问题，排查和解决的方法有这几种方法

建索引规范:

不走索引:

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读