关于MySQL前置索引

作者: AQ王浩 | 来源:发表于2015-07-21 08:16 被阅读1025次

    一、前置索引的好处

    使用前置索引的好处

    Normally, the entire value of a column is used to build the index - this is fine for short data types (integers and the like) but can result in a lot of data in the index for longer data types (CHAR and VARCHAR, for example). Using an index prefix allows you to make a trade off between the space required for the index and the cardinality of the index.

    二、前置索引的坑

    第一种情况

    • 索引的长度和对应字段中的长度一致(name 与 idx_of_name)
    | areas | CREATE TABLE `areas` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `areaid` varchar(20) NOT NULL,
      `name` varchar(50) NOT NULL,
      `cityid` varchar(20) NOT NULL,
      PRIMARY KEY (`id`),
      KEY `idx_of_name` (`name`)
    ) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8 |
    +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.00 sec)
    
    mysql> explain select id, name from areas order by name;
    +----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
    | id | select_type | table | type  | possible_keys | key         | key_len | ref  | rows | Extra       |
    +----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
    |  1 | SIMPLE      | areas | index | NULL          | idx_of_name | 152     | NULL | 3144 | Using index |
    +----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
    1 row in set (0.00 sec)
    
    mysql> explain select * from areas where name = '龙潭区';
    +----+-------------+-------+------+---------------+-------------+---------+-------+------+-----------------------+
    | id | select_type | table | type | possible_keys | key         | key_len | ref   | rows | Extra                 |
    +----+-------------+-------+------+---------------+-------------+---------+-------+------+-----------------------+
    |  1 | SIMPLE      | areas | ref  | idx_of_name   | idx_of_name | 152     | const |    1 | Using index condition |
    +----+-------------+-------+------+---------------+-------------+---------+-------+------+-----------------------+
    1 row in set (0.01 sec)
    
    mysql>  explain select id, name from areas group by name;
    +----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
    | id | select_type | table | type  | possible_keys | key         | key_len | ref  | rows | Extra       |
    +----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
    |  1 | SIMPLE      | areas | index | idx_of_name   | idx_of_name | 152     | NULL | 3144 | Using index |
    +----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
    1 row in set (0.01 sec)
    
    

    第二种情况:

    • 索引的长度和对应字段中的长度不一致(name 与 idx_of_name)
    | areas | CREATE TABLE `areas` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `areaid` varchar(20) NOT NULL,
      `name` varchar(50) NOT NULL,
      `cityid` varchar(20) NOT NULL,
      PRIMARY KEY (`id`),
      KEY `idx_of_name` (`name`(6))
    ) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8 |
    +-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.00 sec)
    
    mysql> explain select id, name from areas order by name;
    +----+-------------+-------+------+---------------+------+---------+------+------+----------------+
    | id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra          |
    +----+-------------+-------+------+---------------+------+---------+------+------+----------------+
    |  1 | SIMPLE      | areas | ALL  | NULL          | NULL | NULL    | NULL | 3144 | Using filesort |
    +----+-------------+-------+------+---------------+------+---------+------+------+----------------+
    1 row in set (0.01 sec)
    
    mysql> explain select * from areas where name = '龙潭区';
    +----+-------------+-------+------+---------------+-------------+---------+-------+------+-------------+
    | id | select_type | table | type | possible_keys | key         | key_len | ref   | rows | Extra       |
    +----+-------------+-------+------+---------------+-------------+---------+-------+------+-------------+
    |  1 | SIMPLE      | areas | ref  | idx_of_name   | idx_of_name | 20      | const |    1 | Using where |
    +----+-------------+-------+------+---------------+-------------+---------+-------+------+-------------+
    1 row in set (0.01 sec)
    
    mysql> explain select id, name from areas group by name;
    +----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
    | id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                           |
    +----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
    |  1 | SIMPLE      | areas | ALL  | NULL          | NULL | NULL    | NULL | 3144 | Using temporary; Using filesort |
    +----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
    1 row in set (0.01 sec)
    

    虽然在查询上用到了索引,但是在排序时,索引不生效。

    使用索引就一定能加快查询效率么?不一定。上面的例子就告诉我们,前置索引
    并不是一个万能药,它的确可以帮助我们在一个过长的字段中建立索引。但同时也会导致排序(order by, group by)查询都无法使用前置索引。

    三、如何计算出前置索引的最佳长度

    关于如何计算出,最恰当的设置索引的长度,总体思想:

    # 全列选择性
    SELECT COUNT(DISTINCT column_name) / COUNT(*) FROM table_name;
    
    # 测试某一长度前缀的选择性
    SELECT COUNT(DISTINCT LEFT(column_name, prefix_length)) / COUNT(*) FROM table_name;
    
    

    当前置的选择性越接近全列的选择性的时候,索引效果越好。

    通常可以索引开始的几个字符,而不是全部值,以节约空间并取得好的性能。这使索引需要的空间变小,但是也会降低选择性。索引选择性不重复的索引值和表中所有行的比值。高选择性的索引有好处,因为它使MySQL在查找匹配的时候可以过滤更多的行。唯一索引的选择率为1,为最佳值

    具体操作如下:

    CREATE TABLE `areas` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `areaid` varchar(20) NOT NULL,
      `name` varchar(50) NOT NULL,
      `cityid` varchar(20) NOT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8 |
    +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.00 sec)
    
    mysql> select count(distinct(name))/count(id) from areas;
    +---------------------------------+
    | count(distinct(name))/count(id) |
    +---------------------------------+
    |                          0.8954 |
    +---------------------------------+
    1 row in set (0.01 sec)
    
    mysql> select count(distinct left(name, 2))/count(id) from areas;
    +-----------------------------------------+
    | count(distinct left(name, 2))/count(id) |
    +-----------------------------------------+
    |                                  0.8648 |
    +-----------------------------------------+
    1 row in set (0.00 sec)
    
    mysql> select count(distinct left(name, 3))/count(id) from areas;
    +-----------------------------------------+
    | count(distinct left(name, 3))/count(id) |
    +-----------------------------------------+
    |                                  0.8909 |
    +-----------------------------------------+
    1 row in set (0.01 sec)
    
    mysql> select count(distinct left(name, 5))/count(id) from areas;
    +-----------------------------------------+
    | count(distinct left(name, 5))/count(id) |
    +-----------------------------------------+
    |                                  0.8941 |
    +-----------------------------------------+
    1 row in set (0.01 sec)
    
    mysql> select count(distinct left(name, 6))/count(id) from areas;
    +-----------------------------------------+
    | count(distinct left(name, 6))/count(id) |
    +-----------------------------------------+
    |                                  0.8954 |
    +-----------------------------------------+
    1 row in set (0.00 sec)
    
    前置索引为6时,选择性和列值选择性相同。那么就设定该索引的长度为6位。
    
    mysql> select max(length(name)) from areas;
    +-------------------+
    | max(length(name)) |
    +-------------------+
    |                45 |
    +-------------------+
    1 row in set (0.11 sec)
    
    mysql> select * from areas where length(name) = 45;
    +------+--------+-----------------------------------------------+--------+
    | id   | areaid | name                                          | cityid |
    +------+--------+-----------------------------------------------+--------+
    | 2624 | 530925 | 双江拉祜族佤族布朗族傣族自治县                | 530900 |
    | 2965 | 622927 | 积石山保安族东乡族撒拉族自治县                | 622900 |
    +------+--------+-----------------------------------------------+--------+
    2 rows in set (0.01 sec)
    
    
    

    四、索引的最大长度

    mysql> select 255 * 3 from dual;
    +---------+
    | 255 * 3 |
    +---------+
    |     765 |
    +---------+
    1 row in set (0.00 sec)
    
    mysql> show create table areas;
    
    CREATE TABLE `areas` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `areaid` varchar(20) NOT NULL,
      `name` varchar(255) NOT NULL,
      `cityid` varchar(20) NOT NULL,
      PRIMARY KEY (`id`),
      KEY `idx_of_name` (`name`)
    ) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8 |
    
    mysql> alter table areas change `name` `name` varchar(256) not null;
    ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes
    mysql> alter table areas drop index `idx_of_name`;
    Query OK, 0 rows affected (0.03 sec)
    Records: 0  Duplicates: 0  Warnings: 0
    
    mysql> alter table areas change `name` `name` varchar(256) not null;
    Query OK, 3144 rows affected (0.06 sec)
    Records: 3144  Duplicates: 0  Warnings: 0
    
    mysql> show create table areas;
    
    CREATE TABLE `areas` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `areaid` varchar(20) NOT NULL,
      `name` varchar(256) NOT NULL,
      `cityid` varchar(20) NOT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8
    
    mysql> alter table areas add index `idx_of_name` (name);
    ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes
    
    
    

    添加索引时,如果不指定索引的长度,MySQL默认会选择该字段的长度作为索引长度。其实这里

    alter table areas add index `idx_of_name` (name)
    等同于
    alter table areas add index `idx_of_name` (name(256))
    

    MySQL 索引最大 bytes 为 767, 255 * 3 < 767 但是 256 * 3 > 767 所以,varchar(255)时候,创建默认索引可以成功,但是varchar(256)时,创建默认索引就不成功。

    并且由上述列子可知,索引和字段相互影响,当索引设置为 idx_of_name (name(255))时,要改变字段的长度,也会提示 Specified key was too long 错误。

    总结

    根据具体的业务需求来选择索引,并不是索引都会加快查询速度。

    参考

    错误使用MySQL前缀索引导致的慢查询
    MySQL 前缀索引
    mysql前缀索引
    mysql省市区邮政编码和区号级联sql文件

    相关文章

      网友评论

        本文标题:关于MySQL前置索引

        本文链接:https://www.haomeiwen.com/subject/qdmpqttx.html