美文网首页
关于 GROUP BY 踩过的小坑

关于 GROUP BY 踩过的小坑

作者: richardlee | 来源:发表于2021-04-25 22:48 被阅读0次

    首先,我们有一张数据字段如下的数据表。

    image

    现在,我想要取每个 event_id 下最新的 status 的值,于是,我瞬间写了如下的 sql:

    mysql> select event_id, status, max(id) from t_event_flow group by event_id;
    ERROR 1055 (42000): Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'testdb.t_event_flow.status' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
    

    报错了,报错信息显示:SELECT 中的第 2 个字段不在 GROUP BY 的子句中,并且包含了非聚合列 status,且这个列在功能上不依赖于 GROUP BY 的子句,这跟 only_full_group_by 的 sql_mode 不符合。

    我们先查看当前的 SQL 模式:

    image

    查询 Mysql 的官方文档有以下说明:

    SQL-92 and earlier does not permit queries for which the select list, HAVING condition, or ORDER BY list refer to nonaggregated columns that are not named in the GROUP BY clause.
    MySQL 5.7.5 and later implements detection of functional dependence. If the ONLY_FULL_GROUP_BY SQL mode is enabled (which it is by default), MySQL rejects queries for which the select list, HAVING condition, or ORDER BY list refer to nonaggregated columns that are neither named in the GROUP BY clause nor are functionally dependent on them. (Before 5.7.5, MySQL does not detect functional dependency and ONLY_FULL_GROUP_BY is not enabled by default. For a description of pre-5.7.5 behavior, see the MySQL 5.6 Reference Manual.)

    意思是说,标准的 SQL-92 和之前的规范中,不允许查询在 select 后面的列、having 后的条件或order by 后面的列引用没有在 group by 中出现的非聚合列。Mysql 5.7 和之后的版本实现了功能依赖性的检测。如果 ONLY_FULL_GROUP_BY SQL 模式启用后(默认启用),Mysql 会拒绝在 select、having、order by 中引用了没有在 group by 子句中出现的或功能上依赖于他们的列的查询。

    所以说,上面的 sql 语句中不能出现非聚合列 status,应改为:

    mysql> select event_id, max(id) from t_event_flow group by event_id;
    +----------+---------+
    | event_id | max(id) |
    +----------+---------+
    |    10000 |       4 |
    |    10001 |       7 |
    |    10002 |      11 |
    |    10003 |      15 |
    |    10004 |      18 |
    |    10005 |      21 |
    +----------+---------+
    6 rows in set (0.00 sec)
    

    然后再根据主键 id 查询到对应的 status 即可。

    那么如果关闭 ONLY_FULL_GROUP_BY 模式呢?官方文档中有如下说明:

    If ONLY_FULL_GROUP_BY is disabled, a MySQL extension to the standard SQL use of GROUP BY permits the select list, HAVING condition, or ORDER BY list to refer to nonaggregated columns even if the columns are not functionally dependent on GROUP BY columns. This causes MySQL to accept the preceding query. In this case, the server is free to choose any value from each group, so unless they are the same, the values chosen are nondeterministic, which is probably not what you want.

    意思说,关闭后,Mysql 会接受上面的查询。不过在这种情况下,服务器会从每个分组中随机选择列的值,可能选到的值不是你想要的。我们试一试。

    先关闭 ONLY_FULL_GROUP_BY 模式:

    mysql> SET @@sql_mode = sys.list_drop(@@sql_mode, 'ONLY_FULL_GROUP_BY');
    Query OK, 0 rows affected (0.00 sec)
    

    然后执行上面的查询:

    mysql> select event_id, status, max(id) from t_event_flow group by event_id;
    +----------+--------+---------+
    | event_id | status | max(id) |
    +----------+--------+---------+
    |    10000 |      1 |       4 |
    |    10001 |      1 |       7 |
    |    10002 |      1 |      11 |
    |    10003 |      1 |      15 |
    |    10004 |      1 |      18 |
    |    10005 |      1 |      21 |
    +----------+--------+---------+
    6 rows in set (0.00 sec)
    

    没有报错了,不过取到的 status 并不是我们想要的最新值,而是每个组默认取了第一条数据。

    综上,当后续使用 GROUP BY 操作时,应注意下当前设置的 SQL MODE,尽量不要去引用 GROUP BY 后未出现的列,不然要么就是报错,要么就是取值不准确。

    相关文章

      网友评论

          本文标题:关于 GROUP BY 踩过的小坑

          本文链接:https://www.haomeiwen.com/subject/adnjrltx.html