美文网首页数据库
根据两张表的多个维度,出一个差异表

根据两张表的多个维度,出一个差异表

作者: 墨色尘埃 | 来源:发表于2020-06-24 22:22 被阅读0次

如下需求

A 表202006_ims_innercg: 有contacts_no, goods_sn, size, outin_num
B 表202006_india_qingguan_data: 有contacts_no, goods_sn, size, y_num

两张表的contacts_no, goods_sn,size三个字段并不能唯一确定一条数据,现在想根据这 3个维度汇总数量,出个差异表union_middle
差异表union_middle有字段contacts_no, goods_sn, size, numA ,numB, diff(numA-numB)

360截图16290610606175.png 360截图1648030486121134.png 360截图1628072286114133.png

解决步骤:

1.创建差异表union_middle
2.将A表、B表contacts_no, goods_sn,size字段的数据全部插入到union_middle中,根据此步的sql,执行之后union_middle表中的数据是不重复的
3.对union_middlenumA字段进行更新
此步陷阱:因为union_middle表中的数据已经是去重后的,所以更新次字段的时候,数据来源要正确
4.对union_middlenumB字段进行更新,同步骤3
5.最后计算union_middle表的diff字段,并将numAnumB字段为null的值置为0
6.数据验证,分别统计出A表outin_num字段的和、B表y_num字段的和、差异表numA字段和numB字段的和,看结果是否相等。相等则正确,不相等则不正确。

  • 创建差异表union_middle
CREATE TABLE `union_middle` (
  `contacts_no` varchar(255) DEFAULT NULL,
  `goods_sn` varchar(255) DEFAULT NULL,
  `size` varchar(255) DEFAULT NULL,
  `numA` int(255) DEFAULT NULL,
  `numB` int(255) DEFAULT NULL,
  `diff` int(255) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

  • 将两张表的contacts_no, goods_sn,size数据插入到union_middle中
INSERT INTO union_middle (contacts_no, goods_sn, size) SELECT
    a.contacts_no,
    a.goods_sn,
    a.size
FROM
    202006_ims_innercg a
UNION
    SELECT
        b.contacts_no,
        b.goods_sn,
        b.size
    FROM
        202006_india_qingguan_data b;
  • 更新union_middle表的numA字段
-- 添加202006_ims_innercg表索引
DROP INDEX idx_contacts_goods_size ON 202006_ims_innercg;

ALTER TABLE 202006_ims_innercg ADD INDEX idx_contacts_goods_size (contacts_no, goods_sn, size);

-- 添加union_middle表索引
DROP INDEX idx_contacts_goods_size ON union_middle;

ALTER TABLE union_middle ADD INDEX idx_contacts_goods_size (contacts_no, goods_sn, size);

UPDATE union_middle a,
 (
    SELECT
        contacts_no,
        goods_sn,
        size,
        COUNT(0) AS count,
        SUM(outin_num) AS numA
    FROM
        202006_ims_innercg
    GROUP BY
        contacts_no,
        goods_sn,
        size
) c
SET a.numA = c.numA
WHERE
    a.contacts_no = c.contacts_no
AND a.goods_sn = c.goods_sn
AND a.size = c.size;


  • 更新union_middle表的numB字段
-- 添加202006_india_qingguan_data表索引
DROP INDEX idx_contacts_goods_size ON 202006_india_qingguan_data;

ALTER TABLE 202006_india_qingguan_data ADD INDEX idx_contacts_goods_size (contacts_no, goods_sn, size);

UPDATE union_middle a,
 (
    SELECT
        contacts_no,
        goods_sn,
        size,
        COUNT(0) AS count,
        SUM(y_num) AS numB
    FROM
        202006_india_qingguan_data
    GROUP BY
        contacts_no,
        goods_sn,
        size
) c
SET a.numB = c.numB
WHERE
    a.contacts_no = c.contacts_no
AND a.goods_sn = c.goods_sn
AND a.size = c.size;

  • 最后计算union_middle表的diff字段,并将numA和numB字段为null的值置为0
UPDATE union_middle c
SET c.diff = IFNULL(c.numA, 0) - IFNULL(c.numB, 0),
 c.numA = IFNULL(c.numA, 0),
 c.numB = IFNULL(c.numB, 0)
  • 数据验证
-- A表中outin_num字段的和
SELECT sum(outin_num) from 202006_ims_innercg;
-- B表中y_num字段的和
SELECT sum(y_num) from 202006_india_qingguan_data;
-- 差异表中numA和numB的和
SELECT sum(numA),SUM(numB)from union_middle;

相关文章

网友评论

    本文标题:根据两张表的多个维度,出一个差异表

    本文链接:https://www.haomeiwen.com/subject/uvprfktx.html