Mysql group_concat 的重复键和 1 个查询中多列的重复计数(查询优化)

Posted

技术标签:

【中文标题】Mysql group_concat 的重复键和 1 个查询中多列的重复计数(查询优化)【英文标题】:Mysql group_concat of repeated keys and count of repetition of multiple columns in 1 query ( Query Optimization ) 【发布时间】:2016-06-02 23:04:37 【问题描述】:

这个问题是关于查询优化以避免通过 php 多次调用数据库。

所以这里是场景,我有两张表,一张包含可以称为参考表的信息,另一张是数据表,字段key1key2在两个表中都是通用的,基于这些字段,我们可以加入他们。

我不知道查询是否可以比我现在做的更简单,我想要实现如下:

我想从main_info 中找到不同的key1,key2,info1,info2 表,只要序列值小于 10 和 key1,key2 两者 表匹配,然后按info1,info2分组,同时分组 计算重复的 key1,key2 以获取重复的 info1,info2 字段 和group_concat 那些键

表格main_info的内容

MariaDB [demos]> select * from main_info;
+------+------+-------+-------+----------+
| key1 | key2 | info1 | info2 | date     |
+------+------+-------+-------+----------+
|    1 |    1 |    15 |    90 | 20120501 |
|    1 |    2 |    14 |    92 | 20120601 |
|    1 |    3 |    15 |    82 | 20120801 |
|    1 |    4 |    15 |    82 | 20120801 |
|    1 |    5 |    15 |    82 | 20120802 |
|    2 |    1 |    17 |    90 | 20130302 |
|    2 |    2 |    17 |    90 | 20130302 |
|    2 |    3 |    17 |    90 | 20130302 |
|    2 |    4 |    16 |    88 | 20130601 |
+------+------+-------+-------+----------+
9 rows in set (0.00 sec) 

product1的内容

MariaDB [demos]> select * from product1;
+------+------+--------+--------------+
| key1 | key2 | serial | product_data |
+------+------+--------+--------------+
|    1 |    1 |      0 | NaN          |
|    1 |    1 |      1 | NaN          |
|    1 |    1 |      2 | NaN          |
|    1 |    1 |      3 | NaN          |
|    1 |    2 |      0 | 12.556       |
|    1 |    2 |      1 | 13.335       |
|    1 |    3 |      1 | NaN          |
|    1 |    3 |      2 | 13.556       |
|    1 |    3 |      3 | 14.556       |
|    1 |    4 |      3 | NaN          |
|    1 |    5 |      3 | NaN          |
|    2 |    1 |      0 | 12.556       |
|    2 |    1 |      1 | 13.553       |
|    2 |    1 |      2 | NaN          |
|    2 |    2 |     12 | 129          |
|    2 |    3 |     22 | NaN          |
+------+------+--------+--------------+
16 rows in set (0.00 sec)

通过 PHP 我对表 main_infoinfo1info2 分组字段,在当前上下文中 product1 的表 serial,product_data,一个接一个地多次(这里我正在运行如您所见,查询两次)

对于字段serial - 第一次查询

MariaDB [demos]> select * , count(*) as serial_count,GROUP_CONCAT(key1,' ',key2) as serial_ids from 
    -> (
    -> SELECT distinct 
    -> if(b.serial  < 10,a.key1,null) AS `key1`,
    -> if(b.serial  < 10,a.key2,null) AS `key2`,
    -> if(b.serial  < 10,a.info1,null) AS `info1`, 
    ->         if(b.serial  < 10,a.info2,null) AS `info2`
    -> FROM main_info a inner join product1 b on  a.key1 = b.key1 AND a.key2= b.key2
    -> ) as sub group by info1,info2
    -> ;
+------+------+-------+-------+--------------+-------------+
| key1 | key2 | info1 | info2 | serial_count | serial_ids  |
+------+------+-------+-------+--------------+-------------+
| NULL | NULL |  NULL |  NULL |            1 | NULL        |
|    1 |    2 |    14 |    92 |            1 | 1 2         |
|    1 |    3 |    15 |    82 |            3 | 1 3,1 4,1 5 |
|    1 |    1 |    15 |    90 |            1 | 1 1         |
|    2 |    1 |    17 |    90 |            1 | 2 1         |
+------+------+-------+-------+--------------+-------------+
5 rows in set (0.00 sec)

对于字段product_data - 第二次查询

MariaDB [demos]> select * , count(*) as product_data_count,GROUP_CONCAT(key1,' ',key2) as product_data_ids from 
    -> (
    -> SELECT distinct 
    -> if(b.product_data IS NOT NULL,a.key1,null) AS `key1`,
    -> if(b.product_data IS NOT NULL,a.key2,null) AS `key2`,
    -> if(b.product_data IS NOT NULL,a.info1,null) AS `info1`, 
    ->         if(b.product_data IS NOT NULL,a.info2,null) AS `info2`
    -> FROM main_info a inner join product1 b on  a.key1 = b.key1 AND a.key2= b.key2
    -> ) as sub group by info1,info2
    -> ;
+------+------+-------+-------+--------------------+------------------+
| key1 | key2 | info1 | info2 | product_data_count | product_data_ids |
+------+------+-------+-------+--------------------+------------------+
|    1 |    2 |    14 |    92 |                  1 | 1 2              |
|    1 |    3 |    15 |    82 |                  3 | 1 3,1 4,1 5      |
|    1 |    1 |    15 |    90 |                  1 | 1 1              |
|    2 |    2 |    17 |    90 |                  3 | 2 2,2 3,2 1      |
+------+------+-------+-------+--------------------+------------------+
4 rows in set (0.01 sec)

我想使用一个查询获得这样的输出,按 info1、info2 分组

+------+------+-------+-------+--------------+-------------+--------------------+------------------+
| key1 | key2 | info1 | info2 | serial_count | serial_ids  | product_data_count | product_data_ids |
+------+------+-------+-------+--------------+-------------+--------------------+------------------+
| NULL | NULL |  NULL |  NULL |            1 | NULL        |               NULL | NULL             |
|    1 |    2 |    14 |    92 |            1 | 1 2         |                  1 | 1 2              |
|    1 |    3 |    15 |    82 |            3 | 1 3,1 4,1 5 |                  3 | 1 3,1 4,1 5      |
|    1 |    1 |    15 |    90 |            1 | 1 1         |                  1 | 1 1              |
|    2 |    1 |    17 |    90 |            1 | 2 1         |                  3 | 2 2,2 3,2 1      |
+------+------+-------+-------+--------------+-------------+--------------------+------------------+

下面是表格结构

DROP TABLE IF EXISTS `main_info`;
CREATE TABLE `main_info` (
  `key1` int(11) NOT NULL,
  `key2` int(11) NOT NULL,
  `info1` int(11) NOT NULL,
  `info2` int(11) NOT NULL,
  `date` int(11) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;


LOCK TABLES `main_info` WRITE;
INSERT INTO `main_info` VALUES (1,1,15,90,20120501),(1,2,14,92,20120601),(1,3,15,82,20120801),(1,4,15,82,20120801),(1,5,15,82,20120802),(2,1,17,90,20130302),(2,2,17,90,20130302),(2,3,17,90,20130302),(2,4,16,88,20130601);
UNLOCK TABLES;


DROP TABLE IF EXISTS `product1`;
CREATE TABLE `product1` (
  `key1` int(11) NOT NULL,
  `key2` int(11) NOT NULL,
  `serial` int(11) NOT NULL,
  `product_data` varchar(1000) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;


LOCK TABLES `product1` WRITE;
INSERT INTO `product1` VALUES (1,1,0,'NaN'),(1,1,1,'NaN'),(1,1,2,'NaN'),(1,1,3,'NaN'),(1,2,0,'12.556'),(1,2,1,'13.335'),(1,3,1,'NaN'),(1,3,2,'13.556'),(1,3,3,'14.556'),(1,4,3,'NaN'),(1,5,3,'NaN'),(2,1,0,'12.556'),(2,1,1,'13.553'),(2,1,2,'NaN'),(2,2,12,'129'),(2,3,22,'NaN');
UNLOCK TABLES;

请有人帮我在一个查询中得到结果。

【问题讨论】:

我看不到带有NULL 值的行来自哪里。 @Gordon Linoff 我认为因为序列值 22, 12 不计算为 true 但它与 main_info 表匹配,我真的不知道为什么会这样,但如果我在上面运行查询,我会得到 NULL 查询中的这些规则使您很难理解您的查询真正试图做什么。那么,这不仅仅是一个简单的连接。 我想从 main_info 表中找到不同的 key1,key2,info1,info2,只要序列值小于 10 并且两个表的 key1,key2 匹配,然后将它们按 info1,info2 分组,对重复的 key1,key2 进行分组计算 info1,info2 字段的重复项并连接这些键 @user3637224 请回复我的回答,我相信它至少能给出正确的输出.. 【参考方案1】:

如何将您的两个查询与 JOIN 结合起来?

SQL:

 SELECT
    tbl1.key1, tbl1.key2, tbl1.info1, tbl1.info2, tbl1.serial_count, tbl1.serial_ids,
    tbl2.product_data_count, tbl2.product_data_ids
 FROM 
 (
select * , count(*) as serial_count,GROUP_CONCAT(key1,' ',key2) as serial_ids from 
 (
 SELECT distinct 
 if(b.serial  < 10,a.key1,null) AS `key1`,
 if(b.serial  < 10,a.key2,null) AS `key2`,
 if(b.serial  < 10,a.info1,null) AS `info1`, 
         if(b.serial  < 10,a.info2,null) AS `info2`
 FROM main_info a inner join product1 b on  a.key1 = b.key1 AND a.key2= b.key2
 ) as sub group by info1,info2
 ) tbl1
 LEFT OUTER JOIN 
 (
select * , count(*) as product_data_count,GROUP_CONCAT(key1,' ',key2) as product_data_ids from 
 (
 SELECT distinct 
 if(b.product_data IS NOT NULL,a.key1,null) AS `key1`,
 if(b.product_data IS NOT NULL,a.key2,null) AS `key2`,
 if(b.product_data IS NOT NULL,a.info1,null) AS `info1`, 
         if(b.product_data IS NOT NULL,a.info2,null) AS `info2`
 FROM main_info a inner join product1 b on  a.key1 = b.key1 AND a.key2= b.key2
 ) as sub group by info1,info2
 ) tbl2
 ON tbl1.info1 = tbl2.info1 AND tbl1.info2 = tbl2.info2
 ORDER BY 3,4
 ;

输出:

mysql>  SELECT
    -> tbl1.key1, tbl1.key2, tbl1.info1, tbl1.info2, tbl1.serial_count, tbl1.serial_ids,
    -> tbl2.product_data_count, tbl2.product_data_ids
    ->  FROM
    ->  (
    -> select * , count(*) as serial_count,GROUP_CONCAT(key1,' ',key2) as serial_ids from
    ->  (
    ->  SELECT distinct
    ->  if(b.serial  < 10,a.key1,null) AS `key1`,
    ->  if(b.serial  < 10,a.key2,null) AS `key2`,
    ->  if(b.serial  < 10,a.info1,null) AS `info1`,
    ->          if(b.serial  < 10,a.info2,null) AS `info2`
    ->  FROM main_info a inner join product1 b on  a.key1 = b.key1 AND a.key2= b.key2
    ->  ) as sub group by info1,info2
    ->  ) tbl1
    ->  LEFT OUTER JOIN
    ->  (
    -> select * , count(*) as product_data_count,GROUP_CONCAT(key1,' ',key2) as product_data_ids from
    ->  (
    ->  SELECT distinct
    ->  if(b.product_data IS NOT NULL,a.key1,null) AS `key1`,
    ->  if(b.product_data IS NOT NULL,a.key2,null) AS `key2`,
    ->  if(b.product_data IS NOT NULL,a.info1,null) AS `info1`,
    ->          if(b.product_data IS NOT NULL,a.info2,null) AS `info2`
    ->  FROM main_info a inner join product1 b on  a.key1 = b.key1 AND a.key2= b.key2
    ->  ) as sub group by info1,info2
    ->  ) tbl2
    ->  ON tbl1.info1 = tbl2.info1 AND tbl1.info2 = tbl2.info2
    ->  ORDER BY 3,4
    ->  ;
+------+------+-------+-------+--------------+-------------+--------------------+------------------+
| key1 | key2 | info1 | info2 | serial_count | serial_ids  | product_data_count | product_data_ids |
+------+------+-------+-------+--------------+-------------+--------------------+------------------+
| NULL | NULL |  NULL |  NULL |            1 | NULL        |               NULL | NULL             |
|    1 |    2 |    14 |    92 |            1 | 1 2         |                  1 | 1 2              |
|    1 |    3 |    15 |    82 |            3 | 1 3,1 4,1 5 |                  3 | 1 3,1 4,1 5      |
|    1 |    1 |    15 |    90 |            1 | 1 1         |                  1 | 1 1              |
|    2 |    1 |    17 |    90 |            1 | 2 1         |                  3 | 2 2,2 3,2 1      |
+------+------+-------+-------+--------------+-------------+--------------------+------------------+
5 rows in set (0.01 sec)

mysql>  select version();
+-----------------+
| version()       |
+-----------------+
| 10.1.10-MariaDB |
+-----------------+
1 row in set (0.00 sec)

【讨论】:

plus 1 ,但此方法失败,例如 tbl1 返回 5 行,而 tbl2 应该返回 6 行,输出它只显示 5 行,您可以截断两个表并插入此数据,@ 987654323@ INSERT INTO product1 VALUES (1,1,0,'NaN'),(1,1,1,'NaN'),(1,1,2,'NaN'),(1,1,3,'NaN'),(1,2,0,'12.556'),(1,2,1,'13.335'),(1,3,1,'NaN'),(1,3,2,'13.556'),(1,3,3,'14.556'),(1,4,3,'NaN'),(1,5,3,'NaN'),(2,1,11,'12.556'),(2,1,11,'13.553'),(2,1,11,'NaN'),(2,2,11,'129'),(2,3,11,'NaN'),(2,4,9,NULL),(2,4,19,'11'); 您可以尝试将JOe LEFT OUTER JOIN改为OUTET JOIN。 外部联接未按预期给出结果【参考方案2】:

试试这个

SELECT 
     key1, key2, info1, info2, 
     SUM(Scount) AS serial_count, GROUP_CONCAT(Skey1, ' ', Skey2) AS serial_ids,
     SUM(Pcount) AS product_data_count, GROUP_CONCAT(Pkey1, ' ', Pkey2) AS product_data_ids 
FROM 
(

   SELECT DISTINCT 
     IF(b.serial  < 10 OR b.product_data IS NOT NULL,a.key1, NULL) AS `key1`,
     IF(b.serial  < 10 OR b.product_data IS NOT NULL,a.key2, NULL) AS `key2`,
     IF(b.serial  < 10 OR b.product_data IS NOT NULL,a.info1, NULL) AS `info1`, 
     IF(b.serial  < 10 OR b.product_data IS NOT NULL,a.info2, NULL) AS `info2`,
     IF(b.serial  < 10,a.key1, NULL) AS `Skey1`,
     IF(b.serial  < 10,a.key2, NULL) AS `Skey2`,
     IF(b.product_data IS NOT NULL,a.key1, NULL) AS `Pkey1`,
     IF(b.product_data IS NOT NULL,a.key2, NULL) AS `Pkey2`,
     IF(b.serial < 10, 1, NULL) AS `Scount`,
     IF(b.product_data IS NOT NULL, 1, NULL) AS `Pcount`
   FROM main_info a INNER JOIN product1 b ON  a.key1 = b.key1 AND a.key2= b.key2

   UNION ALL

   SELECT DISTINCT
     NULL AS `key1`,
     NULL AS `key2`,
     NULL AS `info1`,
     NULL AS `info2`,
     NULL AS `Skey1`,
     NULL AS `Skey2`,
     NULL AS `Pkey1`,
     NULL AS `Pkey2`,
     IF(serial > 9, 1, NULL) AS `Scount`,
     IF(product_data IS NULL, 1, NULL) AS `Pcount`
   FROM product1 WHERE serial > 9 xor product_data IS NULL

) AS sub GROUP BY info1,info2

结果(来自问题的数据)

+------+------+-------+-------+--------------+-------------+--------------------+------------------+
| key1 | key2 | info1 | info2 | serial_count | serial_ids  | product_data_count | product_data_ids |
+------+------+-------+-------+--------------+-------------+--------------------+------------------+
| NULL | NULL | NULL  | NULL  | 1            | NULL        | NULL               | NULL             |
+------+------+-------+-------+--------------+-------------+--------------------+------------------+
| 1    | 2    | 14    | 92    | 1            | 1 2         | 1                  | 1 2              |
+------+------+-------+-------+--------------+-------------+--------------------+------------------+
| 1    | 3    | 15    | 82    | 3            | 1 3,1 4,1 5 | 3                  | 1 3,1 4,1 5      |
+------+------+-------+-------+--------------+-------------+--------------------+------------------+
| 1    | 1    | 15    | 90    | 1            | 1 1         | 1                  | 1 1              |
+------+------+-------+-------+--------------+-------------+--------------------+------------------+

结果(来自评论的数据)

+------+------+-------+-------+--------------+-------------+--------------------+------------------+
| key1 | key2 | info1 | info2 | serial_count | serial_ids  | product_data_count | product_data_ids |
+------+------+-------+-------+--------------+-------------+--------------------+------------------+
| NULL | NULL | NULL  | NULL  | 1            | NULL        | 1                  | NULL             |
+------+------+-------+-------+--------------+-------------+--------------------+------------------+
| 1    | 2    | 14    | 92    | 1            | 1 2         | 1                  | 1 2              |
+------+------+-------+-------+--------------+-------------+--------------------+------------------+
| 1    | 3    | 15    | 82    | 3            | 1 3,1 4,1 5 | 3                  | 1 3,1 4,1 5      |
+------+------+-------+-------+--------------+-------------+--------------------+------------------+
| 1    | 1    | 15    | 90    | 1            | 1 1         | 1                  | 1 1              |
+------+------+-------+-------+--------------+-------------+--------------------+------------------+
| 2    | 4    | 16    | 88    | 1            | 2 4         | 1                  | 2 4              |
+------+------+-------+-------+--------------+-------------+--------------------+------------------+
| 2    | 1    | 17    | 90    | NULL         | NULL        | 3                  | 2 1,2 2,2 3      |
+------+------+-------+-------+--------------+-------------+--------------------+------------------+

注意:

关于问题背后的基本逻辑,我确实能理解一些东西,所以主要根据预期结果来回答。例如,如果组字段(info1info2)为空,则其他结果将始终为空,除了 serial_countproduct_data_count 可以为 1 或空,你真的想得到吗?请注意,此答案使用另一个带有 UNION ALL 的子查询来满足这一点。

【讨论】:

你能解释一下吗,我没看懂逻辑 你为什么在那里使用异或? 哥们我得到了,它工作得很好,谢谢你帮助我【参考方案3】:

从你的报价来看,在我看来你想做这样的事情 (SQLfiddle):

SELECT
  m.info1,
  m.info2,
  COUNT(DISTINCT CONCAT(m.key1, ' ', m.key2)) key_count,
  GROUP_CONCAT(DISTINCT CONCAT(m.key1, ' ', m.key2) ORDER BY m.key1, m.key2) key_pairs,
  COUNT(DISTINCT p.serial) serial_count,
  GROUP_CONCAT(DISTINCT p.serial ORDER BY p.serial) serials,
  COUNT(DISTINCT p.product_data) data_count,
  GROUP_CONCAT(DISTINCT p.product_data ORDER BY p.product_data) product_data
FROM
  main_info m INNER JOIN
  product1 p ON p.key1 = m.key1 AND p.key2 = m.key2
WHERE
  p.serial < 10
GROUP BY
  m.info1,
  m.info2

计算不同的值并列出它们,这是正确的吗?您不能只按 info1、info2 分组,并且在结果中也有 key1 或 key2 的列(例如 min(key1) 或 max(key2) 会起作用)。我在上面的查询中对此进行了调整,虽然它与您的结果有很大不同,但它可能是您实际需要的,可能会进行一些更改。

【讨论】:

对不起,这没有达到预期的结果

以上是关于Mysql group_concat 的重复键和 1 个查询中多列的重复计数(查询优化)的主要内容,如果未能解决你的问题,请参考以下文章

MySQL - GROUP_CONCAT 返回重复数据,不能使用 DISTINCT

在 SQL Server 中完成 MYSQL 的 Group_Concat [重复]

MySQL GROUP_CONCAT 返回重复值。无法使用 DISTINCT

MySQL:GROUP_CONCAT 中的 DISTINCT 删除相同的值(不重复)

MySQL GROUP_CONCAT 防止不必要的重复

如何抑制 GROUP_CONCAT MySQL 中的重复项