使用带有 SUM 和 GROUP BY 的完整 JOIN 避免重复条目

Posted

技术标签:

【中文标题】使用带有 SUM 和 GROUP BY 的完整 JOIN 避免重复条目【英文标题】:Avoid duplicate entries using full JOIN with SUM and GROUP BY 【发布时间】:2018-11-25 18:25:42 【问题描述】:

我正在为数据库使用 HSQLDB,并且在以下情况下,我必须在加入 2 个表时避免重复条目。

表1

HMEXPENSE
+--------+---------------+-------------+
| USERID | EXPENSEAMOUNT | EXPENSEDATE |
+--------+---------------+-------------+
|      a |      100      | 2018-10-10  |
|      a |      200      | 2018-10-11  |
|      a |      100      | 2018-10-11  |
|      a |      200      | 2018-10-13  |
+--------+---------------+-------------+

表2

HMINCOME
+--------+---------------+-------------+
| USERID | EXPENSEAMOUNT | EXPENSEDATE |
+--------+---------------+-------------+
|      a |      200      | 2018-10-10  |
|      a |      100      | 2018-10-11  |
|      a |      200      | 2018-10-11  |
|      a |      100      | 2018-10-12  |
+--------+---------------+-------------+

给我重复条目的当前查询如下

SELECT e.expenseDate ,i.incomeDate , SUM(e.expenseAmount), SUM(i.incomeAmount)
FROM HMINCOME i FULL JOIN HMEXPENSE e on i.incomeDate = e.expenseDate 
GROUP BY i.incomeDate,e.expenseDate, i.incomeAmount, e.expenseAmount

输出

+-------------+------------+-------+-------+
| EXPENSEDATE | INCOMEDATE |   C3  |   C4  |
+-------------+------------+-------+-------+
|  2018-10-10 | 2018-10-10 | 100.0 | 200.0 |
|  2018-10-11 | 2018-10-11 | 200.0 | 100.0 |
|  2018-10-11 | 2018-10-11 | 100.0 | 100.0 |
|  2018-10-11 | 2018-10-11 | 200.0 | 200.0 |
|  2018-10-11 | 2018-10-11 | 100.0 | 200.0 |
|   <null>    | 2018-10-12 | <null>| 100.0 |
|  2018-10-13 |   <null>   | 200.0 | <null>|
+-------------+------------+-------+-------+

如果我使用上面提到的这个查询来获得我实际场景中所需的实际输出如下

SELECT e.expenseDate, i.incomeDate , SUM(e.expenseAmount),SUM(i.incomeAmount)
FROM HMINCOME i FULL JOIN HMEXPENSE e on i.incomeDate = e.expenseDate 
GROUP BY i.incomeDate,e.expenseDate

输出

+-------------+------------+-------+-------+
| EXPENSEDATE | INCOMEDATE |   C3  |   C4  |
+-------------+------------+-------+-------+
|  2018-10-10 | 2018-10-10 | 100.0 | 200.0 |
|  2018-10-11 | 2018-10-11 | 600.0 | 600.0 |
|   <null>    | 2018-10-12 | <null>| 100.0 |
|  2018-10-13 |   <null>   | 200.0 | <null>|
+-------------+------------+-------+-------+

要求是获取单日金额的总和,以及另一个表中不存在的日期的空条目。

预期输出如下

+-------------+------------+-------+-------+
| EXPENSEDATE | INCOMEDATE |   C3  |   C4  |
+-------------+------------+-------+-------+
|  2018-10-10 | 2018-10-10 | 100.0 | 200.0 |
|  2018-10-11 | 2018-10-11 | 300.0 | 300.0 |
|   <null>    | 2018-10-12 | <null>| 100.0 |
|  2018-10-13 |   <null>   | 200.0 | <null>|
+-------------+------------+-------+-------+

由于条目重复,无法正确计算 C3 和 C4 列值。

帮助...

【问题讨论】:

请不要在其他 RDBMS 上添加垃圾邮件标签。仅使用特定标签。 您通常按您选择的列进行分组,除了那些作为设置函数的参数的列。 IE。试试GROUP BY i.incomeDate, e.expenseDate 【参考方案1】:

解决此问题的一种方法是使用union allgroup by

select dte, sum(incomeamount) as incomeamount, sum(expenseamount) as expenseamount
from ((select incomedate as dte, incomeamount, 0 as expenseamount
       from hmincome
      ) union all
      (select expensedate, 0, expenseAmount
       from hmexpense
      )
     ) ie
group by dte
order by dte;

【讨论】:

谢谢。这对我有用。从答案中删除了“ie”。【参考方案2】:

这里的问题是您在表格中的日期有多行。因此,我们需要首先在子查询中聚合它们。之后会用来做FULL JOIN

试试:

SELECT 
  e.expenseDate,
  i.incomeDate, 
  e.sumExpenseAmount, 
  i.sumIncomeAmount
FROM 
(SELECT incomeDate, SUM(incomeAmount) sumIncomeAmount
 FROM HMINCOME
 GROUP BY incomeDate) i
FULL JOIN 
(SELECT expenseDate, SUM(expenseAmount) sumExpenseAmount
 FROM HMEXPENSE
 GROUP BY expenseDate) e
  ON i.incomeDate = e.expenseDate 

【讨论】:

子查询的总和列需要列别名。不需要那个外部 GROUP BY。 HSQLDB支持的语法【参考方案3】:

感谢您的回答。 发布的两个答案都对我有用。

select dte, sum(incomeamount) as incomeamount, sum(expenseamount) as expenseamount
from ((select incomedate as dte, incomeamount, 0 as expenseamount
       from hmincome
      ) union all
      (select expensedate, 0, expenseAmount
       from hmexpense
      )
     ) ie
group by dte
order by dte;

SELECT 
  e.expenseDate,
  i.incomeDate, 
  e.sumExpenseAmount, 
  i.sumIncomeAmount
FROM 
(SELECT incomeDate, SUM(incomeAmount) sumIncomeAmount
 FROM HMINCOME
 GROUP BY incomeDate) i
FULL JOIN 
(SELECT expenseDate, SUM(expenseAmount) sumExpenseAmount
 FROM HMEXPENSE
 GROUP BY expenseDate) e
  ON i.incomeDate = e.expenseDate 

【讨论】:

以上是关于使用带有 SUM 和 GROUP BY 的完整 JOIN 避免重复条目的主要内容,如果未能解决你的问题,请参考以下文章

sum()在具有多个联接的MySQL查询中不能正常工作(group by不能按预期工作)

EF LINQ Group By 和 Sum

关于SQL中两张表联合sum和group by的查询问题

带有 Group By 子句的 SQL 逗号分隔行

带有连接和group by子句的选择查询中的MySQL性能问题

Drupal 使用 group by 和 sum 查看查询