MySQL 按日期和计数分组,包括丢失的日期
Posted
技术标签:
【中文标题】MySQL 按日期和计数分组,包括丢失的日期【英文标题】:MySQL group by date and count including missing dates 【发布时间】:2014-11-06 09:51:56 【问题描述】:以前我正在执行以下操作以从报告表中获取每日计数。
SELECT COUNT(*) AS count_all, tracked_on
FROM `reports`
WHERE (domain_id = 939 AND tracked_on >= '2014-01-01' AND tracked_on <= '2014-12-31')
GROUP BY tracked_on
ORDER BY tracked_on ASC;
显然,这不会给我错过日期的 0 计数。
然后我终于找到了一个optimum solution 来生成给定日期范围之间的日期系列。 但我面临的下一个挑战是将它与我的报告表结合起来,并按日期对计数进行分组。
select count(*), all_dates.Date as the_date, domain_id
from (
select curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY as Date
from (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as c
) all_dates
inner JOIN reports r
on all_dates.Date >= '2014-01-01'
and all_dates.Date <= '2014-12-31'
where all_dates.Date between '2014-01-01' and '2014-12-31' AND domain_id = 939 GROUP BY the_date order by the_date ASC ;
得到的结果是
count(*) the_date domain_id
46 2014-01-01 939
46 2014-01-02 939
46 2014-01-03 939
46 2014-01-04 939
46 2014-01-05 939
46 2014-01-06 939
46 2014-01-07 939
46 2014-01-08 939
46 2014-01-09 939
46 2014-01-10 939
46 2014-01-11 939
46 2014-01-12 939
46 2014-01-13 939
46 2014-01-14 939
...
而我希望用 0 填写缺失的日期
类似
count(*) the_date domain_id
12 2014-01-01 939
23 2014-01-02 939
46 2014-01-03 939
0 2014-01-04 939
0 2014-01-05 939
99 2014-01-06 939
1 2014-01-07 939
5 2014-01-08 939
...
我给的另一个尝试是:
select count(*), all_dates.Date as the_date, domain_id
from (
select curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY as Date
from (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as c
) all_dates
inner JOIN reports r
on all_dates.Date = r.tracked_on
where all_dates.Date between '2014-01-01' and '2014-12-31' AND domain_id = 939 GROUP BY the_date order by the_date ASC ;
结果:
count(*) the_date domain_id
38 2014-09-03 939
8 2014-09-04 939
上述查询的最小数据:http://sqlfiddle.com/#!2/dee3e/6
【问题讨论】:
如果您愿意,请考虑遵循以下简单的两步操作: 1. 如果您还没有这样做,请提供适当的 DDL(和/或 sqlfiddle),以便我们可以更轻松地复制问题。 2. 如果您尚未这样做,请提供与步骤 1 中提供的信息相对应的所需结果集。 当然,sqlfiddle.com/#!2/dee3e/6 这是包含行的最小表格。 这只是一个建议,您不必遵循它。 【参考方案1】:您需要OUTER JOIN
才能在开始和结束之间的每一天到达,因为如果您使用INNER JOIN
,它会将输出限制为仅连接的日期(即仅报告表中的那些日期) .
此外,当您使用OUTER JOIN
时,您必须注意where clause
中的条件不会导致implicit inner join
;例如 AND domain_id = 1 如果在 where 子句中使用会抑制任何不满足该条件的行,但当用作连接条件时,它只会限制报表表的行。 p>
SELECT
COUNT(r.domain_id)
, all_dates.Date AS the_date
, domain_id
FROM (
SELECT DATE_ADD(curdate(), INTERVAL 2 MONTH) - INTERVAL (a.a + (10 * b.a) ) DAY as Date
FROM (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
CROSS JOIN (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
) all_dates
LEFT OUTER JOIN reports r
ON all_dates.Date = r.tracked_on
AND domain_id = 1
WHERE all_dates.Date BETWEEN '2014-09-01' AND '2014-09-30'
GROUP BY
the_date
ORDER BY
the_date ASC;
我还更改了 all_dates 派生表,使用DATE_ADD()
将起点推到未来,并且我已经减小了它的大小。这两个都是选项,可以根据需要进行调整。
Demo at SQLfiddle
要获得每一行的 domain_id (如您的问题所示),您需要使用以下内容;请注意,您可以使用特定于 mysql 的 IFNULL()
,但我使用了更通用的 SQL 的 COALESCE()
。但是,这里显示的 @parameter 的使用无论如何都是 MySQL 特定的。
SET @domain := 1;
SELECT
COUNT(r.domain_id)
, all_dates.Date AS the_date
, coalesce(domain_id,@domain) AS domain_id
FROM (
SELECT DATE_ADD(curdate(), INTERVAL 2 month) - INTERVAL (a.a + (10 * b.a) ) DAY as Date
FROM (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
CROSS JOIN (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
) all_dates
LEFT JOIN reports r
ON all_dates.Date = r.tracked_on
AND domain_id = @domain
WHERE all_dates.Date BETWEEN '2014-09-01' AND '2014-09-30'
GROUP BY
the_date
ORDER BY
the_date ASC;
See this at SQLfiddle
【讨论】:
太棒了!刚刚工作:) 是否有可能也有一组按周和月分组的日子?与我们拥有的所有“日子”类似,我可以在一年中拥有所有的星期。 很高兴这是您的回答 - 请花一秒钟点击勾选标记 - 这表示答案已被接受。可以使用较大的时间单位,例如周、月、年,但我们通常使用日期范围(从/到日期对)。【参考方案2】:all_dates
子查询仅从当天 (curdate()
) 回顾。如果您想包含未来的日期,请将子查询的第一行更改为:
select '2015-01-01' - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY as Date
【讨论】:
以上是关于MySQL 按日期和计数分组,包括丢失的日期的主要内容,如果未能解决你的问题,请参考以下文章