如何获取日期范围内的缺失值?
Posted
技术标签:
【中文标题】如何获取日期范围内的缺失值?【英文标题】:How to get missing values in date range? 【发布时间】:2014-09-16 17:08:44 【问题描述】:我有以下结构的表:
我正在尝试获取两个日期之间的分组值,问题是,我还希望为未选择的日期重新调整行,例如我有
的范围WHERE m.date BETWEEN "2014-09-02" AND "2014-09-10"
但例如在日期 2014-09-06 中没有相关行在表中,所以结果应该是
2014-09-06| 0 | 0 | 0 | 0
请问我该怎么做? (如果可以使用 SQLLite 数据库)。
这是我正在使用的查询:
SELECT substr(m.date, 1, 10) as my_date, COUNT(m.ID) AS 'NUMBER_OF_ALL_CALLS',
(SELECT COUNT(*) FROM dialed_calls subq WHERE subq.call_result = 'DONE'
AND substr(m.date, 1, 10) = substr(subq.DATE, 1, 10)) as 'RESULT_DONE',
(SELECT COUNT(*) FROM dialed_calls subq WHERE subq.call_result = 'NOT_INTERESTED'
AND substr(m.date, 1, 10) = substr(subq.DATE, 1, 10)) as 'RESULT_NOT_INTERESTED',
(SELECT COUNT(*) FROM dialed_calls subq WHERE subq.call_result = 'NO_APPOINTMENT'
AND substr(m.date, 1, 10) = substr(subq.DATE, 1, 10)) as 'RESULT_NO_APP'
FROM dialed_calls m
WHERE m.date BETWEEN "2014-09-02" AND "2014-09-05"
GROUP BY my_date
非常感谢您的帮助。
表结构:
BEGIN TRANSACTION;
CREATE TABLE dialed_calls(Id integer PRIMARY KEY,
'date' datetime,
'called_number' VARCHAR(45),
'call_result' VARCHAR(45),
'call_duration' INT,
'synced' BOOL);
/* Create few records in this table */
INSERT INTO dialed_calls VALUES(1,'2014-09-02 15:54:34+0200',
'800123456', 'NOT_INTERESTED', 10, 0);
INSERT INTO dialed_calls VALUES(2,'2014-09-02 15:56:30+0200',
'800123456', 'NO_APPOINTMENT', 10, 0);
INSERT INTO dialed_calls VALUES(3,'2014-09-02 16:01:49+0200',
'800123456', 'DONE', 9, 0);
INSERT INTO dialed_calls VALUES(4,'2014-09-02 16:03:03+0200',
'800123456', 'NO_APPOINTMENT', 69, 0);
INSERT INTO dialed_calls VALUES(5,'2014-09-02 18:09:34+0200',
'800123456', 'NO_APPOINTMENT', 3, 0);
INSERT INTO dialed_calls VALUES(6,'2014-09-02 18:54:02+0200',
'123456789', 'NO_APPOINTMENT', 89, 0);
INSERT INTO dialed_calls VALUES(7,'2014-09-02 18:55:25+0200',
'123456789', 'NOT_INTERESTED', 89, 0);
INSERT INTO dialed_calls VALUES(8,'2014-09-03 18:36:58+0200',
'123456789', 'DONE', 185, 0);
INSERT INTO dialed_calls VALUES(9,'2014-09-04 18:36:58+0200',
'123456789', 'DONE', 185, 0);
INSERT INTO dialed_calls VALUES(10,'2014-09-05 18:36:58+0200',
'123456789', 'DONE', 185, 0);
COMMIT;
【问题讨论】:
【参考方案1】:试试这个:
SELECT
d.date AS DATE,
IFNULL(NUMBER_OF_ALL_CALLS, 0) AS NUMBER_OF_ALL_CALLS,
IFNULL(RESULT_DONE, 0) AS RESULT_DONE,
IFNULL(RESULT_NOT_INTERESTED, 0) AS RESULT_NOT_INTERESTED,
IFNULL(RESULT_NO_APP, 0) AS RESULT_NO_APP
FROM
(SELECT DATE('1970-01-01', '+' || (t4.i*10000 + t3.i*1000 + t2.i*100 + t1.i*10 + t0.i) || ' days') date FROM
(SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) t0,
(SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) t1,
(SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) t2,
(SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) t3,
(SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) t4) d
LEFT JOIN
(
SELECT substr(m.date, 1, 10) as my_date, COUNT(m.ID) AS 'NUMBER_OF_ALL_CALLS',
(SELECT COUNT(*) FROM dialed_calls subq WHERE subq.call_result = 'DONE'
AND substr(m.date, 1, 10) = substr(subq.DATE, 1, 10)) as 'RESULT_DONE',
(SELECT COUNT(*) FROM dialed_calls subq WHERE subq.call_result = 'NOT_INTERESTED'
AND substr(m.date, 1, 10) = substr(subq.DATE, 1, 10)) as 'RESULT_NOT_INTERESTED',
(SELECT COUNT(*) FROM dialed_calls subq WHERE subq.call_result = 'NO_APPOINTMENT'
AND substr(m.date, 1, 10) = substr(subq.DATE, 1, 10)) as 'RESULT_NO_APP'
FROM dialed_calls m
GROUP BY my_date
) t
ON d.date = t.my_date
WHERE d.date BETWEEN '2014-09-02' AND '2014-09-10'
ORDER BY d.date;
上述查询将首先检索指定日期范围之间的日期,然后将检索到的值与您的表连接。
【讨论】:
非常感谢,但是查询速度很慢(移动应用程序需要它)。有什么方法可以加快速度吗? Fxp: 1970-01-01 更改为距今天最近的日期? 1970-01-01 是开始日期。如果您想让它从更当前的日期开始,您可以更改该值。您还可以优化查询以获得更好的响应时间。 你可以推荐什么来优化这个查询? 例如,您可以编写这样的查询,而不是内部查询:SELECT substr(m.date, 1, 10) as my_date, COUNT(m.ID) AS 'NUMBER_OF_ALL_CALLS', call_result AS 'CALL_RESULT' FROM dialed_calls m WHERE m.date BETWEEN "2014-09-02" AND "2014-09-05" GROUP BY my_date, call_result;
,但此查询将垂直返回 call_result
值。稍后您可以在应用程序中转换数据。
或者你可以试试这个查询:SELECT substr(m.date, 1, 10) as my_date, SUM(1) AS 'NUMBER_OF_ALL_CALLS', SUM(CASE WHEN call_result = 'DONE' THEN 1 ELSE 0 END) AS 'RESULT_DONE', SUM(CASE WHEN call_result = 'NOT_INTERESTED' THEN 1 ELSE 0 END) AS 'RESULT_NOT_INTERESTED', SUM(CASE WHEN call_result = 'NO_APPOINTMENT' THEN 1 ELSE 0 END) AS 'RESULT_NO_APP' FROM dialed_calls m WHERE m.date BETWEEN "2014-09-02" AND "2014-09-05" GROUP BY my_date
【参考方案2】:
这是加入日历表的好例子。
http://web.archive.org/web/20070611150639/http://sqlserver2000.databases.aspfaq.com/why-should-i-consider-using-an-auxiliary-calendar-table.html 请注意,这是一个 SQL 服务器链接,但您可以将其调整为 SQLlite。
您可以进行计算,然后将结果右连接到日历表中,以便日期显示为 NULL 值。或者您可以将空值 COALESCE() 转换为更有意义的值,例如 0。
【讨论】:
您好,谢谢,但这似乎是一个相当复杂的解决方案。我正在寻找可以通过简单的 SQL 查询使用的东西。 它使查询变得更简单!如果 JOIN 太难,我建议花时间正确学习 SQL。 @oardic 的解决方案基本上在您运行查询时每次 都会创建该日历表。为什么不直接实现它??【参考方案3】:要获取中间日期作为查询结果,您需要一个包含该范围内所有日期的表。 在某些 rdbms 中,您可以填充要加入的临时表。 您将只需要比较日期部分(没有时间)。
小心你的间隔,因为第二个日期的时间是 00:00:00,也许这不是你的意思。
【讨论】:
以上是关于如何获取日期范围内的缺失值?的主要内容,如果未能解决你的问题,请参考以下文章