如何获取日期范围内的缺失值?

Posted

技术标签:

【中文标题】如何获取日期范围内的缺失值?【英文标题】:How to get missing values in date range? 【发布时间】:2014-09-16 17:08:44 【问题描述】:

我有以下结构的表:

我正在尝试获取两个日期之间的分组值,问题是,我还希望为未选择的日期重新调整行,例如我有

的范围
WHERE m.date BETWEEN "2014-09-02" AND "2014-09-10"

但例如在日期 2014-09-06 中没有相关行在表中,所以结果应该是

 2014-09-06| 0 | 0 | 0 | 0

请问我该怎么做? (如果可以使用 SQLLite 数据库)。

这是我正在使用的查询:

  SELECT substr(m.date, 1, 10) as my_date, COUNT(m.ID) AS 'NUMBER_OF_ALL_CALLS',
    (SELECT COUNT(*) FROM dialed_calls subq WHERE subq.call_result = 'DONE'
    AND substr(m.date, 1, 10) = substr(subq.DATE, 1, 10)) as 'RESULT_DONE',
    (SELECT COUNT(*) FROM dialed_calls subq WHERE subq.call_result = 'NOT_INTERESTED' 
    AND substr(m.date, 1, 10) = substr(subq.DATE, 1, 10)) as 'RESULT_NOT_INTERESTED',
    (SELECT COUNT(*) FROM dialed_calls subq WHERE subq.call_result = 'NO_APPOINTMENT'
    AND substr(m.date, 1, 10) = substr(subq.DATE, 1, 10)) as 'RESULT_NO_APP'
    FROM dialed_calls m
    WHERE m.date BETWEEN "2014-09-02" AND "2014-09-05"
    GROUP BY my_date

非常感谢您的帮助。

表结构:

BEGIN TRANSACTION;

CREATE TABLE dialed_calls(Id integer PRIMARY KEY,
'date' datetime,
'called_number' VARCHAR(45),
'call_result' VARCHAR(45),
'call_duration' INT,
'synced' BOOL);

/* Create few records in this table */
INSERT INTO dialed_calls VALUES(1,'2014-09-02 15:54:34+0200',
'800123456', 'NOT_INTERESTED', 10, 0);
INSERT INTO dialed_calls VALUES(2,'2014-09-02 15:56:30+0200',
'800123456', 'NO_APPOINTMENT', 10, 0);
INSERT INTO dialed_calls VALUES(3,'2014-09-02 16:01:49+0200',
'800123456', 'DONE', 9, 0);
INSERT INTO dialed_calls VALUES(4,'2014-09-02 16:03:03+0200',
'800123456', 'NO_APPOINTMENT', 69, 0);
INSERT INTO dialed_calls VALUES(5,'2014-09-02 18:09:34+0200',
'800123456', 'NO_APPOINTMENT', 3, 0);
INSERT INTO dialed_calls VALUES(6,'2014-09-02 18:54:02+0200',
'123456789', 'NO_APPOINTMENT', 89, 0);
INSERT INTO dialed_calls VALUES(7,'2014-09-02 18:55:25+0200',
'123456789', 'NOT_INTERESTED', 89, 0);
INSERT INTO dialed_calls VALUES(8,'2014-09-03 18:36:58+0200',
'123456789', 'DONE', 185, 0);
INSERT INTO dialed_calls VALUES(9,'2014-09-04 18:36:58+0200',
'123456789', 'DONE', 185, 0);
INSERT INTO dialed_calls VALUES(10,'2014-09-05 18:36:58+0200',
'123456789', 'DONE', 185, 0);
COMMIT;

【问题讨论】:

【参考方案1】:

试试这个:

SELECT 
  d.date AS DATE, 
  IFNULL(NUMBER_OF_ALL_CALLS, 0) AS NUMBER_OF_ALL_CALLS, 
  IFNULL(RESULT_DONE, 0) AS RESULT_DONE, 
  IFNULL(RESULT_NOT_INTERESTED, 0) AS RESULT_NOT_INTERESTED, 
  IFNULL(RESULT_NO_APP, 0) AS RESULT_NO_APP
FROM 
 (SELECT DATE('1970-01-01', '+' || (t4.i*10000 + t3.i*1000 + t2.i*100 + t1.i*10 + t0.i) || ' days') date FROM
 (SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) t0,
 (SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) t1,
 (SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) t2,
 (SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) t3,
 (SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) t4) d
LEFT JOIN 
(
    SELECT substr(m.date, 1, 10) as my_date, COUNT(m.ID) AS 'NUMBER_OF_ALL_CALLS',
    (SELECT COUNT(*) FROM dialed_calls subq WHERE subq.call_result = 'DONE'
    AND substr(m.date, 1, 10) = substr(subq.DATE, 1, 10)) as 'RESULT_DONE',
    (SELECT COUNT(*) FROM dialed_calls subq WHERE subq.call_result = 'NOT_INTERESTED' 
    AND substr(m.date, 1, 10) = substr(subq.DATE, 1, 10)) as 'RESULT_NOT_INTERESTED',
    (SELECT COUNT(*) FROM dialed_calls subq WHERE subq.call_result = 'NO_APPOINTMENT'
    AND substr(m.date, 1, 10) = substr(subq.DATE, 1, 10)) as 'RESULT_NO_APP'
    FROM dialed_calls m
    GROUP BY my_date
) t
ON d.date = t.my_date
WHERE d.date BETWEEN '2014-09-02' AND '2014-09-10'
ORDER BY d.date;

上述查询将首先检索指定日期范围之间的日期,然后将检索到的值与您的表连接。

【讨论】:

非常感谢,但是查询速度很慢(移动应用程序需要它)。有什么方法可以加快速度吗? Fxp: 1970-01-01 更改为距今天最近的日期? 1970-01-01 是开始日期。如果您想让它从更当前的日期开始,您可以更改该值。您还可以优化查询以获得更好的响应时间。 你可以推荐什么来优化这个查询? 例如,您可以编写这样的查询,而不是内部查询:SELECT substr(m.date, 1, 10) as my_date, COUNT(m.ID) AS 'NUMBER_OF_ALL_CALLS', call_result AS 'CALL_RESULT' FROM dialed_calls m WHERE m.date BETWEEN "2014-09-02" AND "2014-09-05" GROUP BY my_date, call_result;,但此查询将垂直返回 call_result 值。稍后您可以在应用程序中转换数据。 或者你可以试试这个查询:SELECT substr(m.date, 1, 10) as my_date, SUM(1) AS 'NUMBER_OF_ALL_CALLS', SUM(CASE WHEN call_result = 'DONE' THEN 1 ELSE 0 END) AS 'RESULT_DONE', SUM(CASE WHEN call_result = 'NOT_INTERESTED' THEN 1 ELSE 0 END) AS 'RESULT_NOT_INTERESTED', SUM(CASE WHEN call_result = 'NO_APPOINTMENT' THEN 1 ELSE 0 END) AS 'RESULT_NO_APP' FROM dialed_calls m WHERE m.date BETWEEN "2014-09-02" AND "2014-09-05" GROUP BY my_date【参考方案2】:

这是加入日历表的好例子。

http://web.archive.org/web/20070611150639/http://sqlserver2000.databases.aspfaq.com/why-should-i-consider-using-an-auxiliary-calendar-table.html 请注意,这是一个 SQL 服务器链接,但您可以将其调整为 SQLlite。

您可以进行计算,然后将结果右连接到日历表中,以便日期显示为 NULL 值。或者您可以将空值 COALESCE() 转换为更有意义的值,例如 0。

【讨论】:

您好,谢谢,但这似乎是一个相当复杂的解决方案。我正在寻找可以通过简单的 SQL 查询使用的东西。 它使查询变得更简单!如果 JOIN 太难,我建议花时间正确学习 SQL。 @oardic 的解决方案基本上在您运行查询时每次 都会创建该日历表。为什么不直接实现它??【参考方案3】:

要获取中间日期作为查询结果,您需要一个包含该范围内所有日期的表。 在某些 rdbms 中,您可以填充要加入的临时表。 您将只需要比较日期部分(没有时间)。

小心你的间隔,因为第二个日期的时间是 00:00:00,也许这不是你的意思。

【讨论】:

以上是关于如何获取日期范围内的缺失值?的主要内容,如果未能解决你的问题,请参考以下文章

如何从日期范围查询中查找表中的一组缺失日期

如何在没有周末的情况下获取日期范围之间的所有日期?

当时间范围在两天之间时如何获取特定时间范围内的记录

数据清洗要点

如何使用asp C#从日期范围选择器中获取值

计算平均值 (AVG),包括 Redshift DB 中某个日期范围内的缺失数据