时间间隔内的 SQL 计数

Posted

技术标签:

【中文标题】时间间隔内的 SQL 计数【英文标题】:SQL Counts over time interval 【发布时间】:2020-08-26 16:12:45 【问题描述】:

我将如何将此代码转换为一个 SQL 查询,这样我就不会在任何时间间隔内进行 SQL 查询?

对于过去 12 小时内的每一小时,计算总、成功、重试和失败记录的数量...看起来很简单,但我迷路了,而且我的 sql 知识有限。

我尝试了 group by,但这只给了我几个小时的结果。

附注这将是一个非常大的表:目前有 300K 行用于测试,但预计每天插入 500K 到 1M 行。一行的生命周期在 14 或 30 天后被清除。

使用 mysql 8

输出

Results over Time
Current Time: 1598457808
Interval: 1598461200
Interval    Total   Success Retry   Failed
2020-08-26 12:00:00 0   0   0   0
2020-08-26 11:00:00 0   0   0   0
2020-08-26 10:00:00 104 77  22  5
2020-08-26 09:00:00 887 567 224 96
2020-08-26 08:00:00 1895    1274    408 213
2020-08-26 07:00:00 0   0   0   0
2020-08-26 06:00:00 0   0   0   0
2020-08-26 05:00:00 0   0   0   0
2020-08-26 04:00:00 0   0   0   0
2020-08-26 03:00:00 0   0   0   0
2020-08-26 02:00:00 0   0   0   0
2020-08-26 01:00:00 0   0   0   0
2020-08-26 00:00:00 0   0   0   0

代码

function results_per_interval($lrg_pdo) 
    $now = time();
    print "Current Time: $now<br>";

    $interval_count = 12;
    $duration = 3600;
    
    $modulus = $now % $duration;
    $start_interval = $now + ($duration - $modulus);
    $end_interval  = $start_interval - ($duration * $interval_count);

    $interval = $start_interval;

    print "Interval: $interval<br>";
    print "<table border=1>";
    web_table_header([
        "Interval",
        "Total",
        "Success",
        "Retry",
        "Failed",
    ]);

    while($interval >= $end_interval) 
        $i_start = date('Y-m-d H:i:00',$interval);
        $i_end = date('Y-m-d H:i:00',$interval+$duration);
        $sql = "
            SELECT 
            (SELECT count(*) FROM results WHERE timestamp_created between '$i_start' and '$i_end') as count,
            (SELECT count(*) FROM results WHERE timestamp_created between '$i_start' and '$i_end' and status='success') as success_count,
            (SELECT count(*) FROM results WHERE timestamp_created between '$i_start' and '$i_end' and status='processing') as retry_count,
            (SELECT count(*) FROM results WHERE timestamp_created between '$i_start' and '$i_end' and status='failed') as failed_count";
    
        $r = $lrg_pdo->query($sql)->fetch(PDO::FETCH_OBJ);
        web_table_row([
            $i_start,
            $r->count,
            $r->success_count,
            $r->retry_count,
            $r->failed_count,
        ]);

        $interval = $interval - $duration;
    
    print "</table>";

【问题讨论】:

【参考方案1】:

您可以使用递归 CTE 生成日期:

with recursive dates as (
      select current_date + interval hour(now()) hour as dt, 1 as n
      union all
      select dt - interval 1 hour, n + 1
      from dates
      where n < 12
     )
select *
from dates;

然后您可以将其放入查询中:

with recursive dates as (
      select current_date + interval hour(now()) hour as dt, 1 as n
      union all
      select dt - interval 1 hour, n + 1
      from dates
      where n < 12
     )
select d.dt, count(*) as num,
       sum(status = 'success') as num_success,
       sum(status = 'processing') as num_processing,
       sum(status = 'failed') as num_failed
from dates d left join
     results r
     on r.timestamp >= d.dt and
        r.timestamp < d.dt + interval 1 hour
group by d.dt;

【讨论】:

以上是关于时间间隔内的 SQL 计数的主要内容,如果未能解决你的问题,请参考以下文章

Databricks SQL - 间隔中的最大同时事件计数

Oracle SQL 组行计数按 10 分钟的时间间隔

SQL:一个月内的子间隔

查询以在表中的间隔内计数

时间间隔计数器的检定计量方案

从 Sql 中的表中选择不同的日期间隔