SQL 计算项目状态历史记录在日期范围内的项目项
Posted
技术标签:
【中文标题】SQL 计算项目状态历史记录在日期范围内的项目项【英文标题】:SQL Count project items where project status history is within date range 【发布时间】:2014-04-10 01:14:18 【问题描述】:我有 4 张桌子:
projects: id, title, current_status_id
statuses: id, label
status_history: project_id, status_id, created_at
messages: id, project_id, body, created_at
当项目在应用程序中更改状态(例如,从“lead”到“active”再到“complete”)时,插入 status_history 行。请注意 created_at 列是记录更改日期的时间戳。在状态更改之间,项目中正在发生活动并创建消息。例如,项目初始化为“lead”状态,在项目处于“lead”状态时创建一些消息,项目更改为“活动”状态,在项目处于此状态时创建一些消息,等等。
我想创建显示以下内容的查询:日期、在“领导”项目中创建的消息数、在“活动”项目中创建的消息数以及在具有其他状态的项目中创建的消息数。这可以在一个查询中完成吗?我正在使用 PostgreSQL。
这里有一些伪代码,希望能阐明我正在寻找的东西。
* Start at the earliest date
* Find all projects whose status was 'lead' on that date
* Count the number of created messages from these projects with that date
* Find all projects whose status was 'active' on that date
* Count the number of created messages from these projects with that date
* Find all projects whose status was anything else on that date
* Count the number of created messages from these projects with that date
* ... some projects change status, some stay the same, business happens ...
* Go to next date
* Find all projects whose status was 'lead' on that date
* Count the number of created messages from these projects with that date
* Find all projects whose status was 'active' on that date
* Count the number of created messages from these projects with that date
* Find all projects whose status was anything else on that date
* Count the number of created messages from these projects with that date
* ... some projects change status, some stay the same, business happens ...
* keep doing this until the present
虽然项目确实有一个 current_status_id 列,但它是当前状态,不一定是项目上个月的状态。项目的状态不会每天都在变化 - 不会每天为每个项目创建 status_history 行。
【问题讨论】:
如果有链接列来连接包含日期、消息、项目和状态的表,则可以创建这样的查询 对于给定的日期,您想要使用这些类别新创建的项目数,对吧?不是在给定日期可能已更改为其中一种状态的状态? 嗨,Brian,对于给定的日期,我想计算任何项目中具有特定状态的消息数量。项目的状态会随着时间而变化,因此有些消息可能是在项目处于领先地位时创建的,有些消息可能是在项目处于活动状态时发生的,等等。 【参考方案1】:您正在寻找这样的查询...这是 MSSQL,但我假设与 Postgresql 非常相似,或者您可以简单地在网上找到正确的语法。
SELECT count(*) AS 'count', messages.created_at, statuses.label
FROM messages
JOIN projects ON projects.id = messages.project_id
JOIN status_history ON projects.id = status_history.project_id
JOIN statuses ON statuses.id ON status_history.status_id
GROUP BY created_at, statues.label
【讨论】:
【参考方案2】:试试下面的。
将“lead”和“active”替换为这两种状态的状态 ID。
请注意,选择的第一个字段是将您的 created_at 时间戳转换为日期值(删除时间)。
提供的计数显示新创建的具有这些状态的项目数。它们不包括已经存在但在给定日期更改为这些状态的项目。这是通过不存在子查询完成的。
select date(created_at) as dt
, sum(case when sh.status_id = 'lead' then 1 else 0 end) as num_lead
, sum(case when sh.status_id = 'active' then 1 else 0 end) as num_active
, sum(case when sh.status_id not in ('lead','active') then 1 else 0 end) as num_else
from status_history sh
where not exists
( select 1
from status_history x
where x.project_id = sh.project_id
and x.created_at < sh.created_at )
group by date(created_at)
order by 1
【讨论】:
嗨,Brian,感谢您的提示,但我实际上是在尝试计算新创建消息的数量。 @JustinM 这就是查询的作用。它通过不存在子查询过滤掉那些已经存在的。【参考方案3】:怎么样:
SELECT to_char(tmp.date, 'YYYY-MM-DD') as date, COUNT(tmp.status = 'lead') as num_lead, COUNT(tmp.status = 'active') as num_active FROM
(
SELECT m.created_at AS date, COUNT(m.id) as messages, s.label as status FROM messages AS m
INNER JOIN project AS p ON p.id = m.project_id
INNER JOIN statuses AS s ON s.id = p.current_status_id
GROUP BY m.created_at, s.id, s.label
) as tmp
GROUP BY tmp.date;
分组应该 100% 正确(因为不清楚一个 id 是否完全属于一个文本表示,标签不是 primary_key!)
临时表包含“Messages per date and project_status_label”的所有关系,外部选择函数只改变维度。
【讨论】:
以上是关于SQL 计算项目状态历史记录在日期范围内的项目项的主要内容,如果未能解决你的问题,请参考以下文章
SQL从日期范围内的同一表中的不同记录中获取多个项目的总和(ORACLE)