如何仅选择最近的时间戳?
Posted
技术标签:
【中文标题】如何仅选择最近的时间戳?【英文标题】:How to select only the most recent timestamp? 【发布时间】:2021-10-02 20:52:38 【问题描述】:如果我在多个表上执行内连接,我如何确保结果集只包含最近的时间戳。例如
SELECT
e.customer_id AS customer_id,
e.event_id AS event_id,
#MOST RECENT TIMESTAMP from car.updated_on, motorcycle.updated_on or walkc.updated_on
FROM
event_table AS e
INNER JOIN car AS c ON e.customer_id = c.customer_id
INNER JOIN motorcycle AS m ON e.customer_id = m.customer_id
INNER JOIN walk AS w ON e.customer_id = w.customer_id
WHERE
e.event_id = c.event_id
AND e.event_id = m.event_id
AND e.event_id = w.event_id
我有一个记录所有发生的事件的表,我只想提取最近的时间戳,而不管所有三个事件(汽车、摩托车或步行)中的所有客户的事件类型。
样本数据:
事件
customer_id | event_id |
---|---|
1 | 100 |
2 | 101 |
3 | 102 |
4 | 103 |
5 | 104 |
6 | 105 |
7 | 106 |
8 | 107 |
9 | 108 |
10 | 109 |
汽车
customer_id | event_id | car_id | updated_on |
---|---|---|---|
1 | 100 | 1 | 2021-07-23 10:09:05 |
2 | 101 | 1 | 2021-07-23 10:09:05 |
3 | 102 | 1 | 2021-07-23 10:09:05 |
4 | 103 | 1 | 2021-07-23 10:09:05 |
5 | 104 | 1 | 2021-07-23 10:09:05 |
6 | 105 | 1 | 2021-07-23 10:09:05 |
7 | 106 | 1 | 2021-07-23 10:09:05 |
8 | 107 | 1 | 2021-07-23 10:09:05 |
9 | 108 | 1 | 2021-07-23 10:09:05 |
10 | 109 | 1 | 2021-07-23 10:09:05 |
摩托车
customer_id | event_id | motorcycle_id | updated_on |
---|---|---|---|
1 | 100 | 1 | 2021-07-23 10:09:00 |
2 | 101 | 1 | 2021-07-23 10:09:00 |
3 | 102 | 1 | 2021-07-23 10:09:00 |
4 | 103 | 1 | 2021-07-23 10:09:00 |
5 | 104 | 1 | 2021-07-23 10:09:10 |
6 | 105 | 1 | 2021-07-23 10:09:10 |
7 | 106 | 1 | 2021-07-23 10:09:00 |
8 | 107 | 1 | 2021-07-23 10:09:00 |
走路
customer_id | event_id | walk_id | updated_on |
---|---|---|---|
1 | 100 | 1 | 2021-07-23 10:09:00 |
2 | 101 | 1 | 2021-07-23 10:09:00 |
3 | 102 | 1 | 2021-07-23 10:09:00 |
4 | 103 | 1 | 2021-07-23 10:09:00 |
5 | 104 | 1 | 2021-07-23 10:09:00 |
6 | 105 | 1 | 2021-07-23 10:09:00 |
7 | 106 | 1 | 2021-07-23 10:09:00 |
8 | 107 | 1 | 2021-07-23 10:09:15 |
9 | 108 | 1 | 2021-07-23 10:09:15 |
期望的结果:
customer_id | event_id | updated_on | comment |
---|---|---|---|
1 | 100 | 2021-07-23 10:09:05 | TS from car |
2 | 101 | 2021-07-23 10:09:05 | TS from car |
3 | 102 | 2021-07-23 10:09:05 | TS from car |
4 | 103 | 2021-07-23 10:09:05 | TS from car |
5 | 104 | 2021-07-23 10:09:10 | TS from motorcycle |
6 | 105 | 2021-07-23 10:09:10 | TS from motorcycle |
7 | 106 | 2021-07-23 10:09:15 | TS from walk |
8 | 107 | 2021-07-23 10:09:15 | TS from walk |
我不需要最终结果集中的comment
,我添加它只是为了解释。实际上,我不在乎事件是什么。我只关心四个表中的INNER JOIN
;所以最多应该只有8条记录,我只想要最新(最高)的时间戳值。 customer_id
和 event_id
需要匹配所有 INNER JOINS
。
例如:customer_id = 1
和 event_id = 100
;这存在于所有 4 个表中。它具有updated_on
的三个值(分别来自:汽车、摩托车和步行)。我想要MAX(2021-07-23 10:09:05, 2021-07-23 10:09:00, 2021-07-23 10:09:00)
; MAX(car.updated_on, motorcyle.updated_on, walk.updated_on)
.
任何帮助将不胜感激,谢谢。
编辑:我在两个查询中得到了想要的结果。希望优化为单个查询。
-
仅获取三个表之间的
UNIQUE
记录并将它们存储在另一个名为event_joined
的位置。此表每次都会被完全覆盖,而不仅仅是附加到。
SELECT
e.customer_id AS customer_id,
e.event_id AS event_id,
FROM
event_table AS e
INNER JOIN car AS c ON e.customer_id = c.customer_id
INNER JOIN motorcycle AS m ON e.customer_id = m.customer_id
INNER JOIN walk AS w ON e.customer_id = w.customer_id
WHERE
e.event_id = c.event_id
AND e.event_id = m.event_id
AND e.event_id = w.event_id
-
在执行
UNION
之前,我们知道所有三个表将具有相同的行数,因为我们之前仅将它们加入以匹配记录。现在我们只需 GROUP BY
并获取 MAX
(最近的)时间戳。
SELECT event_temp.customer_id, event_temp.event_id, MAX(event_temp.updated_on) AS updated_on
FROM (
SELECT customer_id, event_id, updated_on FROM car AS c INNER JOIN event_joined AS ej ON e.customer_id = c.customer_id AND e.event_id = c.event_id
UNION ALL
SELECT customer_id, event_id, updated_on FROM motorcycle AS m INNER JOIN event_joined AS ej ON e.customer_id = c.customer_id AND e.event_id = c.event_id
UNION ALL
SELECT customer_id, event_id, updated_on FROM walk AS w INNER JOIN event_joined AS ej ON e.customer_id = c.customer_id AND e.event_id = c.event_id
) AS event_temp
GROUP BY event_temp.customer_id, event_temp.event_id;
有没有办法将其优化为单个查询?谢谢。
【问题讨论】:
提供样本数据、所需结果和适当的数据库标签。 抱歉打扰你,知道我可以用什么来格式化表格吗?它在预览中看起来不错,但是当我按保存时它无法正确生成表格。 我更新了,希望截图没问题。我无法弄清楚降价。看起来像堆栈的错误:( 当汽车、摩托车和步行表链接到的事件表已经有客户 ID 时,为什么会有客户 ID?您的 tab,e 结构没有意义。另外,这个凌乱的表结构与您的实际问题有多大的相关性?尽量减少到最低限度。 对遗留系统的不幸技术限制我无法升级,否则我会正常化整个混乱。实际上,我知道我需要做 4INNER JOINS
这很好。我只需要从其他 3 个表中获取最新的TIMESTAMP
。
【参考方案1】:
您可以使用 CROSS APPLY 进行简单查询,如下所示:
SELECT
e.customer_id AS customer_id,
e.event_id AS event_id,
max(t.updated_On)
FROM
event_table AS e
INNER JOIN car AS c ON e.customer_id = c.customer_id and e.event_id = c.event_id
INNER JOIN motorcycle AS m ON e.customer_id = m.customer_id and e.event_id = m.event_id
INNER JOIN walk AS w ON e.customer_id = w.customer_id and e.event_id = w.event_id
CROSS APPLY (values (c.updated_On),(m.updated_On),(w.updated_On)) as t(updated_On)
GROUP BY e.customer_id,
e.event_id
样本数据和工作解决方案
declare @event table(cust_id int, event_id int)
declare @car table(cust_id int, event_id int, updated_on datetime)
declare @walk table(cust_id int, event_id int, updated_on datetime)
insert into @event values (1, 100)
insert into @car values (1,100, '2020-01-01')
insert into @walk values(1,100, '2020-02-01')
SELECT
e.cust_id AS customer_id,
e.event_id AS event_id,
max(t.updatedON) as recent_timestamp
FROM
@event AS e
INNER JOIN @car AS c ON e.cust_id = c.cust_id and e.event_id = c.event_id
INNER JOIN @walk AS w ON e.cust_id = w.cust_id and e.event_id = w.event_id
CROSS APPLY (values(c.updated_On),(w.updated_on)) as t(updatedOn)
group by e.cust_id, e.event_id
customer_id | event_id | recent_timestamp |
---|---|---|
1 | 100 | 2020-02-01 00:00:00.000 |
【讨论】:
您好,感谢您的回复。 values(c.updated_On 表有一个多部分规范。这是不允许的。是我得到的错误消息。尝试运行查询时没有更多详细信息。 @Jordan,理想情况下,它应该可以工作。我测试了它,它运行良好 我确认它确实可以在常规 SQL Server 2016 上运行,不幸的是,我使用的软件由于某些奇怪的原因限制了某些功能的使用。您的解决方案确实让我知道如何在没有交叉应用的情况下做到这一点,谢谢。 @Jordan,很高兴它有帮助。以上是关于如何仅选择最近的时间戳?的主要内容,如果未能解决你的问题,请参考以下文章