Postgres distinct union 仅适用于特定列
Posted
技术标签:
【中文标题】Postgres distinct union 仅适用于特定列【英文标题】:Postgres distinct union only for specific columns 【发布时间】:2018-09-28 18:29:20 【问题描述】:我有两组数据,其中一组是动态生成的。
如果我离开 state
列,它会完美运行,因为该列并不真正存在,我的问题是如何忽略 UNION 的列,以便它结合两个数据集(因为它是相同的作为 UNION ALL)。例如,我更喜欢第一个表,并希望忽略第二个数据集中的任何行,如果它们存在于第一个表中。
SELECT event_id, start_at, state
FROM event_logs
WHERE start_at BETWEEN current_date AND current_date + interval '3 weeks'
UNION
SELECT id event_id,
GENERATE_SERIES(date_trunc('week', current_date)::date + (extract(isodow from start_at)::int - 1) + start_at::time, current_date + interval '3 weeks', '1 week'::INTERVAL) AS start_at,
'draft' AS state
FROM events
更新,也试过了:
WITH future_logs AS (
SELECT id event_id,
GENERATE_SERIES(date_trunc('week', current_date)::date + (extract(isodow from start_at)::int - 1) + start_at::time, current_date + interval '3 weeks', '1 week'::INTERVAL) AS start_at,
'draft' AS state
FROM events)
SELECT future_logs.event_id, future_logs.start_at, future_logs.state
FROM future_logs
LEFT JOIN event_logs ON future_logs.event_id = event_logs.event_id AND future_logs.start_at = event_logs.start_at
WHERE event_logs.start_at BETWEEN current_date AND current_date + interval '3 weeks'
但是得到的结果太少了 77 vs ~1000 预期。
【问题讨论】:
将 UNION 的第二部分转换为日历表(或视图,或 CTE)并将 event_logs 表左连接到它。 (或:使用 UNION ALL,并在第二部分添加 WHERE NOT EXISTS 子句) @wildplasser 试过了...似乎没有按预期工作。 【参考方案1】:只需将NOT EXISTS()
添加到第二条腿,您可以使用UNION ALL
来避免排序/合并。
SELECT event_id, start_at, state
FROM event_logs
WHERE start_at BETWEEN current_date AND current_date + interval '3 weeks'
UNION ALL
SELECT id AS event_id
, generate_series(date_trunc('week', current_date)::date + (extract(isodow from start_at)::int - 1) + start_at::time
, current_date + interval '3 weeks'
, '1 week'::INTERVAL) AS start_at
, 'draft' AS state
FROM events ev
WHERE NOT EXISTS ( SELECT*
FROM event_logs nx
WHERE nx.event_id =ev.id
AND nx.start_at BETWEEN current_date AND current_date + interval '3 weeks' )
;
【讨论】:
而这个 WHERE NOT 原因会删除 UNION 中的第二部分(不是第一个),对吗?如果是这样,那就完美了!【参考方案2】:select DISTINCT ON (date_day) date_day, state from(
SELECT day::date as date_day, null as state
FROM generate_series(now()- interval '2 week'
, now()
, interval '1 day') day
UNION ALL
select distinct
date_trunc('day',e.updated_at) as date_day,
max(des.state) over (partition by date_trunc('day',des.updated_at)) as state
from device_event as des where e.id=49 and e.updated_at >= now() - interval '2 week'
) dba order by 1
【讨论】:
【参考方案3】:我会在您的 UNION 查询中添加另一列 taborder
以确保行的简单排序并以下列方式使用窗口函数 row_number() over(...)
:
SELECT
event_id,
start_at,
state
FROM (
SELECT
event_id,
start_at,
state,
row_number(*) OVER (PARTITION BY event_id, start_at ORDER BY taborder) AS rownum
FROM (
SELECT
event_id,
start_at,
state,
1 AS taborder
FROM original_table
UNION
SELECT
event_id,
start_at,
state,
2 AS taborder
FROM draft_table
) src0
) src1
WHERE rownum = 1
ORDER BY 1, 2, 3
【讨论】:
以上是关于Postgres distinct union 仅适用于特定列的主要内容,如果未能解决你的问题,请参考以下文章
sql中,只使用union和先union all再distinct,两种方式哪个效率高?