将 Oracle 查询中的 keep dense_rank 转换为 postgres
Posted
技术标签:
【中文标题】将 Oracle 查询中的 keep dense_rank 转换为 postgres【英文标题】:Convert keep dense_rank from Oracle query into postgres 【发布时间】:2015-06-27 08:04:39 【问题描述】:我正在尝试将以下 Oracle 查询转换为 Postgres
select
this_.GLOBAL_TRANSACTION_ID as y0_,
this_.BUSINESS_IDENTIFIER as y1_,
this_.ENVIRONMENT as y2_,
count(*) as y3_,
this_.HOST_NAME as y4_,
listagg(process,
', ') within
group (order by
date_time) as process,
min(this_.DATE_TIME) as y6_,
max(this_.DATE_TIME) as y7_,
max(status)keep(dense_rank last
order by
date_time,
decode(status,
'COMPLETED',
'd',
'FAILED',
'c',
'TERMINATED',
'b',
'STARTED',
'a',
'z')) as status
from
ACTIVITY_MONITOR_TRANSACTION this_
where
this_.DATE_TIME between ? and ?
and 1=1
group by
this_.GLOBAL_TRANSACTION_ID,
this_.BUSINESS_IDENTIFIER,
this_.ENVIRONMENT,
this_.HOST_NAME,
global_transaction_id,
business_identifier,
global_transaction_id,
business_identifier
order by
y7_ asc
问题是我不知道如何转换这个块:
max(status)keep(dense_rank last
order by
date_time,
decode(status,
'COMPLETED',
'd',
'FAILED',
'c',
'TERMINATED',
'b',
'STARTED',
'a',
'z')) as status
此块的目的是获取最新状态,并在完全相同的时间(有可能!)按照上述顺序分配状态。
This is an example of data:
ID DATA_TIME GLOBAL_TRANSACTION_ID STATUS
===================================================================
54938456;"2015-04-20 09:39:27";"8d276718-eca7-4fd0-a266 ;"STARTED"
54938505;"2015-04-20 09:39:27";"8d276718-eca7-4fd0-a266 ;"COMPLETED"
54938507;"2015-04-20 09:39:27";"8d276718-eca7-4fd0-a266 ;"FAILED"
54938507;"2015-04-20 09:38:25";"8d276718-eca7-4fd0-a266 ;"FAILED"
状态应该是“COMPLETED”,所以我的查询应该返回以下内容:
GLOBAL_TRANSACTION_ID COUNT (...) STATUS
=====================================================
8d276718-eca7-4fd0-a266 4 (...) COMPLETED
我已尝试将查询拆分为 2:
select
this_.GLOBAL_TRANSACTION_ID as y0_,
this_.BUSINESS_IDENTIFIER as y1_,
this_.ENVIRONMENT as y2_,
count(*) as y3_,
this_.HOST_NAME as y4_,
array_to_string(array_agg(distinct process),
',') as process,
min(this_.DATE_TIME) as y6_,
max(this_.DATE_TIME) as y7_,
max(this_.STATUS) as y8_
from
ACTIVITY_MONITOR_TRANSACTION this_
where
this_.DATE_TIME between ? and ?
group by
this_.GLOBAL_TRANSACTION_ID,
this_.BUSINESS_IDENTIFIER,
this_.ENVIRONMENT,
this_.HOST_NAME,
global_transaction_id,
business_identifier
order by
y7_ desc limit ?
然后
select
status
from
activity_monitor_transaction
where
GLOBAL_TRANSACTION_ID=?
order by
date_time DESC,
CASE status
WHEN 'COMPLETED'THEN 'd'
WHEN 'FAILED' THEN 'c'
WHEN 'TERMINATED' THEN 'b'
WHEN 'STARTED' THEN 'a'
ELSE 'z'
END DESC LIMIT 1
但这会导致我出现性能问题,因为我必须每行执行一次第二个查询。
这是 postgres 的表格脚本:
CREATE TABLE activity_monitor_transaction
(
id numeric(11,0) NOT NULL,
date_time timestamp(6) without time zone NOT NULL,
global_transaction_id character varying(40) NOT NULL,
repost_flag character(1) NOT NULL DEFAULT 'N'::bpchar,
environment character varying(20),
transaction_mode character varying(20),
status character varying(20),
step character varying(80),
event character varying(20),
event_code character varying(20),
event_subcode character varying(20),
summary character varying(200),
business_identifier character varying(80),
alternate_business_identifier character varying(80),
domain character varying(20),
process character varying(80),
service_name character varying(80),
service_version character varying(20),
detail text,
app_name character varying(80),
app_user character varying(20),
host_name character varying(80),
thread_name character varying(200),
CONSTRAINT activity_monitor_transact_pk PRIMARY KEY (id)
USING INDEX TABLESPACE actmon_data
)
还有一些数据:
insert into ACTIVITY_MONITOR_TRANSACTION values
(54938456,'2015-04-20 09:39:27','8d276718-eca7-4fd0-a266-d465181f911a','N','Perf','','STARTED','servicereq.p2p.rso.blaze.dedup.in.channel','PROCESS','','','','3100729','51174628','ERP','servicereq-p2p-rso-blaze','servicereq-p2p-rso-blaze','1.0.0-SNAPSHOT','','servicereq-p2p-rso-blaze','CIC','intintprf20','SimpleAsyncTaskExecutor-88177');
insert into ACTIVITY_MONITOR_TRANSACTION values
(54938505,'2015-04-20 09:45:27','8d276718-eca7-4fd0-a266-d465181f911a','N','Perf','','COMPLETED','servicereq.p2p.rso.blaze.service.out.channel','PROCESS','','','','3100729','51174628','ERP','servicereq-p2p-rso-blaze','servicereq-p2p-rso-blaze','1.0.0-SNAPSHOT','','servicereq-p2p-rso-blaze','CIC','intintprf20','SimpleAsyncTaskExecutor-88177');
insert into ACTIVITY_MONITOR_TRANSACTION values
(54938507,'2015-04-20 09:45:27','8d276718-eca7-4fd0-a266-d465181f911a','N','Perf','','FAILED','inputChannel','PROCESS','','','','3100729','','ERP','servicereq-p2p-rso-blaze','servicereq-p2p-rso-blaze','1.0.0-SNAPSHOT','','servicereq-p2p-rso-blaze','CIC','intintprf20','SimpleAsyncTaskExecutor-88177');
有没有办法将 keep dense_rank 块模拟到 postgres 中以便只有一个查询?
【问题讨论】:
我相信想要帮助您的人可能需要您的表定义和数据,即表创建脚本和插入语句。我赞成您的问题只是因为您指出了需要帮助的部分。现在,请按照我的建议发布所需的详细信息。 已添加,希望对您有所帮助! 没有。对不起,但是,这无济于事。试想一下,如果我给你同样的,你会如何用它创建一个表?您需要进行逆向工程,编写自己的创建和插入语句。为什么不提供创建和插入语句。关于如何提问,我已尽力帮助您,希望您能得到您想要的解决方案。 您是否尝试过使用first_value(status) OVER (partition by global_transaction_id order by date_time, CASE ... END)
之类的东西?
谢谢伊戈尔。它不起作用,因为我需要按 date_time 分组。我希望将行按其他列分组,以便对具有不同 date_times 的行进行分组,并且“计数”将显示分组的行数。见上面的例子(我已经编辑)
【参考方案1】:
您可以使用 PostgreSQL WINDOW FUNCTIONS
-- we only added infos to the activity_monitor_transaction
-- we are free to group by date_time or status
SELECT
first_value(status) OVER w AS global_transaction_status,
count(*) OVER w AS global_transaction_count,
activity_monitor_transaction.*
FROM
activity_monitor_transaction
WINDOW w AS (
PARTITION BY global_transaction_id
ORDER BY date_time DESC, id DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
【讨论】:
虽然在这种情况下,您还需要有一些方法来 de-dup,因为窗口与group by
不同 - 它会在活动表中为每条记录返回一行,而不是单行以上是关于将 Oracle 查询中的 keep dense_rank 转换为 postgres的主要内容,如果未能解决你的问题,请参考以下文章
SQL 分析函数之KEEP (DENSE_RANK FIRST/LAST)