具有 LEAD/LAG 功能的行到列
Posted
技术标签:
【中文标题】具有 LEAD/LAG 功能的行到列【英文标题】:Rows to Columns with LEAD/LAG function 【发布时间】:2014-12-13 20:42:48 【问题描述】:我想获得帮助以使用 Oracle 11gR2 获得特定结果。
首先,我需要从这样排列的表“RAW_DATA”开始:
CREATE TABLE RAW_DATA
AS
SELECT 'MTL' AS EMH_CED,'ATW 25-55' AS EMH_ID,to_date('2014-12-03 17:17:10','yyyy-mm-dd hh24:mi:ss') AS EMH_DATE_HEURE,'AM' AS EMH_TYPE_MESURE,'A' AS EMH_PHASE,75 AS EMH_MESURE FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-03 17:17:10','yyyy-mm-dd hh24:mi:ss'),'AM','B',100 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-03 17:17:10','yyyy-mm-dd hh24:mi:ss'),'AM','C',98 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-03 17:17:29','yyyy-mm-dd hh24:mi:ss'),'AM','A',75 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-03 17:17:29','yyyy-mm-dd hh24:mi:ss'),'AM','B',100 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-03 17:17:29','yyyy-mm-dd hh24:mi:ss'),'AM','C',98 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-03 17:17:57','yyyy-mm-dd hh24:mi:ss'),'AM','A',84 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-03 17:17:57','yyyy-mm-dd hh24:mi:ss'),'AM','B',100 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-03 17:17:57','yyyy-mm-dd hh24:mi:ss'),'AM','C',98 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 00:00:00','yyyy-mm-dd hh24:mi:ss'),'AM','B',91 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 00:00:00','yyyy-mm-dd hh24:mi:ss'),'AM','C',89 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 15:06:07','yyyy-mm-dd hh24:mi:ss'),'AM','A',0 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 15:06:07','yyyy-mm-dd hh24:mi:ss'),'AM','B',0 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 15:06:07','yyyy-mm-dd hh24:mi:ss'),'AM','C',0 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:22:37','yyyy-mm-dd hh24:mi:ss'),'AM','A',23 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:22:37','yyyy-mm-dd hh24:mi:ss'),'AM','B',24 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:22:37','yyyy-mm-dd hh24:mi:ss'),'AM','C',24 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:27:36','yyyy-mm-dd hh24:mi:ss'),'AM','A',34 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:27:43','yyyy-mm-dd hh24:mi:ss'),'AM','B',40 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:27:43','yyyy-mm-dd hh24:mi:ss'),'AM','C',39 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:28:12','yyyy-mm-dd hh24:mi:ss'),'AM','A',51 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:28:12','yyyy-mm-dd hh24:mi:ss'),'AM','B',58 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:28:12','yyyy-mm-dd hh24:mi:ss'),'AM','C',57 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:40:33','yyyy-mm-dd hh24:mi:ss'),'AM','B',80 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:40:33','yyyy-mm-dd hh24:mi:ss'),'AM','C',78 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:41:02','yyyy-mm-dd hh24:mi:ss'),'AM','A',73 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:47:10','yyyy-mm-dd hh24:mi:ss'),'AM','A',83 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:55:39','yyyy-mm-dd hh24:mi:ss'),'AM','B',98 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:27:59','yyyy-mm-dd hh24:mi:ss'),'AM','A',0 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:27:59','yyyy-mm-dd hh24:mi:ss'),'AM','B',0 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:27:59','yyyy-mm-dd hh24:mi:ss'),'AM','C',0 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:56:37','yyyy-mm-dd hh24:mi:ss'),'AM','C',96 FROM dual;
我正在寻找的最终结果如下:
我需要在单个数据行上为“EMH_PHASE”(“A”、“B”和“C”)的每个值提供“EMH_MESURE”。该结果需要存储在三个新列中,分别命名为“MESURE_A”、“MESURE_B”和“MESURE_C”。
在那之后,我需要在零交叉之前和之后的数据行(这是MESURE_A=MESURE_B=MESURE_C=0
,“RAW_DATA”按“EMH_DATE_HEURE”排序)。我还需要与零交叉对应的数据行。在我的上下文中,可能有几个过零。然后,基于“RAW_DATA”表,我想要得到的结果如下:
EMH_CED, EMH_ID, EMH_DATE_HEURE, EMH_TYPE_MESURE, MESURE_A, MESURE_B, MESURE_C
MTL ATW 25-55 2014-12-04 00:00:00 AM 84 91 89
MTL ATW 25-55 2014-12-04 15:06:07 AM 0 0 0
MTL ATW 25-55 2014-12-04 16:22:37 AM 23 24 24
MTL ATW 25-55 2014-12-04 16:27:43 AM 34 40 39
MTL ATW 25-55 2014-12-04 16:27:59 AM 0 0 0
MTL ATW 25-55 2014-12-04 16:28:12 AM 51 58 57
因此,我首先使用以下代码将“EMH_PHASE”列从“RAW_DATA”转换为 3 个不同的列(“MESURE_A”、“MESURE_B”和“MESURE_C”)。
WITH ROWS_TO_COLUMNS AS(
SELECT EMH_CED
,EMH_ID
,EMH_DATE_HEURE
,EMH_TYPE_MESURE
, MAX(decode(EMH_PHASE,'A', EMH_MESURE, null)) AS MESURE_A
, MAX(decode(EMH_PHASE,'B', EMH_MESURE, null)) AS MESURE_B
, MAX(decode(EMH_PHASE,'C', EMH_MESURE, null)) AS MESURE_C
FROM RAW_DATA
GROUP BY EMH_CED, EMH_ID, EMH_DATE_HEURE, EMH_TYPE_MESURE
)
到目前为止,它似乎做了我想做的事,但我得到了一些不想要的空值。
然后,我用下面的代码用每个空值之前的值填充空值:
NULLS_FILLED AS(
SELECT EMH_CED, EMH_ID, EMH_DATE_HEURE
,FIRST_VALUE(MESURE_A) IGNORE NULLS
OVER (PARTITION BY EMH_CED, EMH_ID ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC
RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS MESURE_A
,FIRST_VALUE(MESURE_B) IGNORE NULLS
OVER (PARTITION BY EMH_CED, EMH_ID ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC
RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS MESURE_B
,FIRST_VALUE(MESURE_C) IGNORE NULLS
OVER (PARTITION BY EMH_CED, EMH_ID ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC
RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS MESURE_C
FROM ROWS_TO_COLUMNS
ORDER BY EMH_DATE_HEURE
)
那次手术后的结果就是我一开始要找的那个。
下一步是我需要帮助的地方。我只想在MESURE_A=MESURE_B=MESURE_C=0
时获得 LEADing 和 LAGing 行(我也需要显示这一行)。
现在,我只能获取 LAGing 行,以及表格的最后一行,我什至不想要。我仍然需要找到一种方法来获取我缺少的 2 行,同时摆脱我不想要的那一行。
到目前为止,我尝试了不同的方法,但没有任何好的结果。帮忙?
这是我的其余代码,需要对它们进行调整才能获得所需的结果:
,RN_DATA AS(
SELECT NULLS_FILLED.*, row_number() over (order by EMH_CED, EMH_ID, EMH_DATE_HEURE) AS rn
FROM NULLS_FILLED
)
,DATA_GROUPED AS (
SELECT RN_DATA.*, rownum - rn AS grp
FROM RN_DATA
WHERE MESURE_A>0 AND MESURE_B>0 AND MESURE_C>0
)
SELECT max(EMH_CED) keep (dense_rank first ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC) AS EMH_CED
,max(EMH_ID) keep (dense_rank first ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC) AS EMH_ID
,max(EMH_DATE_HEURE) keep (dense_rank first ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC) AS EMH_DATE_HEURE
,max(MESURE_A) keep (dense_rank first ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC) AS MESURE_A
,max(MESURE_B) keep (dense_rank first ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC) AS MESURE_B
,max(MESURE_C) keep (dense_rank first ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC) AS MESURE_C
,max(rn) keep (dense_rank first ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC) AS rn
FROM DATA_GROUPED
GROUP BY grp
ORDER BY rn
;
随时使用 SQL Fiddle 测试我的代码: http://sqlfiddle.com/#!4/e6b2e0/4/0
【问题讨论】:
这是小说,不是问题。 【参考方案1】:我不明白您为什么要使用 ROW_NUMBER、FIRST、LAST 函数。您只需要使用LEAD, LAG 函数即可。
WITH rows_to_columns AS
(
SELECT emh_ced,
emh_id,
emh_date_heure,
emh_type_mesure,
Max (
CASE emh_phase
WHEN 'A' THEN emh_mesure
END) AS mesure_a,
Max (
CASE emh_phase
WHEN 'B' THEN emh_mesure
END) AS mesure_b,
Max (
CASE emh_phase
WHEN 'C' THEN emh_mesure
END) AS mesure_c
FROM raw_data
GROUP BY emh_ced,
emh_id,
emh_date_heure,
emh_type_mesure), nulls_filled AS
(
SELECT emh_ced,
emh_id,
emh_date_heure,
emh_type_mesure,
First_value ( mesure_a) ignore nulls over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure DESC RANGE BETWEEN CURRENT ROW AND unbounded following) AS mesure_a,
first_value ( mesure_b) ignore nulls over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure DESC RANGE BETWEEN CURRENT ROW AND unbounded following) AS mesure_b,
first_value ( mesure_c) ignore nulls over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure DESC RANGE BETWEEN CURRENT ROW AND unbounded following) AS mesure_c,
lead ( mesure_a, 1) over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure) lead_a,
lead ( mesure_b, 1) over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure) lead_b,
lead ( mesure_c, 1) over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure) lead_c,
lag ( mesure_a, 1) over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure) lag_a,
lag ( mesure_b, 1) over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure) lag_b,
lag ( mesure_c, 1) over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure) lag_c
FROM rows_to_columns)
SELECT emh_ced,
emh_id,
emh_date_heure,
emh_type_mesure,
mesure_a,
mesure_b,
mesure_c
FROM nulls_filled
WHERE (
mesure_a = 0
AND mesure_b = 0
AND mesure_c = 0)
OR (
lead_a = 0
AND lead_b = 0
AND lead_c = 0)
OR (
lag_a = 0
AND lag_b = 0
AND lag_c = 0)
ORDER BY 3;
Output:
| EMH_CED | EMH_ID | EMH_DATE_HEURE | EMH_TYPE_MESURE | MESURE_A | MESURE_B | MESURE_C |
|---------|-----------|---------------------------------|-----------------|----------|----------|----------|
| MTL | ATW 25-55 | December, 04 2014 00:00:00+0000 | AM | 84 | 91 | 89 |
| MTL | ATW 25-55 | December, 04 2014 15:06:07+0000 | AM | 0 | 0 | 0 |
| MTL | ATW 25-55 | December, 04 2014 16:22:37+0000 | AM | 23 | 24 | 24 |
| MTL | ATW 25-55 | December, 04 2014 16:27:43+0000 | AM | 34 | 40 | 39 |
| MTL | ATW 25-55 | December, 04 2014 16:27:59+0000 | AM | 0 | 0 | 0 |
| MTL | ATW 25-55 | December, 04 2014 16:28:12+0000 | AM | 51 | 58 | 57 |
【讨论】:
感谢您的快速回复!我不是真正的 SQL 请求专家,所以直到现在,我一直使用 ROW_NUMBER、FIRST、LAST 函数进行这种处理,所以我没有真正尝试过使用其他函数......而且,我会没有想过使用新列来存储函数 LEAD / LAG 的结果。您的解决方案非常简单,完全符合我的要求,非常感谢!以上是关于具有 LEAD/LAG 功能的行到列的主要内容,如果未能解决你的问题,请参考以下文章