具有 LEAD/LAG 功能的行到列

Posted

技术标签:

【中文标题】具有 LEAD/LAG 功能的行到列【英文标题】:Rows to Columns with LEAD/LAG function 【发布时间】:2014-12-13 20:42:48 【问题描述】:

我想获得帮助以使用 Oracle 11gR2 获得特定结果。

首先,我需要从这样排列的表“RAW_DATA”开始:

CREATE TABLE RAW_DATA
AS
SELECT 'MTL' AS EMH_CED,'ATW 25-55' AS EMH_ID,to_date('2014-12-03 17:17:10','yyyy-mm-dd hh24:mi:ss') AS EMH_DATE_HEURE,'AM' AS EMH_TYPE_MESURE,'A' AS EMH_PHASE,75 AS EMH_MESURE FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-03 17:17:10','yyyy-mm-dd hh24:mi:ss'),'AM','B',100 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-03 17:17:10','yyyy-mm-dd hh24:mi:ss'),'AM','C',98 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-03 17:17:29','yyyy-mm-dd hh24:mi:ss'),'AM','A',75 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-03 17:17:29','yyyy-mm-dd hh24:mi:ss'),'AM','B',100 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-03 17:17:29','yyyy-mm-dd hh24:mi:ss'),'AM','C',98 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-03 17:17:57','yyyy-mm-dd hh24:mi:ss'),'AM','A',84 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-03 17:17:57','yyyy-mm-dd hh24:mi:ss'),'AM','B',100 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-03 17:17:57','yyyy-mm-dd hh24:mi:ss'),'AM','C',98 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 00:00:00','yyyy-mm-dd hh24:mi:ss'),'AM','B',91 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 00:00:00','yyyy-mm-dd hh24:mi:ss'),'AM','C',89 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 15:06:07','yyyy-mm-dd hh24:mi:ss'),'AM','A',0 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 15:06:07','yyyy-mm-dd hh24:mi:ss'),'AM','B',0 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 15:06:07','yyyy-mm-dd hh24:mi:ss'),'AM','C',0 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:22:37','yyyy-mm-dd hh24:mi:ss'),'AM','A',23 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:22:37','yyyy-mm-dd hh24:mi:ss'),'AM','B',24 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:22:37','yyyy-mm-dd hh24:mi:ss'),'AM','C',24 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:27:36','yyyy-mm-dd hh24:mi:ss'),'AM','A',34 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:27:43','yyyy-mm-dd hh24:mi:ss'),'AM','B',40 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:27:43','yyyy-mm-dd hh24:mi:ss'),'AM','C',39 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:28:12','yyyy-mm-dd hh24:mi:ss'),'AM','A',51 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:28:12','yyyy-mm-dd hh24:mi:ss'),'AM','B',58 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:28:12','yyyy-mm-dd hh24:mi:ss'),'AM','C',57 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:40:33','yyyy-mm-dd hh24:mi:ss'),'AM','B',80 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:40:33','yyyy-mm-dd hh24:mi:ss'),'AM','C',78 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:41:02','yyyy-mm-dd hh24:mi:ss'),'AM','A',73 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:47:10','yyyy-mm-dd hh24:mi:ss'),'AM','A',83 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:55:39','yyyy-mm-dd hh24:mi:ss'),'AM','B',98 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:27:59','yyyy-mm-dd hh24:mi:ss'),'AM','A',0 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:27:59','yyyy-mm-dd hh24:mi:ss'),'AM','B',0 FROM dual union ALL 
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:27:59','yyyy-mm-dd hh24:mi:ss'),'AM','C',0 FROM dual union ALL
SELECT 'MTL','ATW 25-55',to_date('2014-12-04 16:56:37','yyyy-mm-dd hh24:mi:ss'),'AM','C',96 FROM dual;

我正在寻找的最终结果如下:

我需要在单个数据行上为“EMH_PHASE”(“A”、“B”和“C”)的每个值提供“EMH_MESURE”。该结果需要存储在三个新列中,分别命名为“MESURE_A”、“MESURE_B”和“MESURE_C”。

在那之后,我需要在零交叉之前和之后的数据行(这是MESURE_A=MESURE_B=MESURE_C=0,“RAW_DATA”按“EMH_DATE_HEURE”排序)。我还需要与零交叉对应的数据行。在我的上下文中,可能有几个过零。然后,基于“RAW_DATA”表,我想要得到的结果如下:

EMH_CED, EMH_ID,     EMH_DATE_HEURE,     EMH_TYPE_MESURE, MESURE_A, MESURE_B, MESURE_C
MTL      ATW 25-55   2014-12-04 00:00:00       AM              84        91        89
MTL      ATW 25-55   2014-12-04 15:06:07       AM               0         0         0
MTL      ATW 25-55   2014-12-04 16:22:37       AM              23        24        24
MTL      ATW 25-55   2014-12-04 16:27:43       AM              34        40        39
MTL      ATW 25-55   2014-12-04 16:27:59       AM               0         0         0
MTL      ATW 25-55   2014-12-04 16:28:12       AM              51        58        57

因此,我首先使用以下代码将“EMH_PHASE”列从“RAW_DATA”转换为 3 个不同的列(“MESURE_A”、“MESURE_B”和“MESURE_C”)。

WITH ROWS_TO_COLUMNS AS(
  SELECT EMH_CED
    ,EMH_ID
    ,EMH_DATE_HEURE
    ,EMH_TYPE_MESURE
   , MAX(decode(EMH_PHASE,'A', EMH_MESURE, null)) AS MESURE_A
   , MAX(decode(EMH_PHASE,'B', EMH_MESURE, null)) AS MESURE_B
   , MAX(decode(EMH_PHASE,'C', EMH_MESURE, null)) AS MESURE_C
FROM RAW_DATA
GROUP BY EMH_CED, EMH_ID, EMH_DATE_HEURE, EMH_TYPE_MESURE
)

到目前为止,它似乎做了我想做的事,但我得到了一些不想要的空值。

然后,我用下面的代码用每个空值之前的值填充空值:

NULLS_FILLED AS(
  SELECT EMH_CED, EMH_ID, EMH_DATE_HEURE
   ,FIRST_VALUE(MESURE_A) IGNORE NULLS
       OVER (PARTITION BY EMH_CED, EMH_ID ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC
         RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS MESURE_A
   ,FIRST_VALUE(MESURE_B) IGNORE NULLS
       OVER (PARTITION BY EMH_CED, EMH_ID ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC
         RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS MESURE_B
   ,FIRST_VALUE(MESURE_C) IGNORE NULLS
       OVER (PARTITION BY EMH_CED, EMH_ID ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC
         RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS MESURE_C
FROM ROWS_TO_COLUMNS
ORDER BY EMH_DATE_HEURE
)

那次手术后的结果就是我一开始要找的那个。

下一步是我需要帮助的地方。我只想在MESURE_A=MESURE_B=MESURE_C=0 时获得 LEADing 和 LAGing 行(我也需要显示这一行)。

现在,我只能获取 LAGing 行,以及表格的最后一行,我什至不想要。我仍然需要找到一种方法来获取我缺少的 2 行,同时摆脱我不想要的那一行。

到目前为止,我尝试了不同的方法,但没有任何好的结果。帮忙?

这是我的其余代码,需要对它们进行调整才能获得所需的结果:

,RN_DATA AS(
   SELECT NULLS_FILLED.*, row_number() over (order by EMH_CED, EMH_ID, EMH_DATE_HEURE) AS rn
FROM NULLS_FILLED
)

,DATA_GROUPED AS (
   SELECT RN_DATA.*, rownum - rn AS grp
FROM RN_DATA
WHERE MESURE_A>0 AND MESURE_B>0 AND MESURE_C>0
)

SELECT max(EMH_CED) keep (dense_rank first ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC) AS EMH_CED
  ,max(EMH_ID) keep (dense_rank first ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC) AS EMH_ID
  ,max(EMH_DATE_HEURE) keep (dense_rank first ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC) AS EMH_DATE_HEURE
  ,max(MESURE_A) keep (dense_rank first ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC) AS MESURE_A
  ,max(MESURE_B) keep (dense_rank first ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC) AS MESURE_B
  ,max(MESURE_C) keep (dense_rank first ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC) AS MESURE_C
  ,max(rn) keep (dense_rank first ORDER BY EMH_CED, EMH_ID, EMH_DATE_HEURE DESC) AS rn
FROM DATA_GROUPED
GROUP BY grp
ORDER BY rn
;

随时使用 SQL Fiddle 测试我的代码: http://sqlfiddle.com/#!4/e6b2e0/4/0

【问题讨论】:

这是小说,不是问题。 【参考方案1】:

我不明白您为什么要使用 ROW_NUMBER、FIRST、LAST 函数。您只需要使用LEAD, LAG 函数即可。

WITH rows_to_columns AS
(
         SELECT   emh_ced,
                  emh_id,
                  emh_date_heure,
                  emh_type_mesure,
                  Max (
                  CASE emh_phase
                           WHEN 'A' THEN emh_mesure
                  END) AS mesure_a,
                  Max (
                  CASE emh_phase
                           WHEN 'B' THEN emh_mesure
                  END) AS mesure_b,
                  Max (
                  CASE emh_phase
                           WHEN 'C' THEN emh_mesure
                  END) AS mesure_c
         FROM     raw_data
         GROUP BY emh_ced,
                  emh_id,
                  emh_date_heure,
                  emh_type_mesure), nulls_filled AS
(
         SELECT   emh_ced,
                  emh_id,
                  emh_date_heure,
                  emh_type_mesure,
                  First_value ( mesure_a) ignore nulls over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure DESC RANGE BETWEEN CURRENT ROW AND      unbounded following) AS mesure_a,
                  first_value ( mesure_b) ignore nulls over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure DESC RANGE BETWEEN CURRENT ROW AND      unbounded following) AS mesure_b,
                  first_value ( mesure_c) ignore nulls over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure DESC RANGE BETWEEN CURRENT ROW AND      unbounded following) AS mesure_c,
                  lead ( mesure_a, 1) over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure)                                                                                 lead_a,
                  lead ( mesure_b, 1) over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure)                                                                                 lead_b,
                  lead ( mesure_c, 1) over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure)                                                                                 lead_c,
                  lag ( mesure_a, 1) over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure)                                                                                  lag_a,
                  lag ( mesure_b, 1) over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure)                                                                                  lag_b,
                  lag ( mesure_c, 1) over (PARTITION BY emh_ced, emh_id ORDER BY emh_ced, emh_id, emh_date_heure)                                                                                  lag_c
         FROM     rows_to_columns)
SELECT   emh_ced,
         emh_id,
         emh_date_heure,
         emh_type_mesure,
         mesure_a,
         mesure_b,
         mesure_c
FROM     nulls_filled
WHERE    (
                  mesure_a = 0
         AND      mesure_b = 0
         AND      mesure_c = 0)
OR       (
                  lead_a = 0
         AND      lead_b = 0
         AND      lead_c = 0)
OR       (
                  lag_a = 0
         AND      lag_b = 0
         AND      lag_c = 0)
ORDER BY 3;

Output:

| EMH_CED |    EMH_ID |                  EMH_DATE_HEURE | EMH_TYPE_MESURE | MESURE_A | MESURE_B | MESURE_C |
|---------|-----------|---------------------------------|-----------------|----------|----------|----------|
|     MTL | ATW 25-55 | December, 04 2014 00:00:00+0000 |              AM |       84 |       91 |       89 |
|     MTL | ATW 25-55 | December, 04 2014 15:06:07+0000 |              AM |        0 |        0 |        0 |
|     MTL | ATW 25-55 | December, 04 2014 16:22:37+0000 |              AM |       23 |       24 |       24 |
|     MTL | ATW 25-55 | December, 04 2014 16:27:43+0000 |              AM |       34 |       40 |       39 |
|     MTL | ATW 25-55 | December, 04 2014 16:27:59+0000 |              AM |        0 |        0 |        0 |
|     MTL | ATW 25-55 | December, 04 2014 16:28:12+0000 |              AM |       51 |       58 |       57 |

【讨论】:

感谢您的快速回复!我不是真正的 SQL 请求专家,所以直到现在,我一直使用 ROW_NUMBER、FIRST、LAST 函数进行这种处理,所以我没有真正尝试过使用其他函数......而且,我会没有想过使用新列来存储函数 LEAD / LAG 的结果。您的解决方案非常简单,完全符合我的要求,非常感谢!

以上是关于具有 LEAD/LAG 功能的行到列的主要内容,如果未能解决你的问题,请参考以下文章

使用 PIVOT 函数的行到列 (Oracle)

调整多级行到列查询

具有两行到列的数据透视表

在oracle SQL中将行移动到列和列到行

行到列的总和

MySQL - 行到列