我正在尝试使用 Snowflake 根据时间戳“FIELD_TIME”获取“FIELD_NAME”中每个组的最后一个值。


我希望表格中的每个值都有一列(例如,产品数量、SKU 上线日期等),并且该列仅包含当天的最后一个值。如下表,

ISSUE_ID ISSUE_ID FIELD_TIME Number of Products Number of SKU live Date of SKU Live Number of SKU Not Created Work_In_Progress_Date Pending_Date Status Resolution
19229 X1 2021-08-01 55 21 2021-08-01 34 PENDING Null
19229 X1 2021-08-08 PENDING Null
19229 X1 2021-08-12 55 24 2021-08-01 31 2021-08-12 2021-08-12 PENDING Null

我尝试过last_value(FIELD_VALUE) over (partition by FIELD_NAME, ISSUE_ID order by field_time),但它给了我 23 行而不是 3 行的重复值。 我也试过lag(),但没有运气。


           t.ISSUE_ID, t.issue_name,
    --        t.created_date,
           t.field_time::date as field_time,
           max(case when field_name = 'Number of Products' then field_value end) as Number_of_Products,
           max(case when field_name = 'Number of SKU live' then field_value end) as Number_of_SKU_Live,
           max(case when field_name = 'Number of SKU not created' then field_value end) as Number_of_SKU_Not_Created,
           max(case when field_name = 'Date of SKU live' then field_value end) as Date_of_SKU_Live,
           max(case when field_value = '10020' then date(t.field_time) end) as Work_In_Progress_Date,
           max(case when field_value = '10010' then date(t.field_time) end) as Pending_Date,
           t.status, t.resolution
    (select fh.ISSUE_ID,
           date(i.created_date) as created_date,
           fh.TIME as field_time,
           f.name as field_name,
           fh.value as field_value,
             left join JIRA.FIELD f on fh.FIELD_ID = f.ID and f._FIVETRAN_DELETED = 0
             left join (select i0.created as created_date,r.name as resolution, i0.id, i0.key as issue_name, i0.status as status_id, s.name as status
                        from JIRA.issue i0
                                 left join JIRA.status s on i0.status = s.ID
                                 left join JIRA.RESOLUTION r on i0.RESOLUTION = r.ID
                 where i0._FIVETRAN_DELETED = 0
                 and i0.key like 'PIM%')
                 i on i.id = fh.ISSUE_ID
    where fh.ISSUE_ID in (select ID from ISSUE where PROJECT = 10041)
    and fh.FIELD_ID in ('customfield_10067', 'customfield_10063', 'customfield_10066', 'customfield_10068', 'status', 'resolution')
    -- and issue_name = 'PIM-11'
    qualify row_number() over (partition by issue_id, field_time::date, field_name order by field_time desc) = 1
    order by field_time) t
    group by issue_id, issue_name,created_date, field_time::date, status, resolution


ISSUE_FIELD_HISTORY:用于链接字段相关列的主表。 FIELD:用于获取与字段 ID 关联的字段名称的辅助表。 ISSUE:用于获取与每个问题相关的问题名称、ID 和状态 ID 的辅助表。 STATUS:获取与 ISSUE 表中的状态 ID 相关的状态名称。 RESOLUTION:获取分辨率状态(分辨率名称)。


您的样本数据和期望的结果彼此无关。您的查询引用了一堆未定义的表和列。 我为每个使用的表添加了描述。 【参考方案1】:

我无法重现您的查询,因为您没有给出定义的表太多。但是,根据您的要求,下面是一个使用 PIVOT 的简化示例,它会给出预期的结果:

with cte as (
select 19229 as issue_id, 'X1' as issue_name, '2021-08-01 09:04:35'::timestamp_ntz as field_time, 'Status' as field_name, 10020 as field_value, 'Work In Progress' as status union all
select 19229 as issue_id, 'X1' as issue_name, '2021-08-01 09:04:35'::timestamp_ntz as field_time, 'Number of Products' as field_name, 55 as field_value, 'PENDING' as status union all
select 19229 as issue_id, 'X1' as issue_name, '2021-08-01 09:04:35'::timestamp_ntz as field_time, 'Number of SKU live' as field_name, 21 as field_value, 'PENDING' as status union all
select 19229 as issue_id, 'X1' as issue_name, '2021-08-08 06:19:05'::timestamp_ntz as field_time, 'Status' as field_name, 10010 as field_value, 'PENDING' as status
t as (
select ISSUE_ID,
           field_time::date as field_date,
           last_value(status) over (partition by issue_id, field_time::date order by field_time desc) as last_status
    from cte
    qualify row_number() over (partition by issue_id, field_time::date, field_name order by field_time desc) = 1
select * from t
pivot(max(field_value) for field_name in ('Status', 'Number of Products', 'Number of SKU live')) as p
order by field_date

您必须在查询中使用 last_value(status),因为状态在一天中会发生变化,并且每个 field_value 可能会有所不同。这是查询的结果,每天给出一条记录,其中包含每个字段的最新值:

ISSUE_ID    ISSUE_NAME  FIELD_DATE  LAST_STATUS         'Status'    'Number of Products'    'Number of SKU live'
19229       X1          2021-08-01  Work In Progress    10020       55                      21
19229       X1          2021-08-08  PENDING             10010       


代码正在运行,它为我提供了“field_value”中每个组的所有值,但我无法获得“Work_In_Progress_Date”和“Pending_Date”这两列。我希望最终表包括问题何时是“Work_In_progress”以及何时转移到“待定”。谢谢 您的要求是“我希望表格中的每个值都有一列(例如,产品数量、SKU 上线日期等),并且该列仅包含当天的最后一个值。”所以我知道你每天只想要一张记录。如果您想要所有状态,则只需在查询中更改此行:“last_value(status) over (partition by issue_id, field_time::date order by field_time desc) as last_status”为“status”。 如果向右滚动,在所需的结果中,有两列“Work_In_Progress_Date”和“Pending_Date”每天都应该有一个值,即状态值为 10010 或 10020 的 field_date。那两个我真的不能把它们当作约会对象。感谢您的耐心和帮助。


