在一行 BigQuery 中获取今年的销售额和去年的销售额(对于同一周数)

Posted

技术标签:

【中文标题】在一行 BigQuery 中获取今年的销售额和去年的销售额(对于同一周数)【英文标题】:Getiing This Year sales and Last Year Sales (for the same week number) in one row BigQuery 【发布时间】:2020-02-25 04:35:52 【问题描述】:

在 BigQuery 中,我试图在一行中创建一个包含 Sales TY(今年财务)和 Sales LY 的视图。按照下面的示例,销售 LY 度量应针对相同的相应周数 TY。

我正在使用以下代码:

SUM(table_1.sales) OVER (PARTITION BY table_1.client ORDER BY table_1.fw_end_date DESC ROWS BETWEEN 53 FOLLOWING AND 53 FOLLOWING ) as SALES_LY

问题是有时缺少周数,这意味着窗口函数不会反映它,因为它计算的是行数而不是实际周数。

在下面的示例中,缺少本财政年度的第 19 周和第 20 周,因此 sales_ly 度量的相应结果是错误的。

是否可以通过添加/加入具有 NULL 值的缺失日期的附加行来修复?

row_number  client  fy  wofy    fw_end_date year     sales          sales_LY                
1           1111    2020    34  2020-02-23  TY       74.97971177    75.63215281 
2           1111    2020    33  2020-02-16  TY       42.42109122    68.14894689
3           1111    2020    32  2020-02-09  TY       19.85037174    87.12654065
4           1111    2020    31  2020-02-02  TY       16.56226835    3.122137476
5           1111    2020    30  2020-01-26  TY       1.800185225    10.74736963
6           1111    2020    29  2020-01-19  TY       24.75012318    38.63631908
7           1111    2020    28  2020-01-12  TY       25.25663409    1.387588554
8           1111    2020    27  2020-01-05  TY       72.26309414    0.873563634
9           1111    2020    26  2019-12-29  TY       48.42015566    81.57363107
10          1111    2020    25  2019-12-22  TY       29.94857681    41.80248501
11          1111    2020    24  2019-12-15  TY       60.84110104    26.66495665
12          1111    2020    23  2019-12-08  TY       6.810985936    48.21324987
13          1111    2020    22  2019-12-01  TY       55.25000762    8.681731452
14          1111    2020    21  2019-11-24  TY       7.314435129    61.50365765
15          1111    2020    18  2019-11-03  TY       28.86674749    84.68936948 --wrong     
16          1111    2020    17  2019-10-27  TY       93.06304298            
17          1111    2020    16  2019-10-20  TY       31.18234699            
18          1111    2020    15  2019-10-13  TY       56.40700057            
19          1111    2020    14  2019-10-06  TY       70.7995385         
20          1111    2020    13  2019-09-29  TY       88.80009525            
21          1111    2020    12  2019-09-22  TY       75.12011037            
22          1111    2020    11  2019-09-15  TY       28.54977137            
23          1111    2020    10  2019-09-08  TY       38.05238915            
24          1111    2020    9   2019-09-01  TY       7.256419393            
25          1111    2020    8   2019-08-25  TY       32.81145188            
26          1111    2020    7   2019-08-18  TY       32.04938194            
27          1111    2020    6   2019-08-11  TY       53.890913          
28          1111    2020    5   2019-08-04  TY       17.4527262         
29          1111    2020    4   2019-07-28  TY       66.08187866            
30          1111    2020    3   2019-07-21  TY       9.331124689            
31          1111    2020    2   2019-07-14  TY       61.35972079            
32          1111    2020    1   2019-07-07  TY       68.02729471            
33          1111    2019    52  2019-06-23  LY       65.09319706            
34          1111    2019    51  2019-06-16  LY       46.3647103         
35          1111    2019    50  2019-06-09  LY       45.04742519            
36          1111    2019    49  2019-06-02  LY       10.72003618            
37          1111    2019    48  2019-05-26  LY       69.73143446            
38          1111    2019    47  2019-05-19  LY       8.3106988          
39          1111    2019    46  2019-05-12  LY       18.53940931            
40          1111    2019    45  2019-05-05  LY       78.27473501            
41          1111    2019    44  2019-04-28  LY       4.799286544            
42          1111    2019    43  2019-04-21  LY       94.58933139            
43          1111    2019    42  2019-04-14  LY       27.06059414            
44          1111    2019    41  2019-04-07  LY       53.42375151            
45          1111    2019    40  2019-03-31  LY       22.92189479            
46          1111    2019    39  2019-03-24  LY       95.3449253         
47          1111    2019    38  2019-03-17  LY       62.34159147            
48          1111    2019    37  2019-03-10  LY       73.99278829            
49          1111    2019    36  2019-03-03  LY       39.517017          
50          1111    2019    35  2019-02-24  LY       50.46146309            
51          1111    2019    34  2019-02-17  LY       75.63215281            
52          1111    2019    33  2019-02-10  LY       68.14894689            
53          1111    2019    32  2019-02-03  LY       87.12654065            
54          1111    2019    31  2019-01-27  LY       3.122137476            
55          1111    2019    30  2019-01-20  LY       10.74736963            
56          1111    2019    29  2019-01-13  LY       38.63631908            
57          1111    2019    28  2019-01-06  LY       1.387588554            
58          1111    2019    27  2018-12-30  LY       0.873563634            
59          1111    2019    26  2018-12-23  LY       81.57363107            
60          1111    2019    25  2018-12-16  LY       41.80248501            
61          1111    2019    24  2018-12-09  LY       26.66495665            
62          1111    2019    23  2018-12-02  LY       48.21324987            
63          1111    2019    22  2018-11-25  LY       8.681731452            
64          1111    2019    21  2018-11-18  LY       61.50365765            
65          1111    2019    20  2018-11-11  LY       84.68936948            

【问题讨论】:

【参考方案1】:

不要跳过 53 行,而是去搜索确切的 1 年前日期:

WITH data AS (
  SELECT * 
  FROM `fh-bigquery.weather_gsod.all` 
  WHERE date BETWEEN '2018-12-01' AND '2020-02-24'
  AND name LIKE 'SAN FRANCISCO INTERNATIONAL A'
), main_query AS (
  SELECT name, date, temp
    , ARRAY_AGG(STRUCT(date, temp)) OVER(PARTITION BY name ORDER BY date ROWS BETWEEN 366 PRECEDING AND 310 PRECEDING ) over_array
  FROM data a 
)

SELECT * EXCEPT(over_array)
  , (SELECT temp FROM UNNEST(over_array) WHERE date=DATE_SUB(a.date, INTERVAL 1 year)) prev_year
FROM main_query a
ORDER BY name, date DESC 

我用几天来做这个 - 你可以用几周来做同样的事情。


解决了数周:

WITH data AS (
  SELECT ROUND(AVG(temp),1) temp, DATE_TRUNC(date, week) week, name 
  FROM `fh-bigquery.weather_gsod.all` 
  WHERE date BETWEEN '2018-12-01' AND '2020-02-24'
  AND name LIKE 'SAN FRANCISCO INTERNATIONAL A'
  GROUP BY name, week
), main_query AS (
  SELECT name, week, temp
    , ARRAY_AGG(STRUCT(week, temp)) 
      OVER(PARTITION BY name ORDER BY week 
      ROWS BETWEEN 53 PRECEDING AND 40 PRECEDING ) over_array
  FROM data a 
)

SELECT * EXCEPT(over_array)
  , (SELECT temp FROM UNNEST(over_array) 
     WHERE EXTRACT(YEAR FROM week)+1=EXTRACT(YEAR FROM a.week)
     AND EXTRACT(WEEK FROM week)=EXTRACT(WEEK FROM a.week)) prev_year
FROM main_query a
ORDER BY name, week DESC 

【讨论】:

您好 Felipe,感谢您的回答。我已经尝试过您的代码,它为 prev_year 提供了 NULL。我想这是因为我使用的是财政周结束日期而不是日历日。例如,对于 fy_end_date 2020-02-21,去年的对应周是 2019-02-17(就财务周数而言,它们都是第 34 周),而 BigQuery 会搜索 2019-02-21 周而不是“ t 存在于数据集中,因为它不是周末结束日期。可以修吗?

以上是关于在一行 BigQuery 中获取今年的销售额和去年的销售额(对于同一周数)的主要内容,如果未能解决你的问题,请参考以下文章

在 SQL 中根据日期动态聚合列

电商公布的销售额都增长了,不过电商行业其实在去年已衰退了

如何在 bigquery 中使用 rowid 按日期获取数据集的第一个值,并将给定日期的所有其他值设为 0

45分钟,411个中小品牌天猫双11实现新跨越

sql server 中取出今年和去年每个月的数据

你好,去年你提问的 “如何实现ABAP ALV多表头” 是怎么解决的