大查询 |根据运行总和为行分配数字

Posted

技术标签:

【中文标题】大查询 |根据运行总和为行分配数字【英文标题】:BigQuery | Assigning numbers to rows based on running sum 【发布时间】:2021-06-04 22:09:30 【问题描述】:

我有一张表格,其中列出了我的志愿者和他们决定写明信片数量的列。我正在尝试返回一行,根据请求的数量创建一行,将每个志愿者分配给明信片。

例如,如果志愿者 A 申请了 3 张明信片,志愿者 B 申请了 1 张明信片,志愿者 C 申请了 2 张明信片,那么我希望我的查询返回如下内容:

rowNumber volunteer postcardsAssigned
1 Volunteer A 1
2 Volunteer A 2
3 Volunteer A 3
4 Volunteer B 4
5 Volunteer C 5
6 Volunteer C 6

在下面附上一张图片,显示我当前的查询结果以及我试图让它看起来像什么。

With postcards_vols as (SELECT What_s_your_name_ as Name
, What_s_your_email_ as Email
, What_s_your_phone_number_ as Phonenumber 
, Mailing_Street_Address as StreetAddress
, Mailing_City as City
, Mailing_Zip_code as Zip 
, Mailing_State as State
, How_many_cards_would_you_like_us_to_send_ as Postcards_Requested
, ROW_NUMBER() OVER() AS Request_Number 
, Submitted_At as date_requested 
FROM volunteer_program.postcard_volunteers)

SELECT * 
, SUM (Postcards_requested) OVER (PARTITION BY request_number, date_requested ORDER BY date_requested DESC) AS addresses_assigned,
FROM 
postcards_vols

[Current Query Output and Sample Desired Query Output][1]


  [1]: https://i.stack.imgur.com/dGssK.png

* Names and address shown in the picture are fictitious 

【问题讨论】:

【参考方案1】:

您可以使用数组生成所需的额外行,然后取消嵌套:

select pv.*, num,
       row_number() over (order by volunteer, num) as postcardsassigned
from `postcards_vols` pv cross join 
     unnest(generate_array(1, pv.request_number)) num;

num 是每个志愿者的枚举。

【讨论】:

【参考方案2】:

谢谢!我一分钟内没有使用交叉连接。 这似乎可以解决问题。

with postcards_vols as (SELECT What_s_your_name_  as Name
, What_s_your_email_ as Email
, What_s_your_phone_number_ as Phonenumber 
, Mailing_Street_Address as StreetAddress
, Mailing_City as City
, Mailing_Zip_code as Zip 
, Mailing_State as State
, How_many_cards_would_you_like_us_to_send_ as Postcards_Requested
, ROW_NUMBER() OVER() AS Request_Number 
, Submitted_At as date_requested 
FROM `noble-hangar-313121.volunteer_program.postcard_volunteers`),

distribution AS (
SELECT pv.*, num,
       row_number() OVER (ORDER BY name, num) AS postcardsassigned
FROM `postcards_vols` pv CROSS JOIN
     unnest(generate_array(1, pv.Postcards_Requested)) num
     )
     
 SELECT request_number as packet_number
, Name as volunteer_name
, Email as volunteer_email
, Phonenumber as volunteer_phone
, StreetAddress as volunteer_streetAddress
, City as volunteer_city
, Zip as volunteer_zip
, State as volunteer_state
, Address as postcard_address
, city_State as postcard_citystate
, zipcode as postcard_zipcode 


 FROM
 distribution d
 LEFT JOIN 
 field_report.postcard_address_distribution
 ON number = postcardsassigned

     GROUP BY request_number, 1, 2,3,4,5,6,7, 8, 9, 10, 11
     ORDER BY packet_number

【讨论】:

以上是关于大查询 |根据运行总和为行分配数字的主要内容,如果未能解决你的问题,请参考以下文章

素数总和 - 用于循环和大数字

根据输入到单元格中的数字为单元格分配颜色和值,而无需点击运行按钮

查询为行分配序列号而不分组在一起并且不更改行的顺序

查询执行期间资源超出。大查询

将列转换为行并在 Access 中获取复选框的总和

在具有限制和大数据的熊猫中运行总和