Google BigQuery SQL:计算来自其他商店的用户
Posted
技术标签:
【中文标题】Google BigQuery SQL:计算来自其他商店的用户【英文标题】:Google BigQuery SQL: Calculate users that come from other shops 【发布时间】:2021-03-29 13:07:40 【问题描述】:我需要按商店计算唯一用户,其中第一次访问是在另一家商店。 我有两张桌子: 访问次数
ShopID UserID
10 1001
11 1002
12 1001
13 1002
14 1001
15 1003
16 1005
17 1002
18 1003
10 1005
11 1003
12 1002
13 1005
和 首次访问:
UserID First ShopID
1001 10
1002 13
1003 18
1005 16
需要输出为
ShopID Total Users from other shops
10 0
11 2
12 2
13 1
14 1
15 1
16 0
17 1
18 0
我可以为单个 ShopID 计算,但不能为每个 ShopID 动态计算:
SELECT
shopid,
COUNT (DISTINCT UserID) AS TOTAL_USERS
FROM project.dataset.table_visits
WHERE shopid=12
AND UserID IN
(
SELECT UserID
FROM project.dataset.table_first_visit
WHERE shopid<>12
)
GROUP BY shopid
如何为每个 ShopID 动态完成这项工作?
【问题讨论】:
【参考方案1】:试试这个:
with visits as (
select 10 as shopid, 1001 as userid union all
select 11, 1002 union all
select 12, 1001 union all
select 13, 1002 union all
select 14, 1001 union all
select 15, 1003 union all
select 16, 1005 union all
select 17, 1002 union all
select 18, 1003 union all
select 10, 1005 union all
select 11, 1003 union all
select 12, 1002 union all
select 13, 1005)
, first_visit as (
select 1001 as userid, 10 as first_shopid union all
select 1002, 13 union all
select 1003, 18 union all
select 1005, 16
)
select
shopid,
count(distinct if(shopid != first_shopid, userid, null)) as users_from_other_shop
from visits join first_visit using(userid)
group by shopid
order by shopid
【讨论】:
【参考方案2】:嗯。 . .我想你想要一个left join
和聚合:
select v.shop_id,
count(*) as total_visits,
count(distinct v.userId) as total_users,
count(distinct case when fv.userId is null then v.userId end) as total_users_from_other_shops
from `project.dataset.table_visits` v left join
`project.dataset.table_first_visit` fv
on fv.userId = v.userId
group by v.shop_id
【讨论】:
【参考方案3】:考虑下面的无连接解决方案(我希望在执行持续时间方面更有效,在插槽消耗方面订单更有效)
select shopid, sum(flag) users_from_other_shop
from (
select distinct shopid, userid, 1 flag
from `project.dataset.table_visits`
union all
select distinct first_shopid, userid, -1
from `project.dataset.table_first_visit`
)
group by shopid
如果应用于您问题中的样本数据 - 输出是
【讨论】:
以上是关于Google BigQuery SQL:计算来自其他商店的用户的主要内容,如果未能解决你的问题,请参考以下文章
用于 Google BigQuery 的 SQL 查询以计算会话和浏览量
有没有办法将来自多个来源的数据与 Google 的新 BigQuery 混合?
Google Bigquery - 运行参数化查询 - php