按时间范围谷歌选择不同的用户组 - bigquery SQL

Posted

技术标签:

【中文标题】按时间范围谷歌选择不同的用户组 - bigquery SQL【英文标题】:Select distinct users group by time range google - bigquery SQL 【发布时间】:2019-09-20 04:51:10 【问题描述】:

Select distinct users group by time range

如何在 google big query 的 SQL 版本中执行上述链接?

更新详情:

我有一张包含以下信息的表格

 |day| user_id  

我想计算一个日期的不同 user_id 的数量:

    在那一天 截至该日期的那一周(Week to date) 截至该日期的月份(Month to date)

输入表示例:

 | day        | user_id    
 | 2013-01-01 | 1          
 | 2013-01-03 | 3          
 | 2013-01-06 | 4          
 | 2013-01-07 | 4          

预期输出:

 | day        | time_series | cnt        |                 
 | 2013-01-01 | D           | 1          |                 
 | 2013-01-01 | W           | 1          |                 
 | 2013-01-01 | M           | 1          |                 
 | 2013-01-03 | D           | 1          |                 
 | 2013-01-03 | W           | 2          |                 
 | 2013-01-03 | M           | 2          |                 
 | 2013-01-06 | D           | 1          |                 
 | 2013-01-06 | W           | 1          |                 
 | 2013-01-06 | M           | 3          |                 
 | 2013-01-07 | D           | 1          |                 
 | 2013-01-07 | W           | 1          |                 
 | 2013-01-07 | M           | 3          |

附:类似的问题是询问 postgresql - 但我需要 BigQuery 的版本

【问题讨论】:

so bigquery 还是 sql-server?完全不同的东西! @MikhailBerlyant 对.. 我糟糕的大查询版本的 SQL* 我还建议您在帖子中提出您的具体问题,而不是引用其他人的问题 我也有同样的问题,就是用google big query.. 【参考方案1】:

以下是 BigQuery 标准 SQL

选项#1

#standardSQL
WITH `project.dataset.table` AS (
  SELECT DATE '2013-01-01' day, 1 user_id UNION ALL
  SELECT '2013-01-03', 3 UNION ALL
  SELECT '2013-01-06', 4 UNION ALL
  SELECT '2013-01-07', 4 
)
SELECT day, 'D' series, COUNT(DISTINCT user_id) users 
FROM `project.dataset.table` GROUP BY day 
UNION ALL
SELECT DISTINCT day, 'W', (SELECT COUNT(DISTINCT id) FROM UNNEST(users) id) 
FROM (
  SELECT day,  ARRAY_AGG(user_id) OVER(PARTITION BY DATE_TRUNC(day, WEEK) ORDER BY day) users
  FROM `project.dataset.table`
)
UNION ALL
SELECT DISTINCT day, 'M', (SELECT COUNT(DISTINCT id) FROM UNNEST(users) id)
FROM (
  SELECT day,  ARRAY_AGG(user_id) OVER(PARTITION BY DATE_TRUNC(day, MONTH) ORDER BY day) users
  FROM `project.dataset.table`
)
ORDER BY day, CASE series WHEN 'D' THEN 1 WHEN 'W' THEN 2 ELSE 3 END

结果

Row day         series  users    
1   2013-01-01  D       1    
2   2013-01-01  W       1    
3   2013-01-01  M       1    
4   2013-01-03  D       1    
5   2013-01-03  W       2    
6   2013-01-03  M       2    
7   2013-01-06  D       1    
8   2013-01-06  W       1    
9   2013-01-06  M       3    
10  2013-01-07  D       1    
11  2013-01-07  W       1    
12  2013-01-07  M       3    

选项#2 - 基于上述版本,但将三个查询合并为一个

#standardSQL
SELECT DISTINCT day, d_users,
  (SELECT COUNT(DISTINCT id) FROM UNNEST(w_users) id) w_users,
  (SELECT COUNT(DISTINCT id) FROM UNNEST(m_users) id) m_users
FROM (
  SELECT day,  
    COUNT(DISTINCT user_id) OVER(PARTITION BY day) d_users,
    ARRAY_AGG(user_id) OVER(PARTITION BY DATE_TRUNC(day, WEEK) ORDER BY day) w_users,
    ARRAY_AGG(user_id) OVER(PARTITION BY DATE_TRUNC(day, MONTH) ORDER BY day) m_users
  FROM `project.dataset.table`
)
ORDER BY day  

如果应用于相同的数据 - 结果是

Row day         d_users w_users m_users  
1   2013-01-01  1       1       1    
2   2013-01-03  1       2       2    
3   2013-01-06  1       1       3    
4   2013-01-07  1       1       3      

选项 #3 - 如果由于某种原因您需要取消旋转/展平选项 #2 的结果

#standardSQL
SELECT day, series, users
FROM (
  SELECT DISTINCT day, d_users,
    (SELECT COUNT(DISTINCT id) FROM UNNEST(w_users) id) w_users,
    (SELECT COUNT(DISTINCT id) FROM UNNEST(m_users) id) m_users
  FROM (
    SELECT day,  
      COUNT(DISTINCT user_id) OVER(PARTITION BY day) d_users,
      ARRAY_AGG(user_id) OVER(PARTITION BY DATE_TRUNC(day, WEEK) ORDER BY day) w_users,
      ARRAY_AGG(user_id) OVER(PARTITION BY DATE_TRUNC(day, MONTH) ORDER BY day) m_users
    FROM `project.dataset.table`
  )
), UNNEST([STRUCT('D' AS series, d_users AS users), ('W', w_users), ('M', m_users)]) 
ORDER BY day   

wich 给出的结果与选项 #1 中的结果相同

【讨论】:

以上是关于按时间范围谷歌选择不同的用户组 - bigquery SQL的主要内容,如果未能解决你的问题,请参考以下文章

谷歌分析会话范围的字段返回多个值

谷歌分析 - 基于用户的细分

谷歌地图下载拼接软件

谷歌地图下载瓦片拼接地图

排序范围忽略谷歌表格上空白的“”风格

免费下载谷歌地图高清卫星地图