玩了至少 3 节后 3 天内又回来的用户数量?

Posted

技术标签:

【中文标题】玩了至少 3 节后 3 天内又回来的用户数量?【英文标题】:Number of user that came back within 3 days after playing at least three sessions? 【发布时间】:2018-03-06 17:21:34 【问题描述】:

我有包含用户、事件日期和会话的数据。我想区分至少有 3 个会话并在 3 天内返回新会话的用户。

    user  eventdate   session
    A      2018-02-05   1
    A      2018-02-05   2
    A      2018-02-06   3 
    A      2018-02-10   4

输出完成了 3 次会话然后在 3 天内返回第四次会话的用户。

我尝试了以下查询,但它没有给我所需的答案。

 SELECT distinct user, MIN(eventdate) startdate, MAX(eventdate) enddate
FROM (SELECT user, eventdate
      FROM (SELECT user, eventdate
              FROM tablename
             where datediff(startdate,enddate)<=3
             ORDER BY user, eventdate) where sessions>=3) t
 GROUP BY user
 ORDER BY user, startdate;

我知道查询有很多问题,但我根本无法弄清楚如何继续前进。有什么建议吗?

【问题讨论】:

预期输出是什么?在上述情况下,A 的所有数据都会出现。你的查询编译了吗? 【参考方案1】:

以下是 BigQuery 标准 SQL

#standardSQL
SELECT *
FROM (
  SELECT 
    user, eventdate, sessions_in_a_day, 
    SUM(sessions_in_a_day) OVER(PARTITION BY user ORDER BY eventdate ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) total_sessions_before, 
    DATE_DIFF(eventdate, LAG(eventdate) OVER(PARTITION BY user ORDER BY eventdate), DAY) delay
  FROM (
    SELECT user, eventdate, COUNT(1) sessions_in_a_day 
    FROM t
    GROUP BY user, eventdate
  )
)
WHERE total_sessions_before >= 3
AND delay <= 3
-- ORDER BY user, eventdate

您可以使用虚拟数据测试/玩上面的内容

#standardSQL
WITH t AS (
  SELECT 'A' user, DATE '2018-02-05' eventdate, 1 session UNION ALL
  SELECT 'A', DATE '2018-02-05', 2 UNION ALL
  SELECT 'A', DATE '2018-02-06', 3 UNION ALL
  SELECT 'A', DATE '2018-02-06', 4 UNION ALL
  SELECT 'A', DATE '2018-02-09', 5 UNION ALL
  SELECT 'A', DATE '2018-02-09', 6 UNION ALL
  SELECT 'A', DATE '2018-02-10', 7 UNION ALL 
  SELECT 'A', DATE '2018-02-13', 8 
)
SELECT *
FROM (
  SELECT 
    user, eventdate, sessions_in_a_day, 
    SUM(sessions_in_a_day) OVER(PARTITION BY user ORDER BY eventdate ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) total_sessions_before, 
    DATE_DIFF(eventdate, LAG(eventdate) OVER(PARTITION BY user ORDER BY eventdate), DAY) delay
  FROM (
    SELECT user, eventdate, COUNT(1) sessions_in_a_day 
    FROM t
    GROUP BY user, eventdate
  )
)
WHERE total_sessions_before >= 3
AND delay <= 3
ORDER BY user, eventdate  

结果是

Row user    eventdate   sessions_in_a_day   total_sessions_before   delay    

1   A       2018-02-09  2                   4                       3    
2   A       2018-02-10  1                   6                       1    
3   A       2018-02-13  1                   7                       3    

使用 WHERE 子句,您可以“调整”到您需要的任何情况 在上面的示例中,您仅显示在接下来的 3 天内到达下一个会话之前至少有 3 个会话的用户 如果您只对那些恰好有 3 个会话并达到第四个会话的人感兴趣 - 您可以添加相应的过滤器

【讨论】:

【参考方案2】:
WITH Sess AS
(
select user, session
from tablename
group by  user
HAVING count(session) >= 3
)

select user
from tablename join Sess on tablename.session = Sess.session
group by user
having (datediff(day, min(eventdate), Max(eventdate)) <=3) 
and (min(eventdate) <> Max(eventDate))

【讨论】:

以上是关于玩了至少 3 节后 3 天内又回来的用户数量?的主要内容,如果未能解决你的问题,请参考以下文章

在Tableau中查找过去29天内的活跃用户数。

并发用户数,吞吐量计算公式

深入浅出Zabbix 3.0 -- 第四章 主机用户和权限管理

Google Play Console:“活跃用户”的含义

微信公众号推送消息给用户?

mysql date函数怎么用