修复在 Amazon Redshift 上计算 DAU 和 MAU 时的 MAU 问题
Posted
技术标签:
【中文标题】修复在 Amazon Redshift 上计算 DAU 和 MAU 时的 MAU 问题【英文标题】:Fix the MAU problem while calculating DAU and MAU on Amazon Redshift 【发布时间】:2018-12-18 18:40:34 【问题描述】:我正在使用以下查询来计算 MAU 和 DAU,根据this 帖子:
WITH dau AS
(
SELECT TRUNC(created_at) AS created_at,
COUNT(DISTINCT member_id) AS dau
FROM table ds
WHERE ds.created_at BETWEEN '2018-09-03' AND '2018-09-08'
GROUP BY TRUNC(created_at)
)
SELECT created_at,
dau,
(SELECT COUNT(DISTINCT member_id)
FROM table ds
WHERE ds.created_at BETWEEN created_at - 29*INTERVAL '1 day' AND created_at) AS mau
FROM dau
ORDER BY created_at
我尝试运行此查询并得到以下结果:
2018-09-03 12844 3976132
2018-09-04 54236 3976132
2018-09-05 58631 3976132
2018-09-06 59786 3976132
2018-09-07 52317 3976132
2018-09-08 4 3976132
可以清楚的看到MAU列有重复值。 我该如何解决?任何指针都会有所帮助。
【问题讨论】:
【参考方案1】:您应该为列名添加前缀:
WITH dau AS
(
SELECT TRUNC(created_at) AS created_at,
COUNT(DISTINCT member_id) AS dau
FROM table ds
WHERE ds.created_at BETWEEN '2018-09-03' AND '2018-09-08'
GROUP BY TRUNC(created_at)
)
SELECT created_at,
dau,
(SELECT COUNT(DISTINCT member_id)
FROM table ds
WHERE ds.created_at
BETWEEN dau.created_at - 29*INTERVAL '1 day' AND dau.created_at) AS mau
-- here
FROM dau
ORDER BY created_at
或:
SELECT TRUNC(created_at) AS created_at,
COUNT(DISTINCT member_id) AS dau,
COUNT(DISTINCT member_id)
FILTER(WHERE TRUNC(created_at)>=TRUNC(created_at)-29*INTERVAL '1 day') AS mau
FROM table ds
WHERE ds.created_at BETWEEN '2018-09-03' AND '2018-09-08'
GROUP BY TRUNC(created_at)
ORDER BY created_at
【讨论】:
这对我来说是一个非常愚蠢的错误。非常感谢您指出这一点:)。 @Patthebug 太好了。还请检查FILTER
子句是否返回相同的结果以上是关于修复在 Amazon Redshift 上计算 DAU 和 MAU 时的 MAU 问题的主要内容,如果未能解决你的问题,请参考以下文章
使用查询编辑器在 Amazon Redshift 上创建数据库