查询付费客户和流失客户的数量?
Posted
技术标签:
【中文标题】查询付费客户和流失客户的数量?【英文标题】:Query to find the number of paying customers and churned customers? 【发布时间】:2015-11-23 17:24:35 【问题描述】:我有一个如下所示的表paid_users:
http://sqlfiddle.com/#!15/d25ba
我正在尝试确定按月-年分组的付费客户和按月-年分组的流失客户。本质上,有付款人和用户。付款人是为该特定用户付款的人。如果没有payment_stop_date,则表示付款人仍在为用户付款。 payment_stop_date 指示付款人是否/何时停止为用户付款。
我想找出查询结果应该是多少付费客户:
Month-Year | New Paying Customers | Churned Paying Customers
------------------------------------------------------------
11-2014 | 1 |
12-2014 | | 1
01-2015 | 1 |
04-2015 | |
06-2015 | 2 |
07-2015 | 1 |
10-2015 | | 1
查看 payor_id 3453,她在 11-2014 年开始为 user_id 3182 付款,因此她将被纳入 11-2014 组。然而,她在 12-2014 年停止为两个用户付费,因此被纳入了 12-2014 年流失的群体。如果付款人完全停止向我们付款(即,他们本可以为一个人付款然后取消。或者在这种情况下,payor_id 3453 为 2 个用户付款然后取消),则付款人被视为流失的付款客户。 Payor_3453 然后在 01-2015 开始为 user_id 4716 付款,因此她随后被包含在 01-2015 组中。
我很难为此编写查询,因为它不一定是不同的 payor_id,因为 payor_id 3453 两次被视为新的付费客户
【问题讨论】:
【参考方案1】:不确定我是否理解正确:您想知道每个月有多少客户开始为第一个用户付费,有多少客户停止为最后一个用户付费?
解决方案看起来相当复杂,但也许并不那么容易。
with months as
(
select * from
generate_series('2014-06-01', now() at time zone 'utc', interval '1 month') as month
cross join paid_users
)
, sums as
(
select month, payor_id, joiners, leavers, sum(net) over (partition by payor_id order by month)
from
(
select month, payor_id, joiners, leavers, coalesce(joiners,0) - coalesce(leavers, 0) as net
from
(
select payor_id, month, count(*) as joiners
from months
where payment_start_date >= month
and payment_start_date < month + interval '1 month'
group by month, payor_id
) as t
full join
(
select payor_id, month, count(*) as leavers
from months
where payment_stop_date >= month
and payment_stop_date < month + interval '1 month'
group by month, payor_id
) as u
using (month, payor_id)
) as v
)
select * from sums
order by payor_id, sum
以上内容应为您提供每位客户的付费用户总数
month | payor_id | joiners | leavers | sum
---------------------+----------+---------+---------+-----
2014-06-01 00:00:00 | 1725 | 1 | | 1
2014-06-01 00:00:00 | 1929 | 1 | | 1
2015-10-01 00:00:00 | 1929 | | 1 | 0
2014-06-01 00:00:00 | 1986 | 1 | | 1
2014-11-01 00:00:00 | 3453 | 2 | | 2
2014-12-01 00:00:00 | 3453 | | 2 | 0
2015-01-01 00:00:00 | 3453 | 1 | | 1
2015-03-01 00:00:00 | 3453 | 1 | | 2
2015-04-01 00:00:00 | 3453 | 2 | 1 | 3
2015-05-01 00:00:00 | 3453 | | 1 | 2
2015-06-01 00:00:00 | 3453 | | 1 | 1
2015-10-01 00:00:00 | 3453 | 1 | | 2
2015-07-01 00:00:00 | 6499 | 1 | | 1
2015-08-01 00:00:00 | 6499 | 3 | | 4
2015-10-01 00:00:00 | 6499 | | 1 | 3
2015-11-01 00:00:00 | 6499 | | 1 | 2
所以新客户是总和为 0 到非零总和的客户,流失客户是总和为 0 的客户?
select month, new, churned from
(
(
select month, count(*) as churned
from sums
where sum = 0
group by month
) as l
full join
(
select month, count(*) as new
from (
select month, payor_id, sum, coalesce(lag(sum) over (partition by payor_id order by month), 0) as prev_sum
from sums
order by payor_id, month
) as t
where prev_sum = 0 and sum > 0
group by month
) as r
using (month)
)
order by month
输出
month | new | churned
---------------------+-----+---------
2014-06-01 00:00:00 | 3 |
2014-11-01 00:00:00 | 1 |
2014-12-01 00:00:00 | | 1
2015-01-01 00:00:00 | 1 |
2015-07-01 00:00:00 | 1 |
2015-10-01 00:00:00 | | 1
希望这会有所帮助。如果有人知道更简单的方法,我会很高兴听到它。
【讨论】:
这似乎很合理,你肯定比我更好地解释了这个问题的想法 - 非常感谢你的帮助!数字看起来正确,逻辑很有意义:)以上是关于查询付费客户和流失客户的数量?的主要内容,如果未能解决你的问题,请参考以下文章