Postgres - 使用 CTE 的 id 列的唯一值,与 GROUP BY 一起加入
Posted
技术标签:
【中文标题】Postgres - 使用 CTE 的 id 列的唯一值,与 GROUP BY 一起加入【英文标题】:Postgres - Unique values for id column using CTE, Joins alongside GROUP BY 【发布时间】:2021-10-07 15:23:36 【问题描述】:我有一张桌子referrals
:
id | user_id_owner | firstname | is_active | user_type | referred_at
----+---------------+-----------+-----------+-----------+-------------
3 | 2 | c | t | agent | 3
5 | 3 | e | f | customer | 5
4 | 1 | d | t | agent | 4
2 | 1 | b | f | agent | 2
1 | 1 | a | t | agent | 1
还有一张桌子activations
id | user_id_owner | referral_id | amount_earned | activated_at | app_id
----+---------------+-------------+---------------+--------------+--------
2 | 2 | 3 | 3.0 | 3 | a
4 | 1 | 1 | 6.0 | 5 | b
5 | 4 | 4 | 3.0 | 6 | c
1 | 1 | 2 | 2.0 | 2 | b
3 | 1 | 2 | 5.0 | 4 | b
6 | 1 | 2 | 7.0 | 8 | a
我正在尝试从两个表中生成另一个表,该表只有 referrals.id
的唯一值,并将每个应用程序的计数作为列之一返回为 best_selling_app_count
。
这是我运行的查询:
with agents
as
(select
referrals.id,
referral_id,
amount_earned,
referred_at,
activated_at,
activations.app_id
from referrals
left outer join activations
on (referrals.id = activations.referral_id)
where referrals.user_id_owner = 1),
distinct_referrals_by_id
as
(select
id,
count(referral_id) as activations_count,
sum(coalesce(amount_earned, 0)) as amount_earned,
referred_at,
max(activated_at) as last_activated_at
from
agents
group by id, referred_at),
distinct_referrals_by_app_id
as
(select id, app_id as best_selling_app,
count(app_id) as best_selling_app_count
from agents
group by id, app_id )
select *, dense_rank() over (order by best_selling_app_count desc) best_selling_app_rank
from distinct_referrals_by_id
inner join distinct_referrals_by_app_id
on (distinct_referrals_by_id.id = distinct_referrals_by_app_id.id);
这是我得到的结果:
id | activations_count | amount_earned | referred_at | last_activated_at | id | best_selling_app | best_selling_app_count | best_selling_app_rank
----+-------------------+---------------+-------------+-------------------+----+------------------+------------------------+-----------------------
2 | 3 | 14.0 | 2 | 8 | 2 | b | 2 | 1
1 | 1 | 6.0 | 1 | 5 | 1 | b | 1 | 2
2 | 3 | 14.0 | 2 | 8 | 2 | a | 1 | 2
4 | 1 | 3.0 | 4 | 6 | 4 | c | 1 | 2
这个结果的问题是表有重复的id
2。我只需要id
列的唯一值。
我尝试了一种解决方法,即利用distinct
给出了预期的结果,但我担心查询结果可能不可靠和一致。
这是解决方法查询:
with agents
as
(select
referrals.id,
referral_id,
amount_earned,
referred_at,
activated_at,
activations.app_id
from referrals
left outer join activations
on (referrals.id = activations.referral_id)
where referrals.user_id_owner = 1),
distinct_referrals_by_id
as
(select
id,
count(referral_id) as activations_count,
sum(coalesce(amount_earned, 0)) as amount_earned,
referred_at,
max(activated_at) as last_activated_at
from
agents
group by id, referred_at),
distinct_referrals_by_app_id
as
(select
distinct on(id), app_id as best_selling_app,
count(app_id) as best_selling_app_count
from agents
group by id, app_id
order by id, best_selling_app_count desc)
select *, dense_rank() over (order by best_selling_app_count desc) best_selling_app_rank
from distinct_referrals_by_id
inner join distinct_referrals_by_app_id
on (distinct_referrals_by_id.id = distinct_referrals_by_app_id.id);
我需要关于如何最好地实现这一目标的建议。
【问题讨论】:
【参考方案1】:我正在尝试从这两个表中生成另一个表,该表仅具有唯一的 Referrs.id 值,并将每个应用的计数作为 best_sales_app_count 作为列之一返回。
您的问题对于一个非常复杂的 SQL 查询来说真的很复杂。但是,以上看起来像是实际问题。如果是这样,您可以使用:
select r.*,
a.app_id as most_common_app_id,
a.cnt as most_common_app_id_count
from referrals r left join
(select distinct on (a.referral_id) a.referral_id, a.app_id, count(*) as cnt
from activations a
group by a.referral_id, a.app_id
order by a.referral_id, count(*) desc
) a
on a.referral_id = r.id;
您尚未解释结果集中的其他列。
【讨论】:
以上是关于Postgres - 使用 CTE 的 id 列的唯一值,与 GROUP BY 一起加入的主要内容,如果未能解决你的问题,请参考以下文章
基于组 ID 子集的时间戳列的组中的最后一行 - Postgres
具有递归 CTE 的 Postgres:在保留树结构的同时按受欢迎程度对子节点进行排序/排序(父节点始终高于子节点)
Postgres递归查询以在遍历parent_id时更新字段的值