sql子查询加入组
Posted
技术标签:
【中文标题】sql子查询加入组【英文标题】:sql subquery join group by 【发布时间】:2018-10-15 10:38:43 【问题描述】:我正在尝试从我们的数据库中获取我们的用户列表以及与他们来自同一群组的人数 - 在这种情况下,这被定义为同时来自同一所医学院。
medical_school_id
存储在doctor_record
表中
graduation_dt
也存储在 doctor_record
表中。
我已经设法使用一个子查询写出这个查询,该查询执行一个选择语句来计算每行的其他人的数量,但这需要很长时间。我的逻辑告诉我,我应该先运行一次简单的 GROUP BY 查询,然后以某种方式将 medical_school_id 加入其中。
group by查询如下
select count(ca.id) , cdr.medical_school_id, cdr.graduation_dt
from account ca
LEFT JOIN doctor cd on ca.id = cd.account_id
LEFT JOIN doctor_record cdr on cd.gmc_number = cdr.gmc_number
GROUP BY cdr.medical_school_id, cdr.graduation_dt
长选择查询是
select a.id, a.email , dr.medical_school_id,
(select count(ba.id) from account ba
LEFT JOIN doctor bd on ba.id = bd.account_id
LEFT JOIN doctor_record bdr on bd.gmc_number = bdr.gmc_number
WHERE bdr.medical_school_id = dr.medical_school_id AND bdr.graduation_dt = dr.graduation_dt) AS med_count,
from account a
LEFT JOIN doctor d on a.id = d.account_id
LEFT JOIN doctor_record dr on d.gmc_number = dr.gmc_number
如果你能把我推向正确的方向,那就太棒了
【问题讨论】:
样本数据和期望的结果真的很有帮助。 【参考方案1】:我认为你只需要窗口函数:
select a.id, a.email, dr.medical_school_id, dr.graduation_dt,
count(*) over (partition by dr.medical_school_id, dr.graduation_dt) as cohort_size
from account a left join
doctor d
on a.id = d.account_id left join
doctor_record dr
on d.gmc_number = dr.gmc_number;
【讨论】:
@查理。 . .您应该接受您认为最能回答您的问题的答案。【参考方案2】:使用相同的代码进行分组:
SELECT * FROM (
(
SELECT acc.[id]
, acc.[email]
FROM
account acc
LEFT JOIN
doctor doc
ON
acc.id = doc.account_id
LEFT JOIN
doctor_record doc_rec
ON
doc.gmc_number = doc_rec.gmc_number
) label
LEFT JOIN
(
SELECT count(acco.id)
, doc_reco.medical_school_id
, doc_reco.graduation_dt
FROM
account acco
LEFT JOIN
doctor doct
ON
acco.id = doct.account_id
LEFT JOIN
doctor_record doc_reco
ON
doct.gmc_number = doc_reco.gmc_number
GROUP BY
doc_reco.medical_school_id,
doc_reco.graduation_dt
) count
ON
count.[medical_school_id]=label.[medical_school_id]
AND
count.[graduation_dt]=label.[graduation_date]
)
【讨论】:
【参考方案3】:这样的事情怎么样?
select a.doctor_id
, count(*) - 1
from doctor_record a
left join doctor_record b on a.medical_school_id = b.medical_school_id
and a.graduation_dt = b.graduation_dt
group by a.doctor_id
从计数中减去 1,这样您就不会将医生计入“同一队列中的其他人”数字中
我将“同一队列”定义为“同一医学院和毕业日期”。
我不清楚 GMC 编号是什么以及它是如何相关的。跟队列有关吗?
【讨论】:
以上是关于sql子查询加入组的主要内容,如果未能解决你的问题,请参考以下文章
Google BigQuery SQL:使滚动平均子查询或加入对大型数据集更有效