TSQL:外部连接表的计数产生不正确的结果
Posted
技术标签:
【中文标题】TSQL:外部连接表的计数产生不正确的结果【英文标题】:TSQL: Count on outer joined tables is producing incorrect results 【发布时间】:2015-04-02 03:32:30 【问题描述】:生成关于 SQL Server 2008 的报告。我已使用 LEFT OUTER JOIN
将一个表与其他五个表连接起来。当我对那些其他表进行计数时,我得到了不正确的数据。我知道为什么,但我不确定如何解决它。
查询正在跟踪一所学校的招生候选人。当他们完成整个流程时,他们会在流程的每个主要阶段被标记。我需要的是计算在特定时期(年和月)中有多少人通过了特定阶段。大多数情况下它都有效。但是,如果候选人通过了该期间的任何阶段,则该候选人也会在之前的阶段中被“计数”,即使它们发生在查询期间之前。一个很好的例子是 AD_35,一个特定的学术项目应该有一个人,但输出显示为 2。当我只查询 AD_35 表时,我得到了正确的信息。所以我知道问题出在外连接上,但我不确定如何解决这个问题(在生成我的命名输出的子查询中尝试了各种标准)。对某人来说应该很容易......在此先感谢,代码如下。 :Year 和 :Month 用于用户输入,将填充数值(例如 2015 1)
连续波
SELECT DISTINCT
ad_candidacy.prog_cde,
ad_candidacy.stageyr,
ad_candidacy.stagemo,
Count (case when (ad_02.stageyr in (:Year, :Year -1, :Year-2) and ad_02.stagemo <= :month) then 1 else null end) as Inquiry,
Count (case when (ad_05.stageyr in (:Year, :Year -1, :Year-2) and ad_05.stagemo <= :month) then 1 else null end) as Applied,
Count (case when (ad_35.stageyr in (:Year, :Year -1, :Year-2) and ad_35.stagemo <= :month and ad_35.id_num = ad_candidacy.id_num and ad_35.stageyr = ad_candidacy.stageyr and ad_35.stagemo=ad_candidacy.stagemo) then 1 else null end) as Accepted,
Count (case when (ad_50.stageyr in (:Year, :Year -1, :Year-2) and ad_50.stagemo <= :month) then 1 else null end) as Matriculated,
Count (case when (ad_enroll.stageyr in (:Year, :Year -1, :Year-2) and ad_enroll.stagemo <= :month) then 1 else null end) as Enrolled,
ad_candidacy.stagemo_long
FROM
ad_candidacy
LEFT OUTER JOIN
ad_02 ON ad_candidacy.id_num = ad_02.id_num
LEFT OUTER JOIN
ad_05 ON ad_candidacy.id_num = ad_05.id_num
LEFT OUTER JOIN
ad_35 ON ad_candidacy.id_num = ad_35.id_num
LEFT OUTER JOIN
ad_enroll ON ad_candidacy.id_num = ad_enroll.id_num
LEFT OUTER JOIN
ad_50 ON ad_candidacy.id_num = ad_50.id_num
WHERE
(ad_candidacy.stageyr in (:Year, :Year -1, :Year-2) )
AND ( ad_candidacy.stagemo <= :Month )
GROUP BY
ad_candidacy.prog_cde,
ad_candidacy.stageyr,
ad_candidacy.stagemo,
ad_candidacy.stagemo_long
ORDER BY
ad_candidacy.stageyr ASC
【问题讨论】:
您是说如果候选人被算作“已接受”,那么即使他们在 2 年多前申请,也算作“已申请”?我不明白怎么做。或者,如果您说的是去年申请并在今年被录取的候选人,那么他们被计算两次是有道理的。 我说的是第二个。为什么有意义? (是的,我需要在这里上学,哈!)候选人于 2014 年 12 月被录取,并于 2015 年 1 月入学。如果我查询 AD_35(已接受)表,我会看到 12 月的日期。如果我查询并限制在 2015 年 1 月,则候选人不会出现。但是,当我运行包含联接的上述查询时,有一个候选人在 2015 年 1 月接受和注册。我认为 Count (case... 语句中的子查询可以解决这个问题,但没有。 【参考方案1】:连接多个表需要考虑连接条件。第二个表可能有同一行的多行。为确保最终得到重复项,您可以在加入第一个表之前在子查询中搜索第二个表。
SELECT a.Name,
b.Total
FROM table1 as a
LEFT OUTER JOIN ( SELECT table1Id, Total = COUNT(b.some_measure) from table2 group by table1Id)as b ON a.table1Id = b.table1Id
【讨论】:
同意考虑加入条件。单独查询第二个表可以给我正确的信息。但是,您的代码示例为我提供了一些额外的素材来重新设计 OUTER JOIN 逻辑,我现在正在处理它的变体。【参考方案2】:Ako 的回答为我指明了正确的方向。我正在使用子查询,但他的示例导致了正确的输出。下面是代码的工作版本。谢谢!
SELECT DISTINCT
ad_candidacy.prog_cde,
ad_candidacy.stageyr,
ad_candidacy.stagemo,
ad_candidacy.StageMo_Long,
COUNT (case when (Inquiry IS NOT NULL) then 1 else null end) as Inquiry,
COUNT (case when (Applied IS NOT NULL) then 1 else null end) as Applied,
count (case when (Accepted is not null) then 1 else null end) as Accepted,
COUNT (case when (Matriculated IS NOT NULL) then 1 else null end) as Matriculated,
count (case when (Enrolled is not null) then 1 else null end) as Enrolled
FROM
ad_candidacy
LEFT OUTER JOIN
(select id_num, Inquiry = COUNT (id_num) from ad_02 where stageyr in (:year, :year-1, :year-2) and StageMo <= :month group by id_num) as ad_02 on ad_candidacy.id_num = ad_02.id_num
LEFT OUTER JOIN
(select id_num, Accepted = COUNT (id_num) from ad_35 where stageyr in (:year, :year-1, :year-2) and StageMo <= :month group by id_num) as ad_35 on ad_candidacy.id_num = ad_35.id_num
LEFT OUTER JOIN
(select id_num, Applied = COUNT (id_num) from ad_05 where stageyr in (:year, :year-1, :year-2) and StageMo <= :month group by id_num) as ad_05 on ad_candidacy.id_num = ad_05.id_num
LEFT OUTER JOIN
(select id_num, Matriculated = COUNT (id_num) from ad_50 where stageyr in (:year, :year-1, :year-2) and StageMo <= :month group by id_num) as ad_50 on ad_candidacy.id_num = ad_50.id_num
LEFT OUTER JOIN
(select id_num, Enrolled = COUNT (id_num) from ad_enroll where stageyr in (:year, :year-1, :year-2) and StageMo <= :month group by id_num) as ad_enroll on ad_candidacy.id_num = ad_enroll.id_num
WHERE
(ad_candidacy.stageyr in (:year, :year-1, :year-2) )
AND ( ad_candidacy.stagemo <= :month )
GROUP BY
ad_candidacy.prog_cde,
ad_candidacy.stageyr,
ad_candidacy.stagemo,
ad_candidacy.stagemo_long
【讨论】:
以上是关于TSQL:外部连接表的计数产生不正确的结果的主要内容,如果未能解决你的问题,请参考以下文章