有比 UNION ALL 更简单的方法吗?
Posted
技术标签:
【中文标题】有比 UNION ALL 更简单的方法吗?【英文标题】:is there an easier way than doing UNION ALL? 【发布时间】:2012-11-21 22:06:18 【问题描述】:我有一个相当大的查询,它从某个日期范围内选择数据。
我想根据年份对数据进行分组。
现在我有:
;with mappingTable as
(
select b.CLIENT_ID,f.patient_id,f.received_date,f.SPECIMEN_ID
from F_ACCESSION_DAILY f
join
(
select f.client_id,a.patient_id
from
(
SELECT
max(received_date) AS received_date
, patient_id AS Patient_ID
FROM F_ACCESSION_DAILY
group by PATIENT_ID
) a
join F_ACCESSION_DAILY f
on f.PATIENT_ID=a.Patient_ID
and f.RECEIVED_DATE=a.received_date
) b
on b.Patient_ID=f.PATIENT_ID
where f.RECEIVED_DATE between '20100101' and '20110101' - as you can see this data is only for 2010
) ---there's much more to this query
如您所见,此数据仅适用于 2010 年
但我想在所有年份都使用union all
:2008、2009、2010、2011、2012...
如何在不重写查询 5 次并在所有查询之间执行 union all
的情况下做到这一点?
这是整个怪物查询:
;with mappingTable as
(
select b.CLIENT_ID,f.patient_id,f.received_date,f.SPECIMEN_ID
from F_ACCESSION_DAILY f
join
(
select f.client_id,a.patient_id
from
(
SELECT
max(received_date) AS received_date
, patient_id AS Patient_ID
FROM F_ACCESSION_DAILY
group by PATIENT_ID
) a
join F_ACCESSION_DAILY f
on f.PATIENT_ID=a.Patient_ID
and f.RECEIVED_DATE=a.received_date
) b
on b.Patient_ID=f.PATIENT_ID
where f.RECEIVED_DATE between '20100101' and '20110101'
)
,
counted as(
SELECT
PATIENT_ID,
client_id,
--case when COUNT(*)>=12 then 12 else COUNT(*) end TimesTested,
COUNT(*) as timestested,
case when count(*)>1 then
(datediff(day,MIN(received_date),max(received_date)))
/(COUNT(*)-1)
else
0
end as testfreq
FROM mappingTable
GROUP BY
client_id,
patient_id
),counted2 as (
SELECT
client_id,
TimesTested,
CAST(COUNT(*) AS varchar(30)) AS count,
CAST(AVG(testfreq) as varchar(30)) as TestFreq,
CAST(STDEV(TestFreq) as varchar(30)) Stdv
FROM counted
GROUP BY
client_id,
TimesTested
),
counted3 as
(
SELECT
patient_id,
client_id,
TimesTested,
CAST(COUNT(*) AS varchar(30)) AS count,
AVG(testfreq) as TestFreq
FROM counted
GROUP BY
client_id,
TimesTested,
PATIENT_ID
)
,
CountOver12 as
( select client_id,count(*) count
from counted
where timestested>12
group by CLIENT_ID)
,
TotalAvgTestFreq as (
select client_id, SUM(testfreq) / count(testfreq) TotalAvgTestFreq
from counted3
group by client_id
)
,
unpivoted AS (
SELECT
client_id,
ColumnName + CAST(TimesTested AS varchar(10)) AS ColumnName,
ColumnValue
FROM counted2
UNPIVOT (
ColumnValue FOR ColumnName IN (count, TestFreq,stdv)
) u
),
pivoted AS (
SELECT
client_id clientid,
count1, TestFreq1,stdv1,
count2, TestFreq2,stdv2,
count3, TestFreq3,stdv3,
count4, TestFreq4,stdv4,
count5, TestFreq5,stdv5,
count6, TestFreq6,stdv6,
count7, TestFreq7,stdv7,
count8, TestFreq8,stdv8,
count9, TestFreq9,stdv9,
count10, TestFreq10,stdv10,
count11, TestFreq11,stdv11,
count12, TestFreq12,stdv12
FROM unpivoted
PIVOT (
MAX(ColumnValue) FOR ColumnName IN (
count1,TestFreq1,stdv1,
count2,TestFreq2,stdv2,
count3,TestFreq3,stdv3,
count4,TestFreq4,stdv4,
count5,TestFreq5,stdv5,
count6,TestFreq6,stdv6,
count7,TestFreq7,stdv7,
count8, TestFreq8, stdv8,
count9, TestFreq9, stdv9,
count10, TestFreq10,stdv10,
count11, TestFreq11,stdv11,
count12, TestFreq12,stdv12
)
) p
)
,
PatientStats as
(
select
distinct
f.client_id
,datediff(MM,c.mlis_date_established,GETDATE()) MonthsCustomer
,count(distinct patient_id) TotalPatients
,count(distinct specimen_id) TotalSpecimens
from mappingTable f
left join D_CLIENT c
on f.CLIENT_ID=c.CLIENT_ID
group by f.client_id,c.mlis_date_established
)
,
Over12
as
(
select client_id,SUM(c) sumcount from
(
select CLIENT_ID,COUNT(*) c
from mappingTable
group by CLIENT_ID,PATIENT_ID
having count(*)>12
) a
group by client_id
),
GetMedian as(
select client_id, avg(timestested) median_testfreq
from
(
select client_id,
timestested,
rn=row_number() over (partition by CLIENT_ID
order by timestested),
c=count(timestested) over (partition by CLIENT_ID)
from counted3
where timestested>1
) g
where rn in (round(c/2,0),c/2+1)
group by client_id
)
, final as (
SELECT p.*,pivoted.*,c12.count [Count13+],Over12.sumcount SumofGreaterThan12,t.TotalAvgTestFreq,median.median_testfreq
FROM pivoted
left join PatientStats p
on p.CLIENT_ID=pivoted.CLIENTID
left join Over12
on over12.CLIENT_ID=p.CLIENT_ID
left join TotalAvgTestFreq t
on t.CLIENT_ID=p.CLIENT_ID
left join CountOver12 C12
on c12.CLIENT_ID=p.CLIENT_ID
left join GetMedian median
on median.CLIENT_ID=p.CLIENT_ID
where p.CLIENT_ID not in (select CLIENTid from SalesDWH..TestPractices)
)
/* Get the data into a temp table */
SELECT * INTO #TempTable
FROM final
/* Drop the cloumns that are not needed */
ALTER TABLE #TempTable
DROP COLUMN clientid
/* Get results and drop temp table */
SELECT * FROM #TempTable
DROP TABLE #TempTable
【问题讨论】:
没有看到任何数据等我只能问你为什么将记录限制在一个小数据集?如果你只是说你想要f.RECEIVED_DATE >= '20080101'
的所有数据,那么你将得到所有数据,而不必UNION ALL
。
@bluefeet 我想按年份对数据进行分组。如果有不清楚的地方请告诉我!
@aptem 所以最终结果使用GROUP BY DatePart(year, RECEIVED_DATE)
@bluefeet 我有一种感觉,因为枢轴,我无法做到这一点,对吧?我不认为它会给出逻辑上正确的结果
正如我所说的没有看到全貌(数据等),这是猜测。您必须进行一些测试。
【参考方案1】:
创建表函数:
CREATE FUNCTION fn_get_data_by_date_range
(
@start AS VARCHAR(8),
@end AS VARCHAR(8)
)
RETURNS TABLE
AS
RETURN
(
...your query...
WHERE f.RECEIVED_DATE BETWEEN @start AND @end
)
然后你可以 UNION 对该函数的不同调用:
SELECT * FROM fn_get_data_by_date_range('20100101','20110101')
UNION ALL
SELECT * FROM fn_get_data_by_date_range('20090101','20100101')
UNION ALL
....
UNION ALL
SELECT * FROM fn_get_data_by_date_range('20070101','20080101')
【讨论】:
非常感谢这是一个绝妙的答案!以上是关于有比 UNION ALL 更简单的方法吗?的主要内容,如果未能解决你的问题,请参考以下文章
SQL Server-聚焦UNIOL ALL/UNION查询