有比 UNION ALL 更简单的方法吗?

Posted

技术标签:

【中文标题】有比 UNION ALL 更简单的方法吗?【英文标题】:is there an easier way than doing UNION ALL? 【发布时间】:2012-11-21 22:06:18 【问题描述】:

我有一个相当大的查询,它从某个日期范围内选择数据。

我想根据年份对数据进行分组。

现在我有:

;with mappingTable as
(
select b.CLIENT_ID,f.patient_id,f.received_date,f.SPECIMEN_ID
from F_ACCESSION_DAILY f
join
(
select f.client_id,a.patient_id
from
(
SELECT
       max(received_date) AS received_date
     , patient_id         AS Patient_ID
FROM F_ACCESSION_DAILY
group by PATIENT_ID
) a
join F_ACCESSION_DAILY f
on f.PATIENT_ID=a.Patient_ID
and f.RECEIVED_DATE=a.received_date
) b
on b.Patient_ID=f.PATIENT_ID
where f.RECEIVED_DATE between '20100101' and '20110101' - as you can see this data is only for 2010
) ---there's much more to this query

如您所见,此数据仅适用于 2010 年

但我想在所有年份都使用union all:2008、2009、2010、2011、2012...

如何在不重写查询 5 次并在所有查询之间执行 union all 的情况下做到这一点?

这是整个怪物查询:

;with mappingTable as
(
select b.CLIENT_ID,f.patient_id,f.received_date,f.SPECIMEN_ID
from F_ACCESSION_DAILY f
join
(
select f.client_id,a.patient_id
from
(
SELECT
       max(received_date) AS received_date
     , patient_id         AS Patient_ID
FROM F_ACCESSION_DAILY
group by PATIENT_ID
) a
join F_ACCESSION_DAILY f
on f.PATIENT_ID=a.Patient_ID
and f.RECEIVED_DATE=a.received_date
) b
on b.Patient_ID=f.PATIENT_ID
where f.RECEIVED_DATE between '20100101' and '20110101'
)
,
  counted as(
SELECT
PATIENT_ID,
    client_id,
    --case when COUNT(*)>=12 then 12 else COUNT(*) end  TimesTested, 
    COUNT(*) as timestested,
    case when count(*)>1 then
    (datediff(day,MIN(received_date),max(received_date)))
  /(COUNT(*)-1) 
 else
0
end as testfreq
  FROM mappingTable
  GROUP BY
    client_id,
    patient_id    

),counted2 as (
  SELECT
    client_id,
   TimesTested,
    CAST(COUNT(*) AS varchar(30)) AS count,
    CAST(AVG(testfreq) as varchar(30)) as TestFreq,
    CAST(STDEV(TestFreq) as varchar(30)) Stdv
  FROM counted
  GROUP BY
    client_id,
    TimesTested
    ),







    counted3 as 
     (
  SELECT
  patient_id,
    client_id,
   TimesTested,
    CAST(COUNT(*) AS varchar(30)) AS count,
    AVG(testfreq) as TestFreq
  FROM counted
  GROUP BY
    client_id,
    TimesTested,
    PATIENT_ID
    )
    ,
    CountOver12 as
    (    select client_id,count(*) count
    from counted
    where timestested>12
    group by CLIENT_ID)
    ,
    TotalAvgTestFreq as (

    select client_id, SUM(testfreq) / count(testfreq) TotalAvgTestFreq
    from counted3
    group by client_id
    )



    ,
unpivoted AS (
  SELECT
    client_id,
    ColumnName + CAST(TimesTested AS varchar(10)) AS ColumnName,
    ColumnValue
  FROM counted2
  UNPIVOT (
    ColumnValue FOR ColumnName IN (count, TestFreq,stdv)
  ) u
),
pivoted AS (
  SELECT
    client_id clientid,
    count1, TestFreq1,stdv1,
    count2, TestFreq2,stdv2,
    count3, TestFreq3,stdv3,
    count4, TestFreq4,stdv4,
    count5, TestFreq5,stdv5,
    count6, TestFreq6,stdv6,
    count7, TestFreq7,stdv7,
    count8, TestFreq8,stdv8,
    count9, TestFreq9,stdv9,
    count10, TestFreq10,stdv10,
    count11, TestFreq11,stdv11,
    count12, TestFreq12,stdv12
  FROM unpivoted
  PIVOT (
    MAX(ColumnValue) FOR ColumnName IN (
      count1,TestFreq1,stdv1,
      count2,TestFreq2,stdv2,
      count3,TestFreq3,stdv3,
      count4,TestFreq4,stdv4,
      count5,TestFreq5,stdv5,
      count6,TestFreq6,stdv6,
      count7,TestFreq7,stdv7,
    count8, TestFreq8,   stdv8,
    count9, TestFreq9,   stdv9,
    count10, TestFreq10,stdv10,
    count11, TestFreq11,stdv11,
    count12, TestFreq12,stdv12
    )
  ) p
)
,
PatientStats as
(
select 
 distinct
f.client_id
,datediff(MM,c.mlis_date_established,GETDATE()) MonthsCustomer
,count(distinct patient_id) TotalPatients
,count(distinct specimen_id) TotalSpecimens
from mappingTable f
left join D_CLIENT c
on f.CLIENT_ID=c.CLIENT_ID
group by f.client_id,c.mlis_date_established
)
,
Over12
as
(
select client_id,SUM(c) sumcount from
(
select CLIENT_ID,COUNT(*) c
from mappingTable
group by CLIENT_ID,PATIENT_ID
having count(*)>12
) a
group by client_id
),
GetMedian as(

    select client_id, avg(timestested) median_testfreq
from
(
    select client_id,
           timestested,
           rn=row_number() over (partition by CLIENT_ID
                                 order by timestested),
           c=count(timestested) over (partition by CLIENT_ID)
    from counted3
    where timestested>1
) g
where rn in (round(c/2,0),c/2+1)
group by client_id
   )
, final as (
SELECT p.*,pivoted.*,c12.count [Count13+],Over12.sumcount SumofGreaterThan12,t.TotalAvgTestFreq,median.median_testfreq
FROM pivoted
left join PatientStats p
on p.CLIENT_ID=pivoted.CLIENTID
left join Over12
on over12.CLIENT_ID=p.CLIENT_ID
left join TotalAvgTestFreq t
on t.CLIENT_ID=p.CLIENT_ID
left join CountOver12 C12
on c12.CLIENT_ID=p.CLIENT_ID
left join GetMedian median
on median.CLIENT_ID=p.CLIENT_ID
where p.CLIENT_ID not in (select CLIENTid from SalesDWH..TestPractices)
)
/* Get the data into a temp table */
SELECT * INTO #TempTable
FROM final
/* Drop the cloumns that are not needed */
ALTER TABLE #TempTable
DROP COLUMN clientid
/* Get results and drop temp table */
SELECT * FROM #TempTable
DROP TABLE #TempTable

【问题讨论】:

没有看到任何数据等我只能问你为什么将记录限制在一个小数据集?如果你只是说你想要f.RECEIVED_DATE >= '20080101' 的所有数据,那么你将得到所有数据,而不必UNION ALL @bluefeet 我想按年份对数据进行分组。如果有不清楚的地方请告诉我! @aptem 所以最终结果使用GROUP BY DatePart(year, RECEIVED_DATE) @bluefeet 我有一种感觉,因为枢轴,我无法做到这一点,对吧?我不认为它会给出逻辑上正确的结果 正如我所说的没有看到全貌(数据等),这是猜测。您必须进行一些测试。 【参考方案1】:

创建表函数:

CREATE FUNCTION fn_get_data_by_date_range
(
   @start AS VARCHAR(8),
   @end AS VARCHAR(8)
)
RETURNS TABLE
AS 
RETURN 
(
   ...your query...
   WHERE f.RECEIVED_DATE BETWEEN @start AND @end
)

然后你可以 UNION 对该函数的不同调用:

SELECT * FROM fn_get_data_by_date_range('20100101','20110101')
UNION ALL
SELECT * FROM fn_get_data_by_date_range('20090101','20100101')
UNION ALL
....
UNION ALL
SELECT * FROM fn_get_data_by_date_range('20070101','20080101')

【讨论】:

非常感谢这是一个绝妙的答案!

以上是关于有比 UNION ALL 更简单的方法吗?的主要内容,如果未能解决你的问题,请参考以下文章

SQL Server-聚焦UNIOL ALL/UNION查询

Union all 似乎没有按预期工作。返回更少的行

unionall用法

有比 K 方法更快的聚类方法吗?

在 Hibernate 中进行分页是不是有比执行选择和计数查询更有效的方法?

sql语句or与union all的执行效率哪个更高