透视和交叉申请
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了透视和交叉申请相关的知识,希望对你有一定的参考价值。
这个问题围绕着同一个主题,但有两个场景。
我有一组从OData中提取的值。它有一个包含变量的列,我希望它可以旋转并连接在一起
create table xmpltbl
( [Location] nvarchar(max),
[Site] nvarchar(max),
[Variable] nvarchar(max),
[Period] datetimeoffset(3),
[StringValue] nvarchar(max),
[NumericValue] decimal(10,2)
);
INSERT INTO xmpltbl
(
[Location],
[Site],
[Variable],
[Period],
[StringValue],
[NumericValue]
)
VALUES
('UK','London','Customer1','2019-01-01 00:28:53.897','Company A',NULL),
('UK','London','Product1','2019-01-01 00:28:53.897', 'Sand' ,NULL),
('UK','London','Division1','2019-01-01 00:28:53.897','Supplies',NULL),
('UK','London','Expense1','2019-01-01 00:28:53.897',NULL,150),
('UK','London','Customer2','2019-01-01 00:28:53.897','CompanyB',NULL),
('UK','London','Product2','2019-01-01 00:28:53.897','Bricks',NULL),
('UK','London','Division2','2019-01-01 00:28:53.897','Building Materials',NULL),
('UK','London','Expense2','2019-01-01 00:28:53.897',NULL,300),
('France','Paris','Customer3','2020-01-01 00:28:53.897','Company C',NULL),
('France','Paris','Product3','2020-01-01 00:28:53.897','Cement',NULL),
('France','Paris','Division3','2019-01-01 00:28:53.897','Supplies',NULL),
('France','Paris','Expense3','2019-01-01 00:28:53.897',NULL,75);
我需要具有相同数字的变量在同一行上,旁边有值。理想情况下,我想用SSIS做这个,因为我使用它来提取数据。
我希望它看起来像这样
Location Site Period Customer Product Division Total
UK London 2019 CompanyA Sand Supplies 150
UK London 2019 CompanyB Bricks Building Materials 300
France Paris 2020 CompanyC Cement Supplies 75
还有一些数据与之不符
Customer1 + Product1, Division1, Expense1
并且需要
Customer1 + Product10, Division10, Expense10
Customer1 + Product11, Division11, Expense11
我考虑使用动态数据透视,因为我需要使用这些变量中的大约60个。然而,这是加入,但我做不到。我试图做一个交叉应用,但即使我把它放到临时表中它也不会给我回值。
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX);
SET @cols = STUFF((SELECT ',' + QUOTENAME(Variable)
FROM xmpltbl
GROUP BY Variable
ORDER BY Variable
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set @query = 'SELECT Location, Site, NumericValue, Period, ' + @cols + ' from
(
select Location
, Site
, Variable
, NumericValue
, Period
, StringValue
from xmpltbl
) x
pivot
(
max(StringValue)
for Variable in (' + @cols + ')
) p '
execute (@query);
答案
我不知道这是否是使用SQL执行此操作的最佳方法,但以下解决方案提供了预期的结果:
导入临时表
我使用以下查询将数据导入临时表:
create table #xmpltbl
( [Location] nvarchar(max),
[Site] nvarchar(max),
[Variable] nvarchar(max),
[Period] datetimeoffset(3),
[StringValue] nvarchar(max),
[NumericValue] decimal(10,2)
);
INSERT INTO #xmpltbl
(
[Location],
[Site],
[Variable],
[Period],
[StringValue],
[NumericValue]
)
VALUES
('UK','London','Customer1','2019-01-01 00:28:53.897','Company A',NULL),
('UK','London','Product1','2019-01-01 00:28:53.897', 'Sand' ,NULL),
('UK','London','Division1','2019-01-01 00:28:53.897','Supplies',NULL),
('UK','London','Expense1','2019-01-01 00:28:53.897',NULL,150),
('UK','London','Customer2','2019-01-01 00:28:53.897','CompanyB',NULL),
('UK','London','Product2','2019-01-01 00:28:53.897','Bricks',NULL),
('UK','London','Division2','2019-01-01 00:28:53.897','Building Materials',NULL),
('UK','London','Expense2','2019-01-01 00:28:53.897',NULL,300),
('France','Paris','Customer3','2020-01-01 00:28:53.897','Company C',NULL),
('France','Paris','Product3','2020-01-01 00:28:53.897','Cement',NULL),
('France','Paris','Division3','2019-01-01 00:28:53.897','Supplies',NULL),
('France','Paris','Expense3','2019-01-01 00:28:53.897',NULL,75);
使用公用表表达式来获取所需的输出
我使用公用表表达式(CTE)来构建查询:
WITH CTE_1 AS (SELECT *, (ROW_NUMBER() OVER(ORDER BY [Location],
[Site] ) - 1) / 4 as grpno FROM #xmpltbl),
CTE_2 AS (SELECT * , ROW_NUMBER() OVER(PARTITION BY grpno ORDER BY grpno) rn
FROM CTE_1),
CTE_3 AS (SELECT *, case when rn = 2 Then 1 else 0 end as Product, case when rn = 3 Then 1 else 0 end as Supplies
FROM CTE_2)
SELECT DISTINCT [Location], [Site], Year([Period]) as [Period],
FIRST_VALUE(StringValue) OVER(PARTITION BY grpno ORDER BY rn) as [Customer] ,
FIRST_VALUE(StringValue) OVER(PARTITION BY grpno ORDER BY Product DESC) as [Product] ,
FIRST_VALUE(StringValue) OVER(PARTITION BY grpno ORDER BY Supplies DESC) as [Supplies] ,
MAX([NumericValue]) OVER(PARTITION BY grpno) as [Total]
from CTE_3
产量
附注:此解决方案仅适用于SQL Server 2012或更高版本,因为它使用来自FIRST_VALUE()
窗口函数which is added in SQL Server 2012
以上是关于透视和交叉申请的主要内容,如果未能解决你的问题,请参考以下文章
Pandas:透视表(pivotTab)和交叉表(crossTab)