透视和交叉申请

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了透视和交叉申请相关的知识,希望对你有一定的参考价值。

这个问题围绕着同一个主题,但有两个场景。

我有一组从OData中提取的值。它有一个包含变量的列,我希望它可以旋转并连接在一起

    create table xmpltbl
(   [Location]  nvarchar(max),
    [Site]      nvarchar(max),  
    [Variable]  nvarchar(max),  
    [Period]        datetimeoffset(3),  
    [StringValue]   nvarchar(max),
    [NumericValue] decimal(10,2)
);

INSERT INTO xmpltbl
(
    [Location],     
    [Site], 
    [Variable], 
    [Period],   
    [StringValue],
    [NumericValue]
)

VALUES 

('UK','London','Customer1','2019-01-01 00:28:53.897','Company A',NULL),
('UK','London','Product1','2019-01-01 00:28:53.897', 'Sand' ,NULL),
('UK','London','Division1','2019-01-01 00:28:53.897','Supplies',NULL),
('UK','London','Expense1','2019-01-01 00:28:53.897',NULL,150),
('UK','London','Customer2','2019-01-01 00:28:53.897','CompanyB',NULL),
('UK','London','Product2','2019-01-01 00:28:53.897','Bricks',NULL),
('UK','London','Division2','2019-01-01 00:28:53.897','Building Materials',NULL),
('UK','London','Expense2','2019-01-01 00:28:53.897',NULL,300),
('France','Paris','Customer3','2020-01-01 00:28:53.897','Company C',NULL),
('France','Paris','Product3','2020-01-01 00:28:53.897','Cement',NULL),
('France','Paris','Division3','2019-01-01 00:28:53.897','Supplies',NULL),
('France','Paris','Expense3','2019-01-01 00:28:53.897',NULL,75);

我需要具有相同数字的变量在同一行上,旁边有值。理想情况下,我想用SSIS做这个,因为我使用它来提取数据。

我希望它看起来像这样

Location    Site        Period      Customer    Product     Division        Total
UK       London     2019        CompanyA    Sand        Supplies        150
UK       London     2019        CompanyB    Bricks      Building Materials  300
France      Paris       2020        CompanyC    Cement      Supplies        75

还有一些数据与之不符

Customer1 + Product1, Division1, Expense1

并且需要

Customer1 + Product10, Division10, Expense10

Customer1 + Product11, Division11, Expense11

我考虑使用动态数据透视,因为我需要使用这些变量中的大约60个。然而,这是加入,但我做不到。我试图做一个交叉应用,但即使我把它放到临时表中它也不会给我回值。

DECLARE  @cols AS NVARCHAR(MAX),
         @query  AS NVARCHAR(MAX);

SET @cols = STUFF((SELECT ',' + QUOTENAME(Variable) 
            FROM xmpltbl
            GROUP BY Variable
            ORDER BY Variable
            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')

set @query = 'SELECT Location, Site, NumericValue, Period, ' + @cols + ' from 
            (
                select Location
                    , Site
                    , Variable
                    , NumericValue
                    , Period
                    , StringValue


                from xmpltbl
           ) x
            pivot 
            (
                 max(StringValue)
                for Variable in (' + @cols + ')
            ) p '

execute (@query);
答案

我不知道这是否是使用SQL执行此操作的最佳方法,但以下解决方案提供了预期的结果:

导入临时表

我使用以下查询将数据导入临时表:

create table #xmpltbl
(   [Location]  nvarchar(max),
    [Site]      nvarchar(max),  
    [Variable]  nvarchar(max),  
    [Period]        datetimeoffset(3),  
    [StringValue]   nvarchar(max),
    [NumericValue] decimal(10,2)
);

INSERT INTO #xmpltbl
(
    [Location],     
    [Site], 
    [Variable], 
    [Period],   
    [StringValue],
    [NumericValue]
)

VALUES 

('UK','London','Customer1','2019-01-01 00:28:53.897','Company A',NULL),
('UK','London','Product1','2019-01-01 00:28:53.897', 'Sand' ,NULL),
('UK','London','Division1','2019-01-01 00:28:53.897','Supplies',NULL),
('UK','London','Expense1','2019-01-01 00:28:53.897',NULL,150),
('UK','London','Customer2','2019-01-01 00:28:53.897','CompanyB',NULL),
('UK','London','Product2','2019-01-01 00:28:53.897','Bricks',NULL),
('UK','London','Division2','2019-01-01 00:28:53.897','Building Materials',NULL),
('UK','London','Expense2','2019-01-01 00:28:53.897',NULL,300),
('France','Paris','Customer3','2020-01-01 00:28:53.897','Company C',NULL),
('France','Paris','Product3','2020-01-01 00:28:53.897','Cement',NULL),
('France','Paris','Division3','2019-01-01 00:28:53.897','Supplies',NULL),
('France','Paris','Expense3','2019-01-01 00:28:53.897',NULL,75);

使用公用表表达式来获取所需的输出

我使用公用表表达式(CTE)来构建查询:

WITH CTE_1 AS (SELECT *, (ROW_NUMBER() OVER(ORDER BY [Location],     
               [Site] ) - 1) / 4 as grpno FROM #xmpltbl), 
     CTE_2 AS (SELECT * , ROW_NUMBER() OVER(PARTITION BY grpno ORDER BY grpno) rn 
               FROM CTE_1),
     CTE_3 AS (SELECT *, case when rn = 2 Then 1 else 0 end as Product, case when rn = 3 Then 1 else 0 end as Supplies
               FROM CTE_2)
SELECT DISTINCT [Location], [Site], Year([Period]) as [Period], 
                FIRST_VALUE(StringValue) OVER(PARTITION BY grpno ORDER BY rn) as [Customer] ,
                FIRST_VALUE(StringValue) OVER(PARTITION BY grpno ORDER BY Product DESC) as [Product] ,
                FIRST_VALUE(StringValue) OVER(PARTITION BY grpno ORDER BY Supplies DESC) as [Supplies] ,
                MAX([NumericValue]) OVER(PARTITION BY grpno) as [Total] 
from CTE_3

产量

enter image description here


附注:此解决方案仅适用于SQL Server 2012或更高版本,因为它使用来自FIRST_VALUE()窗口函数which is added in SQL Server 2012

以上是关于透视和交叉申请的主要内容,如果未能解决你的问题,请参考以下文章

Pandas:透视表(pivotTab)和交叉表(crossTab)

数据透视表和数据交叉表

动态 SQL 使用多列交叉应用来反透视数据

如何优化C ++代码的以下片段 - 卷中的零交叉

pandas pivot_table透视表crosstab交叉表aggfunc函数详解及实战

交叉表与透视变