mssql子查询聚合 - 总和错误

Posted

技术标签:

【中文标题】mssql子查询聚合 - 总和错误【英文标题】:mssql subquery aggregate - sum wrong 【发布时间】:2017-01-03 13:13:25 【问题描述】:

所以我试图在 tsql (MSSQL2014) 中获取一些数据,在其中我使用子查询来获取一些外键数据表的总和。

结构如下:

TABLE [AggregateData](
    [Id] [bigint] IDENTITY(1,1) NOT NULL,
    [Aggregate_UUID] [uniqueidentifier] NOT NULL,
    [DataDate] [date] NOT NULL,
    [SizeAvailable] [bigint] NOT NULL,
    [SizeTotal] [bigint] NOT NULL,
    [SizeUsed] [bigint] NOT NULL,
    [PercentageUsed] [int] NOT NULL
)

TABLE [Aggregate](
    [Id] [bigint] IDENTITY(1,1) NOT NULL,
    [UUID] [uniqueidentifier] NOT NULL,
    [Name] [nvarchar](255) NOT NULL,
    [Cluster_UUID] [uniqueidentifier] NOT NULL,
    [DiskTypeID] [int] NOT NULL
)

TABLE [DiskType](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [TypeName] [nvarchar](255) NULL
)

TABLE [Volume](
    [Id] [bigint] IDENTITY(1,1) NOT NULL,
    [UUID] [uniqueidentifier] NOT NULL,
    [Name] [nvarchar](255) NOT NULL,
    [Aggregate_UUID] [uniqueidentifier] NOT NULL,
    [ServiceClassID] [int] NULL,
    [ProtocolID] [int] NOT NULL,
    [EnvironmentID] [int] NOT NULL
)

TABLE [VolumeData](
    [Id] [bigint] IDENTITY(1,1) NOT NULL,
    [Volume_UUID] [uniqueidentifier] NOT NULL,
    [DataDate] [date] NOT NULL,
    [SizeAvailable] [bigint] NOT NULL,
    [SizeTotal] [bigint] NOT NULL,
    [SizeUsed] [bigint] NOT NULL,
    [PercentageUsed] [int] NOT NULL
)

现在我到底需要得到以下数据:

DataDate、DiskType、AggregateSizes(Avail、Used、Total)、Aggregated Volume Sizes(Avail、Used、Total of Volumes in that Aggregate)

我正在考虑使用子查询,但是在尝试仅获取特定聚合的值时(为了测试,我更容易检查)我在子查询中得到错误的值。

这是我尝试过的;

SELECT
  AggregateData.DataDate,
  AggregateData.SizeTotal AS AggregateSizeTotal,
  (SELECT
    SUM(VolumeData.SizeTotal)
  FROM VolumeData
  LEFT JOIN Volume
    ON VolumeData.Volume_UUID = Volume.UUID
  WHERE Aggregate_UUID = Volume.Aggregate_UUID
  AND VolumeData.DataDate = AggregateData.DataDate)
  VolumeSizeTotal

FROM AggregateData

WHERE AggregateData.Aggregate_UUID = 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C'
ORDER BY AggregateData.DataDate

但这似乎我没有得到子查询总和的正确值。我的子查询总和太高了,所以我假设我的 where 子句不正确(或整个设置;)...)

那么,问题 1。子查询是要走的路还是我应该采取不同的方式? 如果(问题 1 == true)我的子查询有什么问题?

【问题讨论】:

【参考方案1】:

您需要限定所有列名。我建议使用表格缩写。问题是Aggregate_UUID = v.Aggregate_UUID。第一列来自v,所以这(本质上)是无操作的。

大概,您希望这与外部查询相关:

SELECT ad.DataDate, ad.SizeTotal AS AggregateSizeTotal,
       (SELECT SUM(vd.SizeTotal)
        FROM VolumeData vd LEFT JOIN
             Volume v
             ON vd.Volume_UUID = v.UUID
        WHERE ad.Aggregate_UUID = v.Aggregate_UUID AND
              ad.DataDate = vd.DataDate
       ) VolumeSizeTotal
FROM AggregateData ad
WHERE ad.Aggregate_UUID = 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C'
ORDER BY ad.DataDate

【讨论】:

是的,你是对的。这与选择中的子查询看起来不错。现在对于我需要的所有其他列,我需要在选择中添加另一个子查询,因为据我所知,它在选择中使用时只能返回 1 个值。那使用起来会有点麻烦。因此,虽然我认为我在选择中使用子查询的初始方法是有效的,但并不理想。我将尝试使用连接中的子查询。似乎是满足我需要的更好方法。 @Marc。 . .如果您有另一个问题涉及多个列,那么您应该将其作为问题而不是在评论中提出。 你是对的。所以我会问第二部分的另一个问题。非常感谢。【参考方案2】:

您可以使用 JOIN 而不是相关子查询来执行此操作(O(n^2) 性能)-

SELECT
  t1.DataDate,
  t1.SizeTotal AS AggregateSizeTotal,
  t2.total VolumeSizeTotal
FROM AggregateData t1 left join (SELECT
    DataDate, SUM(VolumeData.SizeTotal) total
  FROM VolumeData
  LEFT JOIN Volume
    ON VolumeData.Volume_UUID = Volume.UUID
  WHERE Aggregate_UUID = Volume.Aggregate_UUID
  group by DataDate) t2 on t1.datadate = t2.dataDate
WHERE t1.Aggregate_UUID = 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C';

【讨论】:

【参考方案3】:

要查询以返回所需的最终结果,我会使用如下内容:

现在最后我需要得到以下数据:DataDate, DiskType, AggregateSizes (Avail, Used, Total), Aggregated Volume Sizes (Sum of Avail, Used, Total of Volumes in that Aggregate)强>

select 
      AggregateUuid          = a.uuid
    , DiskType               = dt.TypeName
    , DataDate               = ad.DataDate
    , AggregateSizeAvailable = ad.SizeAvailable
    , AggregateSizeUsed      = ad.SizeUsed
    , AggregateSizeTotal     = ad.SizeTotal
    , VolumeSizeAvailable    = sum(vd.SizeAvailable)
    , VolumeSizeUsed         = sum(vd.SizeUsed)
    , VolumeSizeTotal        = sum(vd.SizeTotal)
  from [Aggregate] a
      inner join DiskType      dt  on dt.Id             = a.DiskTypeId
      inner join AggregateData ad  on ad.Aggregate_uuid = a.uuid
      left  join Volume         v  on  v.Aggregate_uuid = a.uuid 
      left  join VolumeData     vd on vd.Volume_uuid    = v.uuid
                                 and vd.DataDate       = ad.DataDate
  where a.uuid = 'C58D0098-D1A4-4ee9-A0E9-7de3eeb6275C'
  group by 
      a.uuid
    , dt.TypeName
    , ad.DataDate
    , ad.SizeAvailable
    , ad.SizeUsed
    , ad.SizeTotal
  order by a.uuid, ad.DataDate;

测试设置:http://rextester.com/HZZHLI45077

create table DiskType(
    Id int identity(1,1) not null
  , TypeName nvarchar(255) null
);
set identity_insert DiskType on;
insert into DiskType (Id, TypeName) values 
  (1,'Type1'), (2,'Type2');
set identity_insert DiskType off;

create table [Aggregate](
    Id bigint identity(1,1) not null
  , uuid uniqueidentifier not null
  , Name nvarchar(255) not null
  , Cluster_uuid uniqueidentifier not null
  , DiskTypeid int not null 
);

insert into [Aggregate] (uuid, name, cluster_uuid, disktypeid) 
            select 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C', 'ex', newid(), 1;

create table AggregateData(
    Id bigint identity(1,1) not null
  , Aggregate_uuid uniqueidentifier not null
  , DataDate date not null
  , SizeAvailable bigint not null
  , SizeTotal bigint not null
  , SizeUsed bigint not null
  , PercentageUsed int not null
);

insert into AggregateData 
            select 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C', '20170101', 12,100,87,87
  union all select 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C', '20170102', 9,100,90,90
  union all select 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C', '20170103', 6,100,93,93;

create table Volume(
    Id bigint identity(1,1) not null
  , uuid uniqueidentifier not null
  , Name nvarchar(255) not null
  , Aggregate_uuid uniqueidentifier not null
  , ServiceClassid int null
  , Protocolid int not null
  , Environmentid int not null
);
insert into Volume 
            select '00000000-0000-0000-0000-000000000001', 'v1'
                , 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C', null, 1, 1
  union all select '00000000-0000-0000-0000-000000000002', 'v2'
                , 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C', null, 1, 1
  union all select '00000000-0000-0000-0000-000000000003', 'v3'
                , 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C', null, 1, 1;

create table VolumeData(
    Id bigint identity(1,1) not null
  , Volume_uuid uniqueidentifier not null
  , DataDate date not null
  , SizeAvailable bigint not null
  , SizeTotal bigint not null
  , SizeUsed bigint not null
  , PercentageUsed int not null
);

insert into VolumeData 
            select '00000000-0000-0000-0000-000000000001', '20170101', 4,33,29,88
  union all select '00000000-0000-0000-0000-000000000002', '20170101', 4,33,29,88
  union all select '00000000-0000-0000-0000-000000000003', '20170101', 4,34,29,87
  union all select '00000000-0000-0000-0000-000000000001', '20170102', 3,33,30,91
  union all select '00000000-0000-0000-0000-000000000002', '20170102', 3,33,30,91
  union all select '00000000-0000-0000-0000-000000000003', '20170102', 3,34,30,90
  union all select '00000000-0000-0000-0000-000000000001', '20170103', 2,33,31,94
  union all select '00000000-0000-0000-0000-000000000002', '20170103', 2,33,31,94
  union all select '00000000-0000-0000-0000-000000000003', '20170103', 2,34,31,93

go
/* -------------------------------------------------------- */

select 
      AggregateUuid          = a.uuid
    , DiskType               = dt.TypeName
    , DataDate               = convert(varchar(10),ad.DataDate,121)
    , AggregateSizeAvailable = ad.SizeAvailable
    , AggregateSizeUsed      = ad.SizeUsed
    , AggregateSizeTotal     = ad.SizeTotal
    , VolumeSizeAvailable    = sum(vd.SizeAvailable)
    , VolumeSizeUsed         = sum(vd.SizeUsed)
    , VolumeSizeTotal        = sum(vd.SizeTotal)
  from [Aggregate] a
      inner join DiskType      dt  on dt.Id             = a.DiskTypeId
      inner join AggregateData ad  on ad.Aggregate_uuid = a.uuid
      left  join Volume         v  on  v.Aggregate_uuid = a.uuid 
      left  join VolumeData     vd on vd.Volume_uuid    = v.uuid
                                 and vd.DataDate       = ad.DataDate
  where a.uuid = 'C58D0098-D1A4-4ee9-A0E9-7de3eeb6275C'
  group by 
      a.uuid
    , dt.TypeName
    , ad.DataDate
    , ad.SizeAvailable
    , ad.SizeUsed
    , ad.SizeTotal
  order by a.uuid, ad.DataDate;

结果:

+--------------------------------------+----------+------------+------------------------+-------------------+--------------------+---------------------+----------------+-----------------+
|            AggregateUuid             | DiskType |  DataDate  | AggregateSizeAvailable | AggregateSizeUsed | AggregateSizeTotal | VolumeSizeAvailable | VolumeSizeUsed | VolumeSizeTotal |
+--------------------------------------+----------+------------+------------------------+-------------------+--------------------+---------------------+----------------+-----------------+
| c58d0098-d1a4-4ee9-a0e9-7de3eeb6275c | Type1    | 2017-01-01 |                     12 |                87 |                100 |                  12 |             87 |             100 |
| c58d0098-d1a4-4ee9-a0e9-7de3eeb6275c | Type1    | 2017-01-02 |                      9 |                90 |                100 |                   9 |             90 |             100 |
| c58d0098-d1a4-4ee9-a0e9-7de3eeb6275c | Type1    | 2017-01-03 |                      6 |                93 |                100 |                   6 |             93 |             100 |
+--------------------------------------+----------+------------+------------------------+-------------------+--------------------+---------------------+----------------+-----------------+

【讨论】:

这将为每个卷提供一个条目,但每个聚合需要 1 个条目,卷大小列是该特定聚合中所有卷的总和。很抱歉在我最初的帖子中没有说清楚。 @Marc 更新的答案删除了卷名,这应该返回您现在正在寻找的内容。 这仍然让我得到多行,每个卷具有相同的聚合,因为没有按“卷”分组 由于我们正在汇总所有体积数据,因此您不会希望按体积分组。 Aggregate (uuid) 是唯一列还是 Aggregate (uuid, disktype) 唯一? AggregateData (uuid, datadate) 是唯一的还是同一日期的行具有不同的 SizeAvailable, SizeUsed, or SizeTotal 值? Aggregate(uuid) 是唯一的,但也是 Aggreage(uuid, disktype)。每个聚合只有 1 种磁盘类型。在 AggregateData 中,每天每个聚合都有一个(并且只有一个)行。 VolumeData 每天每个卷都有一行。每个聚合可以有零个(不太可能但可能)到多个卷(卷中有 AggregateUUID)

以上是关于mssql子查询聚合 - 总和错误的主要内容,如果未能解决你的问题,请参考以下文章

带有聚合的 Django 子查询

语句错误时的MSSQL案例

基于条件比较两个子表的聚合返回记录

MSSQL之五 连接查询与子查询

在子查询、标准或 Oracle 功能中混合聚合值和非聚合值?

MSSQL 2012 - 在子查询中返回多个列