mssql子查询聚合 - 总和错误
Posted
技术标签:
【中文标题】mssql子查询聚合 - 总和错误【英文标题】:mssql subquery aggregate - sum wrong 【发布时间】:2017-01-03 13:13:25 【问题描述】:所以我试图在 tsql (MSSQL2014) 中获取一些数据,在其中我使用子查询来获取一些外键数据表的总和。
结构如下:
TABLE [AggregateData](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[Aggregate_UUID] [uniqueidentifier] NOT NULL,
[DataDate] [date] NOT NULL,
[SizeAvailable] [bigint] NOT NULL,
[SizeTotal] [bigint] NOT NULL,
[SizeUsed] [bigint] NOT NULL,
[PercentageUsed] [int] NOT NULL
)
TABLE [Aggregate](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[UUID] [uniqueidentifier] NOT NULL,
[Name] [nvarchar](255) NOT NULL,
[Cluster_UUID] [uniqueidentifier] NOT NULL,
[DiskTypeID] [int] NOT NULL
)
TABLE [DiskType](
[Id] [int] IDENTITY(1,1) NOT NULL,
[TypeName] [nvarchar](255) NULL
)
TABLE [Volume](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[UUID] [uniqueidentifier] NOT NULL,
[Name] [nvarchar](255) NOT NULL,
[Aggregate_UUID] [uniqueidentifier] NOT NULL,
[ServiceClassID] [int] NULL,
[ProtocolID] [int] NOT NULL,
[EnvironmentID] [int] NOT NULL
)
TABLE [VolumeData](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[Volume_UUID] [uniqueidentifier] NOT NULL,
[DataDate] [date] NOT NULL,
[SizeAvailable] [bigint] NOT NULL,
[SizeTotal] [bigint] NOT NULL,
[SizeUsed] [bigint] NOT NULL,
[PercentageUsed] [int] NOT NULL
)
现在我到底需要得到以下数据:
DataDate、DiskType、AggregateSizes(Avail、Used、Total)、Aggregated Volume Sizes(Avail、Used、Total of Volumes in that Aggregate)
我正在考虑使用子查询,但是在尝试仅获取特定聚合的值时(为了测试,我更容易检查)我在子查询中得到错误的值。
这是我尝试过的;
SELECT
AggregateData.DataDate,
AggregateData.SizeTotal AS AggregateSizeTotal,
(SELECT
SUM(VolumeData.SizeTotal)
FROM VolumeData
LEFT JOIN Volume
ON VolumeData.Volume_UUID = Volume.UUID
WHERE Aggregate_UUID = Volume.Aggregate_UUID
AND VolumeData.DataDate = AggregateData.DataDate)
VolumeSizeTotal
FROM AggregateData
WHERE AggregateData.Aggregate_UUID = 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C'
ORDER BY AggregateData.DataDate
但这似乎我没有得到子查询总和的正确值。我的子查询总和太高了,所以我假设我的 where 子句不正确(或整个设置;)...)
那么,问题 1。子查询是要走的路还是我应该采取不同的方式? 如果(问题 1 == true)我的子查询有什么问题?【问题讨论】:
【参考方案1】:您需要限定所有列名。我建议使用表格缩写。问题是Aggregate_UUID = v.Aggregate_UUID
。第一列来自v
,所以这(本质上)是无操作的。
大概,您希望这与外部查询相关:
SELECT ad.DataDate, ad.SizeTotal AS AggregateSizeTotal,
(SELECT SUM(vd.SizeTotal)
FROM VolumeData vd LEFT JOIN
Volume v
ON vd.Volume_UUID = v.UUID
WHERE ad.Aggregate_UUID = v.Aggregate_UUID AND
ad.DataDate = vd.DataDate
) VolumeSizeTotal
FROM AggregateData ad
WHERE ad.Aggregate_UUID = 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C'
ORDER BY ad.DataDate
【讨论】:
是的,你是对的。这与选择中的子查询看起来不错。现在对于我需要的所有其他列,我需要在选择中添加另一个子查询,因为据我所知,它在选择中使用时只能返回 1 个值。那使用起来会有点麻烦。因此,虽然我认为我在选择中使用子查询的初始方法是有效的,但并不理想。我将尝试使用连接中的子查询。似乎是满足我需要的更好方法。 @Marc。 . .如果您有另一个问题涉及多个列,那么您应该将其作为问题而不是在评论中提出。 你是对的。所以我会问第二部分的另一个问题。非常感谢。【参考方案2】:您可以使用 JOIN 而不是相关子查询来执行此操作(O(n^2) 性能)-
SELECT
t1.DataDate,
t1.SizeTotal AS AggregateSizeTotal,
t2.total VolumeSizeTotal
FROM AggregateData t1 left join (SELECT
DataDate, SUM(VolumeData.SizeTotal) total
FROM VolumeData
LEFT JOIN Volume
ON VolumeData.Volume_UUID = Volume.UUID
WHERE Aggregate_UUID = Volume.Aggregate_UUID
group by DataDate) t2 on t1.datadate = t2.dataDate
WHERE t1.Aggregate_UUID = 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C';
【讨论】:
【参考方案3】:要查询以返回所需的最终结果,我会使用如下内容:
现在最后我需要得到以下数据:DataDate, DiskType, AggregateSizes (Avail, Used, Total), Aggregated Volume Sizes (Sum of Avail, Used, Total of Volumes in that Aggregate)强>
select
AggregateUuid = a.uuid
, DiskType = dt.TypeName
, DataDate = ad.DataDate
, AggregateSizeAvailable = ad.SizeAvailable
, AggregateSizeUsed = ad.SizeUsed
, AggregateSizeTotal = ad.SizeTotal
, VolumeSizeAvailable = sum(vd.SizeAvailable)
, VolumeSizeUsed = sum(vd.SizeUsed)
, VolumeSizeTotal = sum(vd.SizeTotal)
from [Aggregate] a
inner join DiskType dt on dt.Id = a.DiskTypeId
inner join AggregateData ad on ad.Aggregate_uuid = a.uuid
left join Volume v on v.Aggregate_uuid = a.uuid
left join VolumeData vd on vd.Volume_uuid = v.uuid
and vd.DataDate = ad.DataDate
where a.uuid = 'C58D0098-D1A4-4ee9-A0E9-7de3eeb6275C'
group by
a.uuid
, dt.TypeName
, ad.DataDate
, ad.SizeAvailable
, ad.SizeUsed
, ad.SizeTotal
order by a.uuid, ad.DataDate;
测试设置:http://rextester.com/HZZHLI45077
create table DiskType(
Id int identity(1,1) not null
, TypeName nvarchar(255) null
);
set identity_insert DiskType on;
insert into DiskType (Id, TypeName) values
(1,'Type1'), (2,'Type2');
set identity_insert DiskType off;
create table [Aggregate](
Id bigint identity(1,1) not null
, uuid uniqueidentifier not null
, Name nvarchar(255) not null
, Cluster_uuid uniqueidentifier not null
, DiskTypeid int not null
);
insert into [Aggregate] (uuid, name, cluster_uuid, disktypeid)
select 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C', 'ex', newid(), 1;
create table AggregateData(
Id bigint identity(1,1) not null
, Aggregate_uuid uniqueidentifier not null
, DataDate date not null
, SizeAvailable bigint not null
, SizeTotal bigint not null
, SizeUsed bigint not null
, PercentageUsed int not null
);
insert into AggregateData
select 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C', '20170101', 12,100,87,87
union all select 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C', '20170102', 9,100,90,90
union all select 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C', '20170103', 6,100,93,93;
create table Volume(
Id bigint identity(1,1) not null
, uuid uniqueidentifier not null
, Name nvarchar(255) not null
, Aggregate_uuid uniqueidentifier not null
, ServiceClassid int null
, Protocolid int not null
, Environmentid int not null
);
insert into Volume
select '00000000-0000-0000-0000-000000000001', 'v1'
, 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C', null, 1, 1
union all select '00000000-0000-0000-0000-000000000002', 'v2'
, 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C', null, 1, 1
union all select '00000000-0000-0000-0000-000000000003', 'v3'
, 'C58D0098-D1A4-4EE9-A0E9-7DE3EEB6275C', null, 1, 1;
create table VolumeData(
Id bigint identity(1,1) not null
, Volume_uuid uniqueidentifier not null
, DataDate date not null
, SizeAvailable bigint not null
, SizeTotal bigint not null
, SizeUsed bigint not null
, PercentageUsed int not null
);
insert into VolumeData
select '00000000-0000-0000-0000-000000000001', '20170101', 4,33,29,88
union all select '00000000-0000-0000-0000-000000000002', '20170101', 4,33,29,88
union all select '00000000-0000-0000-0000-000000000003', '20170101', 4,34,29,87
union all select '00000000-0000-0000-0000-000000000001', '20170102', 3,33,30,91
union all select '00000000-0000-0000-0000-000000000002', '20170102', 3,33,30,91
union all select '00000000-0000-0000-0000-000000000003', '20170102', 3,34,30,90
union all select '00000000-0000-0000-0000-000000000001', '20170103', 2,33,31,94
union all select '00000000-0000-0000-0000-000000000002', '20170103', 2,33,31,94
union all select '00000000-0000-0000-0000-000000000003', '20170103', 2,34,31,93
go
/* -------------------------------------------------------- */
select
AggregateUuid = a.uuid
, DiskType = dt.TypeName
, DataDate = convert(varchar(10),ad.DataDate,121)
, AggregateSizeAvailable = ad.SizeAvailable
, AggregateSizeUsed = ad.SizeUsed
, AggregateSizeTotal = ad.SizeTotal
, VolumeSizeAvailable = sum(vd.SizeAvailable)
, VolumeSizeUsed = sum(vd.SizeUsed)
, VolumeSizeTotal = sum(vd.SizeTotal)
from [Aggregate] a
inner join DiskType dt on dt.Id = a.DiskTypeId
inner join AggregateData ad on ad.Aggregate_uuid = a.uuid
left join Volume v on v.Aggregate_uuid = a.uuid
left join VolumeData vd on vd.Volume_uuid = v.uuid
and vd.DataDate = ad.DataDate
where a.uuid = 'C58D0098-D1A4-4ee9-A0E9-7de3eeb6275C'
group by
a.uuid
, dt.TypeName
, ad.DataDate
, ad.SizeAvailable
, ad.SizeUsed
, ad.SizeTotal
order by a.uuid, ad.DataDate;
结果:
+--------------------------------------+----------+------------+------------------------+-------------------+--------------------+---------------------+----------------+-----------------+
| AggregateUuid | DiskType | DataDate | AggregateSizeAvailable | AggregateSizeUsed | AggregateSizeTotal | VolumeSizeAvailable | VolumeSizeUsed | VolumeSizeTotal |
+--------------------------------------+----------+------------+------------------------+-------------------+--------------------+---------------------+----------------+-----------------+
| c58d0098-d1a4-4ee9-a0e9-7de3eeb6275c | Type1 | 2017-01-01 | 12 | 87 | 100 | 12 | 87 | 100 |
| c58d0098-d1a4-4ee9-a0e9-7de3eeb6275c | Type1 | 2017-01-02 | 9 | 90 | 100 | 9 | 90 | 100 |
| c58d0098-d1a4-4ee9-a0e9-7de3eeb6275c | Type1 | 2017-01-03 | 6 | 93 | 100 | 6 | 93 | 100 |
+--------------------------------------+----------+------------+------------------------+-------------------+--------------------+---------------------+----------------+-----------------+
【讨论】:
这将为每个卷提供一个条目,但每个聚合需要 1 个条目,卷大小列是该特定聚合中所有卷的总和。很抱歉在我最初的帖子中没有说清楚。 @Marc 更新的答案删除了卷名,这应该返回您现在正在寻找的内容。 这仍然让我得到多行,每个卷具有相同的聚合,因为没有按“卷”分组 由于我们正在汇总所有体积数据,因此您不会希望按体积分组。Aggregate (uuid)
是唯一列还是 Aggregate (uuid, disktype)
唯一? AggregateData (uuid, datadate)
是唯一的还是同一日期的行具有不同的 SizeAvailable, SizeUsed, or SizeTotal
值?
Aggregate(uuid) 是唯一的,但也是 Aggreage(uuid, disktype)。每个聚合只有 1 种磁盘类型。在 AggregateData 中,每天每个聚合都有一个(并且只有一个)行。 VolumeData 每天每个卷都有一行。每个聚合可以有零个(不太可能但可能)到多个卷(卷中有 AggregateUUID)以上是关于mssql子查询聚合 - 总和错误的主要内容,如果未能解决你的问题,请参考以下文章