SQL连接和条件求和

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了SQL连接和条件求和相关的知识,希望对你有一定的参考价值。

我有两个表,设置如下:

PMmx - 起始 - 目的地矩阵的表格版本

Origin  Destination Trips
1           1        0.2
2           1        0.3
3           1        0.4
.           .         .
.           .         .
1         1101       0.6
2         1101       0.7
3         1101       0.8
.          .          .
.          .          .     
1101       1         0.2
1101       2         0.3
1101       3         0.4

ZE - 一个区域等价的表

Precinct    Zone
1           1101
2           1102
3           1111

我想在PMmx表中选择与Zone表中的ZE列匹配的行条目。例如:

Origin  Destination Trips
1         1101       0.6
2         1101       0.7
3         1101       0.8
.          .          .
.          .          .     
1101       1         0.2
1101       2         0.3
1101       3         0.4

我还想创建一个名为Distribution的新列,它计算Trips/(Total Trips),其中总行程将在特定区域数上求和(通过OriginDestination,取决于哪个列与区域等效Zone数相匹配)。

例如,对于Origin 1,Destination 1101,我希望该行条目的新Distribution值为0.6/(0.6+0.7+0.8)

我试过以下代码

SELECT 
      PMmx.Origin                  as Origin
     ,PMmx.Destination             as Destination
     ,PMmx.Trips/sum(PMmx.Trips) as 'Distribution'
FROM PMmx

inner join ZE on Origin=ZE.Zone or Destination=ZE.Zone 

Group by Origin, Destination, Trips

我不确定这是否会产生正确的结果,因为没有group by子句我得到Column '2DVISUM_2031PMmx_unpiv.Origin' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.并且通过group by子句我得到Divide by zero error encountered.

inner join不应该有任何sums为零,所以我不确定为什么我得到这个错误。

请帮忙!

编辑:我现在使用查询获取重复行

with cte as (
  select
     origin, destination, trips
  , SUM(Trips) over(partition by Pmx.Origin) sum_trips
  , trips / SUM(Trips) over(partition by Pmx.Origin) trips_div
  from Pmx
  inner join ZE on Pmx.Origin = ZE.Zone
  )
select
origin, destination, trips, sum_trips, trips_div
from cte
union all
select
destination, origin, trips, sum_trips, trips_div
from cte

更新了表以显示错误:

Z E:

Precinct    Zone    
1           1101    
2           1102    
3           1111    
4           1211

PMX:

Origin  Destination Trips
1           1       0.20
2           1       0.30
3           1       0.40
1          1101     0.60
2          1101     0.70
3          1101     0.80
1101        1       0.20
1101        2       0.30
1101        3       0.40
1101       1211     0.60
1211       1101     0.50    

输出包含具有不同行程值的重复项:

origin destination trips sum_trips trips_div

1101    1   0.20    1.50    0.13333333333333333333333333
1101    2   0.30    1.50    0.20000000000000000000000000
1101    3   0.40    1.50    0.26666666666666666666666666
1101  1211  0.60    1.50    0.40000000000000000000000000
1211  1101  0.50    0.50    1.00000000000000000000000000
1     1101  0.20    1.50    0.13333333333333333333333333
2     1101  0.30    1.50    0.20000000000000000000000000
3     1101  0.40    1.50    0.26666666666666666666666666
1211  1101  0.60    1.50    0.40000000000000000000000000
1101  1211  0.50    0.50    1.00000000000000000000000000

编辑2:我想创建一个'if语句',以便如果Pmx.origin =ZE.Zone然后trips_divtrips/SUM(Trips) over(partition by Pmx.Origin)如上所述。然而,如果Pmx.origin =ZE.ZonePmx.destination=ZE.Zone然后我想要trips_div仍然是trips/SUM(Trips) over(partition by Pmx.Origin)。当Pmx.origin does not equal ZE.ZonePmx.destination=ZE.Zone然后trips/SUM(Trips) over(partition by Pmx.Destination)。我尝试过各种各样的case when语句,但似乎无法让它发挥作用。

我希望输出为:

origin destination trips sum_trips trips_div

    1     1101  0.20    2.10    0.0952380952380952
    2     1101  0.30    2.10    0.1428571428571429
    3     1101  0.40    2.10    0.1904761904761905
    1101    1   0.20    1.50    0.1333333333333333
    1101    2   0.30    1.50    0.2000000000000000
    1101    3   0.40    1.50    0.2666666666666666
    1101  1211  0.60    1.50    0.4000000000000000
    1211  1101  0.50    0.50    1.0000000000000000
答案

如果我了解您的要求,我认为您可以使用稍微不同的方法来获得总和,这使得在源表的每一行上都可以使用该总和。有了这个,你不需要group by子句。

SELECT 
       PMmx.Origin                  as Origin
     , PMmx.Destination             as Destination
     , (PMmx.Trips/sum(PMmx.Trips) over(partition by Destination)) as 'Distribution'
FROM PMmx
inner join ZE on Origin=ZE.Zone or Destination=ZE.Zone 

SQL Fiddle

MS SQL Server 2014架构设置:

CREATE TABLE Pmx
    ([Origin] int, [Destination] int, [Trips] decimal(12,2))
;

INSERT INTO Pmx
    ([Origin], [Destination], [Trips])
VALUES
    (1, 1, 0.2),
    (2, 1, 0.3),
    (3, 1, 0.4),
    (1, 1101, 0.6),
    (2, 1101, 0.7),
    (3, 1101, 0.8),
    (1101, 1, 0.2),
    (1101, 2, 0.3),
    (1101, 3, 0.4)
;


CREATE TABLE ZE
    ([Precinct] int, [Zone] int)
;

INSERT INTO ZE
    ([Precinct], [Zone])
VALUES
    (1, 1101),
    (2, 1102),
    (3, 1111)
;

查询1:

with cte as (
  select
     origin, destination, trips
  , SUM(Trips) over(partition by Pmx.Origin) sum_trips
  , trips / SUM(Trips) over(partition by Pmx.Origin) trips_div
  from Pmx
  inner join ZE on Pmx.Origin = ZE.Zone
  )
select
origin, destination, trips, sum_trips, trips_div
from cte
union -- changed to union so duplication is avoided
select
destination, origin, trips, sum_trips, trips_div
from cte

Results

| origin | destination | trips | sum_trips |          trips_div |
|--------|-------------|-------|-----------|--------------------|
|   1101 |           1 |   0.2 |       0.9 | 0.2222222222222222 |
|   1101 |           2 |   0.3 |       0.9 | 0.3333333333333333 |
|   1101 |           3 |   0.4 |       0.9 | 0.4444444444444444 |
|      1 |        1101 |   0.2 |       0.9 | 0.2222222222222222 |
|      2 |        1101 |   0.3 |       0.9 | 0.3333333333333333 |
|      3 |        1101 |   0.4 |       0.9 | 0.4444444444444444 |

part 2

SQL Fiddle

MS SQL Server 2014架构设置:

CREATE TABLE Pmx
    ([Origin] int, [Destination] int, [Trips] decimal(12,2))
;

INSERT INTO Pmx
    ([Origin], [Destination], [Trips])
VALUES
    (1, 1, 0.20),
    (2, 1, 0.30),
    (3, 1, 0.40),
    (1, 1101, 0.60),
    (2, 1101, 0.70),
    (3, 1101, 0.80),
    (1101, 1, 0.20),
    (1101, 2, 0.30),
    (1101, 3, 0.40),
    (1101, 1211, 0.60),
    (1211, 1101, 0.50)
;


CREATE TABLE ZE
    ([Precinct] int, [Zone] int)
;

INSERT INTO ZE
    ([Precinct], [Zone])
VALUES
    (1, 1101),
    (2, 1102),
    (3, 1111),
    (4, 1211)
;

查询1:

with cte as (
  select
     origin, destination, trips
  , SUM(Trips) over(partition by Pmx.Origin) sum_trips
  , trips / SUM(Trips) over(partition by Pmx.Origin) trips_div
  from Pmx
  inner join ZE on Pmx.Origin = ZE.Zone
  )
select
origin, destination, trips, sum_trips, trips_div
from cte
union
select
destination, origin, trips, sum_trips, trips_div
from cte
order by 1,2,3,4

Results

| origin | destination | trips | sum_trips |           trips_div |
|--------|-------------|-------|-----------|---------------------|
|      1 |        1101 |   0.2 |       1.5 | 0.13333333333333333 |
|      2 |        1101 |   0.3 |       1.5 |                 0.2 |
|      3 |        1101 |   0.4 |       1.5 | 0.26666666666666666 |
|   1101 |           1 |   0.2 |       1.5 | 0.13333333333333333 |
|   1101 |           2 |   0.3 |       1.5 |                 0.2 |
|   1101 |           3 |   0.4 |       1.5 | 0.26666666666666666 |
|   1101 |        1211 |   0.5 |       0.5 |                   1 |
|   1101 |        1211 |   0.6 |       1.5 |                 0.4 |
|   1211 |        1101 |   0.5 |       0.5 |                   1 |
|   1211 |        1101 |   0.6 |       1.5 |                 0.4 |

以上是关于SQL连接和条件求和的主要内容,如果未能解决你的问题,请参考以下文章

SQL 条件求和和分组

Oracle SQL - 基于分组和条件运行求和

使用实体框架迁移时 SQL Server 连接抛出异常 - 添加代码片段

sql实现多字段求和并查询

Sql Server 2008中根据条件求和

SQL Server:对列的每个组值求和(或差),直到在另一列上满足条件