从具有交集和联合但也具有其他独特属性的两个表中进行 SQL 查询

Posted

技术标签:

【中文标题】从具有交集和联合但也具有其他独特属性的两个表中进行 SQL 查询【英文标题】:SQL Query from two tables with intersection and union but also with other unique properties 【发布时间】:2016-12-20 02:01:44 【问题描述】:

我有两张桌子

表 1

Brand  | Price| Shape |weight  |Color |URL
--------------------------------
Philips| 13   | Square| 12lbs  |Blue  |example.com/123
Philips| 4    | Round | 17 lbs |Yellow|example.com/1567

表 2

Brand  | Price| Shape  |weight |Color |URL
--------------------------------
Philips| 12   | Square | 12lbs |Blue  |example.com/456
Philips| 4    | Round  | 16 lbs|Yellow|example.com/17987
GE     | 4    | Square | 17 lbs|red   |example.com/17234234

我想编写 SQL 查询,通过比较最便宜的价格、所有属性和 URL,我可以从这两个表中选择产品。我试过加入

select  
    case when a.price < b.price then A.price else B.price end as price,
    * 
from 
    Table1 A, table2 B   
where
    A.Brand = B.Brand 
    and A.Shape = B.Shape 
    and A.weight = B.weight 
    and A.color = B.color

但这会返回重复的结果。

我尝试了联合和交集,但它没有给我 URL

SELECT  
    Brand , Shape, weight, color, URL 
FROM 
    table1 
WHERE
    Price !='NULL' 
    AND BulbShape != 'null' 
    AND Wattage != 'null' 
    AND Lumens_Initial != 'null' 

UNION

SELECT 
    Brand, Shape, weight, color, URL 
FROM 
    table2  
WHERE 
    Price != 'NULL' 
    AND Shape != 'null' 
    AND weight != 'null' 
    AND color != 'null'

EXCEPT 

SELECT 
    Brand, Shape, weight, color, URL   
FROM 
    table1 
WHERE 
    Price != 'NULL' 
    AND Shape != 'null' 
    AND weight != 'null' 
    AND color != 'null'

INTERSECT 

SELECT 
    Brand, Shape, weight, color, URL 
FROM 
    table2 
WHERE
    Price != 'NULL' 
    AND Shape != 'null' 
    AND Wattage != 'null' 
    AND color != 'null'

我没有任何主键,因为它只是从网络收集数据。

如何编写查询来获取唯一的数据,包括表中的所有列和最低价格?

预期结果应该类似于

Brand  | Price| Shape  |weight  |Color  |URL
--------------------------------------------------------------
Philips| 12   | Square | 12 lbs |Blue   |example.com/123
Philips| 4    | Round  | 17 lbs |Yellow |example.com/1567
Philips| 4    | Round  | 16 lbs |Yellow |example.com/17987
GE     | 4    | Square | 17 lbs |red    |example.com/17234234

在第一行中,我刚得到最低价格,其余部分与第一张表保持不变。第二行有不同的属性,所以我从两个表中都得到了行。最后一行只在第二张桌子上,所以我得到了那一行。

【问题讨论】:

您能否格式化您的表格和查询以使其更易于阅读?另外,您使用的是哪个数据库?你是不是打错了 mysql 的标签? 您对该样本数据的预期结果是什么? 你问的是UNION ALL吗? 不,我问的是如何比较两个表的特定列并获得另一列的值。 Bad habits to kick : using old-style JOINs - 旧式 逗号分隔的表格列表 样式已替换为 ANSI 中的 proper ANSI JOIN 语法-92 SQL 标准(20 多年前),不鼓励使用它 【参考方案1】:

这是经典的top-n-per-group

样本数据

DECLARE @table1 TABLE
(
    brand varchar(50),
    price int,
    shape varchar(50),
    weight varchar(50),
    color varchar(50),
    url varchar(100)
);

DECLARE @table2 TABLE
(
    brand varchar(50),
    price int,
    shape varchar(50),
    weight varchar(50),
    color varchar(50),
    url varchar(100)
);

INSERT INTO @table1 (brand,price,shape,weight,color,url) VALUES
('Philips', 13, 'Square', '12lbs', 'Blue', 'example.com/123'),
('Philips', 4, 'Round', '17lbs', 'Yellow', 'example.com/1567');

INSERT INTO @table2 (brand,price,shape,weight,color,url) VALUES
('Philips', 12, 'Square', '12lbs', 'Blue', 'example.com/456'),
('Philips', 4, 'Round', '16lbs', 'Yellow', 'example.com/17987'),
('GE', 4, 'Square', '17lbs', 'Red', 'example.com/17234234');

查询

一开始UNION ALL两个表变成CTE_Tables。然后使用ROW_NUMBER 为按所有属性分区并按价格排序的每一行生成数字 (CTE_RN)。最后只为每个组选择第一行。

WITH
CTE_Tables
AS
(
    SELECT brand,price,shape,weight,color,url
    FROM @table1

    UNION ALL

    SELECT brand,price,shape,weight,color,url
    FROM @table2
)
,CTE_RN
AS
(
    SELECT brand,price,shape,weight,color,url
        ,ROW_NUMBER() OVER(
            PARTITION BY brand,shape,weight,color
            ORDER BY price) AS rn
    FROM CTE_Tables
)
SELECT brand,price,shape,weight,color,url
FROM CTE_RN
WHERE rn = 1
ORDER BY brand DESC,price DESC,shape DESC,weight DESC,color,url;

结果

+---------+-------+--------+--------+--------+----------------------+
|  brand  | price | shape  | weight | color  |         url          |
+---------+-------+--------+--------+--------+----------------------+
| Philips |    12 | Square | 12lbs  | Blue   | example.com/456      |
| Philips |     4 | Round  | 17lbs  | Yellow | example.com/1567     |
| Philips |     4 | Round  | 16lbs  | Yellow | example.com/17987    |
| GE      |     4 | Square | 17lbs  | Red    | example.com/17234234 |
+---------+-------+--------+--------+--------+----------------------+

【讨论】:

【参考方案2】:
CREATE Procedure joindemo
 as

CREATE TABLE #table1
(
    brand varchar(50),
    price int,
    shape varchar(50),
    weight varchar(50),
    color varchar(50),
    url varchar(100)
    )

CREATE TABLE #table2
(
    brand varchar(50),
    price int,
    shape varchar(50),
    weight varchar(50),
    color varchar(50),
    url varchar(100)
    )


INSERT INTO #table1 VALUES('Philips', 13, 'Square', '12lbs', 'Blue', 'example.com/123')
INSERT INTO #table1 VALUES('Philips', 4, 'Round', '17lbs', 'Yellow', 'example.com/1567')
INSERT INTO #table2 VALUES('Philips', 12, 'Square', '12lbs', 'Blue', 'example.com/456')
INSERT INTO #table2 VALUES('Philips', 4, 'Round', '16lbs', 'Yellow', 'example.com/17987')
INSERT INTO #table2 VALUES('GE', 4, 'Square', '17lbs', 'Red', 'example.com/17234234')

CREATE TABLE #jointable
(
    brand varchar(50),
    price int,
    shape varchar(50),
    weight varchar(50),
    color varchar(50),
    url varchar(100)
    )

INSERT INTO #jointable 
SELECT * FROM #table1
UNION 
SELECT * FROM #table2

SELECT 
j.brand, mp.minprice, j.shape, j.weight, j.color, j.url FROM
(SELECT brand, Min(price) as minprice, shape, weight, color  FROM
#jointable
GROUP BY brand, shape, weight, color) as mp
INNER JOIN #jointable j ON mp.brand = j.brand AND mp.minprice = j.price
AND mp.color = j.color AND mp.shape = j.shape and mp.weight = j.weight

DROP TABLE #table1
DROP TABLE #table2
DROP TABLE #jointable

--exec joindemo;

请注意,您的预期输出是错误的。第一行中的 url 应该是 example.com/456。此外,如果您得到两个相同的价格,您还需要决定该怎么做!因为你没有具体说明,所以我无法猜测你是想展示两个还是只展示一个!

【讨论】:

以上是关于从具有交集和联合但也具有其他独特属性的两个表中进行 SQL 查询的主要内容,如果未能解决你的问题,请参考以下文章

MySQL 从具有重复引用条目的联合表中选择唯一记录

如何选择具有优先级第一个表的不同 UNION

SQL查询从具有相同列“名称”的其他两个表中获取具有不同值的单列“名称”[关闭]

Java从其他两个具有不同对象和公共属性的列表构建一个列表

具有不同行数的 Scikit Learn 特征联合

如何对 postgresql 中具有联合的查询进行更新?