SQL选择JOIN中的行,列上具有最大值
Posted
技术标签:
【中文标题】SQL选择JOIN中的行,列上具有最大值【英文标题】:SQL select rows in a JOIN with max value on a column 【发布时间】:2021-08-16 13:24:53 【问题描述】:我正在从 3 个不同的表中选择值,以获取一些产品订单的概览。
没有MAX
,没有问题。
这是我正在处理的数据:
-- limited to first rows for the sake of the exemple
+------+---------------------+-------------------------------+-------+
| ID | post_date | order_item_name | price |
+------+---------------------+-------------------------------+-------+
| 2348 | 2019-01-23 18:47:34 | product A | 18.9 |
| 2348 | 2019-01-23 18:47:34 | Product B | 4.5 |
| 2348 | 2019-01-23 18:47:34 | Product C | 50.5 |
| 2349 | 2019-01-23 21:59:04 | Product E | 26.5 |
| 2352 | 2019-01-24 07:41:12 | Product C | 50.5 |
+------+---------------------+-------------------------------+-------+
这些由以下 SQL 查询返回。
SELECT
p.ID AS order_id,
post_date,
order_item_name,
meta_value as price
FROM wp_posts AS p
JOIN wp_woocommerce_order_items
ON p.ID = order_id
JOIN wp_woocommerce_order_itemmeta
ON wp_woocommerce_order_items.order_item_id = wp_woocommerce_order_itemmeta.order_item_id
WHERE
post_type = 'shop_order'
AND post_status = 'wc-completed'
AND meta_key = '_line_subtotal';
现在我想要的是从每个订单中只获得最昂贵的产品。
显然,仅使用 MAX
函数和 GROUP BY
会返回每个订单一行,但产品名称与价格不匹配。
SELECT
p.ID AS order_id,
post_date,
order_item_name,
MAX(meta_value) AS price
FROM alpha_posts AS p
JOIN alpha_woocommerce_order_items
ON p.ID = order_id
JOIN alpha_woocommerce_order_itemmeta
ON alpha_woocommerce_order_items.order_item_id = alpha_woocommerce_order_itemmeta.order_item_id
WHERE
post_type = 'shop_order'
AND post_status = 'wc-completed'
AND meta_key = '_line_subtotal'
GROUP BY order_id;
返回最高价格,但 order_item_name
列与给定价格不对应。
+----------+---------------------+-------------------------------+-------+
| order_id | post_date | order_item_name | price |
+----------+---------------------+-------------------------------+-------+
| 2348 | 2019-01-23 18:47:34 | Product A | 50.5 | -- should be product C
| 2349 | 2019-01-23 21:59:04 | Product B | 26.5 | -- product b is 4.5, so it's clearly not matching (same for the following results)
| 2352 | 2019-01-24 07:41:12 | Product A | 60.9 |
| 2354 | 2019-01-25 07:43:36 | Product C | 23.1 |
| 2355 | 2019-01-26 19:59:31 | Product D | 79.9 |
+----------+---------------------+-------------------------------+-------+
我已经设法找到了单表查询的例子,但我对这个多连接查询无能为力。
【问题讨论】:
请向我们展示您的预期输出示例,最好是您尝试使用GROUP BY
和MAX()
请注意,没有 ORDER BY 的 LIMIT 是毫无意义的,请参阅:Why should I provide an MCRE for what seems to me to be a very simple SQL query?
@Strawberry 我使用限制只是为了在此处保持简短。我正在处理完整的数据集。
@LaurentS。像你建议的那样编辑
【参考方案1】:
你可以使用row_number()
:
SELECT x.*
FROM (SELECT p.ID AS order_id, post_date, order_item_name, oimmeta_value as price,
ROW_NUMBER() OVER (PARTITION BY p.ID ORDER BY (meta_value + 0) DESC) as seqnum
FROM wp_posts p JOIN
wp_woocommerce_order_items oi
ON p.ID = oi.order_id JOIN
wp_woocommerce_order_itemmeta oim
ON oi.order_item_id = oim.order_item_id
WHERE p.post_type = 'shop_order' AND
p.post_status = 'wc-completed' AND
wpm.meta_key = '_line_subtotal'
) x
WHERE seqnum = 1;
注意:这里假定meta_value
是一个字符串,因此需要将其转换为数字以进行排序。
【讨论】:
它确实产生了所需的输出,但您能否详细说明seqnum
发生了什么?【参考方案2】:
正如预期的那样,聚合函数只处理一列而不考虑其他列,它们不是过滤器。
这就是为什么MAX
函数返回指定列中满足的最大值,但其他列不是对应于所选最大值(或聚合函数的任何结果)的那些。
为了根据最大值选择匹配的列,我们可以使用JOIN
查询,在我们的例子中,连接order_id
和price
。
SELECT
ID,
post_date,
wp_woocommerce_order_items.order_item_name,
wp_woocommerce_order_itemmeta.meta_value
FROM wp_posts
JOIN wp_woocommerce_order_items
ON ID = order_id
JOIN wp_woocommerce_order_itemmeta
ON wp_woocommerce_order_items.order_item_id = wp_woocommerce_order_itemmeta.order_item_id
JOIN (
SELECT
order_id,
MAX(meta_value) as price
FROM wp_woocommerce_order_items
JOIN wp_woocommerce_order_itemmeta
ON wp_woocommerce_order_items.order_item_id = wp_woocommerce_order_itemmeta.order_item_id
WHERE meta_key = '_line_subtotal'
GROUP BY order_id
) b
ON ID = b.order_id AND wp_woocommerce_order_itemmeta.meta_value = price
WHERE
post_type = 'shop_order'
AND post_status = 'wc-completed'
AND meta_key = '_line_subtotal';
【讨论】:
以上是关于SQL选择JOIN中的行,列上具有最大值的主要内容,如果未能解决你的问题,请参考以下文章