Redshift 相关子查询内部错误
Posted
技术标签:
【中文标题】Redshift 相关子查询内部错误【英文标题】:Redshift Correlated Subquery error internal 【发布时间】:2020-03-07 07:50:26 【问题描述】:我在 mysql 中有一个查询,它计算来自特定供应商的 products
和 product_status
,如 Live, Pause, soldout, Partial-Soldout
等。查询包括子查询,但在 Mysql 中完美运行。对于 Redshift (Postgre v8.x),它会给出错误 correlated subquery pattern is not supported due to internal error
查询 (POSTGRES)
SELECT COUNT(CASE WHEN (vendor_id = 6 AND status = 1) THEN 1 ELSE NULL END) AS "vex",
COUNT(CASE WHEN (vendor_id = 6 AND status = 1 AND p.p_id IN (SELECT pov.p_id FROM product_option_value pov WHERE pov.p_id AND p.quantity != pov.quantity AND pov.quantity = 0 GROUP BY pov.p_id)) THEN 1 ELSE NULL END) AS "vex-Partial-Soldout",
COUNT(CASE WHEN (vendor_id = 6 AND status = 1 AND p.quantity = 0) THEN 1 ELSE NULL END) AS "vex-Soldout",
COUNT(CASE WHEN (vendor_id = 5 AND status = 1) THEN 1 ELSE NULL END) AS "vey-DXB",
COUNT(CASE WHEN (vendor_id = 5 AND status = 1 AND p.p_id IN (SELECT pov.p_id FROM product_option_value pov WHERE pov._id AND p.quantity != pov.quantity AND pov.quantity = 0 GROUP BY pov.p_id)) THEN 1 ELSE NULL END) AS "vey-Partial-Soldout",
COUNT(CASE WHEN (vendor_id = 5 AND status = 1 AND p.quantity = 0) THEN 1 ELSE NULL END) AS "vey-Soldout"
FROM product p
表结构
//Product p table
* p_id * model * vendor_id * status * Quantity *
* 1001 * HB1 * 1 * 1 * 10 *
* 1002 * HB2 * 6 * 1 * 17 *
* 1003 * HB3 * 5 * 1 * 19 *
* 1004 * HB4 * 2 * 1 * 3 *
* 1005 * HB5 * 1 * 1 * 8 *
* 1006 * HB6 * 6 * 1 * 55 *
* 1007 * HB7 * 3 * 1 * 32 *
* 1008 * HB8 * 5 * 1 * 6 *
* 1009 * HB9 * 5 * 1 * 10 *
//product_option_value pov table
* pov_id * p_id * opt_id * quantity *
* 1 * 1001 * 11 * 10 *
* 2 * 1002 * 11 * 17 *
* 3 * 1003 * 11 * 0 *
* 4 * 1004 * 11 * 3 *
* 5 * 1005 * 11 * 8 *
* 6 * 1006 * 11 * 0 *
* 7 * 1007 * 11 * 32 *
* 8 * 1008 * 11 * 6 *
* 9 * 1009 * 11 * 0 *
Group by
在子查询中是必需的,所以左连接也不能解决问题。
【问题讨论】:
***.com/help/minimal-reproducible-example 将查询缩小到 SELECT 子句中的一列,然后缩小表达式,直到找到导致问题的特定元素 仅供参考,将SELECT
语句放在 Select 行中是编写 SQL 查询的一种非常低效的方式。它实际上需要为外部查询的每一行运行一个查询。您可能可以将其重写为使用LEFT OUTER JOIN
,然后测试连接的列是否为NULL
。
@JohnRotenstein,我是你的忠实粉丝..我关注了你的很多答案......我尝试了LEFT JOIN
,但查询需要按产品 ID 分组
您可以对查询(包括 GROUP BY)执行 LEFT OUTER JOIN。
【参考方案1】:
子查询中的逻辑很难理解,所以我不确定我是否 100% 正确。例如,样本数据似乎每个产品只有一行,但我不知道是否真的如此。或者opt_id
是什么,因为这在使用pov
表的查询中似乎很有用。
也就是说,您似乎只需要JOIN
和GROUP BY
就可以得到您想要的——一个更简单且在任何数据库中都应该更快的查询。
在以下查询中,我还将两个供应商拆分到不同的行。这至少有助于使逻辑正确:
SELECT vendor_id,
COUNT(DISTINCT p.p_id) AS num_products,
SUM(CASE WHEN p.quantity <> pov.quantity AND pov.quantity = 0 THEN 1 ELSE 0 END) as partial_soldout,
COUNT(DISTINCT CASE WHEN p.quantity = 0 THEN p.p_id END) as soldout
FROM product p LEFT JOIN
product_option_value pov
ON pov.p_id = p.p_pid
WHERE p.vendor_id IN (5, 6) AND p.status = 1
GROUP BY p.vendor_id;
【讨论】:
【参考方案2】:您必须从计算“vex-Partial-Soldout”和“vey-Partial-Soldout”的两个表达式中删除 where 子句之后的“pov.p_id AND”。然后它在 postgreSQL9.6 上工作,输出与您的 sqlserver 版本输出匹配。
SELECT COUNT(CASE WHEN (vendor_id = 6 AND status = 1) THEN 1 ELSE NULL END) AS "vex",
COUNT(CASE WHEN (vendor_id = 6 AND status = 1 AND p.p_id IN (SELECT pov.p_id FROM product_option_value pov WHERE p.quantity != pov.quantity AND pov.quantity = 0 GROUP BY pov.p_id)) THEN 1 ELSE NULL END) AS "vex-Partial-Soldout",
COUNT(CASE WHEN (vendor_id = 6 AND status = 1 AND p.quantity = 0) THEN 1 ELSE NULL END) AS "vex-Soldout",
COUNT(CASE WHEN (vendor_id = 5 AND status = 1) THEN 1 ELSE NULL END) AS "vey-DXB",
COUNT(CASE WHEN (vendor_id = 5 AND status = 1 AND p.p_id IN (SELECT pov.p_id FROM product_option_value pov WHERE p.quantity != pov.quantity AND pov.quantity = 0 GROUP BY pov.p_id)) THEN 1 ELSE NULL END) AS "vey-Partial-Soldout",
COUNT(CASE WHEN (vendor_id = 5 AND status = 1 AND p.quantity = 0) THEN 1 ELSE NULL END) AS "vey-Soldout"
FROM product p
注意:我没有过多关注您的逻辑,作为最佳实践,请避免在您的选择中编写选择语句。
【讨论】:
以上是关于Redshift 相关子查询内部错误的主要内容,如果未能解决你的问题,请参考以下文章
Redshift 中的 DAU WAU MAU 错误:[Amazon](500310) 无效操作:由于内部错误,不支持此类关联子查询模式;