缓慢的 Mysql Inner 加入多个 OR

Posted

技术标签:

【中文标题】缓慢的 Mysql Inner 加入多个 OR【英文标题】:Slow Mysql Inner joins with multiple OR 【发布时间】:2017-08-28 02:01:17 【问题描述】:

我正在帮助一个朋友开发一个电子商务网站。他可以让用户选择他所销售产品的不同颜色、样式、用途和类型。该查询将以下内容添加到查询中:

INNER JOIN tbl_coloursProducts col ON ( p.product_id = col.productID AND (col.colourID = 2 OR col.colourID = 3 OR col.colourID = 5 OR col.colourID = 8 OR col.colourID = 10)) 
INNER JOIN tbl_useProducts tbluse ON ( p.product_id = tbluse.productID AND (tbluse.useID = 15 OR tbluse.useID = 16 OR tbluse.useID = 17 OR tbluse.useID = 18)) 
INNER JOIN tbl_styleProducts style ON ( p.product_id = style.productID AND (style.styleID = 39 OR style.styleID = 44)) 
INNER JOIN tbl_typeProducts type ON ( p.product_id = type.productID AND (type.typeID = 46 OR type.typeID = 48 OR type.typeID = 50)) 

当只选择几个选项时,查询加载速度足够快,但一些用户选择了多个或每个选项,这导致查询运行超过 30 秒并超时。

在不改变表结构的情况下,有没有更好的方法来优化查询?

这是完整的查询:

SELECT *, 
       p.product_id, 
       Coalesce((SELECT p2sp.price 
                 FROM   ab_product_specials p2sp 
                 WHERE  p2sp.product_id = p.product_id 
                        AND p2sp.customer_group_id = '1' 
                        AND ( ( p2sp.date_start = '0000-00-00' 
                                 OR p2sp.date_start < Now() ) 
                              AND ( p2sp.date_end = '0000-00-00' 
                                     OR p2sp.date_end > Now() ) ) 
                 ORDER  BY p2sp.priority ASC, 
                           p2sp.price ASC 
                 LIMIT  1), p.price) AS final_price, 
       pd.name                       AS name, 
       m.name                        AS manufacturer, 
       ss.name                       AS stock, 
       (SELECT Avg(r.rating) 
        FROM   ab_reviews r 
        WHERE  p.product_id = r.product_id 
        GROUP  BY r.product_id)      AS rating, 
       (SELECT Count(rw.review_id) 
        FROM   ab_reviews rw 
        WHERE  p.product_id = rw.product_id 
        GROUP  BY rw.product_id)     AS review 
FROM   ab_products p 
       LEFT JOIN ab_product_descriptions pd 
              ON ( p.product_id = pd.product_id 
                   AND pd.language_id = '1' ) 
       LEFT JOIN ab_products_to_stores p2s 
              ON ( p.product_id = p2s.product_id ) 
       LEFT JOIN ab_manufacturers m 
              ON ( p.manufacturer_id = m.manufacturer_id ) 
       LEFT JOIN ab_stock_statuses ss 
              ON ( p.stock_status_id = ss.stock_status_id 
                   AND ss.language_id = '1' ) 
       LEFT JOIN ab_products_to_categories p2c 
              ON ( p.product_id = p2c.product_id ) 
       INNER JOIN tbl_coloursproducts col 
               ON ( p.product_id = col.productid 
                    AND ( col.colourid = 2 
                           OR col.colourid = 3 
                           OR col.colourid = 5 
                           OR col.colourid = 8 
                           OR col.colourid = 10 ) ) 
       INNER JOIN tbl_useproducts tbluse 
               ON ( p.product_id = tbluse.productid 
                    AND ( tbluse.useid = 15 
                           OR tbluse.useid = 16 
                           OR tbluse.useid = 17 
                           OR tbluse.useid = 18 ) ) 
       INNER JOIN tbl_styleproducts style 
               ON ( p.product_id = style.productid 
                    AND ( style.styleid = 39 
                           OR style.styleid = 44 ) ) 
       INNER JOIN tbl_typeproducts type 
               ON ( p.product_id = type.productid 
                    AND ( type.typeid = 46 
                           OR type.typeid = 48 
                           OR type.typeid = 50 ) ) 
WHERE  p.status = '1' 
       AND p.date_available <= Now() 
       AND p2s.store_id = 0 
       AND p2c.category_id = 131 
GROUP  BY p.product_id 
ORDER  BY p.product_id DESC 
LIMIT  0, 8 

没有自定义位,查询运行良好。

【问题讨论】:

可以给表添加索引吗? 用大型查询的解释来编辑您的问题 和您的带有索引的架构。我有一种感觉是页面抖动。 添加了完整的查询,这是一个 Abantecart 标准查询,但有人添加了额外的位 @Tom - 我希望编辑没有改变您的信息。您粘贴的查询仅出现在一行中,难以阅读,因此我对其应用了自动格式化。我认为它使帖子更清晰。如果您愿意,请随时回滚更改 【参考方案1】:

查看该查询,不确定 OR 本身就是问题(尽管您可以通过对每个 OR 使用和 IN 子句来使代码更紧凑)。相反,我怀疑选择越来越多的选项会导致返回更多行。这会导致 SELECT 子句中的子查询出现问题。

您能否尝试从 SELECT 子句中删除子查询的查询并查看效果。

您可以很容易地删除子查询。

SELECT *, 
       p.product_id, 
       Coalesce(sub1.price, p.price) AS final_price, 
       pd.name                       AS name, 
       m.name                        AS manufacturer, 
       ss.name                       AS stock, 
       sub0.rating, 
       sub0.review 
FROM   ab_products p 
INNER JOIN
(
    SELECT r.product_id,
            Avg(r.rating)  AS rating, 
            Count(rw.review_id) AS review 
    FROM   ab_reviews r 
    GROUP  BY r.product_id
) sub0
ON p.product_id = sub0.product_id 
LEFT OUTER JOIN
(
    SELECT p2sp.product_id,
            SUBSTRING_INDEX(GROUP_CONCAT(p2sp.price ORDER  BY p2sp.priority ASC, p2sp.price ASC ), ',', 1) AS price
    FROM   ab_product_specials p2sp 
    WHERE  p2sp.customer_group_id = '1' 
    AND ( p2sp.date_start = '0000-00-00' OR p2sp.date_start < NOW() ) 
    AND ( p2sp.date_end = '0000-00-00' OR p2sp.date_end > NOW() )
    GROUP BY p2sp.product_id
) sub1
ON p.product_id = sub1.product_id 
       LEFT JOIN ab_product_descriptions pd 
              ON ( p.product_id = pd.product_id 
                   AND pd.language_id = '1' ) 
       LEFT JOIN ab_products_to_stores p2s 
              ON ( p.product_id = p2s.product_id ) 
       LEFT JOIN ab_manufacturers m 
              ON ( p.manufacturer_id = m.manufacturer_id ) 
       LEFT JOIN ab_stock_statuses ss 
              ON ( p.stock_status_id = ss.stock_status_id 
                   AND ss.language_id = '1' ) 
       LEFT JOIN ab_products_to_categories p2c 
              ON ( p.product_id = p2c.product_id ) 
       INNER JOIN tbl_coloursproducts col 
               ON ( p.product_id = col.productid 
                    AND ( col.colourid = 2 
                           OR col.colourid = 3 
                           OR col.colourid = 5 
                           OR col.colourid = 8 
                           OR col.colourid = 10 ) ) 
       INNER JOIN tbl_useproducts tbluse 
               ON ( p.product_id = tbluse.productid 
                    AND ( tbluse.useid = 15 
                           OR tbluse.useid = 16 
                           OR tbluse.useid = 17 
                           OR tbluse.useid = 18 ) ) 
       INNER JOIN tbl_styleproducts style 
               ON ( p.product_id = style.productid 
                    AND ( style.styleid = 39 
                           OR style.styleid = 44 ) ) 
       INNER JOIN tbl_typeproducts type 
               ON ( p.product_id = type.productid 
                    AND ( type.typeid = 46 
                           OR type.typeid = 48 
                           OR type.typeid = 50 ) ) 
WHERE  p.status = '1' 
       AND p.date_available <= Now() 
       AND p2s.store_id = 0 
       AND p2c.category_id = 131 
GROUP  BY p.product_id 
ORDER  BY p.product_id DESC 
LIMIT  0, 8 

顺便说一句,当您从 ab_product_specials 读取数据时,您正在检查 date_start 和 date_end 是否为 0000-00-00(即日期),但还将它们与返回日期/时间字段的 NOW() 进行比较。这些字段是日期还是日期/时间字段?

【讨论】:

【参考方案2】:

我的第一个想法是使用IN 使查询更易于阅读:

INNER JOIN tbl_coloursProducts col 
   ON p.product_id = col.productID AND col.colourID IN ( 2, 3, 5, 8, 10 )

然后我想,我想知道他们是否在动态构建 SQL 文本以喷射到数据库逻辑中?!当查询以这种方式不断变化时,优化器不太可能在优化查询方面做得很好。

考虑一个临时表(伪代码):

-- One time:
CREATE TABLE SratchColours ( colourID INT NOT NULL UNQIUE );

-- For each query:
DELETE FROM SratchColours;
INSERT INTO SratchColours VALUES ( 2 ), ( 3 ), ( 5 ), ( 8 ), ( 10 );

现在你的动态值列表只是变成了另一个连接:

tbl_coloursProducts NATURAL JOIN SratchColours

(如果必须的话,您可以使用内部连接!)

现在,为每个并发用户设置一个基表可能不是扩展系统的好方法。因此,请考虑如何将一袋 colourID 值传递给数据库逻辑(例如,存储过程),将它们放入表(例如,临时表),然后从那里连接到您的基表。

【讨论】:

以上是关于缓慢的 Mysql Inner 加入多个 OR的主要内容,如果未能解决你的问题,请参考以下文章

加入表时性能缓慢

自加入时 hive 的缓慢处理

缓慢的 MySQL 查询:有没有办法避免对左连接的每一行进行条件选择计数?

DCE Cassandra 3.9 在加入现有集群期间创建二级索引缓慢

缓慢的 MySQL 查询。我应该索引啥?

mysql查询运行缓慢