MySQL 索引优化与子查询与左连接
Posted
技术标签:
【中文标题】MySQL 索引优化与子查询与左连接【英文标题】:MySQL index optimization with Subquery vs Left Joins 【发布时间】:2012-11-26 07:04:46 【问题描述】:我创建了 2 个查询,我可以使用它们来执行相同的功能。它们都包含我想合并到一个查询中但我无法合并的属性。
查询 1 - 给了我想要的结果。慢(~0.700 秒)
问题 2 - 给了我很多我忽略和跳过的行。快速(~0.005 秒)
我的目标是修改 QUERY 2 以删除除每个项目的 1 之外的所有空价格行。如果不考虑性能,我似乎无法做到这一点。这是由于我对 mysql 中的索引使用缺乏经验和理解。
查询 1
使用设计不佳的子查询,该子查询不允许跨包含 10k 行的 tbl_sale (e) 使用索引。
SELECT b.id, b.sv, b.description, der.store_id, f.name, der.price
FROM tbl_watch AS a
LEFT JOIN tbl_item AS b ON a.item_id = b.id
LEFT JOIN (
SELECT c.store_id, d.flyer_id, e.item_id, e.price
FROM tbl_storewatch AS c, tbl_storeflyer AS d
FORCE INDEX ( storebeg_ndx ) , tbl_sale AS e
WHERE c.user_id = '$user_id'
AND (
d.store_id = c.store_id
AND d.date_beg = '20121206'
)
AND e.flyer_id = d.flyer_id
) AS der ON a.item_id = der.item_id
LEFT JOIN tbl_store as f ON der.store_id = f.id
WHERE a.user_id = '$user_id'
ORDER BY b.description ASC
这是查询 1 的解释
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY a ref user_item_ndx user_item_ndx 4 const 30 Using index; Using temporary; Using filesort
1 PRIMARY b eq_ref PRIMARY PRIMARY 4 a.item_id 1
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 300
1 PRIMARY f eq_ref PRIMARY PRIMARY 4 der.store_id 1
2 DERIVED c ref user_ndx user_ndx 4 6
2 DERIVED e ALL NULL NULL NULL NULL 9473 Using join buffer
2 DERIVED d eq_ref storebeg_ndx storebeg_ndx 8 c.store_id 1 Using where
查询 2
使用非常有效的所有左连接(除了 ORDER BY)。每次连接都使用索引。此查询返回 tbl_watch 中每个项目的所有可能匹配项。这是查询:
SELECT b.id, b.sv, b.description, c.store_id, f.name, e.price
FROM tbl_watch AS a
LEFT JOIN tbl_item AS b ON a.item_id = b.id
LEFT JOIN tbl_storewatch AS c ON c.user_id = '$user_id'
LEFT JOIN tbl_storeflyer AS d ON d.store_id = c.store_id
AND d.date_beg = '$s_date'
LEFT JOIN tbl_sale AS e ON e.item_id = a.item_id
AND e.flyer_id = d.flyer_id
LEFT JOIN tbl_store as f ON d.store_id = f.id
WHERE a.user_id = '$user_id'
ORDER BY b.description ASC
这里是查询的解释:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE a ref user_item_ndx user_item_ndx 4 const 6 Using index; Using temporary; Using filesort
1 SIMPLE b eq_ref PRIMARY PRIMARY 4 a.item_id 1
1 SIMPLE c ref user_ndx user_ndx 4 const 2
1 SIMPLE d eq_ref storebeg_ndx,storendx storebeg_ndx 8 c.store_id,const 1
1 SIMPLE e eq_ref itemflyer_ndx itemflyer_ndx 8 a.item_id,d.flyer_id 1
1 SIMPLE f eq_ref PRIMARY PRIMARY 4 d.store_id 1
如何修改 QUERY 2(更高效)以提供我需要的行,就像 QUERY 1 一样?
谢谢 迈克
【问题讨论】:
我不太确定第一个查询如何为您提供所需的内容。左连接不是左外连接(尽管它可能在 MySQL 中,它不符合 SQL),并且空值不是唯一值。我没有方便的 MySQL,但是将其放入 PostgreSQL 并没有给出您描述的结果。我的回答如下... 【参考方案1】:我认为这个查询会给你你想要的:
select a.id, a.sv, a.description, c.id, c.name, b.price
from
tbl_item a left outer join tbl_sale b on (a.id=b.item_id)
left outer join tbl_storeflyer d on (b.flyer_id=d.flyer_id and d.date_beg = '20120801')
left outer join tbl_store c on (d.store_id = c.id)
left outer join tbl_storewatch x on (c.id = x.store_id)
left outer join tbl_watch y on (a.id = y.item_id);
如果涉及 NULL,您可能会有一些左连接。另一种方法是使用联合,使用 MySQL 可能更快:
select a.id, a.sv, a.description, c.id as store_id, c.name, b.price
from
tbl_item a,
tbl_sale b,
tbl_storeflyer d,
tbl_store c,
tbl_storewatch x,
tbl_watch y
where
a.id = b.item_id and
b.flyer_id = d.flyer_id and
d.store_id = c.id and
c.id = x.store_id and
a.id = y.item_id and
d.date_beg = '20120801'
union
select a.id, a.sv, a.description, null as store_id, null as name, null as price
from
tbl_item a
where
a.id not in (select b.item_id from tbl_sale b);
您可以使用联合的后半部分作为左外连接而不是“不在”子查询 - 取决于您的 MySQL 版本如何优化。
【讨论】:
【参考方案2】:您在 QUERY 1 中的子选择使用隐式内连接,而查询 2 使用所有左连接显式连接。因此,查询 2 中没有用于排除数据的 where 子句。我会在几行(如标记)中取出 LEFT,看看这如何改进:
SELECT b.id, b.sv, b.description, c.store_id, f.name, e.price
FROM tbl_watch AS a
LEFT JOIN tbl_item AS b ON a.item_id = b.id
LEFT JOIN tbl_storewatch AS c ON c.user_id = '$user_id'
-- Left removed below
JOIN tbl_storeflyer AS d ON d.store_id = c.store_id
AND d.date_beg = '$s_date'
-- Left removed below
JOIN tbl_sale AS e ON e.item_id = a.item_id
AND e.flyer_id = d.flyer_id
LEFT JOIN tbl_store as f ON d.store_id = f.id
WHERE a.user_id = '$user_id'
ORDER BY b.description ASC`
您还可以考虑将 and 子句从连接中取出并将它们移动到 WHERE:
SELECT b.id, b.sv, b.description, c.store_id, f.name, e.price
FROM tbl_watch AS a
LEFT JOIN tbl_item AS b ON a.item_id = b.id
LEFT JOIN tbl_storewatch AS c ON c.user_id = '$user_id'
JOIN tbl_storeflyer AS d ON d.store_id = c.store_id
JOIN tbl_sale AS e ON e.item_id = a.item_id
LEFT JOIN tbl_store as f ON d.store_id = f.id
WHERE a.user_id = '$user_id'
AND d.date_beg = '$s_date'
AND e.flyer_id = d.flyer_id
ORDER BY b.description ASC
最后,日期数学相当密集。在查询 2 中,使用外连接可以避免很多,但您可能需要它。我会尝试使用子查询来获取 ID 并通过它来限制:
SELECT b.id, b.sv, b.description, c.store_id, f.name, e.price
FROM tbl_watch AS a
LEFT JOIN tbl_item AS b ON a.item_id = b.id
LEFT JOIN tbl_storewatch AS c ON c.user_id = '$user_id'
JOIN tbl_storeflyer AS d ON d.store_id = c.store_id
JOIN tbl_sale AS e ON e.item_id = a.item_id
LEFT JOIN tbl_store as f ON d.store_id = f.id
WHERE a.user_id = '$user_id'
AND e.flyer_id = d.flyer_id
AND d.id in (select d.id from d where date_beg = '$s_date')
ORDER BY b.description ASC
【讨论】:
感谢您的回复!这些解决方案确实为所有具有活动销售项目的项目提供行(e.item_id = a.item_id AND e.flyer_id = d.flyer_id),但我也试图将每个项目包含在 tbl_watch (a) 和 (b) 字段中,即使它们在 tbl_sale (e) 中不存在。所以我最终会得到:id,sv,description,NULL,NULL,NULL。我只想让每个项目有 1 行为 NULL。我不确定如何做到这一点。 澄清一下,我希望每个项目都属于 3 种情况之一: 1 - 具有单一价格的项目。 2 - 具有多个价格的项目。 3 - 没有价格的商品。如果出现 3,我仍然希望返回包含项目 id、sv 和描述的行。 不确定 date_beg 这里是一个实际的日期字段,看起来它被用作某种字符。我不确定 MySQL 中的日期匹配是否缓慢,但我会感到惊讶。日期通常在内部存储为 long,唯一的成本是将字符串转换为 long,所以我不相信它会增加任何开销。以上是关于MySQL 索引优化与子查询与左连接的主要内容,如果未能解决你的问题,请参考以下文章