MySql 查询花费的时间比它应该的要长得多
Posted
技术标签:
【中文标题】MySql 查询花费的时间比它应该的要长得多【英文标题】:MySql query taking a lot longer than it should 【发布时间】:2018-04-23 13:03:53 【问题描述】:由于某种原因,此查询最多需要 5 分钟才能执行。我已将连接缓冲区扩展到 1G,并对此查询进行了说明(结果为 here)似乎没有任何迹象表明这需要这么长时间。在查询期间,所有 8 个 CPU 内核的使用率都达到了接近 100%。
引擎是 InnoDB。
所有表都有一个主键索引。
SELECT Concat(Concat(cust.first_name, ' '), cust.last_name) AS customerName,
TYPE.code AS transType,
ty1.nsfamount,
np.sumrebateamount,
trans.note_id AS note_id,
trans.createdate AS createdatestr,
n.totalamount,
n.currentfloat,
( ( n.costofborrowing * 100 ) / n.amounttolent ) AS fees,
n.amounttolent,
( 0 - ( trans.cashamount + trans.chequeamount
+ trans.debitamount
+ trans.preauthorizedamount ) ) AS paidamount,
sumpenaltyamount
FROM (SELECT *
FROM loan_transaction trans1
WHERE trans1.cashamount < 0
OR trans1.chequeamount < 0
OR trans1.debitamount < 0
OR trans1.preauthorizedamount < 0) trans
inner join customer cust
ON trans.customer_id = cust.customer_id
inner join (SELECT *
FROM lookuptransactiontypes ty
WHERE ty.code <> 'REB'
AND ty.code <> 'PN') TYPE
ON trans.transactiontype = TYPE.transactiontypesid
inner join note n
ON trans.note_id = n.note_id
inner join (SELECT note_id,
SUM(rebateamount) AS sumrebateamount
FROM note_payment np1
GROUP BY np1.note_id) np
ON trans.note_id = np.note_id
left join (SELECT note_id,
transactiontype,
( SUM(chequeamount) + SUM(cashamount)
+ SUM(debitamount) + SUM(preauthorizedamount) )AS
NSFamount
FROM (SELECT *
FROM loan_transaction trans4
WHERE trans4.cashamount > 0
OR trans4.chequeamount > 0
OR trans4.debitamount > 0
OR trans4.preauthorizedamount > 0)trans5
inner join (SELECT transactiontypesid
FROM lookuptransactiontypes ty2
WHERE ty2.code = 'NSF')type2
ON
trans5.transactiontype = type2.transactiontypesid
GROUP BY trans5.note_id) ty1
ON ty1.note_id = trans.refnum
left join (SELECT note_id AS noteid,
( SUM(tp.cashamount) + SUM(tp.chequeamount)
+ SUM(tp.debitamount)
+ SUM(tp.preauthorizedamount) ) AS sumpenaltyamount
FROM loan_transaction tp
inner join (SELECT transactiontypesid
FROM lookuptransactiontypes lp
WHERE lp.code = 'PN') lp
ON lp.transactiontypesid = tp.transactiontype
GROUP BY tp.note_id) p
ON p.noteid = trans.refnum
最新解释
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived3> ALL 2241
1 PRIMARY <derived4> ALL 191441 Using join buffer
1 PRIMARY n eq_ref PK_NOTE PK_NOTE 8 np.note_id 1
1 PRIMARY <derived2> ALL 274992 Using where; Using join buffer
1 PRIMARY cust eq_ref PRIMARY_97 PRIMARY_97 8 trans.CUSTOMER_ID 1
1 PRIMARY <derived5> ALL 2803
1 PRIMARY <derived8> ALL 14755
8 DERIVED <derived9> ALL 2 Using temporary; Using filesort
8 DERIVED tp ref TRANSACTIONTYPE TRANSACTIONTYPE 9 lp.transactionTypesID 110 Using where
9 DERIVED lp ALL 2206 Using where
5 DERIVED <derived7> ALL 98 Using temporary; Using filesort
5 DERIVED <derived6> ALL 314705 Using where; Using join buffer
7 DERIVED ty2 ALL 2206 Using where
6 DERIVED trans4 ALL 664587 Using where
4 DERIVED np1 index note_payment_idx_id_rebateamount 16 193366 Using index
3 DERIVED ty ALL 2206 Using where
2 DERIVED trans1 ALL 664587 Using where
【问题讨论】:
“似乎没有任何迹象表明为什么这会花费这么多时间”->“使用临时;使用文件排序”表 np1 上的 207662 行是 mysql 中最糟糕的性能问题的指标..relateds此查询SELECT note_id, SUM(rebateamount) AS sumrebateamount FROM note_payment np1 GROUP BY np1.note_id
需要`index(note_id, rebateamount)...换句话说,您需要仔细检查每个交付表所需的索引
刚刚添加了索引,Explain 不再使用文件排序对大量行进行任何操作(解释数据:pastebin.com/q12uHKUE)遗憾的是,查询仍然需要很长时间。
我已经为这个问题添加了最新的解释......是的,如果不知道表结构和索引,很难进一步改进......
更新:删除 GROUP BY 可将查询速度提高几个数量级
什么版本的 MySQL?较新的版本在JOIN ( SELECT ... )
上做得更好;你需要这样的优化!
【参考方案1】:
说实话,这个查询有很多问题。您可以按照以下规则轻松简化它:
您可以一次连接多个列(例如:CONCAT(column1, ' ', column2))
无需在同一张表上(或在第一个 FROM 中)执行带有内连接的子查询。只需将您的 FROM 直接放在子查询的主表上,然后将子查询的过滤器移动到主查询的 WHERE 中
对此不确定,但您的所有逻辑似乎都是基于每个 note_id 的。如果确实如此,请在主查询的 GROUP BY => 中移动 GROUP BY note_id 删除每个 note_id 进程执行的所有子查询,只需在想要的表上加入并在主查询中移动它们的 SUM() 和其他列选择查询选择
当您希望有 2 个值基于同一个表但使用不同的过滤器时,您不需要进行子查询,您可以使用 for(比如说 SUM() )示例:
SUM(IF(COLUMN1 = YOUR_FILTER1 OR COLUMN1 = YOUR_FILTER2, COLUMN1, 0)) as totalWithFILTER1andFILTER2 [...] GROUP BY note_id
最后但并非最不重要的一点是,您将加入(内部)在 TYPE.code 上进行过滤的表上,而不是“REB”或“PN”,但随后您将加入(向左)在 TYPE 上进行过滤的结果集上.code = 'PN',这没什么意义,左连接总是会导致NULL仅供参考,由于我所说的可能看起来很模糊,我开始简化您的查询,但由于我不知道您想要实现什么而停止了这种废话(没有重构 2 LEFT JOIN)。 这是查询(虽然无法测试):
SELECT
CONCAT(cust.first_name, ' ', cust.last_name) AS customerName,
TYPE.code AS transType,
ty1.nsfamount,
SUM(np.rebateamount) as sumrebateamount,
trans.note_id AS note_id,
trans.createdate AS createdatestr,
n.totalamount,
n.currentfloat,
((n.costofborrowing * 100) / n.amounttolent) AS fees,
n.amounttolent,
(0 - (trans.cashamount + trans.chequeamount
+ trans.debitamount
+ trans.preauthorizedamount)) AS paidamount,
sumpenaltyamount
FROM loan_transaction trans
INNER JOIN customer cust ON trans.customer_id = cust.customer_id
INNER JOIN lookuptransactiontypes TYPE ON trans.transactiontype = TYPE.transactiontypesid
INNER JOIN note n ON trans.note_id = n.note_id
INNER JOIN note_payment np ON trans.note_id = np.note_id
LEFT JOIN (SELECT
note_id,
transactiontype,
(SUM(chequeamount) + SUM(cashamount)
+ SUM(debitamount) + SUM(preauthorizedamount)) AS
NSFamount
FROM loan_transaction trans4
INNER JOIN lookuptransactiontypes type2 ON trans4.transactiontype = type2.transactiontypesid
WHERE (trans4.cashamount > 0
OR trans4.chequeamount > 0
OR trans4.debitamount > 0
OR trans4.preauthorizedamount > 0) AND type2.code = 'NSF'
GROUP BY trans5.note_id) ty1
ON ty1.note_id = trans.refnum
LEFT JOIN (SELECT
note_id AS noteid,
(SUM(tp.cashamount) + SUM(tp.chequeamount)
+ SUM(tp.debitamount)
+ SUM(tp.preauthorizedamount)) AS sumpenaltyamount
FROM loan_transaction tp
INNER JOIN (SELECT transactiontypesid
FROM lookuptransactiontypes lp
WHERE lp.code = 'PN') lp
ON lp.transactiontypesid = tp.transactiontype
GROUP BY tp.note_id) p
ON p.noteid = trans.refnum
WHERE
(trans.cashamount < 0
OR trans.chequeamount < 0
OR trans.debitamount < 0
OR trans.preauthorizedamount < 0)
AND TYPE.code <> 'REB'
AND TYPE.code <> 'PN'
GROUP BY trans.note_id;
【讨论】:
【参考方案2】:我同意@Aurelien 的回答,为什么您可以加入派生表,而您可以加入普通表并应用过滤器。为什么这样做
-- this will force a full scan on customer table and ignores the filter
select whatever
from transactions inner join
(
select * from customer
) customer on transactions.customer_id = customer.customer_id
where customer.customer_id = 1;
虽然你可以这样做
select whatever
from transactions inner join customer on transactions.customer_id = customer.customer_id
where customer.customer_id = 1;
除了@Aurelien 的回答,
恕我直言,您查询的问题在于您需要所有客户的数据,因此无论如何优化此查询,您仍然在进行全面扫描,您无法扩展,想象一下几年后你有 1 亿笔交易。
这可能不是您想要的,但是如何分区/分页这样的报告。用户通常不需要一次性展示所有客户,您也不需要浪费资源。
我们的计划是做同样的工作,但只针对 50 位客户。
在您摆脱不必要的子查询后 - 如@Aurelien 回答所示 - 更改他的这部分查询
FROM loan_transaction trans
INNER JOIN customer cust ON trans.customer_id = cust.customer_id
进入这个
FROM (SELECT * FROM customer LIMIT 50 OFFSET 0) cust
INNER JOIN loan_transaction trans ON trans.customer_id = cust.customer_id
请注意,带有偏移的分页不会缩放,因此如果您的客户表很大,您可以考虑another type of pagination
【讨论】:
以上是关于MySql 查询花费的时间比它应该的要长得多的主要内容,如果未能解决你的问题,请参考以下文章
为啥在 C# .NET 中写入 Excel 范围所需的时间比预期的要长得多?
boost::this_thread::sleep_for 的睡眠时间比我预期的要长得多。