为啥子查询中的 OR 会使查询慢得多?
Posted
技术标签:
【中文标题】为啥子查询中的 OR 会使查询慢得多?【英文标题】:Why does OR in subquery make query so much slower?为什么子查询中的 OR 会使查询慢得多? 【发布时间】:2021-10-02 10:59:45 【问题描述】:我正在使用 mysql,并且有以下我正在尝试改进的查询:
SELECT
*
FROM
overpayments AS op
JOIN payment_allocations AS overpayment_pa ON overpayment_pa.allocatable_id = op.id
AND overpayment_pa.allocatable_type = 'Overpayment'
JOIN (
SELECT
pa.payment_source_type,
pa.payment_source_id,
ft.conversion_rate
FROM
payment_allocations AS pa
LEFT JOIN line_items AS li ON pa.payment_source_id = li.id
LEFT JOIN credit_notes AS cn ON li.parent_document_id = cn.id
LEFT JOIN financial_transactions AS ft ON (
ft.commercial_document_id = pa.payment_source_id
AND ft.commercial_document_type = pa.payment_source_type
)
OR (
ft.commercial_document_id = cn.id
AND ft.commercial_document_type = 'CreditNote'
)
WHERE
pa.allocatable_type = 'Overpayment'
AND pa.company_id = 14792
AND ft.company_id = 14792
) AS op_bank_transaction_ft ON op_bank_transaction_ft.payment_source_id = overpayment_pa.payment_source_id
AND op_bank_transaction_ft.payment_source_type = overpayment_pa.payment_source_type;
运行需要 10 秒。通过删除子查询中的 OR 语句并使用 COALESCE 获得结果,我能够将其提高到 0.047 秒:
SELECT
*
FROM
overpayments AS op
JOIN payment_allocations AS overpayment_pa ON overpayment_pa.allocatable_id = op.id
AND overpayment_pa.allocatable_type = 'Overpayment'
JOIN (
SELECT
pa.payment_source_type,
pa.payment_source_id,
coalesce(ft_one.conversion_rate, ft_two.conversion_rate)
FROM
payment_allocations AS pa
LEFT JOIN line_items AS li ON pa.payment_source_id = li.id
LEFT JOIN credit_notes AS cn ON li.parent_document_id = cn.id
LEFT JOIN financial_transactions AS ft_one ON (
ft_one.commercial_document_id = pa.payment_source_id
AND ft_one.commercial_document_type = pa.payment_source_type
AND ft_one.company_id = 14792
)
LEFT JOIN financial_transactions AS ft_two ON (
ft_two.commercial_document_id = cn.id
AND ft_two.commercial_document_type = 'CreditNote'
AND ft_two.company_id = 14792
)
WHERE
pa.allocatable_type = 'Overpayment'
AND pa.company_id = 14792
) AS op_bank_transaction_ft ON op_bank_transaction_ft.payment_source_id = overpayment_pa.payment_source_id
AND op_bank_transaction_ft.payment_source_type = overpayment_pa.payment_source_type;
但是,我真的不明白为什么会这样?原来的子查询跑得很快,只返回了 2 个结果,那为什么它会减慢查询速度呢?解释第一个查询返回以下内容:
# id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | FIELD13 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | pa | ref | index_payment_allocations_on_payment_source_id | index_payment_allocations_on_company_id | index_payment_allocations_on_company_id | 5 | const | 191 | 10.00 | Using where | |
1 | SIMPLE | overpayment_pa | ref | index_payment_allocations_on_payment_source_id | index_payment_allocations_on_allocatable_id | index_payment_allocations_on_payment_source_id | 5 | rails.pa.payment_source_id | 1 | 3.42 | Using where | |
1 | SIMPLE | op | eq_ref | PRIMARY | PRIMARY | 4 | rails.overpayment_pa.allocatable_id | 1 | 100.00 | |||
1 | SIMPLE | li | eq_ref | PRIMARY | PRIMARY | 4 | rails.pa.payment_source_id | 1 | 100.00 | |||
1 | SIMPLE | cn | eq_ref | PRIMARY | PRIMARY | 8 | rails.li.parent_document_id | 1 | 100.00 | Using where; Using index | ||
1 | SIMPLE | ft | ALL | transactions_unique_by_commercial_doc | 12587878 | 0.00 | Range checked for each record (index map: 0x2) |
第二次我得到以下信息:
# id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | FIELD13 | FIELD14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | pa | ref | index_payment_allocations_on_payment_source_id | index_payment_allocations_on_company_id | index_payment_allocations_on_company_id | 5 | const | 191 | 10.00 | Using where | ||
1 | SIMPLE | overpayment_pa | ref | index_payment_allocations_on_payment_source_id | index_payment_allocations_on_allocatable_id | index_payment_allocations_on_payment_source_id | 5 | rails.pa.payment_source_id | 1 | 3.42 | Using where | ||
1 | SIMPLE | op | eq_ref | PRIMARY | PRIMARY | 4 | rails.overpayment_pa.allocatable_id | 1 | 100.00 | ||||
1 | SIMPLE | ft_one | ref | transactions_unique_by_commercial_doc | index_financial_transactions_on_company_id | transactions_unique_by_commercial_doc | 773 | rails.pa.payment_source_id | rails.pa.payment_source_type | 1 | 100.00 | Using where | |
1 | SIMPLE | li | eq_ref | PRIMARY | PRIMARY | 4 | rails.pa.payment_source_id | 1 | 100.00 | ||||
1 | SIMPLE | cn | eq_ref | PRIMARY | PRIMARY | 8 | rails.li.parent_document_id | 1 | 100.00 | Using where; Using index | |||
1 | SIMPLE | ft_two | ref | transactions_unique_by_commercial_doc | index_financial_transactions_on_company_id | transactions_unique_by_commercial_doc | 773 | rails.cn.id | const | 1 | 100.00 | Using where |
但我真的不知道如何解释这些结果。
【问题讨论】:
【参考方案1】:查看您的第一个 EXPLAIN 的最后一行的右侧。它不使用索引,它必须扫描百万行。那很慢。您的第二个查询对查询的每一步都使用了索引,因此速度要快得多。
如果您的第二个查询产生了正确的结果,请使用它并且不要回头。恭喜!您已优化查询。
OR 操作,尤其是在 ON 子句中,比通常的查询计划器模块更难满足,因为它们通常意味着它必须采用两个单独的子查询的并集。看起来计划者选择在你的情况下强行使用它。 (蛮力 === 扫描多行。)
不知道您的索引,很难进一步帮助您。
阅读本文以了解更多信息。 https://use-the-index-luke.com
【讨论】:
【参考方案2】:这些可能会进一步加快第二个公式:
overpayment_pa:
INDEX(payment_source_id, payment_source_type, allocatable_type, allocatable_id)
pa: INDEX(allocatable_type, company_id, payment_source_id, payment_source_type)
financial_transactions:
INDEX(commercial_document_id, commercial_document_type, company_id, conversion_rate)
【讨论】:
以上是关于为啥子查询中的 OR 会使查询慢得多?的主要内容,如果未能解决你的问题,请参考以下文章
MySQL 查询调优 - 为啥使用变量中的值比使用文字慢得多?