Postgres 表性能缓慢

Posted

技术标签:

【中文标题】Postgres 表性能缓慢【英文标题】:Postgres Table Slow Performance 【发布时间】:2018-05-01 12:51:04 【问题描述】:

我们在 postgres DB 中有一个 Product 表。这是托管在 Heroku 上的。我们有 8 GB RAM 和 250 GB 磁盘空间。允许 1000 IPOP。 我们在列上有适当的索引。

平台

PostgreSQL 9.5.12 on x86_64-pc-linux-gnu (Ubuntu 9.5.12-1.pgdg14.04+1),由 gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4, 64 编译-位

我们正在此表上运行关键字搜索查询。我们在这个表中有 280 万条记录。我们的搜索查询太慢了。它在大约 50 秒内给我们结果。太慢了。

查询

SELECT
    P .sfid AS prodsfid,
    P .image_url__c image,
    P .productcode sku,
    P .Short_Description__c shortDesc,
    P . NAME pname,
    P .category__c,
    P .price__c price,
    P .description,
    P .vendor_name__c vname,
    P .vendor__c supSfid
FROM
    staging.product2 P
JOIN (
    SELECT
        p1.sfid
    FROM
        staging.product2 p1
    WHERE
        p1. NAME ILIKE '%s%'
    OR p1.productcode ILIKE '%s%'
) AS TEMP ON (P .sfid = TEMP .sfid)
WHERE
    P .status__c = 'Available'
AND LOWER (
    P .vendor_shipping_country__c
) = ANY (

    VALUES
        ('us'),
        ('usa'),
        ('united states'),
        ('united states of america')
)
AND P .vendor_catalog_tier__c = ANY (

    VALUES
        ('a1c37000000oljnAAA'),
        ('a1c37000000oljQAAQ'),
        ('a1c37000000oljQAAQ'),
        ('a1c37000000pT7IAAU'),
        ('a1c37000000omDjAAI'),
        ('a1c37000000oljMAAQ'),
        ('a1c37000000oljaAAA'),
        ('a1c37000000pT7SAAU'),
        ('a1c0R000000AFcVQAW'),
        ('a1c0R000000A1HAQA0'),
        ('a1c0R0000000OpWQAU'),
        ('a1c0R0000005TZMQA2'),
        ('a1c37000000oljdAAA'),
        ('a1c37000000ooTqAAI'),
        ('a1c37000000omLBAAY'),
        ('a1c0R0000005N8GQAU')
)

这里是解释计划:

Nested Loop  (cost=31.85..33886.54 rows=3681 width=750)
  ->  Hash Join  (cost=31.77..31433.07 rows=4415 width=750)
        Hash Cond: (lower((p.vendor_shipping_country__c)::text) = "*VALUES*".column1)
        ->  Nested Loop  (cost=31.73..31423.67 rows=8830 width=761)
              ->  HashAggregate  (cost=0.06..0.11 rows=16 width=32)
                    Group Key: "*VALUES*_1".column1
                    ->  Values Scan on "*VALUES*_1"  (cost=0.00..0.06 rows=16 width=32)
              ->  Bitmap Heap Scan on product2 p  (cost=31.66..1962.32 rows=552 width=780)
                    Recheck Cond: ((vendor_catalog_tier__c)::text = "*VALUES*_1".column1)
                    Filter: ((status__c)::text = 'Available'::text)
                    ->  Bitmap Index Scan on vendor_catalog_tier_prd_idx  (cost=0.00..31.64 rows=1016 width=0)
                          Index Cond: ((vendor_catalog_tier__c)::text = "*VALUES*_1".column1)
        ->  Hash  (cost=0.03..0.03 rows=4 width=32)
              ->  Unique  (cost=0.02..0.03 rows=4 width=32)
                    ->  Sort  (cost=0.02..0.02 rows=4 width=32)
                          Sort Key: "*VALUES*".column1
                          ->  Values Scan on "*VALUES*"  (cost=0.00..0.01 rows=4 width=32)
  ->  Index Scan using sfid_prd_idx on product2 p1  (cost=0.09..0.55 rows=1 width=19)
        Index Cond: ((sfid)::text = (p.sfid)::text)
        Filter: (((name)::text ~~* '%s%'::text) OR ((productcode)::text ~~* '%s%'::text))

它返回大约 140,576 条记录。顺便说一句,我们只需要前 5,000 条记录。在这里设置 Limit 会有帮助吗?

让我知道如何使它变快以及导致变慢的原因。

解释分析

@RaymondNijland 这里是解释分析

Nested Loop  (cost=31.83..33427.28 rows=4039 width=750) (actual time=1.903..4384.221 rows=140576 loops=1)
  ->  Hash Join  (cost=31.74..30971.32 rows=4369 width=750) (actual time=1.852..1094.964 rows=164353 loops=1)
        Hash Cond: (lower((p.vendor_shipping_country__c)::text) = "*VALUES*".column1)
        ->  Nested Loop  (cost=31.70..30962.02 rows=8738 width=761) (actual time=1.800..911.738 rows=164353 loops=1)
              ->  HashAggregate  (cost=0.06..0.11 rows=16 width=32) (actual time=0.012..0.019 rows=15 loops=1)
                    Group Key: "*VALUES*_1".column1
                    ->  Values Scan on "*VALUES*_1"  (cost=0.00..0.06 rows=16 width=32) (actual time=0.004..0.005 rows=16 loops=1)
              ->  Bitmap Heap Scan on product2 p  (cost=31.64..1933.48 rows=546 width=780) (actual time=26.004..57.290 rows=10957 loops=15)
                    Recheck Cond: ((vendor_catalog_tier__c)::text = "*VALUES*_1".column1)
                    Filter: ((status__c)::text = 'Available'::text)
                    Rows Removed by Filter: 645
                    Heap Blocks: exact=88436
                    ->  Bitmap Index Scan on vendor_catalog_tier_prd_idx  (cost=0.00..31.61 rows=1000 width=0) (actual time=24.811..24.811 rows=11601 loops=15)
                          Index Cond: ((vendor_catalog_tier__c)::text = "*VALUES*_1".column1)
        ->  Hash  (cost=0.03..0.03 rows=4 width=32) (actual time=0.032..0.032 rows=4 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 9kB
              ->  Unique  (cost=0.02..0.03 rows=4 width=32) (actual time=0.026..0.027 rows=4 loops=1)
                    ->  Sort  (cost=0.02..0.02 rows=4 width=32) (actual time=0.026..0.026 rows=4 loops=1)
                          Sort Key: "*VALUES*".column1
                          Sort Method: quicksort  Memory: 25kB
                          ->  Values Scan on "*VALUES*"  (cost=0.00..0.01 rows=4 width=32) (actual time=0.001..0.002 rows=4 loops=1)
  ->  Index Scan using sfid_prd_idx on product2 p1  (cost=0.09..0.56 rows=1 width=19) (actual time=0.019..0.020 rows=1 loops=164353)
        Index Cond: ((sfid)::text = (p.sfid)::text)
        Filter: (((name)::text ~~* '%s%'::text) OR ((productcode)::text ~~* '%s%'::text))
        Rows Removed by Filter: 0
Planning time: 2.488 ms
Execution time: 4391.378 ms

另一个查询版本,使用 order by ,但似乎也很慢(140 秒)

SELECT
P .sfid AS prodsfid,
P .image_url__c image,
P .productcode sku,
P .Short_Description__c shortDesc,
P . NAME pname,
P .category__c,
P .price__c price,
P .description,
P .vendor_name__c vname,
P .vendor__c supSfid
FROM
staging.product2 P
WHERE
P .status__c = 'Available'
AND P .vendor_shipping_country__c IN (
'us',
'usa',
'united states',
'united states of america'
)
AND P .vendor_catalog_tier__c IN (
'a1c37000000omDQAAY',
'a1c37000000omDTAAY',
'a1c37000000omDXAAY',
'a1c37000000omDYAAY',
'a1c37000000omDZAAY',
'a1c37000000omDdAAI',
'a1c37000000omDfAAI',
'a1c37000000omDiAAI',
'a1c37000000oml6AAA',
'a1c37000000oljPAAQ',
'a1c37000000oljRAAQ',
'a1c37000000oljWAAQ',
'a1c37000000oljXAAQ',
'a1c37000000oljZAAQ',
'a1c37000000oljcAAA',
'a1c37000000oljdAAA',
'a1c37000000oljlAAA',
'a1c37000000oljoAAA',
'a1c37000000oljqAAA',
'a1c37000000olnvAAA',
'a1c37000000olnwAAA',
'a1c37000000olnxAAA',
'a1c37000000olnyAAA',
'a1c37000000olo0AAA',
'a1c37000000olo1AAA',
'a1c37000000olo4AAA',
'a1c37000000olo8AAA',
'a1c37000000olo9AAA',
'a1c37000000oloCAAQ',
'a1c37000000oloFAAQ',
'a1c37000000oloIAAQ',
'a1c37000000oloJAAQ',
'a1c37000000oloMAAQ',
'a1c37000000oloNAAQ',
'a1c37000000oloSAAQ',
'a1c37000000olodAAA',
'a1c37000000oloeAAA',
'a1c37000000olzCAAQ',
'a1c37000000om0xAAA',
'a1c37000000ooV1AAI',
'a1c37000000oog8AAA',
'a1c37000000oogDAAQ',
'a1c37000000oonzAAA',
'a1c37000000oluuAAA',
'a1c37000000pT7SAAU',
'a1c37000000oljnAAA',
'a1c37000000olumAAA',
'a1c37000000oljpAAA',
'a1c37000000pUm2AAE',
'a1c37000000olo3AAA',
'a1c37000000oo1MAAQ',
'a1c37000000oo1vAAA',
'a1c37000000pWxgAAE',
'a1c37000000pYJkAAM',
'a1c37000000omDjAAI',
'a1c37000000ooTgAAI',
'a1c37000000op2GAAQ',
'a1c37000000one0AAA',
'a1c37000000oljYAAQ',
'a1c37000000pUlxAAE',
'a1c37000000oo9SAAQ',
'a1c37000000pcIYAAY',
'a1c37000000pamtAAA',
'a1c37000000pd2QAAQ',
'a1c37000000pdCOAAY',
'a1c37000000OpPaAAK',
'a1c37000000OphZAAS',
'a1c37000000olNkAAI'

)
ORDER BY p.productcode asc
LIMIT 5000

以下是对此的解释分析:

Limit (cost=0.09..45271.54 rows=5000 width=750) (actual time=48593.355..86376.864 rows=5000 loops=1)
-> Index Scan using productcode_prd_idx on product2 p (cost=0.09..743031.39 rows=82064 width=750) (actual time=48593.353..86376.283 rows=5000 loops=1)
Filter: (((status__c)::text = 'Available'::text) AND ((vendor_shipping_country__c)::text = ANY ('us,usa,"united states","united states of america"'::text[])) AND ((vendor_catalog_tier__c)::text = ANY ('a1c37000000omDQAAY,a1c37000000omDTAAY,a1c37000000omDXAAY,a1c37000000omDYAAY,a1c37000000omDZAAY,a1c37000000omDdAAI,a1c37000000omDfAAI,a1c37000000omDiAAI,a1c37000000oml6AAA,a1c37000000oljPAAQ,a1c37000000oljRAAQ,a1c37000000oljWAAQ,a1c37000000oljXAAQ,a1c37000000oljZAAQ,a1c37000000oljcAAA,a1c37000000oljdAAA,a1c37000000oljlAAA,a1c37000000oljoAAA,a1c37000000oljqAAA,a1c37000000olnvAAA,a1c37000000olnwAAA,a1c37000000olnxAAA,a1c37000000olnyAAA,a1c37000000olo0AAA,a1c37000000olo1AAA,a1c37000000olo4AAA,a1c37000000olo8AAA,a1c37000000olo9AAA,a1c37000000oloCAAQ,a1c37000000oloFAAQ,a1c37000000oloIAAQ,a1c37000000oloJAAQ,a1c37000000oloMAAQ,a1c37000000oloNAAQ,a1c37000000oloSAAQ,a1c37000000olodAAA,a1c37000000oloeAAA,a1c37000000olzCAAQ,a1c37000000om0xAAA,a1c37000000ooV1AAI,a1c37000000oog8AAA,a1c37000000oogDAAQ,a1c37000000oonzAAA,a1c37000000oluuAAA,a1c37000000pT7SAAU,a1c37000000oljnAAA,a1c37000000olumAAA,a1c37000000oljpAAA,a1c37000000pUm2AAE,a1c37000000olo3AAA,a1c37000000oo1MAAQ,a1c37000000oo1vAAA,a1c37000000pWxgAAE,a1c37000000pYJkAAM,a1c37000000omDjAAI,a1c37000000ooTgAAI,a1c37000000op2GAAQ,a1c37000000one0AAA,a1c37000000oljYAAQ,a1c37000000pUlxAAE,a1c37000000oo9SAAQ,a1c37000000pcIYAAY,a1c37000000pamtAAA,a1c37000000pd2QAAQ,a1c37000000pdCOAAY,a1c37000000OpPaAAK,a1c37000000OphZAAS,a1c37000000olNkAAI'::text[])))
Rows Removed by Filter: 1707920
Planning time: 1.685 ms
Execution time: 86377.139 ms

谢谢

阿斯拉姆巴里

【问题讨论】:

你能不能也显示EXPLAIN ANALYSE query。还有CREATE TABLE 结构(***.com/questions/2593803/…) 不相关,但为什么要在表格/别名名称和点之间放置一个空格?而且,不要使用这样的大写标识符。如果你真的想要大写,你需要把它放在引号中。否则使用小写。 @eurotrash 我只是从 postgres 客户端复制粘贴了查询,它给出了空格。而关于点,实际上这是一个模式“分期” @RaymondNijland Explain Analyze 添加 【参考方案1】:

您可能需要考虑在 staging.product2 表上使用 GIN 或 GIST 索引。双面 ILIKE 速度慢,难以大幅改进。我已经看到 GIN 索引将类似查询提高了 60-80%。

看到这个doc。

【讨论】:

我已经在我使用 ilike 的地方添加了两个索引 gin gist。仍然查询很慢。 另外,如果我在搜索词中加上 &*,查询需要 100 多秒。 表格大小有关系吗?我的产品表大小为 11 GB。 RAM 大小是 8 GB?有关系吗?

以上是关于Postgres 表性能缓慢的主要内容,如果未能解决你的问题,请参考以下文章

最佳实践:优化Postgres查询性能(上)

Postgres 选择查询在使用 JDBC 时运行缓慢,但在从同一服务器在 PSQL 中运行时快速

运行缓慢的 Postgres 查询

休眠缓慢以获取 Postgres 连接

Postgres 运行缓慢的删除查询

Postgres 进程显示 100% CPU 但实际使用率为 6%,导致查询响应缓慢