查询以查找索引膨胀

Posted

技术标签:

【中文标题】查询以查找索引膨胀【英文标题】:Query to find Index bloat 【发布时间】:2017-09-04 10:59:16 【问题描述】:

我需要一个查询来查找表上的索引是否膨胀。我看到了一些查询,他们将表大小与索引大小进行比较。如果有其他方法,请分享查询。

我使用的是 Greenplum 4.3(基于 Postgres 8.2)

【问题讨论】:

谷歌说wiki.postgresql.org/wiki/Index_Maintenance#Index_Bloat 在该查询中,使用横向连接。它是从 postgresql 9.3 引入的。不幸的是,我在 postgresql 上使用的是旧版本。 您是否尝试过Greenplum manual中“检测膨胀”一章中的查询 是的,他们提到的是表膨胀而不是索引膨胀 【参考方案1】:

膨胀分数查询

以下 SQL 查询将检查 XML 架构中的每个表,并确定浪费磁盘空间的死行(元组)。

SELECT schemaname || '.' || relname as tblnam,
    n_dead_tup,
    (n_dead_tup::float / n_live_tup::float) * 100 as pfrag
FROM pg_stat_user_tables
WHERE schemaname = 'xml' and n_dead_tup > 0 and n_live_tup > 0 order by pfrag desc;

如果此查询返回高百分比 (pfrag) 的死元组,则可以使用 VACUUM 命令回收空间。

7 被认为很高

来自wiki.postgres.org

SELECT
  current_database(), schemaname, tablename, /*reltuples::bigint, relpages::bigint, otta,*/
  ROUND((CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages::float/otta END)::numeric,1) AS tbloat,
  CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::BIGINT END AS wastedbytes,
  iname, /*ituples::bigint, ipages::bigint, iotta,*/
  ROUND((CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages::float/iotta END)::numeric,1) AS ibloat,
  CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes
FROM (
  SELECT
    schemaname, tablename, cc.reltuples, cc.relpages, bs,
    CEIL((cc.reltuples*((datahdr+ma-
      (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta,
    COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
    COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols
  FROM (
    SELECT
      ma,bs,schemaname,tablename,
      (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
      (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
    FROM (
      SELECT
        schemaname, tablename, hdr, ma, bs,
        SUM((1-null_frac)*avg_width) AS datawidth,
        MAX(null_frac) AS maxfracsum,
        hdr+(
          SELECT 1+count(*)/8
          FROM pg_stats s2
          WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
        ) AS nullhdr
      FROM pg_stats s, (
        SELECT
          (SELECT current_setting('block_size')::numeric) AS bs,
          CASE WHEN substring(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
          CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma
        FROM (SELECT version() AS v) AS foo
      ) AS constants
      GROUP BY 1,2,3,4,5
    ) AS foo
  ) AS rs
  JOIN pg_class cc ON cc.relname = rs.tablename
  JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname <> 'information_schema'
  LEFT JOIN pg_index i ON indrelid = cc.oid
  LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid
) AS sml
ORDER BY wastedbytes DESC

【讨论】:

以上是关于查询以查找索引膨胀的主要内容,如果未能解决你的问题,请参考以下文章

PostgreSQL 索引膨胀

为什么vacuum后表还是继续膨胀?

为什么vacuum后表还是继续膨胀?

乱序插入导致索引膨胀

Postgres维护的正确顺序

Spark sql 查询导致分区计数膨胀