尽管成本低、基数低,但 Oracle 中的选择查询需要很长时间

Posted

技术标签:

【中文标题】尽管成本低、基数低,但 Oracle 中的选择查询需要很长时间【英文标题】:Select query in Oracle taking a long time despite low cost and low cardinality 【发布时间】:2019-09-03 09:17:19 【问题描述】:

尝试执行选择查询,该查询基本上使用正则表达式将列中的多个值转换为多行 -

select  INCIDENTID, trim(regexp_substr(CASUALFACTORS,'[^,]+', 1, level) ) value  
FROM PSA.HSERS_INCIDENT_PSA WHERE ACTIVE_FLAG='1'
AND ISDELETED = 'F'
connect by regexp_substr(CASUALFACTORS, '[^,]+', 1, level) is not null
order by INCIDENTID

在解释计划中查询的成本为 3206,基数为 1,08,849。

仍然需要很长时间才能执行(即使在 30 分钟后也会继续)。

请提出建议。

之前的想法是使用 XML 表来创建这个 SQL,但成本更高 -

SELECT INCIDENTID,
       COLUMN_VALUE AS CASUALFACTORS
FROM   (
         SELECT INCIDENTID,
                CASUALFACTORS AS STR 
         FROM   PSA.HSERS_INCIDENT_PSA
         WHERE  ACTIVE_FLAG='1'
         AND    ISDELETED = 'F' 
       ) T,
       XMLTABLE ( ('"' || REPLACE (str, ',', '","') || '"'))


Sample data -

      select incidentid, dbms_lob.substr(CASUALFACTORS,3000)
   from PSA.HSERS_INCIDENT_PSA WHERE ACTIVE_FLAG='1'
AND ISDELETED = 'F' and incidentid = 526849

526849  8,7,26

There are no duplicates for incidentid column in the table 

DDL for the table -

ALTER TABLE PSA.HSERS_INCIDENT_PSA
 DROP PRIMARY KEY CASCADE;

DROP TABLE PSA.HSERS_INCIDENT_PSA CASCADE CONSTRAINTS;

CREATE TABLE PSA.HSERS_INCIDENT_PSA
(
  INCIDENTID                      INTEGER,
  INCIDENTTYPE                    INTEGER,
  SUPPLEMENTALINCIDENTTYPE        INTEGER,
  PENDINGREVIEW                   VARCHAR2(1 BYTE),
  NEARMISSTYPE                    INTEGER,
  NEARMISSSUBCONTRACTOR           INTEGER,
  RELEASEDATE                     TIMESTAMP(6),
  NEARMISSIDENTIFICATION          INTEGER,
  NEARMISSCATEGORY                INTEGER,
  PROFITCENTER                    VARCHAR2(10 BYTE),
  INCIDENTNUMBER                  NUMBER(19),
  INCIDENTCODE                    VARCHAR2(15 BYTE),
  HIRNUMBER                       VARCHAR2(8 BYTE),
  REDBORDERALERTSENT              VARCHAR2(1 BYTE),
  TASKORDER                       VARCHAR2(3 BYTE),
  LOGCAPAREA                      INTEGER,
  TEAMCONNECTNUMBER               VARCHAR2(20 BYTE),
  INCIDENTDATE                    TIMESTAMP(6),
  INCIDENTTIME                    VARCHAR2(10 BYTE),
  REPORTDATE                      TIMESTAMP(6),
  REPORTTIME                      VARCHAR2(5 BYTE),
  INSERTDATETIME                  TIMESTAMP(6),
  REPORTEDBY                      VARCHAR2(40 BYTE),
  SUPERVISOR1                     VARCHAR2(40 BYTE),
  SUPERVISOR2                     VARCHAR2(40 BYTE),
  CLIENT                          INTEGER,
  PROJECTLOCATION                 INTEGER,
  INCIDENTAREA                    INTEGER,
  INCIDENTLOCATION                INTEGER,
  INCIDENTAREADESCRIPTION         VARCHAR2(80 BYTE),
  DRUGALCOHOLTEST                 INTEGER,
  NODRUGTESTRESPONSE              INTEGER,
  NODRUGTESTCOMMENTS              CLOB,
  FACTS                           CLOB,
  POTENTIALCONSEQUENCES           INTEGER,
  LIKELIHOODRATING                INTEGER,
  RISKASSESSMENTSEVERITY          INTEGER,
  COVEREDBYTSTI_JSA               VARCHAR2(1 BYTE),
  INJURINGEVENTDISCUSSEDTSTI_JSA  VARCHAR2(1 BYTE),
  EXTERNALASSESSMENTFINEASSESED   NUMBER(15,2),
  REGULATORYAGENCY                INTEGER,
  INSPECTIONNUMBER                VARCHAR2(12 BYTE),
  INSPECTIONRESULTS               INTEGER,
  INSPECTIONCLOSEDATE             TIMESTAMP(6),
  EXTERNALASSESSMENTSEVERITY      INTEGER,
  CORRECTIVEMEASURES              CLOB,
  WITNESSDETAILS                  CLOB,
  EXTERNALINVESTIGATORCOMMENTS    CLOB,
  CREATEDBY                       VARCHAR2(50 BYTE),
  CREATEDON                       TIMESTAMP(6),
  UPDATEDBY                       VARCHAR2(50 BYTE),
  UPDATEDON                       TIMESTAMP(6),
  ISDELETED                       VARCHAR2(1 BYTE),
  INCIDENTSTATUS                  VARCHAR2(50 BYTE),
  PARENTINCIDENTID                NUMBER(19),
  KEYSTOLIFE                      CLOB,
  OFFICEBEHAVIORS                 CLOB,
  CASUALFACTORS                   CLOB,
  STANDARDSVIOLATED               CLOB,
  SUBCONTRACTORADDRESS            CLOB,
  CONFIDENTIALFACTS               CLOB,
  LASTVISITEDTAB                  VARCHAR2(50 BYTE),
  LAST_RQST_ID                    VARCHAR2(60 BYTE),
  DATA_SYS_SK                     VARCHAR2(3 BYTE),
  CREATED_BY                      VARCHAR2(50 BYTE),
  CREATED_DDTM                    DATE,
  LAST_MODIFIED_BY                VARCHAR2(50 BYTE),
  LAST_MODIFIED_DDTM              DATE,
  ACTIVE_FLAG                     VARCHAR2(1 BYTE)
)
LOB (NODRUGTESTCOMMENTS) STORE AS (
  TABLESPACE  PSA_DATA
  ENABLE      STORAGE IN ROW
  CHUNK       16384
  RETENTION
  NOCACHE
  LOGGING
      STORAGE    (
                  INITIAL          80K
                  NEXT             1M
                  MINEXTENTS       1
                  MAXEXTENTS       UNLIMITED
                  PCTINCREASE      0
                  BUFFER_POOL      DEFAULT
                  FLASH_CACHE      DEFAULT
                  CELL_FLASH_CACHE DEFAULT
                 ))
LOB (FACTS) STORE AS (
  TABLESPACE  PSA_DATA
  ENABLE      STORAGE IN ROW
  CHUNK       16384
  RETENTION
  NOCACHE
  LOGGING
      STORAGE    (
                  INITIAL          80K
                  NEXT             1M
                  MINEXTENTS       1
                  MAXEXTENTS       UNLIMITED
                  PCTINCREASE      0
                  BUFFER_POOL      DEFAULT
                  FLASH_CACHE      DEFAULT
                  CELL_FLASH_CACHE DEFAULT
                 ))
LOB (CORRECTIVEMEASURES) STORE AS (
  TABLESPACE  PSA_DATA
  ENABLE      STORAGE IN ROW
  CHUNK       16384
  RETENTION
  NOCACHE
  LOGGING
      STORAGE    (
                  INITIAL          80K
                  NEXT             1M
                  MINEXTENTS       1
                  MAXEXTENTS       UNLIMITED
                  PCTINCREASE      0
                  BUFFER_POOL      DEFAULT
                  FLASH_CACHE      DEFAULT
                  CELL_FLASH_CACHE DEFAULT
                 ))
LOB (WITNESSDETAILS) STORE AS (
  TABLESPACE  PSA_DATA
  ENABLE      STORAGE IN ROW
  CHUNK       16384
  RETENTION
  NOCACHE
  LOGGING
      STORAGE    (
                  INITIAL          80K
                  NEXT             1M
                  MINEXTENTS       1
                  MAXEXTENTS       UNLIMITED
                  PCTINCREASE      0
                  BUFFER_POOL      DEFAULT
                  FLASH_CACHE      DEFAULT
                  CELL_FLASH_CACHE DEFAULT
                 ))
LOB (EXTERNALINVESTIGATORCOMMENTS) STORE AS (
  TABLESPACE  PSA_DATA
  ENABLE      STORAGE IN ROW
  CHUNK       16384
  RETENTION
  NOCACHE
  LOGGING
      STORAGE    (
                  INITIAL          80K
                  NEXT             1M
                  MINEXTENTS       1
                  MAXEXTENTS       UNLIMITED
                  PCTINCREASE      0
                  BUFFER_POOL      DEFAULT
                  FLASH_CACHE      DEFAULT
                  CELL_FLASH_CACHE DEFAULT
                 ))
LOB (KEYSTOLIFE) STORE AS (
  TABLESPACE  PSA_DATA
  ENABLE      STORAGE IN ROW
  CHUNK       16384
  RETENTION
  NOCACHE
  LOGGING
      STORAGE    (
                  INITIAL          80K
                  NEXT             1M
                  MINEXTENTS       1
                  MAXEXTENTS       UNLIMITED
                  PCTINCREASE      0
                  BUFFER_POOL      DEFAULT
                  FLASH_CACHE      DEFAULT
                  CELL_FLASH_CACHE DEFAULT
                 ))
LOB (OFFICEBEHAVIORS) STORE AS (
  TABLESPACE  PSA_DATA
  ENABLE      STORAGE IN ROW
  CHUNK       16384
  RETENTION
  NOCACHE
  LOGGING
      STORAGE    (
                  INITIAL          80K
                  NEXT             1M
                  MINEXTENTS       1
                  MAXEXTENTS       UNLIMITED
                  PCTINCREASE      0
                  BUFFER_POOL      DEFAULT
                  FLASH_CACHE      DEFAULT
                  CELL_FLASH_CACHE DEFAULT
                 ))
LOB (CASUALFACTORS) STORE AS (
  TABLESPACE  PSA_DATA
  ENABLE      STORAGE IN ROW
  CHUNK       16384
  RETENTION
  NOCACHE
  LOGGING
      STORAGE    (
                  INITIAL          80K
                  NEXT             1M
                  MINEXTENTS       1
                  MAXEXTENTS       UNLIMITED
                  PCTINCREASE      0
                  BUFFER_POOL      DEFAULT
                  FLASH_CACHE      DEFAULT
                  CELL_FLASH_CACHE DEFAULT
                 ))
LOB (STANDARDSVIOLATED) STORE AS (
  TABLESPACE  PSA_DATA
  ENABLE      STORAGE IN ROW
  CHUNK       16384
  RETENTION
  NOCACHE
  LOGGING
      STORAGE    (
                  INITIAL          80K
                  NEXT             1M
                  MINEXTENTS       1
                  MAXEXTENTS       UNLIMITED
                  PCTINCREASE      0
                  BUFFER_POOL      DEFAULT
                  FLASH_CACHE      DEFAULT
                  CELL_FLASH_CACHE DEFAULT
                 ))
LOB (SUBCONTRACTORADDRESS) STORE AS (
  TABLESPACE  PSA_DATA
  ENABLE      STORAGE IN ROW
  CHUNK       16384
  RETENTION
  NOCACHE
  LOGGING
      STORAGE    (
                  INITIAL          80K
                  NEXT             1M
                  MINEXTENTS       1
                  MAXEXTENTS       UNLIMITED
                  PCTINCREASE      0
                  BUFFER_POOL      DEFAULT
                  FLASH_CACHE      DEFAULT
                  CELL_FLASH_CACHE DEFAULT
                 ))
LOB (CONFIDENTIALFACTS) STORE AS (
  TABLESPACE  PSA_DATA
  ENABLE      STORAGE IN ROW
  CHUNK       16384
  RETENTION
  NOCACHE
  LOGGING
      STORAGE    (
                  INITIAL          80K
                  NEXT             1M
                  MINEXTENTS       1
                  MAXEXTENTS       UNLIMITED
                  PCTINCREASE      0
                  BUFFER_POOL      DEFAULT
                  FLASH_CACHE      DEFAULT
                  CELL_FLASH_CACHE DEFAULT
                 ))
TABLESPACE PSA_DATA
RESULT_CACHE (MODE DEFAULT)
PCTUSED    0
PCTFREE    10
INITRANS   1
MAXTRANS   255
STORAGE    (
            INITIAL          80K
            NEXT             1M
            MAXSIZE          UNLIMITED
            MINEXTENTS       1
            MAXEXTENTS       UNLIMITED
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
            FLASH_CACHE      DEFAULT
            CELL_FLASH_CACHE DEFAULT
           )
LOGGING 
NOCOMPRESS 
NOCACHE
NOPARALLEL
MONITORING;


--  There is no statement for index PSA.SYS_C002335882.
--  The object is created when the parent object is created.

CREATE OR REPLACE SYNONYM BI_REPORTS_USER.HSERS_INCIDENT_PSA FOR PSA.HSERS_INCIDENT_PSA;


ALTER TABLE PSA.HSERS_INCIDENT_PSA ADD (
  PRIMARY KEY
  (INCIDENTID, LAST_RQST_ID)
  USING INDEX
    TABLESPACE PSA_DATA
    PCTFREE    10
    INITRANS   2
    MAXTRANS   255
    STORAGE    (
                INITIAL          80K
                NEXT             1M
                MAXSIZE          UNLIMITED
                MINEXTENTS       1
                MAXEXTENTS       UNLIMITED
                PCTINCREASE      0
                BUFFER_POOL      DEFAULT
                FLASH_CACHE      DEFAULT
                CELL_FLASH_CACHE DEFAULT
               )
  ENABLE VALIDATE);


【问题讨论】:

请edit您的问题添加一些示例数据。您给出的第一个查询无法在层次结构的不同级别中的行之间建立关联,因此如果有多个输入行,随着层次结构级别的增加,它将创建成倍增加的重复行;这就是为什么重要的是要知道您输入的内容,因为单行会“工作”,但多行可能会创建数百万个您不期望的重复行。 Oracle 性能调优很棘手,因为有很多因素在起作用。请阅读this answer in another SO thread,其中解释了我们回答您的问题所需的信息。它还可能为您提供足够的关于调整的线索,您可以自己解决。 【参考方案1】:

您可以使用递归子查询因式分解子句和简单的字符串函数(而不是慢速正则表达式):

Oracle 设置

CREATE TABLE HSERS_INCIDENT_PSA ( incidentid, casualfactors, active_flag, isdeleted ) AS
  SELECT 1, 'a,b,c,d,e,f', '1', 'F' FROM DUAL UNION ALL
  SELECT 2, 'g,h,i,j,k',   '1', 'F' FROM DUAL UNION ALL
  SELECT 3, 'l',           '1', 'F' FROM DUAL;

查询

WITH casualfactors_bounds ( incidentid, casualfactors, startidx, endidx ) AS (
  SELECT incidentid,
         casualfactors,
         1,
         INSTR( casualfactors, ',', 1 )
  FROM   HSERS_INCIDENT_PSA
  WHERE  ACTIVE_FLAG = '1'
  AND    ISDELETED   = 'F'
UNION ALL
  SELECT incidentid,
         casualfactors,
         endidx + 1,
         INSTR( casualfactors, ',', endidx + 1 )
  FROM   casualfactors_bounds
  WHERE  endidx > 0
)
SELECT incidentid,
       CASE
       WHEN endidx = 0
       THEN SUBSTR( casualfactors, startidx )
       ELSE SUBSTR( casualfactors, startidx, endidx - startidx )
       END AS casualfactor
FROM   casualfactors_bounds
ORDER BY incidentid, startidx

输出

事件ID |休闲因素 ---------: | :----------- 1 |一种 1 | b 1 | C 1 | d 1 | e 1 | F 2 | G 2 | H 2 |一世 2 | j 2 | ķ 3 | l

说明计划

|计划表输出 | | :------------------------------------------------ -------------------------------------------------- ------------- | |计划哈希值:2740663158 | | | | -------------------------------------------------- -------------------------------------------------- ------------- | | |身份证 |操作 |姓名 |行 |字节 |成本 (%CPU)|时间 | | | -------------------------------------------------- -------------------------------------------------- ------------- | | | 0 |选择声明 | | 6 | 276 | 7 (15)| 00:00:01 | | | | 1 |排序方式 | | 6 | 276 | 7 (15)| 00:00:01 | | | | 2 |查看 | | 6 | 276 | 6 (0)| 00:00:01 | | | | 3 |联合所有(递归)广度优先| | | | | | | | |* 4 |表访问完全 | HSERS_INCIDENT_PSA | 3 | 78 | 3 (0)| 00:00:01 | | | |* 5 |带泵的递归 | | | | | | | | -------------------------------------------------- -------------------------------------------------- ------------- | | | |谓词信息(由操作 id 标识):| | -------------------------------------------------- - | | | | 4 - 过滤器(“ACTIVE_FLAG”='1' AND “ISDELETED”='F')| | 5 - 过滤器(“ENDIDX”>0)| | | |注意 | | ----- | | - 用于此语句的动态采样 (level=2) |

db小提琴here

【讨论】:

非常感谢您的帮助!查询在 10 秒内返回行!【参考方案2】:

FROM PSA.HSERS_INCIDENT_PSA WHERE ACTIVE_FLAG='1' AND ISDELETED = 'F' 是否返回不止一行?如果是这样,CONNECT BY 子句将生成所有行的笛卡尔积。这会打击你的基数并导致极端的运行时间。

你可以通过一个简单的技巧来避免这种情况:

select  INCIDENTID, trim(regexp_substr(hip.CASUALFACTORS,'[^,]+', 1, level) ) value  
FROM PSA.HSERS_INCIDENT_PSA hip
WHERE hip.ACTIVE_FLAG='1'
AND   hip.ISDELETED = 'F'
connect by regexp_substr(CASUALFACTORS, '[^,]+', 1, level) is not null
            and hip.rowid = prior hip.rowid
            and prior sys_guid() is not null
order by hip.INCIDENTID

【讨论】:

谢谢!查询仍然需要时间来获取结果(超过 15 分钟并且仍在执行),我想我做错了什么。【参考方案3】:

如果数据库正在生成错误的计划,可能是因为该表上的统计信息已过时。您可以执行SELECT LAST_ANALYZED FROM ALL_TABLES WHERE OWNER = 'PSA' AND TABLE_NAME = 'HSERS_INCIDENT_PSA' 来查看上次在此表上收集统计信息的时间,但您也可以继续使用

BEGIN
  DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => 'PSA',
                                TABNAME => 'HSERS_INCIDENT_PSA',
                                CASCADE => TRUE);
END;

然后重新评估计划。

【讨论】:

以上是关于尽管成本低、基数低,但 Oracle 中的选择查询需要很长时间的主要内容,如果未能解决你的问题,请参考以下文章

低基数字段的索引效率

数据仓库中的低基数维度

4G18的低成本NA玩法

MongoDB向复合索引添加低基数字段?

弃用共享存储,部署高可用低成本 oracle 12c rac集群

ADAMoracle预言机可靠便捷低成本精准喂价实现生态发展