SELECT后面有自定义函数的优化方法

Posted robinson1988

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了SELECT后面有自定义函数的优化方法相关的知识,希望对你有一定的参考价值。

又是EBS,一个SQL有700多行,要跑2个多小时,EBS开发请求帮忙优化
因为SQL太长,将其做了简化处理,简化后的SQL文本如下:

INSERT INTO nso_allocaton_fr_tmp
  (create_date,
   request_id,
   c1,
   c2,
   c3,
   c4,
   c5,
   c6,
   c7,
   c8,
   c9,
   c10,
   c11,
   c12,
   c13,
   c14,
   c15,
   c16,
   c17,
   c18,
   c19,
   c20,
   c21,
   c22,
   c23,
   c24,
   c25,
   c26,
   c27,
   c28,
   c29,
   c30,
   c31,
   c32,
   c33,
   c34,
   c35,
   c36,
   c37,
   c38,
   c39,
   c40,
   c41,
   c42,
   c43,
   c44,
   c45,
   c46,
   c47,
   c48,
   c49,
   c50,
   c51,
   c52,
   c53,
   c54,
   c55,
   c56,
   c57,
   c58,
   c59,
   c60,
   c61,
   c62,
   c63,
   c64,
   c65,
   c66,
   c67,
   c68,
   c70)
  (SELECT SYSDATE,
          fnd_global.conc_request_id,
          t.ou_name,
          t.ou_name,
          t.department_name,
          t.sales_person,
          t.project_number,
          t.item_category,
          (SELECT ffv.description
             FROM fnd_flex_value_sets ffvs, fnd_flex_values_vl ffv
            WHERE ffvs.flex_value_set_id = ffv.flex_value_set_id
              AND ffvs.flex_value_set_name = 'XY_COA_PD'
              AND ffv.flex_value = t.item_category
              AND ffv.enabled_flag = 'Y'
              AND nvl(ffv.start_date_active, trunc(SYSDATE)) <=
                  trunc(SYSDATE)
              AND nvl(ffv.end_date_active, trunc(SYSDATE)) >= trunc(SYSDATE)
              AND rownum = 1) item_category_desc,
          t.item_type,
          t.trade_mode,
          t.po_type,
          t.so_area,
          t.so_type,
          DECODE('NET',
                 'TOTAL',
                 DECODE(SUBSTR(t.segment3, 1, 8),
                        '60010118',
                        NULL,
                        t.so_quantity),
                 t.so_quantity),
          DECODE('NET',
                 'TOTAL',
                 DECODE(SUBSTR(t.segment3, 1, 8),
                        '60010118',
                        NULL,
                        t.internal_so_quantity),
                 t.internal_so_quantity),
          DECODE('NET',
                 'TOTAL',
                 DECODE(SUBSTR(t.segment3, 1, 8),
                        '60010118',
                        NULL,
                        t.actual_so_quantity),
                 t.actual_so_quantity),
          DECODE('NET',
                 'TOTAL',
                 DECODE(SUBSTR(t.segment3, 1, 8),
                        '60010118',
                        NULL,
                        t.so_revenue),
                 t.so_revenue),
          DECODE('NET',
                 'TOTAL',
                 DECODE(SUBSTR(t.segment3, 1, 8),
                        '60010118',
                        NULL,
                        t.internal_so_revenue),
                 t.internal_so_revenue),
          DECODE('NET',
                 'TOTAL',
                 DECODE(SUBSTR(t.segment3, 1, 8),
                        '60010118',
                        NULL,
                        t.actual_so_revenue),
                 t.actual_so_revenue),
          DECODE('NET',
                 'TOTAL',
                 DECODE(SUBSTR(t.segment3, 1, 8),
                        '64010102',
                        NULL,
                        t.so_cost),
                 t.so_cost),
          DECODE('NET',
                 'TOTAL',
                 DECODE(SUBSTR(t.segment3, 1, 8),
                        '64010102',
                        NULL,
                        t.internal_so_cost),
                 t.internal_so_cost),
          t.so_cost_diff,
          (decode(t.actual_so_revenue,
                  0,
                  NULL,
                  NULL,
                  NULL,
                  round(100 *
                        (nvl(t.so_cost_diff, 0) - nvl(t.expense_amount, 0)) /
                        t.actual_so_revenue,
                        2))),
          t.expense_amount,
          t.margin_amount,
          (decode(t.actual_so_revenue,
                  0,
                  NULL,
                  NULL,
                  NULL,
                  round(100 *
                        (nvl(t.so_cost_diff, 0) - nvl(t.expense_amount, 0)) /
                        t.actual_so_revenue,
                        2))),
          abs(decode(t.actual_so_revenue,
                     0,
                     NULL,
                     NULL,
                     NULL,
                     round(100 * t.expense_amount / t.actual_so_revenue, 2))),
          t.so_number,
          t.so_line_num,
          t.ar_trx_number,
          t.ar_trx_line_num,
          t.gl_je_name,
          t.gl_je_line_num,
          t.lot_number,
          t.inv_organization_name,
          t.inv_item_number,
          t.inv_item_description,
          t.plan_po_number,
          t.plan_po_line_number,
          t.po_number,
          t.po_line_number,
          t.ap_invoice_number,
          t.ap_invoice_line_num,
          t.ra_trx_type_name,
          t.so_party_name,
          t.po_vendor_name,
          t.so_name,
          t.po_name,
          to_char(t.trans_date, 'YYYY-MM-DD HH24:MI:SS'),
          t.po_type1,
          t.comments,
          t.sales_type,
          t.primary_uom_code,
          t.project_number_desc,
          t.sales_percentage_rate,
          t.summary,
          t.so_country_name,
          t.rec_method,
          t.blanket_number,
          t.source_type,
          nvl(t.so_cost, 0) - nvl(t.purchase_cost, 0) -
          nvl(t.sales_cost, 0),
          t.purchase_cost,
          t.sales_cost,
          t.innerordercode,
          t.parent_project_number,
          nar_ar14_analysis_extract_pkg.get_out_contract_number(t.plan_po_number,
                                                                t.project_number,
                                                                t.org_id)
         ,DECODE(SUBSTR(t.segment3, 1, 8),
                         '60010118',
                         DECODE((SELECT COUNT(*)
                                  FROM gl_je_headers    gjh,
                                       gl_je_sources_tl gs
                                 WHERE gjh.je_header_id = t.je_header_id
                                   AND gjh.je_source = gs.je_source_name
                                   AND gs.language = 'ZHS'
                                   AND gs.user_je_source_name IN
                                       ('人工', '自动复制')),
                                0,
                                NULL,
                                (SELECT TO_NUMBER(gjl.attribute5) *
                                        
                                        DECODE(nvl(nvl(gjl.accounted_dr,
                                                       gjl.entered_dr),
                                                   0),
                                               0,
                                               -1,
                                               1)
                                   FROM gl_je_lines gjl
                                  WHERE gjl.je_header_id = t.je_header_id
                                    AND gjl.je_line_num = t.gl_je_line_num)),
                         NULL),
            DECODE(SUBSTR(t.segment3, 1, 8),
                         '60010118',
                         DECODE((SELECT COUNT(*)
                                  FROM gl_je_headers    gjh,
                                       gl_je_sources_tl gs
                                 WHERE gjh.je_header_id = t.je_header_id
                                   AND gjh.je_source = gs.je_source_name
                                   AND gs.language = 'ZHS'
                                   AND gs.user_je_source_name IN
                                       ('人工', '自动复制')
                                ),
                                0,
                                NULL,
                                (SELECT nvl(gjl.accounted_cr, 0) -
                                        nvl(gjl.accounted_dr, 0)
                                   FROM gl_je_lines gjl
                                  WHERE gjl.je_header_id = t.je_header_id
                                    AND gjl.je_line_num = t.gl_je_line_num)),
                         NULL),
            DECODE(SUBSTR(t.segment3, 1, 8),
                         '64010102',
                         DECODE((SELECT COUNT(*)
                                  FROM gl_je_headers    gjh,
                                       gl_je_sources_tl gs
                                 WHERE gjh.je_header_id = t.je_header_id
                                   AND gjh.je_source = gs.je_source_name
                                   AND gs.language = 'ZHS'
                                   AND gs.user_je_source_name IN
                                       ('人工', '自动复制')),
                                0,
                                NULL,
                                (SELECT nvl(gjl.accounted_dr, 0) -
                                        nvl(gjl.accounted_cr, 0)
                                   FROM gl_je_lines gjl
                                  WHERE gjl.je_header_id = t.je_header_id
                                    AND gjl.je_line_num = t.gl_je_line_num)),
                         NULL), 'RST' FROM x t  ---x原本是一个超大的SELECT语句,它没有性能瓶颈,为了简化SQL,创建临时表x代替
   );

x一共有4768314行数据,没有过滤条件

SQL> select count(*) from x;
  COUNT(*)
----------
   4768314

SQL里面有一堆标量子查询和自定义函数,肯定是标量子查询或者自定义函数影响了SQL的性能
可以把SQL拆了,注释掉自定义函数或者注释掉标量子查询,再跑SQL看谁影响了SQL性能  
最终定位到是自定义函数拖慢了整个SQL

nar_ar14_analysis_extract_pkg.get_out_contract_number(t.plan_po_number,
                                                      t.project_number,
                                                      t.org_id)
自定义函数的代码如下:

FUNCTION get_out_contract_number(p_plan_po_number IN VARCHAR2
                                  ,p_project_number IN VARCHAR2
                                  ,p_org_id         IN NUMBER) RETURN VARCHAR2 IS
    l_return VARCHAR2(240);
  BEGIN
    IF p_plan_po_number IS NOT NULL THEN
      SELECT pha.attribute1
        INTO l_return
        FROM po_headers_all pha
       WHERE pha.type_lookup_code = 'PLANNED'
         AND pha.attribute_category = 'CONTRACT'
         AND pha.segment1 = p_plan_po_number
         AND pha.org_id = p_org_id
         AND pha.attribute1 IS NOT NULL
         AND ROWNUM = 1;
    ELSIF p_project_number IS NOT NULL THEN
      SELECT pha.attribute1
        INTO l_return
        FROM po_distributions_all pda
            ,po_headers_all       pha
            ,pa_projects_all      ppa
       WHERE pha.po_header_id = pda.po_header_id
         AND pda.project_id = ppa.project_id
         AND pha.type_lookup_code = 'PLANNED'
         AND pha.attribute_category = 'CONTRACT'
         AND ppa.segment1 = p_project_number
         AND rownum = 1;
    ELSE
      l_return := NULL;
    END IF;
    RETURN l_return;
  EXCEPTION
    WHEN OTHERS THEN
      RETURN NULL;
  END get_out_contract_number;

通过监控ASH,也能定位到是自定义函数拖慢了整个SQL

SQL> select sql_id, sql_child_number, event, top_level_sql_id, count(*)
  2    from gv$active_session_history
  3   where session_id = 937
  4     and sample_time >
  5         to_date('2022-12-01 14:18:00', 'yyyy-mm-dd hh24:mi:ss')
  6   group by sql_id, sql_child_number, event, top_level_sql_id
  7   order by 5 desc;
SQL_ID        SQL_CHILD_NUMBER EVENT                           TOP_LEVEL_SQL_ID   COUNT(*)
------------- ---------------- ------------------------------- ---------------- ----------
16j3dr5ws843j                5                                 1upgmu861mtaz          3110
1upgmu861mtaz                0                                 1upgmu861mtaz            55
                            -1                                 1upgmu861mtaz             3
1upgmu861mtaz                0 db file sequential read         1upgmu861mtaz             2
                             0                                                           1
264xbz7w614gx                5                                 1upgmu861mtaz             1

SQL> select sql_text from v$sql where sql_id='16j3dr5ws843j' and rownum=1;
SQL_TEXT
--------------------------------------------------------------------------------
SELECT PHA.ATTRIBUTE1 FROM PO_DISTRIBUTIONS_ALL PDA ,PO_HEADERS_ALL PHA ,PA_PROJ

SQL> select sql_text from v$sql where sql_id='1upgmu861mtaz' and rownum=1;
SQL_TEXT
--------------------------------------------------------------------------------
INSERT INTO nso_allocaton_fr_tmp   (create_date,    request_id,    c1,    c2,

定位到是因为自定义函数部分拖慢了性能就好办了

优化方法一:

将自定义函数的逻辑改写为标量子查询(为了方便读者阅读,进一步简化了SQL,省略了其他列)

SQL> INSERT INTO nso_allocaton_fr_tmp
  2    (c65)
  3    (SELECT
  4            case
  5              when plan_po_number IS NOT NULL THEN
  6             (SELECT pha.attribute1
  7               FROM po_headers_all pha
  8              WHERE pha.type_lookup_code = 'PLANNED'
  9                AND pha.attribute_category = 'CONTRACT'
 10                AND pha.segment1 = t.plan_po_number
 11                AND pha.org_id = t.org_id
 12                AND pha.attribute1 IS NOT NULL
 13          AND ROWNUM=1
 14                )
 15              when project_number IS NOT NULL THEN
 16             (SELECT pha.attribute1
 17               FROM po_distributions_all pda,
 18                    po_headers_all       pha,
 19                    pa_projects_all      ppa
 20              WHERE pha.po_header_id = pda.po_header_id
 21                AND pda.project_id = ppa.project_id
 22                AND pha.type_lookup_code = 'PLANNED'
 23                AND pha.attribute_category = 'CONTRACT'
 24                AND ppa.segment1 = t.project_number
 25          AND ROWNUM=1
 26                )
 27              else null end FROM x t
 28     );
 
4768314 rows inserted
Executed in 21.315 seconds

优化方法二:

不改写自定义函数逻辑,把对自定义函数的直接访问改成标量子查询+dual来访问

SQL> INSERT INTO nso_allocaton_fr_tmp
  2    (c65)
  3    (SELECT (select nar_ar14_analysis_extract_pkg.get_out_contract_number(t.plan_po_number,
  4                                                                          t.project_number,
  5                                                                          t.org_id)
  6               from dual)
  7       FROM x t);
  
4768314 rows inserted
Executed in 25.836 seconds

优化方法一跑了21秒,优化方法二跑了25秒,优化方法二多出来的4秒是SQL引擎和PLSQL引擎上下文切换引起的

读者可能会有疑问,为什么将SQL

INSERT INTO nso_allocaton_fr_tmp
  (c65)
  (SELECT nar_ar14_analysis_extract_pkg.get_out_contract_number(t.plan_po_number,
                                                                t.project_number,
                                                                t.org_id)
     FROM x t);

改成

INSERT INTO nso_allocaton_fr_tmp
  (c65)
  (SELECT (select nar_ar14_analysis_extract_pkg.get_out_contract_number(t.plan_po_number,
                                                                        t.project_number,
                                                                        t.org_id)
             from dual)
     FROM x t);

就能从2个多小时优化到几十秒呢?

那是因为没有改写之前,函数nar_ar14_analysis_extract_pkg.get_out_contract_number要被调用4768314次
而改写之后,函数nar_ar14_analysis_extract_pkg.get_out_contract_number只会被调用2485次(标量子查询特性)

SQL> select count(*)
  2    from ( 
  3           select t.plan_po_number, t.project_number, t.org_id
  4             from x t
  5            group by t.plan_po_number, t.project_number, t.org_id);
  
  COUNT(*)
----------
      2485
      
如果函数传参的列是主键,那么前面两种优化方法都不对,需要将函数的SQL改写为LEFT JOIN,这里就不做展示了

各位读者,如果以后遇到了自定义函数引起SQL慢,自定义函数是直接访问(没有通过select...dual访问)
自定义函数逻辑非常复杂(改写困难),传参列distinct值少,可以采用本案例第二种优化方法试一试      

以上是关于SELECT后面有自定义函数的优化方法的主要内容,如果未能解决你的问题,请参考以下文章

SELECT后面有自定义函数的优化方法

T-SQL编程 —— 用户自定义函数(标量函数)

Oracle优化笔记

ORACLE当中自定义函数性优化浅析

JVM逃逸分析

sqlserver 标量函数例子