转 怎样解读10046 trace (tkprof 的结果 )

Posted feiyun8616的作坊 (半个程序员and dba)

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了转 怎样解读10046 trace (tkprof 的结果 )相关的知识,希望对你有一定的参考价值。

TKProf Structure

TKProf output for an individual cursor has the following structure:


SELECT NULL FROM DUAL

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0          3          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.00       0.00          0          3          0           1

Misses in library cache during parse: 0
Optimizer goal: FIRST_ROWS
Parsing user id: 271

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  TABLE ACCESS FULL DUAL (cr=3 pr=0 pw=0 time=21 us cost=1 size=1 card=1)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  SQL*Net message from client                     2        0.00          0.00

Overall the structure is:

  • SQL Statement
  • Parse/Execute/Fetch statistics and timings
  • Library Cache information
  • Row source plan
  • Events waited for by the statement

Parse/Execute/Fetch statistics and timings

This section contains the bulk of the useful timing information for each statement. This can be used in conjunction with the ‘Row source plan‘ and ‘Events waited for by the statement‘ to give the full picture.

Columns in the Parse/Execute/Fetch table have the following meanings:

call Statistics for each cursor‘s activity are divided in to 3 areas: Parse/Execute/Fetch. A total is also calculated.
Parse statistics from parsing the cursor. This includes information for plan generation etc.
Execute statistics for the execution phase of a cursor
Fetch statistics for actually fetching the rows
count number of times each individual activity has been performed on this particular cursor
cpu cpu time used by this cursor
elapsed elapsed time for this cursor (includes the cpu time)
disk This indicates the number of blocks read from disk. Generally it would be preferable for blocks to be read from the buffer cache rather than disk.
query This column is incremented if a buffer is read in Consistent mode. A Consistent mode buffer is one that has been generated to give a consistent read snapshot for a long running transaction.
current This column is incremented if a buffer is found in the buffer cache that is new enough for the current transaction and is in current mode (and it is not a CR buffer). This applies to buffers that have been read in to the cache as well as buffers that already exist in the cache in current mode.
rows Rows retrieved by this step


Library Cache information

Tracing a statement records some information regarding library cache usage which is externalised by TKProf in this section. Most important here is "Misses in library cache during parse:" which shows whether or not a statement is being re-parsed. If a statement is being shared well then you should see a minimal number of misses here (1 or 0 preferably). If sharing is not occurring then high values in this field can indicate that.

Row source plan
This section displays the access path used at execution time for each statement along with timing and actual row counts returned by each step in the plan. This can be very useful for a number of reasons.

Row source plans are generated from STAT lines in the raw trace.
STAT lines are written to trace every now and then, but sometimes, if the cursor is not closed cleanly then STAT lines will not be recorded and then the row source plan will not be displayed. Setting SQL_TRACE to false DOES NOT close all cursors. Cursors are closed in SQL*Plus immediately after execution. The safest way to close all cursors is to cleanly exit the session in question. See:

Document 312368.1 Why Row Source Plans or Row Counts are Missing in TKPROF Output

For details of interpreting 10046 output see:

Document 39817.1 Interpreting Raw SQL_TRACE and DBMS_SUPPORT.START_TRACE output

Example:

Rows     Row Source Operation
-------  ---------------------------------------------------
 [A]  1  TABLE ACCESS FULL DUAL [B] (cr=3 [C] pr=0 [D] pw=0 [E] time=21 us [F] cost=7 [G] size=7 [H] card=1 [I])
  • Row count [A]- the row counts output in this section are the actual number of rows returned at each step in the query execution. These actual counts can be compared with the estimated cardinalities (row counts) from an optimizer explain plan. Any differences may indicate a statistical problem that may result in a poor plan choice. 
    See:
    Document 214106.1 Using TKProf to compare actual and predicted row counts
  • Row Source Operation [B] - Shows the operation executed at this step in the plan.
  • IO Stats - For each step in the plan, [C] is the consistent reads, [D] is the physical reads and [E] is the writes. These statistics can be useful in identifying steps that read or write a particularly large proportion of the overall data.
  • Timing - [F] shows the cumulative elapsed time for the step and the steps that preceded it in microseconds (μs: 1/1000000 of a second). This section is very useful when looking for the point in an access path that takes all the time. By looking for the point at where the majority of the time originates it is possible to narrow down a number of problems.
  • On later releases, row source trace has been enhanced to include some optimizer information. [G] is the estimated cost of the operation used by the optimizer for internal comparison, [H] is the estimated space usage of the operation in bytes and [I] is estimated cardinality (number of rows returned) of that particular operation.

Note that TKProf also contains an explain plan feature which allows users to generate the explain plan the statement in question would use if it were to be executed NOW as the specified user. When looking at previously executed statements it is advisable not to use this option and to rely on the Row Source plans generated from the STAT lines in the trace, as above. See the TKProf section in:

 Document 199081.1 Overview Reference for SQL_TRACE, TKProf and Explain Plan


Events waited for by the statement
This section displays all wait events that a statement has waited for during the tracing.  This section can be very useful when used in conjunction with the statistics and row source information for tracking down the causes of problems associated with long wait times. High numbers of waits or waits with a long total duration may be candidates for investigation dependent on the wait itself.

General Tips

If a system is performing sub-optimally then one potential way of diagnosing potential causes is to trace a typical user session and then use TKProf to format the output. The numerous sort options available can provide a useful way of organising the output by moving the ‘top‘ statement in a particular category to the top of the list. A list of the sort options can be accessed by simply typing ‘TKProf‘ at the command prompt.

A useful starting point is the ‘fchela‘ sort option which orders the output by elapsed time spent fetching. The resultant .prf file will display the most time consuming SQL statement at the start of the file.

For actions to deal with degraded query performance located in this manner see:

Document 742112.1 Support Action Plan for Query Performance Degradation

Another useful parameter is sys=yes/no. This can be used to prevent SQL statements run as user SYS from being displayed. This can make the output file much shorter and easier to manage.

Remember to always set the TIMED_STATISTICS parameter to TRUE when tracing sessions as otherwise no time based comparisons can be made.

Potential TKProf Usage Examples

Spotting Relatively High Resource Usage

update ...
where  ...

-----------------------------------------------------------------------
| call    | count | cpu | elapsed | disk |   query | current |   rows |
|---------|-------|-----|---------|------|---------|---------|--------|
| Parse   |     1 |   7 |     122 |    0 |       0 |       0 |      0 |
| Execute |     1 |  75 |     461 |    5 | [H] 297 |   [I] 3 | [J]  1 |
| Fetch   |     0 |   0 |       0 |    0 |       0 |       0 |      0 |
-----------------------------------------------------------------------


This statement is a single execute of an update.

[H] shows that this query is visiting 297 buffers to find the rows to update
[I] shows that only 3 buffer are visited performing the update
[J] shows that only 1 row is updated.

Reading 297 buffers to update 1 rows is a lot of work and would tend to indicate that the access path being used is not particularly efficient. Perhaps there is an index missing that would improve the access performance?

Spotting Over Parsing

select ...

-------------------------------------------------------------------------
| call    | count |     cpu | elapsed | disk |  query | current |  rows |
|---------|-------|---------|---------|------|--------|---------|-------|
| Parse   | [M] 2 | [N] 221 |     329 |    0 |     45 |       0 |     0 |
| Execute |     3 | [O]   9 | [P]  17 |    0 |      0 |       0 |     0 |
| Fetch   |     3 |       6 |       8 |    0 | [L]  4 |       0 | [K] 1 |
-------------------------------------------------------------------------

Misses in library cache during parse: 2 [Q]

Here we have a select that we suspect may be a candidate for over parsing.

[K] is shows that the query has returned 1 row.
[L] shows that 4 buffers were read to get this row back.

This is fine.

[M] show that the statement is parsed twice - this is not desirable especially as the parse cpu usage is a high [N] in comparison to the execute figures : [O] & [P] (ie the elapsed time for execute is 17 seconds but the statement spends over 300 seconds to determine the access path etc in the parse phase.

 [Q] shows that these parses are hard parses. If [Q] was 1 then the statement would have had 1 hard parse followed by a soft parse (which just looks up the already parsed detail in the library cache). See:

Document 32895.1 SQL Parsing Flow Diagram

for more details.

This is not a particularly bad example in terms of total counts since the query has only been executed a few times. However if this pattern is reproduced for each execution this could be a significant issue. Excessive parsing should be avoided as far as possible by ensuring that code is shared:

  • using bind variables
  • make shared pool large enough to hold query definitions in memory long enough to be reused.

See:

Document 62143.1 Understanding and Tuning the Shared Pool


Spotting Queries that Execute too frequently

The following query has a high elapsed time and is a candidate for investigation:

UPDATE ...
SET ...
WHERE COL = :bind1;
    
call     count       cpu    elapsed       disk      query    current      rows
------- ------  -------- ---------- ---------- ---------- ----------  --------
Parse        0      0.00       0.00          0          0          0         0
Execute 488719  66476.95   66557.80          1     488729    1970566    488719
Fetch        0      0.00       0.00          0          0          0         0
------- ------  -------- ---------- ---------- ---------- ----------  --------
total   488719  66476.95   66557.80          1     488729    1970566    488719


From the above, the update executes 488,719 times and takes in total ~ 65,000 seconds to do this. The majority of the time is spent on CPU.  A single row is updated per execution. For each row updated ~1 buffer is queried. ~2 million buffers are visited to perform the update.

On average the elapsed time is ~ 0.1 second per execution. A sub-second execution time would normally be acceptable for most queries, but if the query is not scaleable and is executed numerous times, then the time can quickly add up to a large number. 

It would appear that in this case the update may be part of a loop where individual values are passsed and 1 row is updated per value. This structure does not scale with large number of values meaning that it can become inefficient.

One potential solution is to try to ‘batch up‘  the updates so that multiple rows are updated within the same execution. As Oracle releases have progressed a number of optimizations and enhancements have been made to improve the handling of ‘batch‘ operations and to make them more efficient. In this way, code modifications to replace frequently executed relatively inefficient statements by more scaleable operations can have a significant impact.

Trace File Elapsed Time Total Differs From Total SQL Time

Sometimes you may see tkprof report that shows the elapsed time does not match the overall total for all recursive sql statements + overall total time for all non-recursive statements:

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.00       0.00          0          0          0           1                                 <---------Total non-recursive SQL is 0 seconds

OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse      101      0.23       0.53          0          1         22           0
Execute 11115914    713.48     849.96        183     281305       1021         213
Fetch   11115842   1080.26    2981.83        643  160564747          0    11121030
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   22231857   1793.97    3832.33        826  160846053       1043    11121243        <------------Total for all recursive SQL is 3,832 seconds

Trace file: support_ora_1111111_sample_.trc
Trace file compatibility: 11.1.0.7
Sort options: fchela  exeela  prsela  
       1  session in tracefile.
      47  user  SQL statements in trace file.
      24  internal SQL statements in trace file.
      71  SQL statements in trace file.
      42  unique SQL statements in trace file.
 52492692  lines in trace file.
   10538  elapsed seconds in trace file.           <--------------elapsed time in trace is 10,538 seconds

There appears to be 6706 missing in the trace file.

This is expected behavior if you start tracing a session that has already started.

To ensure that elapsed seconds matches the session total, you need to begin the tracing at the start of the session.

以上是关于转 怎样解读10046 trace (tkprof 的结果 )的主要内容,如果未能解决你的问题,请参考以下文章

Oracle 10046事件 介绍 ---tkprof

tkprof工具详解二

SQL 跟踪方法相关介绍

读懂执行计划

SQL_TRACE与tkprof分析

10046 trace文件分析