Oracle Performance Tuning Tools

Posted 耀阳居士

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Oracle Performance Tuning Tools相关的知识,希望对你有一定的参考价值。

Oracle Performance Tuning Tools

Performance tools can examine execution strategies and can be used for SQL tuning, the tools can give you a good estimate of resource usage used by the queries.

The common tuning tools are

  • explain plan - lets you see the execution plan used by the query
  • autotrace - automatically produces an execution plan when you execute any SQL.
  • SQL trace - traces the execution of SQL statements
  • TKPROF - formats the trace files into a readable form
  • dbms_monitor - setup up end-to-end tracing
  • Statspack - performance diagnosis tool
  • AWR - automatic load repository

EXPLAIN Plan

The explain plan lets you see the execution plan used by the optimizer when it executed your query, it is idea when you are using hints as you can see if the hint is being used or not. The output of the explain plan goes into a table called plan_table, the explain plan out will tell you

  • The tables used in the query and the order in which they are accessed
  • The operations performed on the output of each step of the plan.
  • The specific access and join methods used for each table mentioned
  • The cost of each operation.

To create your own explain plan you must execute a oracle script which will create the plan table where the output of the explain plan is stored.

Setting up explain plan @$ORACLE_HOME/rdbms/admin/utlxplan.sql
Creating the EXPLAIN plan explain plan for
  select * from employees where lname = \'valle\';
Display explain plan

select * from plan_table (dbms_xplan.display);

Note: i have used the dbms_xplan package in the above statement

When reading the plan there are some principle to consider

  • Each step in the plan returns output in the form of a set of rows to the parent step
  • Read the plan outward starting from the line that is indented the most
  • If two operations are at the same level in terms of indentation, read the top one first
  • The numbering of the steps in the plan is misleading, start reading the plan output from inside out, read the most indented operations first
example

select statement
  hash join
    nested loops
      table access full department
      index unique scan employee_pk
    index fast full scan dept_id_pk

  1. Oracle does a full table scan of the department table
  2. Oracle performs an index unique scan of the employees table using its primary key index
  3. Oracle performs a nested loop operation to join the rows from step 1 and 2
  4. Oracle performs a fast full index scan of the department_id using its primary key
  5. The final step oracle performs a hash join of the set from step3 and the rows resulting from step 4

Autotrace

The autotrace facility enables you to produce explain plans automatically when you execute a SQL statement. make sure that the plan table has been created (see above regarding running script utlxplan.sql).

Set privilege

grant plustrace to vallep;

Note: you can also grant to public

Turn off set autotrace off
Turn on

set autotrace on

Note: this turns on explain and statistics

Turn on explain set autotrace {on|off|trace[only]} explain
Turn on statistics set autotrace {on|off|trace[only]} statistics
Traceonly set autotrace traceonly

SQL Trace and TKPROF

SQL trace helps you trace the execution of SQL statements, TKPROF formats the output from the trace into a readable format. SQL trace enables you to track the following variables in a SQL statement

  • CPU and elapsed times
  • Parsed and execution counts for each SQL statement
  • Number of physical and logical reads
  • Execution plan for all the SQL statements
  • Library cache hit ratios

The explain plan gives you important information regarding the access path that the optimizer used, the SQL trace gives you the breakdown of the resources used CPU and I/O.

Collecting trace statistics imposes a performance penalty, you can control the collection of statistics by two parameters

Turn on statistics collection alter system set statistics_level = typical;
alter system set statistics_level = all;
Turn off statistics collection alter system set statistics_level = basic;
Turn on timed statistics

alter system set time_statistics = true;

Note: default is set to false, even if the statistics_level is set to basic (off) time statistics will be collected

You can turn on tracing for both the session or instance level, remember that turning on for the instance will use a lot of disk space and system resources.

Instance alter system set sql_trace = true;
alter system set sql_trace = false;
Session

alter session set sql_trace = true;
alter session set sql_trace = false;

dbms_system.set_sql_trace_in_session(<sid><serial#>,true);
dbms_system.set_sql_trace_in_session(<sid><serial#>,false);

The trace will create a file located in the user_dump_dest and will format the file as db_name_ora_nnnnn.trc, the file size will generally be much larger in size that other files in this area.

TKPROF uses the trace file along with the following parameters

FILENAME input trace file
EXPLAIN the explain plan for the SQL statement
RECORD creates a SQL script with all the nonrecursive SQL statements
WAITS Records a summary of wait events
SORT Presents sort data based on one or more items
TABLE The name of the table TKPROF temporarily puts the executions plans
SYS Enables and disables listing SQL statements issued by SYS
PRINT Lists only a specified number of SQL statements instead of all statements
INSERT Creates scripts that stores the trace information in the database.

 

TKPROF examples

tkprof finance_ora_16340.trc test.txt sys=no explain=y

Note: the output will be dumped into the test.txt file

End-to-End Tracing

By using a new attribute client_identifier you can trace a users session through multiple database sessions. you use the package dbms_monitor or OEM to setup the tracing. You require three attributes to trace the session

  • Client Identifier
  • Service Name
  • Combination of service name, module name and action name.

Below is an example on how to use the end-to-end tracing

setup the service name, module name, action name and the client id

dbms_monitor.serv_mod_act_trace_enable (
  service_name => \'myservice\',
  module_name => \'batch_job\',
  action_name => \'batch_insert\'
);

dbms_monitor.client_id_trace_enable (
  client_id => \'vallep\'
);

set the UID using a trigger

create or replace trigger logon_trigger
  after logon on database
declare
  user_id varchar(64);
begin
  select ora_login_user || \':\' || sys_context(\'userenv\',\'os_user\') into user_id from dual;
  dbms_session.set_identifier(uid);
end;

Obtain the sid and serial# dbms_monitor.session_trace_enable (
  session_id => 111,
  serial_num => 23,
  waits => true,
  binds => false
);
Combine the multiple trace files in one file c:\\> trcsess output="vallep.trc" service="myservice" module="batch_job" action="batch_insert"
run TKPROF against the consolidated file

c:\\> tkprof vallep.trc output=vallep.rpt sort=(EXELA, PRSELA, FCHELA)

Note: there are many options to the sort parameter please see the Oracle documentation for more information.

EXELA - elapsed time executing
PRSELA - elapsed time parsing
FCHELA - elapsed time fetching

Inefficient SQL

You can use the view v$sql to find inefficient SQL code, the view gathers important information regarding the disk reads and memory reads for a SQL statement. This view holds information on statements since startup, it also ages out older statements. The view will give you information on the following

  • rows_processed - total number of rows processed by the statement
  • sql_text - the SQL text of the statement (first 1,000 characters)
  • buffer_gets - total number of logical reads (high CPU use)
  • disk_reads - total number of disk reads (high I/O use)
  • sorts - number of sort for the statement (high sort ratios)
  • cpu_time - total parse and execution time
  • elapsed_time - elapsed parse and execution time
  • parse_calls - combined soft and hard parse calls
  • executions - number of times the statement was executed
  • loads - number of times the statement was flushed out of the shared pool then reloaded
  • sharable_memory - total memory used by the shared cursor
  • persistant_memory - total persistent memory used by the cursor
  • runtime_memory - total runtime memory used by the cursor
High disk reads select sql_text, executions, buffer_gets, disk_reads from v$sql
   where buffer_gets > 100000 or disk_reads > 100000 order by buffer_gets + 100 * disk_reads desc;
High disk reads and parsed calls and row processed select sql_text, rows_processed, buffer_gets, disk_reads, parsed_calls from v$sql
   where buffer_gets > 100000 or disk_reads > 100000 order by buffer_gets + 100 * disk_reads desc;
TOP 5 CU time and elapsed time select sql_text, executions,
  round(elapsed_time/1000000, 2) elapsed_seconds,
  round(cpu_time/1000000, 2) cpu_secs
from (select * from v£sql order by elapsed_time desc)
where rownum < 6;

SQL Tuning Advisor

When you have identified bad SQL, you can use the SQL tuning advisor to perform an in depth analysis to come up with a better execution plan.

See the Advisors for more detailed information.

Statspack

Statspack is a diagnostic tool that captures and stores the V$ table information and allows you generate reports at a later date, although it has been replaced with AWR many dba\'s still use this tool, I will only give a brief overview of this tool as you will more than likely start using the AWR tool as its now Oracle preferred method of collecting statistics.

To install statspack you simply run the following as sys with sysdba privilege "$oracle_home\\rdbms\\admin\\spcreate.sql", you can use "spdrop.sql" to remove it. The install script will ask you three pieces of information

  • The PERFSTAT user password
  • The default tablespace that you will use for the PERFSTAT schema
  • The default temporary tablespace that you will use for the PERFSTAT schema

Once the installation has finished check the "spcpkg.lis" file for any errors, the below commands can be run to obtain snapshots and generate reports.

Create snapshot exec statspack.snap
Run report

@$oracle_home\\rdbms\\admin\\spreport.sql

Note: when you run the report it will ask for two snapshot points to compare.

AWR

See AWR for more information on how to setup and how to run reports.

转帖Articles: Tuning Java I/O Performance

本文转自:

https://www.oracle.com/technical-resources/articles/javase/perftuning.html

作者 Glen McCluskey March 1999

I/O Performance

This article discusses and illustrates a variety of techniques for improving Java I/O performance. Most of the techniques center around tuning disk file I/O, but some are applicable to network I/O and window output as well. The first set of techniques presented below cover low-level I/O issues, and then higher-level issues such as compression, formatting, and serialization are discussed. Note, however, the discussion does not cover application design issues, such as choice of search algorithms and data structures, nor does it discuss system-level issues such as file caching.

When discussing Java I/O, it's worth noting that the Java programming language assumes two distinct types of disk file organization. One is based on streams of bytes, the other on character sequences. In the Java language a character is represented using two bytes, not one byte as in other common languages such as C. Because of this, some translation is required to read characters from a file. This distinction is important in some contexts, as several of the examples will illustrate.

Low-Level I/O Issues

High-Level I/O Issues

Basic Rules for Speeding Up I/O

As a means of starting the discussion, here are some basic rules on how to speed up I/O:

  1. Avoid accessing the disk.
  2. Avoid accessing the underlying operating system.
  3. Avoid method calls.
  4. Avoid processing bytes and characters individually.

 

These rules obviously cannot be applied in a "blanket" way, because if that were the case, no I/O would ever get done! But to see how they can be applied, consider the following three-part example that counts the number of newline bytes ( '\\n') in a file.

Approach 1: Read Method

The first approach simply uses the read method on a FileInputStream:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*;
      public class intro1      public static void main(String args[])  
      if (args.length != 1)          System.err.println("missing filename");
    System.exit(1);  
           
    try  FileInputStream fis = new FileInputStream(args[0]); 
    int cnt = 0; 
    int b; 
    while ((b = fis.read()) != -1)   if (b == '\\n') cnt++; 
             fis.close();  
    System.out.println(cnt);  
           catch (IOException e)          System.err.println(e);    
            

 

However, this approach triggers a lot of calls to the underlying runtime system, that is, FileInputStream.read, a native method that returns the next byte of the file.

Approach 2: Using a Large Buffer

The second approach avoids the above problem, by using a large buffer:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*;
     public class intro2     public static void main(String args[])  
   if (args.length != 1)        System.err.println("missing filename"); 
   System.exit(1); 
       
   try        FileInputStream fis = new FileInputStream(args[0]);
   BufferedInputStream bis = new BufferedInputStream(fis);
   int cnt = 0;
   int b;
   while ((b = bis.read()) != -1)  if (b == '\\n')  cnt++; 
          bis.close(); 
   System.out.println(cnt); 
        catch (IOException e)        System.err.println(e);
          

 

BufferedInputStream.read takes the next byte from the input buffer, and only rarely accesses the underlying system.

Approach 3: Direct Buffering

The third approach avoids BufferedInputStream and does buffering directly, thereby eliminating the read method calls:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*;  
   public class intro3      public static void main(String args[])  
   if (args.length != 1)          System.err.println("missing filename");  
   System.exit(1); 
          
   try          FileInputStream fis = new FileInputStream(args[0]); 
   byte buf[] = new byte[2048]; 
   int cnt = 0;     
   int n; 
   while ((n = fis.read(buf)) != -1)   for (int i = 0; i < n; i++)  
   if (buf[i] == '\\n')               cnt++; 
                     fis.close();  
   System.out.println(cnt);   
          catch (IOException e)          System.err.println(e); 
           

 

For a 1 MB input file, the execution times in seconds of the programs are:

Copy

Copied to Clipboard

Error: Could not Copy

intro1    6.9
 intro2    0.9
 intro3    0.4

 

or about a 17 to 1 difference between the slowest and fastest.

This huge speedup doesn't necessarily prove that you should always emulate the third approach, in which you do your own buffering. Such an approach may be error-prone, especially in handling end-of-file events, if it is not carefully implemented. It may also be less readable than the alternatives. But it's useful to keep in mind where the time goes, and how it can be reclaimed when necessary.

Approach 2 is probably "right" for most applications.

Buffering

Approaches 2 and 3 use the technique of buffering, where large chunks of a file are read from disk, and then accessed a byte or character at a time. Buffering is a basic and important technique for speeding I/O, and several Java classes support buffering ( BufferedInputStream for bytes, BufferedReader for characters).

An obvious question is: Will making the buffer bigger make I/O go faster? Java buffers typically are by default 1024 or 2048 bytes long. A buffer larger than this may help speed I/O, but often by only a few percent, say 5 to 10%.

Approach 4: Whole File

The extreme case of buffering would be to determine the length of a file in advance, and then read in the whole file:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*;
      public class readfile      public static void main(String args[])  
      if (args.length != 1)          System.err.println("missing filename");
    System.exit(1);
          
    try          int len = (int)(new File(args[0]).length()); 
    FileInputStream fis =             new FileInputStream(args[0]);   
      byte buf[] = new byte[len];  
    fis.read(buf);    
    fis.close();   
      int cnt = 0;  
    for (int i = 0; i < len; i++)  if (buf[i] == '\\n')  cnt++;
             System.out.println(cnt);
           catch (IOException e)          System.err.println(e); 
              

 

This approach is convenient, in that a file can be treated as an array of bytes. But there's an obvious problem of possibly not having enough memory to read in a very large file.

Another aspect of buffering concerns text output to a terminal window. By default, System.out (a PrintStream) is line buffered, meaning that the output buffer is flushed when a newline character is encountered. This is important for interactivity, where you'd like to have an input prompt displayed before actually entering any input.

Approach 5: Disabling Line Buffering

But line buffering can be disabled, as in this example:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*;  
    public class bufout      public static void main(String args[]) 
  FileOutputStream fdout =  new FileOutputStream(FileDescriptor.out); 
  BufferedOutputStream bos =  new BufferedOutputStream(fdout, 1024);
  PrintStream ps =  new PrintStream(bos, false); 
  System.setOut(ps); 
  final int N = 100000; 
  for (int i = 1; i <= N; i++) 
  System.out.println(i);   
  ps.close();        

 

This program writes the integers 1..100000 to the output, and runs about three times faster than the default equivalent that has line buffering enabled.

Buffering is also an important part of one of the examples presented below, where a buffer is used to speed up random file access.

Reading/Writing Text Files

Earlier the idea was mentioned that method call overhead can be significant when reading characters from a file. Another example of this can be found in a program that counts the number of lines in a text file:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*;
    public class line1      public static void main(String args[])  
  if (args.length != 1)          System.err.println("missing filename"); 
  System.exit(1); 
        
  try          FileInputStream fis = new FileInputStream(args[0]);
  BufferedInputStream bis = new BufferedInputStream(fis);
  DataInputStream dis =  new DataInputStream(bis);
  int cnt = 0;
  while (dis.readLine() != null) cnt++;  
  dis.close(); 
  System.out.println(cnt); 
         catch (IOException e)          System.err.println(e); 
          

 

This program uses the old DataInputStream.readLine method, which is implemented using read method calls to obtain each character. A newer approach is to say:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*; 
   public class line2      public static void main(String args[]) 
   if (args.length != 1)          System.err.println("missing filename"); 
   System.exit(1);  
         
   try          FileReader fr = new FileReader(args[0]); 
   BufferedReader br = new BufferedReader(fr);
   int cnt = 0;   
   while (br.readLine() != null) cnt++; 
   br.close();   
   System.out.println(cnt); 
          catch (IOException e)  System.err.println(e); 
           

 

This approach can be faster. For example, on a 6 MB text file with 200,000 lines, the second program is around 20% faster than the first.

But even if the second program isn't faster, there's an important issue to note. The first program evokes a deprecation warning from the Java 2 compiler, because DataInputStream.readLine is obsolete. It does not properly convert bytes to characters, and would not be an appropriate choice for manipulating text files containing anything other than ASCII text byte streams (recall that the Java language uses the Unicode character set, not ASCII).

This is where the distinction between byte streams and character streams noted earlier comes into play. A program such as:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*; 
     public class conv1      public static void main(String args[]) 
   try   FileOutputStream fos = new FileOutputStream("out1");
   PrintStream ps = new PrintStream(fos);
   ps.println("\\uffff\\u4321\\u1234"); 
   ps.close();
          
   catch (IOException e) System.err.println(e);
           

 

writes an output file, but without preserving the Unicode characters that are actually output. The Reader/Writer I/O classes are character-based, and are designed to resolve this issue. OutputStreamWriter is where the encoding of characters to bytes is applied.

A program that uses PrintWriter to write out Unicode characters looks like this:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*;
    public class conv2      public static void main(String args[]) 
  try  FileOutputStream fos = new FileOutputStream("out2");
  OutputStreamWriter osw = new OutputStreamWriter(fos, "UTF8");
  PrintWriter pw =  new PrintWriter(osw);
  pw.println("\\uffff\\u4321\\u1234"); 
  pw.close(); 
         
  catch (IOException e) System.err.println(e);
          

 

This program uses the UTF8 encoding, which has the property of encoding ASCII text as itself, and other characters as two or three bytes.

Formatting Costs

Actually writing data to a file is only part of the cost of output. Another significant cost is data formatting. Consider a three-part example, one that writes out lines like:

The square of 5 is 25

Approach 1

The first approach is simply to write out a fixed string, to get an idea of the intrinsic I/O cost:

Copy

Copied to Clipboard

Error: Could not Copy

public class format1  
    public static void main(String args[]) 
  final int COUNT = 25000; 
  for (int i = 1; i <= COUNT; i++) 
  String s = "The square of 5 is 25\\n";
  System.out.print(s);
          

 

Approach 2

The second approach employs simple formatting using "+":

Copy

Copied to Clipboard

Error: Could not Copy

public class format2 
     public static void main(String args[])  
   int n = 5; 
   final int COUNT = 25000; 
   for (int i = 1; i <= COUNT; i++) 
   String s = "The square of " + n + " is " +  n * n + "\\n";
   System.out.print(s); 
           

 

Approach 3

The third approach uses the MessageFormat class from the java.text package:

Copy

Copied to Clipboard

Error: Could not Copy

import java.text.*;
     public class format3  
   public static void main(String args[])  
     MessageFormat fmt = new MessageFormat("The square of 0 is 1\\n");
   Object values[] = new Object[2];
   int n = 5;  
     values[0] = new Integer(n);
     values[1] = new Integer(n * n);
   final int COUNT = 25000;
   for (int i = 1; i <= COUNT; i++)  
   String s = fmt.format(values); 
   System.out.print(s); 
             

 

These programs produce identical output. The running times are:

Copy

Copied to Clipboard

Error: Could not Copy

format1   1.3
 format2   1.8
 format3   7.8

 

or about a 6 to 1 difference between the slowest and fastest. The third program would be even slower if the format had not been precompiled and the static convenience method had been used instead:

Approach 4

MessageFormat.format(String, Object[])

as in:

Copy

Copied to Clipboard

Error: Could not Copy

import java.text.*;
      public class format4  
    public static void main(String args[])  
      String fmt = "The square of 0 is 1\\n";
    Object values[] = new Object[2]; 
    int n = 5; 
    values[0] = new Integer(n); 
      values[1] = new Integer(n * n); 
    final int COUNT = 25000; 
    for (int i = 1; i <= COUNT; i++) 
    String s = MessageFormat.format(fmt, values); 
    System.out.print(s); 
              

 

which takes 1/3 longer than the previous example.

The fact that approach 3 is quite a bit slower than approaches 1 and 2 doesn't mean that you shouldn't use it. But you need to be aware of the cost in time.

Message formats are quite important in internationalization contexts, and an application concerned about this issue might typically read the format from a resource bundle, and then use it.

Random Access

RandomAccessFile is a Java class for doing random access I/O (at the byte level) on files. The class provides a seek method, similar to that found in C/C++, to move the file pointer to an arbitrary location, from which point bytes can then be read or written.

The seek method accesses the underlying runtime system, and as such, tends to be expensive. One cheaper alternative is to set up your own buffering on top of a RandomAccessFile, and implement a read method for bytes directly. The parameter to read is the byte offset >= 0 of the desired byte. An example of how this is done is:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*; 
     public class ReadRandom    
   private static final int DEFAULT_BUFSIZE = 4096;
   private RandomAccessFile raf;
     private byte inbuf[]; 
   private long startpos = -1;
     private long endpos = -1; 
   private int bufsize;  
   public ReadRandom(String name)       throws FileNotFoundException  
   this(name, DEFAULT_BUFSIZE); 
           
   public ReadRandom(String name, int b)  
   throws FileNotFoundException raf = new RandomAccessFile(name, "r");
   bufsize = b;  
     inbuf = new byte[bufsize]; 
     
   public int read(long pos)       
   if (pos < startpos || pos > endpos)  
   long blockstart = (pos / bufsize) * bufsize;
   int n;
   try  
   raf.seek(blockstart);
   n = raf.read(inbuf);
     
   catch (IOException e)            return -1;          
   startpos = blockstart;
   endpos = blockstart + n - 1; 
   if (pos < startpos || pos > endpos)           return -1; 
         
   return inbuf[(int)(pos - startpos)] & 0xffff;
             
   public void close() throws IOException        raf.close(); 
          
   public static void main(String args[]) 
          if (args.length != 1)   
   System.err.println("missing filename"); 
   System.exit(1); 
            
   try          ReadRandom rr = new ReadRandom(args[0]);
   long pos = 0; 
   int c;    
     byte buf[] = new byte[1]; 
   while ((c = rr.read(pos)) != -1)  pos++; buf[0] = (byte)c; 
   System.out.write(buf, 0, 1);
            rr.close(); 
          
   catch (IOException e)   System.err.println(e); 
           

 

The driver program simply reads the bytes in sequence and writes them out.

This technique is helpful if you have locality of access, where nearby bytes in the file are read at about the same time. For example, if you are implementing a binary search scheme on a sorted file, this approach might be useful. It's of less value if you are truly doing random access at arbitrary points in a large file.

Compression

Java provides classes for compressing and uncompressing byte streams. These are found in the java.util.zip package, and also serve as the basis for Jar files (a Jar file is a Zip file with an added manifest).

The following program takes a single input file, and writes a compressed output Zip file, with a single entry representing the input file:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*;
  import java.util.zip.*;
  public class compress public static void doit(  String filein,
  String fileout )  FileInputStream fis = null; 
  FileOutputStream fos = null;  
  try          fis = new FileInputStream(filein); 
  fos = new FileOutputStream(fileout); 
  ZipOutputStream zos =  new ZipOutputStream(fos); 
  ZipEntry ze = new ZipEntry(filein);
  zos.putNextEntry(ze); 
  final int BUFSIZ = 4096; 
  byte inbuf[] = new byte[BUFSIZ];
  int n;   
  while ((n = fis.read(inbuf)) != -1)  zos.write(inbuf, 0, n); 
  fis.close();
  fis = null; 
  zos.close(); 
  fos = null; 
         catch (IOException e)          System.err.println(e);
        
  finally   try   if (fis != null)             fis.close(); 
  if (fos != null)             fos.close();
           catch (IOException e)                      
  public static void main(String args[])      if (args.length != 2)  
  System.err.println("missing filenames"); 
  System.exit(1); 
     
  if (args[0].equals(args[1]))  
  System.err.println("filenames are identical"); 
  System.exit(1);   
       
  doit(args[0], args[1]);        

 

The next program reverses the process, taking an input Zip file that is assumed to have a single entry in it, and uncompresses that entry to the output file:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*; 
 import java.util.zip.*; 
 public class uncompress      public static void doit(  String filein,
 String fileout  )        FileInputStream fis = null;
 FileOutputStream fos = null; 
 try          fis = new FileInputStream(filein); 
 fos = new FileOutputStream(fileout); 
 ZipInputStream zis = new ZipInputStream(fis); 
 ZipEntry ze = zis.getNextEntry();  
 final int BUFSIZ = 4096;   
 byte inbuf[] = new byte[BUFSIZ];
 int n;     
 while ((n = zis.read(inbuf, 0, BUFSIZ)) != -1) fos.write(inbuf, 0, n); 
 zis.close(); 
 fis = null;   
 fos.close();   
 fos = null; 
  catch (IOException e)  System.err.println(e);
        
 finally  try  if (fis != null) fis.close(); 
 if (fos != null) fos.close(); 
   catch (IOException e)                       
 public static void main(String args[])  if (args.length != 2)  
 System.err.println("missing filenames"); 
 System.exit(1);  
     
 if (args[0].equals(args[1]))  
 System.err.println("filenames are identical"); 
 System.exit(1);  
     
 doit(args[0], args[1]);        

 

Whether compression helps or hurts I/O performance depends a lot on your local hardware setup; specifically the relative speeds of the processor and disk drives. Compression using Zip technology implies typically a 50% reduction in data size, but at the cost of some time to compress and decompress. An experiment with large (5 to 10 MB) compressed text files, using a 300-MHz Pentium PC with IDE disk drives, showed an elapsed time speedup of around 1/3 in reading compressed files from disk, over reading uncompressed ones.

An example of where compression is useful is in writing to very slow media such as floppy disks. A test using a fast processor (300 MHz Pentium) and a slow floppy (the conventional floppy drive found on PCs), showed that compressing a large text file and then writing to the floppy drive results in a speedup of around 50% over simply copying the file directly to the floppy drive.

Caching

A detailed discussion of hardware caching is beyond the scope of this paper. But sometimes software caching can be used to speed up I/O. Consider a case where you want to read lines of a text file in random order. One way to do this is to read in all the lines, and store them in an ArrayList (a collection class similar to Vector):

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*;
  import java.util.ArrayList; 
  public class LineCache      private ArrayList list = new ArrayList(); 
  public LineCache(String fn) throws IOException        FileReader fr = new FileReader(fn);
  BufferedReader br = new BufferedReader(fr); 
  String ln;   
  while ((ln = br.readLine()) != null)         list.add(ln);
  br.close();  
          
  public String getLine(int n)        if (n < 0)   
  throw new IllegalArgumentException();
  return (n < list.size() ?         (String)list.get(n) : null);
         
  public static void main(String args[])  
  if (args.length != 1)   
  System.err.println("missing filename");
  System.exit(1);  
         
  try          LineCache lc = new LineCache(args[0]); 
  int i = 0;  
  String ln;
  while ((ln = lc.getLine(i++)) != null)
  System.out.println(ln);
        
  catch (IOException e)  
  System.err.println(e); 
          

 

The getLine method is then used to retrieve an arbitrary line. This technique is quite useful, but obviously uses a lot of memory for large files, and so has limitations. An alternative might be to simply remember the last 100 lines that were requested, and read from the disk for any other requests. This scheme works well if there is locality of access of the lines, but not so well if line requests are truly random.

Tokenization

Tokenization refers to the process of breaking byte or character sequences into logical chunks, for example words. Java offers a StreamTokenizer class, that operates like this:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*;
      public class token1      
    public static void main(String args[]) 
      if (args.length != 1)  
    System.err.println("missing filename");
    System.exit(1); 
           
    try          FileReader fr = new FileReader(args[0]);
    BufferedReader br = new BufferedReader(fr);
    StreamTokenizer st = new StreamTokenizer(br);
    st.resetSyntax(); 
    st.wordChars('a', 'z'); 
    int tok; 
    while ((tok = st.nextToken()) !=  StreamTokenizer.TT_EOF)  
    if (tok == StreamTokenizer.TT_WORD) 
    ;// st.sval has token                  br.close();
           catch (IOException e)  
    System.err.println(e);    
            

 

This example tokenizes in terms of lower-case words (letters a-z). If you implement the equivalent yourself, it might look like:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*; 
     public class token2   
   public static void main(String args[])  
   if (args.length != 1)  
   System.err.println("missing filename");
   System.exit(1);  
           
   try          FileReader fr = new FileReader(args[0]); 
   BufferedReader br = new BufferedReader(fr); 
   int maxlen = 256; 
   int currlen = 0; 
   char wordbuf[] = new char[maxlen]; 
   int c;  
   do  c = br.read();
   if (c >= 'a' && c <= 'z')  
   if (currlen == maxlen)  
   maxlen *= 1.5;  
   char xbuf[] =  new char[maxlen];
   System.arraycopy( 
   wordbuf, 0,  xbuf, 0, currlen);
   wordbuf = xbuf; 
                
   wordbuf[currlen++] = (char)c; 
             
   else if (currlen > 0)  
   String s = new String(wordbuf, 0, currlen); 
   // do something with s   currlen = 0; 
             while (c != -1); 
   br.close();  
            
   catch (IOException e)  
   System.err.println(e);
           

 

The second program runs about 20% faster than the first, at the expense of having to write some tricky low-level code.

StreamTokenizer is sort of a hybrid class, in that it will read from character-based streams (like BufferedReader), but at the same time operates in terms of bytes, treating all characters with two-byte values (greater than 0xff) as though they are alphabetic characters.

Serialization

Serialization is used to convert arbitrary Java data structures into byte streams, using a standardized format. For example, the following program writes out an array of random integers:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*; 
  import java.util.*;  
  public class serial1  
  public static void main(String args[])  
  ArrayList al = new ArrayList(); 
  Random rn = new Random();
  final int N = 100000;  
  for (int i = 1; i <= N; i++)   
  al.add(new Integer(rn.nextInt())); 
  try   FileOutputStream fos = new FileOutputStream("test.ser"); 
  BufferedOutputStream bos = new BufferedOutputStream(fos); 
  ObjectOutputStream oos =  new ObjectOutputStream(bos); 
  oos.writeObject(al);  
  oos.close(); 
        
  catch (Throwable e)  
  System.err.println(e); 
          

 

and this program reads the array back in:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*;
  import java.util.*;
  public class serial2  
  public static void main(String args[])  
  ArrayList al = null; 
  try   
  FileInputStream fis =  new FileInputStream("test.ser"); 
  BufferedInputStream bis =  new BufferedInputStream(fis);
  ObjectInputStream ois =  new ObjectInputStream(bis);
  al = (ArrayList)ois.readObject(); 
  ois.close(); 
        
  catch (Throwable e) 
  System.err.println(e); 
          

 

Note that we used buffering to speed the I/O operations.

Is there a faster way than serialization to write out large volumes of data, and then read it back? Probably not, except in special cases. For example, suppose that you decide to write out a 64-bit long integer as text instead of as a set of 8 bytes. The maximum length of a long integer as text is around 20 characters, or 2.5 times as long as the binary representation. So it seems likely that this format wouldn't be any faster. In some cases, however, such as bitmaps, a special format might be an improvement. However, using your own scheme does work against the standard offered by serialization, so doing so involves some tradeoffs.

Beyond the actual I/O and formatting costs of serialization (using DataInputStream and DataOutputStream), there are other costs, for example, the need to create new objects when deserializing.

Note also that the methods of DataOutputStream can be used to develop semi-custom data formats, for example:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*;
   import java.util.*; 
   public class binary1  
   public static void main(String args[])  
   try   
   FileOutputStream fos = new FileOutputStream("outdata"); 
   BufferedOutputStream bos = new BufferedOutputStream(fos); 
   DataOutputStream dos = new DataOutputStream(bos);
   Random rn = new Random();
   final int N = 10; 
   dos.writeInt(N); 
   for (int i = 1; i <= N; i++)  
   int r = rn.nextInt();
   System.out.println(r); 
   dos.writeInt(r); 
          
   dos.close(); 
        
   catch (IOException e)  
   System.err.println(e);
           

 

and:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*; 
     public class binary2  
   public static void main(String args[])  
   try       
   FileInputStream fis = new FileInputStream("outdata");
   BufferedInputStream bis =  new BufferedInputStream(fis);
   DataInputStream dis =  new DataInputStream(bis);
   int N = dis.readInt();
   for (int i = 1; i <= N; i++) 
   int r = dis.readInt(); 
   System.out.println(r);   
            dis.close(); 
          catch (IOException e)  
   System.err.println(e); 
           

 

These programs write 10 integers to a file and then read them back.

Obtaining Information About Files

Our discussion so far has centered on input and output for individual files. But there's another aspect of speeding I/O performance, that relates to finding out properties of files. For example, consider a small program that prints the length of a filename:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*; 
     public class length1  
   public static void main(String args[]) 
   if (args.length != 1)   
   System.err.println("missing filename");
   System.exit(1);   
         
   File f = new File(args[0]); 
   long len = f.length(); 
   System.out.println(len); 
      

 

The Java runtime system itself cannot know the length of a file, and so must query the underlying operating system to obtain this information. This holds true for other file information, such as whether a file is a directory, the time it was last modified, and so on. The File class in the java.io package provides a set of methods to query this information. Such querying is in general expensive in terms of time, and should be used as little as possible.

A longer example of querying file information, one that recursively walks the file system roots to dump out a set of all the file pathnames on a system, looks like this:

Copy

Copied to Clipboard

Error: Could not Copy

import java.io.*;
      public class roots  
    public static void visit(File f)  
      System.out.println(f); 
          
    public static void walk(File f)        visit(f); 
      if (f.isDirectory())          String list[] = f.list(); 
    for (int i = 0; i < list.length; i++)     
      walk(new File(f, list[i]));    
               
    public static void main(String args[])  
      File list[] = File.listRoots();  
    for (int i = 0; i < list.length; i++)  
    if (list[i].exists())   
    walk(list[i]);     
    else         
    System.err.println("not accessible: "  + list[i]);
            

 

This example uses File methods, such as isDirectory and exists, to navigate through the directory structure. Each file is queried exactly once as to its type (plain file or directory).

Further Information

The article:

JDC Performance Tips and Firewall Tunneling Techniques
discusses some general ways of improving Java application performance. to ones found above, while others tackle the problem at a lower level.

Don Knuth's book, The Art of Computer Programming , Volume 3, discusses external sorting and searching algorithms, such as the use of B-trees.

About the Author

Glen McCluskey has focused on programming languages since 1998. He consults in the areas of Java and C++ performance, testing, and technical documentation.

以上是关于Oracle Performance Tuning Tools的主要内容,如果未能解决你的问题,请参考以下文章

转帖Articles: Tuning Java I/O Performance

Performance Tuning

[转]Columnstore Performance Tuning

SQLite performance tuning

Linux------Performance and Tuning

Performance Tuning