sql tuning for oracle

Upload: agrawalamitkumar

Post on 06-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 SQL Tuning for Oracle

    1/8

    SQL Tuning for Oracle

    White Paper by Marty Weber

    [email protected]

    Improving the performance of your SQL can be achieved using a number of techniques.

    You could re-word your SQL, create or change indexes, or use hints. Determining the

    combination of approaches which is likely to result in performance improvements

    requires an understanding of how Oracle processes SQL, how you can influence Oracle'sprocessing, and how to make the most of indexes.

    OptimizationOptimization is the process of choosing the most efficient way to execute a SQL

    statement. This is an important step in processing any Data Manipulation Language

    statement (SELECT, INSERT, UPDATE OR DELETE). There may be many different

    ways for Oracle to execute such a statement, for example, varying which tables orindexes are accessed in which order. The procedure used to execute a statement can

    greatly affect how quickly the statement executes. A part of Oracle called optimizerchooses the way that it believes to be most efficient. The Oracle optimizer has three

    primary modes of operation: RULE, COST or CHOOSE for the optimizer_mode

    parameter in your database's init.ora file. You can override the optimizer's default

    operations at the query and session level.

    Setting optimizer_mode to RULE invokes the rule-based optimizer (RBO). The basic

    approach of the rule-based optimizer is that for each table in the WHERE clause, every

    possible access path is considered and ranked. The access path with the lowest rank is

    selected. The remaining tables are then selected based on their rankings. The rule-basedoptimizer will almost always favor an access path involving an index to one involving a

    full table scan. This is because without table sizes, the index is the safer choice. Star

    queries are not recognized by the rule-based optimizer. Star queries are used to optimizea data warehousing design called star schema. The rule-based optimizer does not consider

    Bitmap indexes.

    http://home.earthlink.net/~mweber11/home.html
  • 8/3/2019 SQL Tuning for Oracle

    2/8

    Setting optimizer_mode to COST invokes the cost-based optimizer (CBO). Cost-based

    optimizer incorporates many features of the rule-based optimizer, but has the advantage

    of being able to take into account statistical information relating to volume anddistribution of data within tables and indexes. This optimizer can therefore distinguish

    between a two row table and a two million row table and may generate different

    execution plans for each. To use the cost-based optimizer, you should use the analyzecommand to analyze the tables involved. Create histograms for columns that have an

    uneven distribution and/or a small number of distinct values.

    Setting optimizer_mode to CHOOSE invokes the cost-based optimizer if the tables have

    been analyzed and the rule-based optimizer if the tables have not been analyzed. Oraclewarns against using CHOOSE.

    Analyze Command

    The ANALYZE command collects and stores table and index statistics which are

    essential for efficient operation of the cost-based optimizer. The purpose of the

    ANALYZE command is to perform one of the following functions on an index, table, orcluster:

    - To collect statistics about the object used by the optimizer and store them in the datadictionary

    - To delete statistics about the object from the data dictionary

    - To validate the structure of the object- To identify migrated and chained rows of the table or cluster

    Developer Note: The Analyze Command can slow processing time if the statistics are not

    kept current or if all the objects have not been analyzed. The Analyze command will need

    to be run regularly to maintain current statistics.

    Histograms

    The histograms are created as part of the ANALYZE command. You can choose to

    generate histograms for all columns in the table, for all indexed columns or for selected

    columns only. Normally, you would choose to create histograms only for columns wherethe data was not evenly distributed (skewed). The histogram stores information about the

    frequency of various column values which Oracle can use to decide whether or not to use

    the index. For example, a table with yes/no column that 90% of the rows are yes and only10% are no. The optimizer would use the index for queries on no, but not on queries on

    yes.

    Full Table Scan Versus IndexesThe two most commonly used methods to retrieve rows from a table are full table scan orindex lookup. With a full table scan, all rows from the table are read and comparedagainst the selection criteria. A full table scan can be faster than an index lookup if your

    SQL statement accesses a large percent of the table's data. Although it is not possible to

    generalize across all types of SQL statements, hardware and data distribution, a rule ofthumb can help a programmer decide whether to use a full table scan or an index lookup.

    One such rule of thumb is when accessing more than five percent of the table's data, full

  • 8/3/2019 SQL Tuning for Oracle

    3/8

    table scans may be fater than indexes. Based on your experience, you may find that

    percentage to be closer to 10% or even 25%.

    Full Table Scan

    The full table scan is the simplest way for Oracle to get your data. In a full table scan,

    every row of data in the table is read into memory. To perform a full table scan, Oraclereads all blocks (the basic unit of data storage) allocated to the table, starting with the

    first block and continuing until it reaches the high water mark. The high water mark is the"highest" block in the table that has ever held data. Think of it as the high tide mark.

    To improve performance of a full table scan, you can reduce the number of block reads

    required for the scan. If the table is smaller than it once was, you could lower the high

    water mark by rebuilding the table. Rebuilding the table establishes a new high watermark. Another way to improve a full table scan's performance is to squeeze more rows

    into each block by reducing PCTFREE and increasing PCTUSED.

    IndexesThe B-tree is the default Oracle index type. The B-tree has a hierarchical tree structure. A

    header block contains pointers to the appropriate branch block for any given range of key

    values. Branch blocks point to leaf blocks. The leaf block contains a list of key values

    and ROWIDs (row identifiers).

    Each row of a table has a ROWID. The ROWID is a code (a logical address) reflecting

    the physical location of the row. The fastest SELECT statements are those where the

    WHERE clause contains a condition based on the ROWID.

    The B-tree index provides flexible and efficient query performance. However,

    maintaining the B-tree with changing data can be expensive. To insert a row in theappropriate leaf block requires an index split, if there is no free space within the leaf

    block. Index splits are expensive operations. New blocks must be allocated and indexentries moved from one block to another. Index splits can be avoided by using artificial

    keys or reduced by increasing the amount of free space kept within the index. Use the

    PCTFREE clause of the CREATE INDEX statement to define free space. Also look at

    the REBUILD clause of the ALTER INDEX statement. This allows an index to be rebuiltusing the index itself as the source of data. Indexes should be in a separate table_space.

    It is important that all your indexes contribute to query performance, since these indexes

    will otherwise needlessly degrade Data Manipulation Language performance. Inserts and

    deletes will be significantly slower for tables with a large number of indexes. It may bepossible (with the appropriate scheduling) to drop the indexes make a large number of

    inserts/deletes and then rebuild the indexes. Dropping indexes will dramatically slow

    performance for other user transactions so this technique may not be an option.

    Bitmapped indexes were introduced in Oracle 7.3. Oracle creates a bitmap for each

    unique value of the column in question. Each bitmap contains a single bit (0 or 1) for

    every row in the table. A "1" indicates that the row has the value specified by the bitmap

  • 8/3/2019 SQL Tuning for Oracle

    4/8

    and a "0" indicates that it does not. Bitmaps suit columns with few distinct values and

    which are often queried together. Oracle is unable to lock a single bit, and consequently,

    locking for bitmapped indexes is at the block (or page) level.

    Even if there is an appropriate index available, the optimizer may not be able to take

    advantage of the access because of the wording of the SQL statement. Indexes aredisabled if a column is subjected to any modification. Two examples of modification are

    UPPER(column_name) or column_name +1. Some additional query types that preventindexes being used are:

    !=

    IS NULLNOT IN

    LIKE '%Smith' (with wildcard as the first letter)

    If the WHERE clause uses the column of only one index, Oracle performs a range scan

    on it to retrieve the ROWIDs of the selected rows, and then accesses the table by these

    ROWIDs. If the WHERE clause uses columns of more that five indexes, Oracle performsan index merge. When performing an index merge, Oracle retrieves rows from each

    index matching the appropriate condition. Rows, which are common to all "lists", arereturned.

    A composite index is an index that is made up of more than one column with a maximum

    of 16 columns. Using composite indexes for queries can be faster than single-column

    indexes. The WHERE clause must use all the columns of a composite index in equalityconditions combined with AND operators. If the composite index has three or more

    columns, the index may be used if the WHERE clause references the leading columns. If

    your query specifies values for only the nonleading columns of the index, then the index

    will not be used to resolve the query.

    HintsHints are instructions that you can include in your SQL statement to guide the optimizer.

    Using hints, you can specify join orders, type of access paths, indexes to be used, the

    optimization goal and other instructions. Optimizer hints appears as a comment followingthe first word of the SQL statement. A hint is different from a comment in that they are

    prefixed with a plus sign ("+"). Example follows:

    SELECT /*+ RULE*/

    Subqueries

    A subquery is a SQL statement that occurs within another SQL statement. A WHEREclause can contain a subquery SELECT. Subquery selects can be nested. You can nestsubqueries down more levels than you would ever need. Multiple sources recommend

    you not go down more than 16 levels.

    Correlated Subqueries

    A correlated subquery is one that the subquery refers to values in the parent query. A

    correlated subquery can return the same result as a join, but can be used where a join

  • 8/3/2019 SQL Tuning for Oracle

    5/8

    cannot, such as in an UPDATE, INSERT and DELETE statement. In a correlated

    subquery, the subquery executes repeatedly, once for each value of a candidate row

    selected by the main query. This is why a correlated update can take more processingtime.

    JoinThe join operation allows the results from two or more tables to be merged on some

    common column values. The inner join is the most common type of join operation. Rows

    are pulled from the tables based on some common ("key") value. Rows that have nomatch to the other table are not included. The outer join allows rows to be included even

    if they have no match in the other table. In Oracle, the outer join operator is "(+)". An

    anti-join is used to select all rows from a table that do not have a matching row in theother result set. You could create an anti_join using NOT EXISTS or NOT IN. In a self

    join, a table is joined to itself. A common reason for a self join is a recursive relationship.

    The self join is accomplished by using multiple aliases for the same table.

    Set OperationSQL has operations that allow result sets to be concatenated, subtracted or overlaid.These operators are UNION, MINUS, and INTERSECT. UNION returns the sum of two

    result sets without duplicates. MINUS returns all rows in the first result set which do not

    appear in the second result set. INTERSECT returns only the rows that appear in both

    result sets ignoring nulls. All set operations require that the component queries return thesame number of columns and those columns need to be compatible datatypes. The join

    operations described above do not have these columns and compatible datatype

    limitations.

    Bind Variables

    Variables can be specified as literals or as bind variables. Bind variables are fixedreferences to variables contained elsewhere in the programming language or development

    tool. Bind variables are recognizable because they are prefixed by a colon. If you are

    going to re-execute your SQL, but use different parameters, use bind variables to defineparameters. This will minimize parsing. Bitmap indexes do not work with bind variables.

    NullNull values are used to indicate that a data item is missing or undefined. The use of null

    values in relational databases can lead to unexpected results. NULL values extend the

    coding logic of TRUE/FALSE to TRUE/FALSE/UNKNOWN. Null values impact thequery results. Oracle recognizes null values as the highest values in a column, which

    means in an ORDER BY ascending clause, null values are placed at the bottom.Arithmetic operations can not be performed on fields that contain null values. The result

    of adding, subtracting, multiplying or dividing a null value and any number is a nullvalue. A field query using "!=" will not return records where the query field is null. When

    an indexed column is NULL, that row will not have an entry in the index. In other words,

    nulls are not indexed. When possible you may want to define columns as NOT NULL,for fields that will be indexed.

  • 8/3/2019 SQL Tuning for Oracle

    6/8

    Truncate Table versus DeleteThe TRUNCATE TABLE command allows all rows to be removed from a table with

    minimal overhead. TRUNCATE TABLE has no rollback segment and no commit.

    Emptying a table using the DELETE command results in higher overhead in rollbacksegments and redo log entries. Do not issue TRUNCATE TABLE unless you are sure

    you want all the table data gone.

    Execution PlanTo execute a Data Manipulation Language statement, Oracle may have to perform many

    steps. Each of these steps either physically retrieves rows of data from the database orprepares them in some way for the user issuing the statement. The combination of the

    steps Oracle uses to execute a statement is called an execution plan. Oracle provides tools

    that reveal the way in which your SQL is processing and the resources expended. These

    tools are the EXPLAIN PLAN, SQL_TRACE facility and the TKPROF utility.

    Explain Plan

    Explain Plan is used to determine the execution plan Oracle7 follows to execute aspecified SQL statement. This command inserts a row describing each step of the

    execution plan into a "plan table". If you are using cost-based optimization, thiscommand also determines the cost of the executing the statement. If you examine the

    plan, you can see how the server executes the statement.

    Example of Explain Plan follows:

    SQL> EXPLAIN PLAN SET STATEMENT_ID = 'MyInitialsTodaysDate' forSELECT * FROM

    The results of the explain plan go into the plan_table. To view the explain plan, use a

    variation of the following statement. The key to the statement below is the CONNECTBY and the STATEMENT_ID = 'MyInitialsTodaysDate'. The CONNECT BY gives youthe format and the STATEMENT_ID isolates the data to your Explain Plan. This format

    is the key to interpreting the execution plan. The more indented the access path, the

    earlier it is executed. If two statements are indented at the same level, the uppermoststatement is executed first.

    COL OPERATION FORMAT A30

    COL OPTIONS FORMAT A20

    COL OBJECT FORMAT A20

    SELECT LPAD(' ', 2*LEVEL) || OPERATION ||DECODE (ID, 0, ' COST = ' || POSITION) "OPERATION",

    OPTIONS "OPTIONS",

    OBJECT_NAME "OBJECT"FROM PLAN_TABLE WHERE STATEMENT_ID = 'MyInitialsTodaysDate'

    CONNECT BY PRIOR ID = PARENT_ID

    START WITH ID = 0/

  • 8/3/2019 SQL Tuning for Oracle

    7/8

    Because the procedure for running explain plan is awkward, many Oracle users write

    scripts to automate the process. In addition, there are a number of commercial and

    shareware products which can display explain plans and perform additional analysis ofyour SQL statement.

    SQL Trace FacilityOracle provides a facility for tracing SQL statement execution, which can provide all of

    the information provided by EXPLAIN PLAN together with details of CPU and I/O

    requirements and even the number of rows processed by each step in the execution plan.The ALTER SESSION SET SQL_TRACE TRUE statement allows SQL tracing to be

    initiated. The TKPROF command allows the trace files generated to be formatted in a

    meaningful way. The SQL trace facility generates the following statistics for eachstatement:

    - Parse, execute and fetch counts

    - CPU and elapse times- Physical reads and logical reads

    - Number of rows processed- Missed on the library cache

    To enable the SQL trace facility first set the following parameters:SQL> ALTER SESSION SET SQL_TRACE = TRUE;

    SQL>ALTER SESSION SET TIMED_STATISTICS=TRUE;

    This parameter enables the collection of timed statistics. The default is false. Setting it to

    true will reduce performance very slightly.

    SQL> MAX_DUMP_FILE_SIZE = N

    This parameter specifies the maximum size of trace files. The default is 500.

    SQL> USER_DUMP_FILE_SIZE = DIRECTORY_PATH

    This parameter specifies the destination for the trace file. The default value for thisparameter is the default destination for system dumps.

    Developer Note: SQL Trace increases system overhead. You should enable it only when

    tuning your SQL statements, and disable it when you are finished.

    Having enabled SQL_TRACE, your next challenge will be to find your trace file. Your

    trace file will be in the user_dump_file_size directory you specified. In UNIX and many

    other operating systems, the name will start with "ora' or "oracle_sid" and end with ".trc".In between will be the process identifier for the Oracle server process. The timestamp

    will help to identify your file. Once the trace file is found, TKPROF utility is used tocreate a usable file.

    TKPROF TRACEFILE OUTPUTFILE [SORT=OPTION] PRINT=N]

    [EXPLAIN=USERNAME/PASSWORD] [INSERT=FILENAME] [SYS=NO]

    [RECORD=FILENAME] [TABLE=SCHEMA.TABLENAME]

  • 8/3/2019 SQL Tuning for Oracle

    8/8

    TRACEFILE the name of the trace output file

    OUTPUTFILE the name of the formatted file

    SORT=OPTION the order in which to sort the statementsPRINT=N prints the first n statements

    EXPLAIN=USER/PASSWORD runs explain plan in the specified username

    INSERT=FILENAME generates INSERT statementsSYS=NO ignores recursive SQL statements run as user sys.

    RECORED=FILENAME records statement found in the trace file

    TABLE-SCHEMA.TABLENAME puts execution plan into specified table (default isplan_table)

    AutotraceSQL_TRACE and TKPROF are powerful tools, but they are not always easy to use. Each

    time you use SQL_TRACE you have to find the trace file then format and interpret the

    TKPROF output. Starting with SQL*Plus version 3.3 (Oracle 7.3) you have analternative. AUTOTRACE can generate execution plans and execution statistics for SQL

    statements executed from SQL*Plus. The set autotrace on command will cause eachquery, after being executed, to display both its execution path and high-level traceinformation about the processing involved to resolving the query. Unlike the explain plan

    command, the statement is actually run, even if you choose not to see the output. When

    you use set autotrace on, records are inserted into plan_table to show the order of

    operations executed.

    To enable AUTOTRACE set the following parameters:

    SQL> SET AUTOTRACE ON

    To skip seeing the output of the SQL statement use the following command:

    SQL>SET AUTOTRACE TRACE

    SummaryTuning SQL can improve the response time, throughput and scalability of an applicationand can help avoid costly hardware upgrades. SQL tuning is a very cost-effective way of

    improving system performance and can be done at any stage of a system's life cycle.

    Ideally, SQL should be tuned as it is written. It is hoped that the information presented

    here will help you understand how Oracle processes SQL, how you can influenceOracle's processing, and how you can make the most of indexes.

    Copyright 2001