oracle perf tune - intro

Upload: tssr2001

Post on 03-Jun-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Oracle Perf Tune - Intro

    1/16

    Oracle Data Architecture

    Introduction to Performance Tuning

    Performance tuning focuses primarily on writing efficient SQL, allocating appropriate computing

    Resources, and analyzing wait events and contention in the system. Unlike several other

    features of Oracle database management, performance tuning isnt a cut and

    dried subject with clear prescriptions and rules for every type of problem you may face. This is

    one area where your technical knowledge must be used together with constant experimentation

    and observation

    A Systematic Approach to Performance Tuning

    Its important to follow a systematic approach to tuning database performance. Performance

    problems commonly come to the fore only after a large number of users start working on a new

  • 8/12/2019 Oracle Perf Tune - Intro

    2/16

    production database. The system seems fine during development and rigorous testing, but it

    slows down to a crawl when it goes to production. This could be because the application isnt

    easily scalable for a number of reasons.

    The seeds of the future performance potential of your database are planted when you design

    your database. You need to know the nature of the applications the database is going to

    support.

    The more you understand your application, the better you can prepare for it by creating your

    database

    with the right configuration parameters. If major mistakes were made during the design stage

    and the database is already built, youre left with tuning application code on one hand and the

    database resources such as memory, CPU, and I/O on the other. Oracle suggests a specific

    design approach with the following steps:

    1. Design the application correctly.

    2. Tune the application SQL code.

    3. Tune memory.

    4. Tune I/O.

    5. Tune contention and other issues.

    Reactive & Reactive Performance Tuning

    Although the preceding performance tuning steps suggests that you can follow the sequence in

    an orderly fashion, the reality is completely different. Performance tuning is an iterative process,

    not a sequential one where you start at the top and end up with a fully tuned database as the

    product.

    As a DBA, you may be involved in a new project from the outset, when you have just the

    functional requirements. In this case, you have an opportunity to be involved in the tuning effort

    from the beginning stages of the application, a phase that is somewhat misleadingly dubbed

    proactive tuning by some. Alternatively, you may come in after the application has already been

    designed and implemented, and is in production. In this case, your performance efforts are

    categorized as reactive performance tuning. What you can do to improve the performance of the

  • 8/12/2019 Oracle Perf Tune - Intro

    3/16

    database depends on the stage at which you can have input, and on the nature of the

    application itself.

    Optimizing Oracle Query Processing

    When a user starts a data-retrieval operation, the users SQL statement goes through several

    sequential steps that together constitute query processing. One of the great benefits of using

    the SQL language is that it isnt a procedural language in which you have to specify the steps to

    be followed to achieve the statements goal. In other words, you dont have to state how to do

    something; rather, you just state what you need from the database.

    Query processing is the transformation of your SQL statement into an efficient execution plan

    to return the requested data from the database. Query optimization is the process of choosing

    the most efficient execution plan. The goal is to achieve the result with the least cost in terms of

    resource usage. Resources include the I/O and CPU usage on the server where your database

    is running. This also means that the goal is to reduce the total execution time of the query,

    which is simply the sum of the execution times of all the component operations of the query.

    This optimization of throughput may not be the same as minimizing response time. If you want

    to minimize the time it takes to get the first n rows of a query instead of the entire output of the

    query, the Optimizer may choose a different plan. If you choose to minimize the response time

    for all the query data, you may also choose to parallelize the operation.

    A users SQL statement goes through theparsing, optimizing, and execution stages. If the

    SQL statement is a query, data has to be retrieved, so theres an additional fetch stage before

    the SQL statement processing is complete.

    Parsing

    Parsing primarily consists of checking the syntax and semantics of the SQL statements. The

    end product of the parse stage of query compilation is the creation of theparse tree, which

    represents the querys structure.

    The SQL statement is decomposed into a relational algebra query thats analyzed to see

    Whether its syntactically correct. The query then undergoes semantic checking. Thedata

    dictionary is consulted to ensure that the tables and the individual columns that are referenced

  • 8/12/2019 Oracle Perf Tune - Intro

    4/16

    in the query do exist, as well as all the object privileges. In addition, the column types are

    checked to ensure that the data matches the column definitions. The statement is normalized so

    it can be processed more efficiently. The query is rejected if it is incorrectly formulated. Once

    the parse tree passes all the syntactic and semantic checks, its considered a valid parse tree,

    and its sent to the logical query plan generation stage. All these operations take place in the

    library cache portion of the SGA.

    Optimization

    During the optimization phase, Oracle uses its optimizerwhich is a cost-based optimizer

    (CBO)to choose the best access method for retrieving data for the tables and indexes

    referred to in the query. Using statistics that you provide and any hints specified in the SQL

    queries, the CBO produces an optimal execution plan for the SQL statement.

    Execution Plan Generation Phase

    During this phase, Oracle transforms the logical query plan into a physical query plan. The

    Optimizer may be faced with a choice of several algorithms to resolve a query. It needs to

    choose the most efficient algorithm to answer a query, and it needs to determine the most

    efficient way to implement the operations. In addition to deciding on the best operational steps,

    the Optimizer determines the order in which it will perform these steps. For example, the

    Optimizer may decide that a join between table A and table B is called for. It then needs to

    decide on the type of join and the order in which it will perform the table join.

    The physical query or execution plan takes into account the following factors:

    The various operations (for example, joins) to be performed during the query

    The order in which the operations are performed

    The algorithm to be used for performing each operation

    The best way to retrieve data from disk or memory

    The Optimizer may generate several valid physical query plans, all of which are potential

    execution plans. The Optimizer then chooses among them by estimating the cost of each

    possible physical plan based on the table and index statistics available to it, and selecting the

    plan with the lowest estimated cost. This evaluation of the possible physical query plans is

    called cost-based query optimization. The cost of executing a plan is directly proportional to the

    amount of resources such as I/O, memory, and CPU necessary to execute the proposed plan.

    The Optimizer passes this lowcost physical query plan to Oracles query execution engine.

  • 8/12/2019 Oracle Perf Tune - Intro

    5/16

    An Example of Cost Optimization

    Lets say you want to run the following query, which seeks to find all the supervisors who work

    in Dallas. The query looks like this:

    Now, you have several ways to arrive at the list of the supervisors. Lets consider three ways to

    arrive at this list, and compute the cost of accessing the results in each of the three ways.

    Make the following simplifying assumptions for the cost computations:

    You can only read and write data one row at a time (in the real world, you do I/O at the block

    level, not the row level).

    The database writes each intermediate step to disk (again, this may not be the case in the

    real world).

    No indexes are on the tables.

    The employee table has 2,000 rows.

    The dept table has 40 rows. The number of supervisors is also 40 (one for each department).

    Ten departments are in the city of Dallas.

    In the following sections, youll see three different queries that retrieve the same data, but tha t

    use different access methods. For each query, a crude cost is calculated, so you can compare

    how the three queries stack up in terms of resource cost. The first query uses a Cartesian join.

    Query 2: A Join of Two Tables

    The second query uses a join of the employee and dept tables. First, join the employee and

    dept tables on the dept_no column. From this join, select all rows where e.job=supervisor and

    city=Dallas.

    The following would be the total cost of performing the query:

    Joining the employee and dept tables first requires a read of all the rows in both tables:

    2,000 + 40 = 2,040

  • 8/12/2019 Oracle Perf Tune - Intro

    6/16

    Creating the join of the employee and dept tables: 2,000 writes

    Reading the join results costs: 2,000 reads

    Total I/O cost: 2,040 + 2,000 + 2,000 = 6,040

    Query 3: A Join of Reduced Relations

    The third query also uses a join of the employee and dept tables, but not all the rows in the two

    tablesonly selected rows from the two tables are joined. Heres how this query would proceed

    to retrieve the needed data. First, read the employee table to get all supervisor rows. Next, read

    the dept table to get all Dallas departments. Finally, join the rows you derived from the

    employee and the dept tables.

    The following would be the total cost of performing the query:

    Reading the employee table to get the supervisor rows: 2,000 reads

    Writing the supervisor rows derived in the previous step: 40 writes

    Reading the dept table to get all Dallas departments: 40 reads

    Writing the Dallas department rows derived from the previous step: 10 writes

    Joining the supervisor rows and department rows derived in the previous steps of this query

    execution: A total of 40 + 10 = 50 writes

    Reading the join result from the previous step: 50 reads

    Total I/O cost: 2,000 + 2(40) + 10 + 2(50) = 2,190

    This example, simplified as it may be, shows you that Cartesian products are more expensive

    than more restrictive joins. Even a selective join operation, the results show, is more expensive

    than a selection operation. Although a join operation is in query 3, itsa join of two reduced

    relations; the size of the join is much smaller than the join in query 2. Query optimization often

    involves early selection (picking only some rows) and projection (picking only some columns)

    operations to reduce the size of the resulting outputs or row sources.

    Query Execution

    During the final stage of query processing, the optimized query (the physical query plan that has

    been selected) is executed. If its a SELECT statement, the rows are returned to the user. If its

    an INSERT, UPDATE, or DELETE statement, the rows are modified. The SQL execution engine

    takes the execution plan provided by the optimization phase and executes it.

    Of the three steps involved in SQL statement processing, the optimization process is the

  • 8/12/2019 Oracle Perf Tune - Intro

    7/16

    crucial one because it determines the all-important question of how fast your data will be

    retrieved.

    Understanding how the Optimizer works is at the heart of query optimization. Its important to

    know what the common access methods, join methods, and join orders are in order to write

    efficient SQL. The next section presents a detailed discussion of the all-powerful Oracle CBO.

    Query Optimization and the Oracle

    Cost-Based Optimizer

    In most cases, you have multiple ways to execute a SQL query. You can get the same results

    from doing a full table scan or using an index. You can also retrieve the same data by accessing

    the tables and indexes in a different order. The job of the Optimizer is to find the optimal or best

    plan to execute your DML statements such as SELECT, INSERT, UPDATE, and DELETE.

    Oracle uses the CBO to help determine efficient methods to execute queries.

    The CBO uses statistics on tables and indexes, the order of tables and columns in the SQL

    statements, available indexes, and any user-supplied access hints to pick the most efficient way

    to access them. The most efficient way, according to the CBO, is the least costly access

    method, cost being defined in terms of the I/O and the CPU expended in retrieving the rows.

    Accessing the necessary rows means Oracle reads the database blocks on the file system into

    the buffer pool. The resulting I/O cost is the most expensive part of SQL statement execution

    because it involves reading from the disk. You can examine these access paths by using tools

    such as the EXPLAIN PLAN.

    Choosing Your Optimization Mode

    In older versions of Oracle, you had a choice between a rule-based and a cost-based optimizer.

    In a rule-based approach. All the access paths were assigned a rank, and the path with the

    lowest rank was chosen. The operations with a lower rank usually executed faster than those

    with a higher rank. For example, a query that uses the ROWID to search for a row has a cost of

    1. This is expected because identifying a row with the help of the ROWID, an Oracle pointer-like

    mechanism, is the fastest way to locate a row. On the other hand, a query that uses a full table

    scan has a cost of 19, the highest possible cost under rule-based optimization. The CBO

    method almost always performs better than the older rule-based approach because, among

    other things, it takes into account the latest statistics about the database objects.

    Providing Statistics to the Optimizer

  • 8/12/2019 Oracle Perf Tune - Intro

    8/16

    In Oracle Database 10g, the database itself automatically collects the necessary Optimizer

    statistics. Every night, the database schedules a statistics collection job during the maintenance

    window of the Oracle Scheduler. The maintenance window, by default, extends from 10 PM to 6

    AM on weekdays and all weekend as well. The job is named GATHER_STATS_JOB, and runs

    by default in every Oracle Database 10g database. You have the ability to disable the

    GATHER_STATS_JOB if you wish.

    You can get details about this default GATHER_STATS_JOB by querying the

    DBA_SCHEDULER_JOBS view.

    The GATHER_STATS_JOB collects statistics for all tables that either dont have Optimizer

    statistics, or have stale (outdated) statistics. Oracle considers an objects statistics stale if more

    than 10 percent of its data has changed since the last time it collected statistics for that object.

    By default, Oracle monitors all DML changes such as inserts, updates, and deletes made to all

    database objects. You can also view the information about these changes in the

    DBA_TAB_MODIFICATIONS view. Based on this default object monitoring, Oracle decides

    whether to collect new statistics for an object.

    To check that the GATHER_STATS_JOB is indeed collecting statistics on a regular basis, use

    the following:

    Note the following points about the preceding output:

  • 8/12/2019 Oracle Perf Tune - Intro

    9/16

    The job collects statistics during the maintenance window of the database, which is, by

    default, scheduled between 10 PM and 6 AM during weekdays and all weekend.

    The statistics are collected by the nightly GATHER_STATS_JOB run by the Scheduler.

    If a table is created that day, the job uses all the rows of the table the first time it collects

    statistics for the table.

    The sampling percentage varies from less than 1 percent to 100 percent.

    The size of the table and the percentage of the sample arent correlated.

    The job doesnt collect statistics for all the tables each day.

    If a tables data doesnt change after its created, the job never collects a second time.

    Oracle determines the sample size for each object based on its internal algorithms; there is no

    Standard sample size for all objects.

    What exactly the statistics does

    The number of rows in a table

    The number of rows per database block

    The average row length

    The total number of database blocks in a table

    The number of levels in each index

    The number of leaf blocks in each index

    The number of distinct values in each column of a table

    Data distribution histograms

    The number of distinct index keys

    Cardinality (the number of columns with similar values for each column)

    The minimum and maximum values for each column

    System statistics, which include I/O characteristics of your system; and CPU statistics, which

    include CPU speed and other related statistics

    The key to the CBOs capability to pick the best possible query plan is its capability to correctly

    estimate the cost of the individual operations of the query plan. These cost estimates are

    derived from the knowledge about the I/O, CPU, and memory resources needed to execute

    each operation based on the table and index statistics.

    The database stores the Optimizer statistics that it collects in its data dictionary. The DBA_

  • 8/12/2019 Oracle Perf Tune - Intro

    10/16

    TAB_STATISTICS table shows Optimizer statistics for all the tables in your database. You can

    also see column statistics by querying the DBA_TAB_COL_STATISTICS view

    As you can see, more than 22 million PERSON_ID numbers are in the PERSONNEL table.

    However, there are only 7,281 distinct last names and 1,729 distinct first names. Of course, the

    GENDER column has only two distinct values. The Optimizer takes into account these types of

    information regarding your table data, before deciding on the best plan of execution for a SQL

    statement that involves the tables columns.

    Tip Optimizer statistics include both object (table and index) statistics, as well as system

    statistics. Without

    accurate system statistics, the Optimizer cant come up with valid cost estimates to evaluate

    alternate execution plans.

    What Does the Optimizer Do?

    The CBO performs several intricate steps to arrive at the optimal execution plan for a users

    query. The original SQL statement is most likely transformed, and the CBO evaluates

    alternative access paths (for example, full-table or index-based scans). If table joins are

    necessary, the Optimizer evaluates all possible join methods and join orders. The Optimizer

    evaluates all the possibilities and arrives at the execution plan it deems the cheapest in terms of

    total cost, which includes both I/O and CPU resource usage cost.

    SQL Transformation

    Oracle hardly ever executes your query in its original form. If the CBO determines that a

    different SQL formulation will achieve the same results more efficiently, it transforms the

    statement before executing it. A good example is where you submit a query with an OR

  • 8/12/2019 Oracle Perf Tune - Intro

    11/16

    condition, and the CBO transforms it into a statement using UNION or UNION ALL. Or your

    statement may include an index, but the CBO might transform the statement so it can do a full

    table scan, which can be more efficient under some circumstances. In any case, its good to

    remember that the query a user wishes to be executed may not be executed in the same form

    by Oracle, but the querys results are still the same. Here are some common transformations

    performed by the Oracle CBO:

    Transform IN into OR statements.

    Transform OR into UNION or UNION ALL statements.

    Transform noncorrelated nested selects into more efficient joins.

    Transform outer joins into more efficient inner joins.

    Transform complex subqueries into joins, semi joins, and anti joins.

    Star transformation for data warehouse tables based on thestar schema.

    Transform BETWEEN to greater than or equal to and less than or equal to statements.

    Choosing the Access Path

    Oracle can often access the same data through different paths. For each query, the Optimizer

    evaluates all the available paths and picks the least expensive one in terms of resource usage.

    If joins are involved, then the join order and the join method are evaluated to finally arrive at the

    best execution plan. Youll take a brief look at the steps the Optimizer goes through before

    deciding on its choice of execution path.

    Full Table Scans

    Oracle scans the entire table during a full table scan. Oracle reads each block in the table

    sequentially, so the full table scan can be efficient if the MULTI_BLOCK_READ_COUNT

  • 8/12/2019 Oracle Perf Tune - Intro

    12/16

    initialization parameter is set high enough. However, for large tables, full table scans are

    inefficient in general.

    Table Access by ROWID

    Accessing a table by ROWID retrieves rows using unique ROWIDs. ROWIDs in Oracle specify

    the exact location in the data file and the data block where the row resides, so ROWID access is

    the fastest way to retrieve a row in Oracle. Often, Oracle obtains the ROWID through an index

    scan of the tables indexes. Using these ROWIDs, Oracle swiftly fetches the rows.

    Index Scans

    An index stores two things: the column value of the column on which the index is based and the

    ROWID of the rows in the table that contain that column value. An index scan retrieves data

    from an index using the values of the index columns. If the query requests only the indexed

    column values, Oracle will return those values. If the query requests other columns outside the

    indexed column, Oracle will use the ROWIDs to get the rows of the table.

    Choosing the Join Method

    When you need to access data thats in two or more tables, Oracle joins the tables based on a

    common column. However, there are several ways to join the row sets returned from the

    execution plan steps. For each statement, Oracle evaluates the best join method based on the

    statistics and the type of unique or primary keys on the tables. After Oracle has evaluated the

    join methods, the CBO picks the join method with the least cost.

    The following are the common join methods used by the CBO:

    Nested-loop join: A nested-loop join involves the designation of one table as the driving table

    (also called the outer table) in the join loop. The other table in the join is called the inner

    table. Oracle fetches all the rows of the inner table for every row in the driving table.

    Hash join: When you join two tables, Oracle uses the smaller table to build a hash table on

    the join key. Oracle then searches the larger table and returns the joined rows from the hash

    table.

    Sort-merge join: If both the tables in a join are sorted on the join key, the sorted lists are

    merged together.

  • 8/12/2019 Oracle Perf Tune - Intro

    13/16

    Choosing the Join Order

    Once the Optimizer chooses the join method, it determines the order in which the tables are

    joined. The goal of the Optimizer is always to join tables in such a way that the driving table

    eliminates the largest number of rows. A query with four tables has a maximum of 4 factorial, or

    24, possible ways in which the tables can be joined. Each such join order would lead to a

    number of different execution plans, based on the available indexes and the access methods.

    The search for an optimal join strategy could take a long time in a query with a large number of

    tables, so Oracle depends on an adaptive search strategy to limit the time it takes to find the

    best execution plan. An adaptive search strategy means that the time taken for optimization is

    always a small percentage of the total time that is taken for execution of the query itself.

    Drawbacks of the CBO

    The CBO is systematic, but the Optimizer is not guaranteed to follow the same plan in similar

    cases. However, the CBO isnt always perfect, and you need to watch out for the following:

    Application developers may know more than the CBO when it comes to choosing the best

    access path. Application developers know the needs of the users, of which the CBO is

    completely unaware. This could lead to a situation where the CBO may be optimizing

    throughput, when the users would rather have a quick set of results on their screen. By

    using hints such as FIRST_ROWS_n, you can overcome this drawback in the CBO.

    The CBO depends enormously on correct statistics gathering. If the statistics are absent or

    outdated, the Optimizer can make poor decisions.

    Providing Statistics to the CBO

    The CBO can follow optimal execution paths only if it has detailed knowledge of the database

    objects. In Oracle Database 10g, the recommended way to provide these statistics is by letting

    the database automatically collect statistics for you. This is known as the Automatic Optimizer

    Statistics Collection feature. You can also manually provide statistics to the

  • 8/12/2019 Oracle Perf Tune - Intro

    14/16

    Optimizer with the DBMS_STATS package.

    Using DBMS_STATS to Collect Statistics

    Although letting the database automatically collect Optimizer statistics is the recommended

    approach under Oracle Database 10g, you can still manually collect Optimizer statistics using

    the DBMS_STATS package.

    Collecting the Statistics

    The DBMS_STATS package has several procedures that let you collect data at different levels.

    The main data collection procedures for database table and index data are as follows:

  • 8/12/2019 Oracle Perf Tune - Intro

    15/16

    Frequency of Statistics Collection

    Theoretically, if your data is static, you may only need to collect statistics once. If your database

    performs only a small amount of DML activities, you may collect statistics at relatively longer

    intervals, say weekly or monthly. However, if your database objects go through constant changeon a daily basis, you need to schedule the statistics collection jobs much more frequently, say

    daily or even more often. One of the best Oracle Database 10g new features is that you can let

    the database decide the frequency of statistics collection. Remember that the database bases

    its statistics collection on whether the statistics are fresh or stale. Thus, you can relax and let

    the database be the arbiter of how often to collect statistics.

    What Happens When You Dont Have Statistics

    Youve seen how the Oracle database can automatically collect Optimizer statistics for you.

    Youvealso learned how to use the DBMS_STATS package to collect the statistics manually

    yourself. But what happens if you disable the automatic statistics collection process, or if you

    dont collect statisticsin a timely fashion? Even with automatic statistics collection, under which

    necessary statistics are collected on a nightly basis, you may have a situation where table data

    is altered after the statistics collection process is over. In situations such as this, Oracle usesdata, such as the number of blocks taken up by the table data and other ancillary information, to

    figure out the Optimizer execution plan. You can also use the initialization parameter

    OPTIMIZER_DYNAMIC_SAMPLING to let Oracle estimate Optimizer statistics on the fly, when

    no statistics exist for a table, or when the statistics exist but are too old or otherwise unreliable.

    Of course, sampling statistics dynamically would mean that the compile time for the SQL

  • 8/12/2019 Oracle Perf Tune - Intro

    16/16

    statement involved would be longer. Oracle smartly figures out if the increased compile time is

    worth it when it encounters objects without statistics. If its worth it,Oracle will sample a portion

    of the objects data blocks to estimate statistics. You need to set thevalue of the

    OPTIMIZER_DYNAMIC_SAMPLING initialization parameter to 2 or higher to enable dynamic

    sampling of all unanalyzed tables. Because the default for this parameter is 2, dynamic

    sampling is turned on by default in your database. Thus, you need not spend sleepless nights

    worrying about objects with missing or outdated statistics. In any case, if you adhere to Oracles

    recommendation and use the Automatic Optimizer Statistics Collection feature, the

    GATHER_STATS_JOB will automatically collect your databases statistics. The

    GATHER_STATS_JOB is created at database creation time and is managed by the Oracle

    Scheduler, which runs the job when the maintenance window is opened. By default, the

    maintenance window opens every night from 10 PM to 6 AM, and all day on weekends. Oracle

    will collect statistics for all objects that need them if you adopt the Automatic Optimizer Statistics

    Collection feature. The feature is turned on by default in a new Oracle 10g database

    or when you upgrade to the 10g release from an older version.