db2 sql tuning.ppt

Upload: suresh-vanamala

Post on 04-Jun-2018

264 views

Category:

Documents


1 download

TRANSCRIPT

  • 8/14/2019 DB2 SQL TUNING.ppt

    1/53

  • 8/14/2019 DB2 SQL TUNING.ppt

    2/53

    Topics of Discussion

    General Recommendation

    Predicates Evaluation

    Filter Factor

    Column Correlation

    Writing Subquery

    Special Techniques to influence access path Using DB2 EXPLAIN

    Different Access Types

  • 8/14/2019 DB2 SQL TUNING.ppt

    3/53

    Topics of Discussion

    Join Methods

    DB2 Data and Index page prefetch

    Sorting of Data and RIDs

    View Merge/Materialization

    Query Parallelism

    DB2 V5 Features

  • 8/14/2019 DB2 SQL TUNING.ppt

    4/53

    General Recommendation

    Make Sure

    Queries are as simple as possible

    Unused rows are not fetched. Filtering to be doneby DB2 not in the application program.

    Unused columns are not selected

    There is no unnecessary ORDER BY or GROUPBY Clause

    Use page level locking and try to minimize lock

    duration.

    Big tables should be handled with care.

  • 8/14/2019 DB2 SQL TUNING.ppt

    5/53

    General Recommendation

    Try to use indexable predicates wherever possible

    Do not code redundant predicates

    Make sure that declared length of host variable isnot greater than length attribute of data column.

    If there are efficient indexes available on the

    tables in the subquery, co-related subquery willperform better. Otherwise no co related subquery

    will perform better.

    If there are multiple subqueries, make sure that

    they are ordered in efficient manner.

  • 8/14/2019 DB2 SQL TUNING.ppt

    6/53

    Predicate

    Predicates are found on WHERE, ON and

    HAVING clause of the SQL. ON predicates are

    applied first, then WHERE predicates and afterdata access HAVING predicates are applied.

    Predicates on HAVING clause are not used when

    accessing data

    Predicate types have great impact on choosing

    DB2 access path.

    General Predicate Types

    - Subquery, Equal, Range, IN-List and NOT

  • 8/14/2019 DB2 SQL TUNING.ppt

    7/53

    Predicate

    Predicates are also categorized into Indexable and

    non-Indexable predicates.

    Predicates are classified into Stage 1 and Stage 2predicates depending on when they are used

    during query evaluation.

    For Outer join, predicates on ON clause are

    treated as stage 2 predicates. And most of the

    other predicates are applied after JOIN as stage 2

    predicates. Predicates on table expression can be

    evaluated before join as stage 1 predicate

  • 8/14/2019 DB2 SQL TUNING.ppt

    8/53

    Order of Predicate Evaluation

    Predicates are evaluated in following sequence :

    Indexable matching predicates are accessed first

    Next is Indexable non-matching index. Processingis called index screening

    All Indexable predicates are stage 1 predicates

    After data page access, other stage 1 predicates areapplied

    Finally stage 2 predicates are applied in returned

    data rows

  • 8/14/2019 DB2 SQL TUNING.ppt

    9/53

    Order of Predicate Evaluation

    Predicate evaluation within each stage :

    All equal predicates are applied first

    Range predicates are next

    All other predicates at the last

    After both of the sets of rule are applied,predicates are evaluated in the order, they appear

    in the query.

  • 8/14/2019 DB2 SQL TUNING.ppt

    10/53

    Filtering

    DB2 will apply the most restrictive predicate first

    to reduce the processing at next stage for stage 1

    predicates. It is better to code predicate with high filtering

    factor first

    DB2 evaluates the filter factor based on catalog

    information in SYSCOLUMNS and

    SYSCOLDIST tables.

  • 8/14/2019 DB2 SQL TUNING.ppt

    11/53

    Filtering

    If distribution of column value is not available in

    catalog table SYSIBM.SYSCOLDIST, DB2

    assumes normal distribution for filtering. Make sure the catalog tables are updated either

    manually or running RUNSTATS.

    If there is no information available from the

    catalog tables, DB2 assumes default filter factor.

    DB2 uses default filter factor for predicates in

    static SQLs using host variables.

  • 8/14/2019 DB2 SQL TUNING.ppt

    12/53

    Column Correlation

    Two columns of a table are said to be co-related if

    they are depended on each other.

    DB2 might not determine optimum access path,table order or join method when query uses highly

    correlated columns.

    Column correlation makes query cheaper than

    actually they are.

    Run RUNSTATS to update the catalog tables with

    correct correlation to help DB2 to find actual

    filtering factor.

  • 8/14/2019 DB2 SQL TUNING.ppt

    13/53

    Using host variable effectively

    If static SQL has host variables, DB2 might not

    select the optimum access path as it uses default

    filter factor for the predicates using host variable. There are two ways to change the access path for

    the query that contains host variables

    - Using REOPT(VARS) option to change the

    access path at run time.

    - Rewrite the SQL in a different way.

  • 8/14/2019 DB2 SQL TUNING.ppt

    14/53

    Writing Subquery

    Subquery could be correlated and non-correlated

    For Correlated subquery, for each row, returned

    from the outer query, the subquery is evaluated. For non-correlated subquery, Inner query is

    evaluated first and then the outer query

    DB2 sometime can transform the subquery to join

    and sometime application programmer has to do it

    for better performance

    Any subquery could be transformed to a join

  • 8/14/2019 DB2 SQL TUNING.ppt

    15/53

    Writing Subquery

    If you use columns from the both the tables, better

    to use JOIN

    Guidelines for Writing efficient subquery : If there are efficient indexes available on the tables in the

    subquery, then a correlated subquery is likely to be the

    most efficient kind of subquery.

    If there are no efficient indexes available on the tables inthe subquery, then a non-correlated subquery would likely

    perform better.

    If there are multiple subqueries in any parent query, make

    sure that the subqueries are ordered in the most efficientmanner.

  • 8/14/2019 DB2 SQL TUNING.ppt

    16/53

    Some Special Techniques

    OPTIMIZE OF n ROWS

    Reducing the number of matching columns for

    index scan Adding extra local predicates

    Changing inner join to outer join

    Updating Catalog Statistics

  • 8/14/2019 DB2 SQL TUNING.ppt

    17/53

    OPTIMIZE FOR n ROWS

    DB2 Chooses the access path that minimizes the

    response time for retrieving the first few rows

    Using OPTIMIZE FOR does not stop the useraccessing whole result set.

    This is not useful when DB2 has to gather whole

    result set before returning the first n rows.

  • 8/14/2019 DB2 SQL TUNING.ppt

    18/53

    Influencing access path

    DB2 evaluates the access path based on

    information available in catalog tables

    Wrong catalog information or unavailable cataloginformation may result in selection of wrong

    access path

    Wrong access path could be because of wrong

    index selection

    It also could be of index selection where

    tablespace scan is effective

  • 8/14/2019 DB2 SQL TUNING.ppt

    19/53

  • 8/14/2019 DB2 SQL TUNING.ppt

    20/53

    Other factors

    Adding extra predicate may influence in selection

    of join method

    If you have extra predicate, Nested loop join maybe selected as DB2 assumes that filter factor will

    be high. The proper type of predicate to add is

    WHERE T1.C1 = T1.C1

    Hybrid join is a costlier method. Outer join does

    not use hybrid join. So If hybrid join is used by

    DB2, convert inner join to outer join and add extra

    predicates to removes unneeded rows.

  • 8/14/2019 DB2 SQL TUNING.ppt

    21/53

    Updating catalog tables

    Access path based on catalog column values

    Catalog tables could be updated manually or

    running RUNSTATS with appropriate options Tables which are frequently changed, access

    method on them may suffer as statistics are not

    reflected

    Running RUNSTATS is a costly process. So

    catalog statistics manually should be updated.

    It is necessary to rebind the static SQLs after

    catalog statistics update

  • 8/14/2019 DB2 SQL TUNING.ppt

    22/53

    DB2 EXPLAIN AND TUNING

    EXPLAIN is a monitoring tool that produces

    information about a plan, package, or SQL

    statement when it is bound. The output appears ina user-supplied table called PLAN_TABLE

    It helps you to do the following

    Design databases, indexes, and application

    programs

    Determine when to rebind an application

    Determine the access path chosen for a query

  • 8/14/2019 DB2 SQL TUNING.ppt

    23/53

    DB2 EXPLAIN OUTPUT

    Explain output is stored in PLAN_TABLE

    Each plan is identified by APPLNAME column

    Each package is identified by PROGNAME,COLLID and VERSION

    In each package, you might have multiple SQLs

    and each is identified by QUERYNO

    For each query will be evaluated in multiple stages

    and each stage is identified by QBLOCKNO and

    PLANNO

  • 8/14/2019 DB2 SQL TUNING.ppt

    24/53

    Type of Access

    Tablespace Scan (ACCESSTYPE = R)

    Index scan

    Index scan can categorized into

    Index Only Access (INDEXONLY = Y)

    Multiple index Scan

    (ACCESSTYPE=M,MI,MU,MX) Matching index scan (MATCHCOLS > 0)

    Non-Matching index scan ( MATCHCOLS = 0)

    One fetch access (ACCESSTYPE= I1)

  • 8/14/2019 DB2 SQL TUNING.ppt

    25/53

    Tablespace scan (ACCESSTYPE=R)

    A matching index scan is not possible because an

    index is not available, or there are no predicates to

    match the index columns. high percentage of the rows in the table is

    returned. In this case an index is not really useful,

    because most rows need to be read anyway.

    The indexes that have matching predicates have

    low cluster ratios and are therefore efficient only

    for small amounts of data.

    Sequential prefetch is used (PREFETCH=S)

  • 8/14/2019 DB2 SQL TUNING.ppt

    26/53

    Using Index

    Index to be defined should be solely based on how

    does application fetch data

    Proper definition of index will avoid sort You need to trade cost of defining index and

    performance.

  • 8/14/2019 DB2 SQL TUNING.ppt

    27/53

    Matching Index Scan

    Match index scan provide filtering

    This is possible if predicates are specified on

    either the leading or all of the index key columns If degree of filtering is high. Matching index scan

    is efficient

    MATCHCOLS will provide the number of

    matching columns

    If there are more than one index, DB2 will use IX

    with most restrictive filtering for matching index

    scan

  • 8/14/2019 DB2 SQL TUNING.ppt

    28/53

    Non-Matching IX scan

    This is also called Index Screening

    DB2 select index screening when predicates are

    specified on index key columns but are not part ofthe matching columns

    Index screening predicates improve the index

    access by reducing the number of rows that

    qualify while searching the index

    MATCHCOLS = 0 and ACCESSTYPE = I

  • 8/14/2019 DB2 SQL TUNING.ppt

    29/53

    IN LIST Index Scan

    An IN-list index scan is a special case of the

    matching index scan, in which a single indexable

    IN predicate is used as a matching equal predicate. PLAN TABLE shows MATCHCOLS > 0 and

    ACCESSTYPE = N

  • 8/14/2019 DB2 SQL TUNING.ppt

    30/53

    Multiple Index Scan

    Multiple index access uses more than one index to

    access a table

    It is a good access path when No single indexprovides efficient access OR A combination of

    index accesses provides efficient access.

    LIST Sequential prefetch is used as RIDs are

    collected from each index scan

    ACCESSTYPE = M,MI,MU,MX and

    PREFETCH = L

    Same index also may be scanned more than ones

  • 8/14/2019 DB2 SQL TUNING.ppt

    31/53

    One Fetch Access

    One-fetch index access requires retrieving only

    one row. It is the best possible access path if

    available.One-fetch index access is a possible when :

    There is only one table in the query.

    The column function is either MIN or MAX and There is

    an ascending index column for MIN, and a descendingindex column for MAX.

    Either no predicate or all predicates are matching

    predicates for the index. And There is no GROUP BY.

  • 8/14/2019 DB2 SQL TUNING.ppt

    32/53

    Index Only access

    With index-only access, the access path does not

    require any data pages because the access

    information is available in the index Because the index is almost always smaller than

    the table itself, an index-only access path usually

    processes the data efficiently.

    ACCESSTYPE = I AND INDEXONLY = Y

  • 8/14/2019 DB2 SQL TUNING.ppt

    33/53

    JOIN

    A join operation retrieves rows from more than

    one table and combines them. The operation

    specifies at least two tables, but they need not bedistinct.

    Application joins are called inner join, left outer

    join, right outer join and full outer join

    DB2 internally uses three types of join method -Nested loop join, Merge Scan Join and Hybrid

    Join

    Hybrid join is not used for OUTER join.

  • 8/14/2019 DB2 SQL TUNING.ppt

    34/53

    Nested Loop Join (METHOD =1)

    Initially two tables are picked out and one is used

    as inner table and other one as composite table.

    For each row in outer table, inner table is scannedfor matching rows in inner(New) table. And the

    composite table is prepared. This composite table

    is used as outer table at the next stage.

    Process continues until and unless all the tableshave been selected.

    Composite table is sorted when order of join

    columns on both the table are not same.

  • 8/14/2019 DB2 SQL TUNING.ppt

    35/53

    Nested Loop Join (Method = 1)

    Nested loop join is efficient when

    Outer table is small. Predicates with small filter

    factor reduces no of qualifying rows in outer table. The number of data pages accessed in inner table

    is also small.

    Highly clustered index available on join columns

    of the inner table.

    This join method is efficient when filtering for

    both the tables(Outer and inner) is high.

  • 8/14/2019 DB2 SQL TUNING.ppt

    36/53

    Merge Scan Join (Method = 2)

    DB2 scans both the tables in order of join column

    If there is no efficient index to provide the order,

    DB2 might sort the either or both the tables. DB2 reads a row from the outer table and keep on

    reading the inner table as long as a match is there.

    When there is no match, DB2 reads another row

    from outer table.

    If outer table has a new value, DB2 searches ahead

    in the inner table.

  • 8/14/2019 DB2 SQL TUNING.ppt

    37/53

    Merge Scan Join (Method = 2)

    Merge scan is used when :

    Qualifying rows of inner and outer tables are large

    and join predicates also does not provide muchfiltering

    Tables are large and have no indexes with

    matching columns

  • 8/14/2019 DB2 SQL TUNING.ppt

    38/53

    Hybrid Join(Method=4)

    It is only used for Inner Join and requires an index

    on the join column of inner table.

    Join the outer table with RIDs from the index onthe inner table. Index of the inner table is scanned

    for each row in outer table.

    Sort the data on RID orders and retrieve the data

    from inner table using list prefetch

    Concatenates data from inner table to form the

    resultant table.

  • 8/14/2019 DB2 SQL TUNING.ppt

    39/53

    Hybrid Join(Method=4)

    Hybrid join is used often when a non-clustered

    index available on join column of the inner table

    and there are duplicate qualifying rows on outertable.

    Hybrid join handles are duplicates in the outer

    table as inner table is scanned only ones for each

    set of duplicate values. Prefetch method is LIST SEQUENTIAL

  • 8/14/2019 DB2 SQL TUNING.ppt

    40/53

  • 8/14/2019 DB2 SQL TUNING.ppt

    41/53

  • 8/14/2019 DB2 SQL TUNING.ppt

    42/53

    Sequential Detection

    If DB2 does not choose prefetch at bind time, it

    can sometimes do that at execution time. The

    method is called sequential detection. If a table is accessed repeatedly using the same

    statement (SQL in a do-while loop), the data or

    index leaf pages of the table can be accessed

    sequentially. DB2 can use this technique if it did not choose

    sequential prefetch at bind time because of an

    inaccurate estimate of the no of pages to be

  • 8/14/2019 DB2 SQL TUNING.ppt

    43/53

    Sorting of data

    Sort can happen on a new table or on the

    composite table

    Sort is required by ORDER BY or GROUP BYclause. (SORTC_GROUPBY/SORTC_ORDERBY = Y).

    Sort is required to remove duplicates while

    DISTINCT or UNION is used. (SORTC_UNIQ=Y)

    During Nested loop and Hybrid join, composite

    table is sorted and Merge scan join, both of the

    tables might be sorted to make join efficient.(SORTN_JOIN/SORTC_JOIN=Y)

  • 8/14/2019 DB2 SQL TUNING.ppt

    44/53

    Sorting of data

    Sort is need for subquery processing. Result of the

    subquery is sorted and put into the work file for

    later reference by parent query. DB2 sorts RIDs into ascending page number order

    in order to perform list prefetch. This sort is very

    fast and is done totally in memory

    If sort is required during CURSOR processing, itis done during OPEN CURSOR. Once cursor is

    closed and opened, sort is to be performed again.

  • 8/14/2019 DB2 SQL TUNING.ppt

    45/53

    View Merge

    If query is using view, view name ultimately will

    be resolved to table name. This process is called

    view merge. the statement that references the view is combined

    with the subselect that defined the view. This

    combination creates a logically equivalent

    statement. This equivalent statement is executedagainst the database

  • 8/14/2019 DB2 SQL TUNING.ppt

    46/53

    View Materialization

    Views can not be merged if view definition

    involves column functions. In that case view

    materialization is required. Done in two stages :

    1) The view's defining subselect is executed against

    the database and the results are placed in a

    temporary copy of a result table.

    2) The view's referencing statement is then executed

    against the temporary copy of the result table to

    obtain the intended result

  • 8/14/2019 DB2 SQL TUNING.ppt

    47/53

    Query Parallelism

    When DB2 plans to access data from a table or

    index in a partitioned table space, it can initiate

    multiple parallel operations to reduce the response

    time for data or processor-intensive queries.

    Two types of parallelism -

    1) Query I/O parallelism - Manages concurrent I/O

    request for a single query.

    2) Query CP parallelism - Enables multitasking

    within a single query. Query is broken into parts

    and processed.

  • 8/14/2019 DB2 SQL TUNING.ppt

    48/53

    Query Parallelism

    Parallel processing is enabled using

    DEGREE(ANY) on BIND and REBIND for static

    SET CURRENT DEGREE = ANY for dynamic The virtual buffer pool parallel sequential

    threshold (VPPSEQT) value must be large enough

    to provide adequate buffer pool space for parallel

    processing

    Degree = 1 disables parallel processing

  • 8/14/2019 DB2 SQL TUNING.ppt

    49/53

  • 8/14/2019 DB2 SQL TUNING.ppt

    50/53

    Tools

    Tools for Performance Analysis

    - DB2 PM

    - CANDLES OMEGAMON- SMF/RMF Data

    - DB2 TRACE

    Tools for Access Path Analysis- DB2 EXPLAIN

    - VISUAL EXPLAIN

  • 8/14/2019 DB2 SQL TUNING.ppt

    51/53

    Risks

    There is no GOLDEN RULE for DB2 SQL tuning

    Wrong Analysis of performance Data and access

    method information may led to more performanceoverhead

    While tuning SQL in test environment, the person

    should keep in mind that amount of data and DB2

    sub-system setup are not same.

    Person with good knowledge of DB2 should be

    involved with tuning activity.

  • 8/14/2019 DB2 SQL TUNING.ppt

    52/53

    BP Buffer PoolCP Central Processor

    CPC Central Processing CageDASD Direct Access Storage Device

    DB2 PM DB2 Performance Monitor

    DBD Database Descriptor DS Dataset

    IX Index

    LDS Linear VSAM DatasetRI Referential Integrity

    RID Row Identifier

    RMF Resource Monitoring facilitySMF System Monitoring Facility

    SQL Structured Query Language

    T TableTS Tablespace

    VSAM Virtual Storage Access Method

    GLOSSARY

  • 8/14/2019 DB2 SQL TUNING.ppt

    53/53