knowing when to rebuild indexes

Upload: tan-min-choon

Post on 08-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/7/2019 Knowing When to Rebuild Indexes

    1/10

    Knowing when to Rebuild IndexesBrian PeaslandDatabase AdministratorEROS Data Center3 February 2000AbstractIndexes provide a fast and efficient method of retrieving selected data from a table. By pointing to theblocks that contain the selected data, the entire table does not have to read in order to extract therequired information. Most indexes in Oracle databases are built using the B-tree data structure. Contraryto some widely accepted beliefs, indexes in Oracle are not self-balancing. After a table experiences alarge number of inserts, updates, and deletes, the index can become unbalanced and fragmented andcan hinder query performance. Knowing when to rebuild the indexes is a topic of some confusion. Thispaper hopes to shine some light on the subject.Where is the index now?In order to understand what we must do with the index, we must first get an idea of the current state ofthe index. This can be accomplished by using the ANALYZE INDEX VALIDATE STRUCTURE command.Normally, the ANALYZE INDEX command creates either computed or estimated statistics for the indexthat can be seen in the DBA_INDEXES view. Analyzing the index changes the optimizers executionplans for queries that potentially use that index. This action may produce unintentional side effects,especially if the index has not previously been analyzed. The VALIDATE STRUCTURE command can besafely executed without affecting the optimizer. The VALIDATE STRUCTURE command populates theSYS.INDEX_STATS table only. The SYS.INDEX_STATS table can be accessed with the public synonymINDEX_STATS. The INDEX_STATS table will only hold validation information for one index at a time.You will need to query this table before validating the structure of the next index.Below is an example of ANALYZE INDEX VALIDATE STRUCTURE and sample output from

    INDEX_STATS:

    SQLWKS> ANALYZE INDEX shopping_basket_pk VALIDATE STRUCTURE;Statement processed.SQLWKS> SELECTname,height,lf_rows,lf_blks,del_lf_rows,distinct_keys,used_space

    2> FROM INDEX_STATS;NAME HEIGHT LF_ROWS LF_BLKS DEL_LF_ROWDISTINCT_K USED_SPACE------------------------- --------- ---------- ---------- ---------- ---------- ----------SHOPPING_BASKET_PK 2 1 3 11 651 row selected.

    I have the information, now what?There are two rules of thumb to help determine if the index needs to be rebuilt. If it is determined that theindex needs to be rebuilt, this can easily be accomplished by the ALTER INDEX REBUILD command.Although not necessarily recommended, this command could be executed during normal operating hours.Rebuilding the index uses the existing index as a basis. The alternative is to drop and re-create the index.Creating an index uses the base table as its data source that needs to put a lock on the table. The indexis also unavailable during creation.

  • 8/7/2019 Knowing When to Rebuild Indexes

    2/10

    First rule of thumb is if the index has height greater than four, rebuild the index. For most indexes, theheight of the index will be quite low, i.e. one or two. I have seen an index on a 3 million-row table that hadheight three. An index with height greater than four may need to be rebuilt as this might indicate a skewedtree structure. This can lead to unnecessary database block reads of the index. It is helpful to know thedata structure for the table and index. Most times, the index height should be two or less, but there areexceptions.The second rule of thumb is that the deleted leaf rows should be less than 20% of the total number of leafrows. An excessive number of deleted leaf rows indicates that a high number of deletes or updates haveoccurred to the index column(s). The index should be rebuilt to better balance the tree. TheINDEX_STATS table can be queried to determine if there are excessive deleted leaf rows in relation tothe total number of leaf rows. Lets look at an example:SQLWKS> ANALYZE INDEX item_basket_pk VALIDATE STRUCTURE;Statement processed.SQLWKS> SELECT name,height,lf_rows,del_lf_rows,(del_lf_rows/lf_rows)*100 asratio

    2> FROM INDEX_STATS;NAME HEIGHT LF_ROWS DEL_LF_ROW RATIO

    ------------------------------ ---------- ---------- ---------- ----------ITEM_BASKET_PK 1 235 74 31.48936171 row selected.

    In this example, the ratio of deleted leaf rows to total leaf rows is clearly above 20%. This is a goodcandidate for rebuilding. Lets rebuild the index and examine the results.SQLWKS> ALTER INDEX item_basket_pk REBUILD;Statement processed.SQLWKS> ANALYZE INDEX item_basket_pk VALIDATE STRUCTURE;Statement processed.SQLWKS> SELECT name,height,lf_rows,del_lf_rows,(del_lf_rows/lf_rows)*100 asratio

    2> FROM INDEX_STATS;NAME HEIGHT LF_ROWS DEL_LF_ROW RATIO------------------------------ ---------- ---------- ---------- ----------

    ITEM_BASKET_PK 1 161 0 01 row selected.

    The index is rebuilt and validated once again. Examining the INDEX_STATS table shows that the 74deleted leaf rows were dropped from the index. Notice that the total number of leaf rows went from 235 to161, which is a difference of 74 leaf rows. This index should provide better performance for theapplication.A Case StudyTo illustrate the ideas in this paper, a test case was created and various queries were executed against alarge table before and after rebuilding the primary key index. It should be noted that this test case wascreated specifically for this paper and results in the field may vary.The test table only contains two columns, ID and NAME. The table is described below:REQ SQL> desc analyze;Name Null? Type------------------------------- -------- ----ID NUMBERNAME VARCHAR2(20)

  • 8/7/2019 Knowing When to Rebuild Indexes

    3/10

    The test table has a primary key, ANALYZE_PK on the ID field. The test table was populated with 1million rows of data. After populating the table, over 270,000 rows were selectively deleted from the table.This method ensured that the primary key index would have a sufficient number of deleted leaf rows.After populating and deleting rows from the table, the index was validated as below:REQ SQL> analyze index analyze_pk validate structure;

    Index analyzed.

    REQ SQL> select name,height,lf_rows,del_lf_rows,del_lf_rows/lf_rows fromindex_stats;

    NAME HEIGHT LF_ROWS DEL_LF_ROWSDEL_LF_ROWS/LF_ROWS------------------------------ ---------- ---------- ----------- -------------------ANALYZE_PK 3 990206277353 .280096263

    It is now clear that the deleted leaf rows comprise approximately 28% of the total leaf rows.Four different queries were executed against the table. All of these queries will utilize the primary keyindex as verified by the explain plan. These queries are executed against the table with the current index.The index is then rebuilt. These queries are re-executed and the results are compared. The four queriesare:1. select count(*) from analyze where id between 250000 and 600000;2. select count(*) from analyze where id in (50000,4321,698754);3. select count(*) from analyze where id > 500000;4. select count(*) from analyze where id = 4321;

    With the current index, these three queries are executed against the test table. SQL Trace is enabled andthe queries are shown in the following excerpt from TKPROF:********************************************************************************

    select count(*)fromanalyze where id between 250000 and 600000

    call count cpu elapsed disk query currentrows------- ------ -------- ---------- ---------- ---------- ---------- ----------Parse 1 0.00 0.00 0 0 00

    Execute 1 0.00 0.00 0 0 00Fetch 2 0.47 0.47 0 293 01------- ------ -------- ---------- ---------- ---------- ---------- ----------total 4 0.47 0.47 0 293 01

    Misses in library cache during parse: 0

  • 8/7/2019 Knowing When to Rebuild Indexes

    4/10

    Optimizer goal: CHOOSEParsing user id: 5 (SYSTEM)

    Rows Execution Plan------- ---------------------------------------------------

    0 SELECT STATEMENT GOAL: CHOOSE0 SORT (AGGREGATE)

    138614 INDEX (RANGE SCAN) OF 'ANALYZE_PK' (UNIQUE)

    ********************************************************************************

    select count(*)fromanalyze where id in (50000,4321,698754)

    call count cpu elapsed disk query currentrows------- ------ -------- ---------- ---------- ---------- ---------- -------

    ---Parse 1 0.00 0.01 0 0 00Execute 1 0.00 0.00 0 0 00Fetch 2 0.00 0.00 0 9 01------- ------ -------- ---------- ---------- ---------- ---------- ----------total 4 0.00 0.01 0 9 01

    Misses in library cache during parse: 0Optimizer goal: CHOOSEParsing user id: 5 (SYSTEM)

    Rows Execution Plan------- ---------------------------------------------------

    0 SELECT STATEMENT GOAL: CHOOSE0 SORT (AGGREGATE)0 CONCATENATION1 INDEX (UNIQUE SCAN) OF 'ANALYZE_PK' (UNIQUE)1 INDEX (UNIQUE SCAN) OF 'ANALYZE_PK' (UNIQUE)1 INDEX (UNIQUE SCAN) OF 'ANALYZE_PK' (UNIQUE)

    ********************************************************************************

    select count(*)fromanalyze where id > 500000

    call count cpu elapsed disk query currentrows------- ------ -------- ---------- ---------- ---------- ---------- ----------

  • 8/7/2019 Knowing When to Rebuild Indexes

    5/10

    Parse 1 0.00 0.00 0 0 00Execute 1 0.00 0.00 0 0 00Fetch 2 1.11 1.11 0 1611 01------- ------ -------- ---------- ---------- ---------- ---------- ----------total 4 1.11 1.11 0 1611 01

    Misses in library cache during parse: 1Optimizer goal: CHOOSEParsing user id: 5 (SYSTEM)

    Rows Execution Plan------- ---------------------------------------------------

    0 SELECT STATEMENT GOAL: CHOOSE0 SORT (AGGREGATE)

    445545 INDEX (RANGE SCAN) OF 'ANALYZE_PK' (UNIQUE)

    ********************************************************************************

    select count(*)fromanalyze where id = 4321

    call count cpu elapsed disk query currentrows------- ------ -------- ---------- ---------- ---------- ---------- ----------Parse 1 0.00 0.00 0 0 00Execute 1 0.00 0.00 0 0 00Fetch 2 0.00 0.00 0 3 01------- ------ -------- ---------- ---------- ---------- ---------- ----------total 4 0.00 0.00 0 3 01

    Misses in library cache during parse: 1Optimizer goal: CHOOSEParsing user id: 5 (SYSTEM)

    Rows Execution Plan------- ---------------------------------------------------

    0 SELECT STATEMENT GOAL: CHOOSE0 SORT (AGGREGATE)1 INDEX (UNIQUE SCAN) OF 'ANALYZE_PK' (UNIQUE)

    ********************************************************************************

  • 8/7/2019 Knowing When to Rebuild Indexes

    6/10

    After these results have been obtained, the index is rebuilt. The index is then validated and the results areshow below:REQ SQL> alter index analyze_pk rebuild;

    Index altered.

    REQ SQL> analyze index analyze_pk validate structure;

    Index analyzed.

    REQ SQL> select name,height,lf_rows,del_lf_rows,del_lf_rows/lf_rows fromindex_stats;

    NAME HEIGHT LF_ROWS DEL_LF_ROWSDEL_LF_ROWS/LF_ROWS------------------------------ ---------- ---------- ----------- -------------------ANALYZE_PK 3 712853 0

    0

    It is clear that the deleted leaf rows have been removed from the index. The four queries are run againstthe table once again. Again, SQL Trace is enabled and the queries are shown in the following excerptfrom TKPROF:********************************************************************************

    select count(*)fromanalyze where id between 250000 and 600000

    call count cpu elapsed disk query currentrows------- ------ -------- ---------- ---------- ---------- ---------- ----------Parse 1 0.00 0.00 0 0 00Execute 1 0.00 0.00 0 0 00Fetch 2 0.23 0.23 0 679 01------- ------ -------- ---------- ---------- ---------- ---------- ----------

    total 4 0.23 0.23 0 679 01

    Misses in library cache during parse: 0Optimizer goal: CHOOSEParsing user id: 5 (SYSTEM)

    Rows Execution Plan------- ---------------------------------------------------

    0 SELECT STATEMENT GOAL: CHOOSE

  • 8/7/2019 Knowing When to Rebuild Indexes

    7/10

    0 SORT (AGGREGATE)138614 INDEX (RANGE SCAN) OF 'ANALYZE_PK' (UNIQUE)

    ********************************************************************************

    select count(*)fromanalyze where id in (50000,4321,698754)

    call count cpu elapsed disk query currentrows------- ------ -------- ---------- ---------- ---------- ---------- ----------Parse 1 0.00 0.00 0 0 00Execute 1 0.00 0.00 0 0 00Fetch 2 0.00 0.00 0 9 0

    1------- ------ -------- ---------- ---------- ---------- ---------- ----------total 4 0.00 0.00 0 9 01

    Misses in library cache during parse: 0Optimizer goal: CHOOSEParsing user id: 5 (SYSTEM)

    Rows Execution Plan------- ---------------------------------------------------

    0 SELECT STATEMENT GOAL: CHOOSE0 SORT (AGGREGATE)0 CONCATENATION1 INDEX (UNIQUE SCAN) OF 'ANALYZE_PK' (UNIQUE)1 INDEX (UNIQUE SCAN) OF 'ANALYZE_PK' (UNIQUE)1 INDEX (UNIQUE SCAN) OF 'ANALYZE_PK' (UNIQUE)

    ********************************************************************************

    select count(*)fromanalyze where id > 500000

    call count cpu elapsed disk query currentrows------- ------ -------- ---------- ---------- ---------- ---------- ----------Parse 1 0.00 0.00 0 0 00Execute 1 0.00 0.00 0 0 00Fetch 2 0.96 0.96 0 933 01

  • 8/7/2019 Knowing When to Rebuild Indexes

    8/10

    ------- ------ -------- ---------- ---------- ---------- ---------- ----------total 4 0.96 0.96 0 933 01

    Misses in library cache during parse: 0Optimizer goal: CHOOSEParsing user id: 5 (SYSTEM)

    Rows Execution Plan------- ---------------------------------------------------

    0 SELECT STATEMENT GOAL: CHOOSE0 SORT (AGGREGATE)

    445545 INDEX (RANGE SCAN) OF 'ANALYZE_PK' (UNIQUE)

    ********************************************************************************

    select count(*)from

    analyze where id = 4321

    call count cpu elapsed disk query currentrows------- ------ -------- ---------- ---------- ---------- ---------- ----------Parse 1 0.00 0.00 0 0 00Execute 1 0.00 0.00 0 0 00Fetch 2 0.00 0.00 0 3 01------- ------ -------- ---------- ---------- ---------- ---------- ----------total 4 0.00 0.00 0 3 01

    Misses in library cache during parse: 0Optimizer goal: CHOOSEParsing user id: 5 (SYSTEM)

    Rows Execution Plan------- ---------------------------------------------------

    0 SELECT STATEMENT GOAL: CHOOSE0 SORT (AGGREGATE)1 INDEX (UNIQUE SCAN) OF 'ANALYZE_PK' (UNIQUE)

    The same queries have now been executed against the test table before and after rebuilding the index.The results are summarized in the following table:

    Before index rebuild After index rebuild % ChangeCPU Elapsed CPU Elapsed CPU Elapsed

  • 8/7/2019 Knowing When to Rebuild Indexes

    9/10

    Query 1 0.47 0.47 0.23 0.23 51% 51%Query 2 0.00 0.01 0.00 0.00 0% 0%Query 3 1.11 1.11 0.96 0.96 13.5% 13.5%Query 4 0.00 0.00 0.00 0.00 0% 0%The results in the above table show that queries 1 and 3 experienced up to 51% improvement inexecution time. It should be noted that the queries were executed at least once prior to tracing the queriesso that the data would reside in the buffer cache and first time executions of the queries would not skewthe results. This fact explains why the CPU and elapsed times for a query are nearly identical. The datawas read from the buffer cache. Queries 2 and 4 took minimal time to execute. These queries searchedfor specific values. Since the data was cached, the CPU and Elapsed times are minimal. Queries 1 and 3took significantly longer. These queries returned a range of values. This paper does not attempt togenerate any findings on the improvement of different types of queries when rebuilding an index. Rather itgives a broader statement that query execution times will improve after rebuilding substandard indexes.Script to find indexes to rebuildBelow is a sample script that can be run to determine which indexes need to be rebuilt. For those indexesthat need to be rebuilt, the ALTER INDEX REBUILD command is dynamically generated as output. Theuser can tailor the height and percentage of deleted leaf rows by altering the vMaxHeight and vMaxDelvariables. The output of this script can be spooled to a file. This file can then be run to rebuild the indexes. -- validate_idx.sql-- by Brian Peasland-- Database Administrator-- Raytheon / EROS Data Center-- 3 February 2000--

    -- This script will check indexes to find candidates for rebuilding.-- Run this script in SQL*Plus as a user with SELECT ANY TABLE-- privileges.---- This script can be used and modified without permission. Run this-- script at your own risk! The script author is not responsible for-- any problems that may arise from running this script.

    set serveroutput on size 100000

    DECLAREvOwner dba_indexes.owner%TYPE; /* Index Owner */vIdxName dba_indexes.index_name%TYPE; /* Index Name */

    vAnalyze VARCHAR2(100); /* String of Analyze Stmt */vCursor NUMBER; /* DBMS_SQL cursor */vNumRows INTEGER; /* DBMS_SQL return rows */vHeight index_stats.height%TYPE; /* Height of index tree */vLfRows index_stats.lf_rows%TYPE; /* Index Leaf Rows */vDLfRows index_stats.del_lf_rows%TYPE; /* Deleted Leaf Rows */vDLfPerc NUMBER; /* Del lf Percentage */vMaxHeight NUMBER; /* Max tree height */vMaxDel NUMBER; /* Max del lf percentage */CURSOR cGetIdx IS SELECT owner,index_name

  • 8/7/2019 Knowing When to Rebuild Indexes

    10/10

    FROM dba_indexes WHERE OWNER NOT LIKE 'SYS%';BEGIN

    /* Define maximums. This section can be customized. */vMaxHeight := 3;vMaxDel := 20;

    /* For every index, validate structure */OPEN cGetIdx;LOOP

    FETCH cGetIdx INTO vOwner,vIdxName;EXIT WHEN cGetIdx%NOTFOUND;/* Open DBMS_SQL cursor */vCursor := DBMS_SQL.OPEN_CURSOR;/* Set up dynamic string to validate structure */vAnalyze := 'ANALYZE INDEX ' || vOwner || '.' || vIdxName || ' VALIDATE

    STRUCTURE';DBMS_SQL.PARSE(vCursor,vAnalyze,DBMS_SQL.V7);vNumRows := DBMS_SQL.EXECUTE(vCursor);/* Close DBMS_SQL cursor */DBMS_SQL.CLOSE_CURSOR(vCursor);

    /* Does index need rebuilding? *//* If so, then generate command */SELECT height,lf_rows,del_lf_rows INTO vHeight,vLfRows,vDLfRows

    FROM INDEX_STATS;IF vDLfRows = 0 THEN /* handle case where div by zero */

    vDLfPerc := 0;ELSE

    vDLfPerc := (vDLfRows / vLfRows) * 100;END IF;IF (vHeight > vMaxHeight) OR (vDLfPerc > vMaxDel) THEN

    DBMS_OUTPUT.PUT_LINE('ALTER INDEX ' || vOwner || '.' || vIdxName || 'REBUILD;');

    END IF;

    END LOOP;CLOSE cGetIdx;

    END;/

    SummaryOne of the areas lacking in Oracle documentation pertains to rebuilding an index. Oracles ANALYZEINDEX VALIDATE STRUCTURE command provides a nice way to check an index to see if it is acandidate for rebuilding. This command does not affect the Oracle optimizers execution plan for queriesthat may use the index. The results in the INDEX_STATS are checked after issuing the VALIDATESTRUCTURE command. If an index has excessive height (greater than four) or a high number of deletedleaf rows (over 20% of the total), we rebuild the index.A test case was studied to show the potential for improvements after rebuilding an index. In one example,the query executed 51% faster after rebuilding the index. While this example is specific to the test case, itdoes show the potential for improvement after rebuilding an index.