Microsoft PowerPoint - STATENC.ppt [Read-Only]

Download Microsoft PowerPoint - STATENC.ppt [Read-Only]

Post on 27-Jan-2015

107 views

Category:

Documents

2 download

Embed Size (px)

DESCRIPTION

 

TRANSCRIPT

  • 1. State of North Carolina DB2 for z/OS Utilities Topics August 27th, 2008Craig Friske, IBM, friske@us.ibm.com1

2. Agenda Usability with Listdef and Templates COPY/RECOVER/QUIESCE/MODIFY REORG RTS RUNSTATS LOAD Parallel index build 2008 IBM Corporation We will cover the best practices, including keyword specifications recommended for those utilities that are most significant within the utilities suite because they are used often or may consume large amounts of resources. For several of these utilities, building indexes is an important part of processing, so we will cover some of the details of that process to see how is may affect space needs and performance. Sorting is a critical part of building indexes, and its used at various times for many for other phases of utilities as well. DFSORT is always used as of DB2 V8, so we will look at some of the characteristics and recommendations for DFSORT. This involves using dynamic allocation of sort work data sets and why you should let DB2 manage those data sets when possible. Along the way there will be a few samples to better understand the concepts.2 3. Wildcarding/Allocation Overview Improved utility processing for groups of objectsAvoid job stream changes when application changesSupports table spaces, index spaces, or both Generate object listsLISTDEF statement to generate a list of objects wildcarding (e.g. dbname*) explicit lists INCLUDE or EXCLUDE objects Dynamically allocate datasetsTEMPLATE statement to define datasets intelligent dataset sizing variables (Job, Utility, Object, Date, Time, ... etc.) Control StatementsOPTIONS PREVIEW, EVENT (halt, skip, RC) 2008 IBM Corporation 3 4. Wildcarding using LISTDEF List Example: Process objects x, y, and z in database DBA With V6 QUIESCE TABLESPACE DBA.X TABLESPACE DBA.YTABLESPACE DBA.Z COPY TABLESPACE DBA.X COPYDDN (C1,C2) RECOVERYDDN(C3,C4) TABLESPACE DBA.Y COPYDDN (..,..) RECOVERYDDN(..,..) TABLESPACE DBA.Z COPYDDN (..,..) RECOVERYDDN(..,..) With V7 LISTDEF X INCLUDE TABLESPACE DBA.*EXCLUDE TABLESPACE DBA.ATEMPLATE T1 DSNAME((&DB..&TS..&LOCREM.&PRIBAC..D&JDATE.) QUIESCE LIST XCOPY LIST X COPYDDN(T1,T1) RECOVERYDDN(T1,T1) 2008 IBM Corporation 4 5. Dynamic Allocation TemplatesTemplate example: Allocate image copy datasets (2 local, 2 remote)V6 - 4 DD statements for each object //C1 DD DSN=DBA.X.LP.D20020409.,UNIT=..,VOL=SER=..,DISP= //C2 DD DSN=DBA.X.LB.D20020409.,UNIT=..,VOL=SER=..,DISP= //C3 DD DSN=DBA.X.RP.D20020409.,UNIT=..,VOL=SER=..,DISP= //C4 DD DSN=DBA.X.RB.D20020409.,UNIT=..,VOL=SER=..,DISP= //SYSIN DD *V7 - a single template for local and remote//SYSIN DD *TEMPLATE A DSNAME(&DB..&TS..&LOCREM.&PRIBAC..D&DATE.)Generated datasets DSN=DBA.X.LP.D2002099 April 8th Local Primary for DBA.X DSN=DBA.X.LB.D2002099 DSN=DBA.X.RP.D2002099 DSN=DBA.X.RB.D2002099 2008 IBM Corporation5 6. Utilities supported Utility LISTDEF TEMPLATE CHECK DATANoYes CHECK INDEX Yes Yes CHECK LOB NoYes COPYYes Yes COPYTOCOPYYes Yes LOADNoYes MERGECOPY Yes Yes MODIFY RECOVERY Yes n/a MODIFY STATISTICS Yes n/a QUIESCE Yes n/a REBUILD INDEX Yes Yes RECOVER Yes n/a REORG INDEX Yes Yes REORG TABLESPACEYes Yes REPORTYes n/a RUNSTATSYes n/a UNLOADYes Yes 2008 IBM Corporation 6 7. Automatic Space ManagementSliding secondary allocation quantity sizeApplies to DB2 managed pagesets onlyTries to avoids VSAM maximum extent limit errorsCan reach maximum dataset size before running out of extentsUses cylinder allocation Default PRIQTY 1 cylinder for non-LOB tablespaces and indexes 10 cylinders for LOB tablespaces Improved SQL SELECT and INSERT performance 2x improvement relative to track allocationCan be used for New pagesets: No need for PRIQTY/SECQTY values Existing pagesets: SQL ALTER PRIQTY/SECQTY values to -1 plus schedule a REORG 2008 IBM Corporation7 8. COPY Best Practices COPY PARALLEL keyword provides parallelism for lists of objects (including partitions) CHECKPAGE YES (default in DB2 9, no longer sets copy pending, look for RC=8!) Maximize other utilities access to objects while copying a list with SHRLEVEL CHANGE and OPTIONS EVENT(ITEMERROR,SKIP) Keeps objects in the list in UTRW state *only* as each object is being copied instead of for the duration of the COPY utility UTRW utility allows read/write access by applications, but no access for exclusive utilities Incremental copy rule-of-thumb: Consider using incremental image copy if 5% in data sharing, >10% in non-data sharing REORGINSERTS (number of records inserted since the last Reorg)/TOTALROWS > 25% REORGDELETES (number of records deleted since the last Reorg)/TOTALROWS > 25% EXTENTS (number of extents) > 254 REORGDISORGLOB (number of LOBs inserted since the last Reorg that are not perfectly chunked)/TOTALROWS > 50% SPACE > 2 * (DATASIZE / 1024) (when free space is more than used space) REORGMASSDELETE > 0 (mass deletes on seg tsp and DROP on multi-table tsps) RUNSTATS PERCDROP > 10% SYSIBM.SYSLOBSTATS.ORGRATIO < 50% (changed to a value 0-100 in PQ96460 on V7/V8) (NEARINDREF + FARINDREF) / CARDF > 10% non-data-sharing, > 5% if data sharing FAROFFPOSF / CARDF > 10% Or, if index is a clustering index, CLUSTERRATIOF < 90% (irrelevant if predominantly randomaccess) Other Tsp is in REORP or adv reorg pending status (AREO*) as result of an ALTER TABLE stmnt Index on the tsp is in adv REBUILD pend state (ARBDP) as result an ALTER stmnt 2008 IBM Corporation25 26. Reorg table space V9 recommendations Consider running REORG TABLESPACE in the following situations: Real-time statistics (TABLESPACESTATS) REORGUNCLUSTINS (number of records inserted since the last Reorg that are not well- clustered)/TOTALROWS > 10% Irrelevant if predominantly random access REORGUNCLUSTINS is only an indication of the insert behavior and is correlated to the cluster ratioonly if there are no updates or deletes. To prevent DSNACCOR/X from triggering on these, identifysuch objects and put them in exception list (REORGNEARINDREF+REORGFARINDREF (number of overflow rows since the last Reorg))/TOTALROWS > 5% in data sharing, >10% in non-data sharing REORGINSERTS (number of records inserted since the last Reorg)/TOTALROWS > 25% REORGDELETES (number of records deleted since the last Reorg)/TOTALROWS > 25% EXTENTS (number of extents) > 254 REORGDISORGLOB (number of LOBs inserted since the last Reorg that are not perfectly chunked)/TOTALROWS > 50% SPACE > 2 * (DATASIZE / 1024) (when free space is more than used space) REORGMASSDELETE > 0 (mass deletes on seg tsp and DROP on multi-table tsps) RUNSTATS use RUNSTATS statistics as a trigger to consider running REORG Dont PERCDROP > 10% SYSIBM.SYSLOBSTATS.ORGRATIO < 50% (changed to a value 0-100 in PQ96460 on V7/V8) (NEARINDREF + FARINDREF) / CARDF > 10% non-data-sharing, > 5% if data sharing FAROFFPOSF / CARDF > 10% Or, if index is a clustering index, CLUSTERRATIOF < 90% (irrelevant if predominantly randomaccess) Other Tsp is in REORP or adv reorg pending status (AREO*) as result of an ALTER TABLE stmnt Index on the tsp is in adv REBUILD pend state (ARBDP) as result an ALTER stmnt 2008 IBM Corporation26 27. Reorg table space V9 recommendations Consider running REORG TABLESPACE in the following situations: Real-time statistics (TABLESPACESTATS) REORGUNCLUSTINS (number of records inserted since the last Reorg that arenot well-clustered)/TOTALROWS > 10% Irrelevant if predominantly random access REORGUNCLUSTINS is only an indication of the insert behavior and is correlated tothe cluster ratio only if there are no updates or deletes. To prevent DSNACCOR/Xfrom triggering on these, identify such objects and put them in exception list (REORGNEARINDREF+REORGFARINDREF (number of overflow rows sincethe last Reorg))/TOTALROWS > 5% in data sharing, >10% in non-data sharing REORGINSERTS (# of records inserted since the last Reorg)/TOTALROWS >25% REORGDELETES (# of records deleted since the last Reorg)/TOTALROWS >25% EXTENTS (number of extents) > 254 REORGDISORGLOB (number of LOBs inserted since the last Reorg that are notperfectly chunked)/TOTALROWS > 50% SPACE > 2 * (DATASIZE / 1024) (when free space is more than used space) REORGMASSDELETE > 0 (mass deletes on seg tsp and DROP on multi-tabletsps) Other Tsp is in REORP or adv reorg pending status (AREO*) as result of an ALTERTABLE stmnt Index on the tsp is in adv REBUILD pend state (ARBDP) as result an ALTERstmnt 2008 IBM Corporation 27 28. Reorg Index recommendationsConsider running REORG INDEX in the following cases: Real-time statistics (SYSINDEXSPACESTATS) REORGPSEUDODELETES (number of index entries pseudo-deletedsince the last Reorg)/TOTALENTRIES > 10% in non-data sharing, 5% ifdata sharing as pseudo-deleted entry can cause S-lock/unlock in Insertfor unique index REORGLEAFFAR (number of index leaf page splits since the last Reorgand the new leaf page far from the original leaf page)/NACTIVE > 10% REORGINSERTS ( number of index entries inserted since the lastReorg)/TOTALENTRIES > 25% REORGDELETES ( number of index entries inserted since the lastReorg)/TOTALENTRIES > 25% REORGAPPENDINSERT / TOTALENTRIES > 20% EXTENTS (number of extents) > 254 RUNSTATS LEAFFAR / NLEAF > 10% (NLEAF is a column in SYSIBM.SYSINDEXESand SYSIBM.SYSINDEXPART) PSEUDO_DEL_ENTRIES / CARDF > 10% for non-data sharing and > 5%for data sharing Other The index is in advisory REORG-pending status (AREO*) or advisory-REBUILD-pending status (ARBDP) as the result of an ALTER statement 2008 IBM Corporation 28 29. RUNSTATS Best PracticesRUNSTATS SHRLEVEL CHANGE for availability Collect only column stats on columns used in SQLpredicates Use the Statistics Advisor to detect which stats to collect SAMPLE reduces CPU time when gathering column stats KEYCARD provides valuable info for little processing cost(see next slide) 2008 IBM Corporation If not gathered with inline stats, RUNSTATS should be run with SHRLEVEL CHANGE for availability. For large amounts of data RUNSTATS TABLESPACE with column statistics is very CPU intensive. By specifying the SAMPLE keyword, CPU can be reduced while still gathering enough stats for the optimizer to choose a good access path. 29 30. RUNSTATS KEYCARD Collects all of the distinct values in all of the 1 to n key column combinations for the specified indexes. n is the number of columns in the index. For example, suppose that you have an index defined on three columns: A, B, and C. If you specify KEYCARD, RUNSTATS collects cardinality statistics for column A, column set A and B, and column set A, B, and C. So these are cardinality statisics across column sets... if we had a 3-column index that had these values:Col1 Col2 Col3ABCABDABEABEACAACAADABBB then these stats would be collected: Col1 cardinality = 2 Col1 and Col2 cardinality = 4 Col 1, Col2, and Col3 cardinality = 6 2008 IBM Corporation 30 31. When is RUNSTATS needed? When the data changes sufficiently to warrant new statistics REORG of table space or index (use inline stats!) LOAD REPLACE of table space (use inline stats!) After "significant" application changes for the tablespace or index Periodically (weekly, monthly) except for read onlydata? Application tracks updates with activity tables? After percentage of pages changed since lastRUNSTATS (RTS)? Understand implications for access paths! 2008 IBM Corporation 31 32. LOAD Best PracticesLOAD LOG NO reduces log volume; if REPLACE, then take inline copy KEEPDICTIONARY (track dictionary effectiveness with historystatistics PAGESAVE) - small performance impact if loading lots ofdata 254 partition limit for compressed table spaces can be lifted by DBA PK51853 shipped new ZPARM MAX_UTIL_PARTS (watch virtual storage) Load Partition Parallelism (V7) Not individual LOAD part level jobs Enable Parallel Access Volume (PAV) Index parallelism (SORTKEYS) Provide value for SORTKEYS when input is tape/PDS mbr or variable length SORTKEYS is the sum of ALL indexes (and foreign keys) on the table Remove SORTWKxx / UTPRINxx, and turn on UTSORTAL=YES 2008 IBM Corporation LOG NO is the best performance option as well as KEEPDICTIONARY. Dictionaries should not change much. Their effectiveness can be tracked via history statistics PAGESAVE.When loading data into compressed table spaces there is a limit of 254 partitions that can be processed. This limit was introduced because of the amount of virtual storage being used while building compression dictionaries for many partitions in parallel. With PK51853 (UK31488/UK31489) you can now set your own limit in ZPARM MAX_UTIL_PARTS. The ZPARM only changes the limit for the number of partitions, storage consumption was not changed with this PTF, so keep an eye on your virtual storage when loading table spaces with many compressed partitions.Prior to V7, jobs were separated by partition to improved elapsed time in loading large partitioned tables. When NPIs existed, these would be dropped and rebuilt in a separate step. However, now partitioned objects can be loaded in parallel by partition. If the table has NPIs, the keys are merged into a common sort so that KLOAD processing can be used on a single member, thus improving LOAD performance.With multiple indexes, SORTKEYS allows for the indexes to be built in parallel, thus improving elapsed time with some increase in CPU. Usually LOAD should be able to estimate the number of records contained in the input data set. For input on tape, in PDS members or when the input data set has variable length records with great variation of the record length this estimate is not possible. In those cases specify a SORTKEYS estimate which will be required to estimate the required sort work space. Remember that SORTKEYS is the sum for all keys/foreign keys on the table(s) not the number of input records. Foreign keys are only counted if they dont match an existing index or if there are more than 1 foreign key even if they match an existing index. So if youre loading into a table space with 1 table and 3 indexes then SORTKEYS needs to be set to three times the number of input records.For the SORT, although you can control the data sets used by specifying SORTWKxx, avoid doing that. By specifying SORTDEVT, DFSORT or DB2 will dynamically allocate what is needed. DFSORT cant span volumes with sort data sets, so if the amount of data is large, you may want so specify SORTNUM to ensure there are enough data sets available or turn on UTSORTAL=YES.32 33. LOAD Best Practices continued LOAD Inline COPY & Inline STATISTICS Use REUSE to logically reset and reuse DB2-managed data sets without deleting and redefining them (affects elapsed time) When using DISCARD, try to avoid having the input on tape Input is re-read to discard the errant records Avoid data conversion, use internal representation if possible Sort data in clustering order (unless data is randomly accessed via SQL) LOAD RESUME SHRLEVEL CHANGE instead of batch inserts LOAD REPLACE SHRLEVEL CHANGE can be achieved by loading into clone table a...

Recommended

View more >