GDT Tips and Tricks
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
File Handling
The Indexed File Structure The BINARY Tree What happens during a Read and Write
operation? What happens to the Index? How do we obtain data by the Index?
The Impact of Compressing your keys How is Data File Integrity maintained Causes and Response to File Corruption Enhancing Performance!
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Question to Ponder?
What Level of Data Integrity do you require in your files? Maybe you want to immediately flush any write
operations immediately to disk?? Maybe you want a reasonable level of integrity
where you let Micro Focus write the data as soon as it can to protect against application being killed etc..??
Maybe you are comfortable and have the luxury of just re-running your applications to recover from an untimely event??
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Indexed File Structure
The basics Your indexed file will have a primary key and
possibly alternate keys that interface to your data. The data will include live records as well as deleted
records.
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Index Structure
File Header Node
Free Space List 2Free Space List 1Key DefinitionNode
Root NodeKey 0
Root NodeKey 1
Root NodeKey 2
The Header Node containsthe file format, file organization,record type, record length and
a pointer to the Free Space List
Contains a list of recordsthat have been deleted.
When adding a new record,instead of increasing the sizeof the file, it goes to the freespace list and finds the first
deleted record and stores thenew record on top of it.
Contains a list of nodesin the indexed portion
of the file. If needing towrite a new node, insteadof increasing size of index
file, it will use the freespace nodes it has.
Contains a list ofdifferent indexes.
Primary and Alternatekeys. Contains thestart position of the
key, the length of thekey, if compression
is used and a pointerto the Root Node of
each key.
EACH KEY CONTAINS A BINARY SEARCH TREE
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Index Structure
FILEHEADER
NODE
KEYDEFINITION
NODE
FREE SPACELIST
LIST OFDELETEDRECORDS
FREE SPACELIST OF FREESPACE NODES
ROOTNODEKEY 0
ROOTNODEKEY 1
ROOTNODEKEY 2
LEVEL 1NODE
LEVEL 1NODE
LEVEL 1 NODE
LEVEL 2NODE
LEVEL 2NODE
LEVEL 2NODE
LEAFNODE
LEAFNODE
LEAFNODE
LEAF NODE - Pointer to record in data file
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
The Binary Tree (Read a Key)
27,54,FF
8,18,FF 36,45,FF 63,72,FF
3,6,FF 12,15,FF 21,24,FF
10,11,12 13,14,15 16,17,18
READ KEY START 13(1) The Filehandler reads the
header. Finds the KeyDefinition Node for the Key.Verification is done to makesure Key Start Position andLength are correct. It thenobtains the pointer to the
Root Node.
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
The Binary Tree (Read a Key)
27,54,FF
8,18,FF 36,45,FF 63,72,FF
3,6,FF 12,15,FF 21,24,FF
10,11,12 13,14,15 16,17,18
READ KEY START 13
IS 13 < = 27 ?yes = exit left
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
The Binary Tree (Read a Key)
27,54,FF
8,18,FF 36,45,FF 63,72,FF
3,6,FF 12,15,FF 21,24,FF
10,11,12 13,14,15 16,17,18
READ KEY START 13
IS 13 < = 8 ?no
IS 13 < = 18 ?yes = exit center
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
The Binary Tree (Read a Key)
27,54,FF
8,18,FF 36,45,FF 63,72,FF
3,6,FF 12,15,FF 21,24,FF
10,11,12 13,14,15 16,17,18
READ KEY START 13
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
The “Binary Chop”
6 16151413121110987
The Reality is that most nodes will be a lotlarger than 3 as our demo is showing and
would have many more entries. How MicroFocus handles this is by doing a "Binary Chop"
IS 13 > = 11? yes
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
The “Binary Chop”
6 16151413121110987
IS 13 > = 14? no
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
The “Binary Chop”
6 16151413121110987
If 2 entries are left in the node, itwill do a sequential walk
through the nod
IS IT 11? NOIS IT 12? NO
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
The “Binary Chop”
6 16151413121110987
ENTRY ISFOUND!
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
The Binary Tree (Read a Key)
If you do another Random Read of another Key? It would start at the beginning Node and work it’s
way back down the chain UNLESS If previous READ is in cache, then it can read the nodes
from cache. ALTERNATIVELY, if you are doing a sequential
READ NEXT, it knows via cache the previous read Node and starts from there (much quicker).
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Key Compression
Key Compression – to save space Types of Key Compression
Duplicate Key Compression Maybe used when you have many keys the same Shows the first instance of the key while all other occurrences
have a pointer to the node it should point to Leading Character Compression
If 1 record key contains AAAAA and the second record key contains AAAAB, then the second record key will only show “B”, the A’s are compressed. The key does however contain information required so key can be decompressed.
Trailing Space Compression Spaces at end of key are compressed. Again information
maintained for decompression. Trailing Null Compression
Null’s at end of key are compressed. Again information maintained for decompression.
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Key Compression
What happens when you try to read with Key Compression? Keys are not fixed length (some compressed more
than others) So, the keys need to be decompressed before they can be
read and compared to the key being looked for The “Binary Chop” cannot happen MUST SEQUENTIALLY WALK THROUGH EVERY NODE!
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Indexed Files (Writing Records)
What is happening? Every index in the file needs to be updated (Primary
and Alternate Keys) The basic process:
The Header is updated – just to say we are in mid-update The Record is added to the Data file Indexes are Updated – 1st the Primary then the Alternate
keys The Header is updated – to say that the action is completed
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Indexed Files (Writing Records)
27,54,FF
8,18,FF 36,45,FF 63,72,FF
3,6,FF 12,15,FF 21,24,FF
10,12 13,14,15 16,17,18
INSERT KEY VALUE 11
We have anopening in the Leaf
Node for Key 11
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Indexed Files (Writing Records)
27,54,FF
8,18,FF 36,45,FF 63,72,FF
3,6,FF 12,15,FF 21,24,FF
10,11,12 13,14,15 16,17,18
INSERT KEY VALUE 11
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Indexed Files (Writing Records)“NODE SPLITTING”
Done to have the available room to add the entry to the node.
Must look at the preceding node to verify that it also has available room to add the entry.
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Indexed Files (Writing Records)“NODE SPLITTING”
27,54,FF
18,FF 36,45,FF 63,72,FF
12,15,FF 21,24,FF
9,10,12 13,14,15 16,17,18
INSERT KEY VALUE 11
If we try and insert 11 into the Leaf Node (9,10,12), we do not have anyroom. So, we look at the previous node (12,15,FF) and it also does nothave room for a new entry. So we go to the previous node (18,FF) and
it does have room. So (2) node splits are needed!
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Indexed Files (Writing Records)“NODE SPLITTING”
27,54,FF
12,18,FF 36,45,FF 63,72,FF
12,15,FF 21,24,FF
9,10 13,14,15 16,17,18
INSERT KEY VALUE 11
10,FF
11,12
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
How File Integrity is Maintained
The File Header Static Information
File Attributes Number of Keys Format and Organization of a file
Dynamic Information Integrity indicator Modification Counter Logical EOF marker
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
How File Integrity is MaintainedThe File Header
System Record andHeader Length
Last Update Dateand Time
File Creation Dateand Time IDX File Format Minimum and
Maximum Record SizeIntegrity Flag
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
How File Integrity is MaintainedThe File Header
Integrity Flag The File Handler uses this flag to maintain integrity 2 Byte Field Value depends on the update being performed
(type of operation) A non zero value when header is read indicates to
the File Handler that an operation is not fully completed.
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
How File Integrity is MaintainedThe File Header
Modified Value Field Position 105-108 4 byte field Used as an aid to performance If a process detects that this value has changed
after the last read of the header, this indicates to the process that nodes cached are invalid and must read new nodes from the indexed structure that are physically stored.
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Understanding the Write Operation
The File Handler obtains a “Write Semaphore” To only allow 1 process to update at a time Control-Break is disabled
The File Header is read The Integrity Flag is updated
To insure that another process has not left the file in a corrupt status and also checks the Modified Value flag
Update and Write the File Header basicall stating that the process is performing a write operation.
Write DATA Record create is written to disk
Write INDEX Index is created and written to disk
INTEGRITY FLAG is reset and written back to disk for another process to update the file
FILE HEADER is written SEMAPHORE is RELEASED
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Understanding the Write Operation
Special Note When using a WRITE / OPEN EXCLUSIVE on a file,
the indexes are CACHED until either the CACHE limit is reached or a CLOSE operation is done.
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Possible Causes of Corruption of Indexed Files
KILL -9 is used on Unix Need to use KILL -15
RTS invokes Micro Focus Exit procedure flushing back to disk the cached indexed nodes
Copying open files on Unix. Unix allows the copying of opened Exclusive files
which at the time of copying, the indexed nodes cached may not be flushed
Network problems Machine rebooted or powered off while
indexed files are opened Actual error in system itself
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Fixing Corrupted Files
REBUILD UTILITY Taking the attributes of the input file to produce the output
file Requires Exclusive use of the file TO REORGANIZE A FILE
USES THE INDEX TO READ THE DATA RECORDS REBUILD INFILE,OUTFILE
TO FIX CORRUPTED INDEXES IGNORES INDEXES AND DIRECTLY READS RECORDS FROM
THE DATA FILE NEW DATA STRUCTURE AND DATA FILE CREATED REBUILD INFILE,OUTFILE /d
TO RUN REBUILD ON LIVE DATA REBUILD INFILE
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Summary on File Integrity
The MOST Integrity? WRITETHRU directive
Can be used at compile time or as a tunable to the File Handler When OPEN is done on the file, specifies to the Operating System that any WRITES
to a file are flushed immediately to disk. PERFORMANCE takes a NOSE DIVE immediately!
Default Level of Integrity? Reasonable level of Integrity Micro Focus will write data as soon as possible Protects against application being killed Couple of directives to look at
IDXDATBUF The IDXDATBUF option determines the size of buffer used when accessing the data portion
of a file with organization INDEXED. DEFINE NBBUF & BPB from JCL will overwrite this setting if given.
LOADONTOHEAP The LOADONTOHEAP option specifies whether the File Handler loads the file into memory
before executing any I/O operations. Able to RERUN your applications
Maybe recovery enough
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
File Handling Performance
Getting better performance from your application by getting better performance from your file handler
Advances in technology, cpu speed, amount of memory addressable by a process and code generator optimization has made it easier to push back thoughts of trying to improve performance
Accessing your disk is still the slowest thing you can do to your machines today
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
File Handling Performance
You can make certain aspects of the file handler perform better but you need to be careful on how this can effect another application accessing the file in a different manner.
Micro Focus provides an “All Round” solution to performance Giving you the ability to tune the file handler to
what you need in your application performance!
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
File Handling Performance
The BIG question…what should you use? Understanding what to use is based on your
understanding of the Binary Tree. 1 for every index of your data file
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Tuning Your Files
Access Permission Examine your data files on individual per file basis Not every file needs complete access to everyone When Opening Files:
Use Exclusive access where possible Otherwise allow Only Readers
Only when absolutely necessary, give all others complete access to the file
Micro Focus Timing (8 million records read on IDX 8 format file) Exclusive access - 7 min 13 sec Allow Readers – 25 min Allow all Others – 25 min
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Tuning Your Files
Based on the findings below, you may want to just say Allow all Others if choice between that and Allow Readers, but this was because only 1 user was used in the test. Micro Focus Timing (8 million records written on IDX 8 format
file) Exclusive access - 7 min 13 sec Allow Readers – 25 min Allow all Others – 25 min
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Tuning Your Files
Write Allowing Readers When update is done, goes to disk to allow others
to read Write Allowing Others
Other updates by other processes may be done at same time.
When Writing Nodes are Cached into memory With Exclusive use, only has to check if nodes
have been changed. Quick With allowing others, keeps reading nodes off of
disk as they are changed. A lot slower.
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Tuning Your Files
Micro Focus Timing (8 million records read on IDX 8 format file)
Exclusive access - 2 min 32 sec Allow Readers – 2 min 32 sec Allow all Others – 6 min 28 sec
This allows applications to change the file. It has a lot more checks to do to see if the file has been changed each time.
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
File Handler Configuration Settings
READSEMA Specifies whether or not the system attempts to
gain a semaphore for shared files when operations are performed that do not modify the file. (READ, START etc..)
You need to ensure that is set to OFF (default) You might think that this can cause dirty reads?
No, when you read a record it checks to see if the record has been changed, if yes, then it takes out a semaphore on that record.
When set on you can degrade the performance by 15%
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
File Handler Configuration Settings
IGNORELOCK Not interested if you have a “dirty” read. Not bothered that someone comes in and changes
a record you have just read. Can improve performance by 15%
Take care that this is handled internally by GDT thru READLOCK=STAT directive.
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
KEYS
The shorter the keys are in size, the better. Fit more in a node Quicker to traverse the Binary Tree
Remove redundant keys Each key has a tree Needs to be updated for each insert and delete
Micro Focus Timing 8 million records Read
2 alternate keys – 6 min 28 sec 3 alternate keys – 8 min 40 sec 4 alternate keys – 53 min 08 sec ! (we will talk about this in
a couple of minutes)
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Compression
Data Compression Minimal Performance hit
When reading a record, it will traverse the tree, every time it gets a record, it has to decompress the record before writing.
Index Compression If you can get away with it, do not use it! Always a hit in performance – sometimes severe! File handler cannot Binary Chop the node when
searching for the key. All keys are different size Cannot tell where middle of the node is
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Compression
Micro Focus Timing 8 million records
Sequential Write No data/key compression – 6 min 40 sec Random Read No data/key compression – 6 min 28 sec Sequential Read No data/key compression – 6 min 12 sec
Sequential Write with Data compression – 8 min 14 sec Random Read with Data compression – 7 min 59 sec Sequential Read with Data compression – 7 min 53 sec
Sequential Write w/ Data/Key compression – 15 min 18 sec Random Read w/ Data/Key compression – 15 min 36 sec Sequential Read w/ Data/Key compression – 7 min 50 sec
SEQUENTIAL READ – Consistent. Doing a read next it will always know where previous key is. Much different than random reads.
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
FINE TUNING
Setting File Handler Configuration Options Set on per file basis There is no magic formula. You need to adjust to suit each application Can have both positive and negative impact on
your application SET EXTFH=C:\EXTFH.CFG
[XFH-DEFAULT] NODESIZE=4096 [FILE1.DAT] NODESIZE=1024 [FILE2.DAT] INDEXCOUNT=32
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
INDEXCOUNT
Specifies number of index nodes to be cached for an index file per process
Default cache size is 16 nodes [XFH-DEFAULT] INDEXCOUNT=32
take care that this is handled internally by GDT thru NBBUF & BPB directives.
[FILE1.DAT] INDEXCOUNT=16
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
INDEXCOUNT IN ACTIONINDEXCOUNT = 4
1A
4H4G4F4E4D4C4B4A
3D3C3B3A
2B2A
1A
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
INDEXCOUNT IN ACTIONINDEXCOUNT = 4
1A
4H4G4F4E4D4C4B4A
3D3C3B3A
2B2A
1A
2A
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
INDEXCOUNT IN ACTIONINDEXCOUNT = 4
1A
4H4G4F4E4D4C4B4A
3D3C3B3A
2B2A
1A
2A
3A
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
INDEXCOUNT IN ACTIONINDEXCOUNT = 4
1A
4H4G4F4E4D4C4B4A
3D3C3B3A
2B2A
DISK I/O
DISK I/O
DISK I/O
1A
2A
3A
4A
TO CACHE
TO CACHE
TO CACHE
TO CACHE
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
INDEXCOUNT IN ACTION
Now we need to read 4B
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
INDEXCOUNT IN ACTIONINDEXCOUNT = 4
1A
4H4G4F4E4D4C4B4A
3D3C3B3A
2B2A
DISK I/O
DISK I/O
DISK I/O
1A
2A
3A
4B
TO CACHE
TO CACHE
TO CACHE
TO CACHE
Remove from Cache Add to Cache
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
INDEXCOUNT IN ACTION
Prioritization of Cache Works out which nodes are needed more often Nodes higher up in the tree, it tries to keep in cache
the longest INDEXCOUNT with multiple indexes
2 KEYS BETTER TO HAVE INDEXCOUNT GREATER THAN 4
OR YOU WILL HAVE LOADS OF DISK I/O SWAPPING OF THE NODES IN CACHE
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
INDEXCOUNT
When to use For writing where there are multiple keys For reading files randomly Reading files sequentially may degrade peformance
as it keeps track of the nodes in cache Set INDEXCOUNT to 32 for most files
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
NODESIZE
Specifies the size of the index nodes to use for an indexed file when it is created
NODESIZE={512|1024|4096|16384} Default set by the file handler at creation based on the key size NODESIZE can also be given in the BLF statement under
GDTBATCH. If only reading files sequentially, you may want to think of
increasing the node size. Why?
On a sequential read access it always know where the next record is and once it is at the end of the node, then it will go back to the top of the tree and traverse the tree again. On Random reads, it does not know where the next record is so it always goes to the top of the tree and binary chops it way down the tree. In this case NODESIZE has no effect.
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
NODESIZE
When to use Nodesize Reading records sequentially Avoid using with HIGH INDEXCOUNT
Creates high memory usage If in doubt, let file handler set the NODESIZE
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
LOADONTOHEAP
Forces file handler to load entire file onto memory heap All Operations execute in memory – only writing back to disk
when it is closed Use with caution!
Could lose your data [XFH-DEFAULT] [FILE1.DAT] LOADONTOHEAP=ON When to use
Only in Exclusive Mode Only on small files Where you are batch processing and data is backed up Look for SPEED versus INTEGRITY
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
IDXDATBUF
Determines the size of buffer used when accessing the data portion of a file with organization indexed
Default is 0 Set to Disk / Page allocation size More suited to Batch Data Not applicable ot single file format IDX 8 May want to look at if file is too large for
LOADONTOHEAP Sequential file version SEQDATBUF Relative file version RELDATBUF
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Tuning Summary
Consider tuning files on a file to file basis Think about permissions – exclusive use Avoid key compression Unless reading sequential data, set
INDEXCOUNT=32 or calculate the optimum figure
Sequential Access – look at NODESIZE For unloading or loading data on larger files,
look at LOADONTOHEAP, IDXDATBUF or SEQDATBUF
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Using REBUILD utility for better performance
To remove Compression To remove Unused Keys To determine the number of Index Nodes to
Cache To change the Index Node Size of your data
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
How do you know you have Compression on your file?
REBUILD /N filename or GDTFI filenameDATA
COMPRESSION
KEYCOMPRESSION
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Removing Key Compression
So we can Binary Chop the Index Nodes rebuild infile,outfile /c:i0
c = compression i0 = remove key compression
So we don’t have to decompress/compress rebuild infile,outfile /c:d0i0
c = compression d0 = remove data compression i0 = remove key compression
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Removing / Adding Keys Less keys you have, the faster the updates will be!
You cannot remove the primary key! rebuild infile /k:r:22+10
k = references to rebuild that we are dealing with keys r = remove 22 = start of key 10= length of key
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Removing / Adding Keys
Removing a key that is defined in multiple areas of the file rebuild /k:r:22+10,58+2,100+8
Adding a key with duplicates rebuild infile /k:a:1+130d
a = adding a key d = key has duplicates
Be careful when adding and removing keys You must change your SELECT statements and
RECOMPILE or you will end up with a MISMATCH between the attributes on your file and the attributes in your select statements
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
How to determine the number of index nodes to cache?
Max tree depth times the no. of keys + no. of keys = indexcount i.e. 3 x 4 + 4 = 16To improve performance when writing records and random reads!
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
How to determine the number of index nodes to cache?
Remember 16 is the default INDEXCOUNT setting unless you
specify differently in your EXTFH.CFG file
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
How to change your Index Node Size for your files
To improve sequential access – get more keys in a node
NODESIZE is set automatically by the file handler via the following controls Key length in bytes < 51 then nodesize = 512 Key length 51 to 100 then nodesize =1024 (default) Key length 101 to 512 then nodesize = 4096 Key length 513 to 4080 then nodesize = 16384
Can make entry in EXTFH.CFG NODESIZE=4096 Then rebuild infile,outfile will grab EXTFH.CFG setting BUT if
you choose a lower value then you should have, the file handler will pick the setting based on the controls above.
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Operating System Tuning
You may fully understand the “Binary Tree”, the operations of your application and have applied all the necessary tunables.
BUT, application(s) run no better and may be even worse. You are disappointed!
Look at performance at a different level. Operating system
Server Operating system Prioritization is more evenly distributed throughout all processes
Windows Operating system Prioritization is given to the application running in the foreground
GDT 2006 International User Conference: Evolving the Legacy – RevolutionsJune 25 - 28 Palm Springs, California
GDT Tips and TricksDoug Evans
Operating System Tuning
Server Operating System Amount of CACHE is based on the physical amount of
memory available on the system With 6 – 12 gig of memory this means you would have a large
amount of cache Windows Operating System
Amount of CACHE is normally around 10 meg. Once this is filled up then the operating has to remove items from cache and insert newer instances. Big difference in performance.
Windows Operating System Changing the settings for CACHE in the System properties
can improve performance dramatically. Be careful, this may create some side effects!