© ibm corporation 2005 informix user forum 2005 john f. miller iii explaining sqlexplain ®
TRANSCRIPT
© IBM Corporation 2005
Informix User Forum 2005
John F. Miller III
Explaining SQLEXPLAIN
®
© IBM Corporation 20052
Overview• How to view optimizer decision
– How to enable the optimizer explain file– What parts of the optimizer explain are always
enabled• Overview of operators• Different sections of the optimizer explain file• Controlling the Optimizer
© IBM Corporation 20053
How to Enable the Optimizer Explain
• Enabling and Disabling the Optimizer Explain file– Turning on in application– Dynamically enabling– Enabling statistics
© IBM Corporation 20054
Enable in the Application• Enable in the application
– SET EXPLAIN ON– SET EXPLAIN ON AVOID_EXECUTE– SET EXPLAIN OFF– SET EXPLAIN STATISTICS
• Location of the file– UNIX
• File name sqexplain.out• The current directory if you have a local client• Home directory on remote host if the client is remote
– Windows• %INFORMIXDIR%\sqexpln\{username}.out
© IBM Corporation 20055
Dynamically Enabling the Explain File
• Enable using onmode -Y– onmode –Y {session id} 1
– onmode –Y {session id} 0
• Can not turn on SQL statistics dynamically• Only queries optimized after explain has been enabled
will be displayed in the explain file.• Explain output displayed each time a query is
optimized
© IBM Corporation 20056
View Real Time Query Statistics
onstat –g pqs {sid}– Viewing current progress of query
– Real time data
– Always enabled
– Only while query is executing
Query Operators :Opaddr opname phase rows time in1 in2-----------------------------------------------------------------b444d60 scan done 52 0:0:0 0 0b45cd60 scan done 4 0:0:0 0 0b45e2f0 join done 92 0:0:0 b444d60 b45cd60b460d60 scan done 13 0:0:0 0 0b45e208 join done 795 0:0:0 b45e2f0 b460d60
© IBM Corporation 20057
View Real Time Query StatisticsQuery Operators :Opaddr opname phase rows time in1 in2-----------------------------------------------------------------b444d60 scan done 52 0:0:0 0 0b45cd60 scan done 4 0:0:0 0 0b45e2f0 join done 92 0:0:0 b444d60 b45cd60b460d60 scan done 13 0:0:0 0 0b45e208 join done 795 0:0:0 b45e2f0 b460d60
Scan Scan
ScanJoin
Join
© IBM Corporation 20058
Sysmaster Real Time Optimizer Information
• sysmaster:syssqexplain table–Optimizer plan information always present–Profile information about the query requires
•Environment variable SQLSTATS = 1
–Both a client and server environment variable
–Set before the server starts; all sessions track information
–Set before starting the client; only information for that client will be tracked
–Query Timing and Profile information•Environment variable SQLSTATS = 2
© IBM Corporation 20059
Sysmaster Syssqexplain Column Information
– Columns that are always enabled– sqx_sessionid sqx_sdbno sqx_iscurrent
sqx_conbno sqx_ismain sqx_selflagsqx_estcost sqx_estrows sqx_seqscan sqx_srtscan sqx_autoindex sqx_index
sqx_remsql sqx_mrgjoinsqx_dynhashjoin sqx_keyonly sqx_tempfile
sqx_tempview sqx_secthreads sqx_sqlstatement
– Columns which require SQLSTATS >= 1 – sqx_bufreads sqx_pagereads sqx_bufwrites
sqx_pagewrites sqx_totsorts sqx_dsksortssqx_sortspmax
– Columns which required SQLSTATS = 2 – sqx_executions sqx_cumtime
© IBM Corporation 200510
Major Operator Types• Scan Operators
– The actual operation of reading data.
• Join Operators– Used when a query involves more than one
table– Rows from one table are matched to rows
of the second table
• Filters– The search criteria of a query used to
determine which rows are to be selected by the query.
© IBM Corporation 200511
Scan Types
• Sequential Table Scan– Data rows are read in
sequential order as they exist in the table.
• Index Scan– Data in a table is accessed
through an index.
• ROWID Scan– The data rows physical
location is given and read directly
• Skip Duplicate Index Scan– Special type of index scan in
which duplicate rows are skipped
© IBM Corporation 200512
Scan Types (con’t)• First Row Scan
– Data is scanned until the first row satisfies the search condition is found, then the scan is terminated
• Auto Index Path– Add an index on the fly if one does not exist on a
required column
• Key-Only Scan– If all columns required by the query are available
in the index, then the data row is not retrieved.
© IBM Corporation 200513
Join Types• Nested Loop Join
– Scans the inner table once for every row in the outer table.
– The inner table can use either an index or table scan.
• Dynamic Hash Join– Scans the smaller table and builds a hash table by
applying a hash function on the join key. – The larger table is passed by the hash function to see
if a match occurs.
• Cartesian Product Join– All rows of the inner table with all rows of the outer
table.
© IBM Corporation 200514
Filters• Lower-index filter
– A single filter or set of filters which provide a lower bound or starting position for an index scan.
– Example: column >= 10
Start scanning the ascending index from value 10
• Upper-Index Filter– A single filter or set of filters which
provide an upper bound or stopping position for an index scan
© IBM Corporation 200515
Filters (continued)
• Key-First Filter– Index key filters are applied first to see if the
corresponding data row needs to be retrieved.
– Example: SELECT * FROM tab where c1>=1 and c2=2
If an index exists on c1 and c2 then the filter c2=2 will be applied to each index key first. Only if this is satisfied will the corresponding data row be fetched
– C1>=1 is used as a lower index filter to start the index scan
© IBM Corporation 200516
Five Sections of the SQEXPLAIN File
• Overview• Access Order• Detailed
Access Plan • UDR info• Optimizer
Statistics (optional)
QUERY-----SELECT tab1.c2 FROM tab1,tab2 WHERE tab1.c2 = tab2.c2 AND tab1.c2 MATCHES
"systable*"Estimated Cost: 321Estimated # of Rows Returned: 148
1) miller3.tab1: SEQUENTIAL SCAN
Filters: miller3.tab1.c2 MATCHES 'systable*'
2) miller3.tab2: INDEX PATH
(1) Index Keys: c2 (Key-Only) (Serial, fragments: ALL)
Lower Index Filter: miller3.tab2.c2 MATCHES 'systable*'
DYNAMIC HASH JOIN Dynamic Hash Filters: miller3.tab1.c2 =
miller3.tab2.c2
© IBM Corporation 200517
Overview Section• The SQL statement for the query• An estimate of the query cost in units the
optimizer uses to compare plans• An estimate for the number of rows that the
query is expected to return• Maximum number of threads used to execute the
query if PDQPRIORITY is set.• Any temporary files required for executing order
by and/or group by• Directives followed or ignored
© IBM Corporation 200518
Access Order
• The tables are listed in the order in which they are accessed. – Top of the list is the first table accessed
• For each table the following information is listed:– The accessed table’s name
– The access plan by which the server reads the table
• Sequential scan, Index path,…
• The active fragments to be scanned.
– For each pair of tables the join plan is listed• Nested loop join, dynamic hash join
© IBM Corporation 200519
Detailed Access Plan
• This section is repeated for each table in the Access Section.
• Depending on the type of access different information is displayed– Sequential Scans
• Filters to be applied
– Index and auto-index Scans• The index keys on which filters are applied• Key-Only, Aggregate and/or Key-First • Lower index filter• Upper index filter
• Key-First Filters
© IBM Corporation 200520
UDR Info
• Information about UDRs used in the query.– UDR Name
– UDR id• The UDR id is given because UDRs can have the
same name but take different arguments.
Example:UDRs in query:-------------- UDR id : 234 UDR name: x
© IBM Corporation 200521
Optimizer Statistics• A list of the tables and their internal names• For each operator
– Operator type– Internal table name– Rows produced by the operator– Optimizer predicted rows produce by the
operator– Rows scanned– Estimated cost of this operator
© IBM Corporation 200522
Example of Optimizer StatisticsQUERYSELECT tab1.c2 FROM tab1,tab2 WHERE tab1.c2 = tab2.c2 AND tab1.c2 MATCHES "systable*"Estimated Cost: 321Estimated # of Rows Returned: 148
1) miller3.tab1: SEQUENTIAL SCAN
Filters: miller3.tab1.c2 MATCHES 'systable*'
2) miller3.tab2: INDEX PATH
(1) Index Keys: c2 (Key-Only) (Serial, fragments: ALL) Lower Index Filter: miller3.tab2.c2 MATCHES
'systable*'
DYNAMIC HASH JOIN Dynamic Hash Filters: miller3.tab1.c2 = miller3.tab2.c2
© IBM Corporation 200523
Optimizer Statistics
Table map :----------------------------Internal name Table name----------------------------t1 tab1t2 tab2
type table rows_prod est_rows time rows_scan est_cost-------------------------------------------------------------------------------scan t1 92 92 0:0:0 5244 282
type table rows_prod est_rows time rows_scan est_cost-------------------------------------------------------------------------------scan t2 92 92 0:0:0 92 5
type time rows_bld rows_prb novrflo est_cost-------------------------------------------------------------------------------hjoin 8464 148 0:0:0 92 92 0 321
est_rowsrows_prod
Optimizer predictions
Actual Numbers
© IBM Corporation 200524
Examining the Running QueryNo Statistics VS Medium Statistics
No StatisticsQUERY:------select * from t1 where c1 > 20200
Estimated Cost: 20888Estimated # of Rows Returned: 6760
1) miller3.t1: SEQUENTIAL SCAN Filters: miller3.t1.c1 > 20200
No StatisticsQUERY:------select * from t1 where c1 > 20200
Estimated Cost: 20888Estimated # of Rows Returned: 6760
1) miller3.t1: SEQUENTIAL SCAN Filters: miller3.t1.c1 > 20200
Medium StatisticsQUERY:------select * from t1 where c1 > 20200
Estimated Cost: 21Estimated # of Rows Returned: 19
1) miller3.t1: INDEX PATH (1) Index Keys: c1 (Serial, fragments: ALL) Lower Index Filter: t1.c1 > 20250
Medium StatisticsQUERY:------select * from t1 where c1 > 20200
Estimated Cost: 21Estimated # of Rows Returned: 19
1) miller3.t1: INDEX PATH (1) Index Keys: c1 (Serial, fragments: ALL) Lower Index Filter: t1.c1 > 20250
Total Row ReturnedActual 30 estimated 6760 for no statistics 19 for medium
Access MethodChanged from Sequential Scan to Index Path
Estimated CostHuge drop in Estimated Cost
Overall performance improved The estimates were more accurate
The query plan changed
© IBM Corporation 200525
Examining the Running QueryMedium Statistics VS High Statistics
High StatisticsQUERY:------select * from t1 where c1 > 20200
Estimated Cost: 33Estimated # of Rows Returned: 30
1) miller3.t1: INDEX PATH (1) Index Keys: c1 Lower Index Filter: t1.c1 > 20250
High StatisticsQUERY:------select * from t1 where c1 > 20200
Estimated Cost: 33Estimated # of Rows Returned: 30
1) miller3.t1: INDEX PATH (1) Index Keys: c1 Lower Index Filter: t1.c1 > 20250
Total Row ReturnedActual 30, estimated 19 for medium statistics, 30 for high statisticsThe estimated from medium is fairly close
Access Method No change in Access MethodEstimated CostNo significant difference
Overall performance did not change The estimates were slightly more accurate
The query plan did not change
Medium StatisticsQUERY:------select * from t1 where c1 > 20200
Estimated Cost: 21Estimated # of Rows Returned: 19
1) miller3.t1: INDEX PATH (1) Index Keys: c1 Lower Index Filter: t1.c1 > 20250
Medium StatisticsQUERY:------select * from t1 where c1 > 20200
Estimated Cost: 21Estimated # of Rows Returned: 19
1) miller3.t1: INDEX PATH (1) Index Keys: c1 Lower Index Filter: t1.c1 > 20250
© IBM Corporation 200526
Controlling the Optimizer• External Directives
– Allows DBA to modify the query plan w/o changing the application
– Directives not required to be in the application
– To create external directives use the SAVE EXTERNAL DIRECTIVE statement
• SET OPTIMIZATION– LOW, HIGH,
– FIRST_ROWS, ALL_ROWS
• Update Statistics– Statistics and distributions
• Directives– Positive and negative directives
• Other command and/or environment variables– OPTCOMIND, PDQPRIORITY, IFX_DW_OPT
© IBM Corporation 200527
Other Information
• An Overview of the IBM Informix Dynamic Server Optimizer
– http://www.ibm.com/developerworks/db2/zones/informix/library/techarticle/0211desai/0211desai.html
• Understanding and Tuning Update Statistics – http://www-.ibm.com/developerworks/db2/zones/informix/library/techarticle/miller/0203m
iller.html
• Predicate Inference in Informix Dynamic Server – http://www.ibm.com/developerworks/db2/zones/informix/library/techarticle/0206goswami/
0206goswami.html
• IBM Informix Performance Manual• IBM Informix SQL Reference Manual
© IBM Corporation 200528
Questions