1 chapter 7 optimizing the optimizer. 2 the oracle optimizer is… about query optimization is a...

16
1 Chapter 7 Optimizing the Optimizer

Upload: lillian-leonard

Post on 01-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach

1

Chapter 7Optimizing the Optimizer

Page 2: 1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach

2

The Oracle Optimizer is…

About query optimization Is a sophisticated set of algorithms Choosing the fastest approach (hopefully) Cost-based determinations, including

– Time estimates for retrieval– Estimated CPU time– Does not include network components

Page 3: 1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach

3

OPTIMIZER_GOAL parameter

Controls how the optimizer calculates cost ALL_ROWS

– minimizes cost of processing all rows– Favors full table scans

FIRST_ROWS_N – minimizes cost of processing first N rows– Favors index usage

RULE– Obsolete but can be invoked with RULE hint

Page 4: 1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach

4

Query Transformation

Oracle can transform a query into a logically equivalent statement

Types of transformations include:– View Merging – Taking view and merging into a SQL

statement– Star – Used for star schemas– Join elimination – eliminates unnecessary joins– Join factorization – used with UNION operations– Antijoin and semijoin – Converts MINUS and

INTERSECT

Page 5: 1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach

5

Basis of Cost Calculations

The amount of data accessed Estimated data in memory Estimate of physical IO operations Estimate of CPU-related operations

– Sorting– Hashing

Relative speed of IO and compute operations

Page 6: 1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach

6

Inputs into Cost Calculations

Object statistics – metadata about the tables Histograms – data distribution statistics Database parameters – data block, memory,

and other process configuration parameters System statistics – physical hardware

configuration Bind Variable Peeking - Using actual value of

variable to help determine execution plan

Page 7: 1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach

7

Database Parameters

Influences optimizer decisions Some key parameters affecting optimization:

– DB_FILE_MULTIBLOCK_READ_COUNT – how many blocks read in single IO operation

– OPTIMIZER_INDEX_CACHING & OPTIMIZER_INDEX_COST_ADJ – affects how optimizer thinks about indexes

– OPTIMIZER_FEATURES_ENABLE – tells optimizer what Oracle version to base decisions

– Parallel_threads_per_cpu – affects how prominent parallelism can be

– Many memory (SGA/PGA based) parameters– See pp. 193.194

Page 8: 1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach

8

System Statistics

View these in SYS.AUX_STATS$ view Statistics on physical system configuration Types include:

– Nonworkload statistics – collect in absence of database load / use

– Workload statistics – collect during typical database activity

Workload statistics more reflective of actual hardware configuration

Page 9: 1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach

9

Collecting Statistics

Automatic Statistics Gathering– Automated as of Oracle 10g– Many configurable parameters– Can configure within manually or within Enterprise Manager

Stale Statistics– Means Oracle keeps DML activity to determine when

statistics become stale– STATISTICS_LEVEL must be set to use– View information in ALL_TAB_MODIFICATIONS view– Can configure the threshold of what is regarded as “stale”

Page 10: 1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach

10

Collecting Statistics (cont.)

Manual Statistics Gathering– Usually necessary on some objects to keep current as is needed– Used to fine-tune statistics collection– Used to collect on SYSTEM objects not gather automatically– Used to create “extended statistics”

Dynamic Sampling– Done by Oracle on-the-fly at execution time– Done when certain objects have no statistics– Oracle may determine doing this may reduce execution time– Determined by OPTIMIZER_DYNAMIC_SAMPLING parameter

Page 11: 1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach

11

Using DBMS_STATS

Oracle provided PL/SQL Package Collects and manages optimizer statistics

– For a table– For a schema– For objects with “stale” statistics– Used to create histograms– Used to transfer statistics from another database– See pp. 200-202 for procedures and parameters

Page 12: 1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach

12

Using DBMS_STATS (cont.)

Defaults can be set via the following procedures– SET_GLOBAL_PREFS– SET_DATABASE_PREFS– SET_SCHEMA_PREFS– SET_TABLE_PREFS

Configurable parameters:– CASCADE– DEGREE– ESTIMATE_PERCENT– METHOD_OPT– GRANULARITY– See p. 202 for others

Page 13: 1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach

13

Histograms and METHOD_OPT

Histograms – Increase accuracy of of cost calculations– Increase overhead of collecting statistics– Worthwhile for

Columns with irregular data distribution Columns involved with WHERE or JOIN expressions

METHOD_OPT affects how histograms are created– Default is FOR ALL COLUMNS SIZE AUTO – leaves decision to

Oracle to choose columns on which to create histograms– Based on cardinality– Based on presence of filters dependent on the data– Set METHOD_OPT manually when appropriate if knowledge of

column data makeup is known

Page 14: 1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach

14

Extended Statistics

Stored in ALL_TAB_COL_STATISTICS Two types

– Multicolumn– Expression-based

Multicolumn Extended Statistics– Calculates selectivity of multicolumn expressions– Example

50% of customers are men 10% of customers are from Australia Optimizer conclusion: 5% of customers are Australian men

Page 15: 1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach

15

Extended Statistics

Expression Extended Statistics– Optimizer gathers statistics on an expression to help determine

query cost– Take the query:

SELECT COUNT(*), SUM(amount_sold)FROM SALESWHERE sale_category (amount_sold) = 1;

– By gathering statistics on the expression in above query, optimizer will likely more accurately determine cost of queryBEGIN DBMS_STATS.gather_table_stats(ownname=>USER,tabname=>’SALES’,method_opt=> ‘FOR ALL COLUMNS FOR

COLUMNS (sale_category(amount_sold))’);END

Page 16: 1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach

16

More on DBMS_STATS

Can perform other operations, such as– Locking statistics (LOCK_SCHEMA_STATS)– Collect system statistics (GATHER_SYSTEM_STATS)– Move statistics from one database to another

Popular to simulate production statistics in a development environment

– Manipulate object statistics Modifying statistics based on actual data Good if forecasting data changes