03 chapter 15 algorithms for query processing optimization
TRANSCRIPT
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 1/35
ICS 424 - 01 (072) Query Processing andOptimization
1
Chapter 15
Algorithms for Query Processing
and Optimization
ICS 424 Advanced Database Systems
Dr. Muhammad Shafique
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 2/35
ICS 424 - 01 (072) Query Processing andOptimization
2
Outline
• Introduction• Processing a query
• SQL queries and relational algebra
• Implementing basic query operations
• Heuristics-based query optimization
• Overview of query optimization in Oracle
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 3/35
ICS 424 - 01 (072) Query Processing andOptimization
3
Material Covered from Chapter 15
• Pages 537, 538, 539• Section 15.1
• Section 15.2
• Section 15.6
• Section 15.7
• Section 15.9
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 4/35
ICS 424 - 01 (072) Query Processing andOptimization
4
Introduction to Query Processing
• Query optimization
• The process of choosing a suitable execution strategy
for processing a query.
• Two internal representations of a query:
• Query Tree• Query Graph
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 5/35
ICS 424 - 01 (072) Query Processing andOptimization
5
Background Review
• DDL compiler• DML compiler
• Runtime
database
processor• System catalog
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 6/35
ICS 424 - 01 (072) Query Processing andOptimization
6
Processing a Query
• Tasks in processing a high-level query1. Scanner scans the query and identifies the language tokens
2. Parser checks syntax of the query
3. The query is validated by checking that all attribute names and
relation names are valid
4. An intermediate internal representation for the query is created(query tree or query graph)
5. Query execution strategy is developed
6. Query optimizer produces an execution plan
7. Code generator generates the object code
8. Runtime database processor executes the code
• Query processing and query optimization
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 7/35
ICS 424 - 01 (072) Query Processing andOptimization
7
Processing a Query
• Typical steps in processing a high-level query
1. Query in a high-level query language like SQL
2. Scanning, parsing, and validation
3. Intermediate-form of query like query tree
4. Query optimizer5. Execution plan
6. Query code generator
7. Object-code for the query
8. Run-time database processor
9. Results of query
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 8/35
ICS 424 - 01 (072) Query Processing andOptimization
8
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 9/35
ICS 424 - 01 (072) Query Processing andOptimization
9
SQL Queries and Relational Algebra
• SQL query is translated into an equivalent extended
relational algebra expression --- represented as a query tree
• In order to transform a given query into a query tree, the
query is decomposed into query blocks
• Query block :
• The basic unit that can be translated into the algebraic operators andoptimized.
• A query block contains a single SELECT-FROM-WHERE
expression, as well as GROUP BY and HAVING clause if these are
part of the block.
• The query optimizer chooses an execution plan for each
block
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 10/35
ICS 424 - 01 (072) Query Processing andOptimization
10
COMPANY Relational Database Schema (1)
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 11/35
ICS 424 - 01 (072) Query Processing andOptimization
11
COMPANY Relational Database Schema (2)
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 12/35
ICS 424 - 01 (072) Query Processing andOptimization
12
SQL Queries and Relational Algebra (1)
• Example
SELECT Lname, Fname
FROM EMPLOYEE
WHERE Salary > ( SELECT MAX(Salary)
FROM EMPLOYEE
WHERE Dno = 5 )
• Inner block and outer block
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 13/35
ICS 424 - 01 (072) Query Processing andOptimization
13
Translating SQL Queries into Relational Algebra
SELECT LNAME, FNAME
FROM EMPLOYEE
WHERE SALARY > ( SELECT MAX (SALARY)
FROM EMPLOYEE
WHERE DNO = 5);
SELECT MAX (SALARY)
FROM EMPLOYEE
WHERE DNO = 5
SELECT LNAME, FNAME
FROM EMPLOYEE
WHERE SALARY > C
πLNAME, FNAME (σSALARY>C(EMPLOYEE)) ℱMAX SALARY (σDNO=5 (EMPLOYEE))
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 14/35
ICS 424 - 01 (072) Query Processing andOptimization
14
SQL Queries and Relational Algebra (2)
• Uncorrelated nested queries Vs Correlated nested queries
• Example
Retrieve the name of each employee who works on all the projectscontrolled by department number 5.
SELECT FNAME, LNAMEFROM EMPLOYEEWHERE ( (SELECT PNO
FROM WORKS_ONWHERE SSN=ESSN)
CONTAINS(SELECT PNUMBERFROM PROJECTWHERE DNUM=5) )
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 15/35
ICS 424 - 01 (072) Query Processing andOptimization
15
SQL Queries and Relational Algebra (3)
• Example
For every project located in ‘Stafford’, retrieve the project number,the controlling department number and the department manager’slast name, address and birthdate.
• SQL query:
SELECT P.NUMBER,P.DNUM,E.LNAME, E.ADDRESS, E.BDATE
FROM PROJECT AS P,DEPARTMENT AS D, EMPLOYEE AS E
WHERE P.DNUM=D.DNUMBER AND D.MGRSSN=E.SSN ANDP.PLOCATION=‘STAFFORD’;
• Relation algebra:
PNUMBER, DNUM, LNAME, ADDRESS, BDATE ((( PLOCATION=‘STAFFORD’(PROJECT))
DNUM=DNUMBER (DEPARTMENT)) MGRSSN=SSN (EMPLOYEE))
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 16/35
ICS 424 - 01 (072) Query Processing andOptimization
16
SQL Queries and Relational Algebra (4)
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 17/35
ICS 424 - 01 (072) Query Processing andOptimization
17
Implementing Basic Query Operations
• An RDBMS must provide implementation(s) for
all the required operations including relational
operators and more
• External sorting
• Sort-merge strategy• Sorting phase
• Number of file blocks (b)
• Number of available buffers (nB)
• Runs --- (b / nB)
• Merging phase --- passes
• Degree of merging --- the number of runs that are merged
together in each pass
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 18/35
ICS 424 - 01 (072) Query Processing andOptimization
18
Algorithms for External Sorting (1)
• External sorting:
• Refers to sorting algorithms that are suitable for large files
of records stored on disk that do not fit entirely in main
memory, such as most database files.
• Sort-Merge strategy:• Starts by sorting small subfiles (runs) of the main file and
then merges the sorted runs, creating larger sorted subfiles
that are merged in turn.
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 19/35
ICS 424 - 01 (072) Query Processing andOptimization
19
Algorithms
for
External
Sorting (2)
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 20/35
ICS 424 - 01 (072) Query Processing andOptimization
20
Algorithms for External Sorting (3)
• Analysis
Number of file blocks = b
Number of initial runs = nR
Available buffer space = nB
Sorting phase: nR = (b/nB) Degree of merging: dM = Min (nB-1, nR );
Number of passes: nP = (logdM(nR ))
Number of block accesses: (2 * b) + (2 * b * (logdM(nR )))
• Example done in the class
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 21/35
ICS 424 - 01 (072) Query Processing andOptimization
21
Implementing Basic Query Operations (cont.)
• Estimates of selectivity
• Selectivity is the ratio of the number of tuples that satisfy thecondition to the total number of tuples in the relation.
• SELECT ( ) operator implementation
1. Linear search
2. Binary search3. Using a primary index (or hash key)
4. Using primary index to retrieve multiple records
5. Using clustering index to retrieve multiple records
6. Using a secondary index on an equality comparison
7. Conjunctive selection using an individual index
8. Conjunctive selection using a composite index
9. Conjunctive selection by intersection of record pointers
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 22/35
ICS 424 - 01 (072) Query Processing andOptimization
22
Implementing Basic Query Operations (cont.)
• JOIN operator implementation
1. Nested-loop join
2. Sort-merge join
3. Hash join
• Partition Hash join• Hybrid hash join
• PROJECT operator implementation
• Set operator implementation
• Implementing Aggregate operators/functions
• Implementing OUTER JOIN
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 23/35
ICS 424 - 01 (072) Query Processing andOptimization 23
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 24/35
ICS 424 - 01 (072) Query Processing andOptimization 24
Buffer Space and Join performance
In the nested-loop join, it makes a difference which file is chosen forthe outer loop and which for the inner loop. If EMPLOYEE is used for
the outer loop, each block of EMPLOYEE is read once, and the entireDEPARTMENT file (each of its blocks) is read once for each time weread in ( nB - 2) blocks of the EMPLOYEE file. We get the following:
Total number of blocks accessed for outer file = bE
Number of times ( nB - 2) blocks of outer file are loaded = bE/ nB – 2
Total number of blocks accessed for inner file = bD * bE/ nB – 2
Hence, we get the following total number of block accesses:
bE + ( bE/ nB – 2 * bD) = 2000 + ( (2000/5) * 10) = 6000 blocks
On the other hand, if we use the DEPARTMENT records in the outerloop, by symmetry we get the following total number of blockaccesses:
bD + ( bD/ nB – 2 * bE) = 10 + ((10/5) * 2000) = 4010 blocks
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 25/35
ICS 424 - 01 (072) Query Processing andOptimization 25
Implementing Basic Query Operations (cont.)
• Combining operations using pipelining
• Temporary files based processing
• Pipelining or stream-based processing
• Example: consider the execution of the following query
list of attributes( ( c1(R) ( c2 (S))
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 26/35
ICS 424 - 01 (072) Query Processing andOptimization 26
General Transformation Rules for
Relational Algebra Operations
1. Cascade of
: A conjunctive selection condition can be broken up into a cascade (that is, a sequence) ofindividual operations: C1 AND C2 AND ….AND Cn (R) ≡ C1 (C2( …(Cn(R))…)
2. Commutativity of : The operation is commutative:
C1(C2(R)) ≡ C2(C1(R))3. Cascade of : In a cascade (sequence) of operations, all
but the last one can be ignored
4. Commuting with : If the selection condition cinvolves only those attributes A1, ..., An in the projectionlist, the two operations can be commuted
• And more …
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 27/35
ICS 424 - 01 (072) Query Processing andOptimization 27
Heuristic-Based Query Optimization
• Outline of heuristic algebraic optimization algorithm
1. Break up SELECT operations with conjunctive conditions into a
cascade of SELECT operations2. Using the commutativity of SELECT with other operations, moveeach SELECT operation as far down the query tree as is permitted
by the attributes involved in the select condition
3. Using commutativity and associativity of binary operations,rearrange the leaf nodes of the tree
4. Combine a CARTESIAN PRODUCT operation with asubsequent SELECT operation in the tree into a JOIN operation,if the condition represents a join condition
5. Using the cascading of PROJECT and the commuting ofPROJECT with other operations, break down and move lists of
projection attributes down the tree as far as possible by creatingnew PROJECT operations as needed
6. Identify sub-trees that represent groups of operations that can beexecuted by a single algorithm
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 28/35
ICS 424 - 01 (072) Query Processing andOptimization 28
Heuristic-Based Query Optimization:
Example
• Query
"Find the last names of employees born after 1957
who work on a project named ‘Aquarius’."
• SQLSELECT LNAME
FROM EMPLOYEE, WORKS_ON, PROJECT
WHERE PNAME=‘Aquarius’ AND PNUMBER=PNO
AND ESSN=SSN AND BDATE.‘1957-12-31’;
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 29/35
ICS 424 - 01 (072) Query Processing andOptimization 29
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 30/35
ICS 424 - 01 (072) Query Processing andOptimization 30
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 31/35
ICS 424 - 01 (072) Query Processing andOptimization 31
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 32/35
ICS 424 - 01 (072) Query Processing andOptimization 32
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 33/35
ICS 424 - 01 (072) Query Processing andOptimization 33
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 34/35
ICS 424 - 01 (072) Query Processing andOptimization 34
Overview of Query Optimization in Oracle
• Rule-based query optimization: the optimizer chooses execution plans
based on heuristically ranked operations.• May be phased out
• Cost-based query optimization: the optimizer examines alternative access paths and operator algorithms and chooses the execution plan with lowestestimate cost.
• The query cost is calculated based on the estimated usage of resources such as
I/O, CPU and memory needed.
• Application developers could specify hints to the ORACLE queryoptimizer.
• application developer might know more information about the data.
• SELECT /*+ ...hint... */ [rest of query]
• SELECT /*+ index(t1 t1_abc) index(t2 t2_abc) */ COUNT(*)FROM t1, t2WHERE t1.col1 = t2.col1;
8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization
http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 35/35
ICS 424 01 (072) Query Processing andO ti i ti 35
Summary
• Background review
• Processing a query
• SQL queries and relational algebra
• Implementing basic query operations
• Heuristics-based query optimization
• Overview of query optimization in Oracle