banner and the sql select statement: part four (multiple connected select statements) mark holliday...
TRANSCRIPT
Banner and the SQL Select Statement: Part Four (Multiple Connected Select Statements)
Mark HollidayDepartment of Mathematics and
Computer ScienceWestern Carolina University
18 November 2005(updated: 18 November 2005)
Outline
The Goal The Concepts
A First Example Single Table Selects Joins Multiple Connected Select Statements
A First Example
Outline The Relational Model: Single Table Lab 1: TOAD, Schema Browser Some Structured Query Language (SQL)
Basics Lab 2: TOAD, SQL Editor
Single Table Selects
Outline WHERE clause: single condition, multiple
conditions Lab 3: Order By; Aggregate Functions Lab 4: Group By; Having Lab 5:
Joins
Outline Why multiple tables? Inner Joins Lab 6: Outer joins Lab 7:
Multiple Connected Select Statements
Outline Set Operators Lab 8: Subqueries
Use directly: FROM clause Use as a set: new operators Use as a single value: aggregate functions
Lab 9: A Query Development Methodology
SET Operators
Intuition: A SQL Select statement returns a
table A table is a set
we can put a set operator in between two select statements to create a new set (that is, table)
SET Operators (franz)
Types of Set Operators :
UNION – return all rows, exclusing duplicates
UNION ALL -- returns all rows, including duplicates
INTERSECT -- returns rows retrieved in both queries
MINUS -- returns remaining rows when results of second query are subtracted from the first query
Set Operators
Two restrictions of set operators are:
The two tables must contain the same number of columns.
All corresponding columns in the two tables need to be of the same data type.
Example Tables (franz)Example Tables (franz)
Relation (stvrelt) table (13 rows) Legacy (stvlgcy) table (10 rows)
UNION (franz)
The purpose of the SQL UNION command is to combine the results of two queries.
In this respect, UNION is similar to JOIN they are both used to combine related
information from multiple tables.
UNION (franz)
When using UNION, only distinct values are selected (similar to SELECT DISTINCT).
The syntax is as follows: [SQL Statement 1]UNION[SQL Statement 2]
UNION (franz)
SELECT stvrelt_code, stvrelt_desc varchar(1); varchar(30)
FROM stvrelt Relation table
UNIONSELECT stvlgcy_code, stvlgcy_desc
varchar(1); varchar(30)FROM stvlgcy
Legacy table
Note: The default resultant set from UNION is DISTINCT rows.
UNION ALL (franz)
The difference between UNION ALL and UNION is that, while UNION only selects distinct values, UNION ALL selects all values.
The syntax for UNION ALL is as follows: [SQL Statement 1]UNION ALL[SQL Statement 2]
UNION ALL (franz)
Here is our example using Banner tables:SELECT stvrelt_code, stvrelt_desc FROM stvreltUNION ALLSELECT stvlgcy_code, stvlgcy_desc FROM stvlgcy
INTERSECT (franz)
Similar to the UNION command, INTERSECT operates on two SQL statements.
The difference is that, while UNION essentially acts as an OR operator (value is selected if it appears in either the first or
the second statement), the INTERSECT command acts as an AND
operator (value is selected only if it appears in both
statements).
INTERSECT (franz)
The syntax is as follows:
[SQL Statement 1]INTERSECT[SQL Statement 2]
INTERSECT (franz)
SELECT stvrelt_code, stvrelt_desc FROM stvreltINTERSECTSELECT stvlgcy_code, stvlgcy_desc FROM stvlgcy
MINUS (franz)
The MINUS operates on two SQL statements.
It takes all the results from the first SQL statement, and then subtract out the ones that are present in the second SQL statement to get the final answer.
If the second SQL statement includes results not present in the first SQL statement, such results are ignored.
MINUS (franz)
The syntax is as follows:
[SQL Statement 1]MINUS[SQL Statement 2]
MINUS (franz)
SELECT stvrelt_code, stvrelt_desc FROM stvreltMINUSSELECT stvlgcy_code, stvlgcy_desc FROM stvlgcy
SET Operators (franz)
A note of special importance: When using the various SQL SET operators
that we have covered, it is especially important that you
understand your data!
In the last example, if we reversed the order of the MINUS
operator, we would have completely different
results.
SET Operators (franz)
SELECT stvlgcy_code, stvlgcy_descFROM stvlgcyMINUSSELECT stvrelt_code, stvrelt_descFROM stvrelt
SET Operators (franz)
would return FIVE rows. Starting from the
“stvlgcy” table, we would look at “stvrelt”.
Any records in “stvrelt” that were not in “stvlgcy” would be ignored.
The records in “stvrelt” that were the same as “stvlgcy” would be removed.
The items grayed out in the “stvlgcy” table would be selected.
Laboratory Eight
Objectives: Develop competence with set operators
Steps: First Query
Laboratory Eight
Problem: Find the phone numbers of the people who do not live in the area code 828 region.
Laboratory Eight
Answer:SELECT sprtele_phone_numberFROM sprteleMINUSSELECT sprtele_phone_numberFROM sprteleWHERE sprtele_area_code = 828
SubquerySubquery
Intuition: The Set Operators used so far (UNION, UNION ALL,
INTERSECT, MINUS) operate on output tables (i.e. sets) but only in between select statement
Question: Can we use set operations so as to operate on an output table within another select statement?
Answer: Yes! The select statement inside the outer select
statement is called a subquery or a nested query.
SubquerySubquery
Where in a select statement can we nest a subquery (i.e. where do we use a set)?
The FROM clause since the FROM clause lists tables just have one of those tables be
generated from the subquery
Suppose we want to select non-busineses from ‘spriden’, including Suppose we want to select non-busineses from ‘spriden’, including a count of addresses, where there is more than one address a count of addresses, where there is more than one address type for that pidm:type for that pidm:
Example of a subquery in the FROM clause:
SELECT spriden_id, spriden_last_name, spriden_first_name, p_cnt FROM spriden,
(SELECT spraddr_pidm, count(spraddr_atyp_code) p_cnt FROM spraddrGROUP BY spraddr_pidm)
WHERE spriden_pidm = spraddr_pidm and p_cnt > 1and spriden_entity_ind = 'P‘
FROM Clause Subquery (franz)
SubquerySubquery
Can we use a subquery anywhere else?
Yes, but we need some help. The conditions in WHERE, and HAVING
clauses we have seen all use comparison operators that work on single
values (=, <>, >, <, <=, >=) or multiple values only in restricted ways (e.g.
LIKE)
SubquerySubquery
We need operators to compare a value with a set of values the set of values will be the output table of the
subquery the resulting expressions can be conditions in
the WHERE and HAVING clauses
Introduce new operators that work with subqueries IN, NOT IN, ANY, ALL, EXISTS, NOT EXISTS
Special operators for subqueries (franz)Special operators for subqueries (franz)
IN/NOT IN -- Check to see if a value is in a specified list of values returned in the subquery.
ANY -- Compare a value with any value in a list.
ALL -- Compare a value with all values in a list.
EXISTS/NOT EXISTS -- Check for the existence of rows returned by a subquery.
IN (franz)IN (franz)
Check to see if a value is in a specified list of values returned in the subquery
SELECT product_id, nameFROM productsWHERE product_id IN
(SELECT product_id FROM product WHERE name LIKE ‘%e%’)
NOT IN (franz)NOT IN (franz)
Check to see if a value is not in a specified list of values returned in the subquery.
SELECT product_id, nameFROM productsWHERE product_id NOT IN
(SELECT product_id FROM purchases)
ANY (franz)ANY (franz) Compare a value with any value in a list. You have to place an “=, <>, >, <, <=, or >=“
operator before ANY in the query.
SELECT employee_id, last_nameFROM employeesWHERE salary < ANY
(SELECT low_salary FROM salary_grades)
ALL (franz)ALL (franz)
Compare a value with all values in a list. You have to place an “=, <>, >, <, <=, or
>=“ operator before ALL in the query.
SELECT employee_id, last_nameFROM employeesWHERE salary > ALL
(SELECT high_salary FROM salary_grades)
If the table variable declared in the outer query is used in the subquery
the subquery is said to be correlated (otherwise, it is uncorrelated)
In an uncorrelated subquery, the subquery is just evaluated once during the outer query.
Correlated Subquery
In a correlated subquery, the set of rows output from the subquery can vary for each value of the outer table variable.
the subquery is reevaluated for each value of the outer table variable.
EXISTS and NOT EXISTS tend to be used in correlated subqueries as in the examples following
Correlated Subquery
EXISTS (franz)EXISTS (franz)
Check for the existence of rows returned by a subquery.
SELECT employee_id, last_nameFROM employees outerWHERE EXISTS
(SELECT employee_id FROM employees inner WHERE inner.manager_id =
outer.employee_id)
‘EXISTS’ just checks for the existence of rows returned by the subquery, not the actual values.
To make your query run faster, you can just return a literal value.
EXISTS (franz)
A re-write of our previous example:
SELECT employee_id, last_nameFROM employees outerWHERE EXISTS
(SELECT 1 FROM employees inner WHERE inner.manager_id = outer.employee_id)
EXISTS (franz)
Retrieve products that have not been purchased:
SELECT product_id, nameFROM products outerWHERE NOT EXISTS
(SELECT 1 FROM purchases inner WHERE
inner.product_id = outer.product_id)
NOT EXISTS (franz)
A Multi-Condition WHERE Clause Subquery (franz)
The WHERE clause can have conditions besides the one The WHERE clause can have conditions besides the one using the subquery.using the subquery.
Suppose we want to select non-busineses from ‘spriden’ that Suppose we want to select non-busineses from ‘spriden’ that do NOT have a record in ‘spbpers’:do NOT have a record in ‘spbpers’:
SELECT spriden_id, spriden_last_name, spriden_first_name
FROM spriden WHERE
spriden_entity_ind = 'P'AND spriden_pidm not in
(SELECT spbpers_pidm FROM spbpers)
Single Value Subquery
Question: Can we do even more with subqueries?
More Specific Question: Can we use the output table generated by a subquery in conditions that use the single value comparison operators (=, >, …)?
Answer: Yes! (sometimes)
Some SQL select statements are guaranteed to return an output table that is a set with only one value.
Which ones? Those with an aggregate function in the SELECT clause.
Single Value Subquery (franz)Single Value Subquery (franz)
SELECT "column_name1" FROM "table_name" WHERE "column_name2"
[Comparison Operator] (SELECT AGGREGATE
FUNCTION("column_name1“) FROM "table_name"WHERE [Condition])
[Comparison Operator] can be =, >, <, >=, <=. or "LIKE."
This is not an example from Banner, but from a made-up table.
SELECT product_type_id, AVG(price)FROM productsGROUP BY product_type_idHAVING AVG(price) <
(SELECT MAX(AVG(price)) FROM products GROUP BY product_type_id)
The following data illustrates this subquery example...
Single Value Subquery and
HAVING Clause Subquery (franz)
Single Value SubquerySingle Value Subquery
The same table alias is not used in both the outer query and the subquery
=> the query is uncorrelated => the subquery only needs to be evaluated
once
Single Value Subquery (franz)Single Value Subquery (franz)
In the products table, there are multiple rows for each product_type_id.
Each row has a price, along with other information (which is not shown in our example).
Single Value Subquery (franz)Single Value Subquery (franz)
For each product_type_id in the products table: •we sum the prices (and divide by the total number of prices)
•then determine the average price for product_type_id.
•Each row has an average price, unique to the product_type_id [GROUP BY].
Single Value Subquery (franz)Single Value Subquery (franz)
SELECT product_type_id, AVG(price)FROM productsGROUP BY product_type_id
Among these average prices, we locate the maximum price out of all the averages – which is $26.22.
Single Value Subquery (franz)Single Value Subquery (franz)
From the product_type_id average prices, select the ones that are less than $26.22.
HAVING AVG(price) <(SELECT MAX(AVG(price)) FROM products GROUP BY product_type_id)
Single Value Subquery (franz)
Our original query (franz)Our original query (franz)
SELECT product_type_id, AVG(price)FROM productsGROUP BY product_type_idHAVING AVG(price) <
(SELECT MAX(AVG(price)) FROM products GROUP BY product_type_id)
Break it down into smaller steps. Then put the larger query together.Break it down into smaller steps. Then put the larger query together.
Laboratory Nine
Objectives: Develop competence with subqueries
Steps: First Query Second Query
Laboratory Nine
First QueryProblem: Find the phone numbers of the
people who do not live in the area code 828 region.
Requirement: Use an uncorrelated subquery.
Laboratory Nine
Solution:SELECT s1.sprtele_phone_numberFROM sprtele s1WHERE s1.sprtele_pidm NOT IN
(SELECT s2.sprtele_pidmFROM sprtele s2WHERE s2.sprtele_area_code = 828)
Laboratory Nine
Second QueryProblem: Find the phone numbers of the
people who do not live in the area code 828 region.
Requirement: Use an correlated subquery.
Laboratory Nine
Solution:SELECT s1.sprtele_phone_numberFROM sprtele s1WHERE NOT EXISTS
(SELECT s2.sprtele_pidmFROM sprtele s2WHERE s1.sprtele_pidm = s2.sprtele_pidm and s2.sprtele_area_code = 828)
A Query Development Methodology
Problem: queries can be complicated => easy to introduce bugs => difficult to find bugs
=> complexity of the select statement => large size of input table => limited number of instances of input table tested
Solution: be aware of types of bugs incremental development multiple small input data sets with correct
output known
A Query Development Methodology
Types of Bugs: Compile-time: Banner emits error
message about illegal syntax when query is submitted
Run-time: Banner emits error message when query is executing that an illegal operation occurred
Logic: No error message!
A Query Development Methodology
Logic Errors Your query has legal syntax and does
not cause any illegal operations => result table is generated
But your query is not doing what you think it is doing
Hard to detect (the result table may be the correct result table for this particular input table instance)
A Query Development Methodology
Incremental Development: Don’t try to come up with a complete solution
(select statement) all at once. Develop a select statement for a small part of the
query test that partial solution using all the input data
sets to make sure it works if it does not work, edit it, and try again
Extend the select statement to include more of the complete query and repeat
A Query Development Methodology
Multiple small input data sets
Develop instances of the input table that are small (so understandable) test all the cases for the possible real input
to check the tentative partial select statement is correct
usually artificial (made just for testing) to satisfy the previous constraints