lecture 8 alg
TRANSCRIPT
-
7/28/2019 Lecture 8 Alg
1/30
By relieving the brain of all unnecessary
work, a good notation sets it free to
concentrate on more advanced problems,
and, in effect, increases the mental power of
the race.
-- Alfred North Whitehead (1861 - 1947)
Relational Algebra
CS 186 Spring 2006, Lecture 8
R & G, Chapter 4
p
-
7/28/2019 Lecture 8 Alg
2/30
Administrivia
Midterm 1 Tues 2/21, in class Covers all material up to and including Th 2/16
Closed bookcan bring 1 8.5x11 sheet of notes
No calculators no cell phones
Midterm 2 Th 3/21, in class
Focuses on material after MT 1
Same rules as above.
Homework 1 2/14 Homework 2 Out after MT 1, Due 3/14
-
7/28/2019 Lecture 8 Alg
3/30
Relational Query Languages
Query languages:Allow manipulation and retrieval ofdata from a database.
Relational model supports simple, powerful QLs:
Strong formal foundation based on logic.Allows for much optimization.
Query Languages != programming languages!
QLs not expected to be Turing complete.
QLs not intended to be used for complex calculations.
QLs support easy, efficient access to large data sets.
-
7/28/2019 Lecture 8 Alg
4/30
Formal Relational Query Languages
Two mathematical Query Languages form thebasis for real languages (e.g. SQL), and forimplementation:
Relational Algebra: More operational, veryuseful for representing execution plans.
Relational Calculus: Lets users describe what
they want, rather than how to compute it.(Non-procedural, declarative.)
Understanding Algebra & Calculus is key to
understanding SQL, query processing!
-
7/28/2019 Lecture 8 Alg
5/30
Preliminaries
A query is applied to relation instances, and theresult of a query is also a relation instance.
Schemasof input relations for a query are fixed (butquery will run over any legal instance)
The schema for the resultof a given query is alsofixed. It is determined by the definitions of the querylanguage constructs.
Positional vs. named-field notation:
Positional notation easier for formal definitions,named-field notation more readable.
Both used in SQL
-
7/28/2019 Lecture 8 Alg
6/30
Relational Algebra: 5 Basic Operations
Selection ( ) Selects a subset ofrowsfromrelation (horizontal).
Projection ( ) Retains only wanted columnsfrom relation (vertical).
Cross-product (x) Allows us to combine tworelations.
Set-difference () Tuples in r1, but not in r2.
Union ( ) Tuples in r1 and/or in r2.
Since each operation returns a relation, operationscan
be composed! (Algebra is closed.)
p
-
7/28/2019 Lecture 8 Alg
7/30
Example Instances R1
S1
S2
Boats
bid bname color
101 Interlake blue102 Interlake red
103 Clipper green
104 Marine red
sid bid day
22 101 10/10/96
58 103 11/12/96
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.558 rusty 10 35.0
sid sname rating age28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
-
7/28/2019 Lecture 8 Alg
8/30
Projection
page S( )2 Examples: ;
Retains only attributes that are in the projectionlist.
Schemaof result:
exactly the fields in the projection list, with thesame names that they had in the input relation.
Projection operator has to eliminate duplicates(How do they arise? Why remove them?)
Note: real systems typically dont do duplicateelimination unless the user explicitly asks for it.(Why not?)
p
sname rating
S
,
( )2
-
7/28/2019 Lecture 8 Alg
9/30
Projection
)2(, Sratingsnamep
p
ageS( )2
S2
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
sname rating
yuppy 9
lubber 8guppy 5rusty 10
age
35.055.5
-
7/28/2019 Lecture 8 Alg
10/30
Selection ()
rating S8 2( ) p sname rating rating S, ( ( )) 8 2
Selects rows that satisfy selection condition. Result is a relation.
Schemaof result is same as that of the input relation.
Do we need to do duplicate elimination?
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
sname ratingyuppy 9
rusty 10
-
7/28/2019 Lecture 8 Alg
11/30
Union and Set-Difference
All of these operations take two input relations,which must beunion-compatible:
Same number of fields.
`Corresponding fields have the same type.
For which, if any, is duplicate elimination
required?
-
7/28/2019 Lecture 8 Alg
12/30
Union
S S1 2
S1
S2
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
sid sname rating age
28 yuppy 9 35.031 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
sid sname rating age22 dustin 7 45.031 lubber 8 55.558 rusty 10 35.0
44 guppy 5 35.028 yuppy 9 35.0
-
7/28/2019 Lecture 8 Alg
13/30
Set Difference
S1
S2
S S1 2
S2S1
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
sid sname rating age
28 yuppy 9 35.031 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
sid sname rating age22 dustin 7 45.0
sid sname rating age
28 yuppy 9 35.044 guppy 5 35.0
-
7/28/2019 Lecture 8 Alg
14/30
Cross-Product
S1 x R1: Each row of S1 paired with each row of R1.
Q: How many rows in the result?
Result schemahas one field per field of S1 and R1,with field names `inherited if possible.
May have a naming conflict: Both S1 and R1 havea field with the same name.
In this case, can use the renaming operator:
( ( , ), )C sid sid S R1 1 5 2 1 1
-
7/28/2019 Lecture 8 Alg
15/30
Cross Product Example
R1 S1
R1 X S1 =
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
sid bid day22 101 10/10/96
58 103 11/12/96
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 22 101 10/10/
22 dustin 7 45.0 58 103 11/12/31 lubber 8 55.5 22 101 10/10/
31 lubber 8 55.5 58 103 11/12/
58 rusty 10 35.0 22 101 10/10/
58 rusty 10 35.0 58 103 11/12/
-
7/28/2019 Lecture 8 Alg
16/30
Compound Operator: Intersection
In addition to the 5 basic operators, there areseveral additional Compound Operators
These add no computational power to thelanguage, but are useful shorthands.
Can be expressed solely with the basic ops.
Intersection takes two input relations, whichmust beunion-compatible.
Q: How to express it using basic operators?
R S = R (R S)
-
7/28/2019 Lecture 8 Alg
17/30
Intersection
S1
S2
S S1 2
sid sname rating age
31 lubber 8 55.558 rusty 10 35.0
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
sid sname rating age
28 yuppy 9 35.031 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
-
7/28/2019 Lecture 8 Alg
18/30
Compound Operator: Join
Joins are compound operators involving cross product,
selection, and (sometimes) projection. Most common type of join is a natural join (often just called
join). R S conceptually is: Compute R X S
Select rows where attributes that appear in both relations have equal
values Project all unique atttributes and one copy of each of the common
ones.
Note: Usually done much more efficiently than this.
Useful for putting normalized relations back together.
-
7/28/2019 Lecture 8 Alg
19/30
Natural Join Example
R1 S1
R1 S1 =
sid sname rating age bid day22 dustin 7 45.0 101 10/10/9658 rusty 10 35.0 103 11/12/96
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
sid bid day22 101 10/10/96
58 103 11/12/96
-
7/28/2019 Lecture 8 Alg
20/30
Other Types of Joins
Condition Join (or theta-join):
Result schemasame as that of cross-product.
May have fewer tuples than cross-product.
Equi-Join: Special case: condition ccontains
only conjunction ofequalities.
R c S c R S ( )
-
7/28/2019 Lecture 8 Alg
21/30
Theta Join Examplesid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
sid bid day
22 101 10/10/96
58 103 11/12/96
R1 S1
S1S1.sidR1.sid
R1 =
(sid) sname rating age (sid) bid day22 dustin 7 45.0 58 103 11/12/9631 lubber 8 55.5 58 103 11/12/96
-
7/28/2019 Lecture 8 Alg
22/30
Compound Operator: Division
Useful for expressing for all queries like:Find sids of sailors who have reservedallboats.
For A/B attributes of B are subset of attrs of A.
May need to project to make this happen.
E.g., letAhave 2 fields, xand y; Bhave onlyfield y:
A/Bcontains allxtuples such that for everyytuple in B, there is anxytuple inA.
A B x y B( x,y A)
-
7/28/2019 Lecture 8 Alg
23/30
-
7/28/2019 Lecture 8 Alg
24/30
Expressing A/B Using Basic Operators
Division is not essential op; just a useful shorthand.
(Also true of joins, but joins are so common that systemsimplement joins specially.)
Idea: ForA/B, compute allxvalues that are not`disqualified by some yvalue in B.
xvalue is disqualifiedif by attaching yvalue from B, weobtain an xytuple that is not inA.
Disqualified x values: p px x A B A(( ( ) ) )
A/B: px A( ) Disqualified x values
-
7/28/2019 Lecture 8 Alg
25/30
ExamplesReserves
Sailors
Boats
sid bid day
22 101 10/10/96
58 103 11/12/96sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.558 rusty 10 35.0
bid bname color101 Interlake Blue
102 Interlake Red
103 Clipper Green
104 Marine Red
-
7/28/2019 Lecture 8 Alg
26/30
Find names of sailors whove reserved boat #103
Solution 1: p sname bidserves Sailors(( Re ) )
103
Solution 2: p sname bid
serves Sailors( (Re ))103
-
7/28/2019 Lecture 8 Alg
27/30
Find names of sailors whove reserved a red boat
Information about boat color only available inBoats; so need an extra join:
p sname color red Boats serves Sailors(( ' ' ) Re )
A more efficient (???) solution:
psname
(psid
((pbid
(color'red'
Boats)) Res) Sailors)
A query optimizer can find this given the first solution!
-
7/28/2019 Lecture 8 Alg
28/30
Find sailors whove reserved a red or a green boat
Can identify all red or green boats, then findsailors whove reserved one of these boats:
( , (
' ' ' '
))Tempboats
color red color green
Boats
psname Tempboats serves Sailors( Re )
-
7/28/2019 Lecture 8 Alg
29/30
Find sailors whove reserved a red and a green boat
Previous approach wont work! Must identifysailors whove reserved red boats, sailors whovereserved green boats, then find the intersection(note that sidis a key for Sailors):
p ( , ((' '
) Re ))Tempredsid color red
Boats serves
psname Tempred Tempgreen Sailors(( ) )
p ( , (( ' ' ) Re ))Tempgreen sid color green Boats serves
-
7/28/2019 Lecture 8 Alg
30/30
Find the names of sailors whove reserved all boats
Uses division; schemas of the input relations to/ must be carefully chosen:
p p( , (,
Re ) / ( ))Tempsidssid bid
servesbid
Boats
psname Tempsids Sailors( )
To find sailors whove reserved all Interlake boats:
/ (' '
)p bid bname Interlake
Boats
.....