database design i
TRANSCRIPT
DATABASE DESIGN I - 1DL300
Fall 2010
An introductory course on database systems
2010-11-11 1Manivasakan Sabesan- UDBL - IT - UU
http://www.it.uu.se/edu/course/homepage/dbastekn/ht10/
Manivasakan SabesanUppsala Database Laboratory
Department of Information Technology, Uppsala University, Uppsala, Sweden
Introduction to Relational Algebra
Elmasri/Navathe ch 6Padron-McCarthy/Risch ch 10
Manivasakan Sabesan
2010-11-11 2Manivasakan Sabesan- UDBL - IT - UU
Department of Information TechnologyUppsala University, Uppsala, Sweden
Query languages
• Languages where users can express what information to retrieve from the database.
• Categories of query languages:– Procedural– Non-procedural (declarative)
2010-11-11 3Manivasakan Sabesan- UDBL - IT - UU
– Non-procedural (declarative)
• Formal (“pure”) languages:– Relational algebra– Relational calculus
• Tuple-relational calculus• Domain-relational calculus
– Formal languages form underlying basis of query languages that people use.
Relational algebra
• Relational algebra is a procedural language
• Operations in relational algebra takes two or more relations as arguments and return a new relation.
• Relational algebraic operations:– Operations from set theory:
2010-11-11 4Manivasakan Sabesan- UDBL - IT - UU
– Operations from set theory: • Union, Intersection, Difference, Cartesian product
– Operations specifically introduced for the relational data model:• Select, Project, Join
• It have been shown that the select, project, union, difference, andCartesian product operations form a complete set. That is any other relational algebra operation can be expressed in these.
Operations from set theory
• Relations are required to be union compatible to be able to take part in the union, intersection and difference operations.
• Two relations R1 and R2 is said to be union-compatible if:
R ⊆ D × D × ... × D and
2010-11-11 5Manivasakan Sabesan- UDBL - IT - UU
R1 ⊆ D1 × D2 × ... × Dn and R2 ⊆ D1 × D2 × ... × Dn
i.e. if they have the same degree and the same domains.
Union operation
• The union of two union-compatible relations R and S is the set of all tuples that either occur in R, S, or in both.
• Notation: R ∪ S
• Defined as: R ∪ S = {t | t ∈ R or t ∈ S}
• For example:
2010-11-11 6Manivasakan Sabesan- UDBL - IT - UU
• For example:
R S
A B
aab
121
U A B
ab
23
= A B
aabb
1213
Difference operation
• The difference between two union-compatible sets R and S is the set of all tuples that occur in R but not in S.
• Notation: R − S
• Defined as: R − S = {t | t ∈ R and t ∉ S}
• For example:
2010-11-11 7Manivasakan Sabesan- UDBL - IT - UU
• For example:
R S
A B
aab
121
− A B
ab
23
= A B
ab
11
Intersection
• The intersection of two union-compatible sets R and S, is the set of all tuples that occur in both R and S.
• Notation: R∩S
• Defined as: R ∩ S = {t | t ∈ R and t ∈ S}
• For example:
2010-11-11 8Manivasakan Sabesan- UDBL - IT - UU
• For example:
R S
A B
aab
121
∩ A B
ab
23
= A B
a 2
Cartesian product
• Let R and S be relations with k1 and k2 arities resp. The Cartesian product of R and S is the set of all possible k1+k2 tuples where the first k1 components constitute a tuple in R and the last k2 components a tuple in S.
• Notation: R × S
• Defined as: R × S = {t q | t ∈ R and q ∈ S}• Assume that attributes of r(R) and s(S) are disjoint. (i.e. R ) S = �). If attributes
of r(R) and s(S) are not disjoint, then renaming must be used.
2010-11-11 9Manivasakan Sabesan- UDBL - IT - UU
of r(R) and s(S) are not disjoint, then renaming must be used.
X =
Cartesian product example
A B
ab
12
× C D
abbc
5565
= A B
aaaa
1111
C D
abbc
5565
2010-11-11 10Manivasakan Sabesan- UDBL - IT - UU
c 5 abbbb
12222
cabbc
55565
Selection operation
• The selection operator, σ, selects a specific set of tuples from a relation according to a selection condition (or selection predicate) P.
• Notation: σ P(R)
• Defined as: σ P(R) = {t | t ∈ R and P(t) } (i.e. the set of tuples t in R that fulfills the condition P)
• Where P is a logical expression(*) consisting of terms connected by:
2010-11-11 11Manivasakan Sabesan- UDBL - IT - UU
• Where P is a logical expression(*) consisting of terms connected by: ∧ (and), ∨ (or), ¬ (not) and each term is one of: <attribute> op <attribute> or <constant>, where op is one of: =,≠, >, ≥, <, ≤
Example:σSALARY>30000(EMPLOYEE)
(*) a formula in propositional calculus
Selection example
= A B
aabb
abbb
C D
1524
7739
R
2010-11-11 12Manivasakan Sabesan- UDBL - IT - UU
b b 4 9
σA=B ∧ D > 5 (R) = A B
ab
ab
C D
14
79
Projection operation
• The projection operator, Π, picks out (or projects) listed columns from a relation and creates a new relation consisting of these columns.
• Notation: Π A1,A2,...,Ak(R), where A1, A2 are attribute names and
R is a relation name.• The result is a new relation of k columns.
2010-11-11 13Manivasakan Sabesan- UDBL - IT - UU
• The result is a new relation of k columns.
• Duplicate rows removed from result, since relations are sets.
Example: Π LNAME,FNAME,SALARY(EMPLOYEE)
Projection example
= A B
aabb
1234
C
1112
R
2010-11-11 14Manivasakan Sabesan- UDBL - IT - UU
b 4 2
Π A,C (R) = A C
aabb
1112
= A C
abb
112
Join operator
• The join operator, ⊗ (almost, correct ), creates a new relation by joining related tuples from two relations.
• Notation: R ⊗ C SC is the join condition which has the form Ar θ As , where θ is one of {=, <, >, ≤, ≥, ≠}. Several terms can be connected as C1
2010-11-11 15Manivasakan Sabesan- UDBL - IT - UU
one of {=, <, >, ≤, ≥, ≠}. Several terms can be connected as C1∧C2 ∧...Ck.
• A join operation with this kind of general join condition is called “Theta join”.
Example Theta join
θA≤F = A B
11
22
C D
33
27
A B
16
27
C
38
D E
27
33
F
45
R θA≤F S
E
33
F
45
R S
2010-11-11 16Manivasakan Sabesan- UDBL - IT - UU
1169
2277
3388
7777
69
77
88
77
38
59
3888
5999
Equijoin
• The same as join but it is required that attribute Ar and attribute Asshould have the same value.
• Notation: R ⊗C SC is the join condition which has the form Ar = As. Several terms can be connected as C ∧C ∧...C .
2010-11-11 17Manivasakan Sabesan- UDBL - IT - UU
be connected as C1 ∧C2 ∧...Ck.
Example Equijoin
⊗B=C = A B
aa
24
C D
24
dd
A B
aa
24
C D
24
dd
E
ee
R ⊗B=C S
E
ee
R S
2010-11-11 18Manivasakan Sabesan- UDBL - IT - UU
a 4 4 da 4 49
dd
ee
e
Natural join
• Natural join is equivalent with the application of join to R and S with the equality condition Ar = As (i.e. an equijoin) and then removing the redundant column As in the result.
• Notation: R *Ar,As SA ,A are attribute pairs that should fulfill the join condition which
2010-11-11 19Manivasakan Sabesan- UDBL - IT - UU
Ar,As are attribute pairs that should fulfill the join condition which has the form Ar = As. Several terms can be connected as C1 ∧C2∧...Ck.
Example Natural join
⊗B=C = A B
aa
24
D
dd
A B
aa
24
C D
24
dd
E
ee
R ∗B=C S
E
ee
R S
2010-11-11 20Manivasakan Sabesan- UDBL - IT - UU
a 4 da 4 49
dd
ee
e
Composition of operations
• Expressions can be built by composing multiple operations
• Example: σA=C (R × S)
A B
ab
12
× C D
ab
55
= A B
aa
11
C D
ab
55
R × S =
2010-11-11 21Manivasakan Sabesan- UDBL - IT - UU
σA=C (R × S) =
b 2 bbc
565
aaabbbb
1112222
bbcabbc
5655565
A B
abb
122
C D
abb
556
Assignment operation
• The assignment operation (← ) makes it possible to assign the result of an expression to a temporary relation variable.
• Example:
temp ← σdno = 5 (EMPLOYEE)
2010-11-11 22Manivasakan Sabesan- UDBL - IT - UU
dno = 5
result ← ∏fname,lname,salary (temp)
• The result to the right of the ← is assigned to the relation variable on the left of the ←.
• The variable may be used in subsequent expressions.
Renaming relations and attribute
• The assignment operation can also be used to rename relations and attributes.
• Example:
NEWEMP ← σ (EMPLOYEE)
2010-11-11 23Manivasakan Sabesan- UDBL - IT - UU
NEWEMP ← σdno = 5(EMPLOYEE)
R(FIRSTNAME,LASTNAME,SALARY) ←∏fname,lname,salary (NEWEMP)
Division operation
• Suited to queries that include the phrase “for all”.
• Let R and S be relations on schemas R and S respectively, where
R = (A1,...,Am ,B1,...,Bn)S = (B1,...,Bn)
2010-11-11 24Manivasakan Sabesan- UDBL - IT - UU
S = (B1,...,Bn)
• The result of R ÷ S is a relation on the schema R - S = (A1,...,Am)
R ÷ S = {t | t ∈ ΠR-S (R) ∀u ∈ S ∧ tu ∈ R}
Example of Division operation
÷ = A
ae
A B
aaa
123
B
12
R ÷ SR S
2010-11-11 25Manivasakan Sabesan- UDBL - IT - UU
abcddddee
311134612
Additional relational operations
• Outer join and outer union (presented together with SQL)
• Aggregation operations (presented together with SQL)
• Update operations (presented together with SQL)– (not part of pure query language)
2010-11-11 26Manivasakan Sabesan- UDBL - IT - UU
Aggregation operations
• Presented together with SQL later
• Examples of aggregation operations– avg
– min
– max
2010-11-11 27Manivasakan Sabesan- UDBL - IT - UU
– max
– sum
– count
Update operations
• Presented together with SQL later
• Operations for database updates are normally part of the DML– insert (of new tuples)
– update (of attribute values)
– delete (of tuples)
2010-11-11 28Manivasakan Sabesan- UDBL - IT - UU
– (of tuples)
• Can be expressed by means of the assignment operator
Outer join/union operation
• Extensions of the join/union operations that avoid loss of information.
• Computes the join/union and then adds tuples from one relation that do not match tuples in the other relation to the result of the join.
• Fills out with null values:– null signifies that the value is unknown or does not exist.
2010-11-11 29Manivasakan Sabesan- UDBL - IT - UU
– null signifies that the value is unknown or does not exist.
– All comparisons involving null are false by definition.
Example Outer join
• Relation loan
branch-name
DowntownRedwoodPerryridge
loan-number
1-170L-230L-260
amount
300040001700
2010-11-11 30Manivasakan Sabesan- UDBL - IT - UU
• Relation borrower
customer-name
JonesSmithHayes
loan-number
1-170L-230L-155
Example Outer join cont...
• loan ⊗ borrower (natural join)
branch-name
DowntownRedwood
loan-number
1-170L-230
amount
30004000
customer-name
JonesSmith
2010-11-11 31Manivasakan Sabesan- UDBL - IT - UU
• loan ⊗ left borrower (left outer join)
customer-name
JonesSmithnull
loan-number
1-170L-230null
branch-name
DowntownRedwoodPerryridge
loan-number
1-170L-230L-260
amount
300040001700
Example Outer join cont...
• loan ⊗right borrower (natural right outer join)
• loan ⊗ borrower (natural full outer join)
customer-name
JonesSmithHayes
branch-name
DowntownRedwood
null
loan-number
L-170L-230L-155
amount
30004000null
2010-11-11 32Manivasakan Sabesan- UDBL - IT - UU
• loan ⊗full borrower (natural full outer join)
customer-name
JonesSmithnull
Hayes
branch-name
DowntownRedwoodPerryridge
null
loan-number
L-170L-230L-260L-155
amount
300040001700null