database design i

32
DATABASE DESIGN I - 1DL300 Fall 2010 An introductory course on database systems 2010-11-11 1 Manivasakan Sabesan- UDBL - IT - UU http://www.it.uu.se/edu/course/homepage/dbastekn/ht10/ Manivasakan Sabesan Uppsala Database Laboratory Department of Information Technology, Uppsala University, Uppsala, Sweden

Upload: others

Post on 12-Sep-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DATABASE DESIGN I

DATABASE DESIGN I - 1DL300

Fall 2010

An introductory course on database systems

2010-11-11 1Manivasakan Sabesan- UDBL - IT - UU

http://www.it.uu.se/edu/course/homepage/dbastekn/ht10/

Manivasakan SabesanUppsala Database Laboratory

Department of Information Technology, Uppsala University, Uppsala, Sweden

Page 2: DATABASE DESIGN I

Introduction to Relational Algebra

Elmasri/Navathe ch 6Padron-McCarthy/Risch ch 10

Manivasakan Sabesan

2010-11-11 2Manivasakan Sabesan- UDBL - IT - UU

Department of Information TechnologyUppsala University, Uppsala, Sweden

Page 3: DATABASE DESIGN I

Query languages

• Languages where users can express what information to retrieve from the database.

• Categories of query languages:– Procedural– Non-procedural (declarative)

2010-11-11 3Manivasakan Sabesan- UDBL - IT - UU

– Non-procedural (declarative)

• Formal (“pure”) languages:– Relational algebra– Relational calculus

• Tuple-relational calculus• Domain-relational calculus

– Formal languages form underlying basis of query languages that people use.

Page 4: DATABASE DESIGN I

Relational algebra

• Relational algebra is a procedural language

• Operations in relational algebra takes two or more relations as arguments and return a new relation.

• Relational algebraic operations:– Operations from set theory:

2010-11-11 4Manivasakan Sabesan- UDBL - IT - UU

– Operations from set theory: • Union, Intersection, Difference, Cartesian product

– Operations specifically introduced for the relational data model:• Select, Project, Join

• It have been shown that the select, project, union, difference, andCartesian product operations form a complete set. That is any other relational algebra operation can be expressed in these.

Page 5: DATABASE DESIGN I

Operations from set theory

• Relations are required to be union compatible to be able to take part in the union, intersection and difference operations.

• Two relations R1 and R2 is said to be union-compatible if:

R ⊆ D × D × ... × D and

2010-11-11 5Manivasakan Sabesan- UDBL - IT - UU

R1 ⊆ D1 × D2 × ... × Dn and R2 ⊆ D1 × D2 × ... × Dn

i.e. if they have the same degree and the same domains.

Page 6: DATABASE DESIGN I

Union operation

• The union of two union-compatible relations R and S is the set of all tuples that either occur in R, S, or in both.

• Notation: R ∪ S

• Defined as: R ∪ S = {t | t ∈ R or t ∈ S}

• For example:

2010-11-11 6Manivasakan Sabesan- UDBL - IT - UU

• For example:

R S

A B

aab

121

U A B

ab

23

= A B

aabb

1213

Page 7: DATABASE DESIGN I

Difference operation

• The difference between two union-compatible sets R and S is the set of all tuples that occur in R but not in S.

• Notation: R − S

• Defined as: R − S = {t | t ∈ R and t ∉ S}

• For example:

2010-11-11 7Manivasakan Sabesan- UDBL - IT - UU

• For example:

R S

A B

aab

121

− A B

ab

23

= A B

ab

11

Page 8: DATABASE DESIGN I

Intersection

• The intersection of two union-compatible sets R and S, is the set of all tuples that occur in both R and S.

• Notation: R∩S

• Defined as: R ∩ S = {t | t ∈ R and t ∈ S}

• For example:

2010-11-11 8Manivasakan Sabesan- UDBL - IT - UU

• For example:

R S

A B

aab

121

∩ A B

ab

23

= A B

a 2

Page 9: DATABASE DESIGN I

Cartesian product

• Let R and S be relations with k1 and k2 arities resp. The Cartesian product of R and S is the set of all possible k1+k2 tuples where the first k1 components constitute a tuple in R and the last k2 components a tuple in S.

• Notation: R × S

• Defined as: R × S = {t q | t ∈ R and q ∈ S}• Assume that attributes of r(R) and s(S) are disjoint. (i.e. R ) S = �). If attributes

of r(R) and s(S) are not disjoint, then renaming must be used.

2010-11-11 9Manivasakan Sabesan- UDBL - IT - UU

of r(R) and s(S) are not disjoint, then renaming must be used.

X =

Page 10: DATABASE DESIGN I

Cartesian product example

A B

ab

12

× C D

abbc

5565

= A B

aaaa

1111

C D

abbc

5565

2010-11-11 10Manivasakan Sabesan- UDBL - IT - UU

c 5 abbbb

12222

cabbc

55565

Page 11: DATABASE DESIGN I

Selection operation

• The selection operator, σ, selects a specific set of tuples from a relation according to a selection condition (or selection predicate) P.

• Notation: σ P(R)

• Defined as: σ P(R) = {t | t ∈ R and P(t) } (i.e. the set of tuples t in R that fulfills the condition P)

• Where P is a logical expression(*) consisting of terms connected by:

2010-11-11 11Manivasakan Sabesan- UDBL - IT - UU

• Where P is a logical expression(*) consisting of terms connected by: ∧ (and), ∨ (or), ¬ (not) and each term is one of: <attribute> op <attribute> or <constant>, where op is one of: =,≠, >, ≥, <, ≤

Example:σSALARY>30000(EMPLOYEE)

(*) a formula in propositional calculus

Page 12: DATABASE DESIGN I

Selection example

= A B

aabb

abbb

C D

1524

7739

R

2010-11-11 12Manivasakan Sabesan- UDBL - IT - UU

b b 4 9

σA=B ∧ D > 5 (R) = A B

ab

ab

C D

14

79

Page 13: DATABASE DESIGN I

Projection operation

• The projection operator, Π, picks out (or projects) listed columns from a relation and creates a new relation consisting of these columns.

• Notation: Π A1,A2,...,Ak(R), where A1, A2 are attribute names and

R is a relation name.• The result is a new relation of k columns.

2010-11-11 13Manivasakan Sabesan- UDBL - IT - UU

• The result is a new relation of k columns.

• Duplicate rows removed from result, since relations are sets.

Example: Π LNAME,FNAME,SALARY(EMPLOYEE)

Page 14: DATABASE DESIGN I

Projection example

= A B

aabb

1234

C

1112

R

2010-11-11 14Manivasakan Sabesan- UDBL - IT - UU

b 4 2

Π A,C (R) = A C

aabb

1112

= A C

abb

112

Page 15: DATABASE DESIGN I

Join operator

• The join operator, ⊗ (almost, correct ), creates a new relation by joining related tuples from two relations.

• Notation: R ⊗ C SC is the join condition which has the form Ar θ As , where θ is one of {=, <, >, ≤, ≥, ≠}. Several terms can be connected as C1

2010-11-11 15Manivasakan Sabesan- UDBL - IT - UU

one of {=, <, >, ≤, ≥, ≠}. Several terms can be connected as C1∧C2 ∧...Ck.

• A join operation with this kind of general join condition is called “Theta join”.

Page 16: DATABASE DESIGN I

Example Theta join

θA≤F = A B

11

22

C D

33

27

A B

16

27

C

38

D E

27

33

F

45

R θA≤F S

E

33

F

45

R S

2010-11-11 16Manivasakan Sabesan- UDBL - IT - UU

1169

2277

3388

7777

69

77

88

77

38

59

3888

5999

Page 17: DATABASE DESIGN I

Equijoin

• The same as join but it is required that attribute Ar and attribute Asshould have the same value.

• Notation: R ⊗C SC is the join condition which has the form Ar = As. Several terms can be connected as C ∧C ∧...C .

2010-11-11 17Manivasakan Sabesan- UDBL - IT - UU

be connected as C1 ∧C2 ∧...Ck.

Page 18: DATABASE DESIGN I

Example Equijoin

⊗B=C = A B

aa

24

C D

24

dd

A B

aa

24

C D

24

dd

E

ee

R ⊗B=C S

E

ee

R S

2010-11-11 18Manivasakan Sabesan- UDBL - IT - UU

a 4 4 da 4 49

dd

ee

e

Page 19: DATABASE DESIGN I

Natural join

• Natural join is equivalent with the application of join to R and S with the equality condition Ar = As (i.e. an equijoin) and then removing the redundant column As in the result.

• Notation: R *Ar,As SA ,A are attribute pairs that should fulfill the join condition which

2010-11-11 19Manivasakan Sabesan- UDBL - IT - UU

Ar,As are attribute pairs that should fulfill the join condition which has the form Ar = As. Several terms can be connected as C1 ∧C2∧...Ck.

Page 20: DATABASE DESIGN I

Example Natural join

⊗B=C = A B

aa

24

D

dd

A B

aa

24

C D

24

dd

E

ee

R ∗B=C S

E

ee

R S

2010-11-11 20Manivasakan Sabesan- UDBL - IT - UU

a 4 da 4 49

dd

ee

e

Page 21: DATABASE DESIGN I

Composition of operations

• Expressions can be built by composing multiple operations

• Example: σA=C (R × S)

A B

ab

12

× C D

ab

55

= A B

aa

11

C D

ab

55

R × S =

2010-11-11 21Manivasakan Sabesan- UDBL - IT - UU

σA=C (R × S) =

b 2 bbc

565

aaabbbb

1112222

bbcabbc

5655565

A B

abb

122

C D

abb

556

Page 22: DATABASE DESIGN I

Assignment operation

• The assignment operation (← ) makes it possible to assign the result of an expression to a temporary relation variable.

• Example:

temp ← σdno = 5 (EMPLOYEE)

2010-11-11 22Manivasakan Sabesan- UDBL - IT - UU

dno = 5

result ← ∏fname,lname,salary (temp)

• The result to the right of the ← is assigned to the relation variable on the left of the ←.

• The variable may be used in subsequent expressions.

Page 23: DATABASE DESIGN I

Renaming relations and attribute

• The assignment operation can also be used to rename relations and attributes.

• Example:

NEWEMP ← σ (EMPLOYEE)

2010-11-11 23Manivasakan Sabesan- UDBL - IT - UU

NEWEMP ← σdno = 5(EMPLOYEE)

R(FIRSTNAME,LASTNAME,SALARY) ←∏fname,lname,salary (NEWEMP)

Page 24: DATABASE DESIGN I

Division operation

• Suited to queries that include the phrase “for all”.

• Let R and S be relations on schemas R and S respectively, where

R = (A1,...,Am ,B1,...,Bn)S = (B1,...,Bn)

2010-11-11 24Manivasakan Sabesan- UDBL - IT - UU

S = (B1,...,Bn)

• The result of R ÷ S is a relation on the schema R - S = (A1,...,Am)

R ÷ S = {t | t ∈ ΠR-S (R) ∀u ∈ S ∧ tu ∈ R}

Page 25: DATABASE DESIGN I

Example of Division operation

÷ = A

ae

A B

aaa

123

B

12

R ÷ SR S

2010-11-11 25Manivasakan Sabesan- UDBL - IT - UU

abcddddee

311134612

Page 26: DATABASE DESIGN I

Additional relational operations

• Outer join and outer union (presented together with SQL)

• Aggregation operations (presented together with SQL)

• Update operations (presented together with SQL)– (not part of pure query language)

2010-11-11 26Manivasakan Sabesan- UDBL - IT - UU

Page 27: DATABASE DESIGN I

Aggregation operations

• Presented together with SQL later

• Examples of aggregation operations– avg

– min

– max

2010-11-11 27Manivasakan Sabesan- UDBL - IT - UU

– max

– sum

– count

Page 28: DATABASE DESIGN I

Update operations

• Presented together with SQL later

• Operations for database updates are normally part of the DML– insert (of new tuples)

– update (of attribute values)

– delete (of tuples)

2010-11-11 28Manivasakan Sabesan- UDBL - IT - UU

– (of tuples)

• Can be expressed by means of the assignment operator

Page 29: DATABASE DESIGN I

Outer join/union operation

• Extensions of the join/union operations that avoid loss of information.

• Computes the join/union and then adds tuples from one relation that do not match tuples in the other relation to the result of the join.

• Fills out with null values:– null signifies that the value is unknown or does not exist.

2010-11-11 29Manivasakan Sabesan- UDBL - IT - UU

– null signifies that the value is unknown or does not exist.

– All comparisons involving null are false by definition.

Page 30: DATABASE DESIGN I

Example Outer join

• Relation loan

branch-name

DowntownRedwoodPerryridge

loan-number

1-170L-230L-260

amount

300040001700

2010-11-11 30Manivasakan Sabesan- UDBL - IT - UU

• Relation borrower

customer-name

JonesSmithHayes

loan-number

1-170L-230L-155

Page 31: DATABASE DESIGN I

Example Outer join cont...

• loan ⊗ borrower (natural join)

branch-name

DowntownRedwood

loan-number

1-170L-230

amount

30004000

customer-name

JonesSmith

2010-11-11 31Manivasakan Sabesan- UDBL - IT - UU

• loan ⊗ left borrower (left outer join)

customer-name

JonesSmithnull

loan-number

1-170L-230null

branch-name

DowntownRedwoodPerryridge

loan-number

1-170L-230L-260

amount

300040001700

Page 32: DATABASE DESIGN I

Example Outer join cont...

• loan ⊗right borrower (natural right outer join)

• loan ⊗ borrower (natural full outer join)

customer-name

JonesSmithHayes

branch-name

DowntownRedwood

null

loan-number

L-170L-230L-155

amount

30004000null

2010-11-11 32Manivasakan Sabesan- UDBL - IT - UU

• loan ⊗full borrower (natural full outer join)

customer-name

JonesSmithnull

Hayes

branch-name

DowntownRedwoodPerryridge

null

loan-number

L-170L-230L-260L-155

amount

300040001700null