spring 2003 ece569 lecture 02.1 ece 569 database system engineering spring 2003 yanyong zhang...

28
Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang www.ece.rutgers.edu/~ yyzhang Course URL www.ece.rutgers.edu/~yyzhang/spring03

Post on 20-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.1

ECE 569 Database System Engineering

Spring 2003

Yanyong Zhang www.ece.rutgers.edu/~yyzhang

Course URL www.ece.rutgers.edu/~yyzhang/spring03

Page 2: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.2

Warm up discussion

Compare and contrast database and file system Similarity: the ability to manage persistent data

Difference: efficiency

- DBMS provides index, navigation among relations, easier data access via query language (data manipulation language), transaction management,

Page 3: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.3

Today’s topic

Relational data model Different data models exist: relational, network,

hierarchical, object-oriented

Powerful, simple, declarative

Relational algebra

Page 4: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.4

Set-theoretic notion of a relation

A domain D is a set of values colors = {red, blue, green}

age = set of positive integers less than 20

last_name = {a-zA-Z}+

A relation over domains D1,D2,…,Dn is a subset of D1D2D3…Dn.

Example R(colors, age) {(red,1), (blue,1),(green,1),

(red,2), (blue,2),(green,2), …}

Page 5: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.5

Set-theoretic notion of a relation (cont)

We are only interested in finite relations.

The members of a relation are called tuples.

Each relation that is a subset of the product of k domains is k-degree.

A relation can be represented as a table, where each row is a tuple and each column corresponds to one domain.

PERSON (NAME, AGE)

Name Age

Zhang 27

Merril 2

Stewart 40

Page 6: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.6

Set-theoretic notion of a relation (cont)

Because a relation is a set of tuples, the order of tuples in the table is insignificant (but the order of domains does matter).

Name Age

Zhang 27

Merril 2

Stewart 40

Name Age

Zhang 27

Stewart 40

Merril 2

Age Name

27 Zhang

2 Merill

40 Stewart

Page 7: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.7

An alternative definition – set of mappings

A relation schema R = {A1,A2,…,An} (no order imposed on attributes)

A relation instance of R is a finite set of mappings {1, 2, …, m} where

2: R D

where D = null dom(A1) dom(A2)… dom(An).

Each tuple i maps each attribute in tuple i to a value.

Note that attributes must be given names that are unique in a relation schema.

Under this definition, all three instances above are equivalent.

Page 8: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.8

Keys

A set S of attributes is a candidate key for R if No semantically correct instance of R can include two

tuples such that [s] = [s], and

No proper subset of S satisfies (a)

A set of attributes satisfying (a) but not necessarily (b) is a superkey.

A relation may have multiple candidate keys. One can be designated as the primary key.

Page 9: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.9

Example

Schema for a relation of degree four patient(name,address,balance_due,room#)

An instance of patient with three tuples that satisfies the key constraint (name is a candidate key)

patient Name Address Balance_due Room #

Zhang 1964 highland dr $5,230 518

Stewart 2342 K St. $25 203

Brinkley 2342 K St. $10,200 203

No other candidate key is possible (assuming that this is a semantically correct instance of patient).

Page 10: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.10

Entity-Relationship model

To describe the conceptual scheme.

An entity is a thing that exists and is distinguishable.

Entity sets consist of a group of all “similar” entities.

Each entity has certain attributes.

A relationship among entity sets is an ordered list of entity sets.

Page 11: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.11

Example of entity-relationship model

Billed

Account Made To

Room#

Name

Address

Balance

Patient

Payment Amount

Date

Diagnosed

Disease NameTreatment

FromVital Sign

Pulse

Blood

Time

Page 12: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.12

Example – Healthcare DB

patient(name, address, balance_due, room#)

payments(name, amount, date)

vital_signs(name, pulse, blood, date, time)

diagnosis(patient_name, disease_name)

disease(disease_name, treatment)

treats(patient, doctor)

Page 13: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.13

Representing entity-relationship diagrams

An entity set E can be represented by a relation whose relation scheme consists of all the attributes of the entity set.

Each tuple of the relation represents one entity in the current instance of E.

A relationship R among E1, E2, …, Ek is represented by a relation whose relation scheme consists of the attributes in the keys for each of E1, E2, …, Ek.

Page 14: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.14

Relational Algebra

A set of five basic operations that map one or more relations to a new relation.

Union: R S t tR or tS The set of tuples in R or S or both

The degrees of R and S must match

A B C

a b c

a d c

e a b

D E

d e

b c

F G H

d e b

a b c

a b b

d b bRS

T

a b c

a d c

e a b

d e b

a b b

d b b

R T

Page 15: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.15

Relational Algebra (cont)

Difference: R - S t tR and tS The set of tuples in R, but not in S

The degrees of R and S must match

A B C

a b c

a d c

e a b

D E

d e

b c

F G H

d e b

a b c

a b b

d b bRS

T

a d c

e a b

R - T

Page 16: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.16

Relational Algebra (cont)

Cartesian Product: R X S

(a1, a2, …, ar+s) (a1, a2, …, ar) R and (ar+1, ar+2, …, ar+s) S

R x S is of degree r+s where r=degree(R) and s=degree(s)

a tuple t is in R x S if its first r components match those of a tuple in R and its last s components from a tuple in S

A B C

a b c

a d c

e a b

D E

d e

b c

F G H

d e b

a b c

a b b

d b bRS

T

A B C D E

a b c d e

a b c b c

a d c d e

a d c b c

e a b d e

e a b b c

R x S

R.A R.B R.C S.D S.E

Page 17: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.17

Relational Algebra (cont)

Projection: i1, i2, …, im (R) (ai1, ai2, …, aim) (a1, a2, …, an) R

i1, i2, …, im (R) is of degree m

A tuple t in the result is obtained by removing and/or rearranging the attributes of a tuple of R

A B C

a b c

a d c

e a b

D E

d e

b c

F G H

d e b

a b c

a b b

d b bRS

T

G F

e d

b a

b d

2,1 (T)

Page 18: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.18

Relational Algebra (cont)

Selection: F(R) t t R and F(t)

F(R) is of the same degree as R

All tuples in R that satisfy predicate F are included in result

F is a formula involving

- Constants or components of the tuple – component i is denoted $I

- Comparison operators, <, =, >, , ,

- Logical operators and, or, not

F G H

d e b

a b c

a b b

d b b

T

F G H

d e b

a b b

d b b

$1=d or $2=$3 (T)

Page 19: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.19

Examples

Find the names of all patients with a balance of more than $10,000.

Find the names of all patients that have a pulse less than 50 or have been diagnosed with hypertension.

Page 20: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.20

Examples (cont)

Find all patients whose treatment includes the “application of leeches”

Print the names of all pairs of patients that occupy the same hospital room and have been diagnosed with the same disease.

Page 21: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.21

Examples (cont)

Print the name of the patient with the highest pulse recorded today

Find all patients that had pulse measured today

Find the set of all ‘losers’, i.e., those whose pulse is less than or equal to that of some other patient

Subtract (b) from (a) and project out patient name

Page 22: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.22

Examples (cont)

List all patients that suffer from at least one illness that no other patient suffers from

Page 23: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.23

Additional Operations

Intersection: R S t tR and tS The set of tuples in both R and S

The degrees of R and S must match

R S = R – (R – S) SR

R – ( R – S)

R – S

Page 24: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.24

Additional Operations (cont)

Theta-join: R ij S $i $r+j(R S) where r = degree(R) and <, =, >, , ,

When is ‘=‘ operation is called equijoin.

Find patient with highest pulse

Page 25: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.25

Additional Operations (cont)

Natural Join: R S An equijoin for all attributes that R and S have in

common.

The shared columns originating in S are projected out

R S i1, i2, …, im R.A1 = S. A1 and … and R.Ak = S.Ak (R x S) where i1, i2, … im is the list of all attributes of R x S, in order, except the attributes S.A1, …, S.Ak.

Example: Names of patients being treated with leeches.

Page 26: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.26

Sample Queries

Consider the following set of relations

frequents (drinker, bar)

serves (bar, beer)

likes (drinker, beer)

List the drinkers that frequent at least one bar that serves a beer they like

Construct the relation should_visit(drinker,bar) consisting of all tuples <d, b> where bar b serves a beer that drinker d likes.

Page 27: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.27

Sample Queries (cont)

List the drinkers that frequent only bars that serve some beer they like. (Assume every drinker frequents at least one bar.)

Print the drinkers that frequent no bar that serves a beer that they like

Page 28: Spring 2003 ECE569 Lecture 02.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang yyzhangyyzhang

Spring 2003ECE569 Lecture 02.28

Questions on Healthcare DB

Retrieve all pairs of patients that shared a room at the same time

List all patients that finished paying their bill before one of their roomates began paying theirs.

Print the names of all pairs of patients that occupy the same hospital room and have been diagnosed with the same disease.