spring 2003 ece569 lecture 02.1 ece 569 database system engineering spring 2003 yanyong zhang...
Post on 20-Dec-2015
213 views
TRANSCRIPT
Spring 2003ECE569 Lecture 02.1
ECE 569 Database System Engineering
Spring 2003
Yanyong Zhang www.ece.rutgers.edu/~yyzhang
Course URL www.ece.rutgers.edu/~yyzhang/spring03
Spring 2003ECE569 Lecture 02.2
Warm up discussion
Compare and contrast database and file system Similarity: the ability to manage persistent data
Difference: efficiency
- DBMS provides index, navigation among relations, easier data access via query language (data manipulation language), transaction management,
Spring 2003ECE569 Lecture 02.3
Today’s topic
Relational data model Different data models exist: relational, network,
hierarchical, object-oriented
Powerful, simple, declarative
Relational algebra
Spring 2003ECE569 Lecture 02.4
Set-theoretic notion of a relation
A domain D is a set of values colors = {red, blue, green}
age = set of positive integers less than 20
last_name = {a-zA-Z}+
A relation over domains D1,D2,…,Dn is a subset of D1D2D3…Dn.
Example R(colors, age) {(red,1), (blue,1),(green,1),
(red,2), (blue,2),(green,2), …}
Spring 2003ECE569 Lecture 02.5
Set-theoretic notion of a relation (cont)
We are only interested in finite relations.
The members of a relation are called tuples.
Each relation that is a subset of the product of k domains is k-degree.
A relation can be represented as a table, where each row is a tuple and each column corresponds to one domain.
PERSON (NAME, AGE)
Name Age
Zhang 27
Merril 2
Stewart 40
Spring 2003ECE569 Lecture 02.6
Set-theoretic notion of a relation (cont)
Because a relation is a set of tuples, the order of tuples in the table is insignificant (but the order of domains does matter).
Name Age
Zhang 27
Merril 2
Stewart 40
Name Age
Zhang 27
Stewart 40
Merril 2
Age Name
27 Zhang
2 Merill
40 Stewart
Spring 2003ECE569 Lecture 02.7
An alternative definition – set of mappings
A relation schema R = {A1,A2,…,An} (no order imposed on attributes)
A relation instance of R is a finite set of mappings {1, 2, …, m} where
2: R D
where D = null dom(A1) dom(A2)… dom(An).
Each tuple i maps each attribute in tuple i to a value.
Note that attributes must be given names that are unique in a relation schema.
Under this definition, all three instances above are equivalent.
Spring 2003ECE569 Lecture 02.8
Keys
A set S of attributes is a candidate key for R if No semantically correct instance of R can include two
tuples such that [s] = [s], and
No proper subset of S satisfies (a)
A set of attributes satisfying (a) but not necessarily (b) is a superkey.
A relation may have multiple candidate keys. One can be designated as the primary key.
Spring 2003ECE569 Lecture 02.9
Example
Schema for a relation of degree four patient(name,address,balance_due,room#)
An instance of patient with three tuples that satisfies the key constraint (name is a candidate key)
patient Name Address Balance_due Room #
Zhang 1964 highland dr $5,230 518
Stewart 2342 K St. $25 203
Brinkley 2342 K St. $10,200 203
No other candidate key is possible (assuming that this is a semantically correct instance of patient).
Spring 2003ECE569 Lecture 02.10
Entity-Relationship model
To describe the conceptual scheme.
An entity is a thing that exists and is distinguishable.
Entity sets consist of a group of all “similar” entities.
Each entity has certain attributes.
A relationship among entity sets is an ordered list of entity sets.
Spring 2003ECE569 Lecture 02.11
Example of entity-relationship model
Billed
Account Made To
Room#
Name
Address
Balance
Patient
Payment Amount
Date
Diagnosed
Disease NameTreatment
FromVital Sign
Pulse
Blood
Time
Spring 2003ECE569 Lecture 02.12
Example – Healthcare DB
patient(name, address, balance_due, room#)
payments(name, amount, date)
vital_signs(name, pulse, blood, date, time)
diagnosis(patient_name, disease_name)
disease(disease_name, treatment)
treats(patient, doctor)
Spring 2003ECE569 Lecture 02.13
Representing entity-relationship diagrams
An entity set E can be represented by a relation whose relation scheme consists of all the attributes of the entity set.
Each tuple of the relation represents one entity in the current instance of E.
A relationship R among E1, E2, …, Ek is represented by a relation whose relation scheme consists of the attributes in the keys for each of E1, E2, …, Ek.
Spring 2003ECE569 Lecture 02.14
Relational Algebra
A set of five basic operations that map one or more relations to a new relation.
Union: R S t tR or tS The set of tuples in R or S or both
The degrees of R and S must match
A B C
a b c
a d c
e a b
D E
d e
b c
F G H
d e b
a b c
a b b
d b bRS
T
a b c
a d c
e a b
d e b
a b b
d b b
R T
Spring 2003ECE569 Lecture 02.15
Relational Algebra (cont)
Difference: R - S t tR and tS The set of tuples in R, but not in S
The degrees of R and S must match
A B C
a b c
a d c
e a b
D E
d e
b c
F G H
d e b
a b c
a b b
d b bRS
T
a d c
e a b
R - T
Spring 2003ECE569 Lecture 02.16
Relational Algebra (cont)
Cartesian Product: R X S
(a1, a2, …, ar+s) (a1, a2, …, ar) R and (ar+1, ar+2, …, ar+s) S
R x S is of degree r+s where r=degree(R) and s=degree(s)
a tuple t is in R x S if its first r components match those of a tuple in R and its last s components from a tuple in S
A B C
a b c
a d c
e a b
D E
d e
b c
F G H
d e b
a b c
a b b
d b bRS
T
A B C D E
a b c d e
a b c b c
a d c d e
a d c b c
e a b d e
e a b b c
R x S
R.A R.B R.C S.D S.E
Spring 2003ECE569 Lecture 02.17
Relational Algebra (cont)
Projection: i1, i2, …, im (R) (ai1, ai2, …, aim) (a1, a2, …, an) R
i1, i2, …, im (R) is of degree m
A tuple t in the result is obtained by removing and/or rearranging the attributes of a tuple of R
A B C
a b c
a d c
e a b
D E
d e
b c
F G H
d e b
a b c
a b b
d b bRS
T
G F
e d
b a
b d
2,1 (T)
Spring 2003ECE569 Lecture 02.18
Relational Algebra (cont)
Selection: F(R) t t R and F(t)
F(R) is of the same degree as R
All tuples in R that satisfy predicate F are included in result
F is a formula involving
- Constants or components of the tuple – component i is denoted $I
- Comparison operators, <, =, >, , ,
- Logical operators and, or, not
F G H
d e b
a b c
a b b
d b b
T
F G H
d e b
a b b
d b b
$1=d or $2=$3 (T)
Spring 2003ECE569 Lecture 02.19
Examples
Find the names of all patients with a balance of more than $10,000.
Find the names of all patients that have a pulse less than 50 or have been diagnosed with hypertension.
Spring 2003ECE569 Lecture 02.20
Examples (cont)
Find all patients whose treatment includes the “application of leeches”
Print the names of all pairs of patients that occupy the same hospital room and have been diagnosed with the same disease.
Spring 2003ECE569 Lecture 02.21
Examples (cont)
Print the name of the patient with the highest pulse recorded today
Find all patients that had pulse measured today
Find the set of all ‘losers’, i.e., those whose pulse is less than or equal to that of some other patient
Subtract (b) from (a) and project out patient name
Spring 2003ECE569 Lecture 02.22
Examples (cont)
List all patients that suffer from at least one illness that no other patient suffers from
Spring 2003ECE569 Lecture 02.23
Additional Operations
Intersection: R S t tR and tS The set of tuples in both R and S
The degrees of R and S must match
R S = R – (R – S) SR
R – ( R – S)
R – S
Spring 2003ECE569 Lecture 02.24
Additional Operations (cont)
Theta-join: R ij S $i $r+j(R S) where r = degree(R) and <, =, >, , ,
When is ‘=‘ operation is called equijoin.
Find patient with highest pulse
Spring 2003ECE569 Lecture 02.25
Additional Operations (cont)
Natural Join: R S An equijoin for all attributes that R and S have in
common.
The shared columns originating in S are projected out
R S i1, i2, …, im R.A1 = S. A1 and … and R.Ak = S.Ak (R x S) where i1, i2, … im is the list of all attributes of R x S, in order, except the attributes S.A1, …, S.Ak.
Example: Names of patients being treated with leeches.
Spring 2003ECE569 Lecture 02.26
Sample Queries
Consider the following set of relations
frequents (drinker, bar)
serves (bar, beer)
likes (drinker, beer)
List the drinkers that frequent at least one bar that serves a beer they like
Construct the relation should_visit(drinker,bar) consisting of all tuples <d, b> where bar b serves a beer that drinker d likes.
Spring 2003ECE569 Lecture 02.27
Sample Queries (cont)
List the drinkers that frequent only bars that serve some beer they like. (Assume every drinker frequents at least one bar.)
Print the drinkers that frequent no bar that serves a beer that they like
Spring 2003ECE569 Lecture 02.28
Questions on Healthcare DB
Retrieve all pairs of patients that shared a room at the same time
List all patients that finished paying their bill before one of their roomates began paying theirs.
Print the names of all pairs of patients that occupy the same hospital room and have been diagnosed with the same disease.