Lecture 3
Relational Model:
Relational Algebra and
Relational Calculus
• T. Connolly, and C. Begg, “Database Systems: A Practical Approach to Design, Implementation, and Management”, 5th edition,
Addison-Wesley, 2009. ISBN: 0-321-60110-6, ISBN-13: 978-0-321-60110-0 (International Edition).
• T. Connolly, and C. Begg, “Database Systems: A Practical Approach to Design, Implementation, and Management”, 4th edition,
Addison-Wesley, 2004. ISBN: 0-321-21025-5.
• R. Elmasri and S. B. Navathe, “Fundamentals of Database Systems”, 5th ed., Pearson, 2007, ISBN: 0-321-41506-X.
Walailak University
ITM-661 ระบบฐานขอ้มลู (Database system)
Walailak - 2013
Objectives Overview of the relational model, e.g., how tables
represent data.
The connection between mathematical relations and
relations in the relational model.
Properties of database relations
How to identify candidate, primary, and foreign keys.
The meaning of entity integrity and referential integrity.
How to form queries in relational algebra.
How relational calculus queries are expressed.
The purpose and advantages of views in relational
systems.
2
An Example (Branch and Staff Relations)
3
• A relation is a table with
columns and rows. Only applies to logical structure of
the database, not the physical
structure.
• An attribute is a named
column of a relation.
• A domain is a set of
allowable values for one or
more attributes.
• A tuple is a row of a
relation.
• A degree is a number of
attributes in a relation.
• A cardinality is a number of
tuples in a relation.
• A relational database is a
collection of normalized
relations.
Terminology for Relational Model and Attribute Domain
4
Terminology
Attribute Domain
Mathematical Relations (I) Mathematical definition of relation
Consider two sets D1 = {2, 4} and D2 = {1, 3, 5}.
Cartesian product is D1D2, the set of all ordered pairs, 1st element is member of D1 and 2nd element is member of D2.
Alternative way is to find all combinations of elements with first from D1 and second from D2.
D1D2 = {(2, 1), (2, 3), (2, 5), (4, 1), (4, 3), (4, 5)}
Another example: D3 = {“Mr. X” , “Ms. Y”} and
D4 = { “M”, “F” } (i.e., M=Male, F=Female)
D3D4 = { (“Mr. X”, “M”), (“Mr. X”, “F”),
(“Ms. Y”, “M”), (“Ms. Y”, “F”) }
5
Mathematical Relations (II) Any subset of Cartesian product is a relation.
R = {(2, 1), (4, 1)}
May specify which pairs are in relation using some condition for selection. For example, the second element is 1 R = {(x, y) | x D1, y D2, and y = 1}
Using same sets, form another relation S, where first element is always twice the second. S = {(x, y) | x D1, y D2, and x = 2y}
Only one ordered pair in the Cartesian Product satisfies this condition. S = {(2, 1)}
6
Mathematical Relations (III) Consider three sets D1, D2, and D3 with Cartesian
Product D1D2D3. For example
D1 = {1, 3} , D2 = {2, 4} , D3 = {5, 6}
D1D2D3 = {(1,2,5), (1,2,6), (1,4,5), (1,4,6), (3,2,5),
(3,2,6), (3,4,5), (3,4,6)}
Any subset of these ordered triples is a relation. T = {(x, y, z) | x D1, y D2, z D3 and y = 2x
and z = 3y}
T = { (1, 2, 6) }
7
Mathematical Relations (IV)
8
To define a general relation on n domains…let D1, D2, . . ., Dn be n sets with Cartesian product defined as
D1 D2 . . . Dn = {(d1, d2, . . . , dn) | d1 D1, d2 D2, . . . , dn Dn} usually written as
In defining relations we specify the sets, or domains, from which we chose values.
i
n
i
D1
Database Relations and Their Properties Relation schema
Named relation defined by a set of attribute and domain name pairs.
Relational database schema Set of relation schemas, each with a distinct name.
Relation name is distinct from all other relations.
Each cell of relation contains exactly one atomic (single) value.
Each attribute has a distinct name.
Values of an attribute are all from the same domain.
Order of attributes has no significance.
Each tuple is distinct; there are no duplicate tuples.
Order of tuples has no significance, theoretically.
9
Relational Keys (I) Superkey
An attribute or a set of attributes that uniquely
identifies a tuple within a relation.
Candidate Key
A superkey (K) such that no proper subset is a
superkey within the relation.
In each tuple of R, the values of K uniquely
identify that tuple (uniqueness).
No proper subset of K has the uniqueness property
(irreducicility).
10
Relational Keys (II) Primary Key
Candidate key selected to identify tuples uniquely
within relation.
Alternate Keys
Candidate keys that are not selected to be the
primary key.
Foreign Key
An attribute or set of attributes within one relation
that matches candidate key of some (possibly
same) relation.
11
Relational Integrity (I) Null
Represents a value for an attribute that is currently
unknown or is not applicable for this tuple.
Deals with incomplete or exceptional data.
represents the absence of a value and is not the
same as zero or spaces, which are values.
12
Relational Integrity (II) Entity Integrity
In a base relation, no attribute of a primary key can be null.
Referential Integrity
If foreign key exists in a relation, either the foreign key value must match a candidate key value of some tuple in its home relation or foreign key value must be wholly null.
Enterprise Constraints
Additional rules specified by users or database administrators.
13
Relational Algebra and Calculus Relational Algebra
Unary Relational Operations
Relational Algebra Operations From Set Theory
Binary Relational Operations
Additional Relational Operations
Examples of Queries in Relational Algebra
Relational Calculus
Tuple Relational Calculus
Domain Relational Calculus
14
Relational Algebra and Calculus Relational algebra and relational calculus are formal
languages associated with the relational model.
Informally,
Relational algebra is a (high-level) procedural language and
Relational calculus a non-procedural language.
However, formally both are equivalent to one
another.
A language that produces a relation that can be
derived using relational calculus is relationally
complete.
15
Relational Algebra Relational algebra operations work on one or
more relations to define another relation
without changing the original relations.
Thus, both operands and results are relations,
so output from one operation can become input
to another operation.
This allows expressions to be nested, just as in
arithmetic. This property is called closure.
16
Relational Algebra (Overview) Relational Algebra consists of several groups of operations
Unary Relational Operations
SELECT (symbol: σ (sigma))
PROJECT (symbol: π (pi))
RENAME (symbol: ρ (rho))
Relational Algebra Operations From Set Theory
UNION ( ∪ ), INTERSECTION ( ∩ ), DIFFERENCE (or MINUS, – )
CARTESIAN PRODUCT ( x )
Binary Relational Operations
JOIN (several variations of JOIN exist)
DIVISION
Additional Relational Operations
OUTER JOINS, OUTER UNION
AGGREGATE FUNCTIONS These compute summary of information: for example, SUM, COUNT,
AVG, MIN, MAX
17
Relational Algebra There are 5 basic operations, in relational algebra, that
performs most of the data retrieval operations needed.
Selection
Projection
Cartesian Product
Union
Set Difference
Also operations that can be expressed by 5 basic operations.
Join
Intersection
Division
18
Relational Algebra Operations
19
Relational Algebra Operations
20
An Example (Home Rental Database)
21
An Example (Home Rental Database)
22
An Example (Home Rental Database)
23
Selection (or Restriction)
24
spredicate(R)
Selection operation works on a single relation R and defines a
relation that contains only those tuples (rows) of R that satisfy
the specified condition (predicate).
Ex.: List all staff with a salary greater than 10,000.
ssalary> 10000 (Staff)
Projection
25
Pcol1, . . . , coln(R)
Projection operation works on a single relation R and defines a
relation that contains a vertical subset of R, extracting the values of
specified attributes and eliminating duplicates.
Ex.: Produce a list of salaries for all staff, showing only the
StaffNo, fName, lName, and salary details.
PstaffNo, fName, lName, salary(Staff)
Cartesian Product R S
– The Cartesian product operation defines a relation that is the concatenation of every tuple of relation R with every tuple of relation S.
– Ex.: List the names and comments of all renters who have viewed a property.
26
(PclientNo, fName, lName(Client))(PclientNo, propertyNo,comment (Viewing))
Example - Cartesian Product and Selection
27
Use selection operation to extract those tuples
where Renter.Rno = Viewing.Rno.
sClient.clientNo=Viewing.clientNo
((PclientNo,fName,lName(Client))(PclientNo,propertyNo,comment(Viewing)))
Note that: Cartesian product and Selection can be reduced to a
single operation called a join.
Union
28
R S Union of two relations R and S defines a
relation that contains all the tuples of R, or
S, or both R and S, duplicate tuples being
eliminated.
R and S must be union-compatible.
If R and S have I and J tuples, respectively,
union is obtained by concatenating them into
one relation with a maximum of (I + J) tuples.
List all cities where there is either a branch
office or a property for rent.
Pcity(Branch) Pcity (PropertyForRent)
Set Difference
29
R – S Define a relation consisting of the tuples that are in relation
R, but not in S.
R and S must be union-compatible.
List all cities where there is a branch office but no
properties for rent.
Pcity (Branch) – Pcity (PropertyForRent)
Join Operations Join is a derivative of Cartesian product.
Equivalent to performing a selection, using the
join predicate as the selection formula, over
the Cartesian product of the two operand
relations.
One of the most difficult operations to
implement efficiently in a relational DBMS
and one of the reasons why RDBMSs have
intrinsic performance problems.
30
Join Operations There are various forms of join operation
Theta-join
Equi-join (a particular type of theta-join)
Natural join
Outer join
Semi-join
31
Theta-join (q-join) R F S
Defines a relation that contains tuples
satisfying the predicate F from the Cartesian
product of R and S.
The predicate F is of the form R.ai q S.bi
where q may be one of the comparison
operators (<, < =, >, > =, =, ~ =).
32
Theta-join (q-join) We can rewrite the theta-join in terms of the
basic Selection and Cartesian product
operations.
R F S = sF(R S)
Degree of a theta-join is sum of the degrees of
the operand relations R and S. If predicate F
contains only equality (=), the term equi-join is
used.
33
Example - Equi-join
34
List the names and comments of all clients
who have viewed a property for rent.
(PclientNo, fName, lName(Client)) Client.clientNo = Viewing.clientNo
(PclientNo, propertyNo, comment(Viewing))
Natural Join
35
R S Natural join is an equi-join of the two relations R and S over all
common attributes x. One occurrence of each common
attribute is eliminated from the result.
List the names and comments of all clients who have viewed a
property for rent.
(PclientNo, fName, lName(Client)) (PclientNo, propertyNo, comment(Viewing))
Outer Join Often in joining two relations, there is no
matching value in the join columns. To display
rows in the result that do not have matching
values in the join column, we use the outer
join.
R S
The (left) outer join is a join in which tuples from
R that do not have matching values in the common
columns of S are also included in the result
relation.
36
Example - Left Outer Join
37
Produce a status report on property viewings.
PpropertyNo, street, city(PropertyForRent) Viewing
Semi-join
38
R F S The semi-join operation defines a relation that contains the tuples
of R that participate in the join of R with S.
Can rewrite Semijoin using Projection and Join:
R F S = PA(R F S)
List complete details of all staff who work at the branch in
Partick.
Staff Staff.branchNo = Branch.branchNo and Branch.city = ‘Glasgow’ Branch
Intersection
39
R S
The intersection operation consists of the set
of all tuples that are in both R and S.
R and S must be union-compatible.
Expressed using basic operations
R S = R – (R – S)
List all cities where there is both a branch
office and at least one property for rent.
Pcity(Branch) Pcity(PropertyForRent)
Division R S
The division operation consists of the set of tuples
from R defined over the attributes C that match the
combination of every tuple in S.
Expressed using basic operations
T1 = PC(R)
T2 = PC(( ST1) – R)
T = T1 – T2
40
Example - Division
41
Identify all clients who have viewed all
properties with three rooms.
(PclientNo, propertyNo(Viewing))
(PpropertyNo(srooms = 3 (PropertyForRent)))
Relational Algebra - Aggregate Function
Aggregate Function Operation
MAX Salary (EMPLOYEE) retrieves the maximum salary
value from the EMPLOYEE relation
MIN Salary (EMPLOYEE) retrieves the minimum Salary
value from the EMPLOYEE relation
SUM Salary (EMPLOYEE) retrieves the sum of the Salary
from the EMPLOYEE relation
COUNT SSN, AVERAGE Salary (EMPLOYEE) computes the
count (number) of employees and their average salary
Note: count just counts the number of rows, without
removing duplicates
42
Relational Algebra - Aggregate Function
43
Group by Dno
Relational Calculus Relational calculus query specifies what is to be
retrieved rather than how to retrieve it.
No description of how to evaluate a query.
In first-order logic (or predicate calculus), predicate
is a truth-valued function with arguments.
When we substitute values for the arguments,
function yields an expression, called a proposition,
which can be either true or false.
When applied to databases, relational calculus is in
two forms: tuple-oriented and domain-oriented.
44
Relational Calculus If a predicate contains a variable, as in ‘x is
a member of staff’, there must be a range
for x.
When we substitute some values of this
range for x, the proposition may be true; for
other values, it may be false.
If P is a predicate, then we write the set of
all x such that P is true for x, as {x | P(x)}
Predicates can be connected using (AND), (OR),
and ~ (NOT) 45
Tuple-oriented Relational Calculus Interested in finding tuples for which a predicate is
true. Based on use of tuple variables.
Tuple variable is a variable that ‘ranges over’ a
named relation: i.e., variable whose only permitted
values are tuples of the relation.
Specify range of a tuple variable S as the Staff
relation as:
Staff(S)
To find set of all tuples S such that P(S) is true:
{S | P(S)}
46
Tuple-oriented Relational Calculus (Examples and quantifiers)
Examples of tuple-oriented relational calculus To find details of all staffs earning more than £10,000:
{S | Staff(S) S.salary > 10000}
To find a particular attribute, such as salary, write:
{S.salary | Staff(S) S.salary > 10000}
Can use two quantifiers to tell how many instances the predicate applies to: Existential quantifier $ (‘there exists’)
Universal quantifier " (‘for all’)
Tuple variables qualified by " or $ are called bound variables, otherwise called free variables.
47
Existential quantifier (Tuple-oriented Relational Calculus)
Existential quantifier used in formulae that must
be true for at least one instance, such as:
{ S | Staff(S) ($B)(Branch(B)
(B.branchNo=S.branchNo)
B.city = ‘London’) }
Means ‘There exists a Branch tuple with same
branchNo as the branchNo of the current Staff
tuple, S, and is located in London’.
48
Universal quantifier (Tuple-oriented Relational Calculus)
Universal quantifier is used in statements
about every instance, such as:
("B) (B.city ‘Paris’)
Means ‘For all Branch tuples, the address is
not in Paris’.
Can also use ~($B) (B.city=‘Paris’) which means ‘There are no branches with an address in Paris’.
49
Tuple-oriented Relational Calculus Formulae should be unambiguous and make sense.
A (well-formed) formula is made out of atoms:
R(Si), where Si is a tuple variable and R is a relation
Si.a1 q Sj.a2
Si.a1 q c
Can recursively build up formulae from atoms:
An atom is a formula
If F1 and F2 are formulae, so are their conjunction, F1 F2;
disjunction, F1 F2; and negation, ~F1
If F is a formula with free variable X, then ($X)(F) and
("X)(F) are also formulae.
50
{ S1.a1, S2.a2, …, Sn.an | F(S1, S2, …, Sn) } General form
q is one of comparison operators (<,>, so on)
c is a constant.
Tuple-oriented Relational Calculus (An example – I)
List the names of all managers who earn more
than £25,000.
{S.fName, S.lName | Staff(S)
S.position = ‘Manager’ S.salary > 25000}
List the staff who manage properties for rent in
Glasgow.
{S | Staff(S) ($P) (PropertyForRent(P)
(P.staffNo = S.staffNo) P.city = ‘Glasgow’)}
51
Tuple-oriented Relational Calculus (An example – II)
List the names of staff who currently do not
manage any properties.
{S.fName, S.lName | Staff(S) (~($P)
(PropertyForRent(P)(S.staffNo = P.staffNo)))}
Or
{S.fName, S.lName | Staff(S) (("P)
(~PropertyForRent(P) ~(S.staffNo = P.staffNo)))}
52
Tuple-oriented Relational Calculus (An example – III)
List the names of clients who have viewed a
property for rent in Glasgow.
{C.fName, C.lName | Client(C)
(($V)($P) (Viewing(V) PropertyForRent(P)
(C.clientNo = V.clientNo)
(V.propertyNo=P.propertyNo)
P.city=‘Glasgow’ ) ) }
53
Tuple-oriented Relational Calculus (An example – IV)
Expressions can generate an infinite set. For
example: {S | ~Staff(S)}
To avoid this, add restriction that all values in
result must be values in the domain of the
expression.
54
Domain Relational Calculus
Domain-oriented Relational Calculus
Uses variables that take values from domains
instead of tuples of relations.
If F(d1, d2, . . . , dn) stands for a formula
composed of atoms and d1, d2, . . . , dn
represent domain variables, then:
{d1, d2, . . . , dn | F(d1, d2, . . . , dn)}
is a general domain relational calculus expression.
55
Domain-oriented Relational Calculus (An example – I)
Find the names of all managers who earn more than £25,000.
{fN, lN | ($sN, posn, sex, DOB, sal, bN)
(Staff (sN, fN, lN, posn, sex, DOB, sal, bN)
posn = ‘Manager’ sal > 25000)}
List the staff who manage properties for rent in Glasgow.
{sN, fN, lN, posn, sex, DOB, sal, bN |
($sN1,cty)(Staff(sN,fN,lN,posn,sex,DOB,sal,bN)
PropertyForRent(pN, st, cty, pc, typ, rms, rnt, oN, sN1, bN1)
(sN=sN1) cty=‘Glasgow’)}
56
Domain-oriented Relational Calculus (An example – II) List the names of staff who currently do not manage any
properties for rent.
{fN, lN | ($sN)
(Staff(sN,fN,lN,posn,sex,DOB,sal,bN)
(~($sN1) (PropertyForRent(pN, st, cty, pc, typ,
rms, rnt, oN, sN1, bN1) (sN=sN1))))}
List the names of clients who have viewed a property for rent in Glasgow.
{fN, lN | ($cN, cN1, pN, pN1, cty)
(Client(cN, fN, lN,tel, pT, mR) Viewing(cN1, pN1, dt, cmt)
PropertyForRent(pN, st, cty, pc, typ, rms, rnt,oN, sN, bN)
(cN = cN1) (pN = pN1) cty = ‘Glasgow’)}
57
Domain-oriented Relational Calculus
When domain relational calculus is restricted
to safe expressions, it is equivalent to tuple
relational calculus restricted to safe
expressions, which is equivalent to relational
algebra.
This means every relational algebra expression
has an equivalent relational calculus
expression, and vice versa.
58
Other Languages Transform-oriented languages are non-procedural
languages that use relations to transform input data into outputs (e.g. SQL).
Graphical languages provide the user with a picture or illustration of the structure of the relation. The user fills in an example of what is wanted and the system returns the required data in that format (e.g QBE).
Fourth-generation languages (4GLs) can create a complete customized application using a limited set of commands in a user-friendly, often menu-driven environment.
Some systems accept a form of natural language, sometimes called a fifth-generation language (5GL). This development is still in its infancy.
59
Exercise The following tables form a part of a database held in a relational
DBMS: Hotel: (hotelNo, hotelName, hotelAddress)
Room: (roomNo, hotelNo, Type, Price)
Booking: (hotelNo, guestNo, dateFrom, dataTo, roomNo)
Guest: (guestNo, guestName, guestAddress)
Where the primary keys are underlined.
Generate the relational algebra, tuple-oriented and domain-oriented calculus for the following queries:
1. List all hotels.
2. List all single rooms with a price below 20 per night.
3. List the names and addresses of all guests.
4. List the price and type of all rooms at the Grosvenor Hotel.
5. List all guests currently staying at the Grovenor Hotel.
6. List the details of all rooms at the Grosvenor Hotel, including the name of the guest staying in the room, if the room is occupied.
7. List the guest details (guestNo, guestName, and guestAddress) of all guests staying at the Grosvenor Hotel.
60