Download - Revision for Mid 1
©Silberschatz, Korth and Sudarshan7.2Database System Concepts
Functional DependenciesFunctional Dependencies FDs defined over two sets of
attributes: X, Y R
Notation: X Y reads as “X determines Y”
If X Y, then all tuples that agree on X must also agree on Y
X Y Z
1 2 3
2 4 5
1 2 4
1 2 7
2 4 8
3 7 9
R
©Silberschatz, Korth and Sudarshan7.3Database System Concepts
X Y Z
1 2 3
2 4 5
1 2 4
1 2 7
2 4 8
3 7 9
X Y Z
Functional Dependencies Functional Dependencies (example)(example)
©Silberschatz, Korth and Sudarshan7.4Database System Concepts
Candidate KeysCandidate Keys
an attribute (or set of attributes) that uniquely identifies a row primary key is a special candidate key
values cannot be null
e.g. ENROLL (Student_ID, Name, Address, …)
PK = Student_ID candidate key = Name, Address
©Silberschatz, Korth and Sudarshan7.6Database System Concepts
2NF2NF
a relation is in second normal form if it is in first normal form AND every nonkey attribute is fully functionally dependant on the primary key
i.e. remove partial functional dependencies, so no nonkey attribute depends on just part of the key
©Silberschatz, Korth and Sudarshan7.7Database System Concepts
EMPLOYEE2 (Emp_ID, Course_Title, Name, Dept_Name, Salary, Date_Completed)
Emp_IDCourse_
TitleName
Dept_
Name Salary
Date_Comp.
not fully functionally dependant on the primary key
©Silberschatz, Korth and Sudarshan7.8Database System Concepts
Second Normal Form ( 2NF )Second Normal Form ( 2NF )
it is based on the concept of full functional dependency.
A functional dependency XY is a full functional full functional dependencydependency , for any attribute A X, {X - {A}} Y.
©Silberschatz, Korth and Sudarshan7.9Database System Concepts
R (A B C D)
1 1 2 3
2 1 3 2
3 1 2 3
1 2 1 3
A B C D
2 Candidate Keys
2NF 2NF (Example)(Example)
R with key{AB} is NOT 2NF
R with key{AC} is NOT 2NF
©Silberschatz, Korth and Sudarshan7.10Database System Concepts
Second Normal FormSecond Normal FormSecond Normal FormSecond Normal Form
Second normal form:
Let R’ be a relation, and let F be the set of governing FDs. An attribute belongs to R’ is prime if a key of R’ contains A. In other words, A is prime in R’ if there exists K<R’ such that (1) K->R’,
(2) for all B belongs to K, (K-B)->R’ not belongs to F+, and
(3) A belongs to K
©Silberschatz, Korth and Sudarshan7.12Database System Concepts
General Definitions of Second Normal FormGeneral Definitions of Second Normal Form
A relation schema R is in second normal form (2NF) if every nonprime attribute A in R is fully functionally dependent on every key of R.
©Silberschatz, Korth and Sudarshan7.13Database System Concepts
Third Normal FormThird Normal Form
The definition of 3NF is similar to that of BCNF, with the only difference being the third condition.
Recall that a key for a relation is a minimal set of attributes that uniquely determines all other attributes. A must be part of a key (any key, if there are several).
It is not enough for A to be part of a superkey, because this condition is satisfied by every attribute.
A relation R is in 3NF if,
for all X A that holds over R
A X ( i.e., X A is a trivial FD ), or
X is a superkey, or
A is part of some key for R
If R is in BCNF,obviously it is in3NF.
©Silberschatz, Korth and Sudarshan7.14Database System Concepts
Suppose that a dependency X A causes a violation of 3NF. There are two cases: X is a proper subset of some key K. Such a dependency is
sometimes called a partial dependency. In this case, we store (X,A) pairs redundantly.
X is not a proper subset of any key. Such a dependency is sometimes called a transitive dependency, because it means we have a chain of dependencies K XA.
©Silberschatz, Korth and Sudarshan7.15Database System Concepts
Key Attributes X Attributes A
Key Attributes AAttributes X
Key Attributes A Attributes X
Partial Dependencies
Transitive Dependencies
A not in a key
A not in a key
A in a key
©Silberschatz, Korth and Sudarshan7.16Database System Concepts
Motivation of 3NF By making an exception for certain dependencies involving key
attributes, we can ensure that every relation schema can be decomposed into a collection of 3NF relations using only decompositions.
Such a guarantee does not exist for BCNF relations.
It weaken the BCNF requirements just enough to make this guarantee possible.
Unlike BCNF, some redundancy is possible with 3NF. The problems associate with partial and transitive dependencies
persist if there is a nontrivial dependency XA and X is not a superkey, even if the relation is in 3NF because A is part of a key.
©Silberschatz, Korth and Sudarshan7.17Database System Concepts
Reserves
Assume: sid cardno (a sailor uses a unique credit card to pay for reservations). Reserves is not in 3NF
sid is not a key and cardno is not part of a key In fact, (sid, bid, day) is the only key. (sid, cardno) pairs are redundantly.
©Silberschatz, Korth and Sudarshan7.18Database System Concepts
Reserves
Assume: sid cardno, and cardno sid (we know that credit cards also uniquely identify the owner).
Reserves is in 3NF (cardno, sid, bid) is also a key for Reserves. sid cardno does not violate 3NF.
Lecture 12:Lecture 12:Further relational algebra, Further relational algebra,
further SQLfurther SQL
www.cl.cam.ac.uk/Teaching/current/Databases/
©Silberschatz, Korth and Sudarshan7.20Database System Concepts
Today’s lectureToday’s lecture
Where does SQL differ from relational model?
What are some other features of SQL?
How can we extend the relational algebra to match more closely SQL?
©Silberschatz, Korth and Sudarshan7.21Database System Concepts
Duplicate rowsDuplicate rows
Consider our relation instances from lecture 6, Reserves, Sailors and Boats
Consider
SELECT rating,age
FROM Sailors;
We get a relation that doesn’t satisfy our definition of a relation!
RECALL: We have the keyword DISTINCT to remove duplicates
©Silberschatz, Korth and Sudarshan7.22Database System Concepts
Multiset semanticsMultiset semantics
A relation in SQL is really a multiset or bag, rather than a set as in the relational model A multiset has no order (unlike a list), but allows duplicates
E.g. {1,2,1,3} is a bag
select, project and join work for bags as well as sets
Just work on a tuple-by-tuple basis
©Silberschatz, Korth and Sudarshan7.23Database System Concepts
Extended relational algebraExtended relational algebra
Add features needed for SQL
1. Bag semantics
2. Duplicate elimination operator,
3. Sorting operator,
4. Grouping and aggregation operator,
5. Outerjoin operators, oV, Vo, oVo
©Silberschatz, Korth and Sudarshan7.24Database System Concepts
Duplicate-elimination operatorDuplicate-elimination operator
(R) = relation R with any duplicated tuples removed
R= (R)=
This is used to model the DISTINCT feature of SQL
A B
1 2
3 4
1 2
A B
1 2
3 4
©Silberschatz, Korth and Sudarshan7.25Database System Concepts
SortingSorting
L1,… Ln(R) returns a list of tuples of R, ordered according to the attributes L1, …, Ln
Note: does not return a relation
R= B(R)= [(5,2),(1,3),(3,4)]
ORDER BY in SQL, e.g.
SELECT *
FROM Sailors
WHERE rating>7
ORDER BY age, sname;
A B
1 3
3 4
5 2
©Silberschatz, Korth and Sudarshan7.26Database System Concepts
Extended projectionExtended projection
SQL allows us to use arithmetic operators
SELECT age*5
FROM Sailors;
We extend the projection operator to allow the columns in the projection to be functions of one or more columns in the argument relation, e.g.
R= A+B,A,A(R)=
A B
1 2
3 4
A+B A.1 A.2
3 1 1
7 3 3
©Silberschatz, Korth and Sudarshan7.27Database System Concepts
ArithmeticArithmetic
Arithmetic (and other expressions) can not be used at the top level i.e. 2+2 is not a valid SQL query
How would you get SQL to compute 2+2?
©Silberschatz, Korth and Sudarshan7.28Database System Concepts
AggregationAggregation
SQL provides us with operations to summarise a column in some way, e.g.
SELECT COUNT(rating)
FROM Sailors;
SELECT COUNT(DISTINCT rating)
FROM Sailors;
SELECT COUNT(*)
FROM Sailors WHERE rating>7;
We also have SUM, AVG, MIN and MAX
©Silberschatz, Korth and Sudarshan7.29Database System Concepts
GroupingGrouping
These aggregation operators have been applied to all qualifying tuples. Sometimes we want to apply them to each of several groups of tuples, e.g. For each rating, find the average age of the sailors
For each rating, find the age of the youngest sailor
©Silberschatz, Korth and Sudarshan7.30Database System Concepts
GROUP BY in SQLGROUP BY in SQL
SELECT [DISTINCT] target-list
FROM relation-list
WHERE qualification
GROUP BY grouping-list;
The target-list contains
1. List of column names
2. Aggregate terms
NOTE: The variables in target-list must be contained in grouping-list
©Silberschatz, Korth and Sudarshan7.31Database System Concepts
GROUP BY cont.GROUP BY cont.
For each rating, find the average age of the sailors
SELECT rating,AVG(age)
FROM Sailors
GROUP BY rating;
For each rating find the age of the youngest sailor
SELECT rating,MIN(age)
FROM Sailors
GROUP BY rating;
©Silberschatz, Korth and Sudarshan7.32Database System Concepts
Grouping and aggregationGrouping and aggregation
L(R) where L is a list of elements that are either
Individual column names (“Grouping attributes”), or
Of the form (A), where is an aggregation operator (MIN, SUM, …) and A is the column it is applied to
For example,
rating,AVG(age)(Sailors)
©Silberschatz, Korth and Sudarshan7.33Database System Concepts
ExampleExample
Let R=
Compute beer,AVG(price)(R)
bar beer price
Anchor 6X 2.50
Anchor Adnam’s 2.40
Mill 6X 2.60
Mill Fosters 2.80
Eagle Fosters 2.90
©Silberschatz, Korth and Sudarshan7.34Database System Concepts
Example cont.Example cont.
1. Group according to the grouping attribute, beer:
2. Compute average of price within groups:
bar beer price
Anchor 6X 2.50
Mill 6X 2.60
Anchor Adnam’s 2.40
Mill Fosters 2.80
Eagle Fosters 2.90
beer price
6X 2.55
Adnam’s 2.40
Fosters 2.85
©Silberschatz, Korth and Sudarshan7.35Database System Concepts
NULL valuesNULL values
Sometimes field values are unknown (e.g. rating not known yet), or inapplicable (e.g. no spouse name)
SQL provides a special value, NULL, for both these situations
This complicates several issues Special operators needed to check for NULL
Is NULL>8? Is (NULL OR TRUE)=TRUE?
We need a three-valued logic
Need to carefully re-define semantics
©Silberschatz, Korth and Sudarshan7.36Database System Concepts
NULL valuesNULL values
Consider INSERT INTO Sailors (sid,sname)
VALUES (101,”Julia”);
SELECT * FROM Sailors;
SELECT rating FROM Sailors;
SELECT sname
FROM Sailors
WHERE rating>0;
©Silberschatz, Korth and Sudarshan7.37Database System Concepts
Entity integrity constraintEntity integrity constraint
An entity integrity constraint states that no primary key value can be NULL
©Silberschatz, Korth and Sudarshan7.38Database System Concepts
Outer joinOuter join
Note that with the usual join, a tuple that doesn’t ‘join’ with any from the other relation is removed from the resulting relation
Instead, we can ‘pad out’ the columns with NULLs
This operator is called an full outer join, written oVo
©Silberschatz, Korth and Sudarshan7.39Database System Concepts
Example of full outer joinExample of full outer join
Let R= Let S=
Then RVS =
But RoVoS =
A B
1 2
3 4
B C
4 5
6 7
A B C
3 4 5
A B C
1 2 NULL
3 4 5
NULL 6 7
©Silberschatz, Korth and Sudarshan7.40Database System Concepts
Outer joins in SQLOuter joins in SQL
SQL/92 has three variants: LEFT OUTER JOIN (algebra: oV)
RIGHT OUTER JOIN (algebra: Vo)
FULL OUTER JOIN (algebra: oVo)
For example:
SELECT * FROM Reserves r LEFT OUTER JOIN Sailors s ON r.sid=s.sid;
©Silberschatz, Korth and Sudarshan7.41Database System Concepts
ViewsViews
A view is a query with a name that can be used in further SELECT statements, e.g.
CREATE VIEW ExpertSailors(sid,sname,age)
AS SELECT sid,sname,age
FROM Sailors
WHERE rating>9;
Note that ExpertSailors is not a stored relation
(WARNING: mysql does not support views )
©Silberschatz, Korth and Sudarshan7.42Database System Concepts
Querying viewsQuerying views
So an example query
SELECT sname
FROM ExpertSailors
WHERE age>27;
is translated by the system to the following:
SELECT sname
FROM Sailors
WHERE rating>9 AND age>27;
©Silberschatz, Korth and Sudarshan7.43Database System Concepts
Relational AlgebraRelational Algebra
The Relational Algebra is used to define the ways in which relations (tables) can be operated to manipulate their data.
It is used as the basis of SQL for relational databases, and illustrates the basic operations required of any DML.
This Algebra is composed of Unary operations (involving a single table) and Binary operations (involving multiple tables).
©Silberschatz, Korth and Sudarshan7.44Database System Concepts
SQLSQL Structured Query Language (SQL)
Standardised by ANSI Supported by modern RDBMSs
Commands fall into three groups Data Definition Language (DLL)
Create tables, etc Data Manipulation Language (DML)
Retrieve and modify data Data Control Language
Control what users can do – grant and revoke privileges
©Silberschatz, Korth and Sudarshan7.45Database System Concepts
SelectionSelection The selection or operation selects rows from a table that satisfy a condition:
< condition > < tablename >
Example: course = ‘CM’ Students
Students
stud# name course
100 Fred PH stud# name course
200 Dave CM 200 Dave CM
300 Bob CM 300 Bob CM
©Silberschatz, Korth and Sudarshan7.46Database System Concepts
ProjectionProjection The projection or operation selects a list of columns from a table.
< column list > < tablename >
Example: stud#, name Students
Students
stud# name course stud# name
100 Fred PH 100 Fred
200 Dave CM 200 Dave
300 Bob CM 300 Bob
©Silberschatz, Korth and Sudarshan7.47Database System Concepts
Selection / ProjectionSelection / Projection
Selection and Projection are usually combined:
stud#, name ( course = ‘CM’ Students)
Students
stud# name course
100Fred PH stud# name
200Dave CM 200 Dave
300Bob CM 300 Bob
©Silberschatz, Korth and Sudarshan7.48Database System Concepts
Cartesian ProductCartesian Product
Concatenation of every row in the first relation (R) with every row in the second relation (S):
R X S
©Silberschatz, Korth and Sudarshan7.49Database System Concepts
Cartesian Product - ExampleCartesian Product - ExampleStudents Courses
stud# name course course# name
100 Fred PH PH Pharmacy
200 Dave CM CM Computing
300 Bob CM
Students X Courses =
stud# Students.name course course# Courses.name
100 Fred PH PH Pharmacy
100 Fred PH CM Computing
200 Dave CM PH Pharmacy
200 Dave CM CM Computing
300 Bob CM PH Pharmacy
300 Bob CM CM Computing
©Silberschatz, Korth and Sudarshan7.50Database System Concepts
Theta JoinTheta Join
A Cartesian product with a condition applied:
R ⋈ <condition> S
©Silberschatz, Korth and Sudarshan7.51Database System Concepts
Theta Join - ExampleTheta Join - ExampleStudents Courses
stud# name course course# name
100 Fred PH PH Pharmacy
200 Dave CM CM Computing
300 Bob CM
Students ⋈ stud# = 200 Courses
stud# Students.name course course# Courses.name
200 Dave CM PH Pharmacy
200 Dave CM CM Computing
©Silberschatz, Korth and Sudarshan7.52Database System Concepts
Inner Join (Equijoin)Inner Join (Equijoin)
A Theta join where the <condition> is the match (=) of the primary and foreign keys.
R ⋈ <R.primary_key = S.foreign_key> S
©Silberschatz, Korth and Sudarshan7.53Database System Concepts
Inner Join - ExampleInner Join - ExampleStudents Courses
stud# name course course# name
100 Fred PH PH Pharmacy
200 Dave CM CM Computing
300 Bob CM
Students ⋈ course = course# Courses
stud# Students.name course course# Courses.name
100 Fred PH PH Pharmacy
200 Dave CM CM Computing
300 Bob CM CM Computing
©Silberschatz, Korth and Sudarshan7.54Database System Concepts
Natural JoinNatural Join
Inner join produces redundant data (in the previous example: course and course#). To get rid of this duplication:
< stud#, Students.name, course, Courses.name >
(Students ⋈ <course = course#> Courses)
Or
R1= Students ⋈ <course = course#> Courses
R2= < stud#, Students.name, course, Courses.name > R1
The result is called the natural join of Students and Courses
©Silberschatz, Korth and Sudarshan7.55Database System Concepts
Natural Join - ExampleNatural Join - ExampleStudents Courses
stud# name course course# name
100 Fred PH PH Pharmacy
200 Dave CM CM Computing
300 Bob CM
R1= Students ⋈ <course = course#> Courses
R2= < stud#, Students.name, course, Courses.name > R1stud# Students.name course Courses.name
100 Fred PH Pharmacy
200 Dave CM Computing
300 Bob CM Computing
©Silberschatz, Korth and Sudarshan7.56Database System Concepts
Outer JoinsOuter Joins Inner join + rows of one table which do not satisfy the <condition>.
Left Outer Join: R <R.primary_key = S.foreign_key> SAll rows from R are retained and unmatched rows of S are
padded with NULL
Right Outer Join: R <R.primary_key = S.foreign_key> SAll rows from S are retained and unmatched rows of R are
padded with NULL
©Silberschatz, Korth and Sudarshan7.57Database System Concepts
Left Outer Join - ExampleLeft Outer Join - Example
Students Courses
stud# name course course# name
100 Fred PH PH Pharmacy
200 Dave CM CM Computing
400 Peter EN CH Chemistry
Students <course = course#> Courses
stud# Students.name course course# Courses.name
100 Fred PH PH Pharmacy
200 Dave CM CM Computing
400 Peter EN NULL NULL
©Silberschatz, Korth and Sudarshan7.58Database System Concepts
Right Outer Join - ExampleRight Outer Join - Example
Students Courses
stud# name course course# name
100 Fred PH PH Pharmacy
200 Dave CM CM Computing
400 Peter EN CH Chemistry
Students <course = course#> Courses
stud# Students.name course course# Courses.name
100 Fred PH PH Pharmacy
200 Dave CM CM Computing
NULL NULL NULL CH Chemistry
©Silberschatz, Korth and Sudarshan7.59Database System Concepts
Combination of Unary and Join OperationsCombination of Unary and Join Operations
Students Courses
stud# name address course course# name
100 Fred Aberdeen PH PH Pharmacy
200 Dave Dundee CM CM Computing
300 Bob Aberdeen CM
Show the names of students (from Aberdeen) and the names of their coursesShow the names of students (from Aberdeen) and the names of their courses
R1= Students ⋈ <course=course#> Courses
R2= <address=“Aberdeen”> R1
R3= <Students.name, Course.name> R2
Students.name Courses.nameFred PharmacyBob Computing
©Silberschatz, Korth and Sudarshan7.60Database System Concepts
UnionUnion
Takes the set of rows in each table and combines them, eliminating duplicates
Participating relations must be compatible, ie have the same number of columns, and the same column names, domains, and data types
R S R S
A Ba1 b1a2 b2
A Ba2 b2a3 b3
A Ba1 b1a2 b2a3 b3
©Silberschatz, Korth and Sudarshan7.61Database System Concepts
IntersectionIntersection
Takes the set of rows that are common to each relation Participating relations must be compatible
R S R S
A Ba1 b1a2 b2
A Ba2 b2a3 b3
A Ba2 b2
©Silberschatz, Korth and Sudarshan7.62Database System Concepts
DifferenceDifference
Takes the set of rows in the first relation but not the second Participating relations must be compatible
R S R - S
A Ba1 b1a2 b2
A Ba2 b2a3 b3
A Ba1 b1
©Silberschatz, Korth and Sudarshan7.63Database System Concepts
Exercise (May 2004 Exam)Exercise (May 2004 Exam)
Employee WorkLoad Projectempid name empid* projid* duration projid name
E100 Fred E100 P001 17 P001 DB
E200 Dave E200 P001 12 P002 Access
E300 Bob E300 P002 15 P003 SQL
E400 Peter
Determine the outcome of the following operations:
A natural join between Employee and WorkLoad
A left outer join between Employee and WorkLoad
A right outer join between WorkLoad and Project
©Silberschatz, Korth and Sudarshan7.64Database System Concepts
Unary OperationsUnary Operations
Selection
course = ‘Computing’ Students
In SQL:
Select *
From Students
Where course = ‘Computing’;
Projection
stud#, name Students
In SQL:
Select stud#, name
From Students;
Selection & Projection
stud#, name ( course = ‘Computing’ Students)
In SQL:
Select stud#, name
From students
Where course = ‘Computing’;
©Silberschatz, Korth and Sudarshan7.65Database System Concepts
Binary Operations/JoinsBinary Operations/Joins
Cartesian Product: Students X Courses
In SQL:
Select *
From Students, Courses;
Theta Join: Students ⋈ <stud# =200> Courses
In SQL:
Select *
From Students, Courses
Where stud# = 200;
©Silberschatz, Korth and Sudarshan7.66Database System Concepts
Binary Operations/JoinsBinary Operations/Joins
Inner Join (Equijoin): Students ⋈ <course=course#> Courses
In SQL:
Select *
From Students, Courses
Where course=course#;
Natural Join:
R1= Students ⋈ <course = course#> Courses
R2= < stud#, Students.name, course, Courses.name > R1
In SQL:
Select stud#, Students.name, course, Courses.name
From Students, Courses
Where course=course#;
©Silberschatz, Korth and Sudarshan7.67Database System Concepts
Outer JoinsOuter Joins
Left Outer Join
Students <course = course#> Courses
In SQL:
Select *
From Students, Courses
Where course = course#(+)
Right Outer Join
Students <course = course#> Courses
In SQL:
Select *
From Students, Courses
Where course(+) = course#
©Silberschatz, Korth and Sudarshan7.68Database System Concepts
Combination of Unary and Join OperationsCombination of Unary and Join Operations
R1= Students ⋈ <course=course#> Courses
R2= <address=“Aberdeen”> R1
R3= <Students.name, Course.name> R2
In SQL:
Select Students.name, Courses.name
From Students, Courses
Where course=course#
AND address=“Aberdeen”;
©Silberschatz, Korth and Sudarshan7.69Database System Concepts
Set OperationsSet Operations
Union: R S
In SQL:
Select * From R
Union
Select * From S;
Intersection: R S
In SQL:
Select * From R
Intersect
Select * From S;
Difference: R - S
In SQL:
Select * From R
Minus
Select * From S;
©Silberschatz, Korth and Sudarshan7.70Database System Concepts
SQL OperatorsSQL Operators
SELECT *
FROM Book
WHERE catno BETWEEN 200 AND 400;
SELECT *
FROM Product
WHERE prod_desc BETWEEN ‘C’ AND ‘S’;
SELECT *
FROM Book
WHERE catno NOT BETWEEN 200 AND 400;
©Silberschatz, Korth and Sudarshan7.71Database System Concepts
SQL OperatorsSQL OperatorsSELECT Catno
FROM Loan
WHERE Date-Returned IS NULL;
SELECT Catno
FROM Loan
WHERE Date-Returned IS NOT NULL;
©Silberschatz, Korth and Sudarshan7.72Database System Concepts
SQL OperatorsSQL OperatorsSELECT Name
FROM Member
WHERE memno IN (100, 200, 300, 400);
SELECT Name
FROM Member
WHERE memno NOT IN (100, 200, 300, 400);
©Silberschatz, Korth and Sudarshan7.73Database System Concepts
SQL OperatorsSQL OperatorsSELECT Name
FROM Member
WHERE address NOT LIKE ‘%Aberdeen%’;
SELECT Name
FROM Member
WHERE Name LIKE ‘_ES%’;
Note: In MS Access, use * and # instead of % and _