Relational Data ModelCh. 7.1 – 7.3
John Ortiz
Lecture 3 Relational Data Model 2
Why Study Relational Model?
Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle,
Sybase, etc. “Legacy systems” in older models
E.G., IBM’s IMS (hierarchical model) Recent competitor: object-oriented model
ObjectStore, Versant, Ontos, O2 A synthesis emerging: object-relational
model Informix UDS, UniSQL, Oracle, DB2
Lecture 3 Relational Data Model 3
Anatomy of a Relation Each relation is a table with a name. An attribute is a column heading. The heading is the schema of the relation
Students(SSN, Name, Age, GPA)
SID Name Age GPA 1002 John Smith 20 3.2 1005 Mary Day 18 2.9 1020 Bill Lee 19 2.7
StudentsRelation name
Attribute name
Column
Tuple
Lecture 3 Relational Data Model 4
Domain of an Attribute
The domain of an attribute A, denoted by Dom(A), is the set of values that the attribute can take. A domain is usually represented by a
type. E.g., SID char(4) Name varchar(30) --- character
string of variable length up to 30 Age number --- a number
Lecture 3 Relational Data Model 5
Tuples
A tuple of a relation is a row in its table. If t is a tuple of a relation R and A is an
attribute of R, we use t[A] to represent the value of t under A in R.
Example: If t is the second tuple in Students t[Name] = ‘Mary Day’
t[Age] = 18,
t[Name, Age] = (‘Mary Day’, 18)
Lecture 3 Relational Data Model 6
Schema and Instance
A relation schema, denoted by R(A1, A2, …, An), consists of the relation name R and a list of attributes A1, …, An. R.A denotes attribute A of R. # of attributes = degree
A relation instance (state) of a relation schema R(A1, …, An), denoted by r(R), is a set of tuples in the table of R at some instance of time. # of tuples = cardinality
Lecture 3 Relational Data Model 7
Schema & Instance Update
The schema of a relation may change (e.g., adding, deleting, renaming attributes and deleting a table schema) but it is infrequent
The state of a relation may also change (e.g., inserting or deleting a tuple, and changing an attribute value in a tuple) & it is frequent
A schema may have different states at different times.
Lecture 3 Relational Data Model 8
Relational Database
A relational database schema is a set of relation schemas S={R1, …, Rm}.
A relational database is a set of relations DB(S)={r(R1), …, r(Rm)}.
A database state is a set of relation instances at some instance of time.
In addition, a relational database must satisfy a number of constraints (more to come later).
Lecture 3 Relational Data Model 9
A University Databasesee p. 204, Fig. 7.5
SID Name Major GPA 1002 J. Smith CS 3.2 1005 M. Day Math 2.9 1020 B. Lee EE 2.7
StudentsCno Sno Semester Prof CS374 001 F2000 Zhang CS455 002 S2000 Smith Math210 001 F1999 Brown
Sections
Cno Name Hour Dept CS374 Database 3 CS CS455 Network 3 CS CS100 Prog.Lang 4 CS Math210 Calculus 3 Math
Courses
Name Room Chair CS SB220 Hansen EE EB318 Johnson Math AB119 Miller
Departments
Lecture 3 Relational Data Model 10
Constraints of Relational DB
Relations must satisfy the following constraints. Domain (1NF) Constraint. Access-by-Content Constraint. Key (Unique Tuple) Constraint. Entity Integrity Constraint. Referential Integrity Constraint.
Integrity constraints are enforced by the RDBMS.
Lecture 3 Relational Data Model 11
Domain Constraint
Also known as the First Normal Form (1NF): Attributes can only take atomic values (I.e., set values are not allowed).
How to handle multivalued attributes? Use multiple tuples, one per value Use multiple columns, one per value Use separate tablesWhat problems does these solutions
have?
Lecture 3 Relational Data Model 12
Handle Multi-Valued Attributes
EID Name Age Dependents 1234 Bob 34 {Allen, Ann} 1357 Mary 23 {Kathy} 2468 Peter 54 {Mike, Sue, David}
Employees
EID Name Age Dependents 1234 Bob 34 Allen 1234 Bob 34 Ann 1357 Mary 23 Kathy 2468 Peter 54 Mike 2468 Peter 54 Sue 2468 Peter 54 David
Employees
Multiple Values:
Use Multiple Tuples:
Lecture 3 Relational Data Model 13
Handle Multi-Valued Attributes
EID Name Age Dep1 Dep2 Dep3 1234 Bob 34 Allen Ann 1357 Mary 23 Kathy 2468 Peter 54 Mike Sue David
Employees
EID Name Age 1234 Bob 34 1357 Mary 23 2468 Peter 54
EmployeesEID Name 1234 Allen 1234 Ann 1357 Kathy 2468 Mike 2468 Sue 2468 David
Dependents
Use Multiple Columns:
Use SeparateRelations:
Lecture 3 Relational Data Model 15
Access-by-Content Constraint
A tuple is retrieved only by values of its attributes, i.e., the order of tuples is not important.
This is because a relation is a set of tuples.
Although the order of tuples is insignificant for query formulation, it is significant for query evaluation.
Lecture 3 Relational Data Model 16
Superkey
A superkey of a relation is a set of attributes whose values uniquely identify the tuples of the relation.
Every relation has at least one superkey (default is all attributes together?).
Any superset of a superkey is a superkey. From a state of a relation, we may
determine that a set of attributes is not a superkey, but can not determine that a set of attributes is a superkey.
Lecture 3 Relational Data Model 17
Superkey Example Find all superkeys of the Students relation.
SID Name Major GPA 1002 J. Smith CS 3.2 1005 M. Day Math 2.9 1020 B. Lee EE 2.7
Students
A B C D A1 B2 C1 D2 A2 B2 C3 D2 A2 B1 C2 D1 A3 B3 C4 D1
R
With the only state of R, is A a superkey? What about {A, B}?
Lecture 3 Relational Data Model 18
Candidate Key
A candidate key of a relation is a set of attributes of the relation such that it is a superkey, and none of its proper subsets is a superkey.
Find all candidate keys in Students relation. Is it true that every relation has at least
candidate key? Why? Can candidate keys be found from a state? If AB is a candidate key of a relation, can A
also be a candidate key? What is ABC called?
Lecture 3 Relational Data Model 19
Primary Key
A primary key of a relation is a candidate key designated (with an underline) by a database designer. Often chosen at the time of schema
design, & once specified to DBMS, it cannot be changed.
Better be the smallest candidate key for improvement of both storage and query processing efficiencies.
What should be the primary key of Students?
Lecture 3 Relational Data Model 20
Key Constraint
Every relation must have a primary key. Why is key constraint needed?
Every tuple has a different primary key value.
Only the primary key values need to be checked for identifying duplicate when new tuples are inserted (index is often used).
Primary key values can be referenced from within other relations
Lecture 3 Relational Data Model 21
Entity Integrity Constraint
A null value is a special value that is unknown, yet to be assigned, or inapplicable.
Entity Integrity Constraint: No primary key values can be null.
Why?
Lecture 3 Relational Data Model 22
Foreign Key
A foreign key in relation R1 referencing relation R2 is a set of attributes FK of R1, such that, FK is compatible with a candidate (or
primary) key PK of R2 (with same number of attributes and compatible domains); and
for every tuple t1 in R1, either there exists a tuple t2 in R2 such that t1[FK] = t2[PK] or t1[FK] = null.
Foreign keys need to be explicitly defined.
Lecture 3 Relational Data Model 23
Foreign Key Example
EID Name Age DName 1234 Bob 34 Sales 1357 Mary 23 Service 2468 Peter 54 null
EmployeesName City Manager Sales Huston Bill Payroll Dallas Steve Service Chicago Tom
Departments
DName of Employees is a foreign key referencing Name of Departments
A foreign key may reference its own relation. Employee(EID, Name, Age, Dept, ManegerID)
Lecture 3 Relational Data Model 24
Referential Integrity Constraint
Referential Integrity Constraint: No relation can contain unmatched foreign key values.
Using foreign keys in a relation to reference primary keys of other relations is the only way in the relational data model to establish relationships among different relations.
Lecture 3 Relational Data Model 25
Update Operations Insert
Can violate any of the 4 previous constraints – what were they again?
1 solution: reject the insert Delete
Can only violate referential integrity – why?
3 solutions: reject deletion, propagate deletion, modify referencing attributes
Modify Can violate any of the 4 previous
constraints
Lecture 3 Relational Data Model 26
Relational Model: Summary
A tabular representation of data. Simple and intuitive, currently the most
widely used. Integrity constraints can be specified by the
DBA, based on application semantics. DBMS checks for violations. Two important ICs for primary and
foreign keys In addition, we always have domain
constraints.
Lecture 3 Relational Data Model 27
Relational Model: Summary
ICs are based upon the semantics of the real-world enterprise that is being described in the database relations.
We can check a database instance to see if an IC is violated, but we can never infer that an IC is true by looking at an instance.
Powerful and natural query languages exist.
Guidelines to translate ER to relational model (next class…)
Lecture 3 Relational Data Model 28
Look Ahead Next topic: Relational Algebra Read Textbook:
Chapter 7.4 – 7.6