relational database design unit ii 1. 2 advantages of using database systems centralized control of...
TRANSCRIPT
Relational Database Design UNIT II
1
2
Advantages of Using Database Systems
• Centralized control of a firm’s data
• Redundancy can be reduced (avoid keeping copies of the same data in many places)• Inconsistency can be avoided to some extent (updating occurs at one place instead of many places)
3
• The data can be shared
• Security restrictions can be applied (passwords)
• Integrity of data can be maintained (e.g., an instructor has to exist before being assigned to teach a class)
Advantages of Using Database Systems (cont’d)
Relational Data Model
4
5
RelationRelation• Definition: A relation is a named, two-dimensional table of data Definition: A relation is a named, two-dimensional table of data • Table consists of rows (records) and columns (attribute or field)Table consists of rows (records) and columns (attribute or field)• Requirements for a table to qualify as a relation:Requirements for a table to qualify as a relation:
– It must have a unique nameIt must have a unique name– Every attribute value must be atomic (not multivalued, not Every attribute value must be atomic (not multivalued, not
composite)composite)– Every row must be unique (can’t have two rows with exactly Every row must be unique (can’t have two rows with exactly
the same values for all their fields)the same values for all their fields)– Attributes (columns) in tables must have unique namesAttributes (columns) in tables must have unique names– The order of the columns must be irrelevantThe order of the columns must be irrelevant– The order of the rows must be irrelevantThe order of the rows must be irrelevant
6
The table representation of an entity set
An entity set is usually represented as a table in the database. The name of the table is the same name as the entity set. For example:
SSN Name Street City
586-41-4125 Brown 2404 Maile Honolulu
332-18-6752 Yang 809 Dole St. Honolulu
121-898-5569 Gregson 211 Main St. Chicago
Customer
7
The table representation of an entity set
In the table, columns are called attributes, rows are called tuples, observations, or records. Each row is an entity, and the table is an entity set.
CustomerSSN Name Street City
586-41-4125 Brown 2404 Maile Honolulu
332-18-6752 Yang 809 Dole St. Honolulu
121-898-5569 Gregson 211 Main St. Chicago
8
RelationRelation• Definition: A relation is a named, two-dimensional table of data Definition: A relation is a named, two-dimensional table of data • Table consists of rows (records) and columns (attribute or field)Table consists of rows (records) and columns (attribute or field)• Requirements for a table to qualify as a relation:Requirements for a table to qualify as a relation:
– It must have a unique nameIt must have a unique name– Every attribute value must be atomic (not multivalued, not composite)Every attribute value must be atomic (not multivalued, not composite)– Every row must be unique (can’t have two rows with exactly the same Every row must be unique (can’t have two rows with exactly the same
values for all their fields)values for all their fields)– Attributes (columns) in tables must have unique namesAttributes (columns) in tables must have unique names– The order of the columns must be irrelevantThe order of the columns must be irrelevant– The order of the rows must be irrelevantThe order of the rows must be irrelevant
9
KeysIt is important that any entity in an entity set be
uniquely identifiable. Practically, we use the values of certain attributes to uniquely identify an entity.
Keys can be Keys can be simplesimple (a single field) or (a single field) or compositecomposite (more than one field)(more than one field)
10
Customer-name
Customer-street
customer
SSN
Customer-city
In this particular case, the value of the combination of any attribute(s) with SSN can uniquely identify a particular customer.
11
KeysMeanwhile, it is also obvious that some of
these combinations may have redundant attributes.
Therefore, in theory, we systematically classify these combinations of attributes using three types of Keys: superkey, candidate key, and primary key.
12
Keys(1) Superkey: is a set of attributes whose
value can uniquely identify an entity in the entity set.
A superkey contains one or more attributes. Example:
The combination of “SSN” and “Name” is a superkey of the following entity set customer.
13
Because: The value of attributes “SSN” and “Name”, such as 558-36-1234 and Susan, can uniquely identify that particular customer in customer entity set, which is the pool of all customers.
Customer-name
Customer-street
customer
SSN
Customer-city
14
Notationally, we write, Superkey: (SSN, name)
Other superkeys for the entity set customer: Superkey: (SSN)Superkey: (SSN, Street)Superkey: (SSN, City)Superkey: (SSN, Name, Street)Superkey: (SSN, Name, City)Superkey: (SSN, Street, City)Superkey: (SSN, Name, Street, City)
15
Superkey is the broadest definition of unique identifiers of an entity in an entity set.
We are unsurprisingly very interested in the most economical combination(s) of attributes that can uniquely identify any particular entity.
Therefore, we introduce Candidate Key next.
16
(2) Candidate key: is a set of one or more attributes whose value can uniquely identify an entity in the entity set, and any attribute in the candidate key cannot be omitted without destroying the uniqueness property of the candidate key. (It is minimal superkey).
Example:(SSN, Name) is NOT a candidate key, because taking out “name” still leaves “SSN” which can uniquely identify an entity. “SSN” is a candidate key of customer.Candidate key could have more than one attributes.
17
Also, while most entity sets have only one candidate key, some entity sets could have more than one candidate key.
Example: Both “SSN” and “License #” are candidate keys of Driver entity set.
In building a database in a database software, the software will only allow to use one candidate key to be the unique identifier of an entity for an entity set.
18
(3) Primary Key: is the candidate key that is chosen by the database designer as the unique identifier of an entity. The database designer chooses only one candidate key as the primary key in building the system.
Example: “SSN” and “License #” are both candidate keys of Driver entity set. The database designer can choose either one as the primary key.
Foreign Key· A foreign key is an attribute or attribute combination
of one relation R2 whose values are required to match those of the primary key of relation R1 where R1 and R2 are not necessarily distinct. Note that a foreign key and the corresponding primary key should be defined on the same domain(s).
Emp#e1e2e3
enameredblue
brown
Deptd1d2d3
Worksfordeptd1
d2
DnamePayTaxArt
Employee Dept
Foreign key
20
Primary Key
Foreign Key (implements 1:N relationship between customer and order)
Combined, these are a composite primary key (uniquely identifies the order line)…individually they are foreign keys (implement M:N relationship between order and product)
Figure 5-3 Schema for four relations (Pine Valley Furniture Company)
Domain
• For each attribute, a set of permitted values, is called a the domain
21
Relational Data Structure
EMPLOYEEEmp# Name Emp Sex Mgr Emp#
E1 Jones Male E65
E6 Smith Male E28
E28 Jones Female -
Relation Attribute
Heading
Body
FemaleMale
Gender
Domain
The Domain
Emp#E1E2E3
NameRed
BrownBlack
Mgr#-
E1E1
E1, E2, E3,E4Red, BrownBlack, Blue
Attributes
Domains
Employee
Person Name
Emp#
25
Overall, superkey is the broadest unique identifier; candidate key is a subset of superkey; and primary key is a subset of candidate key.
In practice, we would first look for superkeys. Then we look for candidate keys based on experience and common sense. If there is only one candidate key, it naturally will be designated as the primary key. If we find more than one candidate key, then we can designate any one of them as primary key.
Integrity Constraints
• It ensures that changes made to the database by authorized users do not result in a loss of data consistency.
• It guards against accidental damage to the database
26
Domain Constraints
• They define valid values for attributes
• They are the most elementary form of integrity constraint.
• They test values inserted in the database, and test queries to ensure that the comparisons make sense.
28
Integrity ConstraintsIntegrity Constraints
– NOT NULL: specification prohibits the insertion NOT NULL: specification prohibits the insertion of a null value for the attribute. Any data base of a null value for the attribute. Any data base modification that would cause a null to be modification that would cause a null to be inserted in an attribute declared to be not null inserted in an attribute declared to be not null generates an error diagnosticgenerates an error diagnostic
– E.g. balance numeric(12,2) not nullE.g. balance numeric(12,2) not null
29
Integrity ConstraintsIntegrity Constraints
– Unique Constraints: specification says that Unique Constraints: specification says that attributes A1,A2… Aj, form a candidate key; attributes A1,A2… Aj, form a candidate key;
– no two tuples in the relation can be equal on all no two tuples in the relation can be equal on all the primary key attributes.the primary key attributes.
30
Integrity ConstraintsIntegrity Constraints
– The Check Clause:ensures that attribute values The Check Clause:ensures that attribute values satisfy specified conditions.satisfy specified conditions.
– E.g.E.g.– Create table studentCreate table student(name char(15) not null,(name char(15) not null,Stud_id char(10),Stud_id char(10),Degree_level char(15),Degree_level char(15),Primary key(stud_id),check (degree_level in(‘Bachelors’, ‘Masters’)))Primary key(stud_id),check (degree_level in(‘Bachelors’, ‘Masters’)))The check clause ensures that the degree_level must be one of The check clause ensures that the degree_level must be one of
‘Bachelors’ or’Masters’‘Bachelors’ or’Masters’
Referential Integrity• Ensures that a value that appears in one relation for a given set of
attributes also appears for a certain set of attribute in another relation.– If an account exists in the database with branch name “Perryridge”, then
the branch “Perryridge” must actually exist in the database.
Primary keys ofrespective relations
Foreign key
branch (branch-name, branch-city, asset ) Perryridge Brooklyn 500,000
account ( account-no, branch-name, balance ) A-123 Perryridge 5000
A set of attributes X in R is a foreign key if it is not a primary key of R butit is a primary key of some relation S.
Assertions
• An assertion is a predicate expressing a condition that we wish the database to always satisfy.
• Domain constraints and referential integrity constraints are special forms of assertions.
• But there are many constraints we cannot express by using only these special forms.
• EX1: Every loan has atleast one customer who maintains an account with a minimum balance of Rs 1000.00
Creating an Assertion
• When an assertion is created, the system will test it for validity.
• If the assertion is valid, then any future modification to the database is allowed only if it does not cause the assertion to be violated.
• But assertions can create a considerable amount of overhead, especially if complex assertions have been made.