rdbms_day2
DESCRIPTION
IgnoreTRANSCRIPT
1
RDBMS-Day2
Continuation of ER modeling conceptsLogical Database designNormalization
2
ER/CORP/CRS/DB07/003
Version No: 2.02Copyright © 2004,
Infosys Technologies Ltd
Recap of Day1 session
• RDBMS handles data in the form of relations, tuples and fields
• Keys identify tuples uniquely
• ER modeling is a diagrammatic representation of the conceptual design of a database
• ER diagrams consist of Entity types, relationship types and attributes
Since day 2 is a continuation of day1 content, this recap is done here to maintain the continuity
3
ER/CORP/CRS/DB07/003
Version No: 2.03Copyright © 2004,
Infosys Technologies Ltd
Relationship participation
departmentEmployee headof
1 1
partial Total
•All instances of the entity type Employee don’t participate in the relationship, Head-of.
•Every employee doesn’t head a department. So, employee entity type is said to partially participate in the relationship.
•But, every department would be headed by some employee.
•So, all instances of the entity type Department participate in this relationship. So, we say that it is total participation from the department side.
4
ER/CORP/CRS/DB07/003
Version No: 2.04Copyright © 2004,
Infosys Technologies Ltd
Attributes of a Relationship
Doctor
Medicine
PatientPrescription
dosage
Number of days
These attributes best describe the relationship rather than any individual entity
5
ER/CORP/CRS/DB07/003
Version No: 2.05Copyright © 2004,
Infosys Technologies Ltd
Weak entity
Employee
E#
has dependant
id
1 N
name
The dependant entity is represented by a double lined rectangle and the identifying relationship by a double lined diamond
The identifying relationship is the one which relates the weak entity with the strong entity on which it depends
6
ER/CORP/CRS/DB07/003
Version No: 2.06Copyright © 2004,
Infosys Technologies Ltd
Extended ER features (EER diagrams)
• Supertypes & Subtypes
• Sometimes different entity types are actually specializations of a more general entity type
– Example: Rose, jasmine, lotus, lily are all flowers
• Some attributes are common to all, others are specific to one entity type
• Represented by a generalization hierarchy
• subtypes may be disjoint or overlapping
The attributes that are common belong to the supertype and those that are specific belong to the particular subtype
7
ER/CORP/CRS/DB07/003
Version No: 2.07Copyright © 2004,
Infosys Technologies Ltd
Supertypes & Subtypes
Flower
Jasmine Rose Lotus
Jasmineattributes
Roseattributes
Lotusattributes
Colour
dUUU
Subset symbol
This depicts a disjoint subset: – e.g a particular flower that is a jasmine can only belong to the entity set jasmine and it can’t also belong to the set Rose
8
ER/CORP/CRS/DB07/003
Version No: 2.08Copyright © 2004,
Infosys Technologies Ltd
Overlapping Subtypes
Artist
Singer Dancer actor
id
Singerattributes
Dancerattributes
actorattributes
DOB
oU
UU
Gender
Address
The diagram depicts a overlapping subtype relationship– e.g a person who’s a singer could also be a dancer.
9
Case study
10
ER/CORP/CRS/DB07/003
Version No: 2.010Copyright © 2004,
Infosys Technologies Ltd
A banking scenario
• Banks have customers. • Customers are identified by name, custid, phone number and address.• Customers can have one or more accounts • Accounts are identified by an account number, account type (savings, current)
and a balance. • Customers can avail loans.• Loans are identified by loan id, loan type (car, home, personal)• and an amount.• Banks are identified by a name, code and the address of the main office.• Banks have branches.• Branches are identified by a branch number, branch name and an address.• Accounts and loans are related to the banks’ branches.• Create an ER diagram for a database to represent this application
Note: The solution that follows is just a sample and not the only solution
11
ER/CORP/CRS/DB07/003
Version No: 2.011Copyright © 2004,
Infosys Technologies Ltd
Solution Step 1: Identify the entities
• Bank• Branch• Customer• Account• Loan
Solution Step 2: Identify attributes of entities
• Bank Branch CustomerName Branch# NameCode Branch name CustidAddress Address Phone
Address
Account LoanAccount number Loan idAccount type Loan typeBalance Amount
12
ER/CORP/CRS/DB07/003
Version No: 2.012Copyright © 2004,
Infosys Technologies Ltd
Solution Step 3: Identify relationships between entities
• Bank has Branch• Branch maintains accounts• Branch offers loans• Account is held by customer• Loan is availed by customer
Solution Step 4: Analyze cardinality of relationships
•Bank has Branch : A bank has many branches->1:N
•Branch maintains accounts: One branch maintains many accounts-> 1:N
•Branch offers loans : One branch offers many loans -> 1:N
•Account is held by customer -> M:N
•Loan is availed by customer ->M:N
one account may be held by many customers (joint accounts) and one customer may hold many accounts->
One loan may be availed by many customers (joint holders) and one customer may avail many loans (car, housing etc)
13
ER/CORP/CRS/DB07/003
Version No: 2.013Copyright © 2004,
Infosys Technologies Ltd
Solution Step 5: Identify weak entities if any
Branch : Depends on strong entity Bank
Solution Step 6: Identify participation types
• Bank has Branch -> both total• Branch maintains accounts-> Branch :partial Account: Total• Branch offers loans -> Branch: partial Loan: Total• Account is held by customer-> both Total• Loan is availed by customer-> Loan : Total Customer: Partial
14
ER/CORP/CRS/DB07/003
Version No: 2.014Copyright © 2004,
Infosys Technologies Ltd
Represented diagrammatically
An ER diagram is typically read top to bottom and left to right
The cardinality, the participation type, the attribute type etc need to be expressed to the extent to which it is required for the given problem.
15
Logical database designConverting ER diagrams to relational schema
Logical database design
Process of converting the conceptual model into an equivalent representation in the implementation model (relational/hierarchic/network etc.)
We will focus on the relational model
Relational database design
Convert ER model into relational schema (a specification of the table definitions and their foreign key links)
There are well defined rules for this conversion
16
ER/CORP/CRS/DB07/003
Version No: 2.016Copyright © 2004,
Infosys Technologies Ltd
Converting Strong entity types
• Each entity type becomes a table
• Each single-valued attribute becomes a column
• Derived attributes are ignored
• Composite attributes are represented by components
• Multi-valued attributes are represented by a separate table
• The key attribute of the entiry type becomes the primary key of the tableback
17
ER/CORP/CRS/DB07/003
Version No: 2.017Copyright © 2004,
Infosys Technologies Ltd
Entity example
• Here address is a composite attribute
• Years of service is a derived attribute (can be calculated from date of joining and current date)
• Skill set is a multi-valued attribute
• The relational Schema
Employee (E#, Name, Door_No, Street, City, Pincode, Date_Of_Joining)
Emp_Skillset ( E#, Skillset)
As per the rules:
Derived attributes are ignored
Composite attributes are represented by components
Multi-valued attributes are represented by a separate table
18
ER/CORP/CRS/DB07/003
Version No: 2.018Copyright © 2004,
Infosys Technologies Ltd
Converting weak entity types
• Weak entity types are converted into a table of their own, with the primary key of the strong entity acting as a foreign key in the table
• This foreign key along with the key of the weak entity form the composite primary key of this table
The Relational Schema
Employee (E# ,…….)
Dependant (Employee, Dependant_ID, Name, Address)
Here dependant is a weak entity. Dependant doesn’t mean anything to the problem without the information on for which employee the person is a dependant.
19
ER/CORP/CRS/DB07/003
Version No: 2.019Copyright © 2004,
Infosys Technologies Ltd
Converting relationships
• The way relationships are represented depends on the cardinality and the degree of the relationship
• The possible cardinalities are:• 1:1, 1:M, N:M
• The degrees are:• Unary• Binary• Ternary …
20
ER/CORP/CRS/DB07/003
Version No: 2.020Copyright © 2004,
Infosys Technologies Ltd
Unary 1:1
• Employee( E#, Name,... Married_to)
•Consider employees who are also a couple
•The primary key field itself will become foreign key in the same table
21
ER/CORP/CRS/DB07/003
Version No: 2.021Copyright © 2004,
Infosys Technologies Ltd
Unary 1:N
• Employee ( E#, Name,…,Manager)
•The primary key field itself will become foreign key in the same table
•Same as unary 1:1
22
ER/CORP/CRS/DB07/003
Version No: 2.022Copyright © 2004,
Infosys Technologies Ltd
Unary M:N
• There will be two resulting tables. One to represent the entity and another to represent the M:N relationship as follows
• Employee ( E#, Name,…)
• Guaranty ( Guarantor, beneficiary)
Employee
Guarantor_of
M
N
23
ER/CORP/CRS/DB07/003
Version No: 2.023Copyright © 2004,
Infosys Technologies Ltd
Binary 1:1
Case 1:Combination of participation types
• The primary key of the partial participant will become the foreign key of the total participant
• Employee ( E#, Name,…)
• Department (Dept#, Name…,Head)
departmentEmployee headof
1 1
partial
Total
24
ER/CORP/CRS/DB07/003
Version No: 2.024Copyright © 2004,
Infosys Technologies Ltd
Binary 1:1
Case 2: Uniform participation types
• The primary key of either of the participants can become a foreign key in the other
• Employee (E#,name…)
• Chair ( item#, model, location, used_by)• (or)• Employee ( E#, Name….Sits_on)
• Chair (item#,….)
Employee CHAIRSits_on
25
ER/CORP/CRS/DB07/003
Version No: 2.025Copyright © 2004,
Infosys Technologies Ltd
Binary 1:N
• The primary key of the relation on the “1” side of the relationship becomes a foreign key in the relation on the “N” side
• Teacher (ID , Name, Telephone, ...)
• Subject (Code , Name, ..., Teacher)
Teacher teaches Subject1 N
26
ER/CORP/CRS/DB07/003
Version No: 2.026Copyright © 2004,
Infosys Technologies Ltd
Binary M:N
• A new table is created to represent the relationship
• Contains two foreign keys - one from each of the participants in the relationship
• The primary key of the new table is the combination of the two foreign keys
• Book (Acc#,Title…) Employee (E#,Name,…)
• Issue (Book#, Borrower#)
Book Borrowed by EmployeeM N
27
ER/CORP/CRS/DB07/003
Version No: 2.027Copyright © 2004,
Infosys Technologies Ltd
Ternary relationship
• Represented by a new table
• The new table contains three foreign keys -one from each of the participating Entities
• The primary key of the new table is the combination of all three foreign keys
• Prescription (Doctor#, Patient #, Medicine_Name)
28
Normalization
29
ER/CORP/CRS/DB07/003
Version No: 2.029Copyright © 2004,
Infosys Technologies Ltd
A well structured table
• Well-structured table - contains minimal redundancy and allows users to insert, modify,and delete the rows without errors or inconsistencies.
• The possible anomalies could be
• Insertion Anomalies• Deletion Anomalies• Modification Anomalies
• An Example table:
Project# Project_name Member# MemberName
MemberAddress
Insertion Anomalies
Experienced when we attempt to store a value for one field but cannot do so because the value of another field is unknown
e.g., cannot add a new employee information until he /she is assigned a project
Deletion Anomalies
Experienced when value of a field is unexpectedly removed when value for another field is deleted
e.g., cannot delete the project detail (project could be dropped) of the employee without deleting the only copy of the employee information
Modification AnomaliesModification Anomalies
Experienced when changes to multiple records of a table are needed to update a single value of a field
e.g., for changing an employee’s address, it has to be changed in every row corresponding to the project he/she would be a member of
30
ER/CORP/CRS/DB07/003
Version No: 2.030Copyright © 2004,
Infosys Technologies Ltd
Normalization
• The formal process that can be followed to achieve a good database design
• Also used to check that an existing design is of good quality
• The different stages of normalization are known as “normal forms”
To understand this, we need to understand the concept of functional dependency
31
ER/CORP/CRS/DB07/003
Version No: 2.031Copyright © 2004,
Infosys Technologies Ltd
Functional dependency
• An attribute Y of a relation schema R is functionally dependent on another attribute X of R if the value of the attribute X uniquely determines the value of the attribute Y x -> y
• We say here “x determines y” or “y is functionally dependent on x”• X→Y does not imply that Y→X
• If the value of an attribute “Marks” is known then the value of an attribute “Grade” is determined since Marks→ Grade
• Types of functional dependencies:
– Full dependency– Partial dependency– Transitive dependency
32
ER/CORP/CRS/DB07/003
Version No: 2.032Copyright © 2004,
Infosys Technologies Ltd
Full dependencies
An attribute B of a relation R is fully functionally dependent on attribute A of R if it is functionally dependent on A & not functionally dependent on any proper subset of A.
Report( S#,C#,Title,Lname,Room#,Marks)
S#, C# Marks
This implies that for a given pair of (S#,C#) values occurring in the relation Report there is exactly one value of Marks. ie Marks are dependent on S# & C# as a composite pair, but not on either individually
33
ER/CORP/CRS/DB07/003
Version No: 2.033Copyright © 2004,
Infosys Technologies Ltd
Partial dependencies
Report( S#,C#,Title,Lname,Room#,Marks)
C# Title
C# LName
An attribute B of a relation R is partially dependent on attribute A of R if it is functionally dependent on any proper subset of A.
The attributes Title, LName are said to be partially dependent on the key (S#, C#) since they are dependent only on C# and not on S#.
34
ER/CORP/CRS/DB07/003
Version No: 2.034Copyright © 2004,
Infosys Technologies Ltd
Transitive dependencies
Report( S#,C#,Title,Lname,Room#,Marks)
C# LName LName Room#
An attribute B of a relation R is transitively dependent on attribute A of R if it is functionally dependent on an attribute C Which in turn is functionally dependent on A or any proper subset of A.
The attribute Room# is said to be transitively dependent on the key C# since it is dependent on
LName which in turn is dependent on C#.
35
ER/CORP/CRS/DB07/003
Version No: 2.035Copyright © 2004,
Infosys Technologies Ltd
First normal form: 1NF
• A relation schema is in 1NF if all of its attributes are: • single-valued • restricted to assuming atomic values, • functionally dependent on the primary key
• 1NF implies:• Composite attributes are represented only by their component attributes
• Attributes cannot have multiple values
• Attributes cannot have complete tuples as values
In relational database design it is not practically possible to have a table which is not in 1NF.
36
ER/CORP/CRS/DB07/003
Version No: 2.036Copyright © 2004,
Infosys Technologies Ltd
Prime Vs Non-Prime Attributes
• An attribute of a relation R that belongs to any key of R is said to be a primeattribute and that which doesn’t is a non-prime attribute
– E.g Report( S#,C#,Title,Lname,Room#,Marks)
• S# is a prime attribute• C# is a prime attribute• Title is a non-prime attribute
37
ER/CORP/CRS/DB07/003
Version No: 2.037Copyright © 2004,
Infosys Technologies Ltd
Second normal form: 2NF
• A relation schema R is in 2NF if it is in 1NF and every non-prime attribute is fully functionally dependent on every key of R
Consider the relational schema:Empdetails( E#, Project#, Role, Number_Of_shares, Share_worth)
In this,E#, Project# -> Role
E# -> Number_Of_shares
Number_of_shares depends only on E# irrespective of the project currently working on. i.e. partial dependency
38
ER/CORP/CRS/DB07/003
Version No: 2.038Copyright © 2004,
Infosys Technologies Ltd
A typical snapshot may look like…
DVMLDV
MLDVML
Role
5,00,0007,50,0005,00,000
5,00,00010,00,0005,00,000
100150100
100200100
AbcPpqAbc
HjkEdfBnm
102119198
102127102
Share_worthNum_of_sharesProject#E#
The share worth is fixed for number of shares and is unnecessarily repeated
The stock detail is repeated for every project information of an employee. So if more shares are allotted to the employee, it needs to be updated on multiple rows
If an employee is not yet allotted to any project, the remaining information can’t be captured
39
ER/CORP/CRS/DB07/003
Version No: 2.039Copyright © 2004,
Infosys Technologies Ltd
After decomposing…
Empdetails( E#, Project#, Role, Loan_amount, Loan_type, Interest_rate)
Becomes
Emp_Project (E#, P#, Role)
Emp_Stock (E#, Num_of_Shares, Share_worth)
• This avoids the anomalies that were present in the original relation
• Still redundancy remains due to the transitive dependency
E#->Num_of_shares
Num_of_shares->Share_worth
40
ER/CORP/CRS/DB07/003
Version No: 2.040Copyright © 2004,
Infosys Technologies Ltd
Third normal form:3 NF
• Applying this, the relationEmp_Stock (E#, Num_of_Shares,Share_worth)
Will be decomposed into
Emp_Stock( E#, Num_of_Shares)StockWorth (Num_of_Shares,Share_worth)
A relation schema R is in 3NF if it is in 2NF and every non-prime attribute is non-transitively dependent on every key of R
41
ER/CORP/CRS/DB07/003
Version No: 2.041Copyright © 2004,
Infosys Technologies Ltd
Boyce-Codd normal form:BCNF
A relation R is in BCNF if, for every non-trivial functional dependency A->B in it, itis true that A is a superkey of R In other words, every determinant is a candidate key
•BCNF is a stronger form of 3NF
•3NF states that every non-prime attribute must be non-transitively dependent on every key
•BCNF states that every attribute (prime or non-prime ) must be non-transitively dependent on every key
42
ER/CORP/CRS/DB07/003
Version No: 2.042Copyright © 2004,
Infosys Technologies Ltd
An example
• Consider the relation:– Courses (Dept#, Course#, Lecturer#, Num_Students)
• Assumptions:– Each Department offers may courses– Course# is unique within a Department only*– Each Lecturer belongs to one Dept only– Each Lecturer may handle several courses within the dept.– A particular course offered by a department may be handled by a single
lecturer.
* The same course id may refer to a different course offered by a different department
43
ER/CORP/CRS/DB07/003
Version No: 2.043Copyright © 2004,
Infosys Technologies Ltd
The functional dependencies
• {Dept#,Course#}->Lecturer#
• {Dept#,Course#}-> Num-of_students
• {Lecturer#,Course#}->Num-of_students
• Lecturer# -> Course#
The candidate keys are:
• {Dept#,Course#}
• {Lecturer#,Course#}
44
ER/CORP/CRS/DB07/003
Version No: 2.044Copyright © 2004,
Infosys Technologies Ltd
A sample table
201542
…1219
L1L1L2
…L3L4
C1C2C3
…C5C6
D1D1D1
…D2D2
Num_studentsLecturer#Course#Dept#
45
ER/CORP/CRS/DB07/003
Version No: 2.045Copyright © 2004,
Infosys Technologies Ltd
Observations
• In the table, the only non-prime attribute is Num-of_students.
• It depends on every key of the table non-transitively
• So, it is in 3NF
• But, the fact that Lecturer L1 belongs to department D1 is repeated –redundancy
• Lecturer#->Dept#. In this, the attribute Dept# is only partially dependent on the key
The solution• Course_Offering(Lecturer#, Course#, Num-of-Students)
• Lecturer(Lecturer#, Dept#)
Every determinant is a candidate key. So, BCNF
46
ER/CORP/CRS/DB07/003
Version No: 2.046Copyright © 2004,
Infosys Technologies Ltd
Denormalization
• Denormalization can be described as a process for reducing the degree of normalization with the aim of improving query processing performance
• However, reducing the degree of normalization of a table may lead to inconsistencies and this option has to be dealt with after careful thinking
• The usefulness of Denormalization is as such debatable
Normalised tables
After Denormalization
Denormalization may not always give an optimal solution
47
ER/CORP/CRS/DB07/003
Version No: 2.047Copyright © 2004,
Infosys Technologies Ltd
Exercise- Recruitment • The HR dept of an organization is planning for a big recruitment drive.• They wish to organize the data required for the process, in a database. The data that
needs to be captured is as follows:
Applicant Interviewer QualificationsEnrollment id, Employee id Qualification(hsc, sslc, etc), Name, Name year of passing,Address, Extension number University/boardDate of birth, classGender, Telephone no Previous employment
Employer nameAddressTelephoneDesignationReason for leavingDate joinedDate leftLast drawn salary.
48
ER/CORP/CRS/DB07/003
Version No: 2.048Copyright © 2004,
Infosys Technologies Ltd
Functional dependencies
• Enroll#, -> Name,• Enroll ->Address,• Enroll ->DOB, • Enroll -> Gender,• Enroll -> Phone, • Enroll -> interviewer• Interviewer -> Int_Name (transitive dependency)• Interviewer -> Extension (transitive dependency)
Qualifications{Enroll#, qualification, year_of_passing } -> awarded_by{Enroll#, qualification, year_of_passing } -> class
Assumptions:A person may acquire the same qualification several times from the same
university (e.g M.A in english, M.A in history)Only one degree can be obtained in an year
49
ER/CORP/CRS/DB07/003
Version No: 2.049Copyright © 2004,
Infosys Technologies Ltd
Functional dependencies
EmploymentEnroll#, Employername,date_joined ->designationEnroll#, Employername, date_joined ->
reason_for_Leaving, Enroll#, Employername, date_joined -> date_leftEnroll#, Employername, date_joined -> last_slaryEmployername -> address (partial dependency)Employername -> telephone (partial dependency)
50
ER/CORP/CRS/DB07/003
Version No: 2.050Copyright © 2004,
Infosys Technologies Ltd
1NF
• Applicant ( Enroll#, Name, Address, DOB, Gender, Phone, interviewer, Int_Name, Extension)
• Qualifications ( Enroll#, qualification, year_of_passing, awarded by ,class)
• Employment ( Enroll#, Employername, date_joined, address, telephone, designation, reason_for_Leaving, date_left, last_slary)
51
ER/CORP/CRS/DB07/003
Version No: 2.051Copyright © 2004,
Infosys Technologies Ltd
2NF
• Applicant ( Enroll#, Name, Address, DOB, Gender, Phone, interviewer, Int_Name, Extension)
• Qualifications ( Enroll#, qualification, year_of_passing, awarded_by ,class)
• Employment ( Enroll#, Employername, date_joined, designation, reason_for_Leaving, date_left, last_slary )
• Employer ( Employername, Address, Phone)
Removal of partial dependencies
52
ER/CORP/CRS/DB07/003
Version No: 2.052Copyright © 2004,
Infosys Technologies Ltd
3NF
• Applicant ( Enroll#, Name, Address, DOB, Gender, Phone, interviewer)
Panel ( interviewer, Name, Extn)
Removal of transitive dependencies
53
Class work
54
ER/CORP/CRS/DB07/003
Version No: 2.054Copyright © 2004,
Infosys Technologies Ltd
Caterers’ association
• A caterer’s association in a city wants to build a database of all the caterers who are members of the association. Every type of item has got an item code for e.g idly- itemcode bf1, dosa bf2, aloo paratha- bf 3, south indian meals –ln1etc. Each caterer may have outlets spread over the city. It is not necessary that all caterers or all branches of a single caterer must provide all item types. Design the relational schema for the above requirement. Normalize the relations.
55
ER/CORP/CRS/DB07/003
Version No: 2.055Copyright © 2004,
Infosys Technologies Ltd
The data
• Unnormalized Data Items for Caterer’s AssociationCaterer name Membership id main location Branches 1..n
item ID 1...nitem Name 1...nlocation where available 1...nCaterer grade
56
ER/CORP/CRS/DB07/003
Version No: 2.056Copyright © 2004,
Infosys Technologies Ltd
Summary
• Entity types could participate in relationships fully or partially• Extended ER features represent generalization-specialization hierarchy etc• There are guidelines for converting entities, relationships , attributes into
corresponding equivalent in relational model• Functional dependencies could be full, partial, or transient• If a table contains atomic values then it is in 1NF• If a table is in 1NF and also free of partial dependencies, it is in 2NF• If a table is in 2NF and is free of transitive dependencies then it is in 3NF• If a table is in 3NF and every determinant is a candidate key, then it is in BCNF
57
ER/CORP/CRS/DB07/003
Version No: 2.057Copyright © 2004,
Infosys Technologies Ltd
Thank You!