dbms participant guide for campus

142

Upload: giri-saranu

Post on 13-Nov-2014

7.260 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DBMS Participant Guide for Campus
Page 2: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 2

Page 3: DBMS Participant Guide for Campus

Introduction to

Database Management Systems

(DBMS)

Page 4: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 4

Database Management System (DBMS)

Definitions:

Data: Known facts that can be recorded and that have implicit meaning

Database: Collection of related data Ex. the names, telephone numbers and addresses

of all the people you know

Database Management System: A computerized record-keeping system

Page 5: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 5

DBMS (Contd.) Goals of a Database Management System:

To provide an efficient as well as convenient environment for accessing data in a database

Enforce information security: database security, concurrence control, crash recovery

It is a general purpose facility for: Defining database

Constructing database

Manipulating database

Page 6: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 6

Benefits of database approach

Redundancy can be reduced Inconsistency can be avoided Data can be shared Standards can be enforced Security restrictions can be applied Integrity can be maintained Data independence can be provided

Page 7: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 7

DBMS Functions

Data Definition Data Manipulation Data Security and Integrity Data Recovery and Concurrency Data Dictionary Performance

Page 8: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 8

Database System

Stored Data Defn. Stored Database

Software to access stored data

Software to process queries/programs

DBMS

Software

Application Programs/Queries

Users

DATABASE

SYSTEM

(META-DATA).

Page 9: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 9

Database System

user query Q1

Database scheme

Application program query

Q2

Query processor DDL compiler

Database manager

File manager

Physical database

Compiled query Q2 Database

description

Page 10: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 10

Categories of Data Categories of Data ModelsModels

ConceptuaConceptuall

PhysicalPhysical RepresentationRepresentationalal

Data Model

A set of concepts describing the structure of a database

By structure, we mean the data types, relationships, and constraints that should holds for the data

Page 11: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 11

Database Architecture

Internal level(storage view)

Conceptual level(community user view)

External level(individual user views)

Database

Page 12: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 12

An example of the three levels

SNo FName LName Age Salary

SNo FName LName Age Salary

SNo LName BranchNo

struct STAFF { int staffNo; int branchNo; char fName[15]; char lName[15]; struct date dateOfBirth; float salary; struct STAFF *next; /* pointer to next Staff record */};index staffNo; index branchNo; /* define indexes for staff */

BranchNo

Conceptual View

External View1

External View2

Internal View

Page 13: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 13

Schema Schema: Description of data in terms of a

data model Three-level DB Architecture defines

following schemas: External Schema (or sub-schema)

Written using external DDL

Conceptual Schema (or schema) Written using conceptual DDL

Internal Schema Written using internal DDL or storage structure

definition

Page 14: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 14

Data Independence

Change the schema at one level of a database system without a need to change the schema at the next higher level Logical data independence: Refers to the immunity

of the external schemas to changes in the conceptual schema e.g., add new record or field

Physical data independence: Refers to the immunity of the conceptual schema to changes in the internal schema e.g., adding new index should not void existing ones

Page 15: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 15

HIERARCHICAL

NETWORK

RELATIONAL

TABLEROW

COLUMN

VALUE

TYPES OF DATABASE MODELS

Page 16: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 16

DATA ANALYSIS

Entities - Attributes - Relationships - Integrity Rules

LOGICAL DESIGN

Tables - Columns - Primary Keys - Foreign Keys

PHYSICAL DESIGN

DDL for Tablespaces, Tables, Indexes

DATABASE DESIGN PHASES

Page 17: DBMS Participant Guide for Campus

Introduction to Relational Databases:

RDBMS

Page 18: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 18

Definition : RDBMS

It is a system in which, at a minimum :

The data is perceived by the user as tables

( and nothing but tables ); and

The operators at the user’s disposal - e.g., for

data retrieval - are operators that generate

new tables from old, and those include at least

SELECT, PROJECT, and JOIN.

Page 19: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 19

Features of an RDBMS

The ability to create multiple relations (tables) and enter data into them

An interactive query language Retrieval of information stored in more

than one table Provides a Catalog or Dictionary, which

itself consists of tables ( called system tables )

Page 20: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 20

Some Important Terms

Relation : a table

Tuple : a row in a table

Attribute : a Column in a table

Degree : number of attributes

Cardinality : number of tuples

Primary Key : a unique identifier for the table

Domain : a pool of values from which specific

attributes of specific relations draw their values

Page 21: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 21

Properties of Relations (Tables)

There are no duplicate rows (tuples)

Tuples are unordered, top to bottom

Attributes are unordered, left to right

All attribute values are atomic ( or scalar )

Relational databases do not allow

repeating groups

Page 22: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 22

Keys

Key

Super Key

Candidate Keys Primary Key

Alternate Key

Secondary Keys

Page 23: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 23

Keys and Referential Integrity

sid cid grade

53666 carnatic101 C

53688 reggae203 B

53650 topology112 A

53666 history105 B

sid name age

53666 Jones 18

53688 Smith 18

53650 Smith 19

gpa

3.4

3.2

3.8

login

Jones@cs

Smith@eecs

Smith@math

Enrolled Student

Primary keyForeign key referring tosid of STUDENT relation

Page 24: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 24

Page 25: DBMS Participant Guide for Campus

Relational Algebra

Page 26: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 26

Relational Query Languages

Query languages: Allow manipulation and retrieval of data from a database.

Relational model supports simple, powerful QLs: Strong formal foundation based on logic. Allows for much optimization.

Query Languages != programming languages!

Page 27: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 27

Example Instancessid bid

22 101

58 103

day

10/10/99

11/12/99

sid sname age

22 Deepa 45.0

31 Laxmi 55.5

58 Roopa 35.0

rating

7

8

10

sid sname age

28 Yamuna 35.0

31 Laxmi 55.5

44 Geeta 35.0

rating

9

8

5

58 Roopa 35.010

R1

S1

S2

Page 28: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 28

Relational Algebra

Basic operations: Selection ( ) Projection () Cross- product ( ) Set- difference ( –) Union ( )

Page 29: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 29

Projection

sname

Yamuna

Laxmi

Geeta

rating

9

8

5

Roopa 10

age

35.0

sname, rating(S2)

age(S2)55.5

Page 30: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 30

Selection

sid sname age

28 Yamuna 35.0

rating

9

58 Roopa 35.010

rating > 8(S2)

sname

Yamuna

rating

9

Roopa 10 sname, rating(S2) (rating > 8(S2))

Page 31: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 31

Union, Intersection, Set Difference

sid sname age

22 Deepa 45.0

31 Laxmi 55.5

58 Roopa 35.0

rating

7

8

10

44 Geeta 35.0

28 Yamuna 35.0

5

9

sid sname age

22 Deepa 45.0

rating

7

sid sname age

31 Laxmi 55.5

58 Roopa 35.0

rating

8

10

S1 S2

S1 S2

S1 S2

Page 32: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 32

Cross- Product

(sid) bid

22 101

58 103

day

10/10/99

11/12/99

(sid) sname age

22 Deepa 45.0

22 Deepa 45.0

31 Laxmi 55.5

rating

7

7

8

31 Laxmi 55.5

58 Roopa 35.0

58 Roopa 35.0

8

10

10

22 101 10/10/99

58 103 11/12/99

22 101 10/10/99

58 103 11/12/99

Page 33: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 33

Joins

Condition Join :

(sid) bid

58 103

58 103

day

11/12/99

11/12/99

(sid) sname age

22 Deepa 45.0

31 Laxmi 55.5

rating

7

8

Page 34: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 34

Equi-Join

bid

101

103

day

10/10/99

11/12/99

(sid) sname age

22 Deepa 45.0

58 Roopa 35.0

rating

7

10

Page 35: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 35

Division

sno pno

s1 p1

s1 p2

s1 p3

s1 p4

s2 p1

s2 p2

s3 p2

s4 p2

s4 p4

Apno

p2

pno

p1

p2

p4

pnop2p4

B1B2

B3snos1s2s3s4

sno

s1

s4

sno

s1

A/B1 A/B2 A/B3

•Not supported as a primitive operator, but useful for expressing queries like:

•Find sailors who have reserved all boats .

Page 36: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 36

Page 37: DBMS Participant Guide for Campus

Introduction to Query Optimization

Page 38: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 38

Processing A High-level Query

Query in a high level language

Intermediate form of query

Execution plan

Code to execute the query

SCANING, PARSING AND VALIDATING

QUERY OPTIMIZER

QUERY CODE GENERATOR

Result of query

RUNTIME DATABASE PROCESSOR

Typical steps when processing a high level query.

Page 39: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 39

Two Main Techniques for QueryOptimization

Heuristic Rules: A heuristic is a rule that works well in most of cases, but not always. General Idea: Many different relational algebra expressions (and thus query

trees) are equivalent. Transform the initial query tree of a query into an equivalent

final query tree that is efficient to execute.

Cost based query optimization Estimate the cost for each execution plan, and choose the

one with the lowest cost.

Can we get the best execution plan?

Page 40: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 40

Motivating Example

select *from R1, R2, R3where R1.r2no=R2.r2noand R2.r3no=R3.r3noand R1.a=5000

NLJ

SS(R2) SS(R3)

NLJ

SS(R1, “a=5000”)

Page 41: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 41

Alternative Plans 1(No Indexes)

select *from R1, R2, R3where R1.r2no=R2.r2noand R2.r3no=R3.r3noand R1.a=5000

NLJ

SS(R1, “a=5000”) SS(R2)

NLJ

SS(R3)

Page 42: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 42

Alternative Plans 2 (With Indexes)

select *from R1, R2, R3where R1.r2no=R2.r2noand R2.r3no=R3.r3noand R1.a=5000

NLJ

IS(R1, “a=5000”) SS(R2)

NLJ

SS(R3)

Page 43: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 43

Page 44: DBMS Participant Guide for Campus

Conceptual Design Using theEntity- Relationship Model

Page 45: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 45

Overview of Database Design

Conceptual design : (ER Model is used at this stage.)

Schema Refinement : (Normalization)

Physical Database Design and Tuning

Page 46: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 46

E R Modeling

Conceptual Schema Design Relational Calculus

- Formal Language for Relational D/B.

Relational Calculus

Predicate Calculus Domain Calculus

SQL / Tuple Based Query By Examples

Page 47: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 47

Design Phases…Requirements Collection

& Analysis

Data Requirements

Functional Requirements Conceptual Design

Logical Design

Physical Design

User Defined Operations Data Flow DiagramsSequence Diagrams, Scenarios

Entity Types, Constraints , RelationshipsNo Implementation Details.

Ensures Requirements Meets the Design

Data Model Mapping – Type of Database is identified

Internal Storage Structures / Access Path / File Organizations

Page 48: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 48

E-R Modeling

Entity is anything that exists and is distinguishable

Entity Set a group of similar entities

Attribute properties that describe an entity

Relationship an association between entities

Page 49: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 49

Notations

ENTITY TYPE ( REGULAR )

WEAK ENTITY TYPE

RELATIONSHIP TYPE

WEAK RELATIONSHIP TYPE

Page 50: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 50

CREATE TABLE Employees(ssn CHAR (11),name CHAR (20),lot INTEGER,PRIMARY KEY (ssn))

Employee

ssn name lotSSN NAME LOT

123- 22- 3666 Attishoo 48

231- 31- 5368 Smiley 22

131- 24- 3650 Smethurst 35

Entity

Entity Set

Attributes

Page 51: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 51

Types of Relationships

student ID cardIs issued

students courseenrols in

students teststake

1 1

1M

M M

1:1

1:M

M:M

Page 52: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 52

ER Model

Department

did dname budgetsincesince

Works_inEmployee

ssn name lot

Reports_To

supervisor Sub-ordinate

Page 53: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 53

CREATE TABLE Works_ In(ssn CHAR (11),did INTEGER,since DATE,PRIMARY KEY (ssn, did),FOREIGN KEY (ssn)REFERENCES Employees,FOREIGN KEY (did)REFERENCES Departments)

SSN DID SINCE

123-22-3666 51 1/1/91

123-22-3666 56 3/3/93

231-31-5368 51 2/2/92

ER Model (Contd.)

Works_ In

Page 54: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 54

ManagesDepartment

did dname budgetsince

Employee

ssn name lot

Key Constraints

Page 55: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 55

Key Constraints for Ternary Relationships

Department

did dnamesince

Works_inEmployee

ssn name lotbudget

Location

capacityaddress

Page 56: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 56

Participation Constraints

Department

did dname budgetsince

ManagesEmployee

ssn name lot

Works_in

since

Page 57: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 57

policyDependent

pnameagecost

Employee

ssn name lot

Weak Entities

Page 58: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 58

ISA (‘is a’) Hierarchies

Employee

ssn name lot

Hourly_Emp

Hrs_worked

Hrly_wages

Contract_Emp

contractidIsA

Page 59: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 59

Employee

ssn name lot

monitors

project

pid pbudget Started on

department

did dname budget

sponsors

until

Aggregation

Page 60: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 60

Works_ In does not allow an employee to work in a department for two or more periods (why?)

Entity vs. Attribute

Works_inDepartment

did dname budgetfrom

Employee

ssn name lot to

Page 61: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 61

Entity vs. Attribute (Contd.)

Works_inDepartment

did dname budget

from

Employee

ssn name lot

toDuration

Page 62: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 62

managesDepartment

did dname budgetsince

Employee

ssn name lot DB

DB - Dbudget

Entity vs. Relationship

Page 63: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 63

managesDepartment

did dname budget

since

Employee

ssn name lot

DBudgetMgr_apptAppt num

Entity vs. Relationship

Page 64: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 64

Dependent

pname age

cost

Employee

ssn name lot

covers

Policy

policyid

Binary vs. Ternary Relationships

Page 65: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 65

Dependent

pnameage

cost

Employee

ssn name lot

Beneficiary

Policypolicyid

Better Design

purchaser

Binary vs. Ternary Relationships

Page 66: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 66

• Some constraints cannot be captured in ER diagrams:

• Functional dependencies

• Inclusion dependencies

• General constraints

Constraints Beyond the ER Model

Page 67: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 67

E-R Diagram

DEPARTMENT

DEPT_EMP

EMPLOYEE

EMP_DEP

DEPENDENT

PROJ_WORK

PROJ_MGR

PROJECT

SUPPLIER

SUPP_PART_PROJ

PART

PART_STRUCTURE

SUPP_PART

MM

M

M

M

M

M

M

M M

M

M

1

1 1

Page 68: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 68

Example to Start with ….

An Example Database Application called COMPANY which serves to illustrate the ER Model concepts and their schema design.

The following are collection from the Client.

Page 69: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 69

Analysis…

Company :Organized into Departments, Each Department has a name, no and manager who manages the department. The Company keeps track of the date that employee managing the department. A Department may have a Several locations.

Page 70: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 70

Analysis…

Department :A Department controls a number of Projects each of which has a unique name , no and a single Location.

Employee :Name, Age, Gender, BirthDate, SSN, Address, Salary. An Employee is assigned to one department, may work on several projects which are not controlled by the department. Track of the number of hours per week is also controlled.

Page 71: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 71

Analysis….

Keep track of the dependents of each employee for insurance policies : We keep each dependant first name, gender, Date of birth and relationship to the employee.

Page 72: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 72

Now to our Company…

DEPARTMENT ( Name , Number , { Locations } , Manager, Start Date )

PROJECT( Name, Number, Location , Controlling Department )

EMPLOYEE(Name (Fname, Lname) , SSN , Gender, Address, Salary

Birthdate, Department , Supervisor , (Workson ( Project , Hrs))

DEPENDENT ( Employee, Name, Gender, Birthdate , Relationship )

Page 73: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 73

Example …

Manage: Department and Employee Partial Participation

Relation Attribute : StartDate. Works For:

Department and Employee Total Participation

Page 74: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 74

Example…

Control : Department , Project Partial Participation from Department Total Participation from Project Control Department is a RKA.

Supervisor : Employee, Employee Partial and Recursive

Page 75: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 75

Example …

Works – On : Project , Employee Total Participation Hours Worked is a RKA.

Dependants of: Employee , Dependant Dependant is a Weaker Dependant is Total , Employee is Partial.

Page 76: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 76

One Possible mapping of the Problem Statement

Works For Department

Name No Loc

Controls

Project

Name No Loc

WorksOn

manages

Sdate

Hours

Depend On

Name Sex Bdate

Relationship

Supervise

s

Employee Address

Fname

SexSSN

Name

Bdate

Sal

Lname

Dependent

Page 77: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 77

Page 78: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 78

Page 79: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 79

Page 80: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 80

Page 81: DBMS Participant Guide for Campus

Schema Refinement andNormalization

Page 82: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 82

Normalization and Normal Forms

Normalization: Decomposing a larger, complex table into several

smaller, simpler ones. Move from a lower normal form to a higher Normal

form. Normal Forms:

First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) *Higher Normal Forms (BCNF, 4NF, 5NF ....)

In practice, 3NF is often good enough.

Page 83: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 83

Why Normal Forms

The first question to ask is whether any refinement is needed!

If a relation is in a certain normal form (BCNF, 3NF etc.), it is known that certain kinds of problems are avoided/ minimized. This can be used to help us decide whether decomposing the relation will help.

Page 84: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 84

The Evils of Redundancy

Redundancy is at the root of several problems associated with relational schemas

More seriously, data redundancy causes several anomalies: insert, update, delete

Wastage of storage. Main refinement technique: decomposition

(replacing ABCD with, say, AB and BCD, or ACD and ABD).

Page 85: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 85

Refining an ER Diagram - Before

Department

did dname budgetsince

Works_inEmployee

ssn name lot

Page 86: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 86

Refining an ER Diagram - After

Works_in

since

Employee

ssn name

lot

Department

did dname budget

Page 87: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 87

First Normal Form

A table is in 1NF, if every row contains exactly one value for each attribute.

Disallow multivalued attributes, composite attributes and their combinations.

1NF states that : domains of attributes must include only atomic (simple,

indivisible) values and that value of any attribute in a tuple must be a single value from the domain of that attribute.

By definition, any relational table must be in 1NF.

Page 88: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 88

Functional Dependencies (FDs)

Provide a formal mechanism to express constraints between attributes

Given a relation R, attribute Y of R is functionally dependent on the attribute X of R if & only if each X-value in R has associated with it precisely one Y-value in R.

Page 89: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 89

Full Dependency

Concept of full functional dependency A FD x y is a full functional dependency if

removal of any attribute A from X means that the dependency does not hold any more.

Page 90: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 90

Partial Dependency

An F.D. x y is a partial dependency if

there is some attribute A X that can be removed from X and the dependency will still hold.

Page 91: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 91

Example: Constraints on Entity Set

123- 22- 3666 Attishoo231- 31- 5368131- 24- 3650434- 26- 3751612- 67- 4134

SmileySmethurstGulduMadayan

4822353535

88558

1010

77

10

4030303240

S N L R W H

58

710

R W123- 22- 3666 Attishoo231- 31- 5368131- 24- 3650434- 26- 3751612- 67- 4134

SmileySmethurstGulduMadayan

4822353535

S N L4030303240

H8R

85

58

Page 92: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 92

Second Normal Form (2NF)

A relation schema R is in 2NF if: it is in 1NF and

every non-prime attribute A in R is fully

functionally dependent on the primary key of

R.

2NF prohibits partial dependencies.

Page 93: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 93

2NF: An Example

Emp{Eno, Dept, ProjCode, Hours} Primary key: {Eno, ProjCode} {Eno} -> {Dept}, {Eno, ProjCode} -> {Hours}

Test of 2NF {Eno} -> {Dept}: partial dependency. Emp is in 1NF, but not in 2NF.

Decomposition: Emp {Eno, Dept} Proj {Eno, ProjCode, Hours}

Page 94: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 94

Transitive Dependency

An FD X Y in a relation schema R is a transitive dependency if there is a set of attributes Z that is not a

subset of any key of R, and

both X Z and Z Y hold.

Page 95: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 95

Third Normal Form

A relation schema R is in 3NF if It is in 2NF and

No nonprime attribute of R is transitively

dependent on the primary key.

3NF means that each non-key attribute value in any tuple is truly dependent on the Primary Key and not even partially on other attributes.

3NF prohibits transitive dependencies.

Page 96: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 96

3NF: An Example

Emp{Eno, Dept, Dept_Head} Primary key: {Eno} {Eno} -> {Dept}, {Dept} -> {Dept_Head}

Test of 3NF {Eno} -> {Dept} -> {Dept_Head}: Transitive

dependency. Emp is in 2NF, but not in 3NF.

Decomposition: Emp {Eno, Dept} Dept {Dept, Dept_Head}

Page 97: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 97

Boyce –Codd Normal Form

The intention of BCNF is that- 3NF does not satisfactorily handle the case of a relation processing two or more composite or overlapping candidate keys

Page 98: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 98

BCNF ( Boyce Codd Normal Form)BCNF ( Boyce Codd Normal Form)

A Relation is said to be in Boyce Codd Normal Form (BCNF) if and only if every determinant is a candidate key.

Page 99: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 99

Decomposition of a Relation Scheme

Suppose that relation R contains attributes A1 ... An. A decomposition of R consists of replacing R by two or more relations such that: Each new relation scheme contains a subset of

the attributes of R (and no attributes that do not appear in R), and

Every attribute of R appears as an attribute of one of the new relations.

Page 100: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 100

Page 101: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 101

Page 102: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 102

Page 103: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 103

Page 104: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 104

Page 105: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 105

Page 106: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 106

Page 107: DBMS Participant Guide for Campus

Transaction, Concurrency Control and Recovery

Page 108: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 108

Transaction

A sequence of many actions which are considered to be one atomic unit of work. Read, write, commit, abort

Governed by four ACID properties: Atomicity, Consistency, Isolation, Durability

Has a unique starting point, some actions and one end point

Page 109: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 109

The ACID Properties

A tomicity: All actions in the transaction happen, or none happen.

C onsistency: If each transaction is consistent, and the DB starts consistent, it ends up consistent.

I solation: Execution of one transaction is isolated from that of other transactions.

D urability: If a transaction commits, its effects persist.

Page 110: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 110

Automicity

All-or-nothing, no partial results. An event either happens and is committed or fails and is rolled back. e.g. in a money transfer, debit one account, credit the

other. Either both debiting and crediting operations succeed, or neither of them do.

Transaction failure is called Abort Commit and abort are irrevocable actions. There is no undo

for these actions. An Abort undoes operations that have already been

executed For database operations, restore the data’s previous

value from before the transaction (Rollback-it); a Rollback command will undo all actions taken since the last commit for that user.

But some real world operations are not undoable.Examples - transfer money, print ticket, fire missile

Page 111: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 111

Consistency

Every transaction should maintain DB consistency Referential integrity - e.g. each order references

an existing customer number and existing part numbers

The books balance (debits = credits, assets = liabilities)

Consistency preservation is a property of a transaction, not of the database mechanisms for controlling it (unlike the A, I, and D of ACID)

If each transaction maintains consistency, then a serial execution of transactions does also

Page 112: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 112

Isolation

Intuitively, the effect of a set of transactions should be the same as if they ran independently. Formally, an interleaved execution of transactions is

serializable if its effect is equivalent to a serial one. Implies a user view where the system runs each

user’s transaction stand-alone. Of course, transactions in fact run with lots of

concurrency, to use device parallelism – this will be covered later.

Transactions can use common data (shared data) They can use the same data processing

mechanisms (time sharing)

Page 113: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 113

Durability

When a transaction commits, its results will survive failures (e.g. of the application, OS, DB system … even of the disk).

Makes it possible for a transaction to be a legal contract.

Implementation is usually via a log DB system writes all transaction updates to a log file to commit, it adds a record “commit(Ti)” to the log when the commit record is on disk, the transaction is

committed. system waits for disk ack before acknowledging to

user

Page 114: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 114

Transaction processing

Can be automatic (controlled by the RDBMS) or programmatic (programmed using SQL or other supported programming languages, like PL/SQL)

Page 115: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 115

Why Have Concurrent Processes?

Better transaction throughput Improved response time Done via better utilization of resources:

While one processes is doing a disk read, another can be using the CPU or reading another disk.

Page 116: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 116

Typical situations requiring concurrency control

Exclusive access to an external device or shared service (e.g., managing printer queues)

Coordination of applications which process parallel data (e.g. parallel DB servers)

Disabling or enabling execution of the client programs in a specific moment (typically for database administration - e.g. database backups, enforcing resource occupation, etc.)

Detection of transaction ends when managing multiple sessions for connection to the database (client/server architectures, Web access)

Page 117: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 117

Problems with Concurrency (in absence of locking)

Lost Update problem - losing values due to intervention of write operation from other overlapping transactions

Temporary Update problem - discarding previous changes made by overlapping transaction after rollback

Incorrect Summary problem - overwriting of certain

values used for calculation by write operations from other transactions

Page 118: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 118

Lost Update Problem

Time

T0

Transaction A

Transaction B

Value

Start A 6

T1Read Value

(6)6

T2 Add 2 (6+2=8) Read Value(6)

6

T3 Write Value (8)

Add 3 (6+3=9)

8

T4 End A Write Value (9)

9

Start B

What should the final Order Value be? Which Update has been lost?

T5 End B9

Page 119: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 119

Temporary Update Problem

Time

T0

Transaction A Transaction B

Value

Start A 6

T1Read Value (6) 6

T2 Add 2 (8) 6

T3 Write Value (8)

8

T4 Failure: Rollback!

8 Read Value (8)

Start B

T5 Write Value (6) Add 3 (8+3=11)

6

Write Value (11)

T6 End A 11

What should the final Order Value be? Where is the temporary update?

T5 End B11

Page 120: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 120

Incorrect Summary Problem

Time

T0

Transaction A

Transaction BValues

T1

Read 1st Value (6)

63

T2

Add 2 (6+2=8)63

T3

Write 1st Value (8)

83

T4

83

T5

Add 2 (3+2 = 5)83

Write 2nd Value (5)

85

Read 2nd Value (3)

Read 1st Value (8)

Read 2nd Value (3)

Total Sum = 11

What should the total Order Value be? Which order was accumulated before update, and which after?

Page 121: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 121

3.1 Database State and Changes

D1, D2 - Logically consistent states of the database data

T - Transaction for changing the databaset1, t2 - Absolute time before and after the transaction

State D1 State D2

T

t1 t2

Page 122: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 122

active partially

committed committed

aborted terminated

BEGIN

READ , WRITE

END

ROLLBACKROLLBACK

COMMIT

3.2 Transaction State and Progress

A transaction reaches its commit point when all operations accessing the database are completed and the result has been recorded in the log. It then writes a [commit, <transaction-id>] and terminates.

When a system failure occurs, search the log file for entries[start, <transaction-id>]

and if there are no logged entries [commit, <transaction-id>]then undo all operations that have logged entries

[write, <transaction-id>, X, old_value, new_value]

Page 123: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 123

Schedules

T1T1 T2T2R(A)R(A)W(A)W(A)

R(B)R(B)W(B)W(B)

R(C)R(C)W(C)W(C)

• Schedule: Actions of transactions as seen by the DBMS

Page 124: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 124

Serializable Schedule

A schedule whose effect on the DB “state”

is the same as that of some serial

schedule

All serial schedules are serializable

But the reverse may not be true

Page 125: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 125

Serializability Violations

T1T1 T2T2R(A)R(A)W(A)W(A)

R(A)R(A)W(A)W(A)R(B)R(B)W(B)W(B)

commitcommitR(B)R(B)W(B)W(B)

commitcommit

Database is Database is inconsistent!inconsistent!

Transfer Transfer Rs.10,000 Rs.10,000 from A to Bfrom A to B

Add 6% Add 6% interest to interest to A & BA & B

Page 126: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 126

Cascading Aborts

T1T1 T2T2

R(A)R(A)

W(A)W(A)

R(A)R(A)

W(A)W(A)

abortabort

Page 127: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 127

Recoverable Schedules

T1T1 T2T2

R(A)R(A)

W(A)W(A)

R(A)R(A)

W(A)W(A)

commitcommit

abortabort

T1T1 T2T2

R(A)R(A)

W(A)W(A)

R(A)R(A)

W(A)W(A)

commitcommit

commitcommit

Unrecoverable Schedule Recoverable Schedule

Page 128: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 128

Locking

The concept of locking data items is one of the main techniques for controlling the concurrent execution of transactions.

A lock is a variable associated with a data item in the database. Generally there is a lock for each data item in the

database. A lock describes the status of the data item with respect to

possible operations that can be applied to that item used for synchronising the access by concurrent

transactions to the database items. A transaction locks an object before using it When an object is locked by another transaction, the

requesting transaction must wait

Page 129: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 129

Locking Granularity

A database item which can be locked could be a database record a field value of a database record the whole database

Trade-offs coarse granularity

the larger the data item size, the lower the degree of concurrency

fine granularity the smaller the data item size, the more locks to be

managed and stored, and the more lock/unlock operations needed.

Page 130: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 130

Locking: A Technique for Concurrency Control

---- SS XX

---- SS XX

Compatibility matrix for lock types X and S

S: Shared lockX: Exclusive lock-- No lock

•Locks are automatically obtained by DBMS.•Guarantees serializability!

Page 131: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 131

Two- Phase Locking (2PL)

Strict 2PL:– If T wants to read an object, first obtains an S lock.– If T wants to modify an object, first obtains X lock.– Hold all locks until end of transaction.– Guarantees serializability, and recoverable schedule, too!

also avoids WW problems!2PL:– Slight variant of strict 2PL– transactions can release locks before the end (commit or abort)

But after releasing any lock it can acquire no new locks– Guarantees serializability

Page 132: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 132

Handling a Lock Request

Lock Request (XID, OID, Mode)Lock Request (XID, OID, Mode)

Currently Locked?Currently Locked? Empty Wait Queue?Empty Wait Queue?

Currently X-locked?Currently X-locked?

Put on QueuePut on Queue

Grant LockGrant Lock

Mode==X Mode==S

No

No

No

Yes

Yes

Yes

Page 133: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 133

Page 134: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 134

Recovery

Occurs in case of transaction failures.

Database (DB) is restored to the most recent consistent state just before the time of failure.

To do this, the DB system needs information about changes applied by various transactions. It is the system log.

Page 135: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 135

Recovery: Motivation

T1T1

T2T2

T3T3

T4T4

T5T5

crashcrash

•Atomicity: Undoing actions of transaction that do not commit•Durability: Making sure all actions of committed transactions survive system crashes•The Recovery Manager guarantees Atomicity & Durability.

Page 136: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 136

Recovery Outline

Restore to most recent “consistent” state just before time of failure Use data in the log file

Catastrophic Failure Restore database from backup Replay transactions from log file

Database becomes inconsistent (non-catastrophic errors) Undo or Redo last transactions until consistent state

is restored

Page 137: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 137

Logging

Record REDO and UNDO information, for

every update, in a log.

– Sequential writes to log (put it on a separate

disk).

– Minimal info (diff) written to log, so multiple

updates fit in a single log page.

Page 138: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 138

Handling the Buffer Pool

DesiredDesired

TrivialTrivial

• When is buffer written back to disk?• Steal/No-steal

Can it be written before commit? (steal)Or does it have to wait till after commit? (no-steal)

• Force/No-forceIs it written “immediately” after commit? (force)Or can it remain in memory? (no-force)

NoStealNoSteal StealSteal

NoForceNoForce

ForceForce

Page 139: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 139

Write- Ahead Logging (WAL)

The Write- Ahead Logging Protocol: Must force the log record for an update before

the corresponding data page gets to disk.

Must write all log records for a transaction before commit .

What goes into log: BFIM needed for UNDO type algorithms

AFIM needed for REDO type algorithms

Page 140: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 140

Checkpoints in the System Log

Checkpoint record written in log when all updated DB buffers written out to disk

Any committed transaction occurring before checkpoint in log can be considered permanent (won’t have to be redone after crash)

Actions suspend execution of all transactions force-write all modified buffers to disk write checkpoint entry in log and force write log resume transactions

Fuzzy checkpointing resume transactions as soon as buffers written

Page 141: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 141

Page 142: DBMS Participant Guide for Campus

Talent Transformation Basic RDBMS ver 2.0 Page 142