databasesystemer e2002lene pries-hejemanaging organizational memory databasesystemer managing...

47
Lene Pries-Heje Managing Organizational memory Databasesystemer E2002 Databasesystemer Managing Organizational Memory

Post on 22-Dec-2015

225 views

Category:

Documents


0 download

TRANSCRIPT

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Databasesystemer

Managing Organizational Memory

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Learning objectives

• Kunne forklare begreberne transaktioner, concurrency control, deadlock and recovery.

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

DBMS environments

• Distinctions– Single user versus multi-user– Multiprogramming– Multiprocessing

• The more complex the environment the more effort that must be expended on maintaining integrity

• The desired outcome is a consistent database for which all integrity constraints hold

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Management of organizational memories

Update

Query

Create

Ensuringconfidentiality

Maintainingquality

Protectingexistence

MaintainingIntegrity

MakingAvailable

Goals

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Strategies for data integrity• Protecting existence

– Preventative (isolation)– Remedial (database backup and recovery)

• Maintaining quality– Update authorization– Integrity constraints– Data validation– Concurrent update control

• Ensuring confidentiality– Data access control– Encryption

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Strategies for data integrity

• Legal– Privacy laws

• Administrative– Storing database backups in a locked vault

• Technical– Using the DBMS to enforce referential integrity

constraint

Type of constraint

Explanation Example

TYPE Validating a data item value against a specified data type.

Supplier number is numeric.

SIZE Defining and validating the minimum and maximum size of a data item.

Delivery number must be at least 3 digits and at most 5.

VALUES Providing a list of acceptable values for a data item.

Item colors must match the list provided.

RANGE Providing one or more ranges within which the data item must fall or must NOT fall.

Employee numbers must be in the range 1-100.

PATTERN Providing a pattern of allowable characters which define permissible formats for data values.

Department phone number must be of the form 542-nnnn (stands for exactly four decimal digits).

PROCEDURE

Providing a procedure to be invoked to validate data items.

A delivery must have valid itemname, department, and supplier values before it can be added to the database. (Tables are checked for valid entries.)

CONDITIONAL

Providing one or more conditions to apply against data values.

If item type is ‘Y’, then color is null.

NOT NULL(MANDATORY)

Indicating whether the data item value is mandatory (not null) or optional. The not null option is required for primary keys.

Employee number is mandatory.

UNIQUE Indicating whether stored values for this data item must be unique (unique compared to other values of the item within the same table or record type). The unique option is also required for identifiers.

Supplier number is unique.

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

The concept of a transaction

• A user/user program access/updates data using a high-level query and update language supported by the DBMS.

• Execution of access/update requests (transactions) can be seen as a series of read and writes of database objects.

• You could also consider a transaction a unit of work.

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Four important properties • Atomic: either all actions are carried out or none

are.

• Consistency: A transaction run by itself with no concurrent execution of other transactions must preserve the consistency of the database.

• Isolation: Transactions are isolated, or protected, form the effects of other concurrent transactions.

• Durability: Once the user is informed that a transaction has been successfully completed, its effects should persist even if the system crashes.

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Concurrent update

• The lost data problemTime Action Database record

Part# QuantityP10 40

T1 User A receives paperworkfor a delivery of 80 units of P10

T2 User A reads P10 P10 40

T3 User B sells 20 units of P10

T4 User B reads P10 P10 40

T5 User A processes the delivery(40 + 80 = 120)

T6 User A updates the file P10 120

T7 User B processes the sales(40 - 20 = 20)

T8 User B updates the file P10 20

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Concurrent update

• Avoiding the lost data problemTime Action Database record

Part# QuantityP10 40

T1 User A receives paperworkfor a delivery of 80 units of P10

T2 User A reads P10 P10 40

T3 User B sells 20 units of P10

T4 User B attempts to read P10 denied P10 40

T5 User A processes the delivery(40 + 80 = 120)

T6 User A updates the file P10 120

T7 User B reads P10 P10 120

T8 User B processes the sales(120 - 20 = 100)

T9 User B updates the file P10 100

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Interleaved Execution of Transactions

• The DBMS interleaves the actions of different transactions to improve performance – system throughput and response time.

• Not all interleaving should be allowed.• Two actions on the same data object

conflict if at least one of the actions i a write.

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Serializability

• Executing transactions serially in different orders may produce different results.

• All results are presumed to be acceptable (consistent database state).

• If interleaved execution of transactions are used the DBMS makes no guarantees about which result will be the outcome.

• A serializable schedule is a schedule whose effect is identical to a serial schedule.

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Simple notation

T1 T2

R(A)W(A)

R(B)W(B)

R(C)W(C)

Transaction

Reading A

Commit

Writing A

Abort

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Reading uncommitted Data

T1 T2R(A)W(A)

R(A)W(A) R(B)W(B)Commit

R(B)W(B)Commit

T1 transfers 1000 kr. from A to B.

T2 increments both A and B by 10% interest.

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Unrepeatable Reads

T1 T2

R(A)

R(B)

W(A)

W(A)

T1 wants to increment A by 1.

T2 wants to decrement A by 1.

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Overwriting Uncommitted Data

T1 T2

W(A)

W(B)

The value of A and B must be the same.

T1 write 20.000 to A.

T2 write 30.000 to B.”A blind write”

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Aborted Transactions

• A recoverable schedule is a schedule in which transactions commit only after all transactions whose changes they read have committed.

• A schedule avoids cascading aborts if it only reads data written by already committed transactions.

• A strict schedule only reads or writes data written by already committed transactions.

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Unrecoverable Schedule

T1 T2R(A)W(A)

R(A)W(A) R(B)W(B)Commit

Abort

T1 transfers 1000 kr. from A to B.

T2 increments both A and B by 10% interest.

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Lock-based Concurrency Control• A DBMS must ensure that only serializable, recoverable

schedules are allowed, and that no result of committed transactions are lost undoing aborted transactions.

• A locking protocol is used to achieve this. It is a set of rules to be followed by transactions, to ensure that the net effect of interleaved transactions is identical to a serial order of the transactions.

• A lock manager is part of the DBMS and maintains a lock table with lock table entries, and a transaction table with pointers to a list of locks held by the transactions.

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Strict Two-Phase Locking

1. If a transaction wants to read/modify an object, it first requests a shared/exclusive lock on the object.

2. All locks held by a transaction are released when the transaction are completed.

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Serial/Interleaved Execution

T1 T2X(A)R(A)W(A)

X(A)R(A)W(A) X(B)R(B)W(B)Commit

Commit

T1 T2S(A)R(A)W(A)

S(A)R(A)X(B)R(B)W(B)Commit

Commit

X(C)R(C)W(C)

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Concurrent update• The deadly embrace

– User A’s update transaction locks record 1

– User B’s update transaction locks record 2

– User A attempts to read record 2 for update

– User B attempts to read record 1 for updateUpdate transaction

(User A)

Update transaction(User B)

Record 1

Record 2

Lock record 11

Lock record 22

Attempt to lock record 13

4 Attempt to lock record 2

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Deadlocks

• Deadlock Prevention– Requires users to lock in advance all records

they will require.

• Deadlock Detection– DBMS must periodically check for deadlocks. – A structure called a wait-for graph is used.

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Transaction failure and recovery

• Program error

• Action by the transaction manager

• Self-abort

• System failure

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Recovery strategies

• Switch to a duplicate database– RAID technology approach

• Backup recovery or rollback– Return to prior state by applying before-images

• Forward recovery or rollforward– Recreate by applying after-images to prior backup

• Reprocess transactions

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Crash Recovery• Analysis phase

– Determines: the point in the log to start the redo, pages in the buffer pool at the time of the crash and transactions active at the time of the crash.

• Redo phase– Repeats history by transforming the database into its state before

the crash.• Undo phase

– Undo transactions that are aborted since they were active at the time of the crash.

• Checkpoints– To be able to recover without reading the complete log,

periodically snapshots of the database is made.

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Potential backup procedures

Getrecord

Retrievedrecord

Processrecord

Updatedrecord

CPU Log updatetransaction

Log before imageof record

Log after imageof record

Outputmessage

Updatetransaction

8

9

4

5

2

1

7

Writeupdatedrecord

Obtainrecord

6

3

Periodicdatabasebackup

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Integrity constraints

Example Explanation

CREATE TABLE STOCK

(STKCODE CHAR(3) NOT NULL,

…,

NATCODE CHAR(3),

PRIMARY KEY(STKCODE),

FOREIGN KEY FKNATION(NATCODE)

REFERENCES NATION

ON DELETE RESRICT)

STKCODE must always be assigned a value of 3 or less alphanumeric characters. STKCODE must be unique because it is a primary key.NATCODE must be assigned a value of 3 or less alphanumeric characters. NATCODE must exist as the primary key of the table NATION.Do not allow the deletion of a row in NATION while there still exist rows in STOCK containing the corresponding value of NATCODE.

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

A general model of data security

Identificationchecked

Authorizationchecked

Dataretrieved

Encryptionprocessing Database

User profilesand

authorizationtables

User

Userid

DBMS access denied

Identification data

User privilegesdata

DBMS access approved

Retrieval request

Request denied

Results of request

Request approved

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Authenticating mechanisms

• Information remembered by the person– Name– Account number– Password

• Object possessed by the person– Badge– Plastic card– Key

• Personal characteristic– Fingerprint– Signature– Voiceprint– Hand size

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Authorization tables

• Indicate authority of each user or groupSubject/Client Action Object Constraint

Accounting department Insert Supplier record None

Purchase department clerk Insert Supplier record If quantity < 200

Purchase department supervisor

Insert Delivery record If quantity ≥ 200

Production department Read Delivery record None

Todd Modify Item record Type and color only

Order processing program Modify Sale record None

Brier Delete Supplier record None

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

SQL authorization

• Grant– Giving privileges to users

• Revoke– Removing privileges

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Firewall

• A device placed between an organization’s network and the Internet

• Monitors and controls traffic between the Internet and Intranet

• Approaches– Restrict packets to those with designated IP

addresses– Restrict access to applications

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Encryption

• Encryption is as old as writing• Sensitive information needs to be remain

secure• Critical to electronic commerce• Encryption hides the meaning of a message• Decryption reveals the meaning of an

encrypted message

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Public key encryption

DecryptEncrypt

Receiver’spublic key

Receiver’sprivate key

Sender Receiver

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Signing

• Message authentication

VerifySign

Sender’sprivate key

Sender’spublic key

Sender Receiver

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Monitoring activity

• Audit trail analysis– Time and date stamp all transactions

• Monitor a sequence of queries– Tracker queries

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Data administration is the common term applied to the task of managing

organizational memory.

Data Administration

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Data management problems

• Redundancy

• Inconsistent representations

• Multiple definitions of data items

• Essential data missing

• Inaccurate or incomplete data

• Uncaptured data

• Data that cannot be located

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Goals of data management

• Enable users to access the data they need in the most suitable format

• Maintain data integrity

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Management of the database environment

Data Administration(project level support)

Data Administration(system level support)

DataDictionary/DirectorySystem(DD/DS)

DatabaseManagement

System(DBMS)

User-Systeminterfaces

Multipledatabases

ExternalDBMS

Database development

cycle

Feasibility analysisDevelop a project implementation planDevelop data standardsEstablish data stewards

Project planning

1

Identify data requirements

Requirementsdefinition

2

Develop a data modelSpecify data integrity controlsSpecify test proceduresDesign

3

Map the data model to the DBMSEstablish and test data integrity controlsAccess control and securityIntegrity constraints and data validation rulesBackup and recovery proceduresCreate and load the test databaseTest database operation and integrity controls

Testing

4

Implement data integrity controlsCreate and load databaseTrain usersImplementation

5

Monitor database performanceTune and reorganize database as neededEnforce data standards and policiesSupport users: consulting, informing, and training

Use

6

Plan and implement growthImplement change control procedures

Evolution

7

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Development roles

Database Development Phase

Database Developer

Data Administrator

User

Project planning Does Consults Provides information

Requirements definition

Does Consults Provides requirements

Database design Does Consults

Data integrity

Validates data models

Database testing System and user testing

Consults

Data integrity

Does user testing

Database implementation

System related activities

Consults

Data integrity

Does user activities

Database usage Consults Data integrity monitoring

Uses

Database evolution Does Change control Provides additional requirements

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Data administration tools

Database development phase

Data Dictionary (DD) Database Management System (DBMS)

Performance monitoring

Case tools

1. Project planning Document Data mapDesign aid

Estimation tools

2. Requirements definition

Document Design aid DocumentDesign aid

3. Database design DocumentDesign aidData mapSchema generator

DocumentDesign aidData map

4.Database testing Data mapDesign aidSchema generator

Define, create, test, data integrity

Impact analysis Test data generatorDesign aid

5.Database implementation

DocumentChange control

Data integrityImplement Design

MonitorTune

6. Database use Document Data mapSchema generatorChange control

Provide tools for retrieval and updateEnforce integrity controls and procedures

Monitor Tune

7. Database evolution DocumentData mapChange control

Redefine Impact analysis

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Use of the data dictionary

• Documentation support

• Data maps

• Design aid

• Schema generation

• Change control

Lene Pries-Heje Managing Organizational memory

Databasesystemer E2002

Organizing the data administration function

• Creation of the function

• Selecting staff and assigning roles

• Locating the function