ece 569 database system engineeringfall 2004 ece 569 database system engineering fall 2004 yanyong...

26
ECE 569 Database System Engineering Fall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: www.ece.rutgers.edu/~ yyzhang Course URL: www.ece.rutgers.edu/~yyzhang/fall04

Post on 21-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

ECE 569 Database System Engineering

Fall 2004

Yanyong Zhang: www.ece.rutgers.edu/~yyzhang

Course URL: www.ece.rutgers.edu/~yyzhang/fall04

Page 2: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

About the instructor (Yanyong Zhang)

Yanyong Office: Core 518 Office hours: TBD (Th 1-2:50??) Office number: 5-0608 Email: [email protected] URL: www.ece.rutgers.edu/~yyzhang Research interests:

● distributed computing● operating systems● sensor networks

Page 3: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Something about the background

What is database? a very large, integrated collection of data

Query

Transaction A group of queries which possess the ACID (atomic,

consistent, isolated, and durable) property

DBMS (DataBase Management System) a software package designed to store and manage

databases

Page 4: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Overview

User programs

Application programs / Queries

Software to process queries

Software to access stored data

Stored databasedefinition

Stored database

DBMS

DatabaseSystem

Page 5: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

DBMS Overview

A database management system (DBMS) provides efficient access to large amounts of persistent data

Data models and query languages allow efficient access while hiding complexity from users

Efficient shared access requires concurrency. Transactions provide transparency to this concurrency. Application programs are easier to write.

In many cases the data is valuable. It must be protected from the effects of failure (resiliency) and sabotage (security).

Page 6: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Files vs. DBMS

Applications must stage large datasets between main memory and secondary storage (e.g., buffering, page-oriented access, 32-bit addressing)

Special code for different queries

Must protect data from inconsistency due to multiple concurrent users

Crash recovery

Security and access control

Page 7: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Why DBMS?

Data independence and efficient data access

Reduced application development time

Data integrity and security

Uniform data administration

Concurrent accesses, recovery from crashes

Page 8: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Data Models

A data model is a collection of concepts for describing data

A schema is a description of a particular collection of data, using a given data model.

The relational model of data is the most widely used model today

Main concept: relation, basically a table with rows and columns

Every relation has a schema, which describes the columns, or fields

Page 9: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Levels of Abstractions

Abstraction is used to hide complexity and allow for a separation of concerns (What vs. How).

Many views, single conceptual (logical) schema, and single physical schema

Views describe how users see the data

Conceptual schema defines logical structure

Physical schema describes the files and indexes used.

Physical Schema

Conceptual Schema

View 1 View 2 View 3 Specialized view of enterprise

Complete model of enterprise

Records, pointers, indices

Subschema definition language

Data definition language

Page 10: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Example

Sample applications Admit_patient

Make_diagnosis

Record_vital_signs

In relational data model we can express schema with following tables:

patient (name, address, balance_due, room#)

payments (name, amount, date)

vital_signs (name, pulse, bp, time)

diagnosis (patient_name, disease_name)

disease (disease_name, treatment)

Page 11: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Examples

Physical Level Specify indices, e.g.,

CREATE INDEX room_index ON patient(room#);

Specify performance related characteristics of relations

Conceptual Level Define tables, specifying data types for each attribute.

CR CREATE TABLE patient (

name char(30),

address char(100),

balance_due number(6,2),

room# integer,

PRIMARY KEY (name));

Page 12: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Examples – cont’d

External Level Define views for various purposes, e.g.,

CREATE VIEW doctor-view-diagnosis AS

SELECT name, room#, disease_name,treatment

FROM patient, diagnosis, diseases

WHERE name = patient_name AND

diagnosis.disease_name = disease.disease_name;

Page 13: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Data Independence

Applications insulated from how data is structured and stored

Logical data independence: protection from change in logical structure of data

Physical data independence: protection from changes in physical structure of data

Page 14: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Concurrency Control

Concurrent execution of user programs is essential for good DBMS performance

Why??

Interleaving actions of different user programs can lead to inconsistency: e.g., check is cleared while account balance is being computed

DBMS ensures such problems don’t arise: users can pretend they are using a single-user system

Page 15: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Transaction: An execution of a DB program

Key concept is transaction, which is an atomic sequence of database actions

Each transaction, executed completely, must leave the DB in a consistent state if DB is consistent when the transaction begins.

Users can specify some simple integrity constraints on the data, and DBMS will enforce them

DBMS doesn’t understand the semantics of the data

Ensuring that a transaction (run alone) preserves consistency is ultimately the user’s responsibility.

Page 16: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Scheduling concurrent transactions

DBMS ensures that execution of {T1, T2, …, Tn} is equivalent to some serial execution T1’…Tn’.

locking scheme

Two-phase locking

Page 17: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Ensuring atomicity

DBMS ensures atomicity (all-or-nothing property) even if system crashes in the middle of a Xact.

Idea: keep a log (history) of all actions carried out by the DBMS while executing a set of Xacts

Page 18: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Structure of a DBMS

A typical DBMS has a layered architecture

The figure does not show the concurrency control and recovery component

This is one of several possible architectures; each system has its own variations.

Query optimization and execution

Relational operators

Files and access methods

Buffer management

Disk space management

Page 19: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

About the course

What will we focus on? Relational data model

Transaction processing

DBMS design

What will we not focus on? OO data model, etc

SQL programming

Goal Understand DBMS design issues

Develop background for research in database area

Page 20: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

What should you’ve know

Data structure and algorithms

Operations system knowledge

C, Unix

Background in data model and query languages recommended

Page 21: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

What will you encounter - topics

1. Relational Data Model (2-4)

2. DBMS Design / Implementation (5-11) a) File organization (5-6)

b) Access methods (7-9)

c) Query processing (10-11)

3. Transaction Processing a) Transaction Models (12-13)

b) Isolation (14-20)

c) Performance (21-22)

d) B-tree Synchronization (23-24)

e) Recovery (25-29)

Page 22: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

What will you encounter - projects

Projects Develop a client/server relational DBMS

- Query processing / Physical data model / Data dictionary

- Concurrency control / Recovery

Work in groups of at most 4.

- You may choose groups but I must approve.

- At least three members of each group should be strong C programmers.

Projects are difficult and time-consuming.

- ~10K lines of codes

- Use threads and RPC

- Code is difficult to debug

Projects are interesting and rewarding.

Page 23: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Grading Policy

3 Homework assignments (15%)

Project (45%)

Two exams (20% each)

Course URL: www.ece.rutgers.edu/~yyzhang/fall04

Page 24: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Database Literature

Journals IEEE Transaction on Knowledge and Data Engineering

ACM Transactions on Database Systems

VLDB Journal

Conferences IEEE Data Engineering Conference

ACM SIGMDO

Very Large Database (VLDB)

Page 25: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

Example – medical database Entities in database, the types and names of their

attributes, and relationships between entities.

Billed

Account Made To

Room#

Name

Address

Balance

Patient

Payment Amount

Date

Diagnosed

Disease NameTreatment

FromVital Sign

Pulse

Blood

Time

Page 26: ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang: yyzhangyyzhang

ECE 569 Database System Engineering Fall 2004

System Architecture

DDL CompilerQuery

Compiler

Runtime DB Processor

Execute

File Manager

Data Dictionary

Stored DB

Precompiler

DML Compiler

DML

Compiled Transaction

Host Compiler

Execute

Concurrency control and recovery

DDL Statements

Interactive Queries

Application Program

Parametric Users

DDL: Data Definition Language DML: Data Manipulation Language