computing & information sciences kansas state university cis 560: database system concepts...

32
Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department of Computing and Information Sciences, KSU KSOL course page: http://snipurl.com/va60 Course web site: http://www.kddresearch.org/Courses/Fall-2008/CIS560 Instructor home page: http:// www.cis.ksu.edu/~bhsu Reading for Next Class: Sections 2.1 – 2.3, Silberschatz et al., 5 th edition Syllabus and Introductory Handouts Database Architecture: lient-Server Models, Relational DB Definitio Wedneday, 27 Aug 2008

Upload: clara-lee-lambert

Post on 05-Jan-2016

220 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

CIS 560: Database System Concepts

Lecture 01 of 42

Wednesday, 27 August 2008

William H. Hsu

Department of Computing and Information Sciences, KSU

KSOL course page: http://snipurl.com/va60

Course web site: http://www.kddresearch.org/Courses/Fall-2008/CIS560

Instructor home page: http://www.cis.ksu.edu/~bhsu

Reading for Next Class:

Sections 2.1 – 2.3, Silberschatz et al., 5th edition

Syllabus and Introductory Handouts

Database Architecture:Client-Server Models, Relational DB Definitions

Wedneday, 27 Aug 2008

Page 2: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

Monday, 25 Aug 2008CIS 560: Database System Concepts

Lecture Outline

Reading for Next Class: Sections 2.1 – 2.3, Silberschatz et al. 5e

Today and Friday: Basic Relational DB Principles Relations

Database definitionsRecords

Fields

Tables

Relations

Next Week: Relational Algebra and SQL Intro Relational operators: PROJECT, SELECT, JOIN

Variations on relational joins

Implementation

Examples and Exercises Look at Chapter 2 examples

Exercises: relational algebra formulas

Page 3: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

CIS 560: Database System Concepts

Course Administration

Official Course Page (KSOL): http://snipurl.com/va60

Class Web Page: http://www.kddresearch.org/Courses/Fall-2008/CIS560

Instructional E-Mail Addresses [email protected] (always use this to reach instructor)

[email protected] (this goes to everyone)

Instructor: William Hsu, Nichols 213 Office phone: +1 785 532 7905; home phone: +1 785 539 7180

IM: AIM/YIM/MSN hsuwh & rizanabsith, ICQ 28651394 & 191317559

Office hours: after class Mon/Wed/Fri; Tue; other times by appointment

Graduate Teaching Assistant: TBD Office location: Nichols 124

Office hours: to be announced on class web board

Grading Policy Hour exams: 15% each (in-class, closed-book); final (open-book): 30%

Machine problems, problem sets (8 of 10): 16%; term project: 17%

Class participation: 7% (3% attendance, 2% questions, 2% answers)

Page 4: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

CIS 560: Database System Concepts

Review: Relational Databases,Architectures, Platforms

This Course: Relational DBs, Queries, Indexing, Concurrency Tools: (My)SQL & ORACLE, PHP & JSP Database Management Systems (DBMS)

Storage and retrieval of information Accessed using queries

Data Manipulation Languages (DMLs) Data Description Languages (DDLs) Client-Server Architecture Relational Databases

Based on relations (tuples of entities) Entities: objects organized into sets Relationships

Links between entitiesCorrespond to mathematical relations

Page 5: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

CIS 560: Database System Concepts

Review:Overall System Structure

Review:Overall System Structure

Page 6: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

CIS 560: Database System Concepts

Review:Relations Defined

Review:Relations Defined

• A relation on sets S1, S2, …, Sn is a subset of S1 S2 … Sn

• It consists of those tuples (s1, s2, …, sn) that are related

• Finite example: set of pairs (a, b) in {1, 2, 3) such that a > b• Infinite example: >, set of pairs (a, b) of nonnegative integers

aka natural numbers such that a > b

1

2

3

A

1

2

3

B

Page 7: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

CIS 560: Database System Concepts

Review:Functions Defined

Review:Functions Defined

R Zf

4.3 4

DomainCo-domain

(range)

Pre-image of 4 Image of 4.3

f maps R to Z

f(4.3)

Adapted from slides © 2005 A. Bloomfield, University of VirginiaCS202 Discrete Mathematicshttp://www.cs.virginia.edu/~asb/teaching/cs202-spring07/

A function is a relation that takes an element from a set and maps it to a unique element in another set

Page 8: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

CIS 560: Database System Concepts

Review:Function Example

Review:Function Example

1

2

3

4

5

“a”

“bb“

“cccc”

“dd”

“e”

A string length function

A

B

C

D

F

Alice

Bob

Chris

Dave

Emma

A class grade function

Domain Co-domainA pre-image

of 1

The imageof A

Adapted from slides © 2005 A. Bloomfield, University of VirginiaCS202 Discrete Mathematicshttp://www.cs.virginia.edu/~asb/teaching/cs202-spring07/

Page 9: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

Monday, 25 Aug 2008CIS 560: Database System Concepts

Review:Functions and Non-Function Relations

Review:Functions and Non-Function Relations

1

2

3

4

5

“a”

“bb“

“cccc”

“dd”

“e”

Not a valid function!Also not a valid function!

1

2

3

4

5

a

e

i

o

u

Some function…

Range

Adapted from slides © 2005 A. Bloomfield, University of VirginiaCS202 Discrete Mathematicshttp://www.cs.virginia.edu/~asb/teaching/cs202-spring07/

Page 10: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

Monday, 25 Aug 2008CIS 560: Database System Concepts

Review:One-to-One (Injective) Functions

Review:One-to-One (Injective) Functions

1

2

3

4

5

a

e

i

o

A one-to-one function

1

2

3

4

5

a

e

i

o

A function that is not one-to-one

• A function is one-to-one if each element in the co-domain (range) has a unique pre-image

• aka into function

• Every element y that is mapped into has an inverse f-1(y)… but this is not necessarily every element in the co-domain

Adapted from slides © 2005 A. Bloomfield, University of VirginiaCS202 Discrete Mathematicshttp://www.cs.virginia.edu/~asb/teaching/cs202-spring07/

Page 11: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

11

More on one-to-oneMore on one-to-one

Injective is synonymous with one-to-one “A function is injective”

A function is an injection if it is one-to-one

Note that there can be un-used elements in the co-domain

1

2

3

4

5

a

e

i

o

A one-to-one functionAdapted from slides © 2005 A. Bloomfield, University of VirginiaCS202 Discrete Mathematicshttp://www.cs.virginia.edu/~asb/teaching/cs202-spring07/

Page 12: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

Monday, 25 Aug 2008CIS 560: Database System Concepts

Review:Onto (Surjective) Functions

Review:Onto (Surjective) Functions

1

2

3

4

5

a

e

i

o

A function that is not

onto

1

2

3

4

a

e

i

o

u

An onto function

• A function is onto if each element in the co-domain (range) is an image of some pre-image

• Every element y with a unique pre-image has an inverse f-1(y)… but this is not necessarily every element in the co-domain

Adapted from slides © 2005 A. Bloomfield, University of VirginiaCS202 Discrete Mathematicshttp://www.cs.virginia.edu/~asb/teaching/cs202-spring07/

Page 13: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

13

1

2

3

4

a

e

i

o

u

An onto function

More on ontoMore on onto

Surjective is synonymous with onto “A function is surjective”

A function is an surjection if it is onto

Note that there can be multiply used elements in the co-domain

Adapted from slides © 2005 A. Bloomfield, University of VirginiaCS202 Discrete Mathematicshttp://www.cs.virginia.edu/~asb/teaching/cs202-spring07/

Page 14: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

14

Onto vs. one-to-oneOnto vs. one-to-one Are the following functions onto, one-to-one, both, or neither?

1

2

3

4

a

b

c

1

2

3

a

b

c

d

1

2

3

4

a

b

c

d

1

2

3

4

a

b

c

d

1

2

3

4

a

b

c

1-to-1, not onto

Onto, not 1-to-1

Both 1-to-1 and onto Not a valid function

Neither 1-to-1 nor onto

Adapted from slides © 2005 A. Bloomfield, University of VirginiaCS202 Discrete Mathematicshttp://www.cs.virginia.edu/~asb/teaching/cs202-spring07/

Page 15: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

Monday, 25 Aug 2008CIS 560: Database System Concepts

Review:BijectionsReview:

Bijections

Consider a function that is both one-to-one and onto

Such a function is a one-to-one correspondence, or a bijection

aka permutation aka invertible function

1

2

3

4

a

b

c

d

Adapted from slides © 2005 A. Bloomfield, University of VirginiaCS202 Discrete Mathematicshttp://www.cs.virginia.edu/~asb/teaching/cs202-spring07/

Page 16: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

16

Inverse functionsInverse functions

R Rf

4.3 8.6

Let f(x) = 2*x

f-1

f(4.3)

f-1(8.6)

Then f-1(x) = x/2

Adapted from slides © 2005 A. Bloomfield, University of VirginiaCS202 Discrete Mathematicshttp://www.cs.virginia.edu/~asb/teaching/cs202-spring07/

Page 17: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

17

More on inverse functionsMore on inverse functions

Can we define the inverse of the following functions?

An inverse function can ONLY be defined on a bijection

1

2

3

4

a

b

c

1

2

3

a

b

c

d

What is f-1(2)?Not onto!

What is f-1(2)?Not 1-to-1!

Adapted from slides © 2005 A. Bloomfield, University of VirginiaCS202 Discrete Mathematicshttp://www.cs.virginia.edu/~asb/teaching/cs202-spring07/

Page 18: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

Wednesday, 24 Jan 2008CIS 560: Database System Concepts

Query ProcessingQuery Processing

1. Parsing and translation

2. Optimization

3. Evaluation

Page 19: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

Wednesday, 24 Jan 2008CIS 560: Database System Concepts

Query Processing (Cont.)Query Processing (Cont.)

Alternative ways of evaluating a given query Equivalent expressions Different algorithms for each operation

Cost difference between a good and a bad way of evaluating a query can be enormous

Need to estimate the cost of operations Depends critically on statistical information about relations which the

database must maintain Need to estimate statistics for intermediate results to compute cost of

complex expressions

Page 20: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

Wednesday, 24 Jan 2008CIS 560: Database System Concepts

Transaction ManagementTransaction Management

A transaction is a collection of operations that performs a single logical function in a database application

Transaction-management component ensures that the database remains in a consistent (correct) state despite system failures (e.g., power failures and operating system crashes) and transaction failures.

Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database.

Page 21: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

Wednesday, 24 Jan 2008CIS 560: Database System Concepts

Database ArchitectureDatabase Architecture

The architecture of a database systems is greatly influenced by

the underlying computer system on which the database is running: Centralized Client-server Parallel (multi-processor) Distributed

Page 22: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

Wednesday, 24 Jan 2008CIS 560: Database System Concepts

History of Database SystemsHistory of Database Systems

1950s and early 1960s: Data processing using magnetic tapes for storage

Tapes provide only sequential access

Punched cards for input

Late 1960s and 1970s: Hard disks allow direct access to data Network and hierarchical data models in widespread use Ted Codd defines the relational data model

Would win the ACM Turing Award for this work IBM Research begins System R prototypeUC Berkeley begins Ingres prototype

High-performance (for the era) transaction processing

Page 23: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

Wednesday, 24 Jan 2008CIS 560: Database System Concepts

History (cont.)History (cont.)

1980s: Research relational prototypes evolve into commercial systems

SQL becomes industrial standard

Parallel and distributed database systems Object-oriented database systems

1990s: Large decision support and data-mining applications Large multi-terabyte data warehouses Emergence of Web commerce

2000s: XML and XQuery standards Automated database administration

Page 24: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

Wednesday, 24 Jan 2008CIS 560: Database System Concepts

Chapter 2: Relational ModelChapter 2: Relational Model

Structure of Relational Databases Fundamental Relational-Algebra-Operations Additional Relational-Algebra-Operations Extended Relational-Algebra-Operations Null Values Modification of the Database

Page 25: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

Wednesday, 24 Jan 2008CIS 560: Database System Concepts

Example of a RelationExample of a Relation

Page 26: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

CIS 560: Database System Concepts

Basic StructureBasic Structure Formally, given sets D1, D2, …. Dn a relation r is a subset of

D1 x D2 x … x Dn

Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai Di

Example: If

customer_name = {Jones, Smith, Curry, Lindsay}

customer_street = {Main, North, Park}

customer_city = {Harrison, Rye, Pittsfield}

Then r = { (Jones, Main, Harrison),

(Smith, North, Rye),

(Curry, North, Rye),

(Lindsay, Park, Pittsfield) }

is a relation over customer_name x customer_street x customer_city

Page 27: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

CIS 560: Database System Concepts

Attribute TypesAttribute Types

Each attribute of a relation has a name The set of allowed values for each attribute is called the domain

of the attribute Attribute values are (normally) required to be atomic; that is,

indivisible Note: multivalued attribute values are not atomic Note: composite attribute values are not atomic

The special value null is a member of every domain The null value causes complications in the definition of many

operations We shall ignore the effect of null values in our main presentation

and consider their effect later

Page 28: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

Wednesday, 24 Jan 2008CIS 560: Database System Concepts

Relation InstanceRelation Instance

The current values (relation instance) of a relation are specified by a table

An element t of r is a tuple, represented by a row in a table

JonesSmithCurryLindsay

customer_name

MainNorthNorthPark

customer_street

HarrisonRyeRyePittsfield

customer_city

customer

attributes(or columns)

tuples(or rows)

Page 29: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

CIS 560: Database System Concepts

Relations are UnorderedRelations are Unordered

Order of tuples is irrelevant (tuples may be stored in an arbitrary order)

Example: account relation with unordered tuples

Page 30: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

CIS 560: Database System Concepts

DatabaseDatabase

A database consists of multiple relations

Information about an enterprise is broken up into parts, with each relation storing one part of the information

account : stores information about accounts depositor : stores information about which customer owns which account customer : stores information about customers

Storing all information as a single relation such as bank(account_number, balance, customer_name, ..)results in

repetition of information (e.g., two customers own an account)

the need for null values (e.g., represent a customer without an account)

Normalization theory (Chapter 7) deals with how to design relational schemas

Page 31: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

CIS 560: Database System Concepts

SummarySummary

RDB Overview Mathematical Relations

Based on sets (of entities) Relation R: subset of Cartesian product of sets Each member of R: one tuple R specifies “which combinations are related”

Relational Algebra Based on mathematical relations

Page 32: Computing & Information Sciences Kansas State University CIS 560: Database System Concepts Lecture 01 of 42 Wednesday, 27 August 2008 William H. Hsu Department

Computing & Information SciencesKansas State University

CIS 560: Database System Concepts

TerminologyTerminology

Database Management Systems (DBMS) Data Manipulation Languages (DMLs) Data Description Languages (DDLs) Client-Server Architecture Relational Databases

Entity Relationship

Relations Subsets of Cartesian product of two or more sets Functions

Functions One-to-one (into function, injection) Onto (surjection) One-to-one & onto (bijection, permutation, invertible function)