cs370 spring 2007 cs 370 database systems lecture 1 overview of database systems

18
CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

Upload: coleen-marsh

Post on 04-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

CS 370 Database SystemsCS 370 Database Systems

Lecture 1 Overview of Database

Systems

Page 2: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

Questions in your mind…Questions in your mind…

• What is in this subject?

• Why we are studying this subject?

• What we shall get from this subject?

Page 3: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

Introduction to DatabasesIntroduction to Databases

• Objectives:– What is Data and Information.– The characteristics of file-based systems.– The problems with the file-based approach.– The meaning of the term database.– The meaning of the term Database Management

System (DBMS).– The typical functions of a DBMS.– The major components of the DBMS environment.– The problems involved in the DBMS environment.– The history of the development of DBMS.– The advantages and disadvantages of DBMS.

Page 4: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

Important Terms to RememberImportant Terms to Remember

• Database: organized collection of logically related data

• Data: stored representations of meaningful objects and events– Structured: numbers, text, dates

– Unstructured: images, video, documents

• Information: data processed to increase knowledge in the person using the data

• Metadata: data that describes the properties and context of user data

Page 5: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

Data Management ExampleData Management Example

• Suppose– You are a video store owner.– Customers rent video tape copies of movies.– Several copies of each movie.

• Needs– Which tapes has a customer rented?– Are any tapes overdue?– When will a tape become available?

Page 6: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

Solution: File – based SystemSolution: File – based System

• Edit rented.txt file

• Advantages– Text editors are easy to use– Simple to insert a record– Simple to delete a record

Page 7: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

Complications: QueriesComplications: Queries

• Does not address to needs• Query: What movies has Ali Raza rented?• Execute (not quite right): Search for ‘Ali Raza’.

• Query: Are any tapes overdue? Execute: ???

• Requirements– Robust, sophisticated query language– Clear separation between data organization

(schema) and data

DBMS Concept

DML

SQL

Page 8: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

Complications: Multiple usersComplications: Multiple users

• Two clerks edit rented.txt file at the same time.– Ahmed starts to edit rented.txt, reads it into memory.– Sarah starts to edit rented.txt.– Ahmed adds a record.– Ahmed saves rented.txt to disk.– Sarah saves rented.txt to disk.

Ahmed’s added record disappears!

• Requirements– Must support multiple readers and writers.– Updates to data must (appear to) occur in serial

DBMS Concept

Locks

Concurrency control

Page 9: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

Complications: CrashesComplications: Crashes

• Crash during update may lead to inconsistent state.

• Some body makes 250 of 500 edits to change records

• Before he saves it, Windows crashes!

• Requirements– Must update on all or none basis.– Implemented by commit or rollback if necessary.

DBMS Concepts

Locks, Transactions

Commit, Rollback

Recovery

Page 10: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

Persistent storage

RentedTapefile

InventoryMaster

file

Customerfile

Tape rentalcheck in

New tapeordering

Customerinfo

FILE-BASED SYSTEM

Page 11: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

Limitations of File – based ApproachLimitations of File – based Approach

• Separation and isolation of data– Each program maintains its own set of data. Users of one

program may be unaware of potentially useful data held by other programs.

• Duplication of data– Same data is held by different programs. Wasted space

and potentially different values and/or different formats for the same item.

• Data dependence– File structure is defined in the program code.

• Incompatible file formats– Programs are written in different languages, and so cannot

easily access each others files.

Page 12: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

Database ApproachDatabase Approach

• Definition of data was embedded in application

programs, rather than being stored separately

and independently.

• No control over access and manipulation of

data beyond that imposed by application

programs.

• ResultThe database and Database Management System (DBMS)

Page 13: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

Database ApproachDatabase Approach

• INFORMATION– Information can be defines as data that has been

organized in such a way as to be useful for someone or some use e.g.

“A telephone directory is an information source”

• PROCESSING– It is the activity when a computer converts data into

information.

• TYPES OF PROCESSING– Sorting, Searching, Filtering and Aggregating

Page 14: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

Database ApproachDatabase Approach

• SORTING– Recording data in a way so that it is easier to find data

items.

• SEARCHING– Finding a particular data from among many(thousands

even millions )

• FILTERING– Selecting a smaller set of data items.

• AGGREGATING– Grouping, adding, counting etc of data items to produce a

summary of the data.

Page 15: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

Sources of DataSources of Data

• Data from orders placed by customers• From public sources such as libraries• Or more recently the Internet• From commercial sources that provide specialised data

such as mailing lists. • CHARACTERISTIC OF USEFUL INFORMATION

– It -- should be:-• Up to date • on time• Relevant• Complete• Consistent• Presented in a usable way and Secured against unauthorised access

Page 16: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

What is a Database?What is a Database?

• A database is a well-organized collection of data that are related in a meaningful way, which can be accessed in different logical orders but are stored only once. The data in the database is therefore integrated, structured, and shared.

• The main features of data in a database therefore are:– It is well organized – It is related– It is accessible in different orders without great difficulty– It is stored only once

Page 17: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

Database Users?Database Users?

• Database administrators:– Responsible for authorizing access to the database, for

coordinating and monitoring its use, acquiring software, and hardware resources, controlling its use and monitoring efficiency of operations

• Database Designers:– Responsible to define the content, the structure, the

constraints, and functions or transactions against the database. They must communicate with the end-users and understand their needs

• End-users:– They use the data for queries, reports and some of them

actually update the database

Page 18: CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems

CS370 Spring 2007

Categories of End UsersCategories of End Users

• Casual:– Access database occasionally when needed

• Naïve or Parametric:– They make up a large section of the end-user population. They use

previously well-defined functions in the form of “canned transactions” against the database. Examples are bank-tellers or reservation clerks who do this activity for an entire shift of operations.

• Sophisticated:– These include business analysts, scientists, engineers, others thoroughly

familiar with the system capabilities. Many use tools in the form of software packages that work closely with the stored database.

• Stand-alone:– Mostly maintain personal databases using ready-to-use packaged

applications. An example is a tax program user that creates his or her own internal database.