cs370 spring 2007 cs 370 database systems lecture 1 overview of database systems

Post on 04-Jan-2016

212 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CS370 Spring 2007

CS 370 Database SystemsCS 370 Database Systems

Lecture 1 Overview of Database

Systems

CS370 Spring 2007

Questions in your mind…Questions in your mind…

• What is in this subject?

• Why we are studying this subject?

• What we shall get from this subject?

CS370 Spring 2007

Introduction to DatabasesIntroduction to Databases

• Objectives:– What is Data and Information.– The characteristics of file-based systems.– The problems with the file-based approach.– The meaning of the term database.– The meaning of the term Database Management

System (DBMS).– The typical functions of a DBMS.– The major components of the DBMS environment.– The problems involved in the DBMS environment.– The history of the development of DBMS.– The advantages and disadvantages of DBMS.

CS370 Spring 2007

Important Terms to RememberImportant Terms to Remember

• Database: organized collection of logically related data

• Data: stored representations of meaningful objects and events– Structured: numbers, text, dates

– Unstructured: images, video, documents

• Information: data processed to increase knowledge in the person using the data

• Metadata: data that describes the properties and context of user data

CS370 Spring 2007

Data Management ExampleData Management Example

• Suppose– You are a video store owner.– Customers rent video tape copies of movies.– Several copies of each movie.

• Needs– Which tapes has a customer rented?– Are any tapes overdue?– When will a tape become available?

CS370 Spring 2007

Solution: File – based SystemSolution: File – based System

• Edit rented.txt file

• Advantages– Text editors are easy to use– Simple to insert a record– Simple to delete a record

CS370 Spring 2007

Complications: QueriesComplications: Queries

• Does not address to needs• Query: What movies has Ali Raza rented?• Execute (not quite right): Search for ‘Ali Raza’.

• Query: Are any tapes overdue? Execute: ???

• Requirements– Robust, sophisticated query language– Clear separation between data organization

(schema) and data

DBMS Concept

DML

SQL

CS370 Spring 2007

Complications: Multiple usersComplications: Multiple users

• Two clerks edit rented.txt file at the same time.– Ahmed starts to edit rented.txt, reads it into memory.– Sarah starts to edit rented.txt.– Ahmed adds a record.– Ahmed saves rented.txt to disk.– Sarah saves rented.txt to disk.

Ahmed’s added record disappears!

• Requirements– Must support multiple readers and writers.– Updates to data must (appear to) occur in serial

DBMS Concept

Locks

Concurrency control

CS370 Spring 2007

Complications: CrashesComplications: Crashes

• Crash during update may lead to inconsistent state.

• Some body makes 250 of 500 edits to change records

• Before he saves it, Windows crashes!

• Requirements– Must update on all or none basis.– Implemented by commit or rollback if necessary.

DBMS Concepts

Locks, Transactions

Commit, Rollback

Recovery

CS370 Spring 2007

Persistent storage

RentedTapefile

InventoryMaster

file

Customerfile

Tape rentalcheck in

New tapeordering

Customerinfo

FILE-BASED SYSTEM

CS370 Spring 2007

Limitations of File – based ApproachLimitations of File – based Approach

• Separation and isolation of data– Each program maintains its own set of data. Users of one

program may be unaware of potentially useful data held by other programs.

• Duplication of data– Same data is held by different programs. Wasted space

and potentially different values and/or different formats for the same item.

• Data dependence– File structure is defined in the program code.

• Incompatible file formats– Programs are written in different languages, and so cannot

easily access each others files.

CS370 Spring 2007

Database ApproachDatabase Approach

• Definition of data was embedded in application

programs, rather than being stored separately

and independently.

• No control over access and manipulation of

data beyond that imposed by application

programs.

• ResultThe database and Database Management System (DBMS)

CS370 Spring 2007

Database ApproachDatabase Approach

• INFORMATION– Information can be defines as data that has been

organized in such a way as to be useful for someone or some use e.g.

“A telephone directory is an information source”

• PROCESSING– It is the activity when a computer converts data into

information.

• TYPES OF PROCESSING– Sorting, Searching, Filtering and Aggregating

CS370 Spring 2007

Database ApproachDatabase Approach

• SORTING– Recording data in a way so that it is easier to find data

items.

• SEARCHING– Finding a particular data from among many(thousands

even millions )

• FILTERING– Selecting a smaller set of data items.

• AGGREGATING– Grouping, adding, counting etc of data items to produce a

summary of the data.

CS370 Spring 2007

Sources of DataSources of Data

• Data from orders placed by customers• From public sources such as libraries• Or more recently the Internet• From commercial sources that provide specialised data

such as mailing lists. • CHARACTERISTIC OF USEFUL INFORMATION

– It -- should be:-• Up to date • on time• Relevant• Complete• Consistent• Presented in a usable way and Secured against unauthorised access

CS370 Spring 2007

What is a Database?What is a Database?

• A database is a well-organized collection of data that are related in a meaningful way, which can be accessed in different logical orders but are stored only once. The data in the database is therefore integrated, structured, and shared.

• The main features of data in a database therefore are:– It is well organized – It is related– It is accessible in different orders without great difficulty– It is stored only once

CS370 Spring 2007

Database Users?Database Users?

• Database administrators:– Responsible for authorizing access to the database, for

coordinating and monitoring its use, acquiring software, and hardware resources, controlling its use and monitoring efficiency of operations

• Database Designers:– Responsible to define the content, the structure, the

constraints, and functions or transactions against the database. They must communicate with the end-users and understand their needs

• End-users:– They use the data for queries, reports and some of them

actually update the database

CS370 Spring 2007

Categories of End UsersCategories of End Users

• Casual:– Access database occasionally when needed

• Naïve or Parametric:– They make up a large section of the end-user population. They use

previously well-defined functions in the form of “canned transactions” against the database. Examples are bank-tellers or reservation clerks who do this activity for an entire shift of operations.

• Sophisticated:– These include business analysts, scientists, engineers, others thoroughly

familiar with the system capabilities. Many use tools in the form of software packages that work closely with the stored database.

• Stand-alone:– Mostly maintain personal databases using ready-to-use packaged

applications. An example is a tax program user that creates his or her own internal database.

top related