rugpijn (451 kb) - liberale mutualiteit

20
Part 2 Database Concepts

Upload: others

Post on 20-Feb-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Part 2

Database Concepts

Case Study 1

Small Bank Master Disk

NOWAcct File

CDFile

PassbookFile

LoansFile

Safe DepositFile

Credit/DebitPosting

InterestPosting

MonthlyStatements

Maintenance

InterestPosting

QuarterlyStatements

NotificationMaturity Maintenance

Credit/DebitPosting

InterestPosting

QuarterlyStatements Maintenance

CreditPosting

I.D. LatePayments

YearlySummary Maintenance

RenewalNotice

Maintenance

Ref: Jan Harrington, Relational Database Management for Microcomputers

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 2

Problems with a File System • redundant data • separate maintenance • varying formats/contents/standards • application program tied to data • little overall planning Problems with handling

• address change • new account/credit report • field length change

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 3

Database Approach

Small Bank Master Disk

Credit/DebitPosting

InterestPosting

MonthlyStatements Maintenance

InterestPosting

QuarterlyStatements

NotificationMaturity Maintenance

Credit/DebitPosting

InterestPosting

QuarterlyStatements Maintenance

CreditPosting

I.D. LatePayments

YearlySummary Maintenance

RenewalNotice Maintenance

Pooled, Shared Database

NOW Accounts Programs Certificates of Deposit Programs

Passbook Savings Programs Loans Programs

Safe-Deposit Box Programs

Ref: Jan Harrington, Relational Database Management for Microcomputers

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 4

Why Database? Advantages over Paper-based Methods

• Compactness • Speed • Less drudgery • Currency • Centralized control

Advantages of Centralized Control

(via a Database Administrator - DBA) • Redundancy can be reduced • Inconsistency can be avoided • The data can be shared • Standards can be enforced • Security restrictions can be applied • Integrity can be maintained • Conflicting requirements can be balanced

Ref: C. J. Date, An Introduction to Database Systems

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 5

Advantages Disadvantages

Economy of scale Size

Getting more information from same amount of data

Complexity

Data sharing Cost Balancing conflicting

requirements Requirements in conflict

Enforcing standards Having to adhere to standards

Controlled redundancy Higher impact of a failureSecurity Recovery more difficult

Consistency Integrity

Flexibility Responsiveness

Increased organizational productivity

Improved program maintenance

Data independence

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 6

Data Dependence Data Dependence:

Application dictates • organization in secondary storage • access methods

Knowledge of that organization and access

method • built into application logic • built into executable code

Impossible to change the storage structure or

access methods without affecting the application

Why Data Dependence is Undesirable:

Different applications will need different views of the data

The database administrator must have the freedom to change the storage structure or access methods

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 7

Stored Field, Record, File Stored Field:

• smallest named unit of stored data • database will contain many occurrences or instances • classified into various types

Stored Record:

• named collection of related stored fields • record occurrence consists of a group of related field

occurrences • also classified into various types by the variation in

fields contained in the record Stored File:

• named collection of all occurrences of one type of record

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 8

Data Independence The Immunity of Applications to Changes in

Storage Structure and Access Methods Must Protect an Application from Variations in:

• representation of numeric data (binary, BCD) • representation of character data (ASCII, EBCDIC) • units for numeric data (dollars, cents) • data coding (red, blue, green; 1, 2, 3; 01, 10, 11) • data materialization (for derived data) • structure of stored records (grouping fields into

records) • structure of stored files (single volume, volume set,

etc.)

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 9

Sample Database - Employees Overview of Content:

The database contains organization, budget, and scheduling information for a software group that is developing an academic information system

Entities:

Employees - who have • a name • a job title • a manager who, in turn, is an employee • a hire date • an hourly billing rate • (possibly) a dollar annual bonus amount • membership in a department, which in turn has a

name, location, and budget • a set of assigned tasks on projects

• each task by each employee on each project has a time estimate in hours

• each project has a name, description, budget, and due date

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 10

Sample Database - Departments and Projects

Entities (continued)

Departments - which have • a department number • a department name • a department location (room number) • an annual dollar budget • employees, who in turn have a name, job description,

manager, hire date, hourly rate annual bonus, and a set of assigned tasks (as described above)

Projects - which have

• a project name • a project description • a project budget • a project due date • a set of tasks, each of which is to be performed by one

or more employees (who in turn have a name, job description, manager, hire date, ...) with a time estimate for each employee for each task

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 11

Sample Database - Tasks Entities (continued)

Tasks - each of which have • the name of the employee working on the task (who in

turn has name, job description, ...) • the name of the project that the task is related to

(which in turn has name, description, ...) • the name of the task being performed • the time estimate (in hours) of how long an employee

will work on a particular type of task for a particular project

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 12

Sample Data (stated in relational form)

Employees - (Table name emp)

Ename Job Mgr Hired Rate Bonus DeptNo allen programmer barger 09-jun-1991 30.00 402 barger supervisor turner 23-jan-1993 65.00 550.00 402 jones programmer radl 20-feb-1991 35.00 401 king clerk barger 22-feb-1991 18.00 402 martin programmer barger 09-nov-1991 25.00 402 olson analyst radl 28-apr-1991 55.00 0.00 401 pearson programmer radl 01-may-1991 30.00 401 radl supervisor turner 03-dec-1992 65.00 600.00 401 rogers programmer barger 08-sep-1992 25.00 402 smith programmer barger 17-dec-1990 35.00 402 sturm clerk radl 23-sep-1992 18.00 401 thomas analyst barger 03-dec-1992 50.00 0.00 402 turner supervisor 02-mar-1991 75.00 1000.00 400 vogel consultant turner 17-nov-1991 80.00 400

Departments (Table name dept) DeptNo Dname Loc Dbudget 400 programming 200 150000.00 401 financial 200 275000.00 402 academic 100 390000.00 403 support 300 7000.00

Projects (Table name proj) Project_id Description Pbudget Due_date admit Admissions 15000.00 07-apr-1998 alumni Alumni development 7500.00 30-jan-1999 billing Student billing 11000.00 30-jan-1998 budget Budgeting 12500.00 12-mar-1998 payroll Payroll 9000.00 15-may-1998 records Students records 6000.00 11-feb-1998

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 13

Tasks (Table name task) Ename Project_id Tname Hours allen admit debug 25 allen admit implement 20 allen billing debug 30 allen billing implement 20 barger admit manage 15 barger alumni manage 10 barger billing manage 8 barger records manage 12 jones billing implement 35 jones budget implement 70 jones payroll debug 40 king admit clerical 25 king alumni clerical 9 king records clerical 15 martin admit implement 30 olson admit design 75 olson alumni design 40 olson billing design 20 olson records design 45 pearson budget debug 40 pearson budget implement 60 pearson payroll implement 80 radl billing design 15 radl billing manage 10 radl budget manage 15 radl payroll manage 20 rogers records debug 20 rogers records design 30 rogers records implement 45 smith alumni debug 30 smith alumni implement 90 smith billing implement 40 sturm billing clerical 38 sturm budget clerical 20 sturm budget debug 20 sturm payroll clerical 15 thomas alumni design 5 thomas billing design 45 thomas budget design 40 thomas payroll design 70 turner billing manage 12 turner budget design 45

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 14

Alternative Organization of Tasks Note:

The above information could have been organized by employee, project, or task as follows:

By Employee:

Allen: Admit Implement 20 Debug 25 Billing Implement 20 Debug 30 Barger: Admit Manage 15 Records Manage 12 Billing Manage 8 Alumni Manage 10 Jones: etc.

By Project:

Admit: Design Olson 75 Implement Allen 20 Martin 30 Debug Allen 25 Manage Barger 15 Clerical King 25 Records: etc.

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 15

Alternative Organization for Tasks By Task:

Design: Olson Admit 75 Records 45 Billing 20 Alumni 40 Rogers Records 30 Thomas Billing 45 Alumni 5 Payroll 70 Budget 40 Radl Billing 15 Turner Budget 45 Implement etc.

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 16

DBMS Definitions DBMS - Database Management System

(usually a GDBMS - Generalized DBMS) DDL - Data Definition Language

Language used to describe the logical database model or “schema”

DML - Data Manipulation Language Language used to perform database operations

such as queries and updates required for an application

PLI - Programming Language Interface An extension to a conventional 3GL (COBOL,

FORTRAN, C, etc.) that allows it to call the DBMS to perform operations (similar to the operations that might be performed in the DML) under control of a programming language

SCHEMA - The overall logical structure (data definition) or model of the stored database

SUB SCHEMA - The user's logical view or model of the database

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 17

DBMS Requirements • Data storage, retrieval, and update

(Get data in/Send data out) • Reporting • Ad hoc query handling • Data manipulation • Integrity services

(data type, legal values, format, constraints) • Transaction support • Concurrency control services

(database, file, table, record, field locking) • Recovery services

(backup, journaling, checkpointing) • Authorization services

(encryption, schemas, views, ACL's) • A user-accessible catalog

(data dictionary of files or tables, fields, etc.) • Support for data communications • Application development • Services to promote data independence

(not be aware of data structure, e.g., linked) • Suitable user/machine models

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 18

How Large is a Large Database?

Size Classifications: Trivial Training, Toys, Examples Extremely Small Tiny, In-memory Very Small Not worth an “industrial strength” database

engine Small Lower than average size Medium “Sweet spot” in the market – majority of

actual implementations Large Above average size Very Large Problematic size Extremely Large State-of-the-art large – only one or two

examples in existence – impractical for all but the world’s largest enterprises

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 19

Copyright © 1971-2004 Thomas P. Sturm Database Concepts Part 2, Page 20

Aggregate Database Sizes Over History

Year Size

1955 1970 1985 2000 2015

Trivial .2KB .8KB 4KB .02MB .1MBTiny 1KB .008MB 60KB .5MB 4MBV. Small 5KB .066MB .001GB 12MB 160MBSmall 25KB .6MB .013GB .3GB 7000MBMedium 130KB 5MB .2GB 7GB .3TBLarge 600KB 43MB 3GB 200GB 12TBV. Large 3300KB 400MB 42GB 5000GB 530TBEx. Large 17000KB 3000MB 620GB 117000GB 22000TB