lecture 1 dbms & more atif farid mohammad adjunct professor uncc
TRANSCRIPT
Lecture 1
DBMS & More
Atif Farid Mohammad
Adjunct ProfessorUNCC
Introductions
How many of you know?
PhP or ColdFusion or JavaScript
How many of you are?
Grads or Undergrads
What are your majors?
CS or SIS or Something Else
DBMS & More…
Textbook:
• Fundamentals of Database Systems, Elmarsi/Navathe, Benjamin, 2011 (Sixth Edition)
Workload:• (30 %) 3 In Class Quiz• (30%) Midterm exam • (40% ) One project, parts 1 & 2
Any Questions ?
Objectives
• Introduction• Example of database systems• Database characteristics• People associated with database• Advantage of Using a DBMS• Database implications• When Not to use a DBMS
8
Introduction• Databases:– Used to maintain information and to present
data to users
• Examples of Databases include:– Reservations systems (Hotel, Car, Airline)– Transactions processing systems (Online
Banking systems, local library)– Investigations systems(Scientific Database
systems)– Multimedia database systems– Geographic information systems (GIS)
9
Core Database Terminologies: 1
• Data? – Any information (most likely in electronic form)
worth preserving • E.g., names, addresses, grades, etc.
• Database? – A collection of related data describing the
activities of one or more organizations◦ Organized (or structured) for access and modification◦ Preserved over a long period
– E.g. University Database
10
Core Terminologies: 3
• Query?an operation that extracts specified data from the
databaseE.g.
get the list of all courses and grades taken by “Smith”
• Relation? an organization of data into a two-dimensional table,
rows (tuples) represent basic entities or facts of some sort columns (attributes or fields) represent properties of those entities.
• Schema?a description of the structure of the data in a
databasealso known as metadata
11
Database Def-1
A database is a shared collection of logically related data that is stored to meet the requirements of different users of an organization
A database is a self-describing collection of integrated records
Database Def-2
A database models a particular real world system in the computer in the
form of data
Database Def-3
The concept of a shared organizational database
Accounting
AccountsPayable
Accounts Receivable
Management
ControlPlanning
Manufacturing
ProductionScheduling
Marketing
Product Development
Sales
CorporateDatabase
A bit of History
Computer initially used for computational/ engineering purposes
Commercial applications introduced File Processing System
File Processing System
A collection of programs that perform services for the end-users such as production of reports
File Processing Systems
Program and Data Interdependence
RegistrationApplications
RegistrationDataFiles
Registration
ExaminationApplications
ExaminationDataFiles
Examination
LibraryApplications
LibraryDataFiles
Library
Library Exam Registration
Reg_Number Reg_Number Reg_Number
Name Name Name
Father Name Address Father Name
Books Issued Class Phone
Fine Semester Address
Grade Class
File Processing Systems
Duplication of Data
Vulnerable to InconsistencyIncompatible Formats
Database Approach
Advantages of Database Approach
RegistrationExamination
LibraryApplications
Library
ExaminationApplications
RegistrationApplications
Database Management
System
University StudentsDatabase
- Data Sharing - Data Independence
- Controlled Redundancy - Better Data Integrity
Matt Damon 23 55
Bruce Willis 24 55
Katie Holmes 20 40
Robert Langdon 19 20
Emp Name Age SalaryCompany: Super Soft Dept: Sales
Data & Information
Schema
Database Applications
Database Management System (DBMS)
Other Advantages
Data consistency
Better data security
Faster development of new
applications
Economy of scale
Better concurrency control
Better backup and recovery procedures
They also provide
Disadvantages
Higher costs
Conversion cost
More difficult recovery
Typical Components
DBMS
Database
“How” to get
Application Programs
“What” to get
End usersinteract
Application Programmers
develop
Database Designersdesign
Software
Users
Data
maintain
Database Administrators
Levels of Data
Real World Data
Data Occurrences
Metadata
Entity, Attributee.g. A student,A class name
Record type, Data item type e.g. Student record type
Student record, Data item occurrence e.g. ‘s001’, ’Amir’, ‘CS101’
name, age, qual, sal
Employee
EmpName textAge numberSal number
John Durso 23 55
Lisa Smith 24 55
Cindy Bates 20 40
Braden Sams 19 20
Levels of Data
Real-world data
Metadata
Data Occurrence
Database Users
Application ProgrammersEnd Users
– Naïve– Sophisticated
Roles in the Database Environment
• Data Administrator (DA)• Database Administrator (DBA)• Database Designers (Logical and
Physical)• Application Programmers• End Users (naive and sophisticated)
Functions of DBA
Schema definitionGranting data accessRoutine Maintenance
Backups Monitoring disk space Monitoring jobs running
Database Properties
• A database must– Represent some aspects of real world– Collection of data must be logically coherent
and meaningful– Database is designed, build, and populated with
data for specific purpose
34
Database Construction
• Database can be constructed– Manually
• E.g., a library card catalog
– Computerized system• A Specific set of applications• Database Mgt systems
35
Database Mgt system (DBMS)
• Database management system (DBMS)? – A collection of programs that enables users
• To define (specifying data type, etc)• To construct (storing)• To manipulate (reading, writing)• To share (simultaneous access)• To protect (security & privacy protection)• To modify (changing requirements)
• Database system= DBMS + Database
36
37
Example: University Database System
• A typical university database system maintaining information regarding– Students– Courses– Grades
38
Example: A University Database
39
Database Engineering
• Design of any DB application starts– Requirements definition and analysis– Conceptual Design
• Performed using Entity relationship Diagram
– Logical Design • Performed using Orcale / Mysquel, etc
– Physical Design (storing/accessing/indexing)
40
Main Characteristics of DB
• Self-Describing nature of a database system ( catalog =Data + metadata)
• Insulation between programs and data ( program-data-independence)
• Support of multiple views of the data (virtual data)
• Sharing of data and multi-user transaction processing (concurrency control)– Transaction and atomicity
• ACID41
People involved in Database
• Involved two types of people– people on the Scene– people behind the Scene
42
People on the Scene
– Database administrators (DBA)• Authorization, coordination, supervision of DB
– Database designers• Defining, building, maintaining, etc
– End Users• Casual, naive, sophisticated
– Software Engineer • system analysts and application programmers
43
People behind the Scene
• DBMS system designers and implementers• Tool developers• Operators and maintenance personnel
44
File System vs. Database Mgt Systems
• Suppose an organization needs to manage a large collection of data, say, 500 GB (i.e., 500X1024 MB)
• Data is supposed to be accessed concurrently by employees
• Changes made to the data must be applied consistently
• Access to the data must be restricted
45
File systems vs. A DBMS (cont’)
• Using file systems, the data is stored in OS files– Not enough MM– Difficulty to directly access data (with 32-bit ,
we can access 4GB)– Need to write special program to answer each
question– Duplicated efforts – inconsistency– Concurrent accesses– Security policies– … 46
Primary Advantages of Using a DBMS: 1
• Controlling Redundancy– Duplicate space and efforts– inconsistency
• Restricting Unauthorized Access– Security and authorization subsystem
• Providing persistent storage for program objects and data structure– Impedance mismatch problem (incompatibility between
PL and DBMS)• Deriving new information from existing ones (view)
47
Primary Advantages of Using a DBMS: 2
48
• Providing Multiple User Interfaces• Representing Complex Relationships among Data• Enforcing Integrity Constraints• Providing Backup and crash Recovery• Scheduling concurrent accesses to the data• Reducing application development time
Implication of the Database Approach
• Standards can be enforced (data and display format)
• Application development time can be reduced
• Flexibility and maintainability• Availability of up-to date information• Economics of Scale (i.e., reducing overall
costs of operations, sources and mgt)49
When Not to Use A DBMS
• When there is unnecessary overhead resulted from:– High initial up-front cost– Generality that a DBMS provides for defining
and processing data– Overhead for providing
• security• concurrency• recovery• integrity
50
When to Use File System
• A File system is desirable under these conditions:– The database and application are simple, well-
defined, and unchangeable– Single-user access to the data is required– performance
51
So, what is a Database?
• Collection of data central to some enterprise• Essential to operation of enterprise
– Contains the only record of enterprise activity• An asset in its own right
– Historical data can guide enterprise strategy– Of interest to other enterprises
• State of database mirrors state of enterprise– Database is persistent
52
What is a Database Management System?
• A Database Management System (DBMS) is a program that manages a database:– Supports a high-level access language (e.g.
SQL).– Application describes database accesses using
that language.– DBMS interprets statements of language to
perform requested database access.
53
What is a Transaction?
• When an event in the real world changes the state of the enterprise, a transaction is executed to cause the corresponding change in the database state– With an on-line database, the event causes the
transaction to be executed in real time• A transaction is an application program
with special properties - discussed later - to guarantee it maintains database correctness
54
What is a Transaction Processing System?
• Transaction execution is controlled by a TP monitor– Creates the abstraction of a transaction,
analogous to the way an operating system creates the abstraction of a process
– TP monitor and DBMS together guarantee the special properties of transactions
• A Transaction Processing System consists of TP monitor, databases, and transactions
55
Transaction Processing System
56
Transaction Processing Monitor
DBMS databasetran
sact
ions
DBMS database
System Requirements
• High Availability: on-line => must be operational while enterprise is functioning
• High Reliability: correctly tracks state, does not lose data, controlled concurrency
• High Throughput: many users => many transactions/sec
• Low Response Time: on-line => users are waiting
57
System Requirements (con’t)
• Long Lifetime: complex systems are not easily replaced– Must be designed so they can be easily
extended as the needs of the enterprise change
• Security: sensitive information must be carefully protected since system is accessible to many users– Authentication, authorization, encryption
58
Roles in Design, Implementation, and Maintenance of a TPS
• System Analyst - specifies system using input from customer; provides complete description of functionality from customer’s and user’s point of view
• Data Scientists: use all available data sources (internal and external) to analyze and gain insights to help decision makers.
• Database Designer - specifies structure of data that will be stored in database.
59
Roles in Design, Implementation and Maintenance of a TPS (con’t)• Application Programmer - implements
application programs (transactions) that access data and support enterprise rules
• Database Administrator - maintains database once system is operational: space allocation, performance optimization, database security
• System Administrator - maintains transaction processing system: monitors interconnection of HW and SW modules, deals with failures and congestion
60
OLTP vs. OLAP
• On-line Transaction Processing (OLTP) – Day-to-day handling of transactions that result
from enterprise operation – Maintains correspondence between database
state and enterprise state
• On-line Analytic Processing (OLAP) – Analysis of information in a database for the
purpose of making management decisions
61
OLAP
• Analyzes historical data (terabytes) using complex queries
• Due to volume of data and complexity of queries, OLAP often uses a data warehouse
• Data Warehouse - (offline) repository of historical data generated from OLTP or other sources
• Data Mining - use of warehouse data to discover relationships that might influence enterprise strategy
62
Examples - Supermarket
• OLTP– Event is 3 cans of soup and 1 box of crackers
bought; update database to reflect that event
• OLAP– Last winter in all stores in northeast, how many
customers bought soup and crackers together?
• Data Mining– Are there any interesting combinations of foods
that customers frequently bought together?
63
Big Data Era
• Every 2 days we create as much information as we did from the dawn of civilization up to 2003. (2010, Eric Shmidt, Google)
• Science: Astronomy, Physics, Bioinformatics, Neuroinfomatics, earth science, etc.
• Business: Automobile, Healthcare, Financial, infotaiment
• The automotive industry is projected to be the 2nd largest generator of data by 2015
64
Big Data Opportunities to Auto
• Google self-driving car (generates 1GB/s).• Recommendation/alert to customers.• Manufactures can better understand
customers and market trends.• Better driving behaviors, better cars, less
accidents, and the bottom line is happier customers.
65
Big Data Opportunities to Auto
• Google self-driving car (generates 1GB/s).• Recommendation/alert to customers.• Manufactures can better understand
customers and market trends.• Better driving behaviors, better cars, less
accidents, and the bottom line is happier customers.
66
Why is Database Important in the Future?
“In the future, … any given discipline advances is likely to depend on… database, workflow management, visualization, and cloud computing technologies. “
G. Bell, T. Hey, A. Szalay, “Beyond the Data Deluge,” Science, Vol. 323, no. 5919, pp. 1297-1298, 2009.
68
Turing Awardees in DB
Jim Gray (1998)
Edgar F. Codd (1981)
Charles Bachman (1973)
Charles Bachman
Developer of IDS: the first database system
Edgar. F. Codd
70Inventor of the Relational Model
Jim Gray
71
Founder of Transaction Processing
Questions ?