cs370 spring 2007 cs 370 database systems lecture 3 overview of database systems
TRANSCRIPT
CS370 Spring 2007
CS 370 Database SystemsCS 370 Database Systems
Lecture 3 Overview of Database
Systems
CS370 Spring 2007
A commonly used view of data approach is the three-level architecture suggested by ANSI/SPARC A commonly used view of data approach is the three-level architecture suggested by ANSI/SPARC
(American National Standards Institute/Standards Planning and Requirements Committee).(American National Standards Institute/Standards Planning and Requirements Committee).
View nView nView 2View 2View 1View 1
Conceptual Conceptual
SchemaSchema
Internal Internal
SchemaSchemaD
ata
base
Dat
a ba
se
External LevelExternal Level
Conceptual LevelConceptual Level
Internal LevelInternal Level
Physical data OrganizationPhysical data Organization
DBMS ArchitectureDBMS Architecture
CS370 Spring 2007
The external level is the view that the individual user of the database has. This
view is often a restricted view of the database and the same database may provide a
number of different views for different classes of users. In general, the end users
and even the applications programmers are only interested in a subset of the
database. For example, a department head may only be interested in the
departmental finances and student enrolments but not the library information. The
librarian would not be expected to have any interest in the information about
academic staff. The payroll office would have no interest in student enrolments
User’s view of the database. Describes that part of database that is relevant to a particular user.
External LevelExternal Level
CS370 Spring 2007
The conceptual view is the information model of the enterprise and contains the view of the whole enterprise without any concern for the physical implementation. This view is normally more stable than the other two views. In a database, it may be desirable to change the internal view to improve performance while there has been no change in the conceptual view of the database. The conceptual view is the overall community view of the database and it includes all the information that is going to be represented in the database. The conceptual view is defined by the conceptual schema, which includes definitions of each of the various types of data.
Community view of the database. Describe what data is stored in database and relationship.
Conceptual LevelConceptual Level
CS370 Spring 2007
Physical representation of the database on the computer. Describes how the data is stored in the database.
Internal LevelInternal Level
CS370 Spring 2007
External View (User 2)External View (User 1)1.1. EMPEMP
2.2. EMP #EMP #
3.3. SALSAL
1. EMP
2. EMP #
3. SAL
Conceptual Level
Employee
EMPLOYEE_NUMBER NUMBER(6)
DEPARTMENT_NUMBER NUMBER( 4)
SALARY NUMARIC(5)
Internal Level STORED_EMP LENGTH=18
PREFIX TYPE=BYTE(6), OFFSET=0
EMP# TYPE=BYTE(6), OFFSET=6,INDEX=EMX
DEPT# TYPE=BYTE(4), OFFSET=12
PAY TYPE=FULLWORD, OFFSET=16
CS370 Spring 2007
SNo FName LName Age Salary Staff_No L_Name BNo
Staff_No FName LName DOB Salary Branch_No
External View 1 External View 2
Conceptual Level
struct STAFF{int Staff_No;int Branch_No;char Fname [15];char Lname [15];struct date Date_of_Birth;float Salary;struct STAFF * next; /*Pointer to next Staff Record*/};index Staff_No; index Branch_No; /*Define indexes for staff*/
Internal Level
CS370 Spring 2007
UserUser UserUser
External ViewExternal View
UserUser UserUser
External ViewExternal View
UserUser UserUser
External ViewExternal View
Conceptual ViewConceptual View(Table,etc)(Table,etc)
Physical ViewPhysical View
Logical BarrierLogical Barrier
Physical BarrierPhysical Barrier
CS370 Spring 2007
Data IndependenceData Independence
• One big problem in application development is the separation of applications from data
• Do I have change my program when I …– replace my hard drive?– partition the data into two physical files (or merge two
physical files into one)?– store salary as floating point number instead of integer?– develop other applications that use the same set of data?– add more data fields to support other applications?– … …
CS370 Spring 2007
External/Conceptual External/Conceptual mappingmapping
Data Independence and the ANSI-SPARC Three-level Data Independence and the ANSI-SPARC Three-level ArchitectureArchitecture
External SchemaExternal Schema
Conceptual SchemaConceptual Schema
Internal SchemaInternal Schema
Conceptual/internal Conceptual/internal mappingmapping
External SchemaExternal Schema External SchemaExternal Schema
Logical data independenceLogical data independence
Physical data independencePhysical data independence
CS370 Spring 2007
Instances and SchemasInstances and Schemas
• Each level is defined by a schema, which defines the data at the corresponding level– A logical schema defines the logical structure of the
database (e.g., set of customers and accounts and the relationship between them)
– A physical schema defines the file formats and locations
• A database instance refers to the actual content of the database at a particular point in time. A database instance must conform to the corresponding schema
CS370 Spring 2007
Logical Data IndependenceLogical Data Independence
Refers to immunity of external schemas to change in Refers to immunity of external schemas to change in conceptual schema.conceptual schema.
Conceptual schema changes e.g. addition /removal of Conceptual schema changes e.g. addition /removal of entities.entities.
Should not require changes to external schema or Should not require changes to external schema or rewrites of application programs.rewrites of application programs.
Data IndependenceData Independence
CS370 Spring 2007
external
schema1
external
schema1
external
schema2
external
schema2
external
schema3
external
schema3
conceptual
schema
conceptual
schema
internal
schema
internal
schema
databasedatabase
Logical data Logical data independence is a independence is a
measure of how much measure of how much the conceptual schema the conceptual schema
can change without can change without affecting the affecting the
application programsapplication programs
• For Example, by removing For Example, by removing
record type or data item, record type or data item,
the external schemas that the external schemas that
refer only to the remaining refer only to the remaining
data should not be affected.data should not be affected.
• Only the view definitions and Only the view definitions and
mapping need to be changed mapping need to be changed
in DBMS.in DBMS.
Logical Data IndependenceLogical Data Independence
CS370 Spring 2007
Physical Data IndependencePhysical Data Independence
Refers to immunity of conceptual schemas to change Refers to immunity of conceptual schemas to change in internal schema.in internal schema.
Internal schema changes e.g. using different file Internal schema changes e.g. using different file organizations, storage structures/devices.organizations, storage structures/devices.
Should not require changes to Conceptual or external Should not require changes to Conceptual or external schema.schema.
Data IndependenceData Independence
CS370 Spring 2007
external
schema1
external
schema1
external
schema2
external
schema2
external
schema3
external
schema3
conceptual
schema
conceptual
schema
internal
schema
internal
schema
databasedatabase
Physical data Physical data independence is a independence is a
measure of how much measure of how much the internal schema can the internal schema can
change without change without affecting the affecting the
application programsapplication programs
• some physics files are some physics files are
reorganized.reorganized.
• For example, when create For example, when create
additional access structure to additional access structure to
improve the performance of improve the performance of
retrieval or update. If the same retrieval or update. If the same
data as before remains same in data as before remains same in
the database.the database.
• No need to change conceptual No need to change conceptual
schemaschema
Physical Data IndependencePhysical Data Independence
CS370 Spring 2007
• Here are some questions a user might ask:-– What is the name of the student whose ID = 1234?– How many students are enrolled in CS370?– Is any student with GPA less than 3.0 enrolled in CS370?
• Such questions involving the data stored in a DBMS are called queries
• A DBMS provides a specialized language, called the query language, in which queries can be imposed
• Relational algebra and relational calculus are the formal query languages based on mathematical logic and collection of operators.
• Query language is also a part of DML (which would be discussed later)
Queries in DBMSQueries in DBMS
CS370 Spring 2007
Transaction ManagementTransaction Management
•A transaction is a collection of operations that performs a single logical function in database application
time
Transaction 1
Transaction 2
Conflicting read/write
Transaction 1
CS370 Spring 2007
• Transaction-management component ensures that the database remains in a consistent (correct) state despite system failures (e.g. power failures and operating system crashes) and transaction failures.
• Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database.
Transaction Management (cont.)Transaction Management (cont.)
CS370 Spring 2007
Storage ManagementStorage Management
• A storage manager is a program module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system.
• The storage manager is responsible for the following tasks:– interaction with the file manager– efficient storing, retrieving, and updating of data.
CS370 Spring 2007
Overall System ArchitectureOverall System Architecture
CS370 Spring 2007
Two-tier architecture: E.g. client programs using ODBC/JDBC to communicate with a databaseThree-tier architecture: E.g. web-based applications, and applications built using “middleware”
Application ArchitecturesApplication Architectures
CS370 Spring 2007
Server hold the database and the DBMS.
Clint manage the user interface and runs applications.
LAN
Request for data Selected data returned
Databa
se
Server (With DBMS)
Client – ServerClient – Server
CS370 Spring 2007
Clint Server
Manages the user interface.
Accepts and checks syntax of user input.
Process application.
Generates database request and transmit to server.
Passes response back to user.
Accepts and processes database requests from clients.Checks authorizationEnsure integrity constraints not violated.Performs query/update processing and transmit response to client.Maintains system catalog.Provides concurrent database access.Provides recovery control.
Summary – Client Server functionsSummary – Client Server functions
CS370 Spring 2007
Two-tier ArchitectureTwo-tier Architecture
• User Interface Programs and Application Programs run on the client side
• Interface called ODBC (Open Database Connectivity) provides an Application program interface (API) allow client side programs to call the DBMS. Most DBMS vendors provide ODBC drivers.
• A client program may connect to several DBMS.• Other variations of clients are possible: e.g., in
some DBMS, more functionality is transferred to clients including data dictionary functions, optimization and recovery across multiple servers, etc. In such situations the server may be called the Data Server.
CS370 Spring 2007
Three-tier ArchitectureThree-tier Architecture
• Common for Web applications• Intermediate Layer called Application Server or
Web Server: • Stores the web connectivity software and the rules
and business logic (constraints) part of the application used to access the right amount of data from the database server
• Acts like a conduit for sending partially processed data between the database server and the client.
• Additional Features- Security: • Encrypt the data at the server before transmission• Decrypt data at the client
CS370 Spring 2007
Advantages include:
Wider access to existing database.
Increased performance.
Possible reduction in hardware cost.
Reduction in communication cost.
Increased consistency.
Client - ServerClient - Server
CS370 Spring 2007
Distributed database system consist of collecting of sites, connected together via some kind of communication network, in which.
Each site is a database system site in its own right.
The site is have agreed to work together so that a user at any
site can access data anywhere in the network exactly as
if the data were all stored at the user’s own site.
Distributed Database SystemsDistributed Database Systems
CS370 Spring 2007
Lahore
Shanghai Islamabad
Singapore
Communication network
Distributed Database SystemsDistributed Database Systems