fundamentals, design, and implementation, 9/e by david m. kroenke lecture 23: sharing enterprise...
TRANSCRIPT
Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23: Sharing Enterprise Data
Chapter 15
BSA206 Database Management Systems
Lecture 23 / Slide 2 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Database Processing Architectures
System architectures for enterprise database processing:– Teleprocessing system– Client-server system– File-sharing system– Distributed system
Lecture 23 / Slide 3 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Teleprocessing Systems
Classic architecture for multi-user database processing
Users operate dumb terminals or PC that emulate dumb terminals– User interface is usually simple and primitive
A single centralized computer processes communications control program, application programs, DBMS, and operating system
Lecture 23 / Slide 4 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Teleprocessing Systems
Lecture 23 / Slide 5 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Client-Server Systems
A client-server system consists of a network of computers connected via a LAN
Clients are personal computers used to process application programs
Servers are PCs or mainframes that stores DBMS and the data-management portion of the operating system
Lecture 23 / Slide 6 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Client-Server Systems
Lecture 23 / Slide 7 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
File-Sharing Systems
This architecture was developed before the client-server architecture
File server and user computers are connected through LAN– File server provides access to files and other resources– User computers must contain a copy of DBMS and
application programs DBMS on user’s computer sends requests to the
data management portion of the operating system on the file server for file-level processing– This cause more traffic across LAN than the client-server
system
Lecture 23 / Slide 8 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
File-Sharing Systems
Lecture 23 / Slide 9 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Distributed Database Systems
Distributed database systems use multiple computers to process the same database
Distributed processing: use multiple computers for applications or DBMS processing– E.g., file-sharing, client-server, and distributed
database system Distributed database processing: distribute
database to multiple computers – E.g., distributed database system
Lecture 23 / Slide 10 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Distributed Database Systems
Lecture 23 / Slide 11 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Database Partitioning
A vertical partition, or vertical fragment, refers to a table that is broken into two or more sets of columns
A horizontal partition, or horizontal fragment, refers to a table that is broken into two or more sets of rows
Mixed partition refers to a database broken into both horizontal and vertical partitions
Lecture 23 / Slide 12 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Types of Distributed Databases
Types of distributed database:– Nonpartitioned, nonreplicate– Partitioned, nonreplicated– Nonpartitioned, replicated– Partitioned, replicated
The greater the degree of partitioning and replication– The greater the flexibility, independence, and reliability– The greater the expense, control difficulty, and security
problems
Lecture 23 / Slide 13 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Types of Distributed Databases
Lecture 23 / Slide 14 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Comparing DB Distribution Alternatives
Lecture 23 / Slide 15 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Distributed Processing Techniques
Three types of distributed database processing Downloading of read-only data: only one computer
updates data, but multiple computers are sent copies to process
Updating by a designated computer: allows data update requests to originate on multiple computers, but to transmit those update requests to a designated computer for processing– Database at the designated computers must be
periodically synchronized
Lecture 23 / Slide 16 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Distributed Processing Techniques (cont.)
Updating by multiple computers: allows multiple updates on the same data at multiple sites– Three types of distributed update conflict can occur:
• Loss of uniqueness• Lost updates due to concurrent transactions• Updates of deleted data
Coordinating distributed atomic transactions is difficult and requires a two-phase commit
The OLE Distributed Transaction Server and Java Enterprise Beans are two technologies for dealing with these problems
Lecture 23 / Slide 17 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Downloading Data
Powerful personal computers enable user to download enterprise data for local processing
Users can query and report on downloaded data using DBMS products on their machines
Normally, users are not allowed to update and return data to prevent data integrity problems
A Web server can be used to publish downloaded data
Lecture 23 / Slide 18 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Potential Problems of Downloaded Databases
Coordination– Downloaded data must conform to database
constraints– Local updates must be coordinated with
downloads Consistency
– In general, downloaded data should not be updated
– Applications need features to prevent updating– Users should be made aware of possible
problems
Lecture 23 / Slide 19 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Potential Problems of Downloaded Databases (cont.)
Access Control – Data may be replicated on many computers– Procedures to control data access are more
complicated
Potential for Computer Crime– Illegal copying is difficult to prevent– Diskettes and illegal online access are easy to
conceal– Risk may prevent the development of
downloaded data applications
Lecture 23 / Slide 20 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Processing Downloaded Data with a Web Server
Lecture 23 / Slide 21 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
OLAP
On Line Analytical Processing (OLAP) is a new way of presenting information
With it, data is viewed in cubes that have axes, dimensions, measures, slices, and levels
Cube refers to – Underlying semantic structure that is used to
interpret data– A particular materialization of data in such a
semantic structure
Lecture 23 / Slide 22 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Example: Relational Source Data
Lecture 23 / Slide 23 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Example: OLAP Cube
Lecture 23 / Slide 24 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
OLAP Terminology
OLAP hypercube: means a data display with an unlimited number of axes
Lecture 23 / Slide 25 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
OLAP Schema Structures
Star schema: every dimension table is adjacent to the table storing the measure values– These tables may or may not be normalized
Snowflake schema: there can be multilevel, normalized tables
In general, the star schema requires greater storage, but it is faster to process than the snowflake schema
Lecture 23 / Slide 26 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Example: Star Schema
Lecture 23 / Slide 27 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Example: Snowflake Schema
Lecture 23 / Slide 28 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
OLAP Storage Alternatives
Three different means for storing OLAP data ROLAP (relational OLAP): relational DBMS with
extensions is sufficient to meet OLAP requirements
MOLAP (multidimensional OLAP): a specialized multidimensional processor is necessary to produce acceptable OLAP performance
HOLAP (hybrid OLAP): both DBMS products and specialized OLAP engines have a role and can be used to advantage
Lecture 23 / Slide 29 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Data Warehouse
A data warehouse is a store of enterprise data that is designed to facilitate management decision-making
Goal: to increase the value of the organization’s data asset
Role: to store extracts from operational data and make those extracts available to users in a useful format– Data can be extracts from databases, files,
images, recordings, photos, external data, etc.
Lecture 23 / Slide 30 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Data Warehouse
Lecture 23 / Slide 31 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Data Warehouse Components
Data extraction tools Extracted data Metadata of warehouse contents Warehouse DBMS(s) and OLAP servers Warehouse data management tools Data delivery programs End-user analysis tools User training courses and materials Warehouse consultants
Lecture 23 / Slide 32 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Data Warehouse Requirements
Queries and reports with variable structure User-specified data aggregation User-specified drill down Graphical outputs Integration with domain-specific programs
Lecture 23 / Slide 33 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Challenges for Data Warehouses
Inconsistent data– When data are integrated, inconsistencies can
develop due to timing and domain differences– Solution: create metadata to describe both
timing and domains of source data Tool Integration
– Because of the many tools required in a data warehouse, tools will have different user interfaces and inconsistent means of importing and exporting data, and it may be difficult to obtain technical support
Lecture 23 / Slide 34 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Challenges for Data Warehouses
Lack of tools for managing the data warehouse– The organization may have to develop its own
tools for managing non-relational data and for maintaining appropriate metadata. Such development is difficult and expensive
Ad hoc nature of requirements – Such requests are difficult to satisfy– Solution: create datamart, i.e,, limited-scope
data warehouses
Lecture 23 / Slide 35 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Data Marts
A data mart is a limited-scope data warehouse
A data mart is easier to manage than the enterprise data warehouse because – It has a much smaller domain– It can be restricted
• To a particular type of input data• To a particular business function• To a particular business unit or geographic area
Lecture 23 / Slide 36 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Enterprise Data Sharing Continuum
Lecture 23 / Slide 37 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Data Administration
Data are an important organizational asset that can support both operations and management decision making
The purpose of offices of data administration is to guard and protect the data and to ensure that they are used effectively
Lecture 23 / Slide 38 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Data Administration Challenges
Many types of data exist Basic categories of data are not obvious The same data can have many names The same data can have many
descriptions and formats Data are changed often concurrently Political-organizational issues complicate
operational issues
Lecture 23 / Slide 39 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Functions of Data Administration
Marketing– Communicate existence of data administration to
organization– Explain reason for existence of standards, policies, and
guidelines– Describe in a positive light the services provided
Data Standards– Establish standard means for describing data items.
Standards include name, definition, description, processing restrictions, etc.
– Establish data proponents
Lecture 23 / Slide 40 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Functions of Data Administration
Data Policies– Establish organization-wide data policy, e.g., security,
data proponency, and distribution
Forum for Data Conflict Resolution– Establish procedures for reporting conflicts– Provide means for hearing all perspectives and views– Have authority to make decision to resolve conflict
Return on Organization's Data Investment– Focus attention on value of data investment– Investigate new methodologies and technologies– Take proactive attitude toward information management