13 chapter 13 the data warehouse hachim haddouti

32
13 Chapter 13 The Data Warehouse Hachim Haddouti

Post on 21-Dec-2015

222 views

Category:

Documents


2 download

TRANSCRIPT

13

Chapter 13

The Data Warehouse

Hachim Haddouti

2

13

Hachim Haddouti and Rob & Coronel, Ch13

In this chapter, you will learn:

• How operational data and decision support differ• What a data warehouse is and how its data are

prepared• What star schemas are and how they are

constructed• ROLAP, MOLAP• What data mining is and what role it plays in

decision support

3

13

Hachim Haddouti and Rob & Coronel, Ch13

• External and internal forces require tactical and strategic decisions

• Search for competitive advantage• Business environments are dynamic• Decision-making cycle time is reduced • Different managers require different decision

support systems (DSS)

The Need for Data Analysis

4

13

Hachim Haddouti and Rob & Coronel, Ch13

• Decision Support– Is a methodology

– Extracts information from data

– Uses information as basis for decision making

Decision Support Systems

5

13

Hachim Haddouti and Rob & Coronel, Ch13

Decision Support Systems

• Decision support system (DSS) – Arrangement of computerized tools

– Used to assist managerial decision

– Extensive data “massaging” to produce information

– Used at all levels in organization

– Tailored to focus on specific areas and needs

– Interactive

– Provides ad hoc query tools

6

13

Hachim Haddouti and Rob & Coronel, Ch13

DSS Components

7

13

Hachim Haddouti and Rob & Coronel, Ch13

• Operational data – Relational, normalized database – Optimized to support transactions – Real time updates

• DSS – Snapshot of operational data– Summarized – Large amounts of data

• Data analyst viewpoint– Timespan– Granularity– Dimensionality

Operational vs. Decision Support Data

13

8 Hachim Haddouti and Rob & Coronel, Ch13

MIS (=Manage-ment Informa-tionssystem)

MAIS (=Marke-ting Informations-system)

60' 70' 80', Begin 90' Mid 90'

DSS (=DecisionSupport System)

EIS (=ExecutiveInformation System)

Data-Ware-housesystemEIS (=Enter-prise IntelligenceSystem)IDF (=Informa-tion Delivery Facility)InformationWarehouseEIS (=Enter-prise Information System)

Unchanged Vision: right informationto the right time and place

History

9

13

Hachim Haddouti and Rob & Coronel, Ch13

• Integrated– Centralized– Holds data retrieved from entire organization

• Subject-Oriented

– Optimized to give answers to diverse questions– Used by all functional areas

• Time Variant – Flow of data through time– Projected data

• Non-Volatile

– Data never removed– Always growing

Data Warehouse

10

13

Hachim Haddouti and Rob & Coronel, Ch13

Creating a Data Warehouse

11

13

Hachim Haddouti and Rob & Coronel, Ch13

?

Purchase Storage Personnel FinancialSales

Customer Supllier Market competition

Internal Information Sources

External information sources

Data Warehouse

Ana

lyze

s, T

rend

s

Data Warehouse Shape

12

13

Hachim Haddouti and Rob & Coronel, Ch13

• Single-subject data warehouse subset• Decision support to small group• Can be test for exploring potential benefits of

Data warehouses• Address local or departmental problems

Data Marts

13

13

Hachim Haddouti and Rob & Coronel, Ch13

1. Separated from operational environment2. Data are integrated3. Contains historical data over long time horizon4. Snapshot data captured at given time5. Subject-oriented data6. Mainly read-only data with periodic batch

updates from operational source, no online updates

7. Development life cycle differs from classical one, data driven not process driven

Twelve Data Warehouse Rules

14

13

Hachim Haddouti and Rob & Coronel, Ch13

8. Contains different levels of data detail – Current and old detail– Lightly and highly summarized

9. Characterized by read-only transactions to large data sets

10. Environment has system to trace data resources, transformation, and storage

11. Metadata critical components – Identify and define data elements– Provide the source, transformation, integration, storage,

usage, relationships, and history of data elements

12. Contains charge-back mechanism for usage– Enforces optimal use of data

Twelve Data Warehouse Rules (Con’t.)

15

13

Hachim Haddouti and Rob & Coronel, Ch13

• Advanced data analysis environment• Supports decision making, business modeling,

and operations research activities • Characteristics of OLAP

– Use multidimensional data analysis techniques

– Provide advanced database support

– Provide easy-to-use end-user interfaces

– Support client/server architecture

Online Analytical Processing (OLAP)

16

13

Hachim Haddouti and Rob & Coronel, Ch13

OLAP Client/Server Architecture

17

13

Hachim Haddouti and Rob & Coronel, Ch13

OLAP Server Arrangement

18

13

Hachim Haddouti and Rob & Coronel, Ch13

OLAP Server with Multidimensional Data Store Arrangement

19

13

Hachim Haddouti and Rob & Coronel, Ch13

OLAP Server with Local Mini-Data-Marts

20

13

Hachim Haddouti and Rob & Coronel, Ch13

• OLAP functionality • Uses relational DB query tools• Extensions to RDBMS

– Multidimensional data schema support

– Data access language and query performance optimized for multidimensional data

– Support for very large databases (VLDBs)

Relational OLAP (ROLAP)

21

13

Hachim Haddouti and Rob & Coronel, Ch13

Typical ROLAP Client/Server Architecture

22

13

Hachim Haddouti and Rob & Coronel, Ch13

• OLAP functionality to multidimensional databases (MDBMS)

• Stored data in multidimensional data cube• N-dimensional cubes called hypercubes• Cube cache memory speeds processing• Affected by how the database system

handles density of data cube called sparsity

Multidimensional OLAP (MOLAP)

23

13

Hachim Haddouti and Rob & Coronel, Ch13

MOLAP Client/Server Architecture

24

13

Hachim Haddouti and Rob & Coronel, Ch13

• Data-modeling technique • Maps multidimensional decision support into

relational database• Yield model for multidimensional data analysis while

preserving relational structure of operational DB• Four Components:

– Facts– Dimensions– Attributes– Attribute hierarchies

Star Schema

25

13

Hachim Haddouti and Rob & Coronel, Ch13

Simple Star Schema

Figure 13.12

26

13

Hachim Haddouti and Rob & Coronel, Ch13

Slice and Dice View of Sales

27

13

Hachim Haddouti and Rob & Coronel, Ch13

• Facts and dimensions represented by physical tables in data warehouse DB

• Fact table related to each dimension table (M:1)• Fact and dimension tables related by foreign keys • Subject to the primary/foreign key constraints

Star Schema Representation

28

13

Hachim Haddouti and Rob & Coronel, Ch13

Star Schema for Sales

29

13

Hachim Haddouti and Rob & Coronel, Ch13

• Seeks to discover unknown data characteristics • Automatically searches data for anomalies and

relationships• Data mining tools

– Analyze data

– Uncover problems or opportunities

– Form computer models based on findings

– Predict business behavior with models

– Require minimal end-user intervention

Data Mining

30

13

Hachim Haddouti and Rob & Coronel, Ch13

Extraction of Knowledge from Data

31

13

Hachim Haddouti and Rob & Coronel, Ch13

Example 1

32

13

Hachim Haddouti and Rob & Coronel, Ch13

Example 2