university of manitoba asper school of business 3500 dbms bob travica business analytics and...

22
University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials Updated 2015

Upload: shanon-willis

Post on 27-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

University of ManitobaAsper School of Business

3500 DBMSBob Travica

Business Analytics and Decision Making

OLTP, OLAP & SAP

Chapter 9 & SAP Materials

Updated 2015

Page 2: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

2 of 20

OLTP vs. OLAP

Online Transaction Processing (OLTP) = relational database systems

Online Analytical Processing (OLAP)

Page 3: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

3 of 20

OLAP via Data Warehousing

Online TransactionProcessing (OLTP):Querying Databaseswith 3NF tables

Operations’data

Predefinedreports

Online Analytical Processing (OLAP);Data warehousing;Data Mining.Usually de-normalized data.

Periodicaltransfers

Interactivedata analysis

Flat files

MIS 3500

Page 4: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

OLTP & OLAP in Enterprise Systems

Enterprise Systems (Enterprise Resource Planning Systems) support both.

Example: An SAP-based system can be a TPS, MIS and DSS for the entire organization. DSS capability draws on data warehousing & cubing.

Process approach to organization with data flowing smoothly end-to-end. Process link up horizontally (department-to-department) and vertically (process-sub-process).

Business process* is for the most part the system process.

More...

4 of 20

Page 5: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

5 of 20

Date Warehousing Goals

Data warehouse (DW) yields multidimensional view of data (cubing):

Integrate data from different sources to get a larger picture of

business Data aggregations (summaries on different dimensions)

DW expands scope of data analysis: Statistical analysis (test hypotheses on relationships between

pieces of data) Discover new relationships by querying cubes or applying data

mining software.

Page 6: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

6 of 20

Extraction, Transformation, and Loading

Data warehouse:All data must be consistent.

Customers

Convert “Client” to “Customer”

Apply standard product numbers

Convert currencies

Fix region codes

Transaction data from diverse systems.

• Preparations performed on data – ETL process

Extract

Transform (Transport)

Load

Page 7: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

7 of 20

Three-Dimensional View of Data: Cube

Product

Sales atLocation

Sales D

ate

Logic similar tocrosstab query And pivot table.

• Created in a datawarehouse

P5

P1 P2P4P3

Page 8: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

8 of 20

Data Hierarchy

Year

Quarter

Month

Week

Day

LevelsRoll-upTo get higher-level totals

Drill-downTo get lower-level details

Page 9: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

LocationLocationID

Detail

9 of 20

Datawarehouse Tables: Star Design

Revenue=Price*Quantity

Fact Table *

SaleSaleDateQuantityDiscount

Dimension

Inputted from Product and Sale

Revenue broken down by product, sales location, and desired time period (time column/s – day of year, or evensmaller; basis for rollup).

ProductProductID

Price

Dimension

Dimension

Design is:- Hierarchical (dimension tables have no direct association)- De-normalized (fact table): Price & Quantity inputted to Fact table;

New keys usually used in the fact table (e.g., SaleTbl#-Row#).

SaleSaleDateQuantityDiscountStoreID

Calculatedfact

Page 10: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

10 of 20

Datawarehouse Tables: Snowflake Design

MerchTblRowSaleTblRowPriceQuantity

OLAPItems

ItemIDDescriptionPriceCategory

Product

SaleIDSaleDateCustomerID DiscountSalesTax

Sale

CustomerIDPhoneFirstNameLastNameAddressZipCodeCityID

Customer

CityIDZipCodeCityState

City

Design is:-Network-like (dimension tables can connect directly)-Still partly normalized (Sale-Customer-City)

Fact Table Dimension Tables

• Advantage: Design simplified in the Fact table part.

Page 11: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

SAP Datawarehouse

11 of 20More on SAP Datawarehouse

Datawarehouse Cube Details

Can also be Dimensions

Page 12: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

12 of 20

Multidimensional View of Data – Precursors to DW: Excel Pivot Table

Can place data in rows or columns.By grouping months, can instantly get quarterly or monthly totals.

Quarter MonthQuarter 1 Quarter 2 Quarter 3 Quarter 4 Grand Total

LastName EmployeeIDDataCarpenter 8 Sum of Animal 1,668.91 606.97 426.39 7.20 2,709.47

Sum of Merchandise 324.90 78.30 99.00 128.70 630.90Eaton 6 Sum of Animal 522.37 341.85 562.50 1,426.72

Sum of Merchandise 30.60 54.90 107.10 192.60Farris 7 Sum of Animal 5,043.36 1,059.70 796.47 6,899.53

Sum of Merchandise 826.92 188.10 306.00 1,321.02Gibson 2 Sum of Animal 4,983.51 1,549.83 2,556.10 9,089.44

Sum of Merchandise 668.25 238.50 450.90 1,357.65Hopkins 4 Sum of Animal 3,747.96 1,194.88 372.65 128.41 5,443.90

Sum of Merchandise 476.91 252.90 121.50 7.20 858.51James 5 Sum of Animal 3,282.77 2,373.08 437.88 150.11 6,243.84

Sum of Merchandise 505.89 693.45 99.00 99.00 1,397.34O'Connor 9 Sum of Animal 2,643.69 180.91 510.12 3,334.72

Sum of Merchandise 263.70 83.70 55.80 403.20Reasoner 3 Sum of Animal 4,577.43 625.74 589.68 2,500.24 8,293.09

Sum of Merchandise 762.30 89.10 116.80 396.90 1,365.10Reeves 1 Sum of Animal 1,120.93 1,120.93

Sum of Merchandise 263.88 263.88Shields 10 Sum of Animal 1,008.76 162.15 1,170.91

Sum of Merchandise 62.10 22.50 84.60Total Sum of Animal 28,599.69 7,591.11 2,840.72 6,701.03 45,732.55Total Sum of Merchandise 4,185.45 1,624.05 569.50 1,495.80 7,874.80

Time hierarchy“Dimensions”

“Facts”(Measures)

Page 13: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

13 of 20

Multidimensional View of Data: CUBE Option in SQL 99

Bird 1 135.00 0 0Bird 2 45.00 0 0…Bird (null) 32.00 0 0Bird (null) 607.50 1 0Cat 1 396.00 0 0Cat 2 113.85 0 0…Cat (null) 1293.30 1 0(null) 1 1358.8 0 1(null) 2 1508.94 0 1(null) 3 2362.68 0 1…(null) (null) 8451.79 1 1

Category Month Amount Gc Gm

SELECT Category, Month, Sum, GROUPING (Category) AS Gc, GROUPING (Month) AS Gm

FROM …GROUP BY CUBE (Category, Month...)

Page 14: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

14 of 20

GROUPING SETS: Hiding Details

Bird (null) 607.50Cat (null) 1293.30…(null) 1 1358.8(null) 2 1508.94(null) 3 2362.68…(null) (null) 8451.79

Category Month Amount

SELECT Category, Month, SumFROM …GROUP BY GROUPING SETS ( ROLLUP (Category),

ROLLUP (Month),( ) )

Page 15: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

15 of 20

SQL RANK FunctionsSELECT Employee, SalesValue RANK() OVER (ORDER BY SalesValue DESC) AS rankDENSE_RANK() OVER (ORDER BY SalesValue DESC) AS denseFROM SalesORDER BY SalesValue DESC, Employee;

Employee SalesValue rank dense

Jones 18,000 1 1

Smith 16,000 2 2

Black 16,000 2 2

White 14,000 4 3

DENSE_RANK does not skip numbers

• Therefore, advances in SQL motivate DBMS vendors to support OLAP and data warehousing.

Page 16: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

16 of 20

Broader Data Analysis withData Mining

Goal: To discover unknown relationships in the data that can be used to make better decisions. Exploratory analysis. A bottom-up approach that scans the data to find relationships Some statistical routines, but they are not sufficient

Statistics relies on averages

Sometimes the important data lies in more detailed pairs

Supervised by developer vs. unsupervised (self-organizing artificial neural networks)

Page 17: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

17 of 20

Common Techniques

1. Classification/Prediction

2. Association Rules/Market Basket Analysis

3. Clustering

Page 18: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

18 of 20

1. Classification(Prediction)

Purpose: “Classify” things that are causes and those that are effects.

Examples Which borrowers/loans are most likely to be successful?

Which customers are most likely to want a new item?

Which companies are likely to file bankruptcy?

Which workers are likely to quit in the next six months?

Which startup companies are likely to succeed?

Which tax returns are fraudulent?

Page 19: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

19 of 20

Classification Process

Clearly identify the outcome/dependent variable. Identify potential variables that might affect the outcome. Use sample data to test and validate the model. Regression/correlation analysis, decision tables and trees,

etc.

Income Credit History Job Stability Credit Success

50000 Good Good Yes

75000 Mixed Bad No

Page 20: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

20 of 20

2. Association/Market Basket

Purpose: Determine what events or items go together/co-occur.

Examples: What items are customers likely to buy together?

(Business use: Consider putting the two together to

increase cross-selling.)

Page 21: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

21 of 20

Association Challenges

If an item is rarely purchased, any other item bought with it seems important. So combine items into categories.

Some relationships are obvious. Burger and fries.

Some relationships are puzzling/meaningless. Hardware store found that toilet rings sell well only when a new

store first opened. But what does it mean?

Page 22: University of Manitoba Asper School of Business 3500 DBMS Bob Travica Business Analytics and Decision Making OLTP, OLAP & SAP Chapter 9 & SAP Materials

DDBB

SSYYSSTTEEMMSS

22 of 20

3. Cluster Analysis Purpose: Determine groups of people or some entities. Examples

Are there groups of customers? (If so, we could target them; market segmentation)

Do the locations for our stores have elements in common? (If so, we can search for similar clusters for new locations.)

Do employees have common characteristics? (If so, we can hire similar, or dissimilar, people.)

Small intra-cluster distance

Large inter-cluster distance