business intelligence reminds on data warehousing

17
BUSINESS INTELLIGENCE Reminds on Data Warehousing (details at the Decision Support Database course) Data Sciencce & Business Informatics Degree

Upload: others

Post on 16-May-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BUSINESS INTELLIGENCE Reminds on Data Warehousing

BUSINESS INTELLIGENCE

Reminds on Data Warehousing

(details at the Decision Support Database course)

Data Sciencce & Business Informatics Degree

Page 2: BUSINESS INTELLIGENCE Reminds on Data Warehousing

BI Architecture2

Laboratory of Data Science

Page 3: BUSINESS INTELLIGENCE Reminds on Data Warehousing

Star-schema datawarehouse

A fact table with star-schema dimension tables only

3

time_key

day

day_of_the_week

month

quarter

year

time

branch_key

street

city

province

country

branch

Sales Fact Table

time_key

item_key

employee_key

branch_key

units_sold

dollars_sold

dollars_costMeasures

item_key

item_name

brand

type

supplier_type

item

employee_key

employee_name

employee_type

employee

Laboratory of Data Science

Page 4: BUSINESS INTELLIGENCE Reminds on Data Warehousing

Snowflake-schema datawarehouse

A fact table with star-schema, snowflake and parent-child dimension tables

4

time_key

day

day_of_the_week

month

quarter

year

time

branch_key

street

city_key

branch

Sales Fact Table

time_key

item_key

employee_key

branch_key

units_sold

dollars_sold

dollars_costMeasures

item_key

item_name

brand

type

supplier_key

item

supplier_key

supplier_type

supplier

city_key

city

province

country

cityemp_key

emp_name

emp_boss_key

employees

Laboratory of Data Science

Page 5: BUSINESS INTELLIGENCE Reminds on Data Warehousing

Which DBMS technology for DW?

Storage technology

Architecture

5

Laboratory of Data Science

Page 6: BUSINESS INTELLIGENCE Reminds on Data Warehousing

Storage

RDBMS: record oriented structure

Laboratory of Data Science

6

Columnar: column oriented structure

Advantages:• Faster Scan

• Data Compression (e.g. State)

Page 7: BUSINESS INTELLIGENCE Reminds on Data Warehousing

Storage

Correlation value-based database

Data cells contain the index to an order-set value

In-memory database

Data is stored in compressed format in main memory

Extraction-based system

Storage of attribute extracted from continuous data

flows (eg., web traffic, sensors)

...

Laboratory of Data Science

7

Page 8: BUSINESS INTELLIGENCE Reminds on Data Warehousing

Architecture

Sequential

SQL query processing by a single

processor

Parallel

SQL query plan processing by a

multi-processor machine, with shared

memory

Distributed (Map-reduce)

SQL query processing distributed to

a set of independent machines

◼ Teradata SQL-MR, Hadoop HiveQL

Laboratory of Data Science

8

Page 9: BUSINESS INTELLIGENCE Reminds on Data Warehousing

BI Architecture10

Laboratory of Data Science

Page 10: BUSINESS INTELLIGENCE Reminds on Data Warehousing

K-dimensional cuboid11

Milk Bread … ... Pasta

Jan 13

… ...

Dec 13

Feb 13

Product.Category

Store.City

Time.Month

OrangeRome

PiseFlorence

An hyper-cube with K axes, with a level of some hierarchy at each axis. A

cell of the cuboid contains the values of metrics for the conditions given by

the cell coordinates.

Lucca

Laboratory of Data Science

Page 11: BUSINESS INTELLIGENCE Reminds on Data Warehousing

Cube navigation by different users12

product

Branch manager look at sales

of his/her stores

for any product and any period

Product managers look at sales

of some products

in any period and in any market

Finance manager look at sales

of a period compared to the previous period

for any product and any market

tim

e

Laboratory of Data Science

Page 12: BUSINESS INTELLIGENCE Reminds on Data Warehousing

Cuboids in SQL

13

SELECT L.city, I.brand, T.month, SUM(dollars_sold)FROM fact AS F, location AS L, time AS T, item AS IWHERE F.location_key = L.location_key AND

F.time_key = T.time_key AND F.item_key = I.item_key

GROUP BY L.city, I.brand, T.month

MeasureAggregate

Star-Join

Hierarchy levels

Order or

pivoting

Laboratory of Data Science

Page 13: BUSINESS INTELLIGENCE Reminds on Data Warehousing

Roll-up

How many cuboids?14

Product

Time

All

Time

Time

Product

All

All

Drill-Down

Roll-up

Drill-Down

Drill-Down

Roll-up

Laboratory of Data Science

Page 14: BUSINESS INTELLIGENCE Reminds on Data Warehousing

15

Data Cube (extended cube, hypercube)

Total annual sales

of TVs in U.S.A.Date

Cou

ntr

y

*

*TV

VCRPC

1Qtr 2Qtr 3Qtr 4Qtr

U.S.A

Canada

Mexico

*

Laboratory of Data Science

Page 15: BUSINESS INTELLIGENCE Reminds on Data Warehousing

Data cube in SQL Server16

SELECT L.city, I.brand, T.month, SUM(dollars_sold)FROM fact AS F, location AS L, time AS T, item AS IWHERE F.location_key = L.location_key AND

F.time_key = T.time_key AND F.item_key = I.item_key

GROUP BY CUBE(L.city, I.brand, T.month)

MeasureAggregate

Star-Join

Hierarchy levels

Order or

pivoting

GROUP BY ROLLUP(L.city, I.brand, T.month) - all initial subsequences of the group-by attributes

Laboratory of Data Science

Page 16: BUSINESS INTELLIGENCE Reminds on Data Warehousing

Slice and Dice17

Product

Time

Slice

Product

Time

Product =A,B,C

Laboratory of Data Science

Page 17: BUSINESS INTELLIGENCE Reminds on Data Warehousing

Slice in SQL Server18

SELECT L.city, I.brand, T.month, SUM(dollars_sold)FROM fact AS F, location AS L, time AS T, item AS IWHERE F.location_key = L.location_key AND

F.time_key = T.time_key AND F.item_key = I.item_key AND

T.year = 2016GROUP BY CUBE(L.city, I.brand, T.month)

MeasureAggregate

Star-Join

Hierarchy levels

Order or

pivoting

Slice

Laboratory of Data Science