![Page 1: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/1.jpg)
Online Analytical Processing (OLAP)
An Overview
Kian Win Ong, Nicola OnoseMar 3rd 2006
![Page 2: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/2.jpg)
Overview• Motivation
• Multi-Dimensional Data Model
• Research Areas
• Optimizations– Materializing multiple aggregates simultaneously– Materialization strategy
![Page 3: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/3.jpg)
Motivation• Aggregation, summarization and exploration• Of historical data• To help management make informed decisions
![Page 4: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/4.jpg)
Different Goal• Aggregation, summarization and exploration• Of historical data• To help management make informed decisions
Product Branch Time Price
Coke (0.5 gallon) Convoy Street 2006-03-01 09:00:01 $1.00
Pepsi (0.5 gallon) UTC 2006-03-01 09:00:01 $1.03
Coke (1 gallon) UTC 2006-03-01 09:00:02 $1.50
Altoids Costa Verde 2006-03-01 09:01:33 $0.30
...
• Find the total sales for each product and month• Find the percentage change in the total monthly
sales for each product
![Page 5: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/5.jpg)
Different Requirements
OLTP OLAP
Tasks Day to day operation High level decision support
Size of database Gigabytes Terabytes
Time span Recent, up-to-date Spanning over months / years
Size of working set Tens of records, accessed through primary keys
Consolidated data from multiple databases
Workload Structured / repetitive Ad-hoc, exploratory queries
Performance Transaction throughput Query latency
• OLTP – On-Line Transaction Processing• OLAP – On-Line Analytical Processing
![Page 6: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/6.jpg)
Overview• Motivation
• Multi-Dimensional Data Model
• Research Areas
• Optimizations– Materializing multiple aggregates simultaneously– Materialization strategy
![Page 7: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/7.jpg)
Query Language Extensions
• In the real world, data is stored in RDBs.
![Page 8: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/8.jpg)
Query Language Extensions
• In the real world, data is stored in RDBs.
• How to express N-dimensional problems using 2D tables?
![Page 9: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/9.jpg)
Query Language Extensions
• In the real world, data is stored in RDBs.
• How to express N-dimensional problems using 2D tables?
• Can we combine OLAP and SQL queries?
• Jim Gray et al:Data Cube: A Relational Aggregation Operator1997
![Page 10: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/10.jpg)
Query Language Extensions
1.histograms
Problems with GROUP BY
SELECT sales, prod_name, population FROM sales_history GROUP BY Population(City, State) as population
![Page 11: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/11.jpg)
Query Language Extensions
1.histograms
2.rollup/drilldown
Problems with GROUP BY
ProductCategory
Product Name
Month Sales Sales by Cat.,
by Name
Salesby Cat.
Drinks Coke Feb 30.3
Mar 93.9 124.2
Heineken Feb 34.8
Mar 123.8 158.6 282.8
non relational representation
![Page 12: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/12.jpg)
Query Language Extensions
1.histograms
2.rollup/drilldown
Problems with GROUP BY
Product Category
Product Name
Month Sales Sales by Cat.,
by Name
Salesby Cat.
Drinks Coke Feb 30.3 124.2 282.8
Drinks Coke Mar 93.9 124.2 282.8
Drinks Heineken Feb 34.8 158.6 282.8
Drinks Heineken Mar 123.8 158.6 282.8
relational, but the rollup is huge
![Page 13: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/13.jpg)
Query Language Extensions
1.histograms
2.rollup/drilldown
3.cross tabulations
Problems with GROUP BY
2-D aggregation is more compact and more natural:
Drinks Feb Mar Total
Coke 30.3 93.9 124.2
Heineken 34.8 123.8 158.6
Total 65.1 217.7 282.8
![Page 14: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/14.jpg)
Query Language ExtensionsReducing the number of attributes
Product Category
Product Name
Month Sales
Drinks Coke Feb 30.3
Drinks Coke Mar 93.9
Drinks Coke ALL 124.2
Drinks Heineken Feb 34.8
Drinks Heineken Mar 123.8
Drinks Heineken ALL 158.6
Drinks ALL ALL 282.8
Drinks ALL Feb 65.1
Drinks ALL Mar 217.7
![Page 15: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/15.jpg)
Query Language Extensions
• introduce a new value: “ALL”
Reducing the number of attributes
“ALL” = the set over which we aggregate
Drinks Feb Mar Total (ALL)
Coke 30.3 93.9 124.2
Heineken 34.8 123.8 158.6
Total (ALL) 65.1 217.7 282.8
![Page 16: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/16.jpg)
Query Language Extensions
• GROUP BY (1D)
General approach
Sales by Product Name
Feb Mar
Coke 30.3 93.9
Heineken 34.8 123.8
SUM 65.1 217.7
![Page 17: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/17.jpg)
Query Language Extensions
• GROUP BY (1D)
• Cross Tab (2D)
General approach
Drinks Feb Mar ALL
Coke 30.3 93.9 124.2
Heineken 34.8 123.8 158.6
ALL 65.1 217.7 282.8
Product Category
Product Name
Month Sales
Drinks Coke Feb 30.3
Drinks Coke Mar 93.9
Drinks Coke ALL 124.2
Drinks Heineken Feb 34.8
Drinks Heineken Mar 123.8
Drinks Heineken ALL 158.6
Drinks ALL Feb 65.1
Drinks ALL Mar 217.7
Drinks ALL ALL 282.8
the corresponding relation:
![Page 18: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/18.jpg)
Query Language Extensions
• GROUP BY (1D)
• Cross Tab (2D)
• Cube (3D)
General approachProduct
CategoryProduct Name
Month Sales
Drinks Coke Feb 30.3
Drinks Coke Mar 93.9
Drinks Coke ALL 124.2
… … … …
Snacks Doritos Feb 123.8
Snacks Doritos Mar 158.6
Snacks Doritos ALL 65.1
… … … …
ALL ALL ALL 964.0
By cat.andmonth
By cat. andname (does it make sense?)
By monthand name
![Page 19: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/19.jpg)
Query Language Extensions
• GROUP BY (1D)
• Cross Tab (2D)
• Cube (3D)
• Any hypercube can be represented as a relation!
General approach
![Page 20: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/20.jpg)
Query Language Extensions
• a CUBE relation, with aggregation function f(.)(x1, x2, …, xn-1, xn, f() )
……………………………
(x1, xn-1, …, xn, ALL, f() )
……………………………
(x1, x2, …, ALL, xn, f() )
……………………………
• after ROLLUP , reduce to a linear # of tuples(x1, x2, …, xn-1, xn, f() )
…………………………………
(x1, xn-1, …, xn, ALL, f() )
…………………………………
(x1, x2, …, ALL, ALL, f() )
…………………………………
(ALL, ALL, …, ALL, ALL, f() )
General approach
![Page 21: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/21.jpg)
Query Language ExtensionsThe new operators: CUBE, ROLLUP
SELECT prod_category, prod_name, month, SUM(sales) AS sales FROM sales_history GROUP BY CUBE prod_category, prod_name, month
Product Category
Product Name
Month Sales
Drinks Coke Feb 30.3
Drinks Coke Mar 93.9
Drinks Coke ALL 124.2
… … … …
Drinks ALL Feb 99.8
… … … …
ALL ALL ALL 964.0
Idea:Group by the CUBE list.Union the aggregates.Introduce the ALL values.
![Page 22: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/22.jpg)
Query Language ExtensionsThe new operators: CUBE, ROLLUP
SELECT prod_category, month, day, state, prod_name, SUM(sales) AS sales FROM sales_history GROUP BY prod_category ROLLUP month, day CUBE city, state
Product Category
Month Day State Product Name
Sales
Drinks Feb 26 CA Coke 12.3
Feb 26 CA Heineken 5.4
… … … … …
Feb 26 CA ALL 30.4
Feb 26 ALL Coke …
… … … …
Snacks Feb 26 CA Doritos 12.0
… … … …
![Page 23: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/23.jpg)
Overview• Motivation
• Multi-Dimensional Data Model
• Research Areas
• Optimizations– Materializing multiple aggregates simultaneously– Materialization strategy
![Page 24: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/24.jpg)
Research Areas• SQL language extensions
• Server architecture
• Parallel processing
• Index structures
• Materialized views
![Page 25: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/25.jpg)
Overview• Motivation
• Multi-Dimensional Data Model
• Research Areas
• Optimizations– Materializing multiple aggregates simultaneously– Materialization strategy
![Page 26: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/26.jpg)
Simultaneous
AggregatesMulti-Dimensional
• Optimization to calculate multiple aggregates simultaneously
• Useful for materialization of aggregate views
• Y. Zhao, P. Deshpande, J. NaughtonAn Array-Based Algorithm for Simultaneous Multidimensional AggregatesSIGMOD 1997
![Page 27: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/27.jpg)
Multiple Aggregates
Month /Product
Feb Mar Total
Altoids 36 131 167
Coke 37 138 175
Doritos 21 136 157
Heineken 44 110 154
Pepsi 31 122 153
Pringles 37 126 164
Total 206 764 970
Product City Month Sales
Coke San Diego Feb 06 12
Pepsi Los Angeles Feb 06 13
Doritos San Diego Mar 06 72
Altoids San Diego Mar 06 65
...
Aggregate on…
![Page 28: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/28.jpg)
Multiple Aggregates
City /Product
San Diego Los Angeles Total
Altoids 90 77 167
Coke 89 86 175
Doritos 74 83 157
Heineken 74 80 154
Pepsi 68 85 153
Pringles 73 90 164
Total 469 501 970
Month /City
Feb Mar Total
Los Angeles 112 358 469
San Diego 95 407 501
Total 206 764 970
Month /Product
Feb Mar Total
Altoids 36 131 167
Coke 37 138 175
Doritos 21 136 157
Heineken 44 110 154
Pepsi 31 122 153
Pringles 37 126 164
Total 206 764 970
Product City Month Sales
Coke San Diego Feb 06 12
Pepsi Los Angeles Feb 06 13
Doritos San Diego Mar 06 72
Altoids San Diego Mar 06 65
...
Aggregate on…
![Page 29: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/29.jpg)
Multiple Aggregates
Product City Month Sales
Coke San Diego Feb 06 12
Pepsi Los Angeles Feb 06 13
Doritos San Diego Mar 06 72
Altoids San Diego Mar 06 65
...
1. Sales by Product / City2. Sales by Product / Month3. Sales by Month / City4. Sales by Product5. Sales by City6. Sales by Month7. Sales (Total)
Is it possible to• make a single pass over the transactional table?• calculate multiple aggregates simultaneously?
Aggregate on…
![Page 30: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/30.jpg)
Chunking
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
64
20
36
42Dimension B
Dimension A
Dimension C
1
Product City Month Sales
Coke San Diego Feb 06 12
12
Array Chunk
Product
City
Month
Partition transactional data into array chunks
![Page 31: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/31.jpg)
Naïve Algorithm
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
64
20
36
42
Dimension A
Dimension C
Pivot on ABaggregate on all C
Dimension A
Dimension B
![Page 32: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/32.jpg)
Naïve Algorithm
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
64
20
36
42
Dimension A
Dimension C
Pivot on ABaggregate on all C
Pivot on ACaggregate on all B
Pivot on BCaggregate on all A
Dimension B
![Page 33: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/33.jpg)
Single Pass Algorithm
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
64
20
36
42
Dimension A
Dimension C
B
1 2 3 4
1 2 3 4
1 2 3 4
AB
AC
BC
Make a single pass over data
![Page 34: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/34.jpg)
Single Pass Algorithm
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
64
20
36
42
Dimension A
Dimension C
B
13
9 10 11 12
5 6 7 8
1 2 3 4
1 5 9 13
2 6 10 3 7 11 4 5 12
13
9 10 11 12
5 6 7 8
1 2 3 4
AB
AC
BC
Simultaneously maintain multiple aggregates
![Page 35: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/35.jpg)
Single Pass Algorithm
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
64
20
36
42
Dimension A
Dimension C
B
13
9 10 11 12
5 6 7 8
1 2 3 4
1 5 9 13
2 6 10 3 7 11 4 5 12
13
9 10 11 12
5 6 7 8
1 2 3 4
AB
AC
BC
Write out completed aggregates
![Page 36: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/36.jpg)
Single Pass Algorithm
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
64
20
36
42
Dimension A
Dimension C
B
13
9 10 11 12
5 6 7 8
1 2 3 4
1 5 9 13
2 6 10 3 7 11 4 5 12
13
AB
AC
BC
Only allocate memory that is necessary
![Page 37: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/37.jpg)
Single Pass Algorithm13
9 10 11 12
5 6 7 8
1 2 3 4
1 5 9 13
2 6 10 3 7 11 4 5 12
13
AB
AC
BC
Array Chunk
ABC4 x 4 x 4
AB16 x 4 x 4
AC4 x 4 x 4
BC4 x 4
A4 x 4
B4
C4
all1
Minimum memory spanning tree
![Page 38: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/38.jpg)
Multi Pass Algorithm
ABCD
ABC ABD ACD BCD
AB AC BC AD BD CD
A B C D
all
Recursively aggregate
![Page 39: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/39.jpg)
Overview• Motivation
• Multi-Dimensional Data Model
• Research Areas
• Optimizations– Materializing multiple aggregates simultaneously– Materialization strategy
![Page 40: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/40.jpg)
Implementing Data Cubes
• Biggest problem for data warehouses: the size
• Space / time trade-off:accelerate queries by materializing the cube
![Page 41: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/41.jpg)
Implementing Data Cubes
• Biggest problem for data warehouses: the size
• Space / time trade-off:accelerate queries by materializing the cube
• The size of the relations gets even bigger!
![Page 42: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/42.jpg)
Implementing Data Cubes
• Biggest problem for data warehouses: the size
• Space / time trade-off:accelerate queries by materializing the cube
• The size of the relations gets even bigger!
• M(ultidimensional)OLAP: good query performance, but bad scalability
• R(elational)OLAP: very scalable; query performance improved by materializing (partial) results
![Page 43: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/43.jpg)
Implementing Data Cubes
• V. Harinarayan, A. Rajaraman, J.D. Ullman:Implementing Data Cubes EfficientlySIGMOD 1996
Presents a materialization strategy for the cells of the cube.
![Page 44: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/44.jpg)
Implementing Data Cubes
Time Id
City Id
Product Id
Sales
Day
Month
WeekCity Id
City
State
Product Id
Name
Category
Week
Month
Year Year
Category Id
Category Name
![Page 45: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/45.jpg)
Implementing Data Cubes
• casted as particular case of the rewriting using views problem
• what cells to materialize what SQL views to materialize
![Page 46: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/46.jpg)
Implementing Data Cubes
• casted as particular case of the rewriting using views problem
• what cells to materialize what SQL views to materialize
p = productt = timec = city
• simple idea: Q1 depends on Q2 (Q1≤Q2) if Q1 can be fully answered using the results of Q2
ptc
pt tc pc
t p c
none
![Page 47: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/47.jpg)
Implementing Data Cubes
• but cube dimensions are usually hierarchical
product_name
product_category
none
day
week month
year
none
city
state
none
X X
• direct-product latticep = productt = timec = city
ptc
pt tc pc
ptspwc
pyc
pmc
ps
pcatt
… ……
…
…
![Page 48: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/48.jpg)
Implementing Data Cubes
• Def. cost of answering Q = # of rows in the table of ancestor(Q)
• It can be estimated w/o materializing the views
• Assume that all queries are identical to some view in the lattice
![Page 49: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/49.jpg)
Implementing Data Cubes
• For a set S and a view vB(v,S) = ∑w≤v, (w not in S) max{cost(w)-cost(v), 0}
• Greedy algorithm for selecting k views to materialize from the lattice:
1. S := {top view}
2. For i=1 to k, add v to S s.t. B(v,S) is maximized
• The greedy algorithm is an (e-1)/e ≈ 0.63 approx. of the optimum.
![Page 50: Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006](https://reader036.vdocuments.us/reader036/viewer/2022062409/5697c0251a28abf838cd5443/html5/thumbnails/50.jpg)
Discussion
• Questions from the audience…