microsoft confidential. we look at the world... with our own eyes

Post on 31-Mar-2015

220 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Microsoft Confidential

Many-To-Many dimensions relationships

Yennifer SantosPremier Field EngineerMicrosoft Corporation

We look at the world... with our own eyes ...

And no matter what our perspective is...

We understand that people play the most important role

People interact, communicate and relate each other every day ...

They relate in a one to many relationships ...

On a one to one relationships ...

And in a many-to-many relationships ...

With all the pros and cons ... for each one of them

So for the SSAS …... Cube perspective of the world ......

The way relationships are build…is extremelly important

Specially when we are talking about M2M relationships

Typical scenario

M2M scenario concepts

14

Data measure group

M2M dimension

Intermediate measure group

Intermediate dimension

M2M scenario

What’s up with M2M relationships in SSAS?

16

Query

By M2M Dim

IN MEMORY INNER JOIN

What’s up with M2M relationships in SSAS?

As the data size of the records in the join increasesThe performance of the run-time join in SSAS suffers

17

IN MEMORY INNER JOIN

Query performance with M2M

Query performance in SSAS is intrinsically and linearly tied to the size of the data being joined between

Data andIntermediate measure groups

18

What to do to improve query performance?

There are 3 optimization techniques:Defining aggregationsPartitioningMatrix relationship

19

Defining aggregations optimization technique

What is an aggregation?Pre-calculated summary of data that SSAS uses to enhance query performanceMoves the calculation to the processing phaseSummarizes measures by a combination of dimension attributes

20

1st

Defining aggregations optimization technique

Aggregations Work by reducing the number of records that the storage engine needs to scan from disk in order to satisfy a queryProvides a significant benefit only if the size of the aggregation is significantly smaller than the size of the original table

21

Defining aggregations optimization technique

Where to define aggregations in a M2M scenario?

22

Defining aggregations optimization technique

Data measure group aggregations for M2M queries

Include in the aggregation the granularity attribute of all dimensions that join with the intermediate measure groupDo not include the attribute from the M2M dimension (occurs at query time as part of the run-time join)

23

X

Defining aggregations optimization technique

BenefitsThe benefit is directly related to the size of this aggregation compared to the size of the data measure groupAn aggregation whose size is >=1/3 the size of the facts themselves is not considered to be useful

24

Defining aggregations optimization technique

Intermediate measure group aggregations for M2M queries

Include in the aggregation the granularity attribute of the dimensions in the intermediate measure group that relates to the data measure group The attribute in the dimension that you wish to aggregate

25

Defining aggregations optimization technique

BenefitsThe benefit is directly related to the resulting size of the aggregation compared to the size of intermediate measure group

26

Partitioning optimization technique

PartitioningEnables SSAS to retrieve data from only a portion of data in a measure groupParallelize data retrieval when data must be retrieved from multiple partitions

27

2nd

Partitioning optimization technique

Where to define partitions in a M2M scenario?

28

Partitioning the data measure group

Data measure groupThe data measure group should generally be partitioned by the same dimension attribute members as the intermediate measure group

29

Partitioning the intermediate measure group

Intermediate measure groupYou must relate the partitioning dimensions to the intermediate measure Benefit

Reduces the data size used by the run-time join between intermediate and data measure groupsWhen the M2M query can be resolved from only a few (or one) partitions

30

Partitioning the intermediate measure group

If the M2M query must be resolved by retrieving data from many or all partitions

The technique provides little or no value and may even increase response times

31

Partitioning the intermediate measure group

Design paternThe intermediate measure group must

Be partitioned by one or more common dimensions used in queriesBe related to each dimension that is used for partitioning

The intermediate fact table must Contain the dimension surrogate key

That is used for the measure group partitioningTo relate with the dimension

32

Matrix relationship optimization technique

Increases M2M query performance by reducing the size of the run-time join

Uses a process of compression to eliminate unnecessary repetitiveness in the intermediate fact tableCollapses the size of the intermediate fact table

33

3rd

Matrix relationship optimization technique

Two phases:Compression and creation of the matrix keyImplementing the matrix optimization

34

Compression and creation of the matrix key

Creates a compressed intermediate fact table by taking the following steps:

Identify common dimension member combinations in the intermediate fact table Each set of common dimension member combinations is assigned a surrogate key (matrix key)Repeated combinations are eliminated

35

Compression and creation of the matrix key

36

Implementing the matrix optimization

Requires changes to aspects of the M2M relationship at the following levels:

The relational data warehouse implementation levelThe cube design levelThe ETL implementation level

37

Relational Datawarehouse

38

Relational Datawarehouse

Matrix dimension table creation

39

Relational Datawarehouse

Requires an intermediate fact table that relates the M2M dimensions to the base fact through the matrix key

Will have the matrix key column and one or more dimension key columns

40

Relational Datawarehouse

Add the Matrix Key column to the base fact table

41

Cube design

Works in the same way that any M2M dimension relationshipThe difference is

Shared dimension between the base measure group and the intermediate measure group is the matrix dimension, rather than the fact dimension

42

Cube design

In Data Source View, add two tables for

Matrix dimension New intermediate fact table

43

Cube design

Add a new SSAS database dimension based on the matrix dimension tableAdd this dimension to the cube that contains the base measure group

44

Cube design

Add a new SSAS measure group to act as the intermediate measure group for resolving the M2M relationships Remove the existing

Sales Reasons measure group

45

Cube design

Define the Dimension Usage to associate the dimension tables to the measure groups

Both the base and intermediate measure group must have a direct relationship to the matrix dimension

46

Cube design

The intermediate measure group also needs a direct relationship to the dimension or dimensions that participate in the M2M relationshipThe dimensions referenced in the second point must also be related to the base measure group

47

ETL implementation level

The ETL involvesUpdating the relational data warehouse tables with new matrix relationships Associating the fact tables to the matrix dimensions

48

Matrix relationship

The general guidelines are:Adding new key concatenation strings to the matrix dimension table for new records being added to the fact tableAdding the matrix dimension key associated with each transaction ID to the base fact tableAdding the key of the matrix dimension with each M2M dimension key record

49

Questions and Answers

50

top related