implementing changing dimensions

7
© 2013 IBM Corporation IBM Confidential Lesson 5.6: Implementing Changing Dimensions Fundamentals of Data Warehouse 1

Upload: john808

Post on 17-Feb-2016

218 views

Category:

Documents


0 download

DESCRIPTION

lesson

TRANSCRIPT

Page 1: Implementing Changing Dimensions

© 2013 IBM CorporationIBM Confidential

Lesson 5.6: Implementing Changing

Dimensions

Fundamentals of Data Warehouse

1

Page 2: Implementing Changing Dimensions

© 2013 IBM CorporationIBM Confidential

Implementing Changing Dimensions

� In a dimensional model, we are most interested in the various

forms of reporting and analyzing facts and measures. However,

facts and measures are relevant only when you define them in the

context of their dimensions. Dimensions and their attributes are

relatively constant, but they do change over time, e.g., for a

customer, the credit card type VISA gold is updaed to type VISA

platinum after a certain period of time based on credit history.

�The point here is, if such real life scenarios occur, how is your

dimensional model going to record these changes and what will be

the impact if your dimensional model is ill-prepared to

accommodate these changes.

�The term “changing dimensions” refers to the variation in

dimensional attributes over time.

2

Page 3: Implementing Changing Dimensions

© 2013 IBM CorporationIBM Confidential

Implementing Changing Dimensions (continued)

� Implementing Slowly Changing Dimensions

• Slowly changing dimensions (SCD) can be categorized into three types: changes that overwrite history (Type 1), those which preserve history (Type 2), and those which preserve a version of history (Type 3). The method you are going to follow to implement these types depends solely upon the user requirements and business needs.

• Type 1: Overwriting History

– With the Type 1 response, we merely overwrite the old attribute value in the dimension row, replacing it with the current value. In so doing, the attribute always reflects the most recent assignment.

– The Type 1 response is the simplest approach to dealing with dimension attribute changes. The advantage of Type 1 is that it is fast and easy. In the dimension table, we merely overwrite the preexisting value with the current assignment. The fact table is left untouched. The problem with a Type 1 response is that we lose all history of attribute changes.

– A Type 1 response obviously is appropriate if the attribute change is a correction. It also may be appropriate if there is no value in keeping the old description.

3

Page 4: Implementing Changing Dimensions

© 2013 IBM CorporationIBM Confidential

Implementing Changing Dimensions (continued)

� Implementing Slowly Changing Dimensions (continued)

• Type 2: Preserving History

– We have said that one of the primary goals of the data warehouse was to represent prior history correctly. A Type 2 response is the predominant technique for supporting this requirement when it comes to slowly changing dimensions.

– Of course, Type 2 requires the use of surrogate keys. With Type 2, we create a new dimension row with a new single-column primary key to uniquely identify the new dimension row. This single-column primary key establishes the linkage between the fact and dimension tables. There's no need to create a confusing secondary join based on effective or expiration dates, as we have pointed out.

– A Type 2 change writes a record with the new attribute information and preserves a record of the old dimensional data. These changes accurately partition history across time more efficiently than other types. However, they also add new records to the data warehouse environment thus significantly increasing the database size.

4

Page 5: Implementing Changing Dimensions

© 2013 IBM CorporationIBM Confidential

Implementing Changing Dimensions (continued)

� Implementing Slowly Changing Dimensions (continued)

• Type 3: Preserving a Version of History

– Instead of creating a new dimensional record to hold the attribute change,

Type 3 places a value for the change in the original dimensional record. You

can create multiple fields to hold distinct values for separate points in time.

– Essentially, with a type 3 response, we do not issue a new dimension row, but

rather we add a new column to capture the attribute change.

– Type 3 is appropriate when there's a strong need to support two views of the

world simultaneously. Some designers call this an alternate reality.

– This methods preserves the change but what would you do if tere were

multiple changes over a period of time? The side effects of Type 3 are

increased table size and increased complexity of the queries that analyze

historical values from the old data.

5

Page 6: Implementing Changing Dimensions

© 2013 IBM CorporationIBM Confidential

Implementing Changing Dimensions (continued)

� Implementing Slowly Changing Dimensions (continued)

• For example, let's assume that we work for an electronics retailer. The

procurement buyers are aligned along the same departmental lines as

the store, so the products being acquired roll up into departments. One

of the procured products is SmartKidz. The existing row in the product

dimension table for SmartKidz looks like the following:

• Suppose that a new merchandising person decides that IntelliKidz

should be moved from the Education software department to the

Strategy department on January 1, 2013, in an effort to boost sales.

6

Product Key Product Description Department SKU Number (Natural Key)

12345 SmartKidz Education ABC922-Z

Page 7: Implementing Changing Dimensions

© 2013 IBM CorporationIBM Confidential

Implementing Changing Dimensions (continued)

� Implementing Slowly Changing Dimensions (continued)

• Original:

• Type 1:

• Type 2:

• Type 3:

7

Product Key Product Description Department SKU Number (Natural Key)

12345 SmartKidz Education ABC922-Z

Product Key Product Description Department SKU Number (Natural Key)

12345 SmartKidz Strategy ABC922-Z

Product Key Product Description Department SKU Number (Natural Key)

12345 SmartKidz Education ABC922-Z

25984 SmartKidz Strategy ABC922-Z

Product Key Product Description Department Old Department SKU Number (Natural Key)

12345 SmartKidz Strategy Education ABC922-Z