dw in reatil sales

29
Click to edit Master subtitle style 3/15/12 Data Warehousing In Retail Sales

Upload: abhishek-chaurasia

Post on 06-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 1/29

Click to edit Master subtitle style

3/15/12

Data Warehousing InRetail Sales

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 2/29

3/15/12

Agenda

Goals of Data Warehouse

Components of Data Warehouse

Dimensional Modeling

Case Study : Retail Business

Designing the Dimensional Model

Dimensional Table Attributes

o Date Dimension

o Product Dimension

o Store Dimension

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 3/29

3/15/12

Goals of Data Warehouse

•  The data warehouse must make an organization’s

information easily accessible

•  The data warehouse must present the organization’s

information consistently

•  The data warehouse must be adaptive and resilient to

change

•  The data warehouse must be a secure bastion that

protects our information assets

•  The data warehouse must serve as the foundation for

improved decision making

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 4/29

3/15/12

Components of DataWarehouse

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 5/29

3/15/12

Dimensional Modeling

Fact Table

• Stores Business Performancemeasurement

Mostly numeric & additive• It expresses many to many relationship

between dimensions

Dimension Table• Discretely valued description that is more

or less constant and participates inconstraints

• It implements user interface to the DataWarehouse

Daily Sales FactTable

Date Key (FK)Product Key (FK)Store Key (FK)Quantity SoldDollar Sales Amount

Product DimensionTable

Product Key (PK)Product DescriptionSKU Number (Natural Key)Brand DescriptionCategory DescriptionDepartment DescriptionPackage Type DescriptionPackage Size... and many more

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 6/29

3/15/12

Bringing together facts anddimensionsStar Join Schema

Simplicity and symmetry

Highly recognizable to business usersHigh performance benefits

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 7/29

3/15/12

Example: A simple report

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 8/29

3/15/12

Case Study: Retail Business

A large grocery chain

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 9/29

3/15/12

Retail Business

100 grocery stores spread over five-statearea

Each store has a full complement of departments, including grocery, frozenfoods, dairy, meat and health/beauty aids

Each store has roughly 60,000 SKUs on itsshelves

About 55,000 of the SKUs come fromoutside manufacturers and have UniversalProduct Codes (UPCs) imprinted on theproduct package.

 The remaining 5,000 SKUs come from

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 10/29

3/15/12

Retail Business

Data collection happens at

Cash Registers (POS systems)

Back door where vendors make deliveries

Key inputs to the dimensional modeling

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 11/29

3/15/12

Designing the DimensionalModel Step 1. Select the Business Process

Aim: Management wants to better understand customer purchases ascaptured by the POS system

 The business process is POS retail sales

Step 2. Declare the Grain

Granularity, atomic data

Provides maximum flexibility

Can support all possibilities of user requests The most granular data is an individual line item on a POS transaction

Step 3. Choose the Dimensions

Date, product, and store dimensions

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 12/29

3/15/12

Designing the DimensionalModelMeasured facts in the retail sales schema

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 13/29

3/15/12

Dimensional Table Attributes

Date Dimension

Product Dimension

Store Dimension

Promotion Dimension

Promotion Coverage Factless Fact Table

Degenerate Transaction Number

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 14/29

3/15/12

Date Dimension

It is present in every data mart as a datamart is a time series

Unlike other dimension table datedimension can be build in advance

Attributes of date dimension:

o Day Number

o Month Number

o Holiday Indicator

o Weekday Indicator

o Selling Season

Date Dimension

Date Key (PK)

DateFull Date DescriptionDay of WeekDay Number in Epoch

Week Number in EpochMonth Number in EpochDay Number inCalendar MonthDay Number in

Calendar Year

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 15/29

3/15/12

Date Dimension contd..

Why an explicit date dimension table isneeded? As SQL query can directly constrainon fact table date key, if the date key in thefact table is a date-type field.

Usability: business user is not versed in SQL datesemantics, so he or she would be unable to directlyleverage inherent capabilities associated with a date

data type

SQL date functions do not support filtering byattributes such as weekdays versus weekends,holidays, fiscal periods, seasons, or major events

Presuming that the business needs to slice data by

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 16/29

3/15/12

Product Dimension

 The product dimension describes every SKUin the grocery store.

The product dimension is almost alwayssourced from the operational productmaster file

Most retailers administer their productmaster files at headquarters and download

a subset of the file to each store’s POSsystem at frequent intervals

 The product master holds many descriptiveattributes of each SKU

 The merchandise hierarchy is an important

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 17/29

3/15/12

Department Description Brand DescriptionSales DollarAmount

SalesQuantity

BakeryBakeryBakeryFrozen FoodsFrozen FoodsFrozen FoodsFrozen Foods

Frozen FoodsFrozen Foods

Baked WellFluffyLightQuickFreezeFreshlikeFrigidIcy

QuickFreezeFreshlike

$3,009$3,024$6,298$5,321$10,476$7,328$2,184

$6,467$10,476

1,1381,4762,4742,6405,2343,0921,437

3,1625,234

Product Dimension contd..

 There are attributes (Bottle, Bag, Box) inthe product dimension table which are notpart of the merchandise hierarchy, cancombine constraints with a constraint on a

merchandise hierarchy attribute The product dimension is one of the two or

three primary dimensions in nearly everydata mart

A robust and complete set of dimensionattributes translates into user capabilitiesfor robust and complete analysis

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 18/29

3/15/12

Store Dimension

 The store dimensiondescribes every store in ourgrocery chain

 The store dimension is theprimary geographicdimension in our case study

Each store can be thought of 

as a location. As a result, wecan roll stores up to anygeographic attribute, such asZIP code, county

Store Dimension

Store Key (PK)Store NameStore Number (NaturalKey)Store Street AddressStore CityStore CountyStore StateStore Zip CodeStore ManagerStore DistrictStore RegionFloor Plan TypePhoto Processing TypeFinancial Service TypeSelling Square Footage Total Square FootageFirst Open DateLast Remodel Date

… and more

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 19/29

3/15/12

Promotion Dimension

It describes the promotion conditions underwhich a product was sold

Causal dimension: Temporary pricereductions, end-aisle displays, newspaperads, and coupons

Factors on which Promotions are judged:

o Lift: Measured on the agreed baseline sales

o Whether transferred sales from regularlypriced products to temporarily reduced-pricedproducts

o

Cannibalization: Gain in sales of one productbut sales decrease in nearby products on the

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 20/29

3/15/12

o Since the four causal mechanisms are highlycorrelated, the combined single dimension isnot much larger than any one of theseparated dimensions would be

o The combined single dimension can bebrowsed efficiently but it only shows thepossible combinations. Browsing in thedimension table does not reveal which stores

or products were affected by the promotion. This information is found in the fact table

 The tradeoffsin favor of keeping the

fourdimensionstogetherinclude thefollowing:

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 21/29

3/15/12

Promotion Coverage FactlessFact TableIt is needed to find the products that were

on promotion but did not sell

We’d load one row in the fact table for eachproduct on promotion in a store each dayregardless of whether the product sold ornot.

It is a factless fact table as it has no

measurement metrics; it merely capturesthe relationship between the involved keys

 To determine what products where onpromotion but didn’t sell requires a two-

step process

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 22/29

3/15/12

Degenerate TransactionNumber The POS transaction number is the key to

the transaction header record, containingall the information valid for the transactionas a whole, such as the transaction date

and store identifierIn dimensional model this interesting

header information is already extractedinto other dimensions

The POS transaction number is still usefulas it serves as the grouping key for pullingtogether all the products purchased in asingle transaction

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 23/29

3/15/12

Retail Schema

A frequentshopperdimensiontable and add

another foreignkey in the facttable iscreated to see

exact purchaseof frequentshopper on aweekly basis

A frequent

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 24/29

3/15/12

Retail Schema

Originalschemagracefullyextends to

accommodatethese newdimensionslargely

because wechose tomodel thePOS

transactiondata at its

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 25/29

3/15/12

Dimension Normalization

Perceived benefits of DimensionNormalization

o This design saves space as we’re only storingcryptic codes

o The normalized design for the dimensiontables is easier to maintain

Snowflaking: Redundant attributes are

removed from the flat, denormalizeddimension table and placed in normalizedsecondary dimension tables

Reason for not adopting modelling:

o The multitude of snowflaked tables makes for

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 26/29

3/15/12

Surrogate Key

Surrogate keys are integers that areassigned sequentially as needed topopulate a dimension

It is encouraged to use surrogate keys indimensional models rather than relying onoperational production codes

Reason to avoid natural keys based on the

operational code:o To avoid embedding intelligence in the data

warehouse keys because any assumptionsthat we make eventually may be invalidated

o Queries and data access applications should

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 27/29

3/15/12

Market Basket Analysis

Market Basket Analysis is the notion of analyzing thecombination of products that sell together

It gives the retailer insights about how to merchandise variouscombinations of items

 The retail sales fact table cannot be used easily to perform MBA

as SQL was never designed to constrain and group across lineitem fact rows

Data mining tools and some OLAP products can assist withmarket basket analysis. However in the absence of these tools, amore direct approach is used

o  The market basket facttable is a periodicsnapshot representingthe pairs of productspurchased together

during a specified timeperiod

o  The basket count is asemiadditive fact

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 28/29

3/15/12

 Thank

 You

8/2/2019 DW in Reatil Sales

http://slidepdf.com/reader/full/dw-in-reatil-sales 29/29

3/15/12

References

 The Data Warehouse ToolKit –Ralph Kimbal& Margy Ross

Wikepedia.org