dw in reatil sales
TRANSCRIPT
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 1/29
Click to edit Master subtitle style
3/15/12
Data Warehousing InRetail Sales
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 2/29
3/15/12
Agenda
Goals of Data Warehouse
Components of Data Warehouse
Dimensional Modeling
Case Study : Retail Business
Designing the Dimensional Model
Dimensional Table Attributes
o Date Dimension
o Product Dimension
o Store Dimension
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 3/29
3/15/12
Goals of Data Warehouse
• The data warehouse must make an organization’s
information easily accessible
• The data warehouse must present the organization’s
information consistently
• The data warehouse must be adaptive and resilient to
change
• The data warehouse must be a secure bastion that
protects our information assets
• The data warehouse must serve as the foundation for
improved decision making
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 4/29
3/15/12
Components of DataWarehouse
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 5/29
3/15/12
Dimensional Modeling
Fact Table
• Stores Business Performancemeasurement
Mostly numeric & additive• It expresses many to many relationship
between dimensions
Dimension Table• Discretely valued description that is more
or less constant and participates inconstraints
• It implements user interface to the DataWarehouse
Daily Sales FactTable
Date Key (FK)Product Key (FK)Store Key (FK)Quantity SoldDollar Sales Amount
Product DimensionTable
Product Key (PK)Product DescriptionSKU Number (Natural Key)Brand DescriptionCategory DescriptionDepartment DescriptionPackage Type DescriptionPackage Size... and many more
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 6/29
3/15/12
Bringing together facts anddimensionsStar Join Schema
Simplicity and symmetry
Highly recognizable to business usersHigh performance benefits
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 7/29
3/15/12
Example: A simple report
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 8/29
3/15/12
Case Study: Retail Business
A large grocery chain
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 9/29
3/15/12
Retail Business
100 grocery stores spread over five-statearea
Each store has a full complement of departments, including grocery, frozenfoods, dairy, meat and health/beauty aids
Each store has roughly 60,000 SKUs on itsshelves
About 55,000 of the SKUs come fromoutside manufacturers and have UniversalProduct Codes (UPCs) imprinted on theproduct package.
The remaining 5,000 SKUs come from
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 10/29
3/15/12
Retail Business
Data collection happens at
Cash Registers (POS systems)
Back door where vendors make deliveries
Key inputs to the dimensional modeling
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 11/29
3/15/12
Designing the DimensionalModel Step 1. Select the Business Process
Aim: Management wants to better understand customer purchases ascaptured by the POS system
The business process is POS retail sales
Step 2. Declare the Grain
Granularity, atomic data
Provides maximum flexibility
Can support all possibilities of user requests The most granular data is an individual line item on a POS transaction
Step 3. Choose the Dimensions
Date, product, and store dimensions
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 12/29
3/15/12
Designing the DimensionalModelMeasured facts in the retail sales schema
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 13/29
3/15/12
Dimensional Table Attributes
Date Dimension
Product Dimension
Store Dimension
Promotion Dimension
Promotion Coverage Factless Fact Table
Degenerate Transaction Number
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 14/29
3/15/12
Date Dimension
It is present in every data mart as a datamart is a time series
Unlike other dimension table datedimension can be build in advance
Attributes of date dimension:
o Day Number
o Month Number
o Holiday Indicator
o Weekday Indicator
o Selling Season
Date Dimension
Date Key (PK)
DateFull Date DescriptionDay of WeekDay Number in Epoch
Week Number in EpochMonth Number in EpochDay Number inCalendar MonthDay Number in
Calendar Year
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 15/29
3/15/12
Date Dimension contd..
Why an explicit date dimension table isneeded? As SQL query can directly constrainon fact table date key, if the date key in thefact table is a date-type field.
Usability: business user is not versed in SQL datesemantics, so he or she would be unable to directlyleverage inherent capabilities associated with a date
data type
SQL date functions do not support filtering byattributes such as weekdays versus weekends,holidays, fiscal periods, seasons, or major events
Presuming that the business needs to slice data by
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 16/29
3/15/12
Product Dimension
The product dimension describes every SKUin the grocery store.
The product dimension is almost alwayssourced from the operational productmaster file
Most retailers administer their productmaster files at headquarters and download
a subset of the file to each store’s POSsystem at frequent intervals
The product master holds many descriptiveattributes of each SKU
The merchandise hierarchy is an important
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 17/29
3/15/12
Department Description Brand DescriptionSales DollarAmount
SalesQuantity
BakeryBakeryBakeryFrozen FoodsFrozen FoodsFrozen FoodsFrozen Foods
Frozen FoodsFrozen Foods
Baked WellFluffyLightQuickFreezeFreshlikeFrigidIcy
QuickFreezeFreshlike
$3,009$3,024$6,298$5,321$10,476$7,328$2,184
$6,467$10,476
1,1381,4762,4742,6405,2343,0921,437
3,1625,234
Product Dimension contd..
There are attributes (Bottle, Bag, Box) inthe product dimension table which are notpart of the merchandise hierarchy, cancombine constraints with a constraint on a
merchandise hierarchy attribute The product dimension is one of the two or
three primary dimensions in nearly everydata mart
A robust and complete set of dimensionattributes translates into user capabilitiesfor robust and complete analysis
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 18/29
3/15/12
Store Dimension
The store dimensiondescribes every store in ourgrocery chain
The store dimension is theprimary geographicdimension in our case study
Each store can be thought of
as a location. As a result, wecan roll stores up to anygeographic attribute, such asZIP code, county
Store Dimension
Store Key (PK)Store NameStore Number (NaturalKey)Store Street AddressStore CityStore CountyStore StateStore Zip CodeStore ManagerStore DistrictStore RegionFloor Plan TypePhoto Processing TypeFinancial Service TypeSelling Square Footage Total Square FootageFirst Open DateLast Remodel Date
… and more
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 19/29
3/15/12
Promotion Dimension
It describes the promotion conditions underwhich a product was sold
Causal dimension: Temporary pricereductions, end-aisle displays, newspaperads, and coupons
Factors on which Promotions are judged:
o Lift: Measured on the agreed baseline sales
o Whether transferred sales from regularlypriced products to temporarily reduced-pricedproducts
o
Cannibalization: Gain in sales of one productbut sales decrease in nearby products on the
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 20/29
3/15/12
o Since the four causal mechanisms are highlycorrelated, the combined single dimension isnot much larger than any one of theseparated dimensions would be
o The combined single dimension can bebrowsed efficiently but it only shows thepossible combinations. Browsing in thedimension table does not reveal which stores
or products were affected by the promotion. This information is found in the fact table
The tradeoffsin favor of keeping the
fourdimensionstogetherinclude thefollowing:
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 21/29
3/15/12
Promotion Coverage FactlessFact TableIt is needed to find the products that were
on promotion but did not sell
We’d load one row in the fact table for eachproduct on promotion in a store each dayregardless of whether the product sold ornot.
It is a factless fact table as it has no
measurement metrics; it merely capturesthe relationship between the involved keys
To determine what products where onpromotion but didn’t sell requires a two-
step process
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 22/29
3/15/12
Degenerate TransactionNumber The POS transaction number is the key to
the transaction header record, containingall the information valid for the transactionas a whole, such as the transaction date
and store identifierIn dimensional model this interesting
header information is already extractedinto other dimensions
The POS transaction number is still usefulas it serves as the grouping key for pullingtogether all the products purchased in asingle transaction
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 23/29
3/15/12
Retail Schema
A frequentshopperdimensiontable and add
another foreignkey in the facttable iscreated to see
exact purchaseof frequentshopper on aweekly basis
A frequent
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 24/29
3/15/12
Retail Schema
Originalschemagracefullyextends to
accommodatethese newdimensionslargely
because wechose tomodel thePOS
transactiondata at its
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 25/29
3/15/12
Dimension Normalization
Perceived benefits of DimensionNormalization
o This design saves space as we’re only storingcryptic codes
o The normalized design for the dimensiontables is easier to maintain
Snowflaking: Redundant attributes are
removed from the flat, denormalizeddimension table and placed in normalizedsecondary dimension tables
Reason for not adopting modelling:
o The multitude of snowflaked tables makes for
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 26/29
3/15/12
Surrogate Key
Surrogate keys are integers that areassigned sequentially as needed topopulate a dimension
It is encouraged to use surrogate keys indimensional models rather than relying onoperational production codes
Reason to avoid natural keys based on the
operational code:o To avoid embedding intelligence in the data
warehouse keys because any assumptionsthat we make eventually may be invalidated
o Queries and data access applications should
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 27/29
3/15/12
Market Basket Analysis
Market Basket Analysis is the notion of analyzing thecombination of products that sell together
It gives the retailer insights about how to merchandise variouscombinations of items
The retail sales fact table cannot be used easily to perform MBA
as SQL was never designed to constrain and group across lineitem fact rows
Data mining tools and some OLAP products can assist withmarket basket analysis. However in the absence of these tools, amore direct approach is used
o The market basket facttable is a periodicsnapshot representingthe pairs of productspurchased together
during a specified timeperiod
o The basket count is asemiadditive fact
8/2/2019 DW in Reatil Sales
http://slidepdf.com/reader/full/dw-in-reatil-sales 28/29
3/15/12
Thank
You