building a just in time data warehouse by dan morris and jason pohl

20
Building a Just-In-Time Data Warehouse Dan Morris (Viacom) Jason Pohl (Databricks) February 18, 2016

Upload: spark-summit

Post on 16-Apr-2017

1.464 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Page 1: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Building a Just-In-Time Data WarehouseDan Morris (Viacom)Jason Pohl (Databricks)February 18, 2016

Page 2: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

About Viacom

• Leading Global Entertainment Content Company• 23 Brands in 170+ Countries

2

Page 3: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Introductions

Dan Morris• Senior Director of Product Analytics• 12 Years @Viacom in a variety of roles• Intersection of Product and Data

3

Page 4: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

About My Team

• Product Analytics team formed one year ago

• Our mission is to grow our global audience with the highestquality users possible

4

Page 5: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Key Areas of Focus

• Mobilize efforts using growth targets

• Uncover deep insights using churn and cohort analysis

• Treat all ideas as hypotheses and test them rigorously

5

Page 6: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Where Are We Today: App Platform

6

Make it extremely simple to build and deploy engagingapps around the globe

FEATURES

UI&ANIMATIONS

CONFIGURATIONSETTINGS

APPBINARY

Page 7: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Disciplined Product Dev Approach is Key

7

• 23 brands in 170+ countries

• Lots of market dynamics

• Many stakeholders

... Data is a must!

Page 8: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Sound Data Management is Required

8

9

11

13

14

16

18

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Expected Data Volume Growth (TB)

2016

Page 9: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Our Data Infrastructure

9

S3 Spark + Databricks Redshift Tableau

Page 10: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Introducing the TV Land iOS App

10

Page 11: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Applying Product Analytics to TV Land

11

• Growth Targets

• Dashboards

• Deep Dive Analyses

• A/B Testing

Page 12: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Baselines Used to Set Growth Targets

12

Business ModelingETL

Data Volume• 30 sites/apps• 11 TB

Data Volume• 30 sites/apps• 1 TB

S3 Spark + Databricks Redshift Tableau

Page 13: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Growth Targets are Monitored via Dashboards

13

1/4/16 1/11/16 1/18/16 1/25/16

New UsersReturning Users

WeeklyRetentionbyCohort0 1 2 3 4

1/4/16 100% 53% 41% 33% 30%1/11/16 100% 58% 51% 42%1/18/16 100% 49% 38%1/25/16 100% 49%

AudienceGrowthbyCohort

Page 14: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Dashboards Spark Deep Dive Analyses

14

0%

25%

50%

75%

100%

0 1 2 3 4 5 6 7 8 9 10 11 12

Users will quickly churn if not activated within small window of time.

% o

f use

rs re

tain

ed

Page 15: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Deep Dive Analysis Requires Flexibility

15

Not NeededDeep Dive Analysis

S3 Spark + Databricks

• Define schema on read instead of write• Work through data quality issues just-in-time.• Tease out business question iterative and interactively.• Use programming language of your choice.

Redshift Tableau

Page 16: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Hypotheses Require A/B Testing

16

Statistical Analysis

Data Sets• Adobe Logs• Experiment Logs

S3 Spark + Databricks Tableau

Not Needed

Redshift Tableau

Page 17: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Summary of Our Setup

17

Just in Time Traditional

Primary Audience • Product Analysts • Product Team• Business

StakeholdersTasks • Exploratory

Analysis• A/B Testing

• Ad Hoc Queries• Dashboards

Tools • S3• Spark• Databricks

• Redshift• Tableau

Page 18: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Coming Soon…

18

• Go live with internal A/B testing platform

• Continue to evolve our setup

• Further scale model to support Product Analytics Pan-Viacom

Page 19: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Questions ?

19

Page 20: Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl

Thank you.Other parting words or contact information go here.