big data bi simplified

21
Big Data BI Simplified Rohit Chatter [email protected] Twitter: @Rohitchatter

Upload: inmobi-technology

Post on 27-Jun-2015

181 views

Category:

Technology


1 download

DESCRIPTION

Today's world, Big Data seems to be buzz word and enabling BI seems to be the dream come true. In traditional world, all BI systems have run on RDBMS and embraced Star Model to enable DWH queries. Imagine, enabling the same for data lying in Hadoop clusters along with RDBMS and bringing down the barrier for business to be able to play with this data. The slides essentially covers around this theme.

TRANSCRIPT

Page 1: Big Data BI Simplified

Big Data BI Simplified

Rohit [email protected]: @Rohitchatter

Page 2: Big Data BI Simplified

Who I am – Rohit Chatter

Big Data – Opportunities & Challenges

agenda

What’s Inside – 10,000 Feet

Use Cases

The Big Data Product

Big Data - Preview

Page 3: Big Data BI Simplified

Rohit Chatter was Senior Architect at Yahoo! in Advertiser and Data Platform group. Now at

inMobi as Principal Architect - Analytics

He is a thought leader specializing in designing solutions involving huge amount of data. He

architected Paid Search BI stack for Microsoft-Yahoo alliance that uses Hadoop, Hive, GraphDB

& HBase.

He has deep knowledge and understanding of various usage models involving traditional databases and newer Big Data platforms to provide customer centric and cost effective

solutions.

He has spent 17 years in the industry. Before joining inMobi, he has worked for companies like

Yahoo!, Tivo, Alcatel Lucent, TCS etc. Some of his recent projects include BI solutions for Paid Search Advertiser Analytics, Partner Analytics

and Web Analytics.

Business Domain:Web Analytics

Search Advertising AnalyticsPublisher Analytics

Technology:Hadoop, Hive, Hbase, RDBMS,

BI tools & technology, Data Modeling

[email protected]@TDWI Bangalore Chapter

Panel Member @ Hadoop The Fifth Elephant

Page 4: Big Data BI Simplified

Big Data - Preview

Big Data – Opportunities & Challenges

agenda

What’s Inside – 10,000 Feet

Use Cases

The Big Data Product

Who I am – Rohit Chatter

Page 5: Big Data BI Simplified

Today’s Dynamic World

Page 6: Big Data BI Simplified

“Information is the oil of the 21st century,

...and analytics is the combustion engine.”

“Unfortunately, we spend 80% of the time collecting data and 20% analyzing it.”

“With increasing importance of precise and timely insights, analysts want to be able to deliver accurate data reports quickly.”

Page 7: Big Data BI Simplified

Big Data

Page 8: Big Data BI Simplified

Big Data - Preview

Big Data – Opportunities & Challenges

agenda

What’s Inside – 10,000 Feet

Use Cases

Who I am – Rohit Chatter

The Big Data Product

Page 9: Big Data BI Simplified
Page 10: Big Data BI Simplified

Business Problems

Scale• Data growth with time• Granularity needed for right business

decisions

Data ReachEase of Data Access. Distance between Data and BusinessOne time reports for investigation or validation of analysis

Reprocessing• Data reprocessing becomes a

nightmare• IT always in catch-up mode

Timely Insights• Data acquisition to Insight – In Time

Low Flexibility for new Reports & Dashboards• Add new dimension and metrics with

complex business rules• Modify reports• New dashboards

Engineering Involvement• Huge dependency on IT/BI team on a

day to day basis

IT/BI Business

Page 11: Big Data BI Simplified

Big Data - Preview

Big Data – Opportunities & Challenges

agenda

What’s Inside – 10, 000 Feet

Use Cases

Who I am – Rohit CHatter

The Big Data Product – To Be

Page 12: Big Data BI Simplified

BI Framework on Hadoop

Custom Reports & Dashboard Canned & Schedule based reports Cubes (Yes!! On Hadoop) Pivot interface for Visualization & Dashboard

STAR Model on Hadoop Define Entities & Relationship Define complex metrics Define dimensions

Data to Analytics - Improved SLAsSignificantly reduces time to analytics from the time data is

acquired

Single Sign On

What should Big Data BI have? Analytics, Dashboards & Reports

Business grouping of reports Report Designer Dashboard Builder Adhoc Analysis

Scalable & Pluggable architecture Any Source HBase, Solr, Graphdb, Pig, Shark, Impala, Hive, Oracle,

MySQL

Data Re-processing – SimplifiedAll data processing happens on grid and stays on grid

SecurityReport & Data access are managed via roles

Page 13: Big Data BI Simplified

What all it should do for you?

Page 14: Big Data BI Simplified

Simplify?

Data to Insights

Data Accessibility

Self Serve

New Dashboard

0 5 10 15 20 25

BigData BIOthers

In Hours

INGEST DEFINE RELATIONSHIP VISUALIZE

Days

Page 15: Big Data BI Simplified

Big Data - Preview

Big Data – Opportunities & Challenges

agenda

What’s Inside – 10, 000 Feet

Who I am – Rohit Chatter

The Big Data Product – To Be

Use Cases

Page 16: Big Data BI Simplified

Stack

Page 17: Big Data BI Simplified

Big Data - Preview

Big Data – Opportunities & Challenges

agenda

What’s Inside – 10,000 Feet

Use Cases

Who am I – Rohit Chatter

The Big Data Product – To Be

Page 18: Big Data BI Simplified

Media Industry

•Audience Engagement, User Value life cycle, User Behavior•Ad Network – Campaign optimization, Better ROI, Brand Performance

•Exchange

E-commerce

•Recommendation engine•Sentiment Analysis & Brand loyalty

Where all BigData BI can help?

Page 19: Big Data BI Simplified

CHURN PREDICTION FOR A TELECOM OPERATOR

► Dependent variable to define attritors: Customer was defined as attritor if they has done less than 2 calls over a period of 3 months

► Logistic regression was used to develop a model equation to calculate attrition propensity score for all customers

► Customer scores were developed to rank them into high medium and low attritors.

Based on Model the customers were

targeted with a marketing offer proactively

which reduced attrition and resulted in $ 2.3

MM inc. volume

PredictedValue

Observedvalue

Likelihood for attrition

Likelihood for no attrition

Total

Customers on Attrition 8,422 1,824 10,246

Customers on No attrition 1,708 14,012 15,720

Total 10,130 15,836 25,966The statistical Model performed 2.67% better than random prediction

BUSINESS IMPACT

SOLUTION APPROACH

Identify the risky customers and develop focused strategies to retain them.

Page 20: Big Data BI Simplified

Customer Life Time Value

Segmentation:

The Natural Segmentation conducted through K-Means clustering showed 4

distinct segments: S1 – Utility Customers S2 – Premium and Loyal Customers S3 –

Premium and careful S4 – Service shy

The final cluster comprised of low value customers though the number of

customers in that segment was high.

SOLUTION APPROACH

► Behavioral change among the customers falling in the two groups of interest represented over 12 Million $ of

revenue to be gained annually

► The CLTV value was compared with marketing investment per customer to find the viability of customer

acquisition. The organization was able to save on marketing investment by 35% and increased revenue by 43%.

BUSINESS IMPACT

SAMPLE OUTPUT: LABOR REVENUE FROM 4 SEGMENTS

0% 5% 10% 15% 20% 25% 30% 35% 40% 45%

0

200

400

600

800

1000

1200

1400

1600

1800

2000

Total Revenue Share

Lab

or

Re

ve

nu

e

S1

S2

S3

S4

Customer Life Time Value

The NPV method was employed for calculating the Customer Life Time

Value (CLTV).

CLTV model for each segment was built and CLTV of each customer was

calculated.

Based on the CLTV values, a further segmentation of customers were

done as: High value, Moderate value and Low value.SAMPLE OUTPUT: Top 10 customers of S2 Segment CLTV

Develop targeted marketing programs for high potential/high value clients

Page 21: Big Data BI Simplified

Thank [email protected]