natc 2013 - big data ecosystem at inmobi by sharad agarwal, inmobi

18
BIG DATA ECOSYSTEM AT INMOBI Sharad Agarwal Nasscom ATC 2013

Upload: nasscom

Post on 25-May-2015

386 views

Category:

Technology


2 download

DESCRIPTION

NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

TRANSCRIPT

Page 1: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

 BIG  DATA  ECOSYSTEM  AT  INMOBI  

Sharad  Agarwal  Sharad Agarwal Nasscom ATC 2013

Page 2: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

Technology and Product have led to InMobi being recognized by MIT as one of the Top 50 Disruptive Companies for 2013 2  

Page 3: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

InMobi Global Reach And Scale 3  

Page 4: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

Leveraging Data 4  

Decision Making by Machines

Reports

Data Driven Systems Data Driven Business Decisions

Increasing Value

Decision Making By Humans

Agile Reports & Analytics

Infrastructure  Scaling  

Data  Sciences  

Page 5: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

Data Driven Decision Making

§  Campaign Delivery §  Marketplace Health Optimization

§  Adoption Metrics §  Product Performance Metrics and Debugging §  Planning and Strategy – Demand, Supply and others

Business Metrics

§  New Product / Feature Ideas Exploration of new opportunities

Page 6: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

Data Sciences Driven Systems

§  Conversion Based Pricing §  Engagement based Pricing §  Determining the value of Supply

Pricing

§  Prediction of Click through Rates and Conversion Rates §  Forecasting and Planning – Inventory / Burn §  Risk Mitigation and Management – Overburn / Fraud

Prediction Prediction

§  App Recommendation Engine §  Dynamic Personalization of Creatives §  Bid Budget Recommendation

Recommendation Recommendation

§  Audience Segment based Targeting §  Geo and Hyper local Targeting §  Contextual Targeting §  Look Alike Modelling

Targeting

6

Page 7: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

7  

Access  to  Data  

Ability  to  Process  

Ability  to  U@lize  

1

2

3

Page 8: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

Data Flow 8

Data Systems

Reporting & Analytics

Feedback -> To power products

Ingest

Curate

Normalize

Store Analyze

Data Ingestion

Data Consumption

Page 9: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

Design: Data Platform Goal 9

Commoditize Data Access And Processing

By Providing Rich Abstractions

Page 10: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

Signals   Ac3onable  Insights  InMobi  Big  Data  Pla=orms  

DATA  INGESTION    

CONDUIT  +  PINTAIL      

DATA  MGMT    

FALCON      

ANALYTICS    

GRILL  

SDK  

APLICATIONS  

DATA  INFRASTRUCTURE  

DASHBOARD  

Hosted/On-­‐Premise    Cloud(Public/Private)   Server  Infrastructure  

STORM  

Page 11: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

Conduit + PinTail 11  

Collect signals – streaming, batch, multi-site At Scale In Real Time

Page 12: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

A_part1   B_part3  B_part1  

A  

DC1  Consumers   DC2  Consumers   DC3  Consumers  B   A   B  

DC1  Producers   DC2  Producers   DC3  Producers  A_part2  

Control  Flow  

Data  Flow  

Page 13: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

Apache Falcon 13  

InMobi Incubated Its Hadoop Data Management Project in Apache

Page 14: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

Apache Falcon

Page 15: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

GRILL 15  

Adhoc Reporting on Logical Cube Abstraction Across Heterogeneous Storages

Page 16: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

GRILL: Query on Cube using HQL 16  

Page 17: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

InMobi and Big Data – Metrics 17

1+ PB Storage

Hadoop cluster

175 K

Hadoop Jobs per day

240 TB

Amount of data read / written by systems in a day

8 Bn

Hbase Read-Write throughputs per day

Raw events per day

10 Bn

Page 18: NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

Thank You 18  

 [email protected]  @sharad_ag    

Bangalore  Hadoop  Meetup