natc 2013 - big data ecosystem at inmobi by sharad agarwal, inmobi

Post on 25-May-2015

386 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi

TRANSCRIPT

 BIG  DATA  ECOSYSTEM  AT  INMOBI  

Sharad  Agarwal  Sharad Agarwal Nasscom ATC 2013

Technology and Product have led to InMobi being recognized by MIT as one of the Top 50 Disruptive Companies for 2013 2  

InMobi Global Reach And Scale 3  

Leveraging Data 4  

Decision Making by Machines

Reports

Data Driven Systems Data Driven Business Decisions

Increasing Value

Decision Making By Humans

Agile Reports & Analytics

Infrastructure  Scaling  

Data  Sciences  

Data Driven Decision Making

§  Campaign Delivery §  Marketplace Health Optimization

§  Adoption Metrics §  Product Performance Metrics and Debugging §  Planning and Strategy – Demand, Supply and others

Business Metrics

§  New Product / Feature Ideas Exploration of new opportunities

Data Sciences Driven Systems

§  Conversion Based Pricing §  Engagement based Pricing §  Determining the value of Supply

Pricing

§  Prediction of Click through Rates and Conversion Rates §  Forecasting and Planning – Inventory / Burn §  Risk Mitigation and Management – Overburn / Fraud

Prediction Prediction

§  App Recommendation Engine §  Dynamic Personalization of Creatives §  Bid Budget Recommendation

Recommendation Recommendation

§  Audience Segment based Targeting §  Geo and Hyper local Targeting §  Contextual Targeting §  Look Alike Modelling

Targeting

6

7  

Access  to  Data  

Ability  to  Process  

Ability  to  U@lize  

1

2

3

Data Flow 8

Data Systems

Reporting & Analytics

Feedback -> To power products

Ingest

Curate

Normalize

Store Analyze

Data Ingestion

Data Consumption

Design: Data Platform Goal 9

Commoditize Data Access And Processing

By Providing Rich Abstractions

Signals   Ac3onable  Insights  InMobi  Big  Data  Pla=orms  

DATA  INGESTION    

CONDUIT  +  PINTAIL      

DATA  MGMT    

FALCON      

ANALYTICS    

GRILL  

SDK  

APLICATIONS  

DATA  INFRASTRUCTURE  

DASHBOARD  

Hosted/On-­‐Premise    Cloud(Public/Private)   Server  Infrastructure  

STORM  

Conduit + PinTail 11  

Collect signals – streaming, batch, multi-site At Scale In Real Time

A_part1   B_part3  B_part1  

A  

DC1  Consumers   DC2  Consumers   DC3  Consumers  B   A   B  

DC1  Producers   DC2  Producers   DC3  Producers  A_part2  

Control  Flow  

Data  Flow  

Apache Falcon 13  

InMobi Incubated Its Hadoop Data Management Project in Apache

Apache Falcon

GRILL 15  

Adhoc Reporting on Logical Cube Abstraction Across Heterogeneous Storages

GRILL: Query on Cube using HQL 16  

InMobi and Big Data – Metrics 17

1+ PB Storage

Hadoop cluster

175 K

Hadoop Jobs per day

240 TB

Amount of data read / written by systems in a day

8 Bn

Hbase Read-Write throughputs per day

Raw events per day

10 Bn

Thank You 18  

 sharad@apache.org  @sharad_ag    

Bangalore  Hadoop  Meetup  

top related