big data e xposed from big data to smart data

21
1 © 2013 eXelate Inc. Confidential and Proprietary. #bdx2013 From Big data to Smart data A journey into the eXelate cloud Motty Cohen, Chief Architect, eXelate

Upload: motty-cohen

Post on 11-May-2015

788 views

Category:

Technology


1 download

DESCRIPTION

This is the deck I presented in the Big Data eXposed event, September 30, David Intercontinental, Israel. In this session I’ll take the audience to a short trip in the eXelate’s cloud and present three big data related challenges and how we faced them.

TRANSCRIPT

Page 1: Big data e xposed   from big data to smart data

1© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

From Big data to Smart data

A journey into the

eXelate cloud

Motty Cohen,Chief Architect, eXelate

Page 2: Big data e xposed   from big data to smart data

2© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

eXelate is the smart data company that powers smarter digital marketing decisions worldwide

Advertiser 1st Party

Data

Data Providers

OfflineData

Online Data

Media Platforms

ModelingScoring

Segmentation

AnalyticsDistributionMarketing

Data Exchange Platform

Page 3: Big data e xposed   from big data to smart data

3© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

• Demographic• Age: 40-55• Urbanicity: Suburban• Income: High• Education: Graduate Plus• Employment: Management

• Interest• Sport• Travels• Wines• Gadgets

• Intent• Travel to Barcelona• 4-star resort

Smart Data:Accurate & actionable audience segmentation

Page 4: Big data e xposed   from big data to smart data

4© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

Our journey begins in the browser

The

Internet

Page 5: Big data e xposed   from big data to smart data

5© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

Inside eXelate Cloud:Real-time Serving & Smart data delivery

Get Event Info

Add History Data

Apply Rules & Models

Sell to buyers

200ms

100+ platforms

~500K Rules~20K Segments

5B Events/Day

~850M Unique Users

14TB Storage27GB daily

Page 6: Big data e xposed   from big data to smart data

6© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

Challenges

Big Data

Relevancy Access Time

On demand Analytics

Page 7: Big data e xposed   from big data to smart data

7© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

big data = noisesmart data = signal

Page 8: Big data e xposed   from big data to smart data

8© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

Challenge 1: Relevancy

Grabbing the relevant audienceon site, on time

Page 9: Big data e xposed   from big data to smart data

9© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

Generating Models

Model

ModelModel

Data Mining

Analytics

Create Models

eXtream

Netezza tables

Running Analytics on

Amazon

Java Packages

Page 10: Big data e xposed   from big data to smart data

10© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

Real time segmentation: Running rules and models

Basic Rules

AssociationRules

Analytic Models

Model

Model

Model

Real-time scoring

Real-time learning

Can we run all these within the limited time frame?

~500K Rules

Complex Models

Page 11: Big data e xposed   from big data to smart data

11© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

Continuous Incremental Segmentation

Users Info

Serving ClusterSegmentation

Cluster

0MQ

Continuous Incremental Segmentation

Page 12: Big data e xposed   from big data to smart data

12© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

Challenge 2: Fast access to distributed big storage

Page 13: Big data e xposed   from big data to smart data

13© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

User Object • User Info• Segments, Delivery info, Intermediate results• Object Size: x10 KB ~ x100 KB• ~ 850M UU

• Access time• Read / Write within a few ms

• Availability• For any machine in the cluster• For any cluster in every data center

Page 14: Big data e xposed   from big data to smart data

14© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

Aerospike: Frontend storage for fast access

Aerospike Cluster

Serving Cluster

XDR: Cross Data Center Replication

Optimized for SSD, Indexed in RAM

Smart Eviction Policy

Fast read/writes: 500K+ TPS

Key-value NoSQL distributed DB

Page 15: Big data e xposed   from big data to smart data

15© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

Replicated storage across data centers

US WEST CA

US CENRALTX

EUROPENL

US EASTNY

Aerospike XDR:Cross Datacenter Replication

Page 16: Big data e xposed   from big data to smart data

16© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

Challenge 3: On demand analytics

Show me the data, Now!

Page 17: Big data e xposed   from big data to smart data

17© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

optiX:Interactive data analytics

On Demand Calculation

Page 18: Big data e xposed   from big data to smart data

18© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

optiX:Interactive data analytics

On Demand Calculation

Page 19: Big data e xposed   from big data to smart data

19© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

Data Center

Elastic Search:Using search engine for counting.

NetezzaDWH Aggregator

ES Cluster(30 Nodes)

Reporter

S3

Loader

optiX

REST FTP

Page 20: Big data e xposed   from big data to smart data

20© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

What did we have so far?

• Data relevancy• Real-time scoring• Parallel processing• Split processing over time

• Big data access time• Front end, Replicated, Aerospike cluster

• On-demand analytics• Change your schema to optimize query time• Move processing from querying to loading phase• Trade off: Space + Processing -> Performance

Page 21: Big data e xposed   from big data to smart data

21© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

Thank YouQuestions?