big data overview

Post on 10-Sep-2014

1.613 Views

Category:

Software

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

An introduction to big data. What's big data, why we'd want it , how is it applicable to CSPs, short intro to Hadoop (some of the info is in the slide notes)

TRANSCRIPT

BIG DATAArnon Rotem-Gal-Oz

Director of Technology Research, AmdocsThe blind men and the elephant. Poem by John Godfrey Saxe (Cartoon originally copyrighted by the authors; G. Renee Guzlas, artists http://www.nature.com/ki/journal/v62/n5/fig_tab/4493262f1.html

1880 US Census

HollerithTabulating Machine

Hollerith photos by Martin Wichary : http://www.flickr.com/photos/mwichary/4358926764/in/photostream/

Source: Silicon Angle http://siliconangle.com/blog/2013/11/13/how-big-is-big-data-really/

Big data happens when the data you have to process is bigger than what you can process in the given time with current

technologies

Myth: Big data = keep all data

Source: Big Data Public Private Forum : http://www.big-project.eu/sites/default/files/D2.2.1_First%20draft%20of%20Technical%20white%20papers_FINAL_v1.01_0.pdf

Source: Big Data Public Private Forum : http://www.big-project.eu/sites/default/files/D2.2.1_First%20draft%20of%20Technical%20white%20papers_FINAL_v1.01_0.pdf

Some Telco Numbers

Source: Wikipediahttp://upload.wikimedia.org/wikipedia/commons/5/50/Telephone_operators,_1952.jpg

So, what do we do with all this data?

Source: Wikipedia http://upload.wikimedia.org/wikipedia/commons/0/06/UPS_Truck.jpg

It’s the insights, stupid*

* With apologies to Bill Clinton

Source: Silicon Angle http://siliconangle.com/blog/2013/11/13/how-big-is-big-data-really/

Big data analytics is when sample = N

• Big data happens when the data you have to process is bigger than what you can process in the given time with current technologies

We need to watch out thatAnalytics won’t get too creepy

When people hear big data they think

fast data

Source: Steve Jones Cap Geminihttp://www.no.capgemini.com/node/778541

Subscribers

Collect& Filter Correlate

(simplified) Network proactive care flow

Account

Event Store

Identify & Predict NetworkFailures

ReimburseVIPs

Prioritize technicians

Identify impact on

high valued Accounts

Source: Silicon Angle http://siliconangle.com/blog/2013/11/13/how-big-is-big-data-really/

Big data is when we can handle data fast enough to make a difference

• Big data happens when the data you have to process is bigger than what you can process in the given time with current technologies

• Big data analytics is when sample = N

Technology space

The Elephant in the room

Hadoop Stack

Map/Reduce

HDFS

HBase

PigHive

ZooKeeper

Oozie MahoutGiraph

Schema on read

Move data to computation

Maybe we should rethink moving data to computation…

Source : http://my-inner-voice.blogspot.co.il/2012/06/haddop-101-paper-by-miha-ahronovitz-and.html

Map/reduce

Source: http://www.bodhtree.com/blog/2012/10/18/ever-wondered-what-happens-between-map-and-reduce/

Customer Segmentation

First name

Last name

ARPU Age Device Country …

Mr. Smith 100 22 iPhone 5s,White USA

John Doe 87 42 Samsung Galaxy S5,Gold France

Lady In Red 105 21 Samsung Note 3, White UK

Uluru, Australia by Stuart Edwards (cc) http://en.wikipedia.org/wiki/Uluru#mediaviewer/File:Uluru_Panorama.jpg

K-Means

ARPU

Age

Source : http://pypr.sourceforge.net/kmeans.html

K=3AR

PU

Age

ARPU

Age

Source : http://pypr.sourceforge.net/kmeans.html

New paradigms

Map/Reduce

HDFS

HBase

PigHive

ZooKeeper

Oozie MahoutGiraph

New Paradigms

Map/Reduce

HDFS

HBase

Pig HiveZoo

Keeper

Oozie Mahout

YARN

Giraph

New Paradigms

Map/Reduce

HDFS

HBase

Pig HiveZoo

Keeper

Oozie Mahout

YARN

Giraph SparkStorm

Slider

Flink

Impala

Tez

Presto

Amdocs Analytics & Data Management Heritage

2013

• Proactive Care• TerraScale• Network optimization

• Real time analytics platform

• Single product catalog

• BSS–OSS Integration

• CRM-Billing Integration

OSSAnalytics Platform,

16 Analytics Patents

• aLDM logical data model

• Policy control

Network AnalyticsCRM

2000 2008

Acqu

isiti

ons

Portf

olio

34Information Security Level 2 – Sensitive© 2014 – Proprietary and Confidential Information of Amdocs

Touchpoints & Applications

CRM Self Service E-MailPCRF SMS OtherWi-Fi OffloadCampaign Mng. • • • • • • •

Operational Envelope & Platform Administration

• Security Management

• Configuration Management

• Services Inventory

• Performance Management

• Fault Management

• LoggerCollect & Ingest

Transform & Enrich

Aggregate & Correlate

Drive Insight

Close the Loop

Machine Learn &

ScoreApplication-Ready Data and Analytics/ML Insights

Entities and Profiles

Detailed Data

OSSProbes Social RAN Inventory Usage &

ChargingCRM

Real-Time & Batch Connectors

Insight Platform

Marketing

AnalyticalApplication Framework:

Dashboards & Visualisation

Decisioning Engine

Dynamic Micro Segmentation

Network Care Operations

Source: Silicon Angle http://siliconangle.com/blog/2013/11/13/how-big-is-big-data-really/

• Big data happens when the data you have to process is bigger than what you can process in the given time with current technologies

• Big data analytics is when sample = N

• Big data is when we can handle data fast enough to make a difference

Additional takeaways

• CSPs have always been in the big data business – they just didn’t know it

• Big data is not a panacea • Hadoop is shaping up as the big data OS– Though there are alternatives arriving from the

cloud arena (mesos, kubernetes)

What we covered here is not even

the tip of the iceberg

Source: wikimedia http://commons.wikimedia.org/wiki/File:Iceberg.jpg

Arnon Rotem-Gal-Oz Director of Technology Research, Amdocsarnonrot@amdocs.com / arnon@rgoarchitects.com

top related