the data lake dream edd dumbill @edd edd@svds.com svds.com/stratany2014

Post on 16-Dec-2015

241 Views

Category:

Documents

7 Downloads

Preview:

Click to see full reader

TRANSCRIPT

THE DATA LAKE DREAMEdd Dumbill • @edd

 edd@svds.com • svds.com/StrataNY2014

2 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.

3 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.

A scalable, accessible repository of data

(in its natural or processed state)

WHAT IS A DATA LAKE?

4 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.

CLEAN VALIDATE CONTROL PROTECT

CONVENTIONAL DATA STRATEGY“WHAT YOU DO TO DATA”

5 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.

MODERN DATA STRATEGY“WHAT YOU DO WITH DATA”

TARGET VIP CUSTOMERS ATTRACT NEW CUSTOMERS

AUTOMATE

6 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.

7 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.

TOWARDS THE “DATA LAKE” — Step 1

8 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.

TOWARDS THE “DATA LAKE” — Step 2

9 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.

TOWARDS THE “DATA LAKE” — Step 3

10 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.

TOWARDS THE “DATA LAKE” — Step 4

11 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.

UP vs. OUT — Enterprise Edition

Different use cases put different demands on the data infrastructure.

Increasing cost per unit of capability from scale-up architectures causes rationing of resources. Only the most valuable use cases are pursued.

US D

olla

rs

Data Resource Usage

Scale-up cost

Scale-out cost

UC1

UC2

UC3

UC4

UC5

12 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.

THE DATA VALUE CHAINDRAW VALUE FROM YOUR STRATEGIC DATA ASSETS

Discover Ingest Process Persist Integrate Analyze Expose

1313 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.

• Make it cheap

• Failure as a feature

• Ask good questions

• Make it quick

• Both learning and adaptation

• Enable the feedback loop

• Don’t break things

• Make operations a platform for innovation

• APIs, platforms, simulation

BUILD FOR EXPERIMENTS

14 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.

THE EXPERIMENTAL ENTERPRISE

We need to both support investigative work and build a solid layer for

production.

Data science allows us to observe our experiments and respond to the

changing environment.

The foundation of the experimental enterprise focuses on making

infrastructure readily accessible.

15

Edd Dumbill

edd@svds.com

@edd

@SVDataScienceYes, we’re hiring!

info@svds.com

Want these slides? Go to:

svds.com/StrataNY2014

top related