wake up and smell the data

40
Wake Up and Smell the Data February, 2013 Mark Madsen www.ThirdNature.net @markmadsen

Upload: mark-madsen

Post on 27-Jan-2015

111 views

Category:

Technology


2 download

DESCRIPTION

Big data is a big part of the disruption hitting this market, but not in the way most people think. It's not replacing the data warehouse, but it is changing the technology stack. It doesn't eliminate data management, but it does redefine enterprise data architecture. Big data is and isn't many things. It's important to understand which information uses are well supported and which have yet to be addressed. Otherwise you risk replacing one set of problems with another. Come to this session to hear some observations on what big data is, isn't and aspires to be. A video is available, starts at 1:03 into this Strata online event: http://www.youtube.com/watch?v=gLsHI1ZglKw

TRANSCRIPT

Page 1: Wake up and smell the data

Wake Up and Smell the Data

February, 2013

Mark Madsenwww.ThirdNature.net@markmadsen

Page 2: Wake up and smell the data

Caveat

The focus of this talk is on information processing and delivery, leaving out many aspects of big data in the automation / execution sense.

Page 3: Wake up and smell the data

Big Data, Big Hype

$876 Gajillion (analyst estimates of the big data market)

Page 4: Wake up and smell the data

We’ve been here before

Bill Schmarzo, EMC

Page 5: Wake up and smell the data

Big Data, Big Nonsense

Big data is subjective, based on bigness at a point in time?

McKinsey focused on the least interesting aspect of big data.

Source: McKinsey

Page 6: Wake up and smell the data

Data volume is the oldest, easiest problem

Image courtesy of Teradata

Page 7: Wake up and smell the data

Technology Capability and Data Volume

Source: Noumenal, Inc.

Page 8: Wake up and smell the data

Origin of BI and data warehouse concepts

The general concept of a separate architecture for BI has been around longer, but this paper by Devlin and Murphy is the first formal data warehouse architecture and definition published.

8

“An architecture for a business and information system”, B. A. Devlin, P. T. Murphy, IBM Systems Journal, Vol.27, No. 1, (1988)

Slide 8Copyright Third Nature, Inc.

Page 9: Wake up and smell the data

Our ideas about information and

how it’s used are outdated.

Page 10: Wake up and smell the data

Metadata catalog

Page 11: Wake up and smell the data

Report

Page 12: Wake up and smell the data

Report library

Page 13: Wake up and smell the data

BI is using broken metaphors

We think of BI as publishing, which it isn’t.

Page 14: Wake up and smell the data

When you first give people access to information that was unavailable…

OH GODI can see into forever

Page 15: Wake up and smell the data

After a while the response is more measured

Page 16: Wake up and smell the data

User autonomy is a tradeoff

Autonomy is a tradeoff in most data warehouses: control at the expense of complexity.

Complexity for casual users can lead to messes.

So we err on the side of simplifying user access in three ways…

Page 17: Wake up and smell the data

Centralize: that solves all problems!

Creates bottlenecks

Causes scale problems

Enforces a single model

In some organizations and areas of business “data warehouse” is a bad word.

Page 18: Wake up and smell the data

Standardize: it’s simpler for everyone

Page 19: Wake up and smell the data

The “E” in EDW was a lie…

Page 20: Wake up and smell the data

Measurement started with the convenient dataThe convenient data is transactional data.▪ Goes in the DW and is used, even if it isn’t the right measurement.

The difficult and misleading data is declarative data.▪ What people say and what they do require ground truth.

The inconvenient data is observational data.▪ It’s not neat, clean, or designed into most systems of operation.

We need to build data systems that integrate all three.

Page 21: Wake up and smell the data

Value: There’s a pony in there somewhere

Page 22: Wake up and smell the data

Many current views miss the point

Using Big Data

Page 23: Wake up and smell the data

It’s not about “big”

Using Big Data

And “big” is often not as big as you think it is.

Page 24: Wake up and smell the data

It’s not really about data, either

Using Big Data

If there’s no process for applying information in a specific context then you are producing expensive trivia.

Page 25: Wake up and smell the data

Two keys to making big data worthwhile

Value:Goal  solution

not

Solution  goal

Actionability:Simple “value” isn’t enough.

Information has to be actionable, somehow.

Page 26: Wake up and smell the data

Planning data strategy means understanding the context of data use so we can provide infrastructure

Monitor Analyze Exceptions

Analyze Causes Decide Act

No problem No idea Do nothing

We need to focus on what people do with data as the primary task, not on the data or the technology.

Copyright Third Nature, Inc.

Page 27: Wake up and smell the data

General model for organizational use of data

Collect new data

Monitor Analyze Exceptions

Analyze Causes Decide Act

No problem No idea Do nothing

Act on the processUsually days/longer timeframe

Act within the processUsually real-time to daily

Page 28: Wake up and smell the data

You need to be able to support both paths

Collect new data

Monitor Analyze Exceptions

Analyze Causes Decide Act

Act on the process

Act within the process

Conventional BI

Causal analysis, i.e. “data science”

Page 29: Wake up and smell the data

How do you manage the business in today’s environment?

Our simplistic notions of BI with stable models, ordered data and predictability are being replaced by concepts from decision support and complex adaptive systems (CAS).

Simple Complicated Complex

Assumption: Order Assumption: Unorder Assumption: Disorder

Cause and effect is repeatable & predictable 

Cause and effect is separated in time & space, repeatable, learnable

Cause and effect is coherent in retrospect only, modelablebut changing

Known Knowable Unpredictable

Standard processes, clear metrics, best practice

Analytical techniques to determine options, effects

Experiment to create possible options

Sense, categorize, respond Sense, analyze, respond Test, sense, respond

Reporting, dashboards Ad‐hoc, OLAP, exploration Data science, casual analysis

Situational context governs data useCopyright Third Nature, Inc.

Page 30: Wake up and smell the data

BI/DW environment support varies for these contexts

Handles this really well (most of the time).

Basic BI Analysis Data science, analytics

Assumption: Order Assumption: Unorder Assumption: Disorder

Cause and effect is repeatable & predictable 

Cause and effect is separated in time & space, repeatable, learnable

Cause and effect is coherent in retrospect only, modelablebut changing

Known Knowable Unpredictable

Standard processes, clear metrics, best practice

Analytical techniques to determine options, effects

Experiment to create possible options, test hypotheses

Sense, categorize, respond Sense, analyze, respond Test, sense, respond

Reporting, dashboards Ad‐hoc, OLAP, data discovery Casual analysis, simulation

Handles this sort of ok, sometimes.

This, not so much.

Copyright Third Nature, Inc.

Page 31: Wake up and smell the data

TANSTAAFL

Technologies are not perfect replacements for one another.

When replacing the old with the new (or ignoring the new over the old) you always make tradeoffs, and usually you won’t see them for a long time.

Page 32: Wake up and smell the data

The usage models for conventional BI

Collect new data

Monitor Analyze Exceptions

Analyze Causes Decide Act

No problem No idea Do nothing

Act on the processUsually days/longer timeframe

Act within the processUsually real-time to daily

This is what we’ve been doing with BI so far: static reporting, dashboards, ad-hoc query, OLAP

Page 33: Wake up and smell the data

The usage models for analytics and “big data” 

Collect new data

Monitor Analyze Exceptions

Analyze Causes Decide Act

No problem No idea Do nothing

Act on the processUsually days/longer timeframe

Act within the processUsually real-time to daily

Analytics and big data is focused on new use cases: deeper analysis, causes, prediction, optimizing decisions

This isn’t ad-hoc, reporting, or OLAP.

Page 34: Wake up and smell the data

Analytics embiggens the data volume problem

Many of the processing problems are O(n2) or worse, so moderate data can be a problem for DB‐based platforms

Page 35: Wake up and smell the data

New and growing use cases drive the need to expand

The use cases are now interactive applications, lower latency data, complex analytics and discovery rather than reporting.

Page 36: Wake up and smell the data

Big Data Shift in a Nutshell

The old model for data▪ Centralized publishing▪ Read only▪ Integrate before use▪ Record only important data

▪ Retrieval‐focused▪ Single method of access

▪ Human‐level latency

The new model for data▪ Community creation

▪ Read‐write▪ Integrate at time of use

▪ Record all the data▪ Processing‐focused▪ Multiple methods of access

▪ Machine‐level latency

It’s an architectural reconfiguration, just like web 2.0

Page 37: Wake up and smell the data

“The future, according to some scientists, will be exactly like the past, only far more expensive.” ~ John Sladek

Page 38: Wake up and smell the data

About the Presenter

Mark Madsen is president of Third Nature, a research and advisory firm focused on analytics, business intelligence and data management. Mark is an award‐winning author, architect and CTO whose work has been featured in numerous industry publications. Over the past ten years Mark received awards for his work from the American Productivity & Quality Center, TDWI, and the Smithsonian Institute. He is an international speaker, a contributor at Forbes Online and Information Management. For more information or to contact Mark, follow @markmadsen on Twitter or visit  http://ThirdNature.net 

Page 39: Wake up and smell the data

About Third Nature

Third Nature is a research and consulting firm focused on new and emerging technology and practices in analytics, business intelligence, and performance management. If your question is related to data, analytics, information strategy and technology infrastructure then you‘re at the right place.

Our goal is to help companies take advantage of information-driven management practices and applications. We offer education, consulting and research services to support business and IT organizations as well as technology vendors.

We fill the gap between what the industry analyst firms cover and what IT needs. We specialize in product and technology analysis, so we look at emerging technologies and markets, evaluating technology and hw it is applied rather than vendor market positions.

Page 40: Wake up and smell the data

CC Image AttributionsThanks to the people who supplied the creative commons licensed images used in this presentation:

Outdated gumshoe.jpg – http://flickr.com/photos/olivander/372385317/Card catalog – http://flickr.com/photos/deborahfitchett/2372385317/book of hours manuscript2.jpg ‐ http://flickr.com/photos/jeffrey/89461374/royal library san lorenzo.jpg ‐ http://flickr.com/photos/cuellar/370663920/uniform_umbrellas.jpg ‐ http://www.flickr.com/photos/mortimer/221051561/ponies in field.jpg ‐ http://www.flickr.com/photos/bulle_de/352732514/caged_tower_melbourne.jpg ‐ http://www.flickr.com/photos/vermininc/2227512763