assumptions about data and analysis: briefing room webcast slides

18
Copyright Third Nature, Inc. “Your assumptions are your windows on the world. Scrub them off every once in a while, or the light won't come in.Isaac Asimov

Upload: mark-madsen

Post on 24-Jan-2018

211 views

Category:

Data & Analytics


4 download

TRANSCRIPT

Copyright Third Nature, Inc.

“Your assumptions are your

windows on the world. Scrub

them off every once in a while,

or the light won't come in.”

– Isaac Asimov

Copyright Third Nature, Inc.

Sc

he

ma

The BI concept in the DW is simple: one place to funnel data, one direction of data flow, one model integrated prior to use.

Limited consideration for feedback loops and change

Processing only

happens here

Carefully

controlled

access

here

Peop

le h

ave lim

ited

ab

ility

to c

rea

te n

ew

info

rmatio

n

Sources

homogenous

and well

understood

Assumes that you have requirements

ahead of time; the data is already

collected, stored, ready to use.

One way flow

Copyright Third Nature, Inc.

Success breeds failure

Organizational use of BI matured over 25 years of data warehouse history.

BI enabled a shift in managing from the center of the organization to the edge, and that drives new requirements.

Needs have moved from basic access to more advanced use, and from the common data to specific, local ad-hoc needs.

Copyright Third Nature, Inc.

This is what success looks like (with only a hammer)

Copyright Third Nature, Inc.

The primary view of BI, self service is publishing data

Copyright Third Nature, Inc.

The old problem was access, the new problem is analysis

Copyright Third Nature, Inc.

What people do with data: not just read it

Explore and Understand

Inform and Explain

Convince and Decide

Deliver

Process

Collect

Copyright Third Nature, Inc.

Questions that are not asked in BI

Query

What data do I need?

Known Unknown

Known What data is available?

Where is it?

Browse

Search Explore Unknown

Copyright Third Nature, Inc.

- Helmuth von Moltke the Elder, talking about ETL specifications

Metadata is what you wished your data looked like.

Reality is not requirements = code

Reality is the data, not the metadata

Exploring data defines metadata

“No battle plan ever survives first contact with the enemy.”

Copyright Third Nature, Inc.

Changing analytics design assumptions

Past assumptions

▪ Center of the org

▪ Global use

▪ Common data

▪ Value in what’s known, monitoring

▪ Data requirements found in advance

Present assumptions

▪ Edge of the org

▪ Local use

▪ Specific data

▪ Value in what’s unknown, discovery

▪ Data requirements found during process

Copyright Third Nature, Inc.

"Always design a thing by considering it in its next larger context - a chair in a room, a room in a house, a house in an environment, an environment in a city plan." – Eliel Saarinen

Copyright Third Nature, Inc.

IT reality is multiple data stores and systems Separate, purpose-built databases and processing systems for different types of data and query / computing workloads, plus any access method, is the new norm for information delivery.

BI, Dashboards, analytics, apps

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

1 Marge Inovera $150,000 Statistician

2 Anita Bath $120,000 Sewer inspector

3 Ivan Awfulitch $160,000 Dermatologist

4 Nadia Geddit $36,000 DBA

Query

processing

Databases Documents Flat Files Objects Streams ERP SaaS Applications

Source Environments

Data processing

Stream processing

Copyright Third Nature, Inc.

An architectural history of BI tools

First there were files and reporting programs

We had cubes before we had RDBMSs!

Then we had hand-coded SQL, then QBE

Then semantic layers and SQL-generation

And now we’re back to files and cubes

But also new and improved:

Products that embed local and in-memory datastores inside the tools so they can deliver direct manipulation (wysiwyg) UIs

Copyright Third Nature, Inc.

BI server architecture shifts

The SQL-generating server model of BI scales extremely well but has poor user response time.

Solution 1: pre-cache query results or prebuild datasets on the BI server (i.e. the old OLAP model) Well-known problems with this.

Solution 2: Shove all the data into a BI server repository. Avoids subset problems. Adds potential scaling problems.

Copyright Third Nature, Inc.

There is always a third way

The previous choices were driven by client-server thinking. We have a distributed (cloud) environment.

Possibilities:

Don’t force all the compute into the DB or server.

Don’t force all the compute to the client.

Data on demand, bring it to the analysis from where it is, or execute the analysis local to where the data is.

Copyright Third Nature, Inc.

On to Q&A

With that as framing:

▪ How is analysis functionally different from “classic” BI?

▪ What technology capabilities are important in an analysis tool today?

▪ How does running in a cloud encironment influence the internal architecture of the product?

Copyright Third Nature, Inc.

About the Presenter

Mark Madsen is president of Third Nature, a technology research and consulting firm focused on business intelligence, data integration and data management. Mark is an award-winning author, architect and CTO whose work has been featured in numerous industry publications. Over the past ten years Mark received awards for his work from the American Productivity & Quality Center, TDWI, and the Smithsonian Institute. He is an international speaker, a contributor to Forbes Online and on the O’Reilly Strata program committee. For more information or to contact Mark, follow @markmadsen on Twitter or visit http://ThirdNature.net

Copyright Third Nature, Inc.

About Third Nature

Third Nature is a research and consulting firm focused on new and emerging technology and practices in analytics, business intelligence, information strategy and data management. If your question is related to data, analytics, information strategy and technology infrastructure then you‘re at the right place.

Our goal is to help organizations solve problems using data. We offer education, consulting and research services to support business and IT organizations as well as technology vendors.

We fill the gap between what the industry analyst firms cover and what IT needs. We specialize in product and technology analysis, so we look at emerging technologies and markets, evaluating technology and hw it is applied rather than vendor market positions.