copyright © 2006, sas institute inc. all rights reserved. using structured and unstructured data as...

30
Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements Now

Upload: ashlynn-fowler

Post on 11-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

Using Structured and Unstructured Data as part of an Analytical ProcessManaging Future Requirements Now

Page 2: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

Agenda

Page 3: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

Agenda

Why is this stuff important?• Trends in analytics and analytical data

Why can’t I simply use the same approach I’m currently using?• Analytics’ unique characteristics

So what’s the solution?• Strategies to plan for and manage information growth

for analytics

Page 4: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

Key Messages

Analytics has been and will continue to be a competitive differentiator

Structured and unstructured data volumes are rapidly increasing

Traditional reporting-driven information management strategies are not always effective

Data quality is paramount to effective analytics

Execution times can be in the order of years, so you need to plan now to succeed in the future

Page 5: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

Why is this stuff important?

Page 6: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

We’re going through an information revolution …

Source: University of California, Berkeley

WWW: 170 Terabytes

Emails: 35,000,000,000/day (400,000 Terabytes/yr)

Telephone: 17.3 Exabytes/yr

WWW: 170 Terabytes

Emails: 35,000,000,000/day (400,000 Terabytes/yr)

Telephone: 17.3 Exabytes/yr

Page 7: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

And our information sources keep growing …

ID columns

Customerinformation

Purchases / Services

Time SeriesDemographic,Financial Profiling

Text-based customer interactions

Non-text-basedcustomer interactions

Page 8: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

Our customer knowledge keeps increasing ...

Terabytes of data

1960 1970 1980 1990 20000

25

50

75

100

2010

Time

Customer Data Availability

ExecutionCapacity

Execution Gap

AnalyticalCapacity

Knowledge Gap

Page 9: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

And we’re moving beyond reporting.

Source: InfoWorld

“Tedious data mining and static reports have had their day. The new business intelligence applies business analytics to fresh data and puts analysis in the hands of those who need it.”

“Tedious data mining and static reports have had their day. The new business intelligence applies business analytics to fresh data and puts analysis in the hands of those who need it.”

Copyright © 2005, SAS Institute Inc. All rights reserved. 9

Page 10: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

And some companies are specifically competing on analytics…

“The idea of competing on analytics is not entirely new”

“What is new is the spreading of analytical competition from individual business units to an enterprise-wide perspective”

-- Thomas H. Davenport (author)

Source: Harvard Business Review (January 2006)

Page 11: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

We’ve entered the “Era of Analytics”.

“Competing on Analytics” (Davenport & Harris)Harvard Business School Press

Worldwide Release: March 6, 2007

“Previous bases for competition … have been eroded … That leaves three things as the basis for competition:

• Efficient & effective execution

• Smart decision making

• Ability to wring every last drop of value from business processes

… all of which can be gained through sophisticated use of analytics.”

Copyright © 2007, SAS Institute Inc. All rights reserved.

Page 12: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

Why can’t I use the same approach I’m currently using?

Page 13: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

Analytics is …

Data-driven insight for better decisions.

A process encompassing a range of techniques dealing with the collection, classification, analysis, and interpretation of data to gain insight, reveal patterns, anomalies, key variables and relationships.

Page 14: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

But, more importantly, what’s critical?

Sufficient historical data

Sufficient granularity

Clean, accurate data

Breadth and representativeness of data

Page 15: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

What is a “model”?

An abstraction of reality• Simplifies reality via assumptions

• Defines constraints and actors

• Narrows our focus by eliminating everything other than what we’re concerned about

Why do we use them?• Helps us gain insight about real-world processes /

objects

• Gives us something we can “play” with

• They’re cheaper than using the real things

Page 16: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

No, really - what is a “model”?

What are some examples?• The theory of relativity

• A 100:1 scale architectural rendition of a proposed building

• A “clay” of a car

• A catwalk / clothes model

Page 17: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

So what does an analytical model look like?

Example specification: Risk of Default on a Loan

Example implementation: Risk of Default on a Loan

2211 xxy

)*22()*15(300 AgeIncomeCreditRisk

Page 18: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

Documentanalysis

Readingthe text files

TextPreprocessing

DimensionReduction

Singular Value Decomposition

Term weighting/rollup

Text Mining is no different.

Page 19: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

Time / insight

Cu

sto

mer

Val

ue

Acquisition / Activating

An

alyt

ical

in

sig

ht

WelcomePrg.

Target/acquire prospect

Pro-activity based on “If” events:

- Lifetime- Usage/purchase- Behaviour- Critical

Churn Prevention / Attrition

Cancellation

Up/X-sale

Service/advice

Customerdevelopment

Harvest Win Back

Behaviour Scoring Response rates Entry Scoring

Contact Policy Fraud Detection Segmentation (Value / Needs)

Tariff Plan Optimisation X Sell / Up Sell Credit / Collections

Churn Propensity Churn Segmentation Satisfaction score

Analytics drives significant value …

Page 20: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

But it requires data, which can take many forms …

Interactive Analytical

Highly aggregated

data

Highly disaggregated

data

Page 21: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

Different activities have different requirements …

Interactive Analytical

Highly aggregated

data

Highly disaggregated

data

The reporting path

The analytics path

Page 22: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

And different approaches are there for good reason …

Interactive Analytical

Highly aggregated

data

Highly disaggregated

data

The reporting path

The analytics path

Page 23: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

Reporting-driven data management processes aren’t always appropriate …

Traditional reporting processes support highly managed activities

Analytical processes are flexible and iteratively driven

Successful companies are managing the two processes differently

Metadata

Source

Systems

Data

Integration

DW Storage

BI

Structured Process

Hand coded

Extracts Analytical Tools

Hypothesis

Interpret

Integrate

Unstructured Process

Page 24: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

However, the two are closely aligned.

Hypothesis

Descriptive – Measures

the past (What)

Inferential – Brings deep understanding

of past and predictive of future (Why)

Page 25: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

So what’s the solution?

Page 26: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

Predictive Analytics: A Summary

Going from seeing small bits to understanding the bigger picture

Integration• Data: being able to link the unseen

• Models: provide the complete picture of the customer

• Technology: support the integration

• People: integrate all stakeholders of analytics into the business process

Page 27: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

We’re using more and more data …

Used to work with a couple of dozens of variables

Nowadays at least a couple of hundreds• Data from different sources

• Derived data (differences, rations, trends etc.)

• Data from combined algorithms (market basket analysis, combined with clustering combined with predictive modeling)

Can become thousands• Pharma: micro-array data

• Interactions

Page 28: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

This has some major implications …

History is key• To build a model, you need historical data• This history must be collected over time• To be able to effectively use analytics now, you must have

planned and executed up to two years ago• Start collecting data now if you want to remain

competitive

Data quality can be showstopper• All the data in the world is useless if it isn’t accurate• Capturing and storing this data can be expensive if it isn’t

useful• Bad quality data can delay an analytics project by years• Solve the data quality problem when you start, not

afterwards

Page 29: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.

This has some major implications …

Granularity is essential• Statistics works by extracting trends of large amounts of

information

• Pre-summarised information is almost always useless

• The enterprise data warehouse may not be the best location for this data

• Don’t assume everything must be in the single data warehouse

It’s not just about data, it’s about the right data• Knowing what data is important can be a challenge

• Requires a highly consultative approach with the business

• Helps to be tied back to strategic business drivers / business model

• Understand not only the business and the problem, but involve the right stakeholders

Page 30: Copyright © 2006, SAS Institute Inc. All rights reserved. Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements

Copyright © 2006, SAS Institute Inc. All rights reserved.Copyright © 2006, SAS Institute Inc. All rights reserved.