having it all is not having it all at all!

29
Having it All is not Having it All at All! Problem Formulation in the Face of Overwhelming Quantities of Data

Upload: caldwell-dunlap

Post on 03-Jan-2016

31 views

Category:

Documents


4 download

DESCRIPTION

Having it All is not Having it All at All! Problem Formulation in the Face of Overwhelming Quantities of Data. A journey of discovery… Where’s the fire?. START FROM THE BEGINNING -- “Before the beginning of great brilliance, there must be Chaos.” -- (I Ching). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Having it All  is not Having it All at All!

Having it All is not Having it All at All!

Problem Formulation in the Face of Overwhelming

Quantities of Data 

Page 2: Having it All  is not Having it All at All!

A journey of discovery… Where’s the fire?

START FROM THE BEGINNING -- “Before the beginning of great brilliance, there must be Chaos.” -- (I Ching)

“At the beginning of the 21th century, the population of the Earth [was] 6.300.000.000., who annually experience a reported 7,000,000 -8,000,000 fires with 70,000 –80,000 fire deaths and 500,000 –800,000 fire injuries.

Dr. Ing. Peter Wagner 2006 ”

Page 3: Having it All  is not Having it All at All!

Data everywhere.

Page 4: Having it All  is not Having it All at All!

Who knew?

Page 5: Having it All  is not Having it All at All!

Gone are the days when there was a single source of “truth”…

Baker Library

Entries in a book on Australia business owners

About a storekeeper in Halifax County, N.C. – June 1873:

“purchaser or stolen goods, a great scamp.”

Entry about one J. B. Alford, who sold groceries and liquors: June 1870

“This man is said to be in thriving circumstances. He has some Real & personal estate & I think it is safe to trust him.”

Entry on Hannah Griffith, a milliner in Springfield, Ill. In 1869

“about to marry a fellow [of] no account.”

An entry two years later noted with some relief, that that plan had fallen through.

Harvard R.G. Dun Credit Report Collection

"is not much of a businessman, but had some capital, it is said, advanced by his father, who is reputed well off“ -- About J.D. Rockefeller – who turned out to be a good credit risk; 1863 was the year he set up a refinery that blossomed into Standard Oil.

Page 6: Having it All  is not Having it All at All!

Hold on… things are changing.

Hold on… things are changing.

Page 7: Having it All  is not Having it All at All!

Framing our case for change…

• We all know that the world is changing• We are aware that the rate of change is increasing

at an unprecedented rate• We see new types of data, technologies, and

behaviors every day• More and more, we are tasked with discerning the

discoverable need from the articulated want

The Operating Environment

• What has made us successful so far is insufficient• We now have the ability to succeed… or fail, much faster• The connectedness of information and the ways in which

it is changing is impacting the risk and opportunity space in ways we are only beginning to understand

The Case for Change

Page 8: Having it All  is not Having it All at All!

Sometimes, a picture is worth a thousand words.

Pope Benedict Inauguration

Pope Francis Inauguration

Lately, a thousand pictures are taken in the time it takes to speak a single word!

• What about the digital footprint of all of the smartphones?

• What about the social networks the crowd?

• What about the metadata in the photos?

• What are the opportunity costs to other activities?

• The largest corpus of data preceded the event

• Most data created about the event had significant, and asymmetric latency

• The rate of “data decay” attributable to the participants in the event is significant

Page 9: Having it All  is not Having it All at All!

Asking the right questionAsking the right question

How deep would the ocean be if sponges didn’t live there?

What if the Hokey Pokey really is what it’s all about?

What if there were no hypothetical questions?

How many more of these silly questions till the next slide?

Page 10: Having it All  is not Having it All at All!

Questions about risk and opportunity are at the heart of our focus.

10

What other companies is this individual associated with?

How do I identify changes with my current contact relationships?

Who is the right decision maker at this company and how do I effectively reach them?

How can I de-dupe my current customer base at the contact

level?

I need answers!

I need insights about a contact to help me

target my messaging

What other risks should I know about

before doing business with this small company?

Should I extend credit?

Should I extend credit?

What about fraud?

What about fraud?

What is the right credit

limit?

What is the right credit

limit?

What do my best

customers look like?

What do my best

customers look like?

Which customers

should I call on next

Which customers

should I call on next

Which prospects are

most promising?

Which prospects are

most promising?

Page 11: Having it All  is not Having it All at All!

It is extremely important to frame the question in the right context.

Page 12: Having it All  is not Having it All at All!

The right universe of data is often implied by the scope and context of the question.

12

Business

Name

Telephone

Address

SIC Employee Size

Sales Revenue

Year Started

Primary Contact

Linkage

Foundational Firmographic

• Unit of Analysis: Set of matched results

• Response Variables

• CC = Confidence Code Attribution

• MG = MatchGrade Attribution

• WACC = Weighted Average Confidence Code

Rational Subgroups

• By Confidence Code Cluster

• By MatchGrade “cousin” cluster within Confidence Code

• Potential Explainable Factors:

• Cleansing Process – things w e do to the Korean text w hich may cause it to be ‘less matchable’

• Candidate retrieval methods that w e use• Evaluation & Decisioning – w e may need to adjust our

definition of A / B / F for Korea• Availability of AME-K data• Distribution bias in aggregate f ile behavior of scoring

system• MatchGrade mappings

– Unknown or ignored, potentially explainable, causes of variation

• Unexplainable• Quality of customer input• Completeness of customer input• Emergence of new jargon/Acronyms• New Chinese Idioms• Statutory changes• Differences in privacy expectations• Differences in w ord order, sound, stroke weight

• Data in hand• Discoverable data• Computable data• Extent, unavailable data

(opportunity cost)• Understanding of cause

systems• Relevant theory

D&B Proprietary information

Page 13: Having it All  is not Having it All at All!

Veracity: How do I adjudicate the truth when the malfeasants are learning so much faster?

Volume: How much data is “too much” to see the answer?

Velocity: Can the rate of change of data itself be part of the answer?

Variety: How can heterogeneous and unstructured data inform new ways of inquiry?

Leveraging the “V’s” to get to the best answerLeveraging the “V’s” to get to the best answer

Page 14: Having it All  is not Having it All at All!

A typical M&A takes 6-9 months from announcement to deal completion

• Some take longer, or may never close

• Regulatory requirements sometimes drive pre- and post- close changes over years

Family trees updated as the deal completes

• Average update within 10 days

• Linkage updates frequently precede official registry changes

• Updates include re-linking records, re-structuring tree levels, taking entities to out of business and creating new entities

Announced restructuring and re-organizations often take 6 months to 2 years

A good example can be seen in tracking mergers, acquisitions, and divestitures.

1414

Page 15: Having it All  is not Having it All at All!

Traditional analysis of this data can reveal interesting risks

15

CITGO PETROLEUM CORPORATIONTexas, USA

CITGO PETROLEUM CORPORATIONTexas, USA

PDV AMERICA, INCOklahoma, USA

PDV AMERICA, INCOklahoma, USA

Propernyn B.V.Netherlands

Propernyn B.V.Netherlands

3 additional subsidiary levels3 additional subsidiary levels

National Government:Republic of VenezuelaNational Government:Republic of Venezuela

Page 16: Having it All  is not Having it All at All!

Combining the articulated want (family tree) with the discoverable need (what’s really going on)…

16

Ceramics Inc50 Employees

Glass MfrWichita, Kansas

Ceramics Inc50 Employees

Glass MfrWichita, Kansas

Medi-Cell125 Employees

Lab Equip Mfr.Abayance, FL

Medi-Cell125 Employees

Lab Equip Mfr.Abayance, FL

AdvDesigns AG30 Employees

R&DStem Cell Rsrch

Frankfurt, Germany

AdvDesigns AG30 Employees

R&DStem Cell Rsrch

Frankfurt, Germany

Mediquip1000 Employees

Mediquip1000 Employees

Monsanto500 member family

treeLargest Genetically

modified food producer

Monsanto500 member family

treeLargest Genetically

modified food producer

Pending Decision: Underwrite Directors and Officers Policy

Pending Decision: Underwrite Directors and Officers Policy

49% 30%

The story is true. The names have been

changed to protect the innocent..

The story is true. The names have been

changed to protect the innocent..

Page 17: Having it All  is not Having it All at All!

Language, identity, and intention can significantly impact the complexity of the situation.

D&B Proprietary information

株式会社カワサキモータースジャパン“Kabushikigaisha Kawasaki Mōtāsu Jyapan ”

(aka Kawasaki Motors Japan)

한국가와사키“Hanguggawasaki”

(aka Kawasaki Korea)

川崎重工咨询“Chuanxi Zhonggong zuishin”

(aka Kawasaki Heavy Industries Consulting)

KAWASAKI KK(Local electricians in a suburb of Kawasaki)

川崎涂料有限公司“Chuanxi chuliao Youxian Gonxi”

(aka Kawasaki Paint Co, Dongguan)

川崎重工業株式会社“Kawasaki Jūkōgyō Kabushiki-gaisha”

(aka Kawasaki Heavy Industries)

“Ka-wa-sa-ki”Kawasaki (idiom)- “river beside mountainous terrain”

Page 18: Having it All  is not Having it All at All!

Privacy and other statutory constraint

Multiple names

Digital natives vs. digital immigrants

Overlapping “identities”

People are strange…People are strange…

Page 19: Having it All  is not Having it All at All!

As the boundary between people and small business becomes increasingly blurred, we continue to focus on the concept of People In The Context of Business

Cleanse, de-dupe, identity resolution and enrichment services for your contact data

Understand when people move from organization to organization

Sharpen the line between the individual and the business when engaging small businesses

Malfeasance and fraud are perpetrated by people, not by businesses. This solution reveals relationships that will help all of us more effectively identify potential for bad behavior.

19

THE CHALLENGE THE GOAL THE VALUE

#1 – the “John Smith” problem – multiple people with the same name

#2 – the “Ann Taylor” problem – data about businesses named after people

Caroline M Smith

302 N Liberty St.Albion, IAAddr Type: Residential

Carrie SmithMeredith Corporation1716 Locust St.Des Moines, IAAddr. Type: Commercial

Caroline SmithUniversity of Iowa21 E Market St.Iowa City, IAAddr. Type: Commercial

#3 – the “Sybil” problem – one person with multiple persona or names

Carrie SmithTenderheart Daycare2635 Cleveland Dr.Adel, IAAddr. Type: Commercial

Many people connected to one business

Many businesses connected to one

person

Businesses connected through people

People connected through associations with other people

A single view of customers and prospects, both in the context of entities and people will drive key actionable outcomes for your business.

D&B Proprietary information

Page 20: Having it All  is not Having it All at All!

Creating the foundation for People in the Context of Business.

20

• There will be a point of inflection reached whereby we have sufficiency of indicia (by quality and count) to say we can recognize a “soul”

• Dynamic clustering will allow us to adjust our opinion of existing indicia or an existing Soul as new Flexible Alternative Indicia is identified

SoulIndicia Dynamic Clustering

Indicia

D&B Proprietary information

Page 21: Having it All  is not Having it All at All!

I’ll bet you knew this was coming Learning from

the way things move, even if you don’t understand them fully… seriously?

How do you predict something that has no precedent?

Predictions, predictions…Predictions, predictions…

Page 22: Having it All  is not Having it All at All!

Commercial signal and proxy are now added to existing predictive attributes to provide deeper insights and even more predictive analytics.

Traditional Business Data

Robust Predictive Data Available

No DataAvailable

Non-Traditional

Insight

Low

High

P

red

icti

ve

Co

nte

nt

Limited Data Available

Signal & proxy sources add significant decisioning content on small businesses with limited or no traditional

predictive data footprint

Page 23: Having it All  is not Having it All at All!

‘Signals’ aggregated and analyzed over time, correlated with other data sources expose hard-to-find patterns.

23

BIG DISPARATE SOURCES OF

DATA

SIGNALEXTRACTION

ADVANCEDANALYTICS

PREDICTIVE MODEL GAINS

We’re harnessing the massive flow of data through our systems and distilling the signals that describe a company’s behavior.

This is helping to increase levels of precision in predictive models.

Customer Cross-border

Inquiries

Customer Match

Inquiries

Global Trade

Experiences

Transactional

WorldBase Updates

Third Party Exchange

Customer Portfolio

Monitoring

Intelligence EngineTraffic

Phone and Email

Connectivity Testing

Call Center Activity

Other Proprietary

Sources

D&B Proprietary information

Page 24: Having it All  is not Having it All at All!

Extending the deployed capability to better understand malfeasance…

24

•Apply learning and integrate new targeted severe risk prevention and detection rules in data supply processes and platforms

Continuous Improvement

Data Collection & Input

D&B Proprietary information

Page 25: Having it All  is not Having it All at All!

Combining people, linkage, and daily signals to quickly recognize and analyze patterns and take action…

25

In the above use-case, with millions of payment experiences a week, we were able to quickly identify and analyze a suspicious pattern and take action Not only on all related cases but also the “three ring leaders”

“Ring Leaders”

D&B Proprietary information

Page 26: Having it All  is not Having it All at All!

Data sensing: Advanced analytics also play a significant role in acquiring new data sources.

26

Scale

Depth

Value

Other Data

Multi-national footprint?

Comprehensive coverage across all verticals and sizes of business?

Positive correlation with trade or other predictors to serve as a proxy?

Page 27: Having it All  is not Having it All at All!

Some current efforts under way to utilized this hybrid capability…

Helping you gain visibility into your supplier’s suppliers, from tier 1 to tier N.

With this knowledge you can reduce the risk of being blind-sided by disruption(s) anywhere within your supplier network.

We use analytical methods to build an implied supply chain using our extensive knowledge of buyer-seller relationships.

31

TIER-N SUPPLY CHAIN RISK LINKAGE DISCOVERY ENGINE MATERIAL CHANGE

Tier 1 Tier 2 Tier … Tier N

A B

Buyer Seller

B

BuyerC

Seller

Providing you more linked families with a focus on small and medium businesses.

Gain a more comprehensive view of your multi-site business partners, revealing new opportunities and overall risk.

Innovative technology and analytics are efficiently guiding us to potential linkage relationships we had not previously seen.

Ultimate Parent

Headquarters

Branch

Branch

Parent Subsidiary

Helping you stay ahead by anticipating important changes before they occur.

Knowing which businesses are poised for growth, or which may be headed for elevated risk is valuable foresight.

Anticipatory analytics is helping us identify unique drivers, root causes, and sensitivities leading to material change.

Signals that predict a change…

…in traditional predictors…

…that predict business

outcomes

Derive insights from signals over time

Pinpoint combinations with greatest predictive value

D&B Proprietary information

Page 28: Having it All  is not Having it All at All!

28

New Techniques to address Big DataNew approaches to Discovery, Curation, and Synthesis

Data sensing at the “Event Horizon”

We are increasingly faced with information that is rich, varied, and replete with opportunity – our focus is shifting from “hunting and gathering” to new challenges.

Page 29: Having it All  is not Having it All at All!

“And now we welcome the new year, full of things

that have never been” – Rainer Maria

Rilke