strata hadoop world 2015 context computing - jonas keynote - final

Post on 11-Apr-2017

692 Views

Category:

Data & Analytics

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

© 2015 IBM Corporation

Context ComputingStrata + Hadoop World 2015

Jeff Jonas, IBM FellowChief Scientist, Context Computinghttp://www.twitter.com/jeffjonaswww.jeffjonas.typepad.com

© 2015 IBM Corporation

Jeff JonasIBM FellowChief Scientist, Context Computing

Founded Systems Research & Development (SRD) in 1985

Architected, designed, developed roughly 100 systems over the last three decades

– Financial services– Defense, intelligence– Manufacturing – Humanitarian efforts

Acquired by IBM in 2005

Currently focused on Context Computing, Sensemaking and Privacy by Design

© 2015 IBM Corporation

No Context

NewsletterSubscriber

© 2015 IBM Corporation

Context

“Better understanding something by taking into account the things around it.”

© 2015 IBM Corporation

I ducked as the bat flew my way.

Another exciting baseball game.

© 2015 IBM Corporation

In Context

SocialMedia

Influencer

NewsletterSubscriber

LoyaltyClub Member

High ValueCustomer

JobApplicant Watch

Listed Party

© 2015 IBM Corporation

Context Accumulating

ContextAccumulation

ContextualizedObservations

Observation(Any kind of data from

any kind of sensor)

© 2015 IBM Corporation

Context Informs Decisioning

ContextAccumulation

ContextualizedObservations

ObservationIn Context

Decisioning

Act

Data Finds Data Relevance Finds You

Observation(Any kind of data from

any kind of sensor)

© 2015 IBM Corporation

The Puzzle Metaphor

Imagine an ever-growing pile of puzzle pieces of varying sizes, shapes, colors

What it represents is unknown – there is no picture on hand

Is it one puzzle, 15 puzzles, or 1,500 different puzzles?

Some pieces are duplicates, missing, incomplete or have errors

Some pieces may even be professionally fabricated lies

Until you take the pieces to the table, it is nearly impossible to assess the scene

© 2015 IBM Corporation

Puzzling Images: Courtesy Ravensburger © 2011

270 pieces90%

200 pieces66%

150 pieces50%

6 pieces2%

30 pieces10% (duplicates)

© 2015 IBM Corporation

© 2015 IBM Corporation

© 2015 IBM Corporation

First Discovery

© 2015 IBM Corporation

More Data Finds Data

© 2015 IBM Corporation

Duplicates in Front Of Your Eyes

© 2015 IBM Corporation

First Duplicate Found Here

© 2015 IBM Corporation

© 2015 IBM Corporation

Incremental Context – Incremental Discovery

6:40pm START

22min “Hey, this one is a duplicate!”

35min “I think some pieces are missing.”

37min “Looks like a bunch of hillbillies on a porch.”

44min “Hillbillies, playing guitars, sitting on a porch, near a barber sign and a banjo!”

© 2015 IBM Corporation

150 pieces50%

© 2015 IBM Corporation

Incremental Context – Incremental Discovery

47min “We should take the sky and grass off the table.”

2hr “Let’s switch sides, and see if we can make sense of this from

different perspectives.”

2hr10m “Wait, there are three … no, four puzzles.”

2hr18m “I think you threw in a few random pieces.”

© 2015 IBM Corporation

© 2015 IBM Corporation

How Context Accumulates

With each new observation one asserts: 1) unrelated; 2) related; or 3) connected

Must favor the false negative

New observations sometimes reverse earlier assertions

Some observations produce novel discovery

The emerging picture helps focus collection interests

© 2015 IBM Corporation

Big Data [in context]. New Physics.

More data: better the predictions– Lower false positives– Lower false negatives

More data: bad data good– Suddenly glad your data is not perfect

More data: less compute

© 2015 IBM Corporation

Big Data

Pile of ______ Information In Context

© 2015 IBM Corporation

One Essential Form of Context: “Entity Resolution”

Is it 5 people each with 1 account or is it 1 person with 5 accounts?

Is it 20 cases of SARS in 20 cities or one case reported 20 times?

If one cannot count, one cannot estimate vector or velocity (direction, speed).

Without vector and velocity prediction is nearly impossible.

© 2015 IBM Corporation

Who is Fang Wong?

Fang WongTop 100 Customer

F A WongSeattle, DOB: 6/12/82

Former Customer

@FangWong2.5M Followers

FangWong@Email.comNewsletter Subscriber

Fang WongFangWong@Email.comMarketing Department’s

Prospect List

© 2015 IBM Corporation

Resolving the Fang Wong

Fang WongTop 100 Customer

F A WongSeattle, DOB: 6/12/82

Former Customer

@FangWong2.5M Followers

FangWong@Email.comNewsletter Subscriber

Fang WongFangWong@Email.comMarketing Department’s

Prospect List

© 2015 IBM Corporation

Resolving the Fang Wong

Fang WongTop 100 Customer2.5M Followers

Newsletter Subscriber

© 2015 IBM Corporation

Graphing the (resolved) Fang Wong

Bill SmithMember of the Board

Employee

Customer

Customer

FraudsterFang Wong

Top 100 Customer2.5M Followers

Newsletter Subscriber

© 2015 IBM Corporation

Contextualizing Sandy Maden

Bill SmithMember of the Board

Sandy MadenNew Account

Employee

Lives With

Co-signer

FormerCustomer

Customer Customer

Customer

FraudsterFang Wong

Top 100 Customer2.5M Followers

Newsletter Subscriber

© 2015 IBM Corporation

“Entities”

Bill SmithMember of the Board

Lives With

Co-signer

Sandy MadenNew AccountFormer

Customer

Employee

Customer Customer

Customer

FraudsterFang Wong

Top 100 Customer2.5M Followers

Newsletter Subscriber

Company

Boat

Plane

Asteroid

Car

© 2015 IBM Corporation

Asteroid Hunting

© 2015 IBM Corporation

Single Detection

Image courtesy of: Eva Lilly, Institute of Astronomy, University of Hawaii

© 2015 IBM Corporation

From Orphans to Orbits

Single Detections(trash)

TrackletteTrackOrbitForecasting

Named entity: S100ZUtza

Single Detection (orphan)

Anticipation

© 2015 IBM Corporationhttp://www.space.com/7854-slam-asteroids-suspected-space-collision.html

© 2015 IBM Corporation

"We have directly observed a collision between asteroids for the first time, instead of having to infer that they happened from million-year-old remains."

Colin SnodgrassPlanetary Scientist

Max Planck Institute for Solar System Research

© 2015 IBM Corporation

Geospatial Context via “Space Time Boxes”

© 2015 IBM Corporation

Detecting Colocation

TIME1 day

SPA

CE 1 hour

Determine encounter distance and time

0.05 A

U

0.005

AU

Space Time Boxes

© 2015 IBM Corporation

Computing 600k Asteroid Interactions over 25 Years

4-5 orders of magnitude improvement

Initial Analysis

Adding 1 New Trajectory

Space-Time Box Method

2,880 CPU hours

15 CPU minutes

N-body Simulation Method

10,000,000 CPU hours

4,000 CPU hours

© 2015 IBM Corporation

Asteroid vs. Asteroid Encounters

Encounter Distance

Asteroid 1

Size Asteroid 2

Size

May 1, 2032 299km 00A9170 2-4km 0008758 4-9km

Nov 24, 2016 449km 00P5634 1-2km 0055711 2-5km

Jan 11, 2018 449km K08E88J 530-1200m

00N0062

2-4km

© 2015 IBM Corporation

June 12th, 2015

Hi Jeff & the gang,

I have great news! On Tuesday I happened to observe a close encounter you guys predicted - one 1 km and the other one 2 km in diameter!

To my knowledge this is the first case ever of direct observation of a close encounter in the small main belt asteroids. The closest point of encounter unfortunately happened during bright daylight in Hawaii, so I missed it …

Cheers!Eva -

© 2015 IBM CorporationImage courtesy of: Eva Lilly, Institute of Astronomy, University of Hawaii

© 2015 IBM Corporation

[Theatrical Pause]

© 2015 IBM Corporation

Action

Red Analytics

Green Analytics

Blue Analytics

ObservationSpace

Old School: Isolated Analytics

© 2015 IBM Corporation

ObservationSpace

ActionInformationIn Context

Next: General Purpose Context Computing

Data Finds Data Relevance Finds You

Context Computing

© 2015 IBM Corporation

ObservationSpace

ActionInformationIn Context

Data Finds Data Relevance Finds You

Context Computing

Helping Focusing Human Attention

General Purpose • Marketing• Customer service• Fraud detection• Asteroid hunting

Simultaneously!

© 2015 IBM Corporation

Making Data Work: Recommendations

Widen the observation space

Accumulate context to improve understanding

Deliver significantly higher quality outcomes … everywhere– Life sciences– Financial services– Public safety

Leverage Hadoop/Spark to accelerate innovation

© 2015 IBM Corporation

More

Blog: www.jeffjonas.typepad.com

Email: JeffJonas@us.ibm.com

Next: San Francisco, Nov 10-12, @Datapalooza

© 2015 IBM Corporation

Context ComputingStrata + Hadoop World 2015

Jeff Jonas, IBM FellowChief Scientist, Context Computinghttp://www.twitter.com/jeffjonaswww.jeffjonas.typepad.com

top related