dan kernaghan, pitney bowes · by consolidating data and running real-time address validation, they...

18
Monetizing the Lake Dan Kernaghan, Pitney Bowes

Upload: others

Post on 22-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

Monetizing the LakeDan Kernaghan, Pitney Bowes

Page 2: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

0

2000

4000

6000

8000

10000

12000

14000

HDP Oracle X Teradata Netezza

Cost Per Terabyte

Hortonworks #REF!

Hadoop is Lower Cost and more Scalable

Page 3: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Cost Drivers – The Big Picture

Insights – Produce more valuable and more holistic insights

Security - Apply Security Policies in one place instead of repeating them in each Silo

Collaborate - Curate Feature Vectors for our Data Scientists and Promote Collaboration

Time – Get models into production faster. Human time still the most costly

Storage – Store data in an accessible file system at the lowest cost

Storage

Collaborate

SecurityInsights

Time-to-Market

Page 4: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Various Data Types

First_Name SSN Net_Worth

Joe 233-33 100,000

Mark 456-77 200,000

Structured

0

5

10

15

20

25

30

35

40

12:05 12:08 12:11 12:14 12:17 12:20

Time-Series

Best Buy released their earnings this quarter and beat analyst expectations. Earnings per share increased by 0.02

Unstructured

DB2, Oracle KDB File System

Page 5: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

HDP Stack – Attack the Data with the Right Tool

Page 6: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Limitations of Building a Model on a Traditional Platform

If you need a lot of data to build a good model, what tools can you use?– Data volumes can eliminate the possibility of desktop tools

– R, Eclipse all limited to 8G of Ram on the desktop machine

Sampling?– Well… we better get an even distribution of true and false positives in each sample, but wait that

requires data munging, back to what tools can we use.

Security Concerns?– Extracting data from it’s secure resting place and pushing it into other environments, often times

unsecure files or desktops where Matlab or R can be installed.

Collaboration– Push processing to the data using modern distributed tooling.

Page 7: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Web-based Notebook for interactive analytics

• Data exploration and discovery

• Visualization

• Interactive snippet-at-a-time experience

• “Modern Data Science Studio”

Features

• Ad-hoc experimentation

• Deeply integrated with Spark + Hadoop

• Supports multiple language backends

• Incubating at Apache

Use Case

Apache Zeppelin

Page 8: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Data Science Notebooks - Collaborate

Page 9: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Pitney Bowes and HortonworksSpatially Enabling the Data Lake

Page 10: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Pitney Bowes Data – Global Coverage

10

Local datasets for

240Countries

Global coverage built on a legacy of accuracy and precision

Recognized leaderfor LI Data and capabilities.

AMER

764Datasets

EMEA

3079Datasets

APAC

719Datasets

Page 11: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Pitney Bowes Data – Unparalleled Depth

Pitney Bowes | Partner Program Overview | February 14, 201711

Page 12: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

/5

July 1997

July 1997

$207,000

$207,000

/2

/13

75

/5

Unfinished

Incorrect information

for this property:

• Last sale date• Last sale price• # of bedrooms• # of rooms• Finished basement• # of spaces (garage)• Structure type • Lot width• Parcel boundary

Risk of Relying Solely on Public Data

Page 13: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Easy to Deploy and Use

Pitney Bowes | May 2, 2017

Client Applications

Pitney Bowes Data Products

Big Data Ecosystem Tools

Spatial Visualization

Reporting AnalyticsCustom

Applications

Distributed Cluster

HDFS HiveReference Datasets

NoSQL Database Spark

Spectrum Data Quality for Big Data

Spectrum Addressing for Big Data

Spectrum Spatial for Big Data

Spectrum Geocoding for Big Data

Spectrum Routing for Big Data

Page 14: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Enriching Data with a Location Stack

For a given location:• POI (carries attributes)

• Retail (Business) Footprint poly

• Building Footprint

• Parcel (Lot)

• Isochrone(travel time)

• Demographics, lifestyle attributes, financial and

consumer vitality, etc.

Page 15: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Wild Fire Risk Walkability Scores

Hydrating the Spatial Data Lake

Property Data

Risk Data

Market Data

Plus• Transactions • IOT Sensors• Social Media

Property Data• 180M+ Property Addresses• Geocode• Property Attributes

Risk Data• Property Boundaries• Distance to Water• Flood Risk• Wild Fire Risk

Market Data• GeoDemographics• Neighborhood Boundaries• Zip Code Boundaries• Points of Interest

Page 16: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Case studies: Drive superior business outcomes and gain a deeper understanding of customers.

16

Online Mortgage Loan ProviderBy consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated mortgage origination to enable loan processing in days not weeks.

Financial service firm gains richer profilesRestored missing address data through data standardization, data augmentation and geocoding. Enabled firm to run targeted multichannel promotions via web and smartphone apps.

Global US Based Wealth Management OrganizationIncrease customer lifetime value and provide ideal customer experience by optimizing every contact with its mass-affluent customers, with 35% increase in revenues and 55% improvement in client satisfaction

Pitney Bowes | Partner Program Overview | May 2, 2017

Page 17: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Business Challenge: Close loans more quickly, improve client experience while mitigating lender’s risk

This lender, unlike most others, relies on wholesale funding to make its loans and uses online applications rather than a system of branches.

Close Loans FasterLender found many specific requirements delayed loan funding and closure, causing clients to abandon online process. Integration if Pitney Bowes data through the pb key enabled the analysis of loan requests to provided an accurate qualification of the property for a loan, reducing abandoned rates and accelerating revenue.

Mitigating RiskThe accurate and complete attributes provided by the spatial data lake, correctly assessed the risks associated with a loan, enabling more accurate pricing and profitability.

Desired Outcomes▪Improved real-time and long-term decisions▪Access to accurate date for 180M properties in the US▪Sharing information with partners (e.g. Fannie Mae)▪Complete picture of property, risk and market

Benefits▪ Accurate qualification of property for a particular loan type▪ Faster loan processing and closure▪ Improved risk assessment of loan to particular property.

Large US Online Loan Provider

Property Analytics Case Study

Page 18: Dan Kernaghan, Pitney Bowes · By consolidating data and running real-time address validation, they gained a complete view of customers, enabling more effective marketing, accelerated

18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Five reasons to modernize with Pitney Bowes Big Data SDKs

18

1

2

3

4

5

They’re easy• Simple and intuitive user experience

• Program in SQL to run processes in the Hortonworks Spatial Data Lake

They’re powerful• Take advantage of more data• Answer questions that were too big before

They’re incredibly fast• Process enormous amounts of data in a fraction of the time

They’re practical• Avoid large capital outlays• They’ll run in the cloud

They’re secure• Extend and enforce your Hadoop permissions• Easy to manage and configure

Pitney Bowes |May 2, 2017