the value of moving streaming analytics outside the … · the value of moving streaming analytics...
TRANSCRIPT
#AnalyticsXC o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
The Value of Moving Streaming Analytics Outside the Data Center
Mark LochbihlerDirector, Partner EngineeringHortonworks
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Disclaimer
This document may contain product features and technology directions that are under development, may be under development in the future or may ultimately not be developed.
Project capabilities are based on information that is publicly available within the Apache Software Foundation project websites ("Apache"). Progress of the project capabilities can be tracked from inception to release through Apache, however, technical feasibility, market demand, user feedback and the overarching Apache Software Foundation community development process can all effect timing and final delivery.
This document’s description of these features and technology directions does not represent a contractual commitment, promise or obligation from Hortonworks to deliver these features in any generally available product.
Product features and technology directions are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.
Since this document contains an outline of general product development plans, customers should not rely upon it when making purchasing decisions.
2 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Your Presenter
Mark Lochbihler
Hortonworks Partner Engineering
@MarkLochbihler
“26 years of Experiencein Computer Science, SAS and Data Platforms”
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Today’s Agenda September 14th, 2016
• Hortonworks and SAS Partnership• Data Explosion, the Market and Joint Customer Stories
• Hortonworks Connected Data Platforms• Hortonworks Data Platform
• Hortonworks Data Flow
• Hortonworks and SAS Integrations (High Level Overview)
• Focus on SAS ESP Integrations with HDP and HDF• SAS ESP running in HDP
• SAS ESP running with HDF
• Edge Analytics Value Summary
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Founded in 2011
Original 24 architects, developers, operators of Hadoop from Yahoo!
800+E M P L O Y E E S
1500+E C O S Y S T E M P A R T N E R S
• 800+ customers (as of Jan 1st, 2016)
• Publicly traded on NASDAQ: HDP
• The Leader in Connected Data Platforms
Data in Motion - Hortonworks Data Flow
Data At Rest - Hortonworks Data Platform
Powering Modern Data Applications
• Leader in open-source community, focused on innovation to meet enterprise needs
• Unrivaled Hadoop support subscriptions
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
SAS + Hortonworks Global Alliance
Strategic alliance established October 2013
Dedicated Alliance Management
Tier-1 Hadoop Distribution Vendor for SAS
Joint R&D with YARN integration
Joint Product Roadmap
Both – Founding Members of ODPi and DGI
"The expanded integration of SAS w ith
Hortonw orks Data Platform provides a simple
w ay for customers to broaden their analytic
operations across new data sets that can
drive smarter business decisions."
Shaun Connolly , VP of Corporate Strategy ,
Hortonworks
”Adopting YARN allow s us to use the YARN
infrastructure to set the boundaries for the
processes needed to run SAS HPA products
and SAS LASR Analytic Server based
products. CPU and memory can be capped,
facilitating a better sharing model for the
cluster."
Paul Kent, Vice President, Big Data, SAS
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
EMBRACE AN OPEN APPROACH
MASTER THE VALUE OF DATA
EVERY BUSINESS IS A DATA BUSINESS
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
4ZBDATAINTERNET
OF
ANYTHING
44ZBDATA
TOMORROW
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Tire Pressure
Server log Mobile
Sensor
Location
Precipitation
Social
Click-stream
Data Powers Highway Safety
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Data Powers Better Health
Claims Codes
Server logs Mobile
Sensor
Wearable Devices
EMR Data
Medical Research
Click-stream
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Data Powers Digital Security
Emails
MobileSensor
Firewall Log
Virus Definitions
Social
Click-stream
Server Log
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Challenge Unable to analyze huge amounts of data to optimize and improve real -time customer insights
Understand audience: Having the largest volume of data sets, audience segments/profile while leading the marketplace in privacy and governance.
Find Audience: Being leaders in identifying and targeting audiences across channels, platforms and devices.
Engage Audience: Driving engagement across platforms and formats.
Measure Audience: Exceeding client expectations with transparent reporting and accurate attribution models.
Solution Integration and analysis of all data collected across the organization
Query ALL data in one location blend of online and offline data, subscription, ecommerce, loyalty programs, etc.
Land massive click stream log fi les, 100+ M records / day, 30 mill ion unique IDs / month
Use 100% of the data for analysis and visualization instead of smaller random samples (over sampling)
Identified and modeled more than 600 relevant web characteristics out of a field of 75,000 with SAS
Customer Insights – SAS leveraging a centralized Big Data Lake
Telco / Media• Large multi-channel
media provider
Why Hortonworks
and SAS?
Customer Insight
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Challenge Lack of unified customer record across all channels clouded targeting for marketing campaigns
No “golden record” for analytics on customer buying behavior across all channels
Data repositories on web traffic, POS transactions and in-home services were in silos
Data storage costs were increasing, without a corresponding increase in value
Solution HDP data lake drives golden customer record, targeted marketing, and reduction in data storage expenses
Golden record enables targeted, personalized marketing with higher success rates
Data warehouse offload saved mill ions of dollars in recurring expense
Price optimization versus competitors several mill ions in top-line revenue growth
Unified Customer Record - 360° Customer View - to Improve Sales
Retail
Major home
improvement retailer
Why Hortonworks
and SAS?
Single View
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Challenge
Difficulty identifying coding errors among 300K daily claims
Health insurer had goals of marrying electronic health records with claims data
Data analysis is disjointed it difficult to identify coding errors
Undiscovered errors may harm patient health and reduce reimbursement from government programs, costing many mill ions in missed payments
Solution
Using SAS and Hortonworks to improve reimbursement revenue and health outcomes
HDP + SAS: Marrying and analyzing numerous pool of data store in HDP—including gross margins, taxes, customer claims and policy premiums—to determine the company's potential exposure and manage its resources more effectively.
Ability to crunch several terabytes of data, and then revises, recalculates and reports on that data on a weekly basis.
Improve Reimbursement - by Finding Errors in Claims
Insurance
Healthcare
Large US medical
insurer
Why Hortonworks
and SAS?
Data Discovery
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
DATA AT RESTDATA IN MOTION
ACTIONABLEINTELLIGENCE
Modern Data Applications
PERISHABLE INSIGHTS
HISTORICAL INSIGHTS
INTERNETOF
ANYTHING
Hortonworks DataFlow
Hortonworks Data Platform
Hortonworks DeliversConnected Data Platforms
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The Value of Modern Data AppsCustom or Off the Shelf
Real-Time Cyber Securityprotects systems with superior threat detection
Smart Manufacturingdramatically improves yields by managing more variables in greater detail
Connected, Autonomous Carsdrive themselves and improve road safety
Future Farmingoptimizing soil, seeds and equipment to measured conditions on each square foot
Automatic Recommendation Enginesmatch products to preferences in milliseconds
DATA ATREST
DATA IN MOTION
ACTIONABLEINTELLIGENCE
Modern Data Applications
Hortonworks DataFlow
Hortonworks Data Platform
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Hortonworks Data Platform for Data at RestPowered by Open Enterprise Hadoop
Open
Interoperable
Ready
Central
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Central Management of Data at RestY A R N
D A T A O P E R A T I N G S Y S T E M
OPERATIONS SECURITY
GOVERNANCE
STOR
AG
ESTO
RA
GE
MachineLearning
Batch
StreamingInteractive
Search
Centralized Platformfor operations, governance and security
Diverse Applicationsrun simultaneously on a single cluster
Maximum Data Ingestincluding existing and new sources, regardless of raw format
Shared Big Data Assetsacross business groups, functionsand users
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Secure
Real-time
Adaptive
Integrated
Hortonworks DataFlow for Data in MotionPowered by Apache NiFi
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDF Provides “Data Plan of Control” by Managing IoT Dataflows
Constrained
High-Latency
Localized Context
Hybrid – Cloud/On-Premise
Low -Latency
Global Context
Data source agnostic collection of data across heterogeneous environments
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Without HDF
Collecting Source Data is complicated
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDF is a Dataflow Management Platform
With HDF
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Hortonworks AND SAS Deliver Advanced Analytics Anywhere!
ESP
ESP
Grid ManagerLASRHP ProceduresCode AcceleratorData Quality AcceleratorEP
Data Management
Data Mining
Data DiscoverAdvanced Analytics
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Focus on SAS ESP Integrations
with HDP and HDF
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The Value of Event Stream Processing
SAS ESP compliments HDF and HDP by offering“Complex Event Processing” or CEP
Cyber Security - identify a malicious intrusion before or as it occurs
Fraud – analyze streaming transactions to determine which needs immediate attention
Predictive Maintenance – predict outlier conditions from streaming machine and sensor data
Customer Experience and Marketing – use streaming data insights to personalize interactions
Stream Data Management – transform and clean data in motion, storing only what you need
28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Data Center Approach: SAS ESP processing is co-located within HDP
STOR
AG
E
STO
RA
GE
GROUP 2GROUP 1
GROUP 4GROUP 3
D A T A A T R E S T
INTERNETOF
ANYTHING
In this deployment model - SAS ESP provides “Complex Event Processing” at the point of data being ingested into Hadoop
ESP
HDP
29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NodeManager NodeManager NodeManager NodeManager
Container 1.1
NodeManager NodeManager NodeManager NodeManager
NodeManager NodeManager NodeManager NodeManager
Container 1.2
Container 1.3
AM 1 Container 3
Container 4
AM3
ESP Job Launcher2
ResourceManager
Scheduler
1) ESP Job Launcher
3a) ESP Server
2) Request ESP: Memory / Core Requirements
3c) ESP Server
3b) ESP Server
AM2
AM4
SAS ESP on HDP is YARN Ready
30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
ESP
ESP
ESP
Remote to Data Center
HDF
HDF
HDF
Extending Streaming Analytics with SAS ESP and HDF
ESP
ESP
ESP ESP
Execute SAS ESP Advanced Analytics as a part of any HDF workflow :
• As data moves between data centers
• As data moves from the Edge or Remote Access Points to a data center
• As data moves from a data centers to the cloud
HDF
Between Data Centers
HDF
HDF
Between Data Centers & Cloud
HDF
31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
SAS ESP can provide “Complex Event Processing” to any HDF workflow
D A T A I N M O T I O N
STOR
AG
E
STO
RA
GE
GROUP 2GROUP 1
GROUP 4GROUP 3
D A T A A T R E S T
INTERNETOF
ANYTHING
SAS Models that were built on Historical Data Can be moved closer to the “Edge” of a modern data application.
ESP
ESP
32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Edge Analytics - On Premise and In the Cloud
STOR
AG
E
STO
RA
GE
GROUP 2GROUP 1
GROUP 4GROUP 3
D A T A A T R E S T
INTERNETOF
ANYTHING
C L O U D
O N P R E M I S E
STOR
AG
E
STO
RA
GE
GROUP 2GROUP 1
GROUP 4GROUP 3
D A T A A T R E S T
ESP
ESP
ESP
HDP
HDP
And, it should be noted that with SAS and HDP organizations are executing Machine Leaning and Deep Historical Closed Loop Analytics in the Cloud and Data Center as well.
Deep HistoricalAnalysis
MachineLearning
33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
SAS ESP Nifi Processors – “Drag and Drop” Integration
SAS ESP Nifi Processors
enable seamless “Drag and
Drop” integration
within any HDF Workflow
SAS ESP Nifi Processors comes
within SAS ESP 4.1 which went GA
in September 2016
34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDF and SAS ESP can extend outside the Data Center
Sensors & Actuators
Edge Gateways w Data Aggregation
and Filtering
Streaming Analytics and Computing
• Data Flow Management• Simple Event Processing• Complex Event Processing
Regional and Central Data Centers & the Cloud
• Event Stream Processing• Scalable Storage• Big Data Analytics
Sources or“Things”
People, Planes, Cars, Machines,
Buildings, ….
MiNiFi
Client Libraries
HDF and SAS ESP Extend Beyond Traditional Data Center Firewalls:
HDF offers Data Plan of Control – HDF offers Apache Nifi, Minifi and Client Libraries, Minifi and Nifi
SAS ESP compliments HDF by providing “Complex Event Processing” anyway along the “Data Plan of Control”
SOURCES REGIONAL AND CORE INFRASTRUCTURES
ESPESPESP
ESP
36 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Summary
Why Move Streaming Analytics Further out to the Edge?
Reacting immediately to an important “Event” closer to the “Edge” can yield significant positive results by:
• Increasing Customer Loyalty and Revenue
• Example: Providing a personalized, appealing offer that generates a close.
• Reducing Operational Inefficiencies and Expenses
• Examples:
• Stopping Fraud as it occurs instead
• Catching an early warning which alerts for immediate maintenance
• Only storing relevant events
• Enriching data as it is ingested
HDF and SAS ESP are integrated to allow an organization to implement specific edge streaming usage cases to improve bottom line results.
37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank You!
Mark Lochbihler
@MarkLochbihler
hortonworks.com/partner/sas/
sas.com/hortonworks
To learn more about our partnership, visit us at: