hidden data scientists lurking in your company-v2...fraud detection predictive maintenance anomaly...

49
EXECUTIVE BRIEFING: The Hidden Data Scientists Lurking in your Company Jack Norris, SVP Data and Applications MapR Technologies

Upload: others

Post on 24-Feb-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

EXECUTIVE BRIEFING: The Hidden Data Scientists Lurking in your Company

Jack Norris, SVP Data and ApplicationsMapR Technologies

Page 2: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

2 © 2019 MapR Technologies, Inc. // MapR Confidential

Putting Things into Perspective

Legacy technology investmentNext-Gen technology investment

Source: IDC, Gartner; Analysis & Estimates: MapRNext-gen consists of cloud, big data, software and hardware related expenses

(80,000)

(40,000)

-

40,000

80,000

120,000

2013 2014 2015 2016 2017 2018 2019 2020

$ (millions)

INVESTMENT IN NEXT-GEN VS. LEGACY TECHNOLOGIES FOR DATA

Total $ growth of IT market

Page 3: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

3 © 2019 MapR Technologies, Inc. // MapR Confidential

McKinsey estimates AI techniques have the potential to create between $3.5T and $5.8T in value annually across nine business functions in 19 industries.

The Impact of AI

Forrester Research predicts that by 2020, businesses adopting Machine Learning, AI, and Deep Learning, the Internet of Things (IoT), and Big Data will take away more than $1.2 trillion from their less-informed peers.

Page 4: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

4 © 2019 MapR Technologies, Inc. // MapR Confidential

Demand for Data Scientists 29% annual Increase

in Demand344% since 2013

- Indeed Job Stats

In the UK 80% ofcompanies are

planning to hire data scientists

- MHR Analytics

Page 5: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

5 © 2019 MapR Technologies, Inc. // MapR Confidential

Deep Learning Algorithms

ConvolutionalNeural

Networks

DeepNeural

Networks

RecurrentNeural

Networks

Providing lift for classification and forecasting models

Feature extraction and classification of images

For sequence of events, language models, time series, etc.

Page 6: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

6 © 2019 MapR Technologies, Inc. // MapR Confidential

+ + DOMAINEXPERIENCE

ALGORITHMSDATAML/AI

Page 7: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

7 © 2019 MapR Technologies, Inc. // MapR Confidential

Focus On Machine Learning Tools

Page 8: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

8 © 2019 MapR Technologies, Inc. // MapR Confidential

Curiosity

Page 9: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

The Importance of Data

Page 10: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

10 © 2019 MapR Technologies, Inc. // MapR Confidential

“90+% of Machine LearningSuccess is Data Logistics”

https://mapr.com/ebook/machine-learning-logistics

Page 11: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

11 © 2019 MapR Technologies, Inc. // MapR Confidential

More DATA

The Unreasonable Effectiveness of Data, published by Google

beats complexalgorithms

Page 12: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

12 © 2019 MapR Technologies, Inc. // MapR Confidential

More Data Allows You to Spot Infrequent Behaviors

f

Time (t=years)

t t+1 t+2 t+n

With recent data you have limited historical data

accumulated

®© 2014 MapR Technologies 12

Page 13: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

13 © 2019 MapR Technologies, Inc. // MapR ConfidentialTime (t=years)

f

t t+1 t+2

With big data, you can trace infrequent patterns through time

that call out anomalies.

t+n

More Data Allows You to Spot Infrequent Behaviors

®© 2014 MapR Technologies 13

Page 14: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

14 © 2019 MapR Technologies, Inc. // MapR Confidential

How to see beyond the obvious

®© 2014 MapR Technologies 14

Page 15: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

15 © 2019 MapR Technologies, Inc. // MapR Confidential

How to see beyond the obvious and react quickly

®© 2014 MapR Technologies 15

Page 16: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

16 © 2019 MapR Technologies, Inc. // MapR Confidential

The Importance of Data Logistics

“Machine learning offers a

fantastically powerful toolkit

for building complex systems

quickly.… it is remarkably easy to incur massive ongoing maintenance costs at the

system level when applying

machine learning.”

Page 17: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

The Impact of Analytics

Page 18: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

18 © 2019 MapR Technologies, Inc. // MapR Confidential

Evolving Analytics at Scale

• Descriptive – What happened?

• Predictive – What will happen?

• Prescriptive – The best response to what is happening

Global Report on Technology and the Economy, “Flow and the Big Shift in Business Models”.

Descriptive Predictive Prescriptive

Information Advice

Ø Injecting analytics into operations

Page 19: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

19 © 2019 MapR Technologies, Inc. // MapR Confidential

StreamingNoSQL Analytics Storage MessagingProcessing EnginesDocument Database

Digital Growth Requires Complex Requirements

CONTEXT SPEED

ACTION

Page 20: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

20 © 2019 MapR Technologies, Inc. // MapR Confidential

AMERICANEXPRESS

Implemented big data machine learning use cases: fraud detection & prevention; new customer acquisition & recommendations for better customer experience.Chao Yuan, SVP & Head of Decision Science

$1TrillionProtected Annually

from Fraud

< 2msIt Takes To

Make A Decision

Page 21: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

21 © 2019 MapR Technologies, Inc. // MapR Confidential

Machine Learning based predictive maintenance & data intensive applications to reduce operational expenses and increase uptime

200K+Sensors

300 Billion+Data Points

Evaluated Daily

Page 22: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

22 © 2019 MapR Technologies, Inc. // MapR Confidential

Automation to Aid not Replace Data Scientist

Page 23: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

23 © 2019 MapR Technologies, Inc. // MapR Confidential

Analytics at Scale: Impact on Business Models

• Moving from upfront payments to usage models

• Paying for the actual value created creates new sources for competitive advantages

Global Report on Technology and the Economy, “Flow and the Big Shift in Business Models”.

Upfront purchase Usage Impact

Transaction Relationship

Page 24: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

24 © 2019 MapR Technologies, Inc. // MapR Confidential

Incitec Pivot LtdFrom Manufacturing Optimization to “Explosions as a Service”

Objectives:• Improve explosive manufacturing quality and yield• Offer new service to consumers to ensure successful

blasts at mine sites

• Collecting data from PLCs and sensors to optimize operations and perform predictive maintenance

• Edge deployments at mine sites for analysis and control of explosions

Page 25: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

25 © 2019 MapR Technologies, Inc. // MapR Confidential

Training Resources

• Installation

• Migrations

• SLA Plans

• Best Practices

• Performance

Tuning

Core Platform Services

IT/ Infrastructure

Converged Platform

Linux

Networking

Data Center

Storage

Operations

Big DataWorkflows

• Hive/Pig/Spark

• Oozie/Sqoop

• Flume

• MapR-DB/HBase

• Data Pipeline

• MapR Streams

BI / DBA

BI / ETL / Reporting

Scripting / Java

Hadoop MR

Eco Projects

(HBase, Hive, …)

SolutionDesign

• HBase/MapR-DB

• Map/Reduce

• Application

Development

• Integration

Development

Java

Hadoop Developer

Architectural Design

AdvancedAnalytics

• Use case

Discovery

• Use case

Modeling

• POC

• Workshops

Modeler / Analyst

PhD

Statistics/Math

MatLab / R / SAS

Scripting / Java

BI / ETL / Reporting

Data Engineering Data Science

ENGAGEMENTS

SKILLS

Page 26: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

26 © 2019 MapR Technologies, Inc. // MapR Confidential

Data Warehouse Offload, Optimization and Analytics

Oil and Gas – Tier 2 Historian

Retail – Customer 360, Social Media Analysis, Recommendation Engine.

Time Series Analytics, NoSQL Webstore Applications

Deep Learning on GPUs for Image Analytics

Solution Template

Deployment Architecture

KnowledgeTransfer

Financial Services – Fraud Detection, Anti-Money Laundering

Complex Event Processing with Drools / Stream Processing

Self Service Data Exploration and BI Analytics on Hadoop

Quick Start Solutions: Speeding Time-to-Value

Page 27: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

27 © 2019 MapR Technologies, Inc. // MapR Confidential

Data Scientist Access To Fast Moving Innovation & Agility

/f1

REQUIRED DATA

NFSFile

Access

CSIContainer

Access

Page 28: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

28 © 2019 MapR Technologies, Inc. // MapR Confidential

The Challenge for IT Organizations

InnovationFlexibility

Democratizationof Data

ControlSecurity

ProtectionAvailability

ML/AI data logistics

BI analytics

Data science platform

Recommendation engine

Customer Engagement

Fraud detection

Predictive Maintenance

Anomaly Detection

AI , ANALYTICS

Data modernization

Scale out data lakes

Multi-temperature data

Data catalogs

Inter-cloud data/app portability

Cloud bursting

Persistent app in containers

Secure data provisioning

IoT, EDGE, CLOUD & CONTAINERIZED

DATA LAYER IS THE LEVERAGE POINT

Page 29: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

29 © 2019 MapR Technologies, Inc. // MapR Confidential

Analytics at Scale: Importance of Location

• Edge – Single Location Processing

• On Premise – Centralized Processing

• Multi-Cloud – Fully Distributed

Global Report on Technology and the Economy, “Flow and the Big Shift in Business Models”.

Edge On Premise Multi-Cloud

Local Distributed Global

Ø Learning Globally, Acting Locally

Page 30: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

30 © 2019 MapR Technologies, Inc. // MapR Confidential

HDFSRESTPOSIX Geo-ReplicationMany Small SitesA Few Big Sites

Leverages Cloud and On-Premise

Globally ProtectedGlobally AccessibleGlobally Managed

NFS Data Protection

Customers Tell Us They Need A Global Data FabricNEED SCALABLE STORAGE PLATFORM AROUND THE GLOBE, AROUND THE CLOCK

Self HealingAssume Failures are

Common

Glacier

Page 31: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

31 © 2019 MapR Technologies, Inc. // MapR Confidential

Top Issues for Cloud Today

https://www.logicmonitor.com/wp-content/uploads/2017/12/LogicMonitor-Cloud-2020-The-Future-of-the-Cloud.pdf

Page 32: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

32 © 2019 MapR Technologies, Inc. // MapR Confidential

Edge Private CloudOn Premise

Public Cloud Public Cloud Public Cloud

API

Application

API Connector

API API API API

ü

Cross Cloud: Difficult to Establish Unified & Secure Data Access

Page 33: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

33 © 2019 MapR Technologies, Inc. // MapR Confidential

Edge Private CloudOn Premise

Public Cloud Public Cloud Public Cloud

Open APIs

Application• Unified Security Model• Data access decoupled from

physical storage location. Globally.• Data made portable• No lock-in to proprietary APIs• Full openness

API Connector

GLOBAL DATA MANAGEMENT

ü

Cloud With A Data Platform: Portable, United Access with Security

Silo Problem solved!

Page 34: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

34 © 2019 MapR Technologies, Inc. // MapR Confidential

Performs real-time analysis to optimize Oil and Gas drilling and production

Page 35: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

35 © 2019 MapR Technologies, Inc. // MapR Confidential

Connected Car: Driving the Edge Use Cases

Data Driver for:Vehicle Efficiency and Performance

Personalized Experience

Manufacturing Optimization

Driver Assisted and Driverless Vehicles

By 2020, more than 250 million vehicles will be connected globally

- Gartner

Page 36: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

36 © 2019 MapR Technologies, Inc. // MapR Confidential

REQUIREMENTS• Collect data in full fidelity

• Apply the latest, most accurate models – even in areas of low bandwidth, high latency, or space constraints

• Avoid the patchwork of silos that comes with historian systems

COORDINATED DATA FLOWS• Pub-sub data streaming:

• Data ingestion from IoT devices and • Edge to cloud data communication where

streams are persisted for centralized analysis • Extends the platform to the edge, allowing for

applications and models to run at the edge (required for time-sensitive applications)

Requires Ability to Learn Globally and Act Locally

Topic

Topic

Topic

ExternalApplication

MAPR EDGE

MAPR EDGE

MAPR EDGE

MAPR DATA PLATFORM

MAPR DATA PLATFORM

Pub-submessaging

Page 37: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

How to Unleash your Lurking Data Scientists

Page 38: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

38 © 2019 MapR Technologies, Inc. // MapR Confidential

Machine Learning Logistics

Discrete Response SystemMove to Support Multiple Models for accuracy, separation of concerns

Load Balancer Approach

With a load balancer, you can start and stop new models pretty easily, you lack:• Latency guarantees, • Ability to compare models on

identical inputs, • Records of all of the requests

with responses from all live models.

Page 39: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

39 © 2019 MapR Technologies, Inc. // MapR Confidential

Stream-based Logistics

Page 40: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

40 © 2019 MapR Technologies, Inc. // MapR Confidential

Rendezvous Architecture

Page 41: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

41 © 2019 MapR Technologies, Inc. // MapR Confidential

Abstraction: a Short History of IT

1970

MAINFRAMESBlackbox

FLEXIBILITYFREEDOM

LOCK-INSPECIALIZATION 2019+

ABSTRACTION

CLIENT/SERVERSpecialized HW with open industry software standards(TCP/IP, X86, NFS)

CONTAINERSResources entirely managed in Software

FUNCTION VIRTUALIZATIONSoftware replaces specialized HW

VIRTUAL MACHINESSoftware used to abstract Hardware from OS

Freedom to run multiple OS on the same HW

Page 42: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

42 © 2019 MapR Technologies, Inc. // MapR Confidential

Data is a resource as well

Hardware

DatawareMiddleware

Software

Page 43: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

43 © 2019 MapR Technologies, Inc. // MapR Confidential

Turning Data into a Manageable ResourceAll in Software

• Data Containerization• Global Multi-Tenancy• Data Portability• Resource Isolation• Workload independence• Security• Global Web-Scale Deployments• Performance• Universal Access

All Managed byPolicies in One Layer

Page 44: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

44 © 2019 MapR Technologies, Inc. // MapR Confidential

Global Data Fabric Supports Existing and New Applications including AI

Page 45: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

45 © 2019 MapR Technologies, Inc. // MapR Confidential

The Solution for Organizations to Support Analytics

InnovationFlexibility

Democratizationof Data

ControlSecurity

ProtectionAvailability

ML/AI data logistics

BI analytics

Data science platform

Recommendation engine

Customer Engagement

Fraud detection

Predictive Maintenance

Anomaly Detection

AI , ANALYTICS

Data modernization

Scale out data lakesMulti-temperature data

Data catalogs

Inter-cloud data/app portability

Cloud burstingPersistent app in containers

Secure data provisioning

IoT, EDGE, CLOUD & CONTAINERIZED

DATAWARE FOR DATA-DRIVEN TRANSFORMATION

Page 46: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

46 © 2019 MapR Technologies, Inc. // MapR Confidential

Finding and Empowering Your Data Scientists

+ + DOMAINEXPERIENCE

ALGORITHMSDATAML/AI

Page 47: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

47 © 2019 MapR Technologies, Inc. // MapR Confidential

Meetup Tonight

Online Evaluation of Machine Learning ModelsTed Dunning PhD, MapR CTO

The Microsoft Reactor London70 Wilson StFinsburyEC2A 2DBUnited Kingdom

Thursday, May 2, 2019 from 7:00 PM to 8:30 PM (BST)

Page 48: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

48 © 2019 MapR Technologies, Inc. // MapR Confidential

Rate today ’s session

Session page on conference website

Page 49: Hidden Data Scientists Lurking in your company-v2...Fraud detection Predictive Maintenance Anomaly Detection AI , ANALYTICS Data modernization Scale out data lakes Multi-temperature

Questions?