data analytics with matlab - mathworksdevelop advanced analytics with machine learning –advanced...

22
1 © 2015 The MathWorks, Inc. Data Analytics with MATLAB Dr. Jan Eggers MathWorks June 9, 2015

Upload: others

Post on 28-Jul-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

1© 2015 The MathWorks, Inc.

Data Analytics with MATLAB

Dr. Jan Eggers

MathWorks

June 9, 2015

Page 2: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

2MPG Acceleration Displacement Weight Horsepow er

MP

GA

ccele

ratio

nD

ispla

cem

ent

Weig

ht

Hors

epow

er

50 1001502002000 4000200 40010 2020 40

50

100

150

200

2000

4000

200

400

10

20

20

40

From Data to Decisions & Design

Observation Organization UnderstandingDecisions &

Design

PhysicalSensors

Data Information Knowledge Action

0 20 40 60 80 100 120 140 160 180 200

0.5

0.6

0.7

0.8

0.9

1

time secs

active p

ow

er

per-

unit

NN

measured

Page 3: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

3

Key Takeaways

Access and preprocess large amounts of heterogeneous data

Develop advanced analytics with machine learning

Integrate analytics with your enterprise systems

Page 4: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

4

Agenda

Data GoalTechniques

Explore

Prototype

Scale

Access Share/Deploy

Advanced

Statistics

Machine

Learning

Predictive

Modelling

Decision

Making

Volume

Variety

Velocity

Page 5: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

5

Big Data Capabilities in MATLAB

Memory and Data Access

64-bit processors

Memory Mapped Variables

Disk Variables

Databases

Datastores

Platforms

Desktop (Multicore, GPU)

Clusters

Cloud Computing (MDCS on EC2)

Hadoop

Programming Constructs

Streaming

Block Processing

Parallel-for loops

GPU Arrays

SPMD and Distributed Arrays

MapReduce

Page 6: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

6

DataStore

datastoreImport text files & collections of text files

that don’t fit into memory

ds = datastore('file1.mat');

ds = datastore('*.csv');

ds = datastore('/shared/data_repository/');

ds = datastore('hdfs://myserver:7867/data/file1.txt');

ds = datastore({'/shared01/','/shared02/'});

while hasdata(ds)

T = read(ds);

end

Page 7: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

7

1503 UA LAX -5 -10 2356

540 PS BUR 13 5 186

1920 DL BOS 10 32 1876

1840 DL SFO 0 13 568

272 US BWI 4 -2 359

784 PS SEA 7 3 176

796 PS LAX -2 2 237

1525 UA SFO 3 -5 1867

632 PS SJC 2 -4 245

1610 UA MIA 60 34 1365

2032 DL EWR 10 16 789

2134 DL DFW -2 6 914

1503 UA LAX -5 -10 2356

540 PS BUR 13 5 186

1920 DL BOS 10 32 1876

1840 DL SFO 0 13 568

272 US BWI 4 -2 359

784 PS SEA 7 3 176

796 PS LAX -2 2 237

1525 UA SFO 3 -5 1867

632 US SJC 2 -4 245

1610 UA MIA 60 34 1365

2032 DL EWR 10 16 789

2134 DL DFW -2 6 914

UA

PS

DL

DL

2356

186

1876

568

US

PS

PS

UA

US

UA

DL

DL

245

1365

789

914

359

176

237

1867

UA 2356

PS 186

PS 237

UA 1867

UA 1365

DL 1876

DL 914

US 359

US 245

Data Store Map Reduce

MapReduce

maxDelay = mapreduce(ds, @maxDistMapper, @maxDistReducer);

Page 8: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

8

Datastore

HDFS

Reduce

Node

Node

Node Data

Data

Data

Map

ReduceMap

ReduceMap

Map Reduce

Map

Map

Reduce

Reduce

A Big Data Platform

Page 9: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

9

Advanced

Statistics

Machine

Learning

Predictive

Modelling

Decision

Making

Volume

Variety

Velocity

Agenda

Data GoalTechniques

Explore

Prototype

Scale

Access Share/Deploy

Page 10: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

10

Machine Learning

Machine learning uses data and produces a program to perform a task

Standard Approach Machine Learning Approach

𝑚𝑜𝑑𝑒𝑙 = <𝑴𝒂𝒄𝒉𝒊𝒏𝒆𝑳𝒆𝒂𝒓𝒏𝒊𝒏𝒈𝑨𝒍𝒈𝒐𝒓𝒊𝒕𝒉𝒎

>(𝑠𝑒𝑛𝑠𝑜𝑟_𝑑𝑎𝑡𝑎, 𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦)

Computer

Program

Machine

Learning

𝑚𝑜𝑑𝑒𝑙: Inputs → OutputsHand Written Program Formula or Equation

If X_acc > 0.5

then “SITTING”

If Y_acc < 4 and Z_acc > 5

then “STANDING”

𝑌𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦= 𝛽1𝑋𝑎𝑐𝑐 + 𝛽2𝑌𝑎𝑐𝑐+ 𝛽3𝑍𝑎𝑐𝑐 +

Task: Human Activity Detection

Page 11: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

11

Machine Learning Techniques

Machine

Learning

Supervised

LearningClassification

Regression

Unsupervised

LearningClustering

Group and interpretdata based only

on input data

Develop predictive model based on bothinput and output data

Type of Learning Categories of Algorithms

Page 12: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

12

Apply Machine Learning techniques easily

Data:

3-axial Accelerometer data

3-axial Gyroscope data

Machine

Learning

Page 13: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

13

Data Analytics Workflow

Work on your desktop

Start “simple”

Basic statistics

Explore data

ExploreAccess Share/Deploy

Start locally …

Page 14: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

14

Data Analytics Workflow

… prototype algorithms and then …..

Explore

Prototype

Access Share/Deploy

Work on your desktop

Interactive development

Advanced algorithms

Page 15: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

15

Data Analytics Workflow

Scale to a cluster

… scale up as needed

Explore

Prototype

Scale

Access Share/Deploy

Parallel Computing Tools

Page 16: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

16

Advanced

Statistics

Machine

Learning

Predictive

Modelling

Decision

Making

Volume

Variety

Velocity

Agenda

Data GoalTechniques

Explore

Prototype

Scale

Access Share/Deploy

Page 17: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

17

A Primer on Deploying MATLAB Programs

Excel®

add-ins

Desktop

MATLABProductionServer(s)

WebServer(s)

Web &

Enterprise

• Royalty-free

• Encryption to protect intellectual property

Page 18: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

18

Benefits of Deploying MATLAB Code

Domain experts maintain ownership of ideas, algorithms, and applications

Flexibility to integrate with different programming languages

Implement a common algorithm on different platforms

Avoid time consuming and error prone re-coding

Easily adopt algorithm improvements throughout lifecycle

Page 19: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

19

Predictive Data Analytics – Load Demand Forecasting

Page 20: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

20

Big Data and Predictive Analytics at Shell

Page 21: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

21

Link to user story

STIWA Increases Total Production Output of Automation Machinery

Challenge

Apply sophisticated mathematical methods to optimize

automation machinery and increase total production output

Solution

Use AMS ZPoint-CI to collect large production data sets in

near real time and use MATLAB to analyze the data and

identify optimal trajectories

Results

Total cycle time reduced by 30%

Large data sets analyzed in seconds

Deployment to multiple machines streamlined

“Our shopfloor management system AMS

ZPoint-CI collects a huge amount of

machine, process, and product data 24 hours

a day. By analyzing this data immediately in

MATLAB and AMS Analysis-CI we have

achieved a tenfold increase in precision, a

30% reduction in total cycle time, and a

significant increase in production output.”

Alexander Meisinger

STIWA

STIWA’s shopfloor management

system, based on MATLAB, AMS

ZPoint-CI, and AMS Analysis-CI.

Page 22: Data Analytics with MATLAB - MathWorksDevelop advanced analytics with machine learning –Advanced statistical and machine learning methods to gain insights –Apps to rapidly iterate

22

Key Takeaways

Access and preprocess large amounts of heterogeneous data

– Capabilities to deal with big data are available and evolve

– Tools to organize data and automate the process

Develop advanced analytics with machine learning

– Advanced statistical and machine learning methods to gain insights

– Apps to rapidly iterate through and assess different models

Integrate analytics with your enterprise systems

– Parallel Computing and Map Reduce to scale up as needed

– Application deployment on every scale to make models available to others