scaling up matlab analytics with kafka and cloud …...2 agenda files databases sensors access and...
TRANSCRIPT
![Page 1: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/1.jpg)
1© 2015 The MathWorks, Inc.
Scaling up MATLAB Analytics
with Kafka and Cloud Services
Olof Larsson
![Page 2: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/2.jpg)
2
Agenda
Files
Databases
Sensors
Access and Explore Data
1
Preprocess Data
Working with
Messy Data
Data Reduction/
Transformation
Feature
Extraction
2Develop Predictive
Models
Model Creation e.g.
Machine Learning
Model
Validation
Parameter
Optimization
3
Visualize Results
3rd party
dashboards
Web apps
5Integrate with
Production
Systems
4
Desktop Apps
Embedded Devices
and Hardware
Enterprise Scale
Systems AWS
Kinesis
![Page 3: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/3.jpg)
3
The Need for Large-Scale Streaming
Predictive MaintenanceIncrease Operational Efficiency
Reduce Unplanned Downtime
Jet engine: ~800TB per day
Turbine: ~ 2 TB per day
Medical DevicesPatient Safety
Better Treatment Outcomes
Connected CarsSafety, Maintenance
Advanced Driving FeaturesCar: ~25 GB per hour
More applications require
near real-time analytics
![Page 4: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/4.jpg)
4
Example Problem – How’s my
driving?
▪ A group of MathWorks employees
installed an OBD dongle in their car
that monitors the on-board systems
▪ Data is streamed to the cloud where it
is aggregated and stored
▪ We would like to use this data to score
the driving habits of participants
![Page 5: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/5.jpg)
5
Example: Fleet Analytics with MATLAB
![Page 6: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/6.jpg)
6
Fleet Analytics Architecture
MATLAB
Analytics Development
MATLAB Production Server
MATLAB
Analytics
Business Decisions
MATLAB
Compiler
SDK
Algorithm
Developers
Storage
Layer
End Users
API
Gateway
AWS
Lambda
Kafka
Connector
Business
Systems
Edge
Devices
Production System
![Page 7: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/7.jpg)
7
MATLAB
Analytics Development
MATLAB Production Server
MATLAB
Analytics
Business Decisions
MATLAB
Compiler
SDK
Algorithm
Developers
Storage
Layer
End Users
API
Gateway
AWS
Lambda
Kafka
Connector
Business
Systems
Edge
Devices
Production System
The first step is to clean up the incoming dataAccess and Explore Data
1
![Page 8: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/8.jpg)
8
The Data: Timestamped messages with JSON encoding
{"vehicles_id": {"$oid":"55a3fd0069702d5b41000000"},
"time” : {"$date":"2015-07-13T18:01:35.000Z"},
"kc” : 1975.0, "kff1225" : 100.65293, "kff125a" : 110.36619, … }
{"vehicles_id": {"$oid":"55a3fe3569702d5c5c000020"} "time":{"$date":"2015-07-13T18:01:53.000Z"},"kc” : 2000.0, "kff1225" : 109.65293, "kff125a" : 115.36619,…
}
{"vehicles_id": {"$oid":"55a4193569702d115b000001"} "time":{"$date":"2015-07-12T19:04:04.000Z"}"kc":2200.0, "kff1225" : 112.65293, "kff125a" : 112.36619,…
}
Key
Values
Access and Explore Data
1
Timestamp
![Page 9: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/9.jpg)
9
Access a Sample of Data
Raw Data
Timetable
Access and Explore Data
1
✓ Decode JSON data
✓ Create Timetable
![Page 10: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/10.jpg)
10
Develop a Preprocessing Function
✓ Clean up
✓ Enrich
✓ Restructure
Preprocess Data
2
Timetable
![Page 11: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/11.jpg)
12
Develop a Predictive Model
MATLAB
Analytics Development
MATLAB Production Server
MATLAB
Analytics
Business Decisions
MATLAB
Compiler
SDK
Algorithm
Developers
Storage
Layer
End Users
API
Gateway
AWS
Lambda
Kafka
Connector
Business
Systems
Edge
Devices
Production System
MATLAB Distributed
Computing Server
Develop Predictive
Models
3
![Page 12: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/12.jpg)
13
Everything you need to develop a predictive
model is found in MATLABDevelop Predictive
Models
3
Label Events
Represent
Signals
Train Model
Validate Model
Scale Up
![Page 13: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/13.jpg)
15
Develop a Predictive Model in MATLAB
Demo
Develop Predictive
Models
3
![Page 14: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/14.jpg)
16
MATLAB
Analytics Development
MATLAB Production Server
MATLAB
Analytics
Business Decisions
MATLAB
Compiler
SDK
Algorithm
Developers
Storage
Layer
End Users
API
Gateway
AWS
Lambda
Kafka
Connector
Business
Systems
Edge
Devices
Production System
Integrate Analytics with Production SystemsIntegrate with
Production
Systems
4
![Page 15: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/15.jpg)
17
A quick Intro to Stream Processing
▪ Batch Processing applies computation to a finite sized historical data set
that was acquired in the past
▪ Stream Processing applies computation to an unbounded data set that is
produced continuously
Messaging Service
• Reporting
• Real Time
Decision Support
Dashboards
Alerts
Storage
Historical Data
Storage
Files
Configure Resources Schedule and Run Job Output Data
Storage
Files
• Reporting
• Data Exploration
• Training Models
Connected
Devices
Continuous Data
f(x)
Stream Analytics
Integrate with
Production
Systems
4
![Page 16: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/16.jpg)
18
Why stream processing?
MATLAB Distributed
Computing Server,
MATLAB Compiler
Stream Processing with
MATLAB Production Server
Edge
Processing
with
MATLAB
Coder
Time critical decisions Big Data processing on historical data Near Real time decisions
Va
lue
of d
ata
to d
ecis
ion
ma
kin
g
Time
Historical
Reactive
Actionable
Pre
ve
nti
ve
/
Pre
dic
tive
Real-
TimeSeconds Minutes Hours Days Months
Today’s example
focuses here
Kinesis
Event Hub
Integrate with
Production
Systems
4
![Page 17: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/17.jpg)
19
Event
Time
Vehicle RPM Torque Fuel
Flow
… … … … …
… … … … …
… … … … …
… … … … …
… … … … …
… … … … …
… … … … …
… … … … …
… … … … …
… … … … …
… … … … …
MATLAB
Function
State
State
18:01:10 55a3fd 1975 100 110
18:10:30 55a3fe 2000 109 115
18:05:20 55a3fd 1980 105 105
18:10:45 55a3fd 2100 110 100
18:30:10 55a419 2000 100 110
18:35:20 55a419 1960 103 105
18:20:40 55a3fe 1970 112 104
18:39:30 55a419 2100 105 110
18:30:00 55a3fe 1980 110 113
18:30:50 55a3fe 2000 100 110
MATLAB
Function
State
MATLAB
Function
State
Input Table
Time window Vehicle Score
… … …
18:00:00 18:10:00 55a3fd …
55a3fe …
55a419 …
18:10:00 18:20:00 55a3fd …
55a3fe …
55a419 …
18:20:00 18:30:00 55a3fd …
55a3fe …
55a419 …
18:30:00 18:40:00 55a3fd …
55a3fe …
55a419 …
Output Table
5
7
3
9
4
5
8
Streaming data is treated as an unbounded TimetableIntegrate with
Production
Systems
4
![Page 18: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/18.jpg)
20
Introducing MATLAB Production Server
Platform
Data Business SystemAnalytics
MATLAB
Production Server
Request
Broker
Integrate with
Production
Systems
4
Azure
Blob
PI System
Databases
Cloud Storage
Cosmos DB
Streaming
Dashboards
Web
Custom Apps
Azure
IoT Hub
AWS
Kinesis
![Page 19: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/19.jpg)
22
MATLAB Production Server is an application
server that publishes MATLAB code as APIs
Enterprise
Application
Mobile / Web
Application
Analytics Development
MATLABMATLAB
Compiler SDK
< >
Package Code / test
Data sources
3rd party
dashboardScale and secure
MATLAB Production Server
Request
Broker
Worker processes
Access
Integrate
Deploy
Integrate with
Production
Systems
4
![Page 20: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/20.jpg)
23
Connecting MATLAB Production Server to Kafka
▪ Kafka client for MATLAB Production Server
feeds topics to functions deployed on the server
▪ Configurable batch of messages passed as a
MATLAB Timetable
▪ Each consumer process feeds one topic to a
specified function
▪ Drive everything from a simple config file
– No programming outside of MATLAB!MATLAB Production Server
Request
Broker
&
Program
Manager
Consumer
Process feeds
Topic-1
Async Java
Client
Topic-0
Partition
Partition
Partition
Topic-1
Partition
Partition
Partition
Kafka Cluster
Publisher Publisher Publisher
Consumer
Process feeds
Topic-0
Async Java
Client
Integrate with
Production
Systems
4
![Page 21: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/21.jpg)
24
Develop and Deploy a Stream Processing Function
MATLAB
Analytics Development
MATLAB Production Server
MATLAB
Analytics
Business Decisions
MATLAB
Compiler
SDKAlgorithm
Developers
Storage
Layer
End Users
API
Gateway
AWS
Lambda
Kafka
Connector
Business
Systems
Edge
Devices
Production System
Integrate with
Production
Systems
4
![Page 22: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/22.jpg)
25
Develop a Stream Processing Function in MATLABIntegrate with
Production
Systems
4
Process each window of
data as it arrives
Current window of data to
be processed
Previous state
Current score
![Page 23: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/23.jpg)
26
Develop a Stream Processing Function in MATLABIntegrate with
Production
Systems
4
Apply your
pre-processing algorithm
![Page 24: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/24.jpg)
27
Develop a Stream Processing Function in MATLABIntegrate with
Production
Systems
4
Use the model you created with
Classification Learner App
![Page 25: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/25.jpg)
28
Develop a Stream Processing Function in MATLABIntegrate with
Production
Systems
4
Update Mongo database
• Count of events by type and location
• Results of driver scoring
![Page 26: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/26.jpg)
29
Debug a Stream Processing Function in MATLAB
MATLAB
Analytics Development
Business Decisions
MATLAB
Compiler
SDKAlgorithm
Developers
Storage
Layer
End Users
Kafka
Connector
Business
Systems
Edge
Devices
Production System
Integrate with
Production
Systems
4
![Page 27: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/27.jpg)
30
Debug a Stream Processing Function in MATLABIntegrate with
Production
Systems
4
![Page 28: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/28.jpg)
31
Tie in your Dashboard Application
MATLAB
Analytics Development
MATLAB Production Server
MATLAB
Analytics
Business Decisions
MATLAB
Compiler
SDKAlgorithm
Developers
Storage
Layer
End Users
API
Gateway
AWS
Lambda
Kafka
Connector
Business
Systems
Edge
Devices
Production System
Integrate with
Production
Systems
4
![Page 29: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/29.jpg)
32
Complete Your ApplicationVisualize Results
5
![Page 30: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/30.jpg)
33
Scalable Analytics with Enterprise BI ToolsVisualize Results
5
TIBCO Spotfire
Tableau
![Page 31: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/31.jpg)
35
Key Takeaways
➢ MATLAB connects directly to your data so you can quickly design and validate
algorithms
➢ The MATLAB language and apps enable fast design iterations
➢ MATLAB Production Server enables easy integration of your MATLAB
algorithms with enterprise production systems
➢ Allows you to spend your time understanding the data and designing algorithms
![Page 32: Scaling up MATLAB Analytics with Kafka and Cloud …...2 Agenda Files Databases Sensors Access and Explore Data 1 Preprocess Data Working with Messy Data Data Reduction/ Transformation](https://reader030.vdocuments.us/reader030/viewer/2022040620/5f31e58645950518790cf009/html5/thumbnails/32.jpg)
36
Resources to learn and get started
▪ Data Analytics with MATLAB
▪ MATLAB Production Server
▪ MATLAB Compiler SDK
▪ Statistics and Machine Learning Toolbox
▪ Database Toolbox
▪ Mapping Toolbox
▪ MATLAB with TIBCO Spotfire
▪ MATLAB with Tableau
▪ MATLAB with MongoDB