leveraging data driven research through microsoft azure
TRANSCRIPT
![Page 1: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/1.jpg)
LEVERAGING DATA DRIVEN RESEARCH THROUGH MICROSOFT AZUREDr. Miguel Fierro
Data Scientist at Microsoft
@[email protected]://miguelgfierro.com
Plymouth University | Jan 27, 2017 | Plymouth, UK
![Page 2: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/2.jpg)
AZURE FOR RESEARCH AWARD
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Free Azure resources if awarded
Areas: data science, climate, health…
Ex: Alan Turing Institute got $5M
![Page 3: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/3.jpg)
D a t a S c i e n c e V i r t u a l
M a c h i n eA z u re M L S t u d i o
S p a r k a n d H a d o o p
w i t h A z u re
OUTLINE
![Page 4: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/4.jpg)
SPARK & HADOOP WITH AZURE
![Page 5: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/5.jpg)
WHAT IS HDINSIGHT
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
HDInsightManaged Service
![Page 6: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/6.jpg)
MANAGER GUI: AMBARI
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
![Page 7: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/7.jpg)
APACHE HADOOP
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Software for storing and analysing
massive amounts (~Tb) of
structured and unstructured data
![Page 8: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/8.jpg)
APACHE SPARK
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Framework that runs large-scale data analytics applications
pySpark, Spark (Scala), SparkR
100x faster than Hadoop (processing in memory)
![Page 9: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/9.jpg)
APACHE KAFKA
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Stream processing for real time apps
Publisher & subscriber messaging system
Millions of messages per second
![Page 10: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/10.jpg)
APACHE STORM
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Distributed framework for real-time applications
ETL, continuous computation, online machine learning
Million of operations per second in each node
![Page 11: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/11.jpg)
APACHE HBASE
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Non-relational database (NoSQL) for Big Data applications
Distributed, fast tolerant and scalable
Built on top of HDFS (Hadoop Distributed File System)
![Page 12: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/12.jpg)
APACHE HIVE
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
SQL-like language to query data in Hadoop systems
Word count program
![Page 13: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/13.jpg)
EXAMPLE OF ARCHITECTURE
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
![Page 14: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/14.jpg)
DEMO: PYSPARK APPLICATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Log analysis with PySpark Predictive analysis on food inspection with PySpark
source: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-machine-learning-mllib-ipython
source: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-custom-library-website-log-analysis
![Page 15: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/15.jpg)
AZURE ML STUDIO
![Page 16: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/16.jpg)
WHAT IS AZURE ML STUDIO
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
GUI for Machine Learning
![Page 17: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/17.jpg)
DATA INPUT/OUTPUT
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
![Page 18: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/18.jpg)
DATA TRANSFORMATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
![Page 19: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/19.jpg)
DATA MANIPULATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
![Page 20: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/20.jpg)
FEATURE SELECTION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
![Page 21: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/21.jpg)
CLASSIFICATION & REGRESSION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
![Page 22: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/22.jpg)
TRAINING & SCORING
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
![Page 23: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/23.jpg)
PYTHON & R SCRIPTS
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
![Page 24: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/24.jpg)
AUTOMATIC API
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
![Page 25: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/25.jpg)
DEMO: CREDIT RISK ANOMALY DETECTION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
source: https://gallery.cortanaintelligence.com/Experiment/1219e87f8fb84e88a2e1b54256808bb3
![Page 26: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/26.jpg)
DATA SCIENCE VIRTUAL MACHINE
![Page 27: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/27.jpg)
WHAT IS THE DSVM
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Windows:- Anaconda with python Jupyter notebooks- Microsoft R Server- Visual Studio- SQL Server- Azure SDK- Deep learning: CNTK & MXNet- Machine Learning: XGBoost
Linux:- Anaconda with python Jupyter notebooks- Microsoft R Server- PyCharm- Azure SDK- Deep learning: CNTK & MXNet- Machine Learning: XGBoost, Weka
![Page 28: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/28.jpg)
DEEP LEARNING DSVM
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Libs:- CNTK- MXNet- TensorFlow- Keras
Digit recognition Image recognitionExamples:
![Page 29: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/29.jpg)
NVIDIA TESLA K80
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
![Page 30: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/30.jpg)
AI LANDSCAPE: IMAGES
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
15.4%
7.3%
6.7%
3.6%3.1%
5.1% (human)
error (%)
ImageNet (image recognition competition) top-5 error
AlexNet(2012)
VGG(2014)
Inception(2015)
ResNet(2015)
Inception-ResNet(2016)
![Page 31: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/31.jpg)
AI LANDSCAPE: SPEECH
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Microsoft Research achieves parity with human speech level
source: http://blogs.microsoft.com/next/2016/10/18/historic-achievement-microsoft-researchers-reach-human-parity-conversational-speech-recognition
CNN(VGG, ResNet, LACE)
RNN(Bi-LSTM)
Multi-GPU and multi server(1-bit Stochastic Gradient Descent)
![Page 32: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/32.jpg)
IMAGE CLASSIFICATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
1.
2.
3.
4.
5.
source: https://blogs.technet.microsoft.com/machinelearning/2016/11/15/imagenet-deep-neural-network-training-using-microsoft-r-server-and-azure-gpu-vms/
![Page 33: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/33.jpg)
IMAGE CLASSIFICATION IMAGENET
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
source: https://blogs.technet.microsoft.com/machinelearning/2016/11/15/imagenet-deep-neural-network-training-using-microsoft-r-server-and-azure-gpu-vms/
Real class
Predicted class
![Page 34: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/34.jpg)
TEXT CLASSIFICATION
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
Train
Backend
Dataset
Azure NC24 VM with 4 K80 GPUs
.R
model.params
Azure Cloud Services
.py
.js
.html
Score
Web app
API
DNN
input text
![Page 35: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/35.jpg)
DEMO: TEXT CLASSIFICATION WEB APP
Plymouth University January 2017 - Dr. Miguel Fierro @miguelgfierro
![Page 36: Leveraging Data Driven Research Through Microsoft Azure](https://reader034.vdocuments.us/reader034/viewer/2022052318/5899c3c71a28ab45548b54dd/html5/thumbnails/36.jpg)
LEVERAGING DATA DRIVEN RESEARCH THROUGH MICROSOFT AZUREDr. Miguel Fierro
Data Scientist at Microsoft
@[email protected]://miguelgfierro.com
Plymouth University | Jan 27, 2017 | Plymouth, UK