a deeper dive into apache mxnet - march 2017 aws online tech talks

24
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Webinars Sunil Mallya Solutions Architect, Deep Learning A Deeper Dive into Apache MXNet on AWS

Upload: amazon-web-services

Post on 05-Apr-2017

212 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Webinars

Sunil Mallya

Solutions Architect, Deep Learning

A Deeper Dive into Apache MXNet on AWS

Page 2: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

Agenda

• Apache MXNet introduction• Distributed Deep Learning with AWS Cloudformation• Deep Learning motivation and basics• MXNet programing model overview• Train our first neural network using MXNet

Page 3: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

Deep Learning ApplicationsSignificantly improve many applications on multiple domains

image understanding speech recognition natural language processing

autonomy

• Netflix – Recommendation Engine• FINRA – Anonmaly detection, Sequence matching• TuSimple - Computer Vision for Autonomous Driving• Pinterest - Image recognition search• Mapillary - Computer vision for crowd sourced maps

AI Customers on AWS

Page 4: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

AI Services

AI Platform

AI Engines

Amazon Rekognition

Amazon Polly

Amazon Lex

More to comein 2017

Amazon Machine Learning

Amazon Elastic MapReduce

Spark & SparkML

More to comein 2017

Apache MXNet TensorFlow Caffe Theano KerasTorch CNTK

P2 ECS LambdaEMR/Spark GreenGrass FPGA More to comein 2017

Hardware

Democratizing Artificial Intelligence

Page 5: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

Apache MXNet

Programmable Portable High PerformanceNear linear scaling

across hundreds of GPUsHighly efficient

models for mobileand IoT

Simple syntax, multiple languages

88% efficiencyon 256 GPUs

Resnet 1024 layer network is ~4GB

Page 6: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

Webinars

Distributed Deep Learning

Page 7: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

IdealInception v3Resnet

Alexnet

88%Efficiency

1 2 4 8 16 32 64 128 256No. of GPUs

• Cloud formation with Deep Learning AMI

• 16x P2.16xlarge. Mounted on EFS

• Inception and Resnet: batch size 32, Alex net: batch size 512

• ImageNet, 1.2M images,1K classes

• 152-layer ResNet, 5.4d on 4x K80s (1.2h per epoch), 0.22 top-1 error

Scaling with MXNet

Page 8: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

Distributed Training Setup with Cloudformation

https://github.com/awslabs/deeplearning-cfn

Page 9: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
Page 10: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

Webinars

Deep Learning basics

Page 11: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

Biological Neuron

slide from http://cs231n.stanford.edu/

Page 12: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

Artificial Neuron

output

synapticweights

• InputVector of training data x

• OutputLinear function of inputs

• NonlinearityTransform output into desired range of values, e.g. for classification we need probabilities [0, 1]

• TrainingLearn the weights w and bias b

Page 13: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

Deep Neural Network

hidden layers

The optimal size of the hidden layer (number of neurons) is usually between the size of the input and size of the output layers

Input layer

output

Page 14: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

The “Learning” in Deep Learning

0.4 0.3

0.2 0.9

...

back propogation (gradient descent)

X1 != X0.4 ± 𝛿 0.3 ± 𝛿

newweights

newweights

01011

.

.--X

input

label

...X1

Page 15: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

Hidden Layer Visualization

Page 16: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

Webinars

MXNet Programing Model

Page 17: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

import numpy as npa = np.ones(10)b = np.ones(10) * 2c = b * a

• Straightforward and flexible.• Take advantage of language

native features (loop, condition, debugger)

• E.g. Numpy, Matlab, Torch, …

• Hard to optimize

PROS

CONSd = c + 1c

Easy to tweak with python codes

Imperative Programing

Page 18: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

• More chances for optimization• Cross different languages• E.g. TensorFlow, Theano,

Caffe

• Less flexible

PROS

CONSC can share memory with D because C is deleted later

A = Variable('A')B = Variable('B')C = B * AD = C + 1f = compile(D)d = f(A=np.ones(10),

B=np.ones(10)*2)

A B

1

+

X

Declarative Programing

Page 19: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

IMPERATIVE NDARRAY API

DECLARATIVE SYMBOLIC EXECUTOR

>>> import mxnet as mx>>> a = mx.nd.zeros((100, 50))>>> b = mx.nd.ones((100, 50))>>> c = a + b>>> c += 1>>> print(c)

>>> import mxnet as mx>>> net = mx.symbol.Variable('data')>>> net = mx.symbol.FullyConnected(data=net, num_hidden=128)>>> net = mx.symbol.SoftmaxOutput(data=net)>>> texec = mx.module.Module(net)>>> texec.forward(data=c)>>> texec.backward() NDArray can be set

as input to the graph

MXNet: Mixed programming paradigm

Page 20: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

Webinars

Lets train our first model to classify handwritten digits

Page 21: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

MXNet Overview

• Founded by: U.Washington, Carnegie Mellon U. (~1.5yrs old)• Recently Accepted to the Apache Incubator • State of the Art Model Support: Convolutional Neural Networks (CNN), Long

Short-Term Memory (LSTM)• Scalable: Near-linear scaling equals fastest time to model• Multi-language: Support for Scala, Python, R, etc.. for legacy code leverage and

easy integration with Spark• Ecosystem: Vibrant community from Academia and Industry

Open Source Project on Github | Apache-2 Licensed

Page 22: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

Application Examples | Python notebooks• https://github.com/dmlc/mxnet-notebooks• Basic concepts

• NDArray - multi-dimensional array computation• Symbol - symbolic expression for neural networks• Module - neural network training and inference

• Applications• MNIST: recognize handwritten digits• Check out the distributed training results• Predict with pre-trained models• LSTMs for sequence learning• Recommender systems• Train a state of the art Computer Vision model (CNN)• Lots more..

Page 23: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

Call to ActionMXNet Resources:• MXNet Blog Post | AWS Endorsement • Read up on MXNet and Learn More: mxnet.io• MXNet Github Repo • MXNet Recommender Systems Talk | Leo DiracDeveloper Resources:• Deep Learning AMI | Amazon Linux• Deep Learning AMI | Ubuntu – NEW!!!• P2 Instance Information• CloudFormation Template Instructions• Deep Learning Benchmark • MXNet on Lambda • MXNet on ECS/Docker• MXNet on Raspberry Pi | Wine Detector

Page 24: A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks

Webinars

Thank You

[email protected]