singa: putting deep learning into the hands of multimedia users wei wang, gang chen, tien tuan anh...

27
SINGA: Putting Deep Learning into the Hands of Multimedia Users SINGA: Putting Deep Learning into the Hands of Multimedia Users http://singa.apache.org/ Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin Ooi, Kian-Lee Tan, and Sheng Wang 1

Upload: franklin-stephens

Post on 17-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

SINGA: Putting Deep Learning into the Hands of

Multimedia Usershttp://singa.apache.org/

Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin Ooi, Kian-Lee Tan, and Sheng

Wang

1

Page 2: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

• Introduction

• Multimedia data and application• Motivations

• Deep learning models and training, and design principles• SINGA

• Usability

• Scalability

• Implementation

• Experiment

2

Page 3: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Introduction

Image/video

Social Media

E-commerce

Health-care Text

AudioMadbits (acquired by Twitter)

Perceptio (acquired by Apple)

LookFlow (acquired by Yahoo! Flickr)

Deepomatic (e-commerce product search)

Descartes Labs (satellite images)

Clarifai (tagging)

ParallelDots

Semantria (NLP tasks >10 languages)

Ldibon

AlchemyAPI  (acquired by IBM)

VocallIQ (acquired by Apple)

Multimedia Data

Multimedia Data

Deep Learning has been noted for its effectiveness for multimedia applications!

3

Page 4: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Motivations

Model Categories

CNN, MLP, Auto-encoderImage/video classification

Feedforward Models

Krizhevsky, Sutskever, and Hinton, 2012; Szegedy et al., 2014; Simonyan and Zisserman, 2014a

CNN

4

Page 5: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Motivations

Feedforward Models

Energy models

RBM

DBN

Model Categories

CNN, MLP, Auto-encoderImage/video classification

DBN, RBM, DBMSpeech recognition

Dahl et al., 20125

Page 6: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Motivations

Feedforward Models

Energy models

Recurrent Neural

Networks

Model Categories

CNN, MLP, Auto-encoderImage/video classification

DBN, RBM, DBMSpeech recognition

RNN, LSTM, GRUNatural language processing

Mikolov et al., 2010; Cho et al., 20146

Page 7: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Motivations

Feedforward Models

Energy models

Recurrent Neural

Networks

Model Categories

CNN, MLP, Auto-encoderImage/video classification

DBN, RBM, DBMSpeech recognition

RNN, LSTM, GRUNatural language processing

Design Goal IUsability: easy to implement various models

7

Page 8: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Motivations: Training Process

• Training process• Update model parameters to minimize prediction error

• Training algorithm• Mini-batch Stochastic Gradient Descent (SGD)

• Training time• (time per SGD iteration) x (number of SGD iterations)• Long time to train large models over large datasets, e.g., 2 weeks

for training Overfeat (Pierre, et al.) reported by Intel (https://software.intel.com/sites/default/files/managed/74/15/SPCS008.pdf).

Back-propagation (BP) Contrastive Divergence (CD)

8

Page 9: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Motivations: Distributed Training Frameworks• Synchronous training (Google Sandblaster, Dean et al., 2012; Baidu AllReduce, Wu et al., 2015)

• Reduce time per iteration

• Scalable for single-node with multiple GPUs

• Cannot scale to large cluster

• Asynchronous training (Google Downpour, Dean et al., 2012, Hogwild!, Recht et al., 2011)

• Reduce number of iterations per machine

• Scalable for big cluster with commodity machine(CPU)

• Not stable

• Hybrid frameworks

Design Goal IIScalability: not just flexible, but also efficient and

adaptive to run different training frameworks

9

Page 10: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

SINGA:

A Distributed Deep Learning Platform

10

Page 11: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Usability: Abstraction

class Layer { vector<Blob> data, grad; vector<Param*> param; ... void Setup(LayerProto& conf, vector<Layer*> src); void ComputeFeature(int flag, vector<Layer*> src); void ComputeGradient(int flag, vector<Layer*> src);};Driver::RegisterLayer<FooLayer>("Foo"); // register new layers

Input layers load raw data (and label)Output layers output feature (and prediction results)

Neuron layers transform features, e.g., convolution and pooling

Loss layers measure training loss, e.g., cross-entropy loss

Connection layers connect layers due to neural net partition

TrainOneBatchTrainOneBatch

NeuralNet

Layer

stopstop

11

Page 12: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Usability: Neural Net Representation

TrainOneBatchTrainOneBatch

NeuralNet

Layer

stopstop

RNN RBM

Input

Hidden

Loss

labels

Feedforward models (e.g., CNN)

12

Page 13: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Usability: TrainOneBatch

TrainOneBatchTrainOneBatch

NeuralNet

Layer

stopstop

Back-propagation (BP)

Contrastive Divergence (CD)

Input

Hidden

Loss

labels

RNN

Feedforward models (e.g., CNN)

RBMJust need to override the TrainOneBatch

function to implement other algorithms! 13

Page 14: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Scalability: Partitioning for Distributed TrainingNeuralNet Partitioning:1. Partition layers into different subsets

2. Partition each singe layer on batch dimension.

3. Partition each singe layer on feature dimension.

4. Hybrid partitioning strategy of 1, 2 and 3. Worker 1

Worker 2

1

Worker 1

Worker 2

Worker 1

Worker 2

Worker 1

2 3

Users just need to CONFIGURE the partitioning scheme and

SINGA takes care of the real work (eg. slice and connect layers)14

Page 15: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Scalability:Training Framework Cluster Topology

Server Group

Parameters

Server Server ServerWorker

Server

Node

Group

Inter-node Communication

Synchronous training cannot scale to large group size

Neural Net

Worker Worker Worker

Legends:

15

Page 16: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Cluster Topology

Worker

Server

Node

Group

Inter-node Communication

Communication is the bottleneck!

Legends:

16

Scalability:Training Framework

Page 17: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Cluster Topology

Worker

Server

Node

Group

Inter-node Communication

(a) Sandblaster (b) AllReduce (c) Downpour (d) Distributed Hogwild

sync async

SINGA is able to configure most known frameworks.

Legends:

17

Scalability:Training Framework

Page 18: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Implementation

Driver::Train()

Main Thread

Stub::Run()

Worker thread

While(not stop): Worker::TrainOneBatch()

Server thread

While(not stop): Server::Update()

Remote NodesHDFS

Ubuntu

Docker

CentOS MacOS

DiskFile

Mes

os

Zoo

keep

er

Worker Stub Server

Driver

CNN RBM RNN

OptionalComponent

SINGA Component

Legend:

SINGA Software StackSINGA Software Stack

18

Page 19: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Deep learning as a Service (DLaaS)Third party APPs(Web app, Mobile,..)----------------------

API

Developers(Browser)

----------------------GUI

Rafiki ServerRafiki Server

Routing(Load balancing)

Rafiki AgentRafiki Agent

User, Job, Model, Node Management

Timon(c++ wrapper)

SINGA

Timon(c++ wrapper)

SINGA

DataBaseDataBase

File Storage System

(e.g. HDFS)

File Storage System

(e.g. HDFS)

Rafiki AgentRafiki AgentTimon

(c++ wrapper)

SINGA

Timon(c++ wrapper)

SINGA ……

http request

http request http request

http request

SINGA’s RAFIKI

1. To improve the Usability of SINGA; 2. To “level” the playing field by taking care of complex system plumbing work, its reliability, efficiency and scalability.

19

Page 20: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Comparison:Features of the Systems

Comparison with other open source projects

Feature SINGA Caffe CXXNET cuda-convnet H2O

Deep LearningModels

Feed-forward (CNN) ✔ ✔ ✔ ✔ MLP

Energy model (RBM) ✔ x x x x

Recurrent networks (RNN) ✔ ✔ x x x

DistributedTrainingFrameworks

Synchronous ✔ ✔ ✔ ✔ ✔

Asynchronous ✔ ✔ x x x

Hybrid ✔ x x x x

Hardware CPU ✔ ✔ ✔ x ✔

GPU V0.2.0 ✔ ✔ ✔ x

Cloud Software

HDFS ✔ x x x ✔

Resource management ✔ x x x ✔

Virtualization ✔ x x x ✔

Binding Python (P), Matlab(M), R ongoing (P) P+M P P P+R

MXNet on 28/09/15

20

Page 21: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Experiment --- Usability

Hinton, G. E. and Salakhutdinov, R. R. (2006)Reducing the dimensionality of data with neural networks.Science, Vol. 313. no. 5786, pp. 504 - 507, 28 July 2006.

Deep Auto-EncodersRBM

• Used SINGA to train three known models and verify the results

21

Page 22: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Experiment --- UsabilityW. Wang, X. Yang, B. C. Ooi, D. Zhang, Y. Zhuang: Effective Deep Learning Based Multi-Modal Retrieval. VLDB Journal - Special issue of VLDB'14 best papers, 2015. W. Wang, B.C. Ooi, X. Yang, D. Zhang, Y. Zhuang: Effective MultiModal Retrieval based on Stacked AutoEncoders. Int'l Conference on Very Large Data Bases (VLDB), 2014.

Deep Multi-Model Neural Network

CNN MLP

22

Page 23: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Experiment --- Usability

Mikolov Tomá, Karafiát Martin, Burget Luká, Èernocký Jan, Khudanpur Sanjeev: Recurrent neural network based language model, INTERSPEECH 2010), Makuhari, Chiba, JP

23

Page 24: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Single Node4 NUMA nodes (Intel Xeon 7540, 2.0GHz)Each node has 6 cores hyper-threading enabled500 GB memory

Experiment --- Efficiency and Scalability

ClusterQuad-core Intel Xeon 3.1 GHz CPU and 8GB memory, 1Gbps switch32 nodes, 4 workers per node

Train DCNN over CIFAR10: https://code.google.com/p/cuda-convnet

Synchronous

Caffe, GTX 970

24

Page 25: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Experiment --- Scalability

Single Node Cluster

Train DCNN over CIFAR10: https://code.google.com/p/cuda-convnet

Asynchronous

Caffe

SINGA

25

Page 26: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

Conclusions• Programming Model, Abstraction, and System Architecture

• Easy to implement different models

• Flexible and efficient to run different frameworks • Experiments

• Train models from different categories

• Scalability test for different training frameworks• SINGA

• Usable, extensible, efficient and scalable

• Apache SINGA v0.1.0 has been released• V0.2.0 (with GPU-CPU, DLaaS, more features) out next month

• Being used for healthcare analytics, product search, …

26

Page 27: SINGA: Putting Deep Learning into the Hands of Multimedia Users  Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin

SINGA: Putting Deep Learning into the Hands of Multimedia Users

27