singa: putting deep learning into the hands of multimedia users wei wang, gang chen, tien tuan anh...
TRANSCRIPT
SINGA: Putting Deep Learning into the Hands of Multimedia Users
SINGA: Putting Deep Learning into the Hands of
Multimedia Usershttp://singa.apache.org/
Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin Ooi, Kian-Lee Tan, and Sheng
Wang
1
SINGA: Putting Deep Learning into the Hands of Multimedia Users
• Introduction
• Multimedia data and application• Motivations
• Deep learning models and training, and design principles• SINGA
• Usability
• Scalability
• Implementation
• Experiment
2
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Introduction
Image/video
Social Media
E-commerce
Health-care Text
AudioMadbits (acquired by Twitter)
Perceptio (acquired by Apple)
LookFlow (acquired by Yahoo! Flickr)
Deepomatic (e-commerce product search)
Descartes Labs (satellite images)
Clarifai (tagging)
ParallelDots
Semantria (NLP tasks >10 languages)
Ldibon
AlchemyAPI (acquired by IBM)
VocallIQ (acquired by Apple)
Multimedia Data
Multimedia Data
Deep Learning has been noted for its effectiveness for multimedia applications!
3
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Motivations
Model Categories
CNN, MLP, Auto-encoderImage/video classification
Feedforward Models
Krizhevsky, Sutskever, and Hinton, 2012; Szegedy et al., 2014; Simonyan and Zisserman, 2014a
CNN
4
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Motivations
Feedforward Models
Energy models
RBM
DBN
Model Categories
CNN, MLP, Auto-encoderImage/video classification
DBN, RBM, DBMSpeech recognition
Dahl et al., 20125
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Motivations
Feedforward Models
Energy models
Recurrent Neural
Networks
Model Categories
CNN, MLP, Auto-encoderImage/video classification
DBN, RBM, DBMSpeech recognition
RNN, LSTM, GRUNatural language processing
Mikolov et al., 2010; Cho et al., 20146
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Motivations
Feedforward Models
Energy models
Recurrent Neural
Networks
Model Categories
CNN, MLP, Auto-encoderImage/video classification
DBN, RBM, DBMSpeech recognition
RNN, LSTM, GRUNatural language processing
Design Goal IUsability: easy to implement various models
7
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Motivations: Training Process
• Training process• Update model parameters to minimize prediction error
• Training algorithm• Mini-batch Stochastic Gradient Descent (SGD)
• Training time• (time per SGD iteration) x (number of SGD iterations)• Long time to train large models over large datasets, e.g., 2 weeks
for training Overfeat (Pierre, et al.) reported by Intel (https://software.intel.com/sites/default/files/managed/74/15/SPCS008.pdf).
Back-propagation (BP) Contrastive Divergence (CD)
8
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Motivations: Distributed Training Frameworks• Synchronous training (Google Sandblaster, Dean et al., 2012; Baidu AllReduce, Wu et al., 2015)
• Reduce time per iteration
• Scalable for single-node with multiple GPUs
• Cannot scale to large cluster
• Asynchronous training (Google Downpour, Dean et al., 2012, Hogwild!, Recht et al., 2011)
• Reduce number of iterations per machine
• Scalable for big cluster with commodity machine(CPU)
• Not stable
• Hybrid frameworks
Design Goal IIScalability: not just flexible, but also efficient and
adaptive to run different training frameworks
9
SINGA: Putting Deep Learning into the Hands of Multimedia Users
SINGA:
A Distributed Deep Learning Platform
10
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Usability: Abstraction
class Layer { vector<Blob> data, grad; vector<Param*> param; ... void Setup(LayerProto& conf, vector<Layer*> src); void ComputeFeature(int flag, vector<Layer*> src); void ComputeGradient(int flag, vector<Layer*> src);};Driver::RegisterLayer<FooLayer>("Foo"); // register new layers
Input layers load raw data (and label)Output layers output feature (and prediction results)
Neuron layers transform features, e.g., convolution and pooling
Loss layers measure training loss, e.g., cross-entropy loss
Connection layers connect layers due to neural net partition
TrainOneBatchTrainOneBatch
NeuralNet
Layer
stopstop
11
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Usability: Neural Net Representation
TrainOneBatchTrainOneBatch
NeuralNet
Layer
stopstop
RNN RBM
Input
Hidden
Loss
labels
Feedforward models (e.g., CNN)
12
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Usability: TrainOneBatch
TrainOneBatchTrainOneBatch
NeuralNet
Layer
stopstop
Back-propagation (BP)
Contrastive Divergence (CD)
Input
Hidden
Loss
labels
RNN
Feedforward models (e.g., CNN)
RBMJust need to override the TrainOneBatch
function to implement other algorithms! 13
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Scalability: Partitioning for Distributed TrainingNeuralNet Partitioning:1. Partition layers into different subsets
2. Partition each singe layer on batch dimension.
3. Partition each singe layer on feature dimension.
4. Hybrid partitioning strategy of 1, 2 and 3. Worker 1
Worker 2
1
Worker 1
Worker 2
Worker 1
Worker 2
Worker 1
2 3
Users just need to CONFIGURE the partitioning scheme and
SINGA takes care of the real work (eg. slice and connect layers)14
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Scalability:Training Framework Cluster Topology
Server Group
Parameters
Server Server ServerWorker
Server
Node
Group
Inter-node Communication
Synchronous training cannot scale to large group size
Neural Net
Worker Worker Worker
Legends:
15
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Cluster Topology
Worker
Server
Node
Group
Inter-node Communication
Communication is the bottleneck!
Legends:
16
Scalability:Training Framework
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Cluster Topology
Worker
Server
Node
Group
Inter-node Communication
(a) Sandblaster (b) AllReduce (c) Downpour (d) Distributed Hogwild
sync async
SINGA is able to configure most known frameworks.
Legends:
17
Scalability:Training Framework
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Implementation
Driver::Train()
Main Thread
Stub::Run()
Worker thread
While(not stop): Worker::TrainOneBatch()
Server thread
While(not stop): Server::Update()
Remote NodesHDFS
Ubuntu
Docker
CentOS MacOS
DiskFile
Mes
os
Zoo
keep
er
Worker Stub Server
Driver
CNN RBM RNN
OptionalComponent
SINGA Component
Legend:
SINGA Software StackSINGA Software Stack
18
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Deep learning as a Service (DLaaS)Third party APPs(Web app, Mobile,..)----------------------
API
Developers(Browser)
----------------------GUI
Rafiki ServerRafiki Server
Routing(Load balancing)
Rafiki AgentRafiki Agent
User, Job, Model, Node Management
Timon(c++ wrapper)
SINGA
Timon(c++ wrapper)
SINGA
DataBaseDataBase
File Storage System
(e.g. HDFS)
File Storage System
(e.g. HDFS)
…
Rafiki AgentRafiki AgentTimon
(c++ wrapper)
SINGA
Timon(c++ wrapper)
SINGA ……
http request
http request http request
http request
SINGA’s RAFIKI
1. To improve the Usability of SINGA; 2. To “level” the playing field by taking care of complex system plumbing work, its reliability, efficiency and scalability.
19
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Comparison:Features of the Systems
Comparison with other open source projects
Feature SINGA Caffe CXXNET cuda-convnet H2O
Deep LearningModels
Feed-forward (CNN) ✔ ✔ ✔ ✔ MLP
Energy model (RBM) ✔ x x x x
Recurrent networks (RNN) ✔ ✔ x x x
DistributedTrainingFrameworks
Synchronous ✔ ✔ ✔ ✔ ✔
Asynchronous ✔ ✔ x x x
Hybrid ✔ x x x x
Hardware CPU ✔ ✔ ✔ x ✔
GPU V0.2.0 ✔ ✔ ✔ x
Cloud Software
HDFS ✔ x x x ✔
Resource management ✔ x x x ✔
Virtualization ✔ x x x ✔
Binding Python (P), Matlab(M), R ongoing (P) P+M P P P+R
MXNet on 28/09/15
20
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Experiment --- Usability
Hinton, G. E. and Salakhutdinov, R. R. (2006)Reducing the dimensionality of data with neural networks.Science, Vol. 313. no. 5786, pp. 504 - 507, 28 July 2006.
…
Deep Auto-EncodersRBM
• Used SINGA to train three known models and verify the results
21
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Experiment --- UsabilityW. Wang, X. Yang, B. C. Ooi, D. Zhang, Y. Zhuang: Effective Deep Learning Based Multi-Modal Retrieval. VLDB Journal - Special issue of VLDB'14 best papers, 2015. W. Wang, B.C. Ooi, X. Yang, D. Zhang, Y. Zhuang: Effective MultiModal Retrieval based on Stacked AutoEncoders. Int'l Conference on Very Large Data Bases (VLDB), 2014.
Deep Multi-Model Neural Network
CNN MLP
22
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Experiment --- Usability
Mikolov Tomá, Karafiát Martin, Burget Luká, Èernocký Jan, Khudanpur Sanjeev: Recurrent neural network based language model, INTERSPEECH 2010), Makuhari, Chiba, JP
23
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Single Node4 NUMA nodes (Intel Xeon 7540, 2.0GHz)Each node has 6 cores hyper-threading enabled500 GB memory
Experiment --- Efficiency and Scalability
ClusterQuad-core Intel Xeon 3.1 GHz CPU and 8GB memory, 1Gbps switch32 nodes, 4 workers per node
Train DCNN over CIFAR10: https://code.google.com/p/cuda-convnet
Synchronous
Caffe, GTX 970
24
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Experiment --- Scalability
Single Node Cluster
Train DCNN over CIFAR10: https://code.google.com/p/cuda-convnet
Asynchronous
Caffe
SINGA
25
SINGA: Putting Deep Learning into the Hands of Multimedia Users
Conclusions• Programming Model, Abstraction, and System Architecture
• Easy to implement different models
• Flexible and efficient to run different frameworks • Experiments
• Train models from different categories
• Scalability test for different training frameworks• SINGA
• Usable, extensible, efficient and scalable
• Apache SINGA v0.1.0 has been released• V0.2.0 (with GPU-CPU, DLaaS, more features) out next month
• Being used for healthcare analytics, product search, …
26
SINGA: Putting Deep Learning into the Hands of Multimedia Users
27