deep learning big data meetup @ trondheim
TRANSCRIPT
Outline
• Intro to DL (A. Ng)
• Intro to Neural Nets
• Training NN
• Conv Nets
• Autoencoders
• Word Embeddings
• DL@TRD
• Bonus
4
AI was the weak link until Deep Learning matured
5
China's Search Giant Goes Deep
AI was the weak link until Deep Learning matured
6
http://www.iro.umontreal.ca/~bengioy/dlbook/intro.html
Loose inspiration from the brain
7
China's Search Giant Goes Deep
Large Neural Nets perform better than small ones
8
China's Search Giant Goes Deep
Google Brain project – 1 billion connections – 1 week of youtube watching.
9 China's Search Giant Goes Deep
From 16k CPUs to 3 GPUsFrom 1M connections to 10 B
10
China's Search Giant Goes Deep
Applications of Deep Learning
11
China's Search Giant Goes Deep
Voice interface to assist computer-illiterates
12
China's Search Giant Goes Deep
Image-search for impossible queries
13
China's Search Giant Goes Deep
Image-search for impossible queries
14
China's Search Giant Goes Deep
Image-queries to find stuff impossible to describe
15
China's Search Giant Goes Deep
16
“Whoever wins AI wins the Internet.” A. Ng.
Google, Facebook and other tech companies race to develop artificial intelligence
Outline
• Intro to DL (A. Ng)
• Intro to Neural Nets
• Training NN
• Conv Nets
• Autoencoders
• Word Embeddings
• DL@TRD
• Bonus
17
Perceptrons makes decisions by weighing evidence
18 http://neuralnetworksanddeeplearning.com/chap1.html
Example: NAND gate
19 http://neuralnetworksanddeeplearning.com/chap1.html
Wiring several perceptrons for more abstract and complex decisions
20 http://neuralnetworksanddeeplearning.com/chap1.html
A simple network to classify handwritten digits (MNIST)
21 http://neuralnetworksanddeeplearning.com/chap1.html
Outline
• Intro to DL (A. Ng)
• Intro to Neural Nets
• Training NN
• Conv Nets
• Autoencoders
• Word Embeddings
• DL@TRD
• Bonus
22
Learning to solve a problem
24 http://neuralnetworksanddeeplearning.com/chap1.html
Forward and Backward passes
25http://caffe.berkeleyvision.org/tutorial/forward_backward.html
The Unstable Gradient Problem
26
Why it is difficult to train an RNN
Why are deep neural networks hard to train?
Practical advices when training neural networks (by Ilya Sutskever)
27
• Get good data
• Preprocessing
• Minibatches
• Gradient normalization
• Learning rate schedule
• Learning rate
• Weight Initialization
• Data augmentation
• Dropout
• Ensembling
A Brief Overview of Deep Learning
Outline
• Intro to DL (A. Ng)
• Intro to Neural Nets
• Training NN
• Conv Nets
• Autoencoders
• Word Embeddings
• DL@TRD
• Bonus
28
Convolutional Neural Network have been here for a while
29
Convolutions
30 Understanding Convolutions
Convolutional Neural Network
31Conv Nets: A Modular Perspective
Convolutional Neural Network
32
Human-level control through deep reinforcement learning
Intriguing properties of Conv Nets
33
Intriguing properties of neural networks
Outline
• Intro to DL (A. Ng)
• Intro to Neural Nets
• Training NN
• Conv Nets
• Autoencoders
• Word Embeddings
• DL@TRD
• Bonus
34
Stacked Autoencoders
35Reducing the Dimensionality of Data with Neural Networks
Stacked Autoencoders
36
Reducing the Dimensionality of Data with Neural Networks
Stacked Autoencoders – semantic Hashing
37Semantic Hashing
Reducing the Dimensionality of Data with Neural Networks
Behavioral micro-segmentation (training set)
38 1008
275
275
1008
8
150
150
0.011|0.98|0.2| … 0
Bit code
1 0 …~
Outline
• Intro to DL (A. Ng)
• Intro to Neural Nets
• Training NN
• Conv Nets
• Autoencoders
• Word Embeddings
• DL@TRD
• Bonus
39
40
Word embeddings
Deep Learning, NLP, and Representations
41
Word embeddings and Shared representations
Deep Learning, NLP, and Representations Deep Visual-Semantic Alignments for Generating Image
Descriptions
42
Word embeddings and Recurrent Neural Nets
Deep Learning, NLP, and Representations
43
Word embeddings and Reversible Sentence Representation
Deep Learning, NLP, and Representations
Rich Rashid in Tianjin, October, 25, 2012
Telenor Norway Network topology
Word embeddings applied to Network operations
Use cases:
• Predict failures of Network components.
• Predict congestion levels on Network links.
• Detect mal-functioning devices.
Outline
• Intro to DL (A. Ng)
• Intro to Neural Nets
• Training NN
• Conv Nets
• Autoencoders
• Word Embeddings
• DL@TRD
• Bonus
45
DL@TRD - Motivations
Personal observations:
– DL is hot (hyped?)
– DL supremacy seems ineluctable
– DL can solve a whole bunch of problems
– DL is frontier technology (difficult)
– Little DL competence @ Telenor Research
Personal implications:
– Career development
– Network with partners to get momentum
– Great if this happens in Trondheim
46
DL@TRD - Vision
Establish a strong DL competence center in Trondheim
– A place where
• competence is gathered
• experiences are exchanged
• collaborations are fostered
– Benefits
• Share passion with others near you
• Get momentum for your work
• Funding (SFI, EU money)
– Ideally
• Collaborate across companies on problems
• Common publications
47Next workshop: 27th March
DL@Telenor – Topics of Interest
NLP tasks
– Speech-to-Text
– Text-to-Speech
– Automatic summarization
– Sentiment analysis
Computer Vision
– Face detection
– Image recognition/classification
Telenor Applications
– New Digital Services
– Managing our Networks
– Understanding our Customers
48
Stuff we could discuss at DL@TRD
• Training Recurrent Neural Networks
• Long Short Term Memory Networks
• Echo State Networks
• Neural Turing Machines
• Hopfield Nets
• Restricted Bolzman Machines
• Deep beliefs Networks
• Teacher – Student Nets
• Momentum
• Dropout
• Full Bayesian learning
• Hessian free optimization
• Stuff I don´t know I don´t know
49
Conclusion & Forecast
50
• DL techniques can be applied to all sorts of data:
– Could you apply some of these techniques to your data?
• DL models are better than humans at some tasks if fed with enough data & trained properly
• Within 5-10 years, “information work” tasks will be augmented or even fully automated
– See Peter Norvig´s talk at InfoQ: Machine Learning for Programming
– Models can take decisions based on millions of records while removing human biases
Big data + Deep Learning = unemployment
– New policies and economic measures will be needed to manage the adverse effects of job computerization
– Schooling will need reforms: routine tasks non-routine tasks