deep learning big data meetup @ trondheim

Cyril Banino-RokkonesTelenor Research

I know nothing about Deep Learning

3

Outline

• Intro to DL (A. Ng)

• Intro to Neural Nets

• Training NN

• Conv Nets

• Autoencoders

• Word Embeddings

• DL@TRD

• Bonus

4

AI was the weak link until Deep Learning matured

5

China's Search Giant Goes Deep

http://www.technologyreview.com/emtech/14/video/watch/andrew-ng-baidu/

AI was the weak link until Deep Learning matured

6

http://www.iro.umontreal.ca/~bengioy/dlbook/intro.html

http://www.iro.umontreal.ca/~bengioy/dlbook/intro.html

Loose inspiration from the brain

7



Large Neural Nets perform better than small ones

8



Google Brain project – 1 billion connections – 1 week of youtube watching.

9 China's Search Giant Goes Deep


From 16k CPUs to 3 GPUsFrom 1M connections to 10 B

10



Applications of Deep Learning

11



Voice interface to assist computer-illiterates

12



http://www.google.com

http://www.google.com

Image-search for impossible queries

13



Image-search for impossible queries

14



Image-queries to find stuff impossible to describe

15



16

“Whoever wins AI wins the Internet.” A. Ng.

Google, Facebook and other tech companies race to develop artificial intelligence

http://www.mercurynews.com/business/ci_25627092/google-facebook-and-other-tech-companies-race-develop

Outline



• Training NN

• Conv Nets

• Autoencoders

• Word Embeddings

• DL@TRD

• Bonus

17

Perceptrons makes decisions by weighing evidence

18 http://neuralnetworksanddeeplearning.com/chap1.html

http://neuralnetworksanddeeplearning.com/chap1.html

Example: NAND gate



Wiring several perceptrons for more abstract and complex decisions



A simple network to classify handwritten digits (MNIST)



Outline



• Training NN

• Conv Nets

• Autoencoders

• Word Embeddings

• DL@TRD

• Bonus

22

Training Neural Networks: Gradient descent

23

Learning to solve a problem



Forward and Backward passes

25http://caffe.berkeleyvision.org/tutorial/forward_backward.html

http://caffe.berkeleyvision.org/tutorial/forward_backward.html

The Unstable Gradient Problem

26

Why it is difficult to train an RNN

Why are deep neural networks hard to train?

https://www.youtube.com/watch?v=Pp4oKq4kCYs


Practical advices when training neural networks (by Ilya Sutskever)

27

• Get good data

• Preprocessing

• Minibatches

• Gradient normalization

• Learning rate schedule

• Learning rate

• Weight Initialization

• Data augmentation

• Dropout

• Ensembling

A Brief Overview of Deep Learning

http://yyue.blogspot.no/2015/01/a-brief-overview-of-deep-learning.html

Outline



• Training NN

• Conv Nets

• Autoencoders

• Word Embeddings

• DL@TRD

• Bonus

28

Convolutional Neural Network have been here for a while

29

https://www.youtube.com/watch?v=FwFduRA_L6Q

https://www.youtube.com/watch?v=FwFduRA_L6Q

https://www.youtube.com/watch?v=bKPf_6J0Qpk

https://www.youtube.com/watch?v=bKPf_6J0Qpk

Convolutions

30 Understanding Convolutions

http://colah.github.io/posts/2014-07-Understanding-Convolutions/

Convolutional Neural Network

31Conv Nets: A Modular Perspective

http://colah.github.io/posts/2014-07-Conv-Nets-Modular/

Convolutional Neural Network

32

Human-level control through deep reinforcement learning

http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html

http://cs.stanford.edu/people/karpathy/ilsvrc/

http://cs.stanford.edu/people/karpathy/ilsvrc/

Intriguing properties of Conv Nets

33

Intriguing properties of neural networks

http://cs.nyu.edu/~zaremba/docs/understanding.pdf

Outline



• Training NN

• Conv Nets

• Autoencoders

• Word Embeddings

• DL@TRD

• Bonus

34

Stacked Autoencoders

35Reducing the Dimensionality of Data with Neural Networks

http://www.cs.toronto.edu/~hinton/science.pdf

Stacked Autoencoders

36

Reducing the Dimensionality of Data with Neural Networks


Stacked Autoencoders – semantic Hashing

37Semantic Hashing

Reducing the Dimensionality of Data with Neural Networks

http://cs.stanford.edu/people/karpathy/convnetjs/demo/autoencoder.html

http://cs.stanford.edu/people/karpathy/convnetjs/demo/autoencoder.html

http://www.cs.toronto.edu/~fritz/absps/sh.pdf


Behavioral micro-segmentation (training set)

38 1008

275

275

1008

8

150

150

0.011|0.98|0.2| … 0

Bit code

1 0 …~

Outline



• Training NN

• Conv Nets

• Autoencoders

• Word Embeddings

• DL@TRD

• Bonus

39

40

Word embeddings

Deep Learning, NLP, and Representations

http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/

41

Word embeddings and Shared representations

Deep Learning, NLP, and Representations Deep Visual-Semantic Alignments for Generating Image

Descriptions


http://www.nytimes.com/2014/11/18/science/researchers-announce-breakthrough-in-content-recognition-software.html?_r=0

42

Word embeddings and Recurrent Neural Nets



43

Word embeddings and Reversible Sentence Representation


Rich Rashid in Tianjin, October, 25, 2012

http://digg.com/video/heres-microsoft-demoing-their-breakthrough-in-real-time-translated-conversation

http://digg.com/video/heres-microsoft-demoing-their-breakthrough-in-real-time-translated-conversation


Telenor Norway Network topology

Word embeddings applied to Network operations

Use cases:

• Predict failures of Network components.

• Predict congestion levels on Network links.

• Detect mal-functioning devices.

Outline



• Training NN

• Conv Nets

• Autoencoders

• Word Embeddings

• DL@TRD

• Bonus

45

DL@TRD - Motivations

Personal observations:

– DL is hot (hyped?)

– DL supremacy seems ineluctable

– DL can solve a whole bunch of problems

– DL is frontier technology (difficult)

– Little DL competence @ Telenor Research

Personal implications:

– Career development

– Network with partners to get momentum

– Great if this happens in Trondheim

46

DL@TRD - Vision

Establish a strong DL competence center in Trondheim

– A place where

• competence is gathered

• experiences are exchanged

• collaborations are fostered

– Benefits

• Share passion with others near you

• Get momentum for your work

• Funding (SFI, EU money)

– Ideally

• Collaborate across companies on problems

• Common publications

47Next workshop: 27th March

DL@Telenor – Topics of Interest

NLP tasks

– Speech-to-Text

– Text-to-Speech

– Automatic summarization

– Sentiment analysis

Computer Vision

– Face detection

– Image recognition/classification

Telenor Applications

– New Digital Services

– Managing our Networks

– Understanding our Customers

48

Stuff we could discuss at DL@TRD

• Training Recurrent Neural Networks

• Long Short Term Memory Networks

• Echo State Networks

• Neural Turing Machines

• Hopfield Nets

• Restricted Bolzman Machines

• Deep beliefs Networks

• Teacher – Student Nets

• Momentum

• Dropout

• Full Bayesian learning

• Hessian free optimization

• Stuff I don´t know I don´t know

49

Conclusion & Forecast

50

• DL techniques can be applied to all sorts of data:

– Could you apply some of these techniques to your data?

• DL models are better than humans at some tasks if fed with enough data & trained properly

• Within 5-10 years, “information work” tasks will be augmented or even fully automated

– See Peter Norvig´s talk at InfoQ: Machine Learning for Programming

– Models can take decisions based on millions of records while removing human biases

Big data + Deep Learning = unemployment

– New policies and economic measures will be needed to manage the adverse effects of job computerization

– Schooling will need reforms: routine tasks non-routine tasks

http://www.infoq.com/presentations/machine-learning-general-programming

Thank you

51

btw we´re hiring…

deep learning big data meetup @ trondheim

Technology