deep learning big data meetup @ trondheim

51
Cyril Banino-Rokkones Telenor Research

Upload: cyril-banino-rokkones

Post on 15-Jul-2015

248 views

Category:

Technology


3 download

TRANSCRIPT

Cyril Banino-RokkonesTelenor Research

2

I know nothing about Deep Learning

3

Outline

• Intro to DL (A. Ng)

• Intro to Neural Nets

• Training NN

• Conv Nets

• Autoencoders

• Word Embeddings

• DL@TRD

• Bonus

4

AI was the weak link until Deep Learning matured

5

China's Search Giant Goes Deep

AI was the weak link until Deep Learning matured

6

http://www.iro.umontreal.ca/~bengioy/dlbook/intro.html

Loose inspiration from the brain

7

China's Search Giant Goes Deep

Large Neural Nets perform better than small ones

8

China's Search Giant Goes Deep

Google Brain project – 1 billion connections – 1 week of youtube watching.

9 China's Search Giant Goes Deep

From 16k CPUs to 3 GPUsFrom 1M connections to 10 B

10

China's Search Giant Goes Deep

Applications of Deep Learning

11

China's Search Giant Goes Deep

Voice interface to assist computer-illiterates

12

China's Search Giant Goes Deep

Image-search for impossible queries

13

China's Search Giant Goes Deep

Image-search for impossible queries

14

China's Search Giant Goes Deep

Image-queries to find stuff impossible to describe

15

China's Search Giant Goes Deep

16

“Whoever wins AI wins the Internet.” A. Ng.

Google, Facebook and other tech companies race to develop artificial intelligence

Outline

• Intro to DL (A. Ng)

• Intro to Neural Nets

• Training NN

• Conv Nets

• Autoencoders

• Word Embeddings

• DL@TRD

• Bonus

17

Perceptrons makes decisions by weighing evidence

18 http://neuralnetworksanddeeplearning.com/chap1.html

Example: NAND gate

19 http://neuralnetworksanddeeplearning.com/chap1.html

Wiring several perceptrons for more abstract and complex decisions

20 http://neuralnetworksanddeeplearning.com/chap1.html

A simple network to classify handwritten digits (MNIST)

21 http://neuralnetworksanddeeplearning.com/chap1.html

Outline

• Intro to DL (A. Ng)

• Intro to Neural Nets

• Training NN

• Conv Nets

• Autoencoders

• Word Embeddings

• DL@TRD

• Bonus

22

Training Neural Networks: Gradient descent

23

Learning to solve a problem

24 http://neuralnetworksanddeeplearning.com/chap1.html

Forward and Backward passes

25http://caffe.berkeleyvision.org/tutorial/forward_backward.html

The Unstable Gradient Problem

26

Why it is difficult to train an RNN

Why are deep neural networks hard to train?

Practical advices when training neural networks (by Ilya Sutskever)

27

• Get good data

• Preprocessing

• Minibatches

• Gradient normalization

• Learning rate schedule

• Learning rate

• Weight Initialization

• Data augmentation

• Dropout

• Ensembling

A Brief Overview of Deep Learning

Outline

• Intro to DL (A. Ng)

• Intro to Neural Nets

• Training NN

• Conv Nets

• Autoencoders

• Word Embeddings

• DL@TRD

• Bonus

28

Convolutions

30 Understanding Convolutions

Convolutional Neural Network

31Conv Nets: A Modular Perspective

Intriguing properties of Conv Nets

33

Intriguing properties of neural networks

Outline

• Intro to DL (A. Ng)

• Intro to Neural Nets

• Training NN

• Conv Nets

• Autoencoders

• Word Embeddings

• DL@TRD

• Bonus

34

Stacked Autoencoders

35Reducing the Dimensionality of Data with Neural Networks

Stacked Autoencoders

36

Reducing the Dimensionality of Data with Neural Networks

Behavioral micro-segmentation (training set)

38 1008

275

275

1008

8

150

150

0.011|0.98|0.2| … 0

Bit code

1 0 …~

Outline

• Intro to DL (A. Ng)

• Intro to Neural Nets

• Training NN

• Conv Nets

• Autoencoders

• Word Embeddings

• DL@TRD

• Bonus

39

40

Word embeddings

Deep Learning, NLP, and Representations

41

Word embeddings and Shared representations

Deep Learning, NLP, and Representations Deep Visual-Semantic Alignments for Generating Image

Descriptions

42

Word embeddings and Recurrent Neural Nets

Deep Learning, NLP, and Representations

Telenor Norway Network topology

Word embeddings applied to Network operations

Use cases:

• Predict failures of Network components.

• Predict congestion levels on Network links.

• Detect mal-functioning devices.

Outline

• Intro to DL (A. Ng)

• Intro to Neural Nets

• Training NN

• Conv Nets

• Autoencoders

• Word Embeddings

• DL@TRD

• Bonus

45

DL@TRD - Motivations

Personal observations:

– DL is hot (hyped?)

– DL supremacy seems ineluctable

– DL can solve a whole bunch of problems

– DL is frontier technology (difficult)

– Little DL competence @ Telenor Research

Personal implications:

– Career development

– Network with partners to get momentum

– Great if this happens in Trondheim

46

DL@TRD - Vision

Establish a strong DL competence center in Trondheim

– A place where

• competence is gathered

• experiences are exchanged

• collaborations are fostered

– Benefits

• Share passion with others near you

• Get momentum for your work

• Funding (SFI, EU money)

– Ideally

• Collaborate across companies on problems

• Common publications

47Next workshop: 27th March

DL@Telenor – Topics of Interest

NLP tasks

– Speech-to-Text

– Text-to-Speech

– Automatic summarization

– Sentiment analysis

Computer Vision

– Face detection

– Image recognition/classification

Telenor Applications

– New Digital Services

– Managing our Networks

– Understanding our Customers

48

Stuff we could discuss at DL@TRD

• Training Recurrent Neural Networks

• Long Short Term Memory Networks

• Echo State Networks

• Neural Turing Machines

• Hopfield Nets

• Restricted Bolzman Machines

• Deep beliefs Networks

• Teacher – Student Nets

• Momentum

• Dropout

• Full Bayesian learning

• Hessian free optimization

• Stuff I don´t know I don´t know

49

Conclusion & Forecast

50

• DL techniques can be applied to all sorts of data:

– Could you apply some of these techniques to your data?

• DL models are better than humans at some tasks if fed with enough data & trained properly

• Within 5-10 years, “information work” tasks will be augmented or even fully automated

– See Peter Norvig´s talk at InfoQ: Machine Learning for Programming

– Models can take decisions based on millions of records while removing human biases

Big data + Deep Learning = unemployment

– New policies and economic measures will be needed to manage the adverse effects of job computerization

– Schooling will need reforms: routine tasks non-routine tasks

Thank you

51

btw we´re hiring…