machine learning and having it deep and structured introduction

Machine Learningand having it deep and

structured

Introduction

Outline

Some tasks are very complex• You know how to write programs.• One day, you are asked to write a program for

speech recognition.

Find the common patterns from the left waveforms

It seems impossible to write a program for speech recognition

你好你好

You quickly get lost in the exceptions and special cases.

Let the machine learn by itself

你好

大家好

人帥真好

You said “你好”

A large amount of audio data

You only have to write the program for learning

Learn how to do speech

recognition

Learning ≈ Looking for a Function• Speech Recognition

• Handwritten Recognition

• Weather forecast

• Play video games

“2”

“你好”

“sunny tomorrow”

“jump”

weather today

Positions and number of enemies

Types of Learning

Supervised Learning

Reinforcement Learning

Unsupervised Learning

Supervised Learning

Training:Pick the best Function f*

TrainingData

Testing:

,ˆ,,ˆ, 2211 yxyx

Hypothesis Function Set

y21, ff

“Best” Function

:y “你好”

“2”(label)

x: function input y: function output

大家好

TrainingData

,, 21 xx

21, ff

know how good f(x) is

How are you?

GoodBye

Example: Dialogue System

f1(x)=“Good Bye”No labels

TrainingData

,, 21 xx

21, ff

know how good f(x) is

Example: Dialogue System

How are you?

f2(x)=“Fine”No labels

TrainingData

Testing:

21, ff

“Best” Function

,, 21 xx

know how good f(x) is : x’ = “hello”

Machine: y’ = “hi”

No labels

Unsupervised Learning

TrainingData

,, 21 xx

No labels

Lots of audio without text annotation

What can I do with these

Outline

Inspired from human brain

Human Brains are Deep

A Neuron for Machine

Sigmoid function

Each neuron is a function

Activation function

• Neural Network: Cascading the neurons

Deep Learning

Hidden Layer

……

Layer 1

……

Layer 2…

…Layer L

……

Input Output

M: RRf N

Deep Learning

Reference: http://neuralnetworksanddeeplearning.com/chap4.html

Any continuous function f

M: RRf N

Can be realized by a network with one hidden layer

(given enough hidden neurons)

Universality Theorem:

Popular

Powerful

• Speech Recognition (TIMIT):

(Deep neural network on TIMIT usually used 4 to 8 layers)

Three misunderstandings about Deep Learning

1. Deep learning works because the model is more “complex”

Deep is simply more complex …..

1x 2x ……Nx

……

1x 2x ……Nx

……

Shallow Deep

Deep works better simply because it uses more parameters.

Fat + Short v.s. Thin + Tall

1x 2x ……Nx

……

Shallow

Which one is better?

Deep Learning - Why? Toy Example {0,1}: 2 Rf

……

……0 or 1

Sample 10,0000 points as training data

Deep Learning - Why? Toy Example

1 hidden layer:

125 neurons 2500 neurons500 neurons

3 hidden layers:

How many neurons in each hidden layers?

100~200

50~100

Less than 25

Deep Learning - Why?

• Experiments on Hand-writing digit classification

Deeper: Using less parameters to achieve the same performance

2. When you are using deep learning, you need more training data.

Size of Training Data

• Different number of training examples10,0000 5,0000 2,0000

1 hidden layer

3 hidden layers

Size of Training Data

• Experiments on Hand-writing digit classification

Deeper: Using less training data to achieve the same performance

3. You can simply get the power of deep by cascading the neurons.

Hard to get the power of Deep …

Can I get all the power of deep from this course?No, the researchers still do not understand all the mystery of deep learning.

Outline

In the real world ……

X (Input domain): Sequence, graph structure, tree structure ……

Y (Output domain): Sequence, graph structure, tree structure ……

Retrieval

(keyword)

“Machine learning”

A list of web pages (Search Result)

Translation

(Another kind of sequence)(One kind of sequence)

“Machine learning and having it deep and structured”

“機器學習及其深層與結構化”

Speech Recognition

“大家好，歡迎大家來修機器學習及其深層與結構化”(Another kind of sequence)(One kind of sequence)

Speech Summarization

:YSummary

Select the most informative segments to form a compact version

Record Lectures

Object Detection

:X Image Object Positions:Y

Haruhi

Mikuru

Image Segmentation

Source of images: Nowozin, Sebastian, and Christoph H. Lampert. "Structured learning and prediction in computer vision." Foundations and Trends® in Computer Graphics and Vision 6.3–4 (2011): P57.

:X Image foreground:Y

Remote Image Ground Survey

Source of images: Nowozin, Sebastian, and Christoph H. Lampert. "Structured learning and prediction in computer vision." Foundations and Trends® in Computer Graphics and Vision 6.3–4 (2011): P146.

Pose Estimation

YXf ::X Image :Y Pose

Source of images: http://groups.inf.ed.ac.uk/calvin/Publications/eichner-techreport10.pdf

Structured Learning

• The tasks above are developed separately in the past.

• Recently, people realize that there is a unified framework behind these approaches.

• Three stepsEvaluationInferenceLearning

Concluding Remarks

Reference

• Deep Learning• Neural Networks and Deep Learning

• http://neuralnetworksanddeeplearning.com/• For more information

• http://deeplearning.net/

• Structure Learning• Structured Learning and Prediction in Computer Vision.

• http://www.nowozin.net/sebastian/papers/nowozin2011structured-tutorial.pdf

• Linguistic Structure Prediction• http://www.cs.cmu.edu/afs/cs/Web/People/nasmith/LSP/

PUBLISHED-frontmatter.pdf

Thank you!

Powerful

• Inspired from human brain

Layer 1 Layer 2 Layer L

……

……Nx

visual cortex

Retina

Pixels Edges PrimitiveShapes

……

Powerful

• Image Recognition

Reference: Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014 (pp. 818-833)

1st hidden layer 2nd hidden layer 3rd hidden layer

machine learning and having it deep and structured introduction

Documents

point cloud oversegmentation with graph-structured deep...

learning deep structured active contours...

proximal deep structured...

crowd counting with deep structured scale integration...

learning deep structured active contours...

dsdnet: deep structured self-driving network

learning deep structured modelsurtasun/publications/... ·...

deep structured output learning for unconstrained text...

deep functional maps: structured prediction for dense...

matrix backpropagation for deep networks with structured...

thin-slicing network: a deep structured model for pose...

end-to-end deep structured models for drawing crosswalks

1 exploring context with deep structured models for ......

learning structured output representation using deep...

structured modeling of joint deep feature and prediction...

deep structured models for group activity recognition...

learning structured sparsity in deep neural … · learning...

deep structured prediction with nonlinear output...

indirect deep structured learning for 3d human body shape

houdini: fooling deep structured prediction models fooling...