machine learning and having it deep and structured introduction

Post on 21-Dec-2015

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Machine Learningand having it deep and

structured

Introduction

Outline

Some tasks are very complex• You know how to write programs.• One day, you are asked to write a program for

speech recognition.

Find the common patterns from the left waveforms

It seems impossible to write a program for speech recognition

你好 你好

你好 你好

You quickly get lost in the exceptions and special cases.

Let the machine learn by itself

你好

大家好

人帥真好

You said “你好”

A large amount of audio data

You only have to write the program for learning

Learn how to do speech

recognition

Learning ≈ Looking for a Function• Speech Recognition

• Handwritten Recognition

• Weather forecast

• Play video games

f

f

f

f

“2”

“你好”

“sunny tomorrow”

“jump”

weather today

Positions and number of enemies

Types of Learning

Supervised Learning

Reinforcement Learning

Unsupervised Learning

Supervised Learning

Training:Pick the best Function f*

TrainingData

Model

Testing:

,ˆ,,ˆ, 2211 yxyx

Hypothesis Function Set

x

y21, ff

*f

“Best” Function

yxf

:x

:y “你好”

“2”(label)

x: function input y: function output

大家好

Reinforcement Learning

TrainingData

Model

,, 21 xx

Hypothesis Function Set

21, ff

know how good f(x) is

How are you?

GoodBye

Bad!

Example: Dialogue System

x:

f1(x)=“Good Bye”No labels

Reinforcement Learning

Training:Pick the best Function f*

TrainingData

Model

,, 21 xx

Hypothesis Function Set

21, ff

know how good f(x) is

Example: Dialogue System

How are you?

Fine.

Good!

f2(x)=“Fine”No labels

Reinforcement Learning

Training:Pick the best Function f*

TrainingData

Model

Testing:

Hypothesis Function Set

21, ff

*f

“Best” Function

yxf

,, 21 xx

know how good f(x) is : x’ = “hello”

Machine: y’ = “hi”

No labels

Unsupervised Learning

TrainingData

,, 21 xx

No labels

Lots of audio without text annotation

What can I do with these

data?

Outline

Inspired from human brain

Human Brains are Deep

A Neuron for Machine

z

1w

2w

Nw

1x

2x

Nx

b

z z

zbias

a

ze

z

1

1

Sigmoid function

Each neuron is a function

Activation function

• Neural Network: Cascading the neurons

Deep Learning

Hidden Layer

1x

2x

……

Layer 1

……

1y

2y…

Layer 2…

…Layer L

……

……

……

Input Output

MyNx

M: RRf N

Deep Learning

Reference: http://neuralnetworksanddeeplearning.com/chap4.html

Any continuous function f

M: RRf N

Can be realized by a network with one hidden layer

(given enough hidden neurons)

Universality Theorem:

Popular

Powerful

• Speech Recognition (TIMIT):

(Deep neural network on TIMIT usually used 4 to 8 layers)

Three misunderstandings about Deep Learning

1. Deep learning works because the model is more “complex”

Deep is simply more complex …..

1x 2x ……Nx

……

1x 2x ……Nx

……

……

Shallow Deep

Deep works better simply because it uses more parameters.

Fat + Short v.s. Thin + Tall

1x 2x ……Nx

Deep

1x 2x ……Nx

……

Shallow

Which one is better?

Deep Learning - Why? Toy Example {0,1}: 2 Rf

10

x

y

……

……

……

……

……

……0 or 1

Sample 10,0000 points as training data

Deep Learning - Why? Toy Example

1 hidden layer:

125 neurons 2500 neurons500 neurons

3 hidden layers:

How many neurons in each hidden layers?

100~200

50~100

25~50

Less than 25

Deep Learning - Why?

• Experiments on Hand-writing digit classification

Deeper: Using less parameters to achieve the same performance

Three misunderstandings about Deep Learning

2. When you are using deep learning, you need more training data.

Size of Training Data

• Different number of training examples10,0000 5,0000 2,0000

1 hidden layer

3 hidden layers

Size of Training Data

• Experiments on Hand-writing digit classification

Deeper: Using less training data to achieve the same performance

Three misunderstandings about Deep Learning

3. You can simply get the power of deep by cascading the neurons.

Hard to get the power of Deep …

Can I get all the power of deep from this course?No, the researchers still do not understand all the mystery of deep learning.

Outline

In the real world ……

YXf :

X (Input domain): Sequence, graph structure, tree structure ……

Y (Output domain): Sequence, graph structure, tree structure ……

Retrieval

YXf :

:X :Y

(keyword)

“Machine learning”

A list of web pages (Search Result)

Translation

(Another kind of sequence)(One kind of sequence)

YXf :

“Machine learning and having it deep and structured”

“機器學習及其深層與結構化”

:X :Y

Speech Recognition

“大家好,歡迎大家來修機器學習及其深層與結構化”(Another kind of sequence)(One kind of sequence)

YXf :

:X :Y

Speech Summarization

YXf :

:X

:YSummary

Select the most informative segments to form a compact version

Record Lectures

Object Detection

YXf :

:X Image Object Positions:Y

Haruhi

Mikuru

Image Segmentation

YXf :

Source of images: Nowozin, Sebastian, and Christoph H. Lampert. "Structured learning and prediction in computer vision." Foundations and Trends® in Computer Graphics and Vision 6.3–4 (2011): P57.

:X Image foreground:Y

Remote Image Ground Survey

Source of images: Nowozin, Sebastian, and Christoph H. Lampert. "Structured learning and prediction in computer vision." Foundations and Trends® in Computer Graphics and Vision 6.3–4 (2011): P146.

YXf :

Pose Estimation

YXf ::X Image :Y Pose

Source of images: http://groups.inf.ed.ac.uk/calvin/Publications/eichner-techreport10.pdf

Structured Learning

• The tasks above are developed separately in the past.

• Recently, people realize that there is a unified framework behind these approaches.

• Three stepsEvaluationInferenceLearning

Concluding Remarks

Reference

• Deep Learning• Neural Networks and Deep Learning

• http://neuralnetworksanddeeplearning.com/• For more information

• http://deeplearning.net/

• Structure Learning• Structured Learning and Prediction in Computer Vision.

• http://www.nowozin.net/sebastian/papers/nowozin2011structured-tutorial.pdf

• Linguistic Structure Prediction• http://www.cs.cmu.edu/afs/cs/Web/People/nasmith/LSP/

PUBLISHED-frontmatter.pdf

Thank you!

Powerful

• Inspired from human brain

Layer 1 Layer 2 Layer L

1x

2x

……

……

……

……

……

……

……Nx

visual cortex

Retina

Pixels Edges PrimitiveShapes

……

……

……

……

Powerful

• Image Recognition

Reference: Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014 (pp. 818-833)

1st hidden layer 2nd hidden layer 3rd hidden layer

top related