advanced topics - university of notre damerjohns15/cse40647.sp14/www... · machine learning and ai...

32
Advanced Topics

Upload: dangthuan

Post on 11-Mar-2018

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Advanced Topics

Page 2: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Last class

• Neural Networks

2

Page 3: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Popularity and Applications

• Major increase in popularity of Neural Networks

• Google developed a couple of efficient methods that allow for the training of huge deep NN – Asynchronous Distributed Gradient Descent

– L-BFGS

• These have recently been made available to the public

3

Page 4: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Large scale ANN

4

Model

Training Data

Machine Learning and AI via Brain simulations - Andrew Ng

Page 5: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Popularity and Applications

5

Parameter Server

Model

Workers

Data Shards

Machine Learning and AI via Brain simulations - Andrew Ng

Parameter Server

Model

Workers

Data

Coordinator

(small messages)

L-BFGS Asynchronous Distributed Gradient Descent

• 20,000 cores in a single cluster

• up to 1 billion data items / mega-batch (in ~1 hour)

Page 6: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Neural Networks

6 Machine Learning and AI via Brain simulations - Andrew Ng

Page 7: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Popularity and Applications

7 Machine Learning and AI via Brain simulations - Andrew Ng

Page 8: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Popularity and Applications

8 Machine Learning and AI via Brain simulations - Andrew Ng

Page 9: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Learning from Unlabeled Data

9 Machine Learning and AI via Brain simulations - Andrew Ng

[Banko & Brill, 2001]

Training set size (millions)

A

ccu

racy

“It’s not who has the best algorithm that wins. It’s who has the most data.”

Page 10: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

10 Machine Learning and AI via Brain simulations - Andrew Ng

Preliminaries Data Understanding

Data Preprocessing

Data Modeling Validation & Interpretation

Advanced Topics

Page 11: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Feature Learning

A set of techniques in machine learning that learn a transformation of "raw" inputs to a representation that can be effectively exploited in a supervised learning task

such as classification

11

Page 12: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Feature Learning

• Can be supervised or unsupervised

• Example of algorithms: – Autoencoders

– Matrix Factorization

– Restricted Boltzmann machine

– Clustering

– Neural Networks

12

Page 13: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Feature Learning

13

• Each k-centroid can be used to represent a feature for supervised learning

• Each feature 𝑗 has value 1 iff the 𝑗𝑡ℎ centroid learned by k-means is the closest to the instance under consideration

Page 14: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Deep Learning

14

• The motivation: Some data representations make it easier to learn particular tasks (e.g., image classification)

• Ex.: Our assignment 2: – It’s hard to give your computer an image and ask “what season does it represent?” but if we can

apply transformations that describe similar images with a simpler representation, we might accomplish that task

Andrew Ng

• Machine learning algorithms to model high-level data abstractions using a series of transformations

• Based on feature learning • The motivation: Some data representations

make it easier to learn particular tasks (e.g., image classification)

Page 15: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Self-Taught Learning (Unsupervised Feature Learning)

15 Machine Learning and AI via Brain simulations - Andrew Ng

Testing:

What is this?

Not motorcycles

Unlabeled images …

Motorcycles

Page 16: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

• Supervised learning

• Semi-supervised learning

• Self-taught learning (unsupervised feature learning)

Cars Motorcycles

Random images

Car Motorcycle

Cars Motorcycles

Unlabeled cars & motorcycles

Page 17: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Self-Taught Learning

• One can always try to get more labeled data, but this can be expensive – Amazon Mechanical Turk

– Expert feature engineering

• The promise of self-taught learning and unsupervised feature learning: – If we can get our algorithms to learn from unlabeled data, then we can

easily obtain and learn from massive amounts of it

• 1 instance of unlabeled data < 1 instance of labeled data • Billions of instances of unlabeled data >> some labeled data

17

Page 18: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Self-Taught Learning

• The idea: – Give the algorithm a large amount of unlabeled data

– The algorithm learns a feature representation of that data

– If the end goal is to perform classification, one can find a small set of labeled instances to probe the model and adapt it to the supervised task

18

Page 19: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Autoencoders

• An artificial neural network used for learning efficient codings

• Can be used for dimensionality reduction

• Similar to a Multilayer Perceptron with an input layer and one or more hidden layers

• The difference between autoencoders and MLP: – Autoencoders have the same number of inputs and outputs

– Instead of predict y, autoencoders try to reconstruct x

19

Page 20: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Autoencoders

20

If the hidden layers are narrower than input layer, then the activations of the final layers can be regarded as compressed representation of the input

Page 21: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Self-Taught Learning in Practice

21

• Suppose we have an unlabeled training set

*𝑥𝑢1

, 𝑥𝑢2

, … , 𝑥𝑢𝑚𝑢 + with 𝑚𝑢 unlabeled instances

• Step 1: Train an autoencoder on this data

Page 22: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Self-Taught Learning in Practice

22

• After step 1, we will have learned all weight parameters for the network

• We can also visualize the algorithm for computing the features/activations 𝑎 s the following neural network:

Page 23: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Self-Taught Learning in Practice

23

• Step 2:

– Training set: 𝑥𝑙1

, 𝑦 1 , 𝑥𝑙2

, 𝑦 2 , … , (𝑥𝑙𝑚𝑙 , 𝑦(𝑚𝑙)) of 𝑚𝑙 labeled

examples

– Feed training example 𝑥𝑙1

to the autoencoder and obtain its

corresponding vector of activations 𝑎𝑙(1)

– Repeat that for all training examples

Page 24: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Self-Taught Learning in Practice

24

• Step 3:

– Replace the original feature with 𝑎𝑙(1)

– The training set then becomes:

• 𝑎𝑙1

, 𝑦 1 , 𝑎𝑙2

, 𝑦 2 , … , (𝑎𝑙𝑚𝑙 , 𝑦(𝑚𝑙))

• Step 4: – Train a supervised algorithm using this new training set to obtain a

function that makes predictions on the y values

Page 25: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

An Example of Self-Taught Learning

25 Machine Learning and AI via Brain simulations - Andrew Ng

• What Google did:

– Trained on 10 million images (YouTube)

– 1000 machines (16,000 cores) for 1 week.

– 1.15 billion parameters

– Test on novel images

Training set (YouTube) Test set (FITW + ImageNet)

Page 26: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Result Highlights

26 Machine Learning and AI via Brain simulations - Andrew Ng

• The face neuron

Top stimuli from the test set Optimal stimulus by numerical optimization

Page 27: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Result Highlights

27 Machine Learning and AI via Brain simulations - Andrew Ng

• The face neuron

Feature value

Random distractors

Faces

Frequency

Faces

Random distractors

Page 28: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Result Highlights

28 Machine Learning and AI via Brain simulations - Andrew Ng

• The cat neuron

Optimal stimulus by numerical optimization

Page 29: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Result Highlights

29 Machine Learning and AI via Brain simulations - Andrew Ng

Page 30: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Best stimuli

Pooling Size = 5

Number

of maps = 8

Image Size = 200

Number of output

channels = 8

Number of input

channels = 3

On

e la

yer

RF size = 18

Input to another layer above

(image with 8 channels)

W

H

LCN Size = 5

Feature 1

Feature 2

Feature 3

Feature 4

Feature 5

Le, et al., Building high-level features using large-scale unsupervised learning. ICML 2012

Page 31: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Pooling Size = 5

Number

of maps = 8

Image Size = 200

Number of output

channels = 8

Number of input

channels = 3

On

e la

yer

RF size = 18

Input to another layer above

(image with 8 channels)

W

H

LCN Size = 5

Feature 7

Feature 8

Feature 6

Feature 9

Best stimuli

Le, et al., Building high-level features using large-scale unsupervised learning. ICML 2012

Page 32: Advanced Topics - University of Notre Damerjohns15/cse40647.sp14/www... · Machine Learning and AI via Brain simulations - Andrew Ng Parameter Server Model Workers Data ... –Amazon

Data Preprocessing

Advanced Topics

Pooling Size = 5

Number

of maps = 8

Image Size = 200

Number of output

channels = 8

Number of input

channels = 3

On

e la

yer

RF size = 18

Input to another layer above

(image with 8 channels)

W

H

LCN Size = 5

Feature 11

Feature 10

Feature 12

Feature 13

Best stimuli

Le, et al., Building high-level features using large-scale unsupervised learning. ICML 2012