what we did not cover - cs.duke.edu · what we did not cover 1 much more detail 2 statistical...

11
What We Did Not Cover COMPSCI 371D — Machine Learning COMPSCI 371D — Machine Learning What We Did Not Cover 1 / 11

Upload: others

Post on 18-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: What We Did Not Cover - cs.duke.edu · What We Did Not Cover 1 Much More Detail 2 Statistical Machine Learning 3 Other Supervised Techniques 4 Reducing the Burden of Labeling 5 Unsupervised

What We Did Not Cover

COMPSCI 371D — Machine Learning

COMPSCI 371D — Machine Learning What We Did Not Cover 1 / 11

Page 2: What We Did Not Cover - cs.duke.edu · What We Did Not Cover 1 Much More Detail 2 Statistical Machine Learning 3 Other Supervised Techniques 4 Reducing the Burden of Labeling 5 Unsupervised

What We Did Not Cover

1 Much More Detail

2 Statistical Machine Learning

3 Other Supervised Techniques

4 Reducing the Burden of Labeling

5 Unsupervised Methods

6 Addressing Multiple Learning Tasks Together

7 Prediction over Time

COMPSCI 371D — Machine Learning What We Did Not Cover 2 / 11

Page 3: What We Did Not Cover - cs.duke.edu · What We Did Not Cover 1 Much More Detail 2 Statistical Machine Learning 3 Other Supervised Techniques 4 Reducing the Burden of Labeling 5 Unsupervised

Much More Detail

Much More Detail

• Computationally efficient training algorithms:Optimization techniques• Deep learning architectures for special problems:

Image motion analysis, video analysis, ...

COMPSCI 371D — Machine Learning What We Did Not Cover 3 / 11

Page 4: What We Did Not Cover - cs.duke.edu · What We Did Not Cover 1 Much More Detail 2 Statistical Machine Learning 3 Other Supervised Techniques 4 Reducing the Burden of Labeling 5 Unsupervised

Much More Detail

Beyond Discriminative Neural Networks

• Abstraction for its own sake: Auto-encoders• A game-theoretical technique to draw from a distribution:

Generative Adversarial Networks

Which image is fake?

COMPSCI 371D — Machine Learning What We Did Not Cover 4 / 11

Page 5: What We Did Not Cover - cs.duke.edu · What We Did Not Cover 1 Much More Detail 2 Statistical Machine Learning 3 Other Supervised Techniques 4 Reducing the Burden of Labeling 5 Unsupervised

Much More Detail

Generative Adversarial Networks

Sampler

Generator

Discriminator

noise

Discriminator Loss

Generator Loss

real image

fake image

real-image database

random switch

• Discriminator guesses if input is real or fake• Discriminator loss penalizes wrong predictions• Generator loss penalizes correct predictions• After training keep only the generator

COMPSCI 371D — Machine Learning What We Did Not Cover 5 / 11

Page 6: What We Did Not Cover - cs.duke.edu · What We Did Not Cover 1 Much More Detail 2 Statistical Machine Learning 3 Other Supervised Techniques 4 Reducing the Burden of Labeling 5 Unsupervised

Statistical Machine Learning

Statistical Machine Learning

• How to measure the size of H: Vapnik-Chervonenkisdimension, Rademacher complexity• How large must T be to get an h that is within ε of a

performance target with probability greater than 1− δ:Probably Approximately Correct (PAC) learning• H is learnable if there exists a size of T that is large enough

for this goal to be achieved• Which Hs are learnable?• How large must S be to get a performance measure

accurate within ε: Concentration bounds, statisticalestimation theory, PAC-like techniques

COMPSCI 371D — Machine Learning What We Did Not Cover 6 / 11

Page 7: What We Did Not Cover - cs.duke.edu · What We Did Not Cover 1 Much More Detail 2 Statistical Machine Learning 3 Other Supervised Techniques 4 Reducing the Burden of Labeling 5 Unsupervised

Other Supervised Techniques

Other Supervised Techniques

• Support Vector Machines: How to maximize free spacearound decision boundariesMaximizes the margin, i.e., the distance between theboundary and the nearest training samples• Boosting: How to use many bad predictors to make one

good oneSimilar in principle to ensemble predictors, differentassumptions and techniques• Learning to rank

Example: Learning a better Google

COMPSCI 371D — Machine Learning What We Did Not Cover 7 / 11

Page 8: What We Did Not Cover - cs.duke.edu · What We Did Not Cover 1 Much More Detail 2 Statistical Machine Learning 3 Other Supervised Techniques 4 Reducing the Burden of Labeling 5 Unsupervised

Reducing the Burden of Labeling

Reducing the Burden of Labeling• Semi-supervised methods: Build models of the data x to

leverage sparse labels y• Domain adaptation: Train a classifier on source-domain

labeled data (x, y) and target-domain unlableled data x sothat is works well in the target domain

Vegas Building Detector

Vegas Image

Vegas Building Detector

Shanghai Image

Shanghai Building Detector

Shanghai Image

TRAIN

ADAPT

https://en.wikipedia.org

COMPSCI 371D — Machine Learning What We Did Not Cover 8 / 11

Page 9: What We Did Not Cover - cs.duke.edu · What We Did Not Cover 1 Much More Detail 2 Statistical Machine Learning 3 Other Supervised Techniques 4 Reducing the Burden of Labeling 5 Unsupervised

Unsupervised Methods

Unsupervised Methods• Dimensionality reduction:

Compressing X ⊆ Rd to X ′ ⊆ Rd ′with d ′ � d

• Principal or Independent Component Analysis (PCA, ICA)• Manifold learning, GANs

• Clustering:• K -means• Expectation-Maximization• Agglomerative methods• Splitting methods

COMPSCI 371D — Machine Learning What We Did Not Cover 9 / 11

Page 10: What We Did Not Cover - cs.duke.edu · What We Did Not Cover 1 Much More Detail 2 Statistical Machine Learning 3 Other Supervised Techniques 4 Reducing the Burden of Labeling 5 Unsupervised

Addressing Multiple Learning Tasks Together

Addressing Multiple Learning Tasks Together

• Multi-task learning: How to learn representations that arecommon to different but related prediction tasks

COMPSCI 371D — Machine Learning What We Did Not Cover 10 / 11

Page 11: What We Did Not Cover - cs.duke.edu · What We Did Not Cover 1 Much More Detail 2 Statistical Machine Learning 3 Other Supervised Techniques 4 Reducing the Burden of Labeling 5 Unsupervised

Prediction over Time

Prediction over Time• State-space methods

• Time series analysis• Stochastic state estimation• System identification

• Recurrent neural networks• Reinforcement learning: Actions over time

Learning policies underlying observed sequences

COMPSCI 371D — Machine Learning What We Did Not Cover 11 / 11