olivier bousquet, pertinence - machine learning thoughtsml.typepad.com/talks/ob_marseille.pdf ·...

105
Motivation How to do it? Challenges Applications Machine Learning: Challenges and Applications Olivier Bousquet, Pertinence Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Upload: duongnhan

Post on 06-Mar-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Machine Learning: Challenges and Applications

Olivier Bousquet, Pertinence

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 2: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Outline

1 Motivation

2 How to do it?

3 Challenges

4 Applications

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 3: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Outline

1 Motivation

2 How to do it?

3 Challenges

4 Applications

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 4: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Spam Filtering

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 5: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Example 1

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 6: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Example 2

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 7: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Spam Filtering Algorithm

List words that may indicate spam

Take into account possible variations

Count these words/variations in incoming messages

Choose a threshold above which the message is classified asspam

Constantly refine this algorithm as new spam gets ignored, orcorrect messages are rejected (update the list, change thethresholds...)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 8: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Spam Filtering Algorithm

List words that may indicate spam

Take into account possible variations

Count these words/variations in incoming messages

Choose a threshold above which the message is classified asspam

Constantly refine this algorithm as new spam gets ignored, orcorrect messages are rejected (update the list, change thethresholds...)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 9: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Spam Filtering Algorithm

List words that may indicate spam

Take into account possible variations

Count these words/variations in incoming messages

Choose a threshold above which the message is classified asspam

Constantly refine this algorithm as new spam gets ignored, orcorrect messages are rejected (update the list, change thethresholds...)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 10: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Spam Filtering Algorithm

List words that may indicate spam

Take into account possible variations

Count these words/variations in incoming messages

Choose a threshold above which the message is classified asspam

Constantly refine this algorithm as new spam gets ignored, orcorrect messages are rejected (update the list, change thethresholds...)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 11: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Spam Filtering Algorithm

List words that may indicate spam

Take into account possible variations

Count these words/variations in incoming messages

Choose a threshold above which the message is classified asspam

Constantly refine this algorithm as new spam gets ignored, orcorrect messages are rejected (update the list, change thethresholds...)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 12: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Manufacturing Process Control

High Pressure Dye Casting

Inject liquid aluminium in a dye, adjust temperature at variouspositionsTry to avoid bubbles

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 13: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Manufacturing Process Control

High Pressure Dye Casting

Inject liquid aluminium in a dye, adjust temperature at variouspositionsTry to avoid bubbles

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 14: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Manufacturing Process Control

High Pressure Dye Casting

Inject liquid aluminium in a dye, adjust temperature at variouspositionsTry to avoid bubbles

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 15: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Approach

Build a simulation model

Use Navier-Stokes + phase transitions (solid/liquid) +air/liquid interface, complex geometry, heat transfer...

Require huge computational resources (finite elementsmethods)

No guarantee that the model is correct (what aboutimpurities?)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 16: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Approach

Build a simulation model

Use Navier-Stokes + phase transitions (solid/liquid) +air/liquid interface, complex geometry, heat transfer...

Require huge computational resources (finite elementsmethods)

No guarantee that the model is correct (what aboutimpurities?)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 17: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Approach

Build a simulation model

Use Navier-Stokes + phase transitions (solid/liquid) +air/liquid interface, complex geometry, heat transfer...

Require huge computational resources (finite elementsmethods)

No guarantee that the model is correct (what aboutimpurities?)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 18: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Approach

Build a simulation model

Use Navier-Stokes + phase transitions (solid/liquid) +air/liquid interface, complex geometry, heat transfer...

Require huge computational resources (finite elementsmethods)

No guarantee that the model is correct (what aboutimpurities?)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 19: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Is there a better way?

Can we solve such problems in a simpler and more direct way?

Can we build systems that can solve a wide variety of suchproblems (without starting from scratch for each new similarproblem) ?

Idea: Use an empirical approach

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 20: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Is there a better way?

Can we solve such problems in a simpler and more direct way?

Can we build systems that can solve a wide variety of suchproblems (without starting from scratch for each new similarproblem) ?

Idea: Use an empirical approach

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 21: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Is there a better way?

Can we solve such problems in a simpler and more direct way?

Can we build systems that can solve a wide variety of suchproblems (without starting from scratch for each new similarproblem) ?

Idea: Use an empirical approach

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 22: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Outline

1 Motivation

2 How to do it?

3 Challenges

4 Applications

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 23: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Induction

Start from Data or Observations

Build a model which

Agrees with the dataPredicts unobserved data

Scientific Method: laws are induced from observations. A lawis correct as long as no experiment contradicts it.

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 24: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Induction

Start from Data or Observations

Build a model which

Agrees with the dataPredicts unobserved data

Scientific Method: laws are induced from observations. A lawis correct as long as no experiment contradicts it.

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 25: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Induction

Start from Data or Observations

Build a model which

Agrees with the dataPredicts unobserved data

Scientific Method: laws are induced from observations. A lawis correct as long as no experiment contradicts it.

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 26: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Induction

Start from Data or Observations

Build a model which

Agrees with the dataPredicts unobserved data

Scientific Method: laws are induced from observations. A lawis correct as long as no experiment contradicts it.

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 27: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Induction

Start from Data or Observations

Build a model which

Agrees with the dataPredicts unobserved data

Scientific Method: laws are induced from observations. A lawis correct as long as no experiment contradicts it.

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 28: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Ingredients 1

Data representation: choose a way to represent the objects tobe classified

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 29: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Example

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 30: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Example

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 31: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Example

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 32: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Ingredients 2

Data representation: choose a way to represent the objects tobe classified

Class of functions: choose a way to express the classificationfunction

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 33: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Example

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 34: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Overfitting

Data is not always nicely separated

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 35: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Overfitting

Underfitting

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 36: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Overfitting

Overfitting

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 37: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Overfitting

Reasonable model

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 38: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Ingredients

Data representation: choose a way to represent the objects tobe classified

Class of functions: choose a way to express the classificationfunction

Preference: incorporate assumptions about the modelregularity

Algorithm: set up the problem as an optimization problem

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 39: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Ingredients

Data representation: choose a way to represent the objects tobe classified

Class of functions: choose a way to express the classificationfunction

Preference: incorporate assumptions about the modelregularity

Algorithm: set up the problem as an optimization problem

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 40: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Ingredients

Data representation: choose a way to represent the objects tobe classified

Class of functions: choose a way to express the classificationfunction

Preference: incorporate assumptions about the modelregularity

Algorithm: set up the problem as an optimization problem

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 41: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Formalization

Example: statistical formalization (not the only one)

Data: (X1,Y1), . . . , (Xn,Yn)

Function class: F , f : X → YLoss function: `(f (x), y) = 1[f (x) 6=y ]

Goal: find a function f ∈ F such that E`(f (X ),Y ) is as smallas possible, and f is regular enough

Optimization problem

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Key question: convergence of E`(f̂ (X ),Y ) tominf E`(f (X ),Y )

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 42: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Formalization

Example: statistical formalization (not the only one)

Data: (X1,Y1), . . . , (Xn,Yn)

Function class: F , f : X → YLoss function: `(f (x), y) = 1[f (x) 6=y ]

Goal: find a function f ∈ F such that E`(f (X ),Y ) is as smallas possible, and f is regular enough

Optimization problem

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Key question: convergence of E`(f̂ (X ),Y ) tominf E`(f (X ),Y )

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 43: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Formalization

Example: statistical formalization (not the only one)

Data: (X1,Y1), . . . , (Xn,Yn)

Function class: F , f : X → YLoss function: `(f (x), y) = 1[f (x) 6=y ]

Goal: find a function f ∈ F such that E`(f (X ),Y ) is as smallas possible, and f is regular enough

Optimization problem

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Key question: convergence of E`(f̂ (X ),Y ) tominf E`(f (X ),Y )

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 44: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Formalization

Example: statistical formalization (not the only one)

Data: (X1,Y1), . . . , (Xn,Yn)

Function class: F , f : X → YLoss function: `(f (x), y) = 1[f (x) 6=y ]

Goal: find a function f ∈ F such that E`(f (X ),Y ) is as smallas possible, and f is regular enough

Optimization problem

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Key question: convergence of E`(f̂ (X ),Y ) tominf E`(f (X ),Y )

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 45: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Formalization

Example: statistical formalization (not the only one)

Data: (X1,Y1), . . . , (Xn,Yn)

Function class: F , f : X → YLoss function: `(f (x), y) = 1[f (x) 6=y ]

Goal: find a function f ∈ F such that E`(f (X ),Y ) is as smallas possible, and f is regular enough

Optimization problem

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Key question: convergence of E`(f̂ (X ),Y ) tominf E`(f (X ),Y )

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 46: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Formalization

Example: statistical formalization (not the only one)

Data: (X1,Y1), . . . , (Xn,Yn)

Function class: F , f : X → YLoss function: `(f (x), y) = 1[f (x) 6=y ]

Goal: find a function f ∈ F such that E`(f (X ),Y ) is as smallas possible, and f is regular enough

Optimization problem

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Key question: convergence of E`(f̂ (X ),Y ) tominf E`(f (X ),Y )

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 47: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Outline

1 Motivation

2 How to do it?

3 Challenges

4 Applications

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 48: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

High Dimensionality

Complexity of the data increases: requires bigger descriptionvectors

Gathering data becomes easier: more and more descriptorsavailable

This increase in the dimensionality comes at a price

Overfitting becomes the main issue (curse of dimensionality)But at the same time, interesting phenomena occur (blessing)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 49: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

High Dimensionality

Complexity of the data increases: requires bigger descriptionvectors

Gathering data becomes easier: more and more descriptorsavailable

This increase in the dimensionality comes at a price

Overfitting becomes the main issue (curse of dimensionality)But at the same time, interesting phenomena occur (blessing)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 50: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

High Dimensionality

Complexity of the data increases: requires bigger descriptionvectors

Gathering data becomes easier: more and more descriptorsavailable

This increase in the dimensionality comes at a price

Overfitting becomes the main issue (curse of dimensionality)But at the same time, interesting phenomena occur (blessing)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 51: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Curses

NP-completeness: optimization problems can be exponentialin the dimension (e.g. search for the separating hyperplanewith minimum number of mistakes)

Statistical issue: exponentially slow convergence

Theorem

Let F be the set of all Lipschitz functions on [0, 1]d . For anyestimator,

supf ∈F

E(f̂ (x)− f (x))2 ≥ Cn−2

2+d as n →∞

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 52: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Curses

NP-completeness: optimization problems can be exponentialin the dimension (e.g. search for the separating hyperplanewith minimum number of mistakes)

Statistical issue: exponentially slow convergence

Theorem

Let F be the set of all Lipschitz functions on [0, 1]d . For anyestimator,

supf ∈F

E(f̂ (x)− f (x))2 ≥ Cn−2

2+d as n →∞

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 53: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Blessings

Concentration-of-measure phenomenon

On the n-sphere, for the usual rotationally-invariant measure,most of the mass is concentrated near any (n− 1)-sphere thatis equatorial in it (analogous to a great circle on the Earth’ssurface, when n = 2). In other words, the ’poles’ and theirneighborhoods down to very small latitudes account for a tinyproportion of the ’area’.Chernoff bounds: for i.i.d. bounded random variables

P

[1

n

n∑i=1

Xi − EX ≥ ε

]≤ e−ncε2

General concentration inequality: under appropriate conditionson f (e.g. Lipschitz)

P [f (X1, . . . ,Xn)− Ef ≥ ε] ≤ e−ncε2

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 54: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Blessings

Concentration-of-measure phenomenon

On the n-sphere, for the usual rotationally-invariant measure,most of the mass is concentrated near any (n− 1)-sphere thatis equatorial in it (analogous to a great circle on the Earth’ssurface, when n = 2). In other words, the ’poles’ and theirneighborhoods down to very small latitudes account for a tinyproportion of the ’area’.Chernoff bounds: for i.i.d. bounded random variables

P

[1

n

n∑i=1

Xi − EX ≥ ε

]≤ e−ncε2

General concentration inequality: under appropriate conditionson f (e.g. Lipschitz)

P [f (X1, . . . ,Xn)− Ef ≥ ε] ≤ e−ncε2

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 55: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Blessings

Concentration-of-measure phenomenon

On the n-sphere, for the usual rotationally-invariant measure,most of the mass is concentrated near any (n− 1)-sphere thatis equatorial in it (analogous to a great circle on the Earth’ssurface, when n = 2). In other words, the ’poles’ and theirneighborhoods down to very small latitudes account for a tinyproportion of the ’area’.Chernoff bounds: for i.i.d. bounded random variables

P

[1

n

n∑i=1

Xi − EX ≥ ε

]≤ e−ncε2

General concentration inequality: under appropriate conditionson f (e.g. Lipschitz)

P [f (X1, . . . ,Xn)− Ef ≥ ε] ≤ e−ncε2

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 56: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Blessings

Random Projections

Lemma (Johnson & Lindenstrauss)

Given a set S of points in Rd , if we perform a random orthogonalprojection of those points on a subspace of dimension m, thenm = O(γ−2 log |S |) is sufficient so that with high probability allpairwise distances are preserved up to a factor 1± γ

→ cheap and easy way to reduce the dimension, while preservingthe geometry

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 57: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Connections

Theoretical

LogicInformation Theory / CompressionAlgorithmic Information TheoryMathematical Statistics

High dimensional phenomena

Probability in Banach spacesRandom graphs, graph theoryConcentration

Algorithmics

OptimizationApproximate algorithms

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 58: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Connections

Theoretical

LogicInformation Theory / CompressionAlgorithmic Information TheoryMathematical Statistics

High dimensional phenomena

Probability in Banach spacesRandom graphs, graph theoryConcentration

Algorithmics

OptimizationApproximate algorithms

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 59: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Connections

Theoretical

LogicInformation Theory / CompressionAlgorithmic Information TheoryMathematical Statistics

High dimensional phenomena

Probability in Banach spacesRandom graphs, graph theoryConcentration

Algorithmics

OptimizationApproximate algorithms

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 60: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Dealing with Dimensionality

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Important questions

How to choose the norm?

What can be said about convergence?

How can this be implemented?

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 61: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Dealing with Dimensionality

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Important questions

How to choose the norm?

What can be said about convergence?

How can this be implemented?

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 62: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Dealing with Dimensionality

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Important questions

How to choose the norm?

What can be said about convergence?

How can this be implemented?

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 63: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Dealing with Dimensionality

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Important questions

How to choose the norm?

What can be said about convergence?

How can this be implemented?

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 64: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Choosing the norm

Typical approach: linear methods

f (x) =d∑

i=1

αixi

L2 norm:by duality everything can be expressed in terms of innerproducts

d∑i=1

x iz i

this allows to generalize to reproducing kernel Hilbert spaces

L1: ensures sparsityonly a few αi will be non-zerothis allows to deal with very high-dimensional representations

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 65: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Choosing the norm

Typical approach: linear methods

f (x) =d∑

i=1

αixi

L2 norm:by duality everything can be expressed in terms of innerproducts

d∑i=1

x iz i

this allows to generalize to reproducing kernel Hilbert spaces

L1: ensures sparsityonly a few αi will be non-zerothis allows to deal with very high-dimensional representations

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 66: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Choosing the norm

Typical approach: linear methods

f (x) =d∑

i=1

αixi

L2 norm:by duality everything can be expressed in terms of innerproducts

d∑i=1

x iz i

this allows to generalize to reproducing kernel Hilbert spaces

L1: ensures sparsityonly a few αi will be non-zerothis allows to deal with very high-dimensional representations

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 67: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Choosing the norm

Typical approach: linear methods

f (x) =d∑

i=1

αixi

L2 norm:by duality everything can be expressed in terms of innerproducts

d∑i=1

x iz i

this allows to generalize to reproducing kernel Hilbert spaces

L1: ensures sparsityonly a few αi will be non-zerothis allows to deal with very high-dimensional representations

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 68: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Choosing the norm

Typical approach: linear methods

f (x) =d∑

i=1

αixi

L2 norm:by duality everything can be expressed in terms of innerproducts

d∑i=1

x iz i

this allows to generalize to reproducing kernel Hilbert spaces

L1: ensures sparsityonly a few αi will be non-zerothis allows to deal with very high-dimensional representations

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 69: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Choosing the norm

Typical approach: linear methods

f (x) =d∑

i=1

αixi

L2 norm:by duality everything can be expressed in terms of innerproducts

d∑i=1

x iz i

this allows to generalize to reproducing kernel Hilbert spaces

L1: ensures sparsityonly a few αi will be non-zerothis allows to deal with very high-dimensional representations

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 70: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Choosing the norm

Typical approach: linear methods

f (x) =d∑

i=1

αixi

L2 norm:by duality everything can be expressed in terms of innerproducts

d∑i=1

x iz i

this allows to generalize to reproducing kernel Hilbert spaces

L1: ensures sparsityonly a few αi will be non-zerothis allows to deal with very high-dimensional representations

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 71: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

L2: Kernels

Idea: replace inner products by positive definite kernels

Three reasons why this is interesting:

a good way to measure smoothness in high dimensionsallows to deal with complex objectseasy way to non-linearize

Also, interesting geometric properties (balls are effectively lowdimensional in L2(P))

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 72: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

L2: Kernels

Idea: replace inner products by positive definite kernels

Three reasons why this is interesting:

a good way to measure smoothness in high dimensionsallows to deal with complex objectseasy way to non-linearize

Also, interesting geometric properties (balls are effectively lowdimensional in L2(P))

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 73: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

L2: Kernels

Idea: replace inner products by positive definite kernels

Three reasons why this is interesting:

a good way to measure smoothness in high dimensionsallows to deal with complex objectseasy way to non-linearize

Also, interesting geometric properties (balls are effectively lowdimensional in L2(P))

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 74: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

L1: Boosting

Use sparsity to introduce many dimensions

Given a set H of interesting features, replace x by (h(x))h∈H

Build a linear combination with small L1 norm

Effective algorithms (e.g. Adaboost), geometry notsignificantly altered by replacing H by its convex hull

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 75: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

L1: Boosting

Use sparsity to introduce many dimensions

Given a set H of interesting features, replace x by (h(x))h∈H

Build a linear combination with small L1 norm

Effective algorithms (e.g. Adaboost), geometry notsignificantly altered by replacing H by its convex hull

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 76: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

L1: Boosting

Use sparsity to introduce many dimensions

Given a set H of interesting features, replace x by (h(x))h∈H

Build a linear combination with small L1 norm

Effective algorithms (e.g. Adaboost), geometry notsignificantly altered by replacing H by its convex hull

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 77: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

L1: Boosting

Use sparsity to introduce many dimensions

Given a set H of interesting features, replace x by (h(x))h∈H

Build a linear combination with small L1 norm

Effective algorithms (e.g. Adaboost), geometry notsignificantly altered by replacing H by its convex hull

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 78: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Convergence Properties

Concentration ensures

supf ∈F

1

n

n∑i=1

`(f (Xi ),Yi )−E`(f (X ),Y ) ≈ E

[supf ∈F

1

n

n∑i=1

`(f (Xi ),Yi )− E`(f (X ),Y )

]

Expectation term can be written as E‖∑

Xi‖Key concept: geometry induced by F and the distribution,”size” of F can be measured by metric entropy, Rademacheraverages, majorizing measures,...

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 79: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Convergence Properties

Concentration ensures

supf ∈F

1

n

n∑i=1

`(f (Xi ),Yi )−E`(f (X ),Y ) ≈ E

[supf ∈F

1

n

n∑i=1

`(f (Xi ),Yi )− E`(f (X ),Y )

]

Expectation term can be written as E‖∑

Xi‖Key concept: geometry induced by F and the distribution,”size” of F can be measured by metric entropy, Rademacheraverages, majorizing measures,...

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 80: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Convergence Properties

Concentration ensures

supf ∈F

1

n

n∑i=1

`(f (Xi ),Yi )−E`(f (X ),Y ) ≈ E

[supf ∈F

1

n

n∑i=1

`(f (Xi ),Yi )− E`(f (X ),Y )

]

Expectation term can be written as E‖∑

Xi‖Key concept: geometry induced by F and the distribution,”size” of F can be measured by metric entropy, Rademacheraverages, majorizing measures,...

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 81: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Computational Tricks

Typically use convex optimization (but number of variables can behuge)

Kernels: fast way of computing inner products

Boosting: few relevant features, fast selection of mostimportant feature

Random Projections: cheap and easy way to reducedimensionality

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 82: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Computational Tricks

Typically use convex optimization (but number of variables can behuge)

Kernels: fast way of computing inner products

Boosting: few relevant features, fast selection of mostimportant feature

Random Projections: cheap and easy way to reducedimensionality

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 83: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Computational Tricks

Typically use convex optimization (but number of variables can behuge)

Kernels: fast way of computing inner products

Boosting: few relevant features, fast selection of mostimportant feature

Random Projections: cheap and easy way to reducedimensionality

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 84: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Computational Tricks

Typically use convex optimization (but number of variables can behuge)

Kernels: fast way of computing inner products

Boosting: few relevant features, fast selection of mostimportant feature

Random Projections: cheap and easy way to reducedimensionality

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 85: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Outline

1 Motivation

2 How to do it?

3 Challenges

4 Applications

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 86: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

To be discussed next

A wide variety of applications

Future directions

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 87: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

To be discussed next

A wide variety of applications

Future directions

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 88: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Recommendation Engines

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 89: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

News Categorization

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 90: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

News

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 91: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

News

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 92: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

News

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 93: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Autonomous Vehicles

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 94: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Other Applications

Bioinformatics

Recognition problems (speech, handwritten text, images...)

Data Mining

Search Engines

...

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 95: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Other Applications

Bioinformatics

Recognition problems (speech, handwritten text, images...)

Data Mining

Search Engines

...

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 96: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Other Applications

Bioinformatics

Recognition problems (speech, handwritten text, images...)

Data Mining

Search Engines

...

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 97: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Other Applications

Bioinformatics

Recognition problems (speech, handwritten text, images...)

Data Mining

Search Engines

...

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 98: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Other Applications

Bioinformatics

Recognition problems (speech, handwritten text, images...)

Data Mining

Search Engines

...

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 99: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Machine Learning Can Make You Rich

Mathematical Finance is the scientific topic of choice forgetting a good job

Machine Learning is not too bad either these days!

Recruiting companies: big shots of the internet betting heavilyon this technology

Google / Yahoo! / MicrosoftAlso some start-ups: Pertinence

Money Prizes

DARPA Grand Challenge 2005: autonomous vehicle $ 2M(http://www.grandchallenge.org)Netflix Prize 2006: movie ratings $ 1M(http://www.netflixprize.com)Hutter Prize: compression of web documents $ 50K(http://prize.hutter1.net/)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 100: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Machine Learning Can Make You Rich

Mathematical Finance is the scientific topic of choice forgetting a good job

Machine Learning is not too bad either these days!

Recruiting companies: big shots of the internet betting heavilyon this technology

Google / Yahoo! / MicrosoftAlso some start-ups: Pertinence

Money Prizes

DARPA Grand Challenge 2005: autonomous vehicle $ 2M(http://www.grandchallenge.org)Netflix Prize 2006: movie ratings $ 1M(http://www.netflixprize.com)Hutter Prize: compression of web documents $ 50K(http://prize.hutter1.net/)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 101: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Machine Learning Can Make You Rich

Mathematical Finance is the scientific topic of choice forgetting a good job

Machine Learning is not too bad either these days!

Recruiting companies: big shots of the internet betting heavilyon this technology

Google / Yahoo! / MicrosoftAlso some start-ups: Pertinence

Money Prizes

DARPA Grand Challenge 2005: autonomous vehicle $ 2M(http://www.grandchallenge.org)Netflix Prize 2006: movie ratings $ 1M(http://www.netflixprize.com)Hutter Prize: compression of web documents $ 50K(http://prize.hutter1.net/)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 102: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Machine Learning Can Make You Rich

Mathematical Finance is the scientific topic of choice forgetting a good job

Machine Learning is not too bad either these days!

Recruiting companies: big shots of the internet betting heavilyon this technology

Google / Yahoo! / MicrosoftAlso some start-ups: Pertinence

Money Prizes

DARPA Grand Challenge 2005: autonomous vehicle $ 2M(http://www.grandchallenge.org)Netflix Prize 2006: movie ratings $ 1M(http://www.netflixprize.com)Hutter Prize: compression of web documents $ 50K(http://prize.hutter1.net/)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 103: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Machine Learning Can Make You Rich

Mathematical Finance is the scientific topic of choice forgetting a good job

Machine Learning is not too bad either these days!

Recruiting companies: big shots of the internet betting heavilyon this technology

Google / Yahoo! / MicrosoftAlso some start-ups: Pertinence

Money Prizes

DARPA Grand Challenge 2005: autonomous vehicle $ 2M(http://www.grandchallenge.org)Netflix Prize 2006: movie ratings $ 1M(http://www.netflixprize.com)Hutter Prize: compression of web documents $ 50K(http://prize.hutter1.net/)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 104: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Machine Learning Can Make You Rich

Mathematical Finance is the scientific topic of choice forgetting a good job

Machine Learning is not too bad either these days!

Recruiting companies: big shots of the internet betting heavilyon this technology

Google / Yahoo! / MicrosoftAlso some start-ups: Pertinence

Money Prizes

DARPA Grand Challenge 2005: autonomous vehicle $ 2M(http://www.grandchallenge.org)Netflix Prize 2006: movie ratings $ 1M(http://www.netflixprize.com)Hutter Prize: compression of web documents $ 50K(http://prize.hutter1.net/)

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Page 105: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Future Directions

Practically

Massive databases: require specific algorithmsMore and more structure: e.g. machine translationMultimedia: e.g. video streams

Theoretically

Better connection betweenapproximation/estimation/computationBetter understanding of model selection (e.g. cross-validation)Formalization and understanding of feature selection

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications