Download - Incomemagdon/courses/LFD-Slides/SlidesLect04-C.pdf · 2019-09-30 · M Age Income E out (h M) # D Age Income E in (h 1) = 2 9 Age Income E in (h 2) = 0 Age Income E in (h 3) = 5 9

Learning From DataLecture 4

Real Learning is Feasible

Real Learning vs. VerificationThe Two Step Solution to Learning

Closer to Reality: Error and Noise

M. Magdon-IsmailCSCI 4100/6100

recap: Verification

h

Age

Incom

e

Eout(h)

↓ D

Age

Incom

e

Ein(h) =29

Hoeffding: Eout(g) ≈ Ein(g) (with high probability)

P[|Ein(h)− Eout(h)| > ǫ] ≤ 2e−2Nǫ2.

c© AML Creator: Malik Magdon-Ismail Real Learning is Feasible: 2 /16 Recap: selection bias and coins −→

recap: Selection Bias Illustrated with Coins

Coin tossing example:

• If we toss one coin and get no heads, its very surprising. P =1

2N

We expect it is biased: P[heads] ≈ 0.

• Tossing 70 coins, and find one with no heads. Is it surprising? P = 1−(1− 1

2N

)70

Do we expect P[heads] ≈ 0 for the selected coin?

Similar to the “birthday problem”: among 30 people, two will likely share the same birthday.

• This is called selection bias.Selection bias is a very serious trap. For example medical screening.

Search Causes Selection Bias

c© AML Creator: Malik Magdon-Ismail Real Learning is Feasible: 3 /16 Learning: finite model −→

Real Learning – Finite Learning Models

h1

Age

Incom

e

Eout(h1)

h2

Age

Incom

e

Eout(h2)

h3

Age

Incom

e

Eout(h3)

· · ·

hM

Age

Incom

e

Eout(hM)

↓ D

Age

Inco

me

Ein(h1) =29

Age

Inco

me

Ein(h2) = 0

Age

Inco

me

Ein(h3) =59

· · ·Age

Inco

me

Ein(hM) = 69

Pick the hypothesis with minimum Ein; will Eout be small?

c© AML Creator: Malik Magdon-Ismail Real Learning is Feasible: 4 /16 Hoeffding: finite model −→

Feasibility of Learning (Finite Models)

• No Free Lunch: can’t know anything outside D, for sure.

• Can “learn” with high probability if D is i.i.d. from P (x).Eout ≈ Ein (Ein can reach outside the data set to Eout).

•We want Eout ≈ 0.

• The two step solution. We trade Eout ≈ 0 for 2 goals:

(i) Eout ≈ Ein;

(ii) Ein ≈ 0.

We know Ein, not Eout, but we can ensure (i) if |H| is small.

This is a big step!

•What about infinite H - the perceptron?

c© AML Creator: Malik Magdon-Ismail Real Learning is Feasible: 9 /16 Complex f are harder to learn −→

“Complex” Target Functions are Harder to Learn

What happened to the “difficulty” (complexity) of f?

• Simple f =⇒ can use small H to get Ein ≈ 0 (need smaller N).

• Complex f =⇒ need large H to get Ein ≈ 0 (need larger N).

c© AML Creator: Malik Magdon-Ismail Real Learning is Feasible: 10 /16 Learning setup with probability −→

Revising the Learning Problem – Adding in Probability

xx1,x2, . . . ,xN

g(x) ≈ f(x)

UNKNOWN TARGET FUNCTION

f : X 7→ Y

TRAINING EXAMPLES

(x1, y1), (x2, y2), . . . , (xN , yN )

UNKNOWNINPUT DISTRIBUTION

P (x)

FINALHYPOTHESIS

g

LEARNINGALGORITHM

A

HYPOTHESIS SET

H

yn = f(xn)

c© AML Creator: Malik Magdon-Ismail Real Learning is Feasible: 11 /16 Error and Noise −→

Error and Noise

Error Measure: How to quantify that h ≈ f .

Noise: yn 6= f(xn).

c© AML Creator: Malik Magdon-Ismail Real Learning is Feasible: 12 /16 Finger print example −→

Finger Print Recognition

f

+1 you

−1 intruder

Two types of error.

f

+1 −1

h+1 no error false accept

−1 false reject no error

In any application you need to think about how to penalize each type of error.

f+1 −1

h+1 0 1−1 10 0

f+1 −1

h+1 0 1000−1 1 0

Supermarket CIA

Take AwayError measure is specified by the user.

If not, choose one that is– plausible (conceptually appealing)

– friendly (practically appealing)

c© AML Creator: Malik Magdon-Ismail Real Learning is Feasible: 13 /16 Pointwise errors −→

Almost All Error Mearures are Pointwise

Compare h and f on individual points x using a pointwise error e(h(x), f(x)):

Binary error: e(h(x), f(x)) = Jh(x) 6= f(x)K (classification)

Squared error: e(h(x), f(x)) = (h(x)− f(x))2 (regression)

In-sample error:

Ein(h) =1

N

N∑

n=1

e(h(xn), f(xn)).

Out-of-sample error:Eout(h) = Ex[e(h(x), f(x))].

c© AML Creator: Malik Magdon-Ismail Real Learning is Feasible: 14 /16 Noisy targets −→

Noisy Targets

age 32 years

gender male

salary 40,000

debt 26,000

years in job 1 year

years at home 3 years

. . . . . .

Approve for credit?

Consider two customers with the same credit data.

They can have different behaviors.

The target ‘function’ is not a deterministic function but a stochastic function.

‘f(x)’ = P (y|x)

c© AML Creator: Malik Magdon-Ismail Real Learning is Feasible: 15 /16 Learning setup with error and noise −→

Learning Setup with Error Measure and Noisy Targets

x1,x2, . . . ,xN

TRAINING EXAMPLES

(x1, y1), (x2, y2), . . . , (xN , yN )

UNKNOWNINPUT DISTRIBUTION

P (x)

UNKNOWN TARGET DISTRIBUTION(target function f plus noise)

P (y | x)

x

g(x) ≈ f(x)

FINALHYPOTHESIS

g

HYPOTHESIS SET

H

LEARNINGALGORITHM

A

ERRORMEASURE

yn ∼ P (y | xn)

c© AML Creator: Malik Magdon-Ismail Real Learning is Feasible: 16 /16

Download - Incomemagdon/courses/LFD-Slides/SlidesLect04-C.pdf · 2019-09-30 · M Age Income E out (h M) # D Age Income E in (h 1) = 2 9 Age Income E in (h 2) = 0 Age Income E in (h 3) = 5 9

Top Related