machine learning chapter 2. concept learning and the general-to-specific ordering gun ho lee...

78
Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

Upload: johnathan-bryant

Post on 02-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

Machine Learning

Chapter 2. Concept Learning and The General-to-specific Ordering

Gun Ho Lee

Soongsil University, Seoul

Page 2: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

2

Outline

Learning from examples General-to-specific ordering over hypotheses Version spaces and candidate elimination

algorithm Picking new examples The need for inductive bias

Note: simple approach assuming no noise,

illustrates key concepts

Page 3: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

3

A Concept

Examples of Concepts– “birds”, “car”, “situations” in which I should

study more in order to pass the exam”

Concept – Some subset of objects or events defined over

a larger set, or – A boolean-valued function defined over this

larger set.– Concept “birds” is the subset of animals that

constitute birds.

Page 4: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

4Adapted by Doug Downey from: Bryan Pardo, EECS 349 Fall 2007

A Concept Let there be a set of objects, X.

X = { 백구 , 야옹이 , 도그 , 강세이 }

A concept C is…

– A subset of X

C = dogs = { 백구 , 도그 , 강세이 }

– A function that returns 1 only for elements in the concept

C( 백구 ) = 1, C( 야옹이 ) = 0

4

Page 5: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

5

Instance Representation

Represent an object (or instance) as an n-tuple of attributes

Example: Days (6-tuples)

5

Instance Sky AirTemp Humidity Wind Water Forecast EnjoySport

1 Sunny Warm Normal Strong Warm Same No 2 Sunny Warm High Strong Warm Same Yes 3 Rainy Cold High Strong Warm Change No 4 Sunny Warm High Strong Cool Change Yes

Page 6: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

6

Concept Learning

Learning – Inducing general functions from specific training

examples

Concept learning– Acquiring the definition of a general category

given a sample of positive and negative training examples of the category

– Inferring a boolean-valued function from training examples of its input and output.

Page 7: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

7

A Concept Learning Task

Target concept EnjoySport– “days on which 박지성 enjoys water sport”

Hypothesis– A vector of 6 constraints, specifying the values of the six

attributes

Sky, AirTemp, Humidity, Wind, Water, Forecast.

– <?, Cold, High, ?, ?, ?>

the hypothesis : 박지성은 일기상태 (cold, high humidity) 에서 수상 스포츠를 즐긴다 .

Sky, AirTemp, Humidity, Wind, Water, Forecast

Page 8: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

8

Representing Hypotheses

Many possible representationsHere, h is conjunction of constraints on attributes

Each constraint can be a specific value (e.g., Water = Warm) don’t care (e.g., “Water =?”) no value allowed (e.g., “Water=Φ”)

For example, Sky AirTemp Humid Wind Water Forecst <Sunny ? ? Strong ? Same>

Page 9: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

9

Task: Learn a hypothesis from a dataset

Page 10: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

10

Example Concept Function

“Days on which my friend 박지성 enjoys his favorite water sport”

Sky Temp Humid Wind Water Forecast C(x)

sunny warm normal strong warm same 1

sunny warm high strong warm same 1

rainy cold high strong warm change 0

sunny warm high strong cool change 1

INPUTOUTPUT

10

Page 11: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

11

The Learning Task

Given: – Hypotheses space H: conjunction of constraints on attributes. E.g. conjunction of literals: < Sunny ? ? Strong ? Same >

– Target concept c: E.g., EnjoySport X {0,1}

– Instances X: set of items over which the concept is defined. E.g., days decribed by attributes: Sky, Temp, Humidity, Wind, Water, Forecast

• Training examples (positive/negative): <x,c(x)>• Training set D: positive, negative examples of the target function: <x1,c(x1)>,…, <xn,c(xn)>

Determine:– A hypothesis h in H such that h(x) = c(x), for all x in X

Page 12: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

12

Assumption 1

We will explore the space of all conjunctions. We assume the target concept falls within this space.

Target concept c

H, Hypotheses space

Page 13: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

13

Assumption 2

A hypothesis close to target concept c obtained after

seeing many training examples will result in high

accuracy on the set of unobserved examples.

Training set DHypothesis h is good

Complement set D’Hypothesis h is good

Inductive learning hypothesis

Page 14: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

14

Inductive Learning Hypothesis Learning task is to determine h identical to c over the

entire set of instances X.

But the only information about c is its value over D (training set).

Inductive learning algorithms can at best guarantee that the induced h fits c over D.

Inductive learning hypothesis– Any good hypothesis over a sufficiently large set of training

examples will also approximate the target function well over unseen examples.

Page 15: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

15

Concept Learning as Search Search

– Find a hypothesis that best fits training examples– Efficient search in hypothesis space (finite/infinite)

Search space in EnjoySport• Sky has 3 (Sunny, Cloudy, and Rainy)• Temp has 2 (Warm and Cold)• Humidity has 2 (Normal and High)• Wind has 2 (Strong and Weak)• Water has 2 (Warm and Cool)• Forecast has 2 (Same and Change)

– 3x2x2x2x2x2 = 96 distinct instances – 5x4x4x4x4x4 = 5120 syntactically distinct hypotheses within H (considering Φ

and ? in addition)

<Sky AirTemp Humid Wind Water Forecst>

Page 16: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

16Adapted by Doug Downey from: Bryan Pardo, EECS 349 Fall 2007

Mammal

Concept Generality

A concept P is more general than or equal to another concept Q iff the set of instances represented by P includes the set of instances represented by Q.

Canine

Wolf Pig Dog

White_fang Lassie

Scooby_dooWilbur

Charlotte

16

Page 17: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

17Adapted by Doug Downey from: Bryan Pardo, EECS 349 Fall 2007

General to Specific Order

Consider two hypotheses:

– h1=< Sunny,?,?,Strong,?,?>

– h2=< Sunny,?,?, ?, ?,?>

Definition: hj is more general than or equal to hk iff:

This imposes a partial order on a hypothesis space.

1)(1)( xhxhxhh jkkj

17

Page 18: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

18

Instance, Hypotheses, and More-General-Than

The Most General Hypothesis : < ?, ?, ?, ?, ?, ? >

The Most Specific Hypothesis : < Ø, Ø, Ø, Ø, Ø, Ø >

Adapted by Doug Downey from: Bryan Pardo, EECS 349 Fall 2007

x1=< Sunny,Warm,High,Strong,Cool,Same>

x2=< Sunny,Warm,High,Light,Warm,Same>

h1=< Sunny, ?, ?, Strong, ? ,?>

h2=< Sunny, ?, ?, ?, ?, ?>

h3=< Sunny, ?, ?, ?, Cool,?>

Instances

x2

x1

Hypotheses

h2

h3h1

h2 h1

h2 h3

specific

general

Page 19: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

19

Find-S Algorithm

1. Initialize h to the most specific hypothesis in H

2. For each positive training instance x– For each attribute constraint ai in h

If the constraint ai in h is satisfied by x

Then do nothing

Else replace ai in h by the next more general constraint

satisfied by x

3. Output hypothesis h

Finding a Maximally Specific Hypothesis

Page 20: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

20

Hypothesis Space Search by Find-S

Instances Hypotheses

specific

general

h0

h0=< Ø, Ø, Ø, Ø, Ø, Ø,>

h1

x1=<Sunny,Warm,Normal,Strong,Warm,Same> +

x1

h1=< Sunny,Warm,Normal,Strong,Warm,Same>

x3=<Rainy,Cold,High,Strong,Warm,Change> -

x3

h2,3

x2=<Sunny,Warm,High,Strong,Warm,Same> +

x2

h2,3=< Sunny,Warm, ?, Strong,Warm,Same>

h4

x4=<Sunny,Warm,High,Strong,Cool,Change> +

x4

h4=< Sunny,Warm, ?, Strong, ?, ?>

Page 21: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

21

Properties of Find-SFind-S

Ignores every negative example (no revision to h required in response to negative examples).

Guaranteed to output the most specific hypothesis consistent with the positive training examples (for conjunctive hypothesis space).

Final h also consistent with negative examples provided the target c is in H and no error in D.

Page 22: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

22

Weaknesses of Find-S

Has the learner converged to the correct target concept ? No way to know whether the solution is unique.

Why prefer the most specific hypothesis? How about the most general hypothesis?

Are the training examples consistent ? Training sets containing errors or noise can severely mislead the algorithm Find-S.

What if there are several maximally specific consistent hypotheses? No backtrack to explore a different branch of partial ordering.

Page 23: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

23

Partial order of hypotheses I

Page 24: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

24

Partial order of hypotheses II

Page 25: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

25

Partial order of hypotheses III

Page 26: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

26

The space of hypotheses

Page 27: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

27

The space of hypotheses I

Page 28: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

28

The space of hypotheses II

Page 29: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

29

The space of hypotheses III

Page 30: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

30

A hypothesis h is consistent with a set of training examples D of target concept c if and only if h(x) = c(x) for each training example <x, c(x)> in D.

Consistent(h, D) ≡ ( <∀ x, c(x)> D) ∈ h(x) = c(x)

Page 31: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

31

The version space, VSH,D, with respect to hypothesis space H and training examples D, is the subset of hypotheses from H consistent with all training examples in D.

VSH,D ≡ {h ∈ H | Consistent(h, D)}

Version Space

Page 32: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

32

Version Spaces

A hypothesis h is consistent with a set of training examples D of target concept c if and only if h(x) = c(x) for each training example <x, c(x)> in D.

Consistent(h, D) ≡ ( <∀ x, c(x)> D) ∈ h(x) = c(x)

The version space, V SH,D, with respect to hypothesis space H and training examples D, is the subset of hypotheses from H consistent with all training examples in D.

VSH,D ≡ {h ∈ H | Consistent(h, D)}

Page 33: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

33

The List-Then-Eliminate Algorithm:

1. VersionSpace a list containing every hypothesis in H

2. For each training example, <x, c(x)>

remove from VersionSpace any hypothesis h for which

h(x) c(x)

3. Output the list of hypotheses in VersionSpace

Page 34: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

34

Drawbacks of List-Then-Eliminate

The algorithm requires exhaustively enumerating all hypotheses in H– An unrealistic approach ! (full search)

If insufficient (training) data is available, the algorithm will output a huge set of hypotheses consistent with the observed data– 학습 data 가 불충분할때 지나치게 많은 hypotheses 를 양산

!!

Page 35: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

35Adapted by Doug Downey from: Bryan Pardo, EECS 349 Fall 2007

Example Version Space

{<Sunny,Warm,?,Strong,?,?>}S:

{<Sunny,?,?,?,?,?>, <?,Warm,?,?,?>, }G:

<Sunny,?,?,Strong,?,?> <Sunny,Warm,?,?,?,?> <?,Warm,?,Strong,?,?>

x1 = <Sunny Warm Normal Strong Warm Same> +x2 = <Sunny Warm High Strong Warm Same> +x3 = <Rainy Cold High Strong Warm Change> -x4 = <Sunny Warm High Strong Cool Change> +

Page 36: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

36

Representing Version Spaces

The General boundary, G, of version space VSH,D is the set of

its maximally general members

The Specific boundary, S, of version space VSH,D is the set of

its maximally specific members

Every member of the version space lies between these

boundaries

VSH,D = {h ∈ H | (∃s ∈ S)(∃g ∈ G) (g ≥ h ≥ s)}

where x ≥ y means x is more general or equal to y

Page 37: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

37

Relevant bounds

Page 38: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

38

Basic Idea of Candidate Elimination Algorithm

1. Initialize G to the set of maximally general hypotheses in H

2. Initialize S to the set of maximally specific hypotheses in H

3. For each training example x, do If x is positive: generalize S if necessary If x is negative: specialize G if necessary

Page 39: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

39

Candidate Elimination Algorithm (1/2)

G ← maximally general hypotheses in HS ← maximally specific hypotheses in H

For each training example d, do If d is a positive example

– G 로 부터 학습 data d 와 일관하지 않는 가설은 제거한다 . – 학습 data d 와 일관하지 않는 각 가설 s(s ∈ S) 에 대하여

• Remove s from S• 다음과 같이 최소로 일반화된 가설 h(all minimal generalizations h) 를 S 에

추가한다 .

1. 가설 h 가 학습 data d 와 일관하고 2. G 의 일부가 가설 h 보다 일반화 (general) 되어 있는 경우 만약 , S 의 다른 가설 보다 더 일반화 된 어떤 가설이 있다면 삭제한다 . (Specific boundary 유지 )

hG:

S:

inconsistent with d from G

Add minimal generalizations

Page 40: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

40

Candidate Elimination Algorithm (2/2)

If d is a negative example– S 로 부터 학습 data d 와 일관하지 않는 가설은 삭제한다– 학습 data d 와 일관하지 않는 가설 g(g ∈ G)

• Remove g from G

• 다음과 같이 최소로 특수화된 모든 가설 h(all minimal specializations h) 를 G 에 추가 한다 .

1. 가설 h 가 학습 data d 에 일관하고 ,

2. S 의 일부가 가설 h 보다 더 특수화 되어 있는• 만약 , G 의 다른 가설 보다 덜 일반화 ( less general) 된 어떤 가설이

있다면 G 에서 삭제한다 . (General Boundary 유지 )

h

G:

S:

inconsistent with d

Add minimal specializations

Page 41: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

41

Candidate-Elimination Algorithm

– When does this halt?– If S and G are both singleton sets, then:

• if they are identical, output value and halt. • if they are different, the training cases were inconsistent.

Output this and halt.

– Else continue accepting new training examples.

Page 42: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

42Adapted by Doug Downey from: Bryan Pardo, EECS 349 Fall 2007

Example Candidate Elimination

• Instance space: integer points in the x,y plane with 0 x,y 10• hypothesis space : rectangles, that means hypotheses are of the form a x b , c y d , assume

a b

c

d

42

Page 43: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

43

Example Candidate Elimination

• examples = {ø} • G= {a, b, c, d} = {0,10,0,10}• S={ø}

G

00

10

10

Page 44: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

44

Example Candidate Elimination

• examples = {(3,4),+} • G={(0,10,0,10)}• S={(3,3,4,4)}

G

+

S

0

10

0 10

Page 45: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

45

Example Trace

First initialize the S and G sets:

S0 : 0,0,0,0,0,0>

G0 : { < ?,?,?,?,?,?> }

Page 46: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

46

Given 1st Example

The first example is positive:

< <Sunny, Warm, Normal, Strong, Warm, Same>, Yes>

h: <Sunny, Warm, Normal, Strong, Warm, Same>,

S0 : 0,0,0,0,0,0>

S1 : { <Sunny, Warm, Normal, Strong, Warm, Same> }

G0, G1 : { < ?,?,?,?,?,?> }

For each training example d, do If d is a positive example

– G 로 부터 학습 data d 와 일관하지 않는 가설은 제거한다 . – 학습 data d 와 일관하지 않는 각 가설 s(s ∈ S) 에 대하여

• Remove s from S• 다음과 같이 최소로 일반화된 가설 h(all minimal generalizations h) 를 S 에 추가한다 .

1. 가설 h 가 학습 data d 와 일관하고 2. G 의 일부가 가설 h 보다 일반화 (general) 되어 있는 경우 만약 , S 의 다른 가설 보다 더 일반화 된 어떤 가설이 있다면 삭제한다 .(Specific boundary 유지 )

(g ≥ h ≥ s) ?

Candidate Elimination Algorithm

Page 47: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

47

Given 2nd Example

The second example is positive:

< <Sunny,Warm,High, Strong,Warm, Same>, Yes>

S1 : { <Sunny, Warm, Normal, Strong, Warm, Same> }

S2 : { <Sunny, Warm, ?, Strong, Warm, Same> }

G1, G2 : { < ?,?,?,?,?,?> }

For each training example d, do If d is a positive example

– G 로 부터 학습 data d 와 일관하지 않는 가설은 제거한다 . – 학습 data d 와 일관하지 않는 각 가설 s(s ∈ S) 에 대하여

• Remove s from S• 다음과 같이 최소로 일반화된 가설 h(all minimal generalizations h) 를 S 에 추가한다 .

1. 가설 h 가 학습 data d 와 일관하고 2. G 의 일부가 가설 h 보다 일반화 (general) 되어 있는 경우 만약 , S 의 다른 가설 보다 더 일반화 된 어떤 가설이 있다면 삭제한다 .(Specific boundary 유지 )

h: <Sunny, Warm, ?, Strong, Warm, Same>

(g ≥ h ≥ s) ?

Candidate Elimination Algorithm

Page 48: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

48

Given 3rd Example

The third example is negative:

< <Rainy, Cold, High, Strong, Warm, Change>, No>

S2, S3 : {<Sunny, Warm, ?, Strong, Warm, Same> }

G3 : { <Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?>, <?, ?, ?, ?, ?, Same>}

G2: { < ?,?,?,?,?,?> }

If d is a negative example– S 로 부터 학습 data d 와 일관하지 않는 가설은 삭제한다– 학습 data d 와 일관하지 않는 가설 g(g ∈ G)

• Remove g from G

• 다음과 같이 최소로 특수화된 모든 가설 h(all minimal specializations h) 를 G 에 추가 한다 .

1. 가설 h 가 학습 data d 에 일관하고 ,

2. S 의 일부가 가설 h 보다 더 특수화 되어 있는• 만약 , G 의 다른 가설 보다 덜 일반화 ( less general) 된 어떤 가설이 있다면 G

에서 삭제한다 . (General Boundary 유지 )

(g ≥ h ≥ s) ?

Candidate Elimination Algorithm

Page 49: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

Given 3rd Example

49

The third example is negative:

< <Rainy, Cold, High, Strong, Warm, Change>, No>

S2, S3 : {<Sunny, Warm, ?, Strong, Warm, Same> }

가능한 가설 h 들 {<Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?>, <?, ?, ?, ?, ?, Same>}, <?,?, Normal,?,?,? >, <?, ?, ?, Strong, ?, ?>, <?, ?, ?, Warm, ?>}

G3 : { <Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?>, <?, ?, ?, ?, ?, Same>}, 은 가능 ?

<?,?, Normal,?,?,? >, <?, ?, ?, Strong, ?, ?>, <?, ?, ?, Warm, ?> 은 불가능 ?

G2: { < ?,?,?,?,?,?> }

Candidate Elimination Algorithm

Page 50: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

Given 3rd Example

50

따라서 S 로부터 삭제할 가설은 없다 !!

If d is a negative example– S 로 부터 학습 data d 와 일관하지 않는 가설은 삭제한다

Candidate Elimination Algorithm

Page 51: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

Given 3rd Example

51

If d is a negative example– 학습 data d 와 일관하지 않는 가설 g(g ∈ G)

• Remove g from G

• 다음과 같이 최소로 특수화된 모든 가설 h(all minimal specializations h) 를 G 에 추가 한다 .

1. 가설 h 가 학습 data d 에 일관하고 ,

2. S 의 일부가 가설 h 보다 더 특수화 되어 있는

Candidate Elimination Algorithm

Page 52: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

Why is <?,?, Normal,?,?,? > not included in G ?

Given 3rd Example

학습 data d : < <Rainy, Cold, High, Strong, Warm, Change>, No>

S2, S3 : {<Sunny, Warm, ?, Strong, Warm, Same> }

가설 h: <?,?, Normal,?,?,? >, yes 의 경우 h(x)=no, d(x) =no

<?,?, Normal,?,?,? > ≥ <Sunny, Warm, ?, Strong, Warm, Same> 인가 ?

VSH,D = {h ∈ H | (∃s ∈ S)(∃g ∈ G) (g ≥ h ≥ s)}

where x ≥ y means x is more general or equal to y

Candidate Elimination Algorithm

Page 53: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

53

Given 3rd Example

The third example is negative:

< <Rainy, Cold, High, Strong, Warm, Change>, No>

S2, S3 : {<Sunny, Warm, ?, Strong, Warm, Same> }

G3 : { <Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?>, <?, ?, ?, ?, ?, Same>}

G2: { < ?,?,?,?,?,?> }

If d is a negative example– S 로 부터 학습 data d 와 일관하지 않는 가설은 삭제한다– 학습 data d 와 일관하지 않는 가설 g(g ∈ G)

• Remove g from G

• 다음과 같이 최소로 특수화된 모든 가설 h(all minimal specializations h) 를 G 에 추가 한다 .

1. 가설 h 가 학습 data d 에 일관하고 ,

2. S 의 일부가 가설 h 보다 더 특수화 되어 있는• 만약 , G 의 다른 가설 보다 덜 일반화 ( less general) 된 어떤 가설이 있다면 G

에서 삭제한다 . (General Boundary 유지 )

Candidate Elimination Algorithm

Page 54: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

54

Given 4th Example

The 4th example is negative:

< <Sunny,Warm,High, Strong,Cool,Change ,Yes>

S3 : {<Sunny, Warm, ?, Strong, Warm, Same> }

S4 : {<Sunny, Warm, ?, Strong, ?, ?> }

G4 : { <Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?>}

G3 : { <Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?>, <?, ?, ?, ?, ?, Same> }

(g ≥ h ≥ s) ?

Candidate Elimination Algorithm

Page 55: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

55

<Sunny, Warm, ?, Strong, Warm, Same>S2

= G2

Example Trace

<Ø, Ø, Ø, Ø, Ø, Ø>S0

<?, ?, ?, ?, ?, ?>G0

d1: <Sunny, Warm, Normal, Strong, Warm, Same, Yes>

d2: <Sunny, Warm, High, Strong, Warm, Same, Yes>

d3: <Rainy, Cold, High, Strong, Warm, Change, No>

d4: <Sunny, Warm, High, Strong, Cool, Change, Yes>= S3

<Sunny, ?, ?, ?, ?, ?> <?, Warm, ?, ?, ?, ?> <?, ?, ?, ?, ?, Same>G3

<Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?>

<Sunny, Warm, ?, Strong, ?, ?>S4

G4 <Sunny, ?, ?, ?, ?, ?>

<?, Warm, ?, ?, ?, ?>

<Sunny, Warm, Normal, Strong, Warm, Same>S1

= G1

Candidate Elimination Algorithm

Page 56: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

56

The hypothesis space after four cases

Candidate Elimination Algorithm

Page 57: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

57

The hypothesis space after four cases

Candidate Elimination Algorithm

Page 58: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

58

Remarks on Candidate Elimination

What training example should the learner request next?

Will the CE algorithm converge to the correct hypothesis ?

How can partially learned concepts be used?

Candidate Elimination Algorithm

Page 59: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

59

What Next Training Example?

Page 60: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

60

Who Provides Examples?

Two methods– Fully supervised learning: External teacher provides all

training examples (input + correct output)– Learning by query: The learner generates instances

(queries) by conducting experiments, then obtains the correct classification for this instance from an external oracle (nature or a teacher).

Negative training examples specializes G, positive ones generalize S.

Page 61: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

61

When Does CE Converge?

Will the Candidate-Elimination algorithm converge to the correct hypothesis?

Prerequisites– 1. No error in training examples– 2. The target hypothesis exists which correctly describes c(x).

If S and G boundary sets converge to an empty set, this means there is no hypothesis in H consistent with observed examples.

(S 와 G 모두 empty set 에 이르게 된다면 학습된 예제들과는 부합하는 hypothesis 이 없다는 것을 의미한다 .)

Page 62: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

62

Can a partially learned classifier be used?

Page 63: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

63

How to Use Partially Learned Concepts?

Suppose the learner is asked to classify the four new instances

shown in the following table.

Instance Sky AirTemp Humidity Wind Water Forecast EnjoySport

A Sunny Warm Normal Strong Cool Change + 6/0 B Rainy Cold Normal Light Warm Same - 0/6 C Sunny Warm Normal Light Warm Same ? 3/3 D Sunny Cold Normal Strong Warm Same ? 2/4

{<Sunny,Warm,?,Strong,?,?>}S:

{<Sunny,?,?,?,?,?>, <?,Warm,?,?,?>, }G:

<Sunny,?,?,Strong,?,?> <Sunny,Warm,?,?,?,?> <?,Warm,?,Strong,?,?>

Page 64: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

64

Can a partially learned classifier be used?

Page 65: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

A Biased Hypothesis Space

65

x1 = <Sunny Warm Normal Strong Cool Change> +x2 = <Cloudy Warm Normal Strong Cool Change> +

x3 = <Rainy Warm Normal Strong Cool Change> -

S2 : { <?, Warm, Normal, Strong, Cool, Change> } overly gerenal !!, incorrectly covers x3

S3 : {} The third example x3 contradicts the already overly

general hypothesis space specific boundary S2.

We have Biased the learner to consider only conjunctive hypothesis !!

Page 66: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

66

An UnBiased Learner

Idea: Choose H that expresses every teachable concept (i.e., H is the

power set of X) Consider H' = disjunctions, conjunctions, negations over previous H. E.g., <Sunny Warm Normal ? ? ?> ∨<? ? ? ? ? Change>

What are S, G in this case? S ← G ←

target concept 들이 H 내에 존재하기

위해서는 모든 teachable concept 을

표현이 가능하도록 해야 함 !!

Page 67: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

67

Unbiased Learner

Assume positive examples (x1, x2, x3) and negative examples (x4, x5)

S : { (x1 v x2 v x3) } G : { (x4 v x5) }

How would we classify some new instance x6 ?

For any instance not in the training exampleshalf of the version space says +the other half says –

=> To learn the target concept, one would have to present every single instance in X as a training example (Rote learning)

Page 68: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

68

What Justifies this Inductive Leap?

New examples

d1: <Sunny Warm Normal Strong Cool Change>, Yes

d2: <Sunny Warm Normal Light Warm Same>, No

S : {<Sunny Warm Normal ? ? ?> +}

Overly general hypothesis

space

Page 69: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

69

Inductive bias

• Our hypothesis space is unable to represent a simple disjunctive target concept : (Sky=Sunny) v (Sky=Cloudy)

x1 = <Sunny Warm Normal Strong Cool Change> +S1 : { <Sunny, Warm, Normal, Strong, Cool, Change> }

x2 = <Cloudy Warm Normal Strong Cool Change> +S2 : { <?, Warm, Normal, Strong, Cool, Change> }

x3 = <Rainy Warm Normal Strong Cool Change> -S3 : {}

The third example x3 contradicts the already overly general hypothesis space specific boundary S2.

Overly general hypothesis

space

Page 70: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

70

Why believe we can classify the unseen ?

S : {<Sunny Warm Normal ? ? ?> +}

Unseen example: <Sunny Warm Normal Strong Warm Same>, ….

Why Inductive learning hypothesis ?

Page 71: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

71

Why believe we can classify the unseen ?

S : {<Sunny Warm Normal ? ? ?> +}

Unseen example: <Sunny Warm Normal Strong Warm Same>, ….

Why Inductive learning hypothesis ?

Inductive learning hypothesis: “If the hypothesis works for enough data then it will work on new examples.”

Page 72: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

72

Inductive bias The inductive bias of a learning algorithm is the set

of assumptions that the learner uses to predict outputs given inputs that it has not encountered (Mitchell, 1980).

Example

– Occam’s Razor

– Target concept c ∈ H (hypothesis space) of candidate-elimination algorithm

Page 73: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

73

Inductive Bias

Consider concept learning algorithm L instances X, target concept c training examples Dc = {<x, c(x)>} let L(xi, Dc) denote the classification assigned to the instance xi by

L after training on data Dc.

Definition:The inductive bias of L is any minimal set of assertions B suchthat for any target concept c and corresponding training

examples Dc

(∀xi ∈ X)[(B ∧ Dc ∧ xi) ├ L(xi, Dc)]where A├ B means A logically entails B

Page 74: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

74

Inductive bias II

Page 75: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

75

Inductive bias II

Page 76: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

76

Inductive Systems and EquivalentDeductive Systems

Candidate EliminationAlgorithm

Using HypothesisSpace H

Inductive System

Theorem Prover

Equivalent Deductive System

Training Examples

New Instance

Training Examples

New Instance

Assertion { c H }

Inductive bias made explicit

Classification of New Instance(or “Don’t Know”)

Classification of New Instance(or “Don’t Know”)

Page 77: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

77

Three Learners with Different Biases

Rote Learner– Weakest bias: anything seen before, i.e., no bias

– Store examples

– Classify x if and only if it matches previously observed example

Version Space Candidate Elimination Algorithm– Stronger bias: concepts belonging to conjunctive H

– Store extremal generalizations and specializations

– Classify x if and only if it “falls within” S and G boundaries (all members agree)

Find-S– Even stronger bias: most specific hypothesis

– Prior assumption: any instance not observed to be positive is negative

– Classify x based on S set

Page 78: Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Gun Ho Lee Soongsil University, Seoul

78

Summary Points

1. Concept learning as search through H

2. General-to-specific ordering over H

3. Version space candidate elimination algorithm

4. S and G boundaries characterize learner’s uncertainty

5. Learner can generate useful queries

6. Inductive leaps possible only if learner is biased

7. Inductive learners can be modelled by equivalent

deductive systems