outline - cs.odu.edumukka/cs480f09/lecturenotes... · outline [read chapter 2] [suggested exercises...

24

Upload: others

Post on 31-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

Outline [read Chapter 2][suggested exercises 2.2, 2.3, 2.4, 2.6]� Learning from examples� General-to-speci�c ordering over hypotheses� Version spaces and candidate eliminationalgorithm� Picking new examples� The need for inductive biasNote: simple approach assuming no noise,illustrates key concepts22 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 2: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

Training Examples for EnjoySportSky Temp Humid Wind Water Forecst EnjoySptSunny Warm Normal Strong Warm Same YesSunny Warm High Strong Warm Same YesRainy Cold High Strong Warm Change NoSunny Warm High Strong Cool Change YesWhat is the general concept?

23 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 3: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

Representing HypothesesMany possible representationsHere, h is conjunction of constraints on attributesEach constraint can be� a spec�c value (e.g., Water =Warm)� don't care (e.g., \Water =?")� no value allowed (e.g.,\Water=;")For example,Sky AirTemp Humid Wind Water ForecsthSunny ? ? Strong ? Samei24 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 4: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

Prototypical Concept Learning Task�Given:{ Instances X: Possible days, each described bythe attributes Sky, AirTemp, Humidity,Wind, Water, Forecast{ Target function c: EnjoySport : X ! f0; 1g{ Hypotheses H: Conjunctions of literals. E.g.h?; Cold;High; ?; ?; ?i:{ Training examples D: Positive and negativeexamples of the target functionhx1; c(x1)i; : : : hxm; c(xm)i�Determine: A hypothesis h in H such thath(x) = c(x) for all x in D.

25 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 5: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

The inductive learning hypothesis: Anyhypothesis found to approximate the targetfunction well over a su�ciently large set oftraining examples will also approximate thetarget function well over other unobservedexamples.

26 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 6: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

Instance, Hypotheses, and More-General-Than

h = <Sunny, ?, ?, Strong, ?, ?>

h = <Sunny, ?, ?, ?, ?, ?>

h = <Sunny, ?, ?, ?, Cool, ?>

2h

h3

h

Instances X Hypotheses H

Specific

General

1x

2x

x = <Sunny, Warm, High, Strong, Cool, Same>

x = <Sunny, Warm, High, Light, Warm, Same>

1

1

2

1

2

3

27 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 7: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

Find-S Algorithm1. Initialize h to the most speci�c hypothesis in H2. For each positive training instance x� For each attribute constraint ai in hIf the constraint ai in h is satis�ed by xThen do nothingElse replace ai in h by the next moregeneral constraint that is satis�ed by x3. Output hypothesis h

28 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 8: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

Hypothesis Space Search by Find-SInstances X Hypotheses H

Specific

General

1x2

x

x3

x4

h0

h1

h2,3

h4

+ +

+

x = <Sunny Warm High Strong Cool Change>, +4

x = <Sunny Warm Normal Strong Warm Same>, +1x = <Sunny Warm High Strong Warm Same>, +

2

x = <Rainy Cold High Strong Warm Change>, -3

h = <Sunny Warm Normal Strong Warm Same>1h = <Sunny Warm ? Strong Warm Same>2

h = <Sunny Warm ? Strong ? ? >4

h = <Sunny Warm ? Strong Warm Same>3

0h = <∅, ∅, ∅, ∅, ∅, ∅>

-

29 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 9: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

Complaints about Find-S� Can't tell whether it has learned concept� Can't tell when training data inconsistent� Picks a maximally speci�c h (why?)� Depending on H, there might be several!

30 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 10: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

Version SpacesA hypothesis h is consistent with a set oftraining examples D of target concept c if andonly if h(x) = c(x) for each training examplehx; c(x)i in D.Consistent(h;D) � (8hx; c(x)i 2 D) h(x) = c(x)The version space, V SH;D, with respect tohypothesis space H and training examples D,is the subset of hypotheses from H consistentwith all training examples in D.V SH;D � fh 2 HjConsistent(h;D)g31 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 11: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

The List-Then-Eliminate Algorithm:1. V ersionSpace a list containing everyhypothesis in H2. For each training example, hx; c(x)iremove from V ersionSpace any hypothesis h forwhich h(x) 6= c(x)3. Output the list of hypotheses in V ersionSpace

32 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 12: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

Example Version SpaceS:

<Sunny, Warm, ?, ?, ?, ?><Sunny, ?, ?, Strong, ?, ?> <?, Warm, ?, Strong, ?, ?>

<Sunny, Warm, ?, Strong, ?, ?>{ }

G: <Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> { }

33 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 13: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

Representing Version SpacesThe General boundary, G, of version spaceV SH;D is the set of its maximally generalmembersThe Speci�c boundary, S, of version spaceV SH;D is the set of its maximally speci�cmembersEvery member of the version space lies betweenthese boundariesV SH;D = fh 2 Hj(9s 2 S)(9g 2 G)(g � h � s)gwhere x � y means x is more general or equal toy34 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 14: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

Candidate Elimination AlgorithmG maximally general hypotheses in HS maximally speci�c hypotheses in HFor each training example d, do� If d is a positive example{ Remove from G any hypothesis inconsistentwith d{ For each hypothesis s in S that is notconsistent with d� Remove s from S� Add to S all minimal generalizations h of ssuch that1. h is consistent with d, and2. some member of G is more general than h� Remove from S any hypothesis that is moregeneral than another hypothesis in S� If d is a negative example35 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 15: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

{ Remove from S any hypothesis inconsistentwith d{ For each hypothesis g in G that is notconsistent with d� Remove g from G� Add to G all minimal specializations h of gsuch that1. h is consistent with d, and2. some member of S is more speci�c than h� Remove from G any hypothesis that is lessgeneral than another hypothesis in G

36 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 16: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

Example Trace

{<?, ?, ?, ?, ?, ?>}

S0: {<Ø, Ø, Ø, Ø, Ø, Ø>}

G 0:

37 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 17: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

What Next Training Example?S:

<Sunny, Warm, ?, ?, ?, ?><Sunny, ?, ?, Strong, ?, ?> <?, Warm, ?, Strong, ?, ?>

<Sunny, Warm, ?, Strong, ?, ?>{ }

G: <Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> { }

38 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 18: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

How Should These Be Classi�ed?S:

<Sunny, Warm, ?, ?, ?, ?><Sunny, ?, ?, Strong, ?, ?> <?, Warm, ?, Strong, ?, ?>

<Sunny, Warm, ?, Strong, ?, ?>{ }

G: <Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> { }hSunny Warm Normal Strong Cool ChangeihRainy Cool Normal Light Warm SameihSunny Warm Normal Light Warm Samei39 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 19: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

What Justi�es this Inductive Leap?+ hSunny Warm Normal Strong Cool Changei+ hSunny Warm Normal Light Warm SameiS : hSunny Warm Normal ? ? ?iWhy believe we can classify the unseenhSunny Warm Normal Strong Warm Samei

40 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 20: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

An UNBiased LearnerIdea: Choose H that expresses every teachableconcept (i.e., H is the power set of X)Consider H 0 = disjunctions, conjunctions,negations over previous H. E.g.,hSunny Warm Normal ? ? ?i _ :h? ? ? ? ? ChangeiWhat are S, G in this case?S G 41 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 21: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

Inductive BiasConsider� concept learning algorithm L� instances X, target concept c� training examples Dc = fhx; c(x)ig� let L(xi;Dc) denote the classi�cation assigned tothe instance xi by L after training on data Dc.De�nition:The inductive bias of L is any minimal setof assertions B such that for any targetconcept c and corresponding trainingexamples Dc(8xi 2 X)[(B ^Dc ^ xi) ` L(xi;Dc)]where A ` B means A logically entails B42 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 22: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

Inductive Systems and EquivalentDeductive SystemsCandidateEliminationAlgorithm

Using Hypothesis Space

Training examples

New instance

Equivalent deductive system

Theorem Prover

Training examples

New instance

Inductive bias made explicit

Classification of new instance, or"don’t know"

Classification of new instance, or"don’t know"

Inductive system

H

Assertion " contains the target concept"

H

43 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 23: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

Three Learners with Di�erent Biases1. Rote learner: Store examples, Classify x i� itmatches previously observed example.2. Version space candidate elimination algorithm3. Find-S

44 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997

Page 24: Outline - cs.odu.edumukka/cs480f09/Lecturenotes... · Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-sp eci c ordering o v er

Summary Points1. Concept learning as search through H2. General-to-speci�c ordering over H3. Version space candidate elimination algorithm4. S and G boundaries characterize learner'suncertainty5. Learner can generate useful queries6. Inductive leaps possible only if learner is biased7. Inductive learners can be modelled by equivalentdeductive systems45 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997