general-to-specific ordering. 8/29/03logic based classification2...

55
A LOGIC BASED CLASSIFICATION TECHNIQUE General-to-Specific Ordering

Upload: eugenia-hubbard

Post on 01-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

A LOGIC BASED CLASSIFICATION

TECHNIQUEGeneral-to-Specific Ordering

Page 2: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 28/29/03

Sky AirTemp

Humidity

Wind Water

Forecast

EnjoySport

Sunny Warm Normal Strong Warm Same Yes

Sunny Warm High Strong Warm Same Yes

Rainy Cold High Strong Warm Change

No

Sunny Warm High Strong Cool Change

Yes

Logic Based

Tree questionsSky? Sunny, ok, Wind? Strong, ok yes enjoy sport

Like Decision Tree

Page 3: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 38/29/03

Expression<Sunny,?,?,Strong,?,?>

Means will enjoy sport only when sky is sunny and wind is strong, don’t care about other attributes

Sky AirTemp

Humidity

Wind Water

Forecast

EnjoySport

Sunny Warm Normal Strong Warm Same Yes

Sunny Warm High Strong Warm Same Yes

Rainy Cold High Strong Warm Change

No

Sunny Warm High Strong Cool Change

Yes

Candidate Elimination

With candidate elimination object is to predict class through the use of expressions

?’s are like wild cardsExpressions represent conjunctions

Page 4: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 48/29/03

First Approach

Finding a maximally specific hypothesis Start with most restrictive (specific) one

can get and relax to satisfy each positive training sample

Most general (all dimensions can be any value)

<?,?,?,?,?,?>Most restrictive (no dimension can be

anything<Ø, Ø, Ø, Ø, Ø, Ø>

Ø’s mean nothing will match it

Page 5: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 58/29/03

That pesky Ø

What if a relation has a single Ø? (remember, the expression is a conjunction)

Ø

Page 6: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 68/29/03

Find-S Algorithm

Initialize h to most specific hypothesis in H (<Ø, Ø, Ø, Ø, Ø, Ø>)

For each positive training instance xFor each attribute constraint ai in h

If the constraint ai is satisfied by x then do nothing

Else replace ai in h by the next more general constraint that is satisfied by x

Return hOrder of generality

? is more general than a specific attribute value which is more specific than Ø

Page 7: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 78/29/03

Sky AirTemp

Humidity

Wind Water

Forecast

EnjoySport

Sunny Warm Normal Strong Warm Same Yes

Sunny Warm High Strong Warm Same Yes

Rainy Cold High Strong Warm Change

No

Sunny Warm High Strong Cool Change

Yes

Set h to <Ø, Ø, Ø, Ø, Ø, Ø> First positive (x)

<Sunny,Warm,Normal,Strong,Warm,Same> Which attributes of x are satisfied by h? None? Replace each ai with a relaxed form from x

<Sunny,Warm,Normal,Strong,Warm,Same>

Example

Page 8: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 88/29/03

Sky AirTemp

Humidity

Wind Water

Forecast

EnjoySport

Sunny Warm Normal Strong Warm Same Yes

Sunny Warm High Strong Warm Same Yes

Rainy Cold High Strong Warm Change

No

Sunny Warm High Strong Cool Change

Yes

h is now <Sunny,Warm,Normal,Strong,Warm,Same>

Next positive <Sunny,Warm,High,Strong,Warm,Same>

Which attributes of x are satisfied by h? Not humidity

Replace h with <Sunny,Warm,?,Strong,Warm,Same>

Example

Page 9: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 98/29/03

Sky AirTemp

Humidity

Wind Water

Forecast

EnjoySport

Sunny Warm Normal Strong Warm Same Yes

Sunny Warm High Strong Warm Same Yes

Rainy Cold High Strong Warm Change

No

Sunny Warm High Strong Cool Change

Yes

h is now <Sunny,Warm,?,Strong,Warm,Same>

Next positive <Sunny,Warm,High,Strong,Cool,Change>

Which attributes of x are satisfied by h? Not water or forcast

Replace h with <Sunny,Warm,?,Strong,?,?>

Example

Return <Sunny,Warm,?,Strong,?,?>

Can one use this to “test” a new instance?

Page 10: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 108/29/03

Next: Version Space

What if want all hypotheses that are consistent with a training set (called a version space)

A hypothesis is consistent with a set of training examples if and only if h(x)=c(x) for each training example

<Sunny,Warm,?,Strong,?,? >

<Sunny, ?, ?, Strong, ?, ?>

<Sunny, Warm, ?, ?, ?, ?>

<?, Warm, ?, Strong, ?, ?>

<Sunny,?,?,?,?,?><?,Warm,?,?,?,?>

<?,?,?,?,?,Same>

Page 11: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 118/29/03

List-Then-Eliminate

Algorithm a list containing every hypothesis in For each training example

Remove from any hypothesis for which

Output the list of hypotheses in

Exh

au

sti

ve

• Number of hypotheses 5,120 that can be represented (5*4*4*4*4*4)

• But a single Ø represents an empty set

• So semantically distinct hypotheses 973

Page 12: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 128/29/03

Next: Candidate Elimination

More compact representation

Just those hypotheses at the extreme ends Those that are the most

general and those that are the most specific

All else between would necessarily be in the

Process of Elimination

Page 13: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 138/29/03

Definitions

And now for something totally formal: The general boundary G, with respect to

hypothesis space consistent with , is the set of maximally general members of consistent with .

G is identical to the set of all g that are members of H such that g is consistent with D and there does not exist a g’ in H such that it is more general than g and it (g’) is consistent with the training data

𝐺≡ {𝑔∈𝐻∨𝐶𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑡 (𝑔 ,𝐷)∧(¬∃𝑔′∈𝐻 )[(𝑔 ′¿𝑔𝑔)∧𝐶𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑡 (𝑔′ ,𝐷)]}

Page 14: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 148/29/03

Definitions

The specific boundary S, with respect to hypothesis space consistent with , is the set of minimally general members of consistent with .

S is identical to the set of all s that are members of H such that s is consistent with D and there does not exist a s’ in H such that it is more specific than s and it (s’) is consistent with the training data

𝑆≡ {𝑠∈𝐻∨𝐶𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑡 (𝑠 ,𝐷)∧(¬∃𝑠 ′∈𝐻 )[(𝑠¿𝑔 𝑠 ′)∧𝐶𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑡 (𝑠 ′ ,𝐷)]}

Page 15: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 158/29/03

Example All yes’s are sunny, warm, and strong But “strong” isn’t enough to identify a

yesS:{<Sunny, Warm, ?, Strong, ?, ?>}

<Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?>

G: {<Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> }5 ?’s

3 ?’s

4 ?’s

Sky AirTemp

Humidity

Wind Water

Forecast

EnjoySport

Sunny Warm Normal Strong Warm Same Yes

Sunny Warm High Strong Warm Same Yes

Rainy Cold High Strong Warm Change

No

Sunny Warm High Strong Cool Change

Yes

Page 16: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 168/29/03

Approach

Start with two extremes Most general (all dimensions can be any

value) <?,?,?,?,?,?>

Most restrictive (no dimension can be anything <Ø, Ø, Ø, Ø, Ø, Ø>

Slowly work inward

Specific General

Page 17: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 178/29/03

Algorithm

Initialize G to the set of maximally general hypotheses in H Initialize S to the set of maximally specific hypotheses in H For each training example d, do

If d is a positive example Remove from G any hypothesis inconsistent with d For each hypothesis s in S that is not consistent with d

Remove s from S Add to S all minimal generalizations h of s such that

h is consistent with d and some member of G is more general than h Remove from S any hypothesis that is more general than another

hypothesis in S If d is a negative example

Remove from S any hypothesis inconsistent with d For each hypothesis g in G that is not consistent with d

Remove g from G Add to G all minimal specializations h of g such that

h is consistent with d, and some member of S is more specific than h Remove from G any hypothesis that is less general than another

hypothesis in G

Page 18: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 188/29/03

Example

InitializeS0: <Ø, Ø, Ø, Ø, Ø, Ø>

G0: {<?,?,?,?,?,?>}

Sky AirTemp

Humidity

Wind Water

Forecast

EnjoySport

Sunny Warm Normal Strong Warm Same Yes

Sunny Warm High Strong Warm Same Yes

Rainy Cold High Strong Warm Change

No

Sunny Warm High Strong Cool Change

Yes

Page 19: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 198/29/03

Example

First recordS1: {<Sunny,Warm,Normal,Strong,Warm,Same>}

G0 G1: {<?,?,?,?,?,?>}

Sky AirTemp

Humidity

Wind Water

Forecast

EnjoySport

Sunny Warm Normal Strong Warm Same Yes

Sunny Warm High Strong Warm Same Yes

Rainy Cold High Strong Warm Change

No

Sunny Warm High Strong Cool Change

Yes

Page 20: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 208/29/03

Example

SecondS2: {<Sunny,Warm, ? ,Strong,Warm,Same>}

G0G1G2: {<?,?,?,?,?,?>}

Sky AirTemp

Humidity

Wind Water

Forecast

EnjoySport

Sunny Warm Normal Strong Warm Same Yes

Sunny Warm High Strong Warm Same Yes

Rainy Cold High Strong Warm Change

No

Sunny Warm High Strong Cool Change

Yes

Modify previous S minimally to keep consistent with d

Page 21: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 218/29/03

Example

ThirdS2S3: {<Sunny,Warm, ? ,Strong,Warm,Same>}

G3: {<Sunny,?,?,?,?,?>, <?,Warm,?,?,?,?>, <?,?,?,?,?,Same>}

Sky AirTemp

Humidity

Wind Water

Forecast

EnjoySport

Sunny Warm Normal Strong Warm Same Yes

Sunny Warm High Strong Warm Same Yes

Rainy Cold High Strong Warm Change

No

Sunny Warm High Strong Cool Change

Yes

Replace {<?,?,?,?,?,?>} with all one member expressions (minimally specialized)

Page 22: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 228/29/03

Example

FourthS4: {<Sunny,Warm, ? ,Strong, ? , ? >}

G3G4: {<Sunny,?,?,?,?,?>, <?,Warm,?,?,?,?>, <?,?,?,?,?,Same>}

Sky AirTemp

Humidity

Wind Water

Forecast

EnjoySport

Sunny Warm Normal Strong Warm Same Yes

Sunny Warm High Strong Warm Same Yes

Rainy Cold High Strong Warm Change

No

Sunny Warm High Strong Cool Change

Yes

Back to positive, replace warm and same with “?” and remove “Same” from General

<Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?>

Then can calculate the interior expressions

Page 23: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 238/29/03

What if Have two identical records but different classes?

If positive shows up first it, first step in evaluating a negative states “Remove from S any hypothesis that is not consistent with d” (S is now empty)

For each hypothesis g in G that is not consistent with d Remove g from G (all ?’s is inconsistent with No, G is empty) Add to G all minimal specializations h of g such that h is consistent with d,

and some member of S is more specific than h No matter what add to G it will violate either d or S (remains empty) Both are empty, broken. Known as converging to an empty version space

Sky AirTemp

Humidity

Wind Water

Forecast

EnjoySport

Sunny Warm Normal Strong Warm Same Yes

Sunny Warm Normal Strong Warm Same No

S1: {<Sunny,Warm,Normal,Strong,Warm,Same>}

G0 G1: {<?,?,?,?,?,?>}

Established by first positive

Page 24: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 248/29/03

What if Have two identical records but different classes?

If negative shows up first it, first step in evaluating a positive states “Remove from G any hypothesis that is not consistent with d”

This is all of them, leaving an empty set For each hypothesis s in S that is not consistent with d

Remove s from S Add to S all minimal generalizations h of s such that h is consistent

with d and some member of G is more general than h No minimal generalization exists except <?,?,?,?,?,?>

Sky AirTemp

Humidity

Wind Water

Forecast

EnjoySport

Sunny Warm Normal Strong Warm Same No

Sunny Warm Normal Strong Warm Same Yes

S0: <Ø, Ø, Ø, Ø, Ø, Ø>

G0G1:{<Rainy,?,?,?,?,?>, <Cloudy,?,?,?,?,?>, <?,Cold,?,?,?,?>,<?,?,High,?,?,?>,<?,?,?,Light,?,?>, <?,?,?,?,Cool,?>,<?,?,?,?,?,Change>}

Established by first negative

Page 25: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 258/29/03

Brittle

Bad with noisy data Similar effect with false positives or

negatives

Page 26: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 268/29/03

Will it converge?

Yes provided1. There are no errors in the

training examples2. There is some hypothesis

in H that correctly describes the target concept

For example: if the target concept is a disjunction () of feature attributes and the hypothesis space supports only conjunctions

Page 27: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 278/29/03

Classifying

Never before seen data

S4: {<Sunny,Warm, ? ,Strong, ? , ? >}

G3G4: {<Sunny,?,?,?,?,?>, <?,Warm,?,?,?,?>, <?,?,?,Strong,?,?>}

Sky AirTemp

Humidity

Wind Water

Forecast

EnjoySport

Sunny Warm Normal Light Warm Same ?

<Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?>

All training samples were strong windVote

No

No NoYes

Yes Yes No

Proportion can be a confidence metric

Page 28: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 288/29/03

A Unanimous Vote

Same confidence as if already converged to the single correct target concept

Regardless of which hypothesis in the version space is eventually found to be correct, it will be positive for at least some of the hypotheses in the current set, and the test case is unanimously positive

100% as good as most specific

match

Page 29: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 298/29/03

Best for…

Discrete data Binary classes

Sky AirTemp

Humidity

Wind Water

Forecast

EnjoySport

Sunny Warm Normal Strong Warm Same Yes

Sunny Warm High Strong Warm Same Yes

Rainy Cold High Strong Warm Change

No

Sunny Warm High Strong Cool Change

Yes

Page 30: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 308/29/03

Now for…

Have seen 4 classifiers Naïve Bayesian KNN Decision Tree Candidate Elimination

Now for some theory

Page 31: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 318/29/03

Have already…

Curse of dimensionality Overfitting Lazy/Eager Radial basis Normalization Gradient descent Entropy/Information

gain Occam’s razor

Page 32: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 328/29/03

Biased Hypothesis Space

Another way of measuring whether a hypothesis captures the learning concept

Candidate Elimination Conjunction of

constraints on the attributes

Page 33: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 338/29/03

In regression Biased toward linear solutions

Naïve Bayes Biased to a given distribution or bin selection

KNN Biased toward solutions that assume

cohabitation of similarly classed instances Decision Tree

Short trees

Biased Hypothesis Space

Page 34: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 348/29/03

Unbiased learner?

Must be able to accommodate every distinct subset as class definition

96 distinct instances (3*2*2*2*2*2) Sky has three possible answers–rest two

Number of distinct subsets 296 Think binary: 1 indicates membership

Sky AirTemp

Humidity

Wind Water

Forecast

EnjoySport

Sunny Warm Normal Strong Warm Same Yes

Sunny Warm High Strong Warm Same Yes

Rainy Cold High Strong Warm Change

No

Sunny Warm High Strong Cool Change

Yes

Page 35: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 358/29/03

Number of hypotheses 5,120 that can be represented (5*4*4*4*4*4)

But a single Ø represents an empty set So semantically distinct hypotheses 973

Each hypothesis represents a subset (due to wild cards)

1+(4*3*3*3*3*3)

Search Space

S0: <Ø, Ø, Ø, Ø, Ø, Ø>

G0: {<?,?,?,?,?,?>}

• Candidate elimination can represent 973 different subsets

• But 296 is the number of distinct subsets

• Very biased

Page 36: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 368/29/03

I think of bias as inflexibility in expressing hypotheses

Or, alternatively, what are the implicit assumptions of the approach

Bias

Implicit Assumptions

Infl

exib

ilit

y

Page 37: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 378/29/03

Next term: inductive inference The process by which a conclusion is inferred

from multiple observations

What we’ve been doing

TRAINING DATA

CLASSIFIER

MAKE PREDICTION ON

NEW DATA

Page 38: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 388/29/03

The Hypothesis

Inductive learning hypothesis Any hypothesis found to approximate the

target function well over a sufficiently large set of training examples will also approximate the target function well over other unobserved examples

Page 39: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 398/29/03

Next Term

Concept learning Automatically inferring the general definition

of some concept, given examples labeled as members or nonmembers of the concept

Roughly equate “Concept” to “Class”

Page 40: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 408/29/03

is the set of all possible hypotheses that the learner may consider regarding the choice of hypothesis representation.

In general, each hypothesis in represents a boolean-valued function defined over ; that is, . Note that this is for a two class system

The goal of the learner is to find a hypothesis such that for all in is the target concept

Hypotheses

Page 41: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 418/29/03

Target Concept

In regression The various “y” values of the training

instances Function approximation

Naïve Bayes, KNN, and Decision Tree Class

Page 42: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 428/29/03

Hypotheses

In regression Line; the coefficients (or other equation members such as

exponents) Naïve Bayes

Class of an instance is predicted by determining most probable class given the training data. That is, by finding the probability for each class for each dimension, multiplying these probabilities (across the dimensions for each class) and taking the class with the maximum probability as the predicted class

KNN Class of an instance is predicted by examining an instance’s

neighborhood Decision Tree

Tree itself Candidate Elimination

Conjunction of constraints on the attributes

Page 43: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 438/29/03

Something Else We’ve Been Doing

Supervised Learning Supervision from an oracle that knows the

classes of the training data Is there unsupervised learning? Yes, covered in pattern rec

Seeks to determine how the data are organized

Clustering PCA Edge detection

Page 44: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 448/29/03

Definition of Machine Learning

Machine learning addresses the question of how to build computer programs that improve their performance at some task through experience.

Finally

Page 45: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 458/29/03

Learning Checkers All about representation Out representation

End game is to develop

function that returns the best next move

Page 46: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 468/29/03

chooseNextMove

Look at every legal move

Determine goodness (score) of resultant board state

Return the highest score (argmax)

Page 47: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 478/29/03

How to Assess a Board State

Score function, we will keep it simple Work with a polynomial with just a few

variables X1: the number of black pieces on the board X2: the number of red pieces on the board X3: the number of black kings on the board X4: the number of red kings on the board X5: the number of black pieces threatened by

red X6: the number of red pieces threatened by

black

Page 48: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 488/29/03

Score(b)

Gotta learn them weights

But how?

𝑆𝑐𝑜𝑟𝑒 (𝑏)=𝑤0+𝑤1𝑥1+𝑤2𝑥2+𝑤3𝑥3+𝑤4 𝑥4+𝑤5𝑥5+𝑤6 𝑥6

X1: the number of black pieces on the boardX2: the number of red pieces on the boardX3: the number of black kings on the boardX4: the number of red kings on the boardX5: the number of black pieces threatened by redX6: the number of red pieces threatened by black

Page 49: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 498/29/03

Training

A bunch of board states (a series of games)

Use them to jiggle the weights Must know the current real “score” vs.

“predicted score” using polynomial

Train the scoring function

Page 50: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 508/29/03

A trick

If my predictor is good then it will be self-consistent

That is, the score of my best move should lead to a good scoring board state

If it doesn’t maybe we should adjust our predictor

PRECOGNITION

Page 51: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 518/29/03

ScoreBasedUponSuccessor

Successor returns the board state of the best move (returned by chooseNextMove(b))

It has been found to be surprisingly successful

𝑆𝑐𝑜𝑟𝑒𝐵𝑎𝑠𝑒𝑑𝑈𝑝𝑜𝑛𝑆𝑢𝑐𝑐𝑒𝑠𝑠𝑜𝑟 (𝑏 )=𝑠𝑐𝑜𝑟𝑒(𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑜𝑟 (𝑏))

Page 52: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 528/29/03

Learning

For each training sample (board states from a series of games)

If win (zero opponent pieces on the board) could give some fixed score (100 if win, -100 if lose)

𝑤𝑖=𝑤𝑖+𝜂 (𝑆𝑐𝑜𝑟𝑒𝐵𝑎𝑠𝑒𝑑𝑈𝑝𝑜𝑛𝑆𝑢𝑐𝑐𝑒𝑠𝑠𝑜𝑟 (𝑏 )−𝑠𝑐𝑜𝑟𝑒 (𝑏) ) 𝑥 𝑖

Look familiar?LMS (least mean squares) weight update rule

Page 53: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 538/29/03

Is this a classifier? Is it Machine

Learning?

Classifier?

Page 54: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 548/29/03

Page 55: General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes

Logic Based Classification 558/29/03

Makes a big deal…

At the beginning of candidate elim pg 29 Diff between satisfies and consistent with Satisfies h when h(x)=1 regardless of

whether x is a positive or negative example

Consistent with h depends on the target concept, whether h(x)=c(x)