1 decision trees exampleskyairtemphumiditywindwaterforecastenjoysport...

17
1 Decision Trees Exampl e Sky AirTem p Humidit y Wind Water Forecas t EnjoySpo rt 1 Sunny Warm Normal Stron g Warm Same Yes 2 Sunny Warm High Stron g Warm Same Yes 3 Rainy Cold High Stron g Warm Change No 4 Sunny Warm High Stron g Cool Change Yes 5 Cloud y Warm High Weak Cool Same Yes 6 Cloud y Cold High Weak Cool Same No

Upload: eric-owen

Post on 21-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo

1

Decision Trees

Example

Sky AirTemp

Humidity

Wind Water Forecast

EnjoySport

1 Sunny Warm Normal Strong

Warm Same Yes

2 Sunny Warm High Strong

Warm Same Yes

3 Rainy Cold High Strong

Warm Change No

4 Sunny Warm High Strong

Cool Change Yes

5 Cloudy

Warm High Weak Cool Same Yes

6 Cloudy

Cold High Weak Cool Same No

Page 2: 1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo

2

Decision Trees

Sky

AirTemp

Sunny Rainy Cloudy

Warm Cold

Yes No

Yes No

(Sky = Sunny) (Sky = Cloudy AirTemp = Warm)

Page 3: 1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo

3

Decision TreesSky

AirTemp

Sunny Rainy Cloudy

Warm Cold

Yes No

Yes No

7 Rainy Warm Normal Weak Cool Same ?

8 Cloudy Warm High Strong

Cool Change ?

Page 4: 1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo

4

Decision TreesHumidity

Normal

High

Yes Sky

AirTemp

Sunny Rainy Cloudy

Warm Cold

Yes No

Yes No

Page 5: 1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo

5

Decision Trees

+ + +

+ + +

+ + +

+ + +

+ + +

+ + +

+ + +

+ + +

A2 = v2

A1 = v1

Page 6: 1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo

6

Homogenity of Examples

• Entropy(S) = p+log2p+ p-log2p-

0.5

Page 7: 1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo

7

Homogenity of Examples

• Entropy(S) = i=1,c pilog2pi impurity measure

Page 8: 1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo

8

Information Gain

• Gain(S, A) = Entropy(S) vValues(A)(|Sv|/|

S|).Entropy(Sv)

A

Sv1 Sv2 ...

Page 9: 1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo

9

Example

• Entropy(S) = p+log2p+ p-log2p- = (4/6)log2(4/6) (2/6)log2(2/6)

= 0.389 + 0.528 = 0.917

• Gain(S, Sky)

= Entropy(S) v{Sunny, Rainy, Cloudy}(|Sv|/|S|)Entropy(Sv)

= Entropy(S) [(3/6).Entropy(SSunny) + (1/6).Entropy(SRainy) +

(2/6).Entropy(SCloudy)]

= Entropy(S) (2/6).Entropy(SCloudy)

= Entropy(S) (2/6)[ (1/2)log2(1/2) (1/2)log2(1/2)]

= 0.917 0.333 = 0.584

Page 10: 1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo

10

Example

• Entropy(S) = p+log2p+ p-log2p- = (4/6)log2(4/6) (2/6)log2(2/6)

= 0.389 + 0.528 = 0.917

• Gain(S, Water)

= Entropy(S) v{Warm, Cool}(|Sv|/|S|)Entropy(Sv)

= Entropy(S) [(3/6).Entropy(SWarm) + (3/6).Entropy(SCool)]

= Entropy(S) (3/6).2.[ (2/3)log2(2/3) (1/3)log2(1/3)]

= Entropy(S) 0.389 0.528

= 0

Page 11: 1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo

11

ExampleSky

?

Sunny Rainy Cloudy

Yes No

• Gain(SCloudy, AirTemp)

= Entropy(SCloudy) v{Warm, Cold}(|Sv|/|S|)Entropy(Sv)

= 1

• Gain(SCloudy, Humidity)

= Entropy(SCloudy) v{Normal, High}(|Sv|/|S|)Entropy(Sv)

= 0

Page 12: 1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo

12

Inductive Bias

• Hypothesis space: complete!

Page 13: 1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo

13

Inductive Bias

• Hypothesis space: complete!

• Shorter trees are preferred over larger trees

• Prefer the simplest hypothesis that fits the data

Page 14: 1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo

14

Inductive Bias

• Decision Tree algorithm: searches incompletely thru a complete hypothesis space.

Preference bias

• Cadidate-Elimination searches completely thru an incomplete hypothesis space.

Restriction bias

Page 15: 1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo

15

Overfitting

• hH is said to overfit the training data if there exists h’H, such that h has smaller error than h’ over the training examples, but h’ has a smaller error than h over the entire distribution of instances:

Page 16: 1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo

16

Overfitting

• hH is said to overfit the training data if there exists h’H, such that h has smaller error than h’ over the training examples, but h’ has a smaller error than h over the entire distribution of instances:

– There is noise in the data

– The number of training examples is too small to produce a representative sample of the target concept

Page 17: 1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo

17

Homework

Exercises 3-13.4 (Chapter 3, ML textbook)