Download - Machine Learning
![Page 1: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/1.jpg)
Machine Learning
Lecture 10
Decision Trees
G53MLE Machine LearningDr Guoping Qiu 1
![Page 2: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/2.jpg)
TreesNodeRootLeafBranchPathDepth
2
![Page 3: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/3.jpg)
Decision Trees A hierarchical data structure that represents data by implementing a divide
and conquer strategy Can be used as a non-parametric classification method. Given a collection of examples, learn a decision tree that represents it. Use this representation to classify new examples
3
![Page 4: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/4.jpg)
Decision Trees Each node is associated with a feature (one of the elements of a feature
vector that represent an object); Each node test the value of its associated feature; There is one branch for each value of the feature Leaves specify the categories (classes) Can categorize instances into multiple disjoint categories – multi-class
4
YesHumidity
No Yes
Wind
No Yes
Outlook
RainSunny Overcast
High WeakStrongNormal
![Page 5: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/5.jpg)
Decision Trees Play Tennis Example Feature Vector = (Outlook, Temperature, Humidity, Wind)
5
YesHumidity
No Yes
Wind
No Yes
Outlook
RainSunny Overcast
High WeakStrongNormal
![Page 6: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/6.jpg)
Decision Trees
6
YesHumidity
No Yes
Wind
No Yes
Outlook
RainSunny Overcast
High WeakStrongNormal
Node associated with a feature
Node associated with a feature
Node associated with a feature
![Page 7: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/7.jpg)
Decision Trees Play Tennis Example Feature values:
Outlook = (sunny, overcast, rain) Temperature =(hot, mild, cool) Humidity = (high, normal) Wind =(strong, weak)
7
![Page 8: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/8.jpg)
Decision Trees Outlook = (sunny, overcast, rain)
8
YesHumidity
No Yes
Wind
No Yes
Outlook
RainSunny Overcast
High WeakStrongNormal
One branch for each value
One branch for each value
One branch for each value
![Page 9: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/9.jpg)
Decision Trees Class = (Yes, No)
9
YesHumidity
No Yes
Wind
No Yes
Outlook
RainSunny Overcast
High WeakStrongNormal
Leaf nodes specify classes
Leaf nodes specify classes
![Page 10: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/10.jpg)
Decision Trees Design Decision Tree Classifier
Picking the root node Recursively branching
10
![Page 11: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/11.jpg)
Decision TreesPicking the root node
Consider data with two Boolean attributes (A,B) and two classes + and –
{ (A=0,B=0), - }: 50 examples{ (A=0,B=1), - }: 50 examples{ (A=1,B=0), - }: 3 examples{ (A=1,B=1), + }: 100 examples
11
![Page 12: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/12.jpg)
Decision TreesPicking the root node
Trees looks structurally similar; which attribute should we choose?
12
B
-
01
A
+ -
01
A
-
01
B
+ -
01
![Page 13: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/13.jpg)
Decision TreesPicking the root node
The goal is to have the resulting decision tree as small as possible (Occam’s Razor)
The main decision in the algorithm is the selection of the next attribute to condition on (start from the root node).
We want attributes that split the examples to sets that are relatively pure in one label; this way we are closer to a leaf node.
The most popular heuristics is based on information gain, originated with the ID3 system of Quinlan.
13
![Page 14: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/14.jpg)
Entropy
S is a sample of training examples
p+ is the proportion of positive examples in S
p- is the proportion of negative examples in S
Entropy measures the impurity of S
14
ppppSEntropy loglog
p+
![Page 15: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/15.jpg)
15
+ - - + + + - - + - + - + + - - + + + - - + - + - - + - - + - + - - + - + - + + - - + + - - - + - + - + + - - + + + - - + - + - + + - - + - +
- - + + + - + - + + - + - + + + - - + - + - - + - +
- - + - + - + - - - + - - - - + - - + - -
-
+ + + + + + + +
- - - - - - - - - - - -
+ + + + + + + + + + + + + +
- - + - + - + - + + + - - - - - -
- - - - - -
+ + + + + +
- - - - -
Highly Disorganized
High Entropy
Much Information Required
Highly Organized
Low Entropy
Little Information Required
![Page 16: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/16.jpg)
Information Gain
Gain (S, A) = expected reduction in entropy due to sorting on A
Values (A) is the set of all possible values for attribute A, Sv is the subset of S which attribute A has value v, |S| and | Sv | represent the number of samples in set S and set Sv respectively
Gain(S,A) is the expected reduction in entropy caused by knowing the value of attribute A.
16
)(
)()(),(AValuesv
vv SEntropySS
SEntropyASGain
![Page 17: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/17.jpg)
Information Gain
Example: Choose A or B ?
17
A 01
100 +3 -
100 -
B 01
100 +50-
53-
Split on A Split on B
![Page 18: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/18.jpg)
Example
Play Tennis Example
18
Day Outlook Temperature Humidity Wind Play
Tennis
Day1 Sunny Hot High Weak No Day2 Sunny Hot High Strong No
Day3 Overcast Hot High Weak Yes
Day4 Rain Mild High Weak Yes
Day5 Rain Cool Normal Weak Yes
Day6 Rain Cool Normal Strong No
Day7 Overcast Cool Normal Strong Yes
Day8 Sunny Mild High Weak No
Day9 Sunny Cool Normal Weak Yes
Day10 Rain Mild Normal Weak Yes
Day11 Sunny Mild Normal Strong Yes
Day12 Overcast Mild High Strong Yes
Day13 Overcast Hot Normal Weak Yes
Day14 Rain Mild High Strong No
0.94
)145log(14
5
)149log(14
9Entropy(S)
![Page 19: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/19.jpg)
Example
19
Day Outlook Temperature Humidity Wind Play
Tennis
Day1 Sunny Hot High Weak No Day2 Sunny Hot High Strong No
Day3 Overcast Hot High Weak Yes
Day4 Rain Mild High Weak Yes
Day5 Rain Cool Normal Weak Yes
Day6 Rain Cool Normal Strong No
Day7 Overcast Cool Normal Strong Yes
Day8 Sunny Mild High Weak No
Day9 Sunny Cool Normal Weak Yes
Day10 Rain Mild Normal Weak Yes
Day11 Sunny Mild Normal Strong Yes
Day12 Overcast Mild High Strong Yes
Day13 Overcast Hot Normal Weak Yes
Day14 Rain Mild High Strong No
Humidity
High Normal3+,4- 6+,1-
E=.985 E=.592
)(
)()(),(AValuesv
vv SEntropySS
SEntropyASGain
Gain(S, Humidity) = .94 - 7/14 * 0.985 - 7/14 *.592 = 0.151
![Page 20: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/20.jpg)
Example
20
Day Outlook Temperature Humidity Wind Play
Tennis
Day1 Sunny Hot High Weak No Day2 Sunny Hot High Strong No
Day3 Overcast Hot High Weak Yes
Day4 Rain Mild High Weak Yes
Day5 Rain Cool Normal Weak Yes
Day6 Rain Cool Normal Strong No
Day7 Overcast Cool Normal Strong Yes
Day8 Sunny Mild High Weak No
Day9 Sunny Cool Normal Weak Yes
Day10 Rain Mild Normal Weak Yes
Day11 Sunny Mild Normal Strong Yes
Day12 Overcast Mild High Strong Yes
Day13 Overcast Hot Normal Weak Yes
Day14 Rain Mild High Strong No
)(
)()(),(AValuesv
vv SEntropySS
SEntropyASGain
Wind
Weak Strong6+2- 3+,3-E=.811 E=1.0
Gain(S, Wind) = .94 - 8/14 * 0.811 - 6/14 * 1.0 = 0.048
![Page 21: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/21.jpg)
Example
21
Day Outlook Temperature Humidity Wind Play
Tennis
Day1 Sunny Hot High Weak No Day2 Sunny Hot High Strong No
Day3 Overcast Hot High Weak Yes
Day4 Rain Mild High Weak Yes
Day5 Rain Cool Normal Weak Yes
Day6 Rain Cool Normal Strong No
Day7 Overcast Cool Normal Strong Yes
Day8 Sunny Mild High Weak No
Day9 Sunny Cool Normal Weak Yes
Day10 Rain Mild Normal Weak Yes
Day11 Sunny Mild Normal Strong Yes
Day12 Overcast Mild High Strong Yes
Day13 Overcast Hot Normal Weak Yes
Day14 Rain Mild High Strong No
)(
)()(),(AValuesv
vv SEntropySS
SEntropyASGain
Outlook
Overcast Rain3,7,12,13 4,5,6,10,14
3+,2-
Sunny1,2,8,9,11
4+,0-2+,3-0.00.970 0.970
Gain(S, Outlook) = 0.246
![Page 22: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/22.jpg)
Example
Pick Outlook as the root
22
Day Outlook Temperature Humidity Wind Play
Tennis
Day1 Sunny Hot High Weak No Day2 Sunny Hot High Strong No
Day3 Overcast Hot High Weak Yes
Day4 Rain Mild High Weak Yes
Day5 Rain Cool Normal Weak Yes
Day6 Rain Cool Normal Strong No
Day7 Overcast Cool Normal Strong Yes
Day8 Sunny Mild High Weak No
Day9 Sunny Cool Normal Weak Yes
Day10 Rain Mild Normal Weak Yes
Day11 Sunny Mild Normal Strong Yes
Day12 Overcast Mild High Strong Yes
Day13 Overcast Hot Normal Weak Yes
Day14 Rain Mild High Strong No
Outlook
Overcast RainSunny Gain(S, Wind) = 0.048
Gain(S, Humidity) = 0.151
Gain(S, Temperature) = 0.029
Gain(S, Outlook) = 0.246
![Page 23: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/23.jpg)
Example
Pick Outlook as the root
23
Day Outlook Temperature Humidity Wind Play
Tennis
Day1 Sunny Hot High Weak No Day2 Sunny Hot High Strong No
Day3 Overcast Hot High Weak Yes
Day4 Rain Mild High Weak Yes
Day5 Rain Cool Normal Weak Yes
Day6 Rain Cool Normal Strong No
Day7 Overcast Cool Normal Strong Yes
Day8 Sunny Mild High Weak No
Day9 Sunny Cool Normal Weak Yes
Day10 Rain Mild Normal Weak Yes
Day11 Sunny Mild Normal Strong Yes
Day12 Overcast Mild High Strong Yes
Day13 Overcast Hot Normal Weak Yes
Day14 Rain Mild High Strong No
Outlook
3,7,12,134,5,6,10,14
3+,2-1,2,8,9,11
4+,0-2+,3-
Yes
?
Continue until: Every attribute is included in path, or, all examples in the leaf have same label
Overcast RainSunny
?
![Page 24: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/24.jpg)
Example
24
Day Outlook Temperature Humidity Wind Play
Tennis
Day1 Sunny Hot High Weak No Day2 Sunny Hot High Strong No
Day3 Overcast Hot High Weak Yes
Day4 Rain Mild High Weak Yes
Day5 Rain Cool Normal Weak Yes
Day6 Rain Cool Normal Strong No
Day7 Overcast Cool Normal Strong Yes
Day8 Sunny Mild High Weak No
Day9 Sunny Cool Normal Weak Yes
Day10 Rain Mild Normal Weak Yes
Day11 Sunny Mild Normal Strong Yes
Day12 Overcast Mild High Strong Yes
Day13 Overcast Hot Normal Weak Yes
Day14 Rain Mild High Strong No
Outlook
3,7,12,131,2,8,9,11
4+,0-2+,3-
Yes
?
Overcast RainSunny
Gain (Ssunny, Humidity) = .97-(3/5) * 0-(2/5) * 0 = .97Gain (Ssunny, Temp) = .97- 0-(2/5) *1 = .57Gain (Ssunny, Wind) = .97-(2/5) *1 - (3/5) *.92 = .02
1 Sunny Hot High Weak No2 Sunny Hot High Strong No8 Sunny Mild High Weak No9 Sunny Cool Normal Weak Yes11 Sunny Mild Normal Strong Yes
![Page 25: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/25.jpg)
Example
25
Day Outlook Temperature Humidity Wind Play
Tennis
Day1 Sunny Hot High Weak No Day2 Sunny Hot High Strong No
Day3 Overcast Hot High Weak Yes
Day4 Rain Mild High Weak Yes
Day5 Rain Cool Normal Weak Yes
Day6 Rain Cool Normal Strong No
Day7 Overcast Cool Normal Strong Yes
Day8 Sunny Mild High Weak No
Day9 Sunny Cool Normal Weak Yes
Day10 Rain Mild Normal Weak Yes
Day11 Sunny Mild Normal Strong Yes
Day12 Overcast Mild High Strong Yes
Day13 Overcast Hot Normal Weak Yes
Day14 Rain Mild High Strong No
Outlook
YesHumidity
Overcast RainSunny
Gain (Ssunny, Humidity) = .97-(3/5) * 0-(2/5) * 0 = .97Gain (Ssunny, Temp) = .97- 0-(2/5) *1 = .57Gain (Ssunny, Wind) = .97-(2/5) *1 - (3/5) *.92 = .02
No Yes
NormalHigh
![Page 26: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/26.jpg)
Example
26
Day Outlook Temperature Humidity Wind Play
Tennis
Day1 Sunny Hot High Weak No Day2 Sunny Hot High Strong No
Day3 Overcast Hot High Weak Yes
Day4 Rain Mild High Weak Yes
Day5 Rain Cool Normal Weak Yes
Day6 Rain Cool Normal Strong No
Day7 Overcast Cool Normal Strong Yes
Day8 Sunny Mild High Weak No
Day9 Sunny Cool Normal Weak Yes
Day10 Rain Mild Normal Weak Yes
Day11 Sunny Mild Normal Strong Yes
Day12 Overcast Mild High Strong Yes
Day13 Overcast Hot Normal Weak Yes
Day14 Rain Mild High Strong No
Outlook
YesHumidity
Overcast RainSunny
Gain (Srain, Humidity) =Gain (Srain, Temp) =Gain (Srain, Wind) =
No Yes
NormalHigh4,5,6,10,14
3+,2-
?
![Page 27: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/27.jpg)
Example
27
Day Outlook Temperature Humidity Wind Play
Tennis
Day1 Sunny Hot High Weak No Day2 Sunny Hot High Strong No
Day3 Overcast Hot High Weak Yes
Day4 Rain Mild High Weak Yes
Day5 Rain Cool Normal Weak Yes
Day6 Rain Cool Normal Strong No
Day7 Overcast Cool Normal Strong Yes
Day8 Sunny Mild High Weak No
Day9 Sunny Cool Normal Weak Yes
Day10 Rain Mild Normal Weak Yes
Day11 Sunny Mild Normal Strong Yes
Day12 Overcast Mild High Strong Yes
Day13 Overcast Hot Normal Weak Yes
Day14 Rain Mild High Strong No
Outlook
YesHumidity
Overcast RainSunny
No Yes
NormalHigh
Wind
No Yes
WeakStrong
![Page 28: Machine Learning](https://reader035.vdocuments.us/reader035/viewer/2022062501/56816764550346895ddc4159/html5/thumbnails/28.jpg)
Tutorial/Exercise Questions
G53MLE Machine Learning Dr Guoping Qiu 28
An experiment has produced the following 3d feature vectors X = (x1, x2, x3) belonging to two classes. Design a decision tree classifier to class an unknown feature vector X = (1, 2, 1).
X = (x1, x2, x3)x1 x2 x3 Classes
1 1 1 11 1 1 21 1 1 12 1 1 22 1 2 12 2 2 22 2 2 12 2 1 21 2 2 21 1 2 1
1 2 1 = ?