decision tree and random forest
TRANSCRIPT
Decision Tree & Random Forest Algorithm
Outline
Introduction Example of Decision Tree Principles of Decision Tree
– Entropy– Information gain
Random Forest
2
The problem
Given a set of training cases/objects and their attribute values, try to determine the target attribute value of new examples.
– Classification– Prediction
Apply Model
Induction
Deduction
Learn Model
Model
Tid Attrib1 Attrib2 Attrib3 Class
1 Yes Large 125K No
2 No Medium 100K No
3 No Small 70K No
4 Yes Medium 120K No
5 No Large 95K Yes
6 No Medium 60K No
7 Yes Large 220K No
8 No Small 85K Yes
9 No Medium 75K No
10 No Small 90K Yes 10
Tid Attrib1 Attrib2 Attrib3 Class
11 No Small 55K ?
12 Yes Medium 80K ?
13 Yes Large 110K ?
14 No Small 95K ?
15 No Large 67K ? 10
Test Set
Learningalgorithm
Training Set
3
Key Requirements
Attribute-value description: object or case must be expressible in terms of a fixed collection of properties or attributes (e.g., hot, mild, cold).
Predefined classes (target values): the target function has discrete output values (bollean or multiclass)
Sufficient data: enough training cases should be provided to learn the model.
4
A simple example
5
Principled Criterion
Choosing the most useful attribute for classifying examples. Entropy
- A measure of homogeneity of the set of examples- If the sample is completely homogeneous the entropy is zero and if
the sample is an equally divided it has entropy of one Information Gain
- Measures how well a given attribute separates the training examples according to their target classification
- This measure is used to select among the candidate attributes at each step while growing the tree
6
Information Gain
Step 1 : Calculate entropy of the target
7
Information Gain (Cont’d)
Step 2 : Calculate information gain for each attribute
8
Information Gain (Cont’d)
Step 3: Choose attribute with the largest information gain as the decision node.
9
Information Gain (Cont’d)
Step 4a: A branch with entropy of 0 is a leaf node.
10
Information Gain (Cont’d)
Step 4b: A branch with entropy more than 0 needs further splitting.
11
Information Gain (Cont’d)
Step 5: The algorithm is run recursively on the non-leaf branches, until all data is classified.
12
Random Forest
Decision Tree : one tree Random Forest : more than one tree
13
Decision Tree & Random Forest
14
Decision Tree
Random Forest
Tree 1 Tree 2
Tree 3
Decision Tree
Outlook Temp. Humidity Windy Play GolfRainy Mild High False ?
15
Result : No
Random Forest
16
Tree 1 Tree 2
Tree 3
Tree 1 : NoTree 2 : NoTree 3 : Yes
Yes : 1No : 2
Result : No
OOB Error Rate
OOB error rate can be used to get a running unbiased estimate of the classification error as trees are added to the forest.
17