lecture 21 decision tree - nakul gopalan

50
Nakul Gopalan Georgia Tech Decision Tree Machine Learning CS 4641 These slides are adopted from Polo, Vivek Srikumar, Mahdi Roozbahani and Chao Zhang.

Upload: others

Post on 01-Dec-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 21 Decision Tree - Nakul Gopalan

Nakul Gopalan

Georgia Tech

Decision Tree

Machine Learning CS 4641

These slides are adopted from Polo, Vivek Srikumar, Mahdi Roozbahani and Chao Zhang.

Page 2: Lecture 21 Decision Tree - Nakul Gopalan

𝑋1

𝑋2

Page 3: Lecture 21 Decision Tree - Nakul Gopalan

3

Visual Introduction to Decision Tree

Building a tree to distinguish homes in New

York from homes in San Francisco

Interpretable decisions

for verifiable outcomes!!!

Page 4: Lecture 21 Decision Tree - Nakul Gopalan

Decision Tree: Example (2)

4

Will I play tennis today?

Page 5: Lecture 21 Decision Tree - Nakul Gopalan

The classifier:

fT(x): majority class in the leaf in the tree T containing x

Model parameters: The tree structure and size

5

Outlook?

Decision trees (DT)

Page 6: Lecture 21 Decision Tree - Nakul Gopalan

Flow charts - xkcd

Page 7: Lecture 21 Decision Tree - Nakul Gopalan

7

Pieces:

1. Find the best attribute to split on

2. Find the best split on the chosen attribute

3. Decide on when to stop splitting

Decision trees

Page 8: Lecture 21 Decision Tree - Nakul Gopalan

Label

Categorical or Discrete attributes

Page 9: Lecture 21 Decision Tree - Nakul Gopalan
Page 10: Lecture 21 Decision Tree - Nakul Gopalan

Attribute

Page 11: Lecture 21 Decision Tree - Nakul Gopalan

Continuous attributes or ordered attributes

Page 12: Lecture 21 Decision Tree - Nakul Gopalan

Test data

Page 13: Lecture 21 Decision Tree - Nakul Gopalan
Page 14: Lecture 21 Decision Tree - Nakul Gopalan
Page 15: Lecture 21 Decision Tree - Nakul Gopalan
Page 16: Lecture 21 Decision Tree - Nakul Gopalan
Page 17: Lecture 21 Decision Tree - Nakul Gopalan
Page 18: Lecture 21 Decision Tree - Nakul Gopalan

Information Content

Coin flip

Which coin will give us the purest information? Entropy ~ Uncertainty

Lower uncertainty, higher information gain

𝐶2𝐻𝐶2𝑇

𝐶2𝑇

𝐶2𝐻

𝐶1𝐻𝐶1𝑇

𝐶1𝑇

𝐶1𝐻

𝐶3𝐻𝐶3𝑇

𝐶3𝑇

𝐶3𝐻

5

1

2

3

Page 19: Lecture 21 Decision Tree - Nakul Gopalan
Page 20: Lecture 21 Decision Tree - Nakul Gopalan
Page 21: Lecture 21 Decision Tree - Nakul Gopalan
Page 22: Lecture 21 Decision Tree - Nakul Gopalan
Page 23: Lecture 21 Decision Tree - Nakul Gopalan
Page 24: Lecture 21 Decision Tree - Nakul Gopalan
Page 25: Lecture 21 Decision Tree - Nakul Gopalan
Page 26: Lecture 21 Decision Tree - Nakul Gopalan
Page 27: Lecture 21 Decision Tree - Nakul Gopalan
Page 28: Lecture 21 Decision Tree - Nakul Gopalan

Announcements

• Touchpoint deliverables due today. Video + PPT

• Touchpoint next class. Individual BlueJeans calls with mentors.

• Please make sure to attend your touchpoints for feedback!!!

• Physical touchpoint survey needed back today

• Questions??

Page 29: Lecture 21 Decision Tree - Nakul Gopalan
Page 30: Lecture 21 Decision Tree - Nakul Gopalan
Page 31: Lecture 21 Decision Tree - Nakul Gopalan
Page 32: Lecture 21 Decision Tree - Nakul Gopalan
Page 33: Lecture 21 Decision Tree - Nakul Gopalan

Left direction for smaller value, right direction for bigger value

Page 34: Lecture 21 Decision Tree - Nakul Gopalan
Page 35: Lecture 21 Decision Tree - Nakul Gopalan
Page 36: Lecture 21 Decision Tree - Nakul Gopalan
Page 37: Lecture 21 Decision Tree - Nakul Gopalan
Page 38: Lecture 21 Decision Tree - Nakul Gopalan
Page 39: Lecture 21 Decision Tree - Nakul Gopalan
Page 40: Lecture 21 Decision Tree - Nakul Gopalan
Page 41: Lecture 21 Decision Tree - Nakul Gopalan
Page 42: Lecture 21 Decision Tree - Nakul Gopalan

different

Page 43: Lecture 21 Decision Tree - Nakul Gopalan
Page 44: Lecture 21 Decision Tree - Nakul Gopalan
Page 45: Lecture 21 Decision Tree - Nakul Gopalan

𝑁𝐷

Page 46: Lecture 21 Decision Tree - Nakul Gopalan
Page 47: Lecture 21 Decision Tree - Nakul Gopalan

What will happen if a tree is too large?

Overfitting

High variance

Instability in predicting test data

Page 48: Lecture 21 Decision Tree - Nakul Gopalan

How to avoid overfitting?

• Acquire more training data

• Remove irrelevant attributes (manual process – not always

possible)

• Grow full tree, then post-prune

• Ensemble learning

Page 49: Lecture 21 Decision Tree - Nakul Gopalan

Reduced-Error Pruning

Split data into training and validation sets

Grow tree based on training set

Do until further pruning is harmful:

1. Evaluate impact on validation set of pruning each possible

node (plus those below it)

2. Greedily remove the node that most improves validation set

accuracy

Page 50: Lecture 21 Decision Tree - Nakul Gopalan

How to decide to remove it a node using pruning

• Pruning of the decision tree is done by replacing a whole

subtree by a leaf node.

• The replacement takes place if a decision rule establishes that

the expected error rate in the subtree is greater than in the

single leaf.

3 training data points

Actual Label: 1 positive class and 2 negative class

Predicted Label: 1 positive class and 2 negative class

3 correct and 0 incorrect

6 validation data points

Actual label:2 positive and 4 negative

Predicted Label: 4 positive and 2 negative

2 correct and 4 incorrect

If we had simply predicted the

majority class (negative), we

make 2 errors instead of 4

Pruned!