tutorial 2 - university of british columbiaschmidtm/courses/340-f17/t2.pdf · tutorial 2 cpsc 340:...

15
Tutorial 2 CPSC 340: Machine Learning and Data Mining Fall 2017 1/13

Upload: others

Post on 30-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tutorial 2 - University of British Columbiaschmidtm/Courses/340-F17/T2.pdf · Tutorial 2 CPSC 340: Machine Learning and Data Mining Fall 2017 1/13. Overview 1 Decision Tree Decision

Tutorial 2

CPSC 340: Machine Learning and Data Mining

Fall 2017

1/13

Page 2: Tutorial 2 - University of British Columbiaschmidtm/Courses/340-F17/T2.pdf · Tutorial 2 CPSC 340: Machine Learning and Data Mining Fall 2017 1/13. Overview 1 Decision Tree Decision

Overview

1 Decision TreeDecision StumpDecision Tree

2 Training, Testing, and Validation Set

2/13

Page 3: Tutorial 2 - University of British Columbiaschmidtm/Courses/340-F17/T2.pdf · Tutorial 2 CPSC 340: Machine Learning and Data Mining Fall 2017 1/13. Overview 1 Decision Tree Decision

Decision Stump

Decision stump: simple decision tree with 1 splitting rule basedon 1 feature.

Binary example:

Assigns a label to each leaf based on the most frequent label.

How to find the best splitting rule?

score the rulesintuitive score: classification accuracy.

3/13

Page 4: Tutorial 2 - University of British Columbiaschmidtm/Courses/340-F17/T2.pdf · Tutorial 2 CPSC 340: Machine Learning and Data Mining Fall 2017 1/13. Overview 1 Decision Tree Decision

The Dataset

X: A matrixeach row corresponds to a city

first column corresponds to longitudesecond column corresponds to latitude

y: class label

1 for blue states, 2 for red states.

Given new city Xtest

predict label ytest

4/13

Page 5: Tutorial 2 - University of British Columbiaschmidtm/Courses/340-F17/T2.pdf · Tutorial 2 CPSC 340: Machine Learning and Data Mining Fall 2017 1/13. Overview 1 Decision Tree Decision

exampledecisionStump.jl

5/13

Page 6: Tutorial 2 - University of British Columbiaschmidtm/Courses/340-F17/T2.pdf · Tutorial 2 CPSC 340: Machine Learning and Data Mining Fall 2017 1/13. Overview 1 Decision Tree Decision

decisionStump.jl

6/13

Page 7: Tutorial 2 - University of British Columbiaschmidtm/Courses/340-F17/T2.pdf · Tutorial 2 CPSC 340: Machine Learning and Data Mining Fall 2017 1/13. Overview 1 Decision Tree Decision

Decision Tree

Decision stumps have only 1 rule based on only 1 feature.

Very limited class of models: usually not very accurate for mosttasks.

Decision trees allow sequences of splits based on multiplefeatures.

Very general class of models: can get very high accuracy.However, it’s computationally infeasible to find the best decisiontree.

Most common decision tree learning algorithm in practice:

Greedy recursive splitting.

7/13

Page 8: Tutorial 2 - University of British Columbiaschmidtm/Courses/340-F17/T2.pdf · Tutorial 2 CPSC 340: Machine Learning and Data Mining Fall 2017 1/13. Overview 1 Decision Tree Decision

exampleDecisionTree.jl and decisionTree.jl

8/13

Page 9: Tutorial 2 - University of British Columbiaschmidtm/Courses/340-F17/T2.pdf · Tutorial 2 CPSC 340: Machine Learning and Data Mining Fall 2017 1/13. Overview 1 Decision Tree Decision

Rewriting a decision tree using if/else statements

Decision tree:

If-else statement:

9/13

Page 10: Tutorial 2 - University of British Columbiaschmidtm/Courses/340-F17/T2.pdf · Tutorial 2 CPSC 340: Machine Learning and Data Mining Fall 2017 1/13. Overview 1 Decision Tree Decision

Rewriting a decision tree using if/else statements

Decision tree:

If-else statement:

9/13

Page 11: Tutorial 2 - University of British Columbiaschmidtm/Courses/340-F17/T2.pdf · Tutorial 2 CPSC 340: Machine Learning and Data Mining Fall 2017 1/13. Overview 1 Decision Tree Decision

Training, Testing, and Validation Set

Given training data, we would like to learn a model to minimizeerror on the testing data

How do we decide decision tree depth?

We care about test error.

But we can’t look at test data.

So what do we do?????

One answer: Use part of your train data to approximate test error.

Split training objects into training set and validation set:

Train model on the training data.Test model on the validation data.

10/13

Page 12: Tutorial 2 - University of British Columbiaschmidtm/Courses/340-F17/T2.pdf · Tutorial 2 CPSC 340: Machine Learning and Data Mining Fall 2017 1/13. Overview 1 Decision Tree Decision

Training, Testing, and Validation Set

Given training data, we would like to learn a model to minimizeerror on the testing data

How do we decide decision tree depth?

We care about test error.

But we can’t look at test data.

So what do we do?????

One answer: Use part of your train data to approximate test error.

Split training objects into training set and validation set:

Train model on the training data.Test model on the validation data.

10/13

Page 13: Tutorial 2 - University of British Columbiaschmidtm/Courses/340-F17/T2.pdf · Tutorial 2 CPSC 340: Machine Learning and Data Mining Fall 2017 1/13. Overview 1 Decision Tree Decision

Cross-Validation

Isn’t it wasteful to only use part of your data?

k-fold cross-validation:

Train on k-1 folds of the data, validate on the other fold.Repeat this k times with different splits, and average the score.

Figure 1: Adapted from Wikipedia.

Note: if examples are ordered, split should be random.

11/13

Page 14: Tutorial 2 - University of British Columbiaschmidtm/Courses/340-F17/T2.pdf · Tutorial 2 CPSC 340: Machine Learning and Data Mining Fall 2017 1/13. Overview 1 Decision Tree Decision

Problem: 2-Fold Cross Validation

Modify the code below to compute the 2-fold cross-validationscores on the training data alone.

Find the depth that would be chosen by cross-validation.

12/13

Page 15: Tutorial 2 - University of British Columbiaschmidtm/Courses/340-F17/T2.pdf · Tutorial 2 CPSC 340: Machine Learning and Data Mining Fall 2017 1/13. Overview 1 Decision Tree Decision

Solution: 2-Fold Cross Validation

13/13