![Page 1: REUMass Amherst 2015 Data Science Bootcampmarlin/teaching/REU/bootcamp-day2.pdf · 2015-06-02 · REUMass Amherst 2015 Data Science Bootcamp Day 2: Machine Learning and Research Methods](https://reader034.vdocuments.us/reader034/viewer/2022050311/5f738ec0a677de04be4bb87b/html5/thumbnails/1.jpg)
REUMass Amherst 2015 Data Science Bootcamp
Day 2: Machine Learning and
Research Methods
Prof. Ben Marlin Prof. David Jensen [email protected] [email protected]
![Page 2: REUMass Amherst 2015 Data Science Bootcampmarlin/teaching/REU/bootcamp-day2.pdf · 2015-06-02 · REUMass Amherst 2015 Data Science Bootcamp Day 2: Machine Learning and Research Methods](https://reader034.vdocuments.us/reader034/viewer/2022050311/5f738ec0a677de04be4bb87b/html5/thumbnails/2.jpg)
Plan for Day 2: • Introduc2on to Machine Learning • Introduc2on to Research Methods
![Page 3: REUMass Amherst 2015 Data Science Bootcampmarlin/teaching/REU/bootcamp-day2.pdf · 2015-06-02 · REUMass Amherst 2015 Data Science Bootcamp Day 2: Machine Learning and Research Methods](https://reader034.vdocuments.us/reader034/viewer/2022050311/5f738ec0a677de04be4bb87b/html5/thumbnails/3.jpg)
Introduc2on to Machine Learning
![Page 4: REUMass Amherst 2015 Data Science Bootcampmarlin/teaching/REU/bootcamp-day2.pdf · 2015-06-02 · REUMass Amherst 2015 Data Science Bootcamp Day 2: Machine Learning and Research Methods](https://reader034.vdocuments.us/reader034/viewer/2022050311/5f738ec0a677de04be4bb87b/html5/thumbnails/4.jpg)
Mitchell (1997): “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.” SubsBtute “data D” for “experience E.”
What is Machine Learning?
![Page 5: REUMass Amherst 2015 Data Science Bootcampmarlin/teaching/REU/bootcamp-day2.pdf · 2015-06-02 · REUMass Amherst 2015 Data Science Bootcamp Day 2: Machine Learning and Research Methods](https://reader034.vdocuments.us/reader034/viewer/2022050311/5f738ec0a677de04be4bb87b/html5/thumbnails/5.jpg)
ClassificaBon Regression
Clustering Dimensionality ReducBon
Supe
rvised
Unsup
ervised
Learning to predict.
Learning to organize and represent.
Machine Learning Tasks
![Page 6: REUMass Amherst 2015 Data Science Bootcampmarlin/teaching/REU/bootcamp-day2.pdf · 2015-06-02 · REUMass Amherst 2015 Data Science Bootcamp Day 2: Machine Learning and Research Methods](https://reader034.vdocuments.us/reader034/viewer/2022050311/5f738ec0a677de04be4bb87b/html5/thumbnails/6.jpg)
Classifica2on
![Page 7: REUMass Amherst 2015 Data Science Bootcampmarlin/teaching/REU/bootcamp-day2.pdf · 2015-06-02 · REUMass Amherst 2015 Data Science Bootcamp Day 2: Machine Learning and Research Methods](https://reader034.vdocuments.us/reader034/viewer/2022050311/5f738ec0a677de04be4bb87b/html5/thumbnails/7.jpg)
Example: Digits
![Page 8: REUMass Amherst 2015 Data Science Bootcampmarlin/teaching/REU/bootcamp-day2.pdf · 2015-06-02 · REUMass Amherst 2015 Data Science Bootcamp Day 2: Machine Learning and Research Methods](https://reader034.vdocuments.us/reader034/viewer/2022050311/5f738ec0a677de04be4bb87b/html5/thumbnails/8.jpg)
Example: Natural Images
![Page 9: REUMass Amherst 2015 Data Science Bootcampmarlin/teaching/REU/bootcamp-day2.pdf · 2015-06-02 · REUMass Amherst 2015 Data Science Bootcamp Day 2: Machine Learning and Research Methods](https://reader034.vdocuments.us/reader034/viewer/2022050311/5f738ec0a677de04be4bb87b/html5/thumbnails/9.jpg)
sepal length
sepal width
petal length
petal width
Type
5.1 3.5 1.4 0.2 Setosa
4.9 3.0 1.4 0.2 Setosa
4.7 3.2 1.3 0.2 Setosa
4.6 3.1 1.5 0.2 Setosa
!" !" !" !" !"
Example: Fisher’s Iris Data
![Page 10: REUMass Amherst 2015 Data Science Bootcampmarlin/teaching/REU/bootcamp-day2.pdf · 2015-06-02 · REUMass Amherst 2015 Data Science Bootcamp Day 2: Machine Learning and Research Methods](https://reader034.vdocuments.us/reader034/viewer/2022050311/5f738ec0a677de04be4bb87b/html5/thumbnails/10.jpg)
Classifier Learning
sepal length
sepal width
petal length
petal width
Type
5.1 3.5 1.4 0.2 Setosa
4.9 3.0 1.4 0.2 Setosa
4.7 3.2 1.3 0.2 Setosa
4.6 3.1 1.5 0.2 Setosa
!" !" !" !" !"
x1 y1 x2 y2
![Page 11: REUMass Amherst 2015 Data Science Bootcampmarlin/teaching/REU/bootcamp-day2.pdf · 2015-06-02 · REUMass Amherst 2015 Data Science Bootcamp Day 2: Machine Learning and Research Methods](https://reader034.vdocuments.us/reader034/viewer/2022050311/5f738ec0a677de04be4bb87b/html5/thumbnails/11.jpg)
Classifier Learning
5.1 3.5 1.4 0.2 Setosa f( )=
![Page 12: REUMass Amherst 2015 Data Science Bootcampmarlin/teaching/REU/bootcamp-day2.pdf · 2015-06-02 · REUMass Amherst 2015 Data Science Bootcamp Day 2: Machine Learning and Research Methods](https://reader034.vdocuments.us/reader034/viewer/2022050311/5f738ec0a677de04be4bb87b/html5/thumbnails/12.jpg)
Classifier Performance
![Page 13: REUMass Amherst 2015 Data Science Bootcampmarlin/teaching/REU/bootcamp-day2.pdf · 2015-06-02 · REUMass Amherst 2015 Data Science Bootcamp Day 2: Machine Learning and Research Methods](https://reader034.vdocuments.us/reader034/viewer/2022050311/5f738ec0a677de04be4bb87b/html5/thumbnails/13.jpg)
K Nearest Neighbors
• Need to pick number of neighbors K.
• Need to define a distance funcBon.
Feature 1
Feature 2
![Page 14: REUMass Amherst 2015 Data Science Bootcampmarlin/teaching/REU/bootcamp-day2.pdf · 2015-06-02 · REUMass Amherst 2015 Data Science Bootcamp Day 2: Machine Learning and Research Methods](https://reader034.vdocuments.us/reader034/viewer/2022050311/5f738ec0a677de04be4bb87b/html5/thumbnails/14.jpg)
Exercises: KNN and Fisher’s Iris Data
• Q1: What pair of features gives the lowest training error when using K=5? • Q2: What is the training error rate for K=5 when using all of the features? • Q3: What value of K gives the lowest error on the training data when using all of the features?