fundamentals of machine learning 1 types of machine learning in-sample and out-of-sample errors...
TRANSCRIPT
![Page 1: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/1.jpg)
Fundamentals of machine learning 1
Types of machine learningIn-sample and out-of-sample errorsVersion spaceVC dimension
![Page 2: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/2.jpg)
Unsupervised learning: input only – no labels
Coins in a vending machine cluster by size and weightHow many clusters are here?Would different attributes make clusters more distinct?
![Page 3: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/3.jpg)
Supervised learning: every example has a label
Labels have enabled a model based on linear discriminantsthat will let the vending machine guess coin value without facial recognition.
![Page 4: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/4.jpg)
Reinforcement learning: No one correct outputData: input, graded output Find relationship between input and high-grade outputs
![Page 5: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/5.jpg)
In-sample error, Ein How well do boundaries match training data?
Out-of-sample error, Eout How often will this system fail if implement in the field?
![Page 6: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/6.jpg)
Quality of data mainly determines success of machine learning
How many data points? How much uncertainty?We assume each datum is labeled correctly.Uncertainties is in values of attributes
![Page 7: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/7.jpg)
Choosing the right model
A good model has small in-sample error and generalizes well.Often a tradeoff between these characteristics is required.
![Page 8: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/8.jpg)
A type of model defines an hypothesis set
A particular member of the set is selected by minimizing some in-sample error. Error definition varies with problem but usually are local.(i.e. accumulated from error in each data point)
Linear discrimants
![Page 9: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/9.jpg)
Nt
tt ,r 1}{ xX
other typeany describes if 1
carfamily a describes if 1
x
xr
9
2
1
xx
x
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
examples of family cars
Supervised learning is the focus of this courseExample: Dichotomy based on 2 attributes
![Page 10: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/10.jpg)
2121 power engine AND price eepp
10Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Assume that blue rectangle is the true boundary of class CIn a real problem, of course, we don’t know this.
Assume family car (class C) uniquely defined by a range of price and engine power
![Page 11: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/11.jpg)
Hypothesis class H: axis aligned rectangles
N
t
ttin rhhE
1
1)|( xX
11
In-sample error on h defined by
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
h = yellow rectangle is a particular member of H
Count misclassifications
![Page 12: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/12.jpg)
Hypothesis class H: axis aligned rectangles
N
t
ttin rhhE
1
1)|( xX
12
For dataset shown, in-sample error on h is zero, but we expect out-of-sample error to be nonzero
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
h = yellow rectangle is a particular member of H
h leaves room for false positives and false negatives
![Page 13: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/13.jpg)
Should we expect the negative examples to cluster?
family car
![Page 14: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/14.jpg)
S, G, and the Version Space
14
most specific hypothesis, S, with no Ein
most general hypothesis, G
any h Î H, between S and G isconsistent (no error)and makes up the version space
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
![Page 15: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/15.jpg)
G
S
A dichotomizer has been trained by N examples. Results are poor due to limited data.An expert will label any additional attribute vector that I specify.Where should attribute vectors be chosen to make the most effective use of the expert?
![Page 16: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/16.jpg)
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 16
Margin: distance between boundary and closest instance in a specified class
S and G hypotheses have narrow margins; not expected to “generalize” well.
Even though Ein is zero, we expect Eout to be large. Why?
GS
![Page 17: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/17.jpg)
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 17
Choose h in the version space with largest margin to maximize generalization
Data points that determine S and G are shaded. They “support” h with largest margins
Logic behind “support vector machines”
Greatest distance between S and G
![Page 18: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/18.jpg)
Vapnik Chervonenkis Dimension, dVC • H is a hypothesis set for a dichotomizer• H(X) is set of dichotomies created by application to H to
dataset X with N points • N points can be labeled + 1 in 2N ways.• Regardless of size of H, |H(X)|bounded by 2N .• H “shatters” N points if there is no way to label the points that is
not consistent with some member of H.• dVC (H) = k if k is the largest number of points that can be
shattered by H.
• dVC(H) is called the “capacity” of H
18Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
![Page 19: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/19.jpg)
Vapnik Chervonenkis Dimension, dVC
To prove that dVC = k we get to choose the k points
19Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
To prove that dVC =3 for the 2D linear dichotomizer, better to chose the non-linear black points. Fact that 3 points in line cannot be shattered does not prove dVC < 3.
![Page 20: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/20.jpg)
Every set of 4 points has 2 labeling are not linearly separable.k=4 is the break point for the 2D linear dichotomier. dvc(H)+1 is always a break point.For dD dichotomizer, dvc(H) = d+1.
Break points
![Page 21: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/21.jpg)
What is the VC dimension of the hypothesis class defined by the union of all axis-aligned rectangles?
![Page 22: Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649e8f5503460f94b9284a/html5/thumbnails/22.jpg)
VC dimension
is conservative
22Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
VC dimension is based on all possible ways to label examples
VC ignores the probability distribution from which dataset was drawn.
In real-world, examples with small differences in attributes usually belong to the same class
Basis of “similarity” classification methods.
family car