Machine Learning GroupUniversity College Dublin
Nearest Neighbour ClassifiersNearest Neighbour Classifiers
Lazy v’s Eager
k-NN
Condensed NN
k-NN
2
Classification problemsClassification problems
Exemplar characterised by a set of features; decide class to which exemplar belongs
Compare regression problems Exemplar characterised by a set of features; decide value of continuous output (dependant)
variable
k-NN
3
Lazy v’s EagerLazy v’s Eager
D-Trees are an example of an Eager ML Algorithm D-Tree is built in advance off-line Less work to do at run-time
k-NN is a Lazy approach Little work done off-line keep training examples,
find k nearest at run time
k-NN
4
Greeness Height Width Taste Weight Height/Width ClassNo. 1 210 60 62 Sweet 186 0.97 AppleNo. 2 220 70 51 Sweet 180 1.37 PearNo. 3 215 55 55 Tart 152 1.00 AppleNo. 4 180 76 40 Sweet 152 1.90 PearNo. 5 220 68 45 Sweet 153 1.51 PearNo. 6 160 65 68 Sour 221 0.96 AppleNo. 7 215 63 45 Sweet 140 1.40 PearNo. 8 180 55 56 Sweet 154 0.98 AppleNo. 9 220 68 65 Tart 221 1.05 Apple
No. 10 190 60 58 Sour 174 1.03 Apple
No. x 222 70 55 Sweet 185 1.27 ?
To what class does this belong?
Classifying Apples & PearsClassifying Apples & Pears
k-NN
5
Sal
Amt
Age
JCat
Gen
Sal
Amt
JC
Gn
Age
Loan Approval SystemLoan Approval System
Nearest Neighbour based on Similarity: What does similar mean?
Sal
Amt
Age
JCat
Gen
k-NN
6
Imagine just 2 featuresImagine just 2 features
2 features Amount Monthly_Sal
o
o
o o
oo
Amount
Mo
nth
ly_S
al
xx
xx
x
xx
xo o
o
k-NN
7
Voronoi DiagramsVoronoi Diagrams
query point q
nearest neighbor x
Indicate areas in which prediction influenced by same set of examples
k-NN
8
3-Nearest Neighbors3-Nearest Neighbors
query point q
2x,1o
k-NN
9
7-Nearest Neighbors7-Nearest Neighbors
query point q
7 nearest neighbors
3x,4o
k-NN
10
kk-NN and Noise-NN and Noise
1-NN easy to implement susceptible to noise
a misclassification every time a noisy pattern retrieved
k-NN with k 3 will overcome this
k-NN
11
kk-Nearest Neighbour-Nearest Neighbour
D set of training samplesFind k nearest neighbours to q according to this difference criterion
For each xi D
where €
d(q,x i) = w f ⋅δ(q f ,x if )f ∈F
∑
€
δ(q f ,x if ) =
0 f discrete and q f = x if
1 f discrete and q f ≠ x if
q f − x if f continuous
⎧
⎨ ⎪
⎩ ⎪
Category of q decided by its k Nearest Neighbours
€
Vote (y j ) =1
d(q,x i)n1(
c=1
k
∑ y j ,yc )
k-NN
12
Minkowski DistancesMinkowski Distances
Generalisation of Euclidean (p=2) & Manhattan (p=1) distance
€
MDp (q,x i) = q f − x if
p
f ∈F
∑p
k-NN
13
Appropriate δ functions
Greeness Height Width Taste Weight Height/Width ClassNo. 1 210 60 62 Sweet 186 0.97 AppleNo. 2 220 70 51 Sweet 180 1.37 PearNo. 3 215 55 55 Tart 152 1.00 AppleNo. 4 180 76 40 Sweet 152 1.90 PearNo. 5 220 68 45 Sweet 153 1.51 PearNo. 6 160 65 68 Sour 221 0.96 AppleNo. 7 215 63 45 Sweet 140 1.40 PearNo. 8 180 55 56 Sweet 154 0.98 AppleNo. 9 220 68 65 Tart 221 1.05 Apple
No. 10 190 60 58 Sour 174 1.03 Apple
No. x 222 70 55 Sweet 185 1.27 ?
To what class does this belong?
k-NN
14
e.g. MVTe.g. MVT (now part of Agilent) (now part of Agilent)
Machine Vision for inspection of PCBs
components present or absentsolder joints good or bad
k-NN
15
Components present?Components present?
Absent
Present
k-NN
16
Characterise image as a set of featuresCharacterise image as a set of features
type name Wid2 Wid3 CenX CenY M1 Sig1 M2 Sig2 M3 Sig3 Min2c0402_mvc c815 556 1344 3 28 134 7 61 16 109 5 51c0402_mvc c804 1221 1253 -20 -49 127 30 78 34 97 39 54c0402_mvc c802 441 1189 -45 -52 122 28 91 24 89 40 68c0402_mvc c808 532 1294 59 60 130 23 74 29 138 9 58c0402_mvc c806 1384 1492 -9 65 140 6 72 15 144 13 62c0402_mvc c605 943 1278 51 -9 116 29 68 28 139 7 54c0402_mvc c813 1446 1462 209 48 93 15 139 29 162 6 100c0402_mvc c606 1219 1302 40 -8 161 7 93 25 135 3 65c0402_mvc c710 1113 1128 -99 -13 145 6 95 40 88 38 56c0402_mvc c703 1090 1386 -56 -18 149 11 72 28 147 14 52c0402_mvc c761 1214 1203 -95 -21 149 11 77 34 113 40 56c0402_mvc c701 1487 1296 -30 33 142 9 73 28 135 12 54c0402_mvc c732 1038 1196 -19 -3 148 8 62 10 100 44 56c0402_mvc c753 1015 1288 58 -16 123 13 73 35 128 8 54c0402_mvc c751 1146 1036 -163 -25 140 5 102 34 85 2 80c0402_mvc c760 1113 1091 -121 44 133 11 94 44 96 37 57
k-NN
17
Dimension reduction in Dimension reduction in kk-NN-NN
Not all features required noisy features a
hindrance
Some examples redundant retrieval time depends on
no. of examples
p features
q best features
n covering examples
m examples
Feature Selection
Case Selection
k-NN
18
Condensed NNCondensed NN
D set of training samplesFind E where E D; NN rule used with E should be as good as D
choose x D randomly, D D \ {x}, E {x},DO
learning? FALSE, FOR EACH x D
classify x by NN using E,if classification incorrect
then E E {x}, D D \ {x}, learning TRUE,
WHILE (learning? FALSE)
k-NN
19
Condensed NNCondensed NN
100 examples2 categories
Different CNN solutions
k-NN
20
Improving Condensed NNImproving Condensed NN
Different outcomes depending on data order that’s a bad thing in an algorithm
Sort data based on distance to nearest unlike neighbour
A
B
identify exemplars near decision surface
in diagram B more useful than
A
k-NN
21
Condensed NNCondensed NN
100 examples2 categories
Different CNN solutions
CNNusingNUN
k-NN
22
Aside: Aside: Real Data is not UniformReal Data is not Uniform
B & D
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0 0.1 0.2 0.3 0.4 0.5 0.6
B
D
Iris Data in two dimensions
k-NN
23
k-k-NN for spam filteringNN for spam filtering
An Lazy Learning system will be able to adapt to the changing nature of Spam
A Local Learning system is good for diverse disjunctive concepts like Spam
Porn, mortgage, religion, cheap drugs… Work, family, play…
Case base editing techniques exist to improve competence of a case base
k-NN
24
Spam FilteringSpam Filtering
C1C2
C3C4
Cn
E-mail messages
Feature
Extraction
Case F1 F2 F3 Fn SC1 ~ ~ ~ ~ S1
C2 ~ ~ ~ ~ S2
C3 ~ ~ ~ ~ S3
Cn ~ ~ ~ ~ Sn
e.g. Bag-of-Words model in Text Classification & Information Retrieval.
k-NN
25
Texts as Bag-of-WordsTexts as Bag-of-Words
1. The easiest online school on earth.
2. Here is the information from Computer Science for the next Graduate School Board meeting.
3. Please find attached the agenda for the Graduate School Board meeting.
1. The easiest online school on earth.
2. Here is the information from Computer Science for the next Graduate School Board meeting.
3. Please find attached the agenda for the Graduate School Board meeting.
No.
Easiest
Online
School
Earth
Info.
Com
puter S
cience
Graduate
Board
Meeting
Please
Find
Attached
Agenda
1 x x x x
2 x x x x x x x
3 x x x x x x x x
k-NN
26
Texts as Bag-of-WordsTexts as Bag-of-Words
Similarity can be measured by dot-product between these vectors.
Information is lost e.g. sequence information
No.
Easiest
Online
School
Earth
Info.
Com
puter S
cience
Graduate
Board
Meeting
Please
Find
Attached
Agenda
1 x x x x
2 x x x x x x x
3 x x x x x x x x
k-NN
27
Runtime System
ECUE ECUE - Email Classification Using Examples- Email Classification Using Examples
EmailEmailEmailEmail FeatureExtraction
Casebase
FeatureSelection
Casebase
CaseSelection
Casebase
Classification
spam!
TargetCase
k-NN
28
ClassificationClassification
k-NN classifier with k=3 Unanimous Voting to bias
away from False Positives Case Retrieval Net [Lenz et al. 98]
improves performance of kNN search
k-NN
29
KK-NN: Summary-NN: Summary
ML avoids some KE effort Lazy v’s Eager How k-NN works Dimmension reduction
Condensed NN Feature Selection
Spam Filtering application