nearest neighbor classificationsrihari/cse555/chap4.nearestneighbor.pdf · cse 555: srihari 2...

25
Nearest Neighbor Classification The Nearest-Neighbor Rule Error Bounds k-Nearest Neighbor Rule Computational Considerations

Upload: others

Post on 24-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

Nearest Neighbor Classification

The Nearest-Neighbor RuleError Bounds

k-Nearest Neighbor RuleComputational Considerations

Page 2: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 1

Example of Nearest Neighbor Rule

• Two class problem: yellow triangles and blue squares. Circle represents the unknown sample x and as its nearest neighbor comes from class θ1, it is labeled as class θ1.

Figure 1: The NN rule

Page 3: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 2

Example of k-NN rule with k = 3

• There are two classes: yellow triangles and blue squares. The circle represents the unknown sample xand as two of its nearest neighbors come from class θ2, it is labeled class θ2.

The number k should be:1) large to minimize probability of misclassifying x.

2) small (with respect to no of samples) so that points are close enough to xto give an accurate estimate of the true class of x.

Page 4: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 3

Nearest Neighbor and Voronoi Tesselation• N-n classifier effectively partitions the feature space into cells consisting of all points closer to a given training point x’than to any other training points.

• All points in such a cell are thus labeled by the category of the training point – Voronoi tesselation of the space

2-dimensions

3-dimensions

Page 5: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 4

Nearest–Neighbor Rule Probability of Error

Let Dn = {x1, x2, …, xn} be a set of n labeled prototypes

• Let x’ ∈ Dn be the nearest prototype to a test point x

• The nearest-neighbor rule for classifying x is • to assign it the label associated with x’

• Nearest-neighbor rule is a sub-optimal procedure• Does not yield the Bayes error rate• Yet it is never worse than twice the Bayes error rate

Page 6: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 5

Why does Nearest–Neighbor rule work well?

• Label θ’ associated with nearest neighbor is a random variable

• Probability that θ’ = ωi is the a posteriori probability P(ωi | x’)

• As n → ∞, it is always possible to find x’ sufficiently close so that:

P(ωi | x’) ≅ P(ωi | x)• Because this is exactly the probability that nature will

be in state ωi the nearest neighbor rule is effectively matching probabilities with nature

Page 7: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 6

Bayesian Probability of Error• If we define ωm(x) by

then the Bayes decision rule always selects ωm.

From this the Bayesian condition probability of error is

)|(1)|(* xPxeP mω−=

Page 8: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 7

Bayesian Probability of Error

If we let P*(e|x) be the minimum possible value of P(e|x), and P* be the minimum possible value of P(e), then by averaging over the a priori distribution of x we get

dxxpxPdxxpxePP m )())|(1()()|(** ∫ ∫ −== ω

Page 9: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 8

Evaluation of Nearest Neighbor Error

• If Pn(e) is the n - sample error rate, and if

Then we want to show that

Page 10: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 9

Nearest-Neighbor Probability of ErrorThe Random Variables

Begin by looking at all the random variables in the construction of anx, xn, θ , θn system.

We denote θ as the true class of x and θn as the labeled class of xn , where xn is the nearest neighbor of x.

It is clear that x and its θ are random input parameters to the problem. Note that the underlying statistics of the labeled space are random too. Thus the xn , θn pair are also unknown and thus random inputs.

The probability of x having true class θ and that of xn being labeled θn are independent. Thus we have

Page 11: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 10

Expressing the Probability of Error

Page 12: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 11

Convergence of Probability of Error

• Notice that as n approaches infinity the space of labeled items will become increasingly filled. Thus the nearest neighbor of x will become xn with probability 1.

• So we can say that:

)|(lim

),|(lim

),|(lim

xePn

xxePn

xxePn n ∞→

=∞→

=∞→

Page 13: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 12

Final Expression for Nearest-Neighbor Probability of Error

Page 14: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 13

Bounds on the Conditional Probability of Error

Page 15: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 14

Nearest Neighbor Error Bound Derivation

Page 16: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 15

Error Bound Conclusion

Error bounds are tight in that for any P* there existConditional and prior distributions for which theBounds are achieved.

Page 17: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 16

Bounds on nearest neighbor error rate in c-category problem

AssumingInfiniteTraining data

PossibleAsymptoticError rates

Page 18: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 17

The k – Nearest-Neighbor Rule• Classify x by assigning it the label most frequently

represented among the k nearest samples and use a voting scheme

k = 3

Page 19: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 18

Analysis of k – Nearest-Neighbor Rule

• Select wm if a majority of the k nearest neighbors are labeled wm, an event of probability

• It can be shown that if k is odd, the large-sample two-class error rate for the k-nearest-neighbor rule is bounded above by the function Ck(P*), where Ck (P*) is defined to be the smallest concave function of P* greater than

Page 20: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 19

Bounds on Error Rate of k-Nearest Neighbor Rule

Bound isCk(P*)

As k gets larger the error rate equals the Bayes ratek should be a small fraction of the total number of samples

Page 21: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 20

Computational Complexity of k-Nearest-Neighbor Rule

• Each Distance Calculation is O(d)• Finding single nearest neighbor is O(n)• Finding k nearest neighbors involves sorting; thus O(dn2)• Methods for speed-up:

• Parallelism• Partial Distance• Pre-structuring• Editing, pruning or condensing

Page 22: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 21

Parallel Implementation of k-Nearest-Neighbor Rule

Constant time or O(1) in time and O(n) in space

Threeunitscorrespondingto 3 cellsassociatedwith ω1

Each box corresponds to a face of the cell and determines if x lies on its close or open side

Classifyas ω1 ifone of thecells saysyes

Page 23: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 22

Partial Distance Method of n-nspeedup

• The partial distance based on r selected dimensions is

• Terminate a distance calculation once its partial distance is greater than the full r =d Euclidean distance to the current closest prototype

Page 24: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 23

Search Tree Method of nn speedup

• Create a search tree where prototypes are selectively linked

• Consider only the prototypes linked to entry point

• Points in neighboring region may actually be closer• Tradeoff of accuracy versus speed

Entry points

Page 25: Nearest Neighbor Classificationsrihari/CSE555/Chap4.NearestNeighbor.pdf · CSE 555: Srihari 2 Example of k-NN rule with k = 3 • There are two classes: yellow triangles and blue

CSE 555: Srihari 24

Editing Method of nn speedup

• Eliminate Prototypes that are surrounded by training points of the same category

• Complexity is O(d3 nd/2 ln n)