learning from data lecture 17 memory and eﬃciency in...

Learning From DataLecture 17

Memory and Efficiency in Nearest Neighbor

MemoryEfficiency

M. Magdon-IsmailCSCI 4100/6100

recap: Similarity and Nearest Neighbor

Similarity

d(x,x′) = ||x− x′ ||

1-NN rule

21-NN rule

1. Simple.

2. No training.

3. Near optimal Eout:

k →∞, k/N → 0 =⇒ Eout → E∗out.

4. Good ways to choose k:

k = 3; k =⌈√

; validation/cross validation.

5. Easy to justify classification to customer.

6. Can easily do multi-class.

7. Can easily adapt to regression or logistic regression

g(x) =1

y[i](x) g(x) =1

y[i](x) = +1y

8. Computationally demanding.

c© AML Creator: Malik Magdon-Ismail Memory and Efficiency in Nearest Neighbor: 2 /25 Computational demands −→

Computational Demands of Nearest Neighbor

Memory.

Need to store all the data, O(Nd) memory.

N = 106, d = 100, double precision≈ 1GB

Finding the nearest neighbor of a test point.

Need to compute distance to every data point, O(Nd).

N = 106, d = 100, 3GHz processor ≈ 3ms (compute g(x))

N = 106, d = 100, 3GHz processor ≈ 1hr (compute CV error)

N = 106, d = 100, 3GHz processor > 1month (choose best k from among 1000 using CV)

c© AML Creator: Malik Magdon-Ismail Memory and Efficiency in Nearest Neighbor: 3 /25 Two basic approaches −→

Two Basic Approaches

Reduce the amount of data.

The 5-year old does not remember every horse he has seen, only a few representative horses.

Store the data in a specialized data structure.

Ongoing research field to develop geometric data structures to make finding nearest neighbors fast.

c© AML Creator: Malik Magdon-Ismail Memory and Efficiency in Nearest Neighbor: 4 /25 Irrelevant data −→

Throw Away Irrelevant Data

−−−−−−→

−−−→

c© AML Creator: Malik Magdon-Ismail Memory and Efficiency in Nearest Neighbor: 5 /25 Decision boundary consistent −→

Decision Boundary Consistent

−−−−−−→

−−−→

g(x) unchanged

c© AML Creator: Malik Magdon-Ismail Memory and Efficiency in Nearest Neighbor: 6 /25 Training set consistent −→

Training Set Consistent

−−−−−−→

−−−→

g(xn) unchanged

c© AML Creator: Malik Magdon-Ismail Memory and Efficiency in Nearest Neighbor: 7 /25 Comparing −→

Decision Boundary Vs. Training Set Consistent

DB−−−−−−→

−−−→

g(x) unchangedversus

g(xn) unchanged

c© AML Creator: Malik Magdon-Ismail Memory and Efficiency in Nearest Neighbor: 8 /25 Consistent 6=⇒ (g(xn) = yn) −→

Consistent Does Not Mean g(xn) = yn

DB−−−−−−→

−−−→

c© AML Creator: Malik Magdon-Ismail Memory and Efficiency in Nearest Neighbor: 9 /25 Training set consistent (k = 3) −→

Training Set Consistent (k = 3)

−−−−−−→

−−−→

g(xn) unchanged

c© AML Creator: Malik Magdon-Ismail Memory and Efficiency in Nearest Neighbor: 10 /25 CNN −→

CNN: Condensed Nearest Neighbor (k = 3)

add this point

Consider the solid blue point:i. blue w.r.t. selected pointsii. red w.r.t. D

Add a red point:i. not already selected

ii. closest to the inconsistent point

1. Randomly select k data points into S .

2. Classify all data according to S .

3. Let x∗ be an inconsistent point and y∗ its class w.r.t. D.

4. Add the closest point to x∗ not in S that has class y∗.

5. Iterate until S classifies all points consistently with D.

c© AML Creator: Malik Magdon-Ismail Memory and Efficiency in Nearest Neighbor: 11 /25 CNN: add red point −→

CNN: Condensed Nearest Neighbor

add this point

c© AML Creator: Malik Magdon-Ismail Memory and Efficiency in Nearest Neighbor: 12 /25 CNN: algorithm −→

CNN: Condensed Nearest Neighbor

add this point

Minimum consistent set (MCS)?← NP-hard

c© AML Creator: Malik Magdon-Ismail Memory and Efficiency in Nearest Neighbor: 13 /25 Digits Data −→

Nearest Neighbor on Digits Data

1-NN rule 21-NN rule

c© AML Creator: Malik Magdon-Ismail Memory and Efficiency in Nearest Neighbor: 14 /25 Condensing the Digits Data −→

Condensing the Digits Data

1-NN rule 21-NN rule

c© AML Creator: Malik Magdon-Ismail Memory and Efficiency in Nearest Neighbor: 15 /25 Finding the nearest neighbor −→

Finding the Nearest Neighbor

1. S1, S2 are ‘clusters’ with centers µ1,µ2 and radii r1, r2.

2. [Branch] Search S1 first→ x̂[1].

3. The distance from x to any point in S2 is at least

||x− µ2 || − r2

4. [Bound] So we are done if

||x− x̂[1] || ≤ ||x− µ2 || − r2

A branch and bound algorithm

Can be applied recursively

c© AML Creator: Malik Magdon-Ismail Memory and Efficiency in Nearest Neighbor: 16 /25 When does the bound hold? −→

When Does the Bound Hold?

Bound condition: ||x− x̂[1] || ≤ ||x− µ2 || − r2.

||x− x̂[1] || ≤ ||x− µ1 ||+ r1

So, it suffices that

r1 + r2 ≤ ||x− µ2 || − ||x− µ1 ||.

||x− µ1 || ≈ 0 means ||x− µ2 || ≈ ||µ2 − µ2 ||.

It suffices that

r1 + r2 ≤ ||µ2 − µ1 ||. S1

within cluster spread should be less than between cluster spread

c© AML Creator: Malik Magdon-Ismail Memory and Efficiency in Nearest Neighbor: 17 /25 Finding clusters – Lloyd’s algorithm −→

Finding Clusters – Lloyd’s Algorithm

1. Pick well separated centers for each cluster.

2. Compute Voronoi regions as the clusters.

3. Update the Centers.

4. Update the Voronoi regions.

5. Compute centers and radii:

µj =1

|Sj|∑

xn∈Sj

xn; rj = maxxn∈Sj

||xn − µj ||.

µj =1

|Sj|∑

xn∈Sj

xn; rj = maxxn∈Sj

||xn − µj ||.

µj =1

|Sj|∑

xn∈Sj

xn; rj = maxxn∈Sj

||xn − µj ||.

µj =1

|Sj|∑

xn∈Sj

xn; rj = maxxn∈Sj

||xn − µj ||.

µj =1

|Sj|∑

xn∈Sj

xn; rj = maxxn∈Sj

||xn − µj ||.

µj =1

|Sj|∑

xn∈Sj

xn; rj = maxxn∈Sj

||xn − µj ||.

µj =1

|Sj|∑

xn∈Sj

xn; rj = maxxn∈Sj

||xn − µj ||.

Radial Basis Functions (RBF)

k-Nearest Neighbor: Only considers k-nearest neighbors.each neighbor has equal weight

What about using all data to compute g(x)?

RBF: Use all data.data further away from x have less weight.

learning from data lecture 17 memory and eﬃciency in...

Documents

learning from data - a short course - abu-mostafa,...

reassessing the diamond/mirrlees eﬃciency...

asymptotic normality and eﬃciency of the maximum...

space eﬃciency of propositional knowledge representation

industrial energy eﬃciency project - rec...

alphanomics: theinformational underpinningsofmarket...

rnc5-tpd - lifebreath.com · sensible eﬃciency: thermal...

eﬃciency and competition in public transport - qucosa

eﬃciency of public sector organizations: perspectives from

financing eﬃciency of securities-based crowdfunding ·...

the eﬃciency theorems and market...

mr-2010c — experimental design: eﬃciency, coding, and...

simple steps checklist for energy eﬃciency · simple...

optimal bandwidth selection for conditional eﬃciency

the organizational eﬃciency of internal capital...

industrial energy eﬃciency project - rec...

measuring capital market eﬃciency: long-term memory,...

cdmtcs research report series eﬃciency … research report...

sde: graph drawing using spectral distance...

increasing adder eﬃciency by exploiting input...