k-means implementation on fpga for high-dimensional data ...whats k-means • unsupervised vs...

30
High-Performance Reconfigurable Computing Group University of Toronto K-Means Implementation on FPGA for High-dimensional Data Using Triangle Inequality Zhongduo Lin, Charles Lo, Paul Chow September-13-12

Upload: others

Post on 11-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

High-Performance Reconfigurable Computing Group University of Toronto

K-Means Implementation on FPGA for High-dimensional Data Using

Triangle Inequality Zhongduo Lin, Charles Lo, Paul Chow

September-13-12

Page 2: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Why K-means?

• One of the most widely used unsupervised clustering algorithms in data mining and machine learning

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

2

Page 3: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Whats K-means

• Unsupervised vs Supervised – Classes are predetermined or not

• A simple iterative clustering algorithm that partitions a given dataset into k clusters

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

3

Page 4: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Basic K-means

• 1) K initial means selected • 2) K clusters are created by

assigning points to the nearest mean

• 3) The centroid of each clusters becomes the new mean

• 4) Repeat step 2 and 3 until convergence has been reached

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

4

Page 5: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Why Triangle Inequality

• Big Data Era – Data size – Number of dimensions – Number of clusters

• Optimization – Kd-tree with filter algorithm – Triangle inequality

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

5

Page 6: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Triangle Inequality

• z < x + y • x > z – y • If x<z/2 then x<y

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

6

Page 7: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Triangle Inequality • z < x + y

– x: pc3 – y: c1c3 – z: pc3 – x+y: upper bound

• x > z – y – x: pc4 – y: c2c4 – z: pc2 – z-y: lower bound

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

7

Page 8: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

K-means with Triangle Inequality

• Keep the upper bound to the assigned center: n

• Keep the lower bound to all the centers: kn

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

8

Page 9: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

K-means with Triangle Inequality

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

9

Page 10: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Time Overhead of Triangle Inequality

• Distance between centers: d(c(x),c) – Not implemented

• Distance between new centers and the old ones: d(c, m(c)) – Parallel with updating bounds

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

10

Page 11: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Optimization for HW: square root

• Square root elimination – Distance squared –

• bounds for

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

11

Page 12: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Optimization for HW: square root

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

12

Page 13: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Optimization for HW: square root

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

13

Page 14: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Optimization for HW: square root

• Ratio: x/y • sum: diff:

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

14

Page 15: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Optimization for HW: square

• 8-bit square calculator for 6-LUT FPGA

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

15

Page 16: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Optimization for HW: square

• Comparison: 4-LUT

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

16

Page 17: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Hardware Platform

• ML605 Evaluation Board: – XC6VLX240T – 512 MB DDR3 (Max BW: 6.4GB) – PCIe interface (8-lane Gen 1)

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

17

Page 18: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Interface Overview

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

18

Page 19: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Interface Overview

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

19

Page 20: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

HW Architecture

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

20

Page 21: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

HW Architecture

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

21

Page 22: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

HW Architecture

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

22

Page 23: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Benchmarks: k=10, d=1024

• Mnist: gray scale picture of digits 0-9 – 28*28 to 32*32 – Initial centers: manually picked up

• Uniform Random (UR) – No seed is set – Initial centers: first 10 points

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

23

Page 24: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Result: approximation

• Distance calculation ratio:

• ORI: original triangle inequality optimization • APT: naïve approximation • AAPT: aggressive approximation

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

24

Page 25: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Result: approximation

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

25

2% 4% 22.7%

5% 7% 77.6%

Page 26: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

HW experiment: cost

• 10% more slice LUTs • 3.4% more registers • BRAM!!

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

26

Page 27: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

HW experiment: speed

• Total time approximation

• When n is big enough

• For 32000 MNIST data, , processing time: 0.23T, saving 77%

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

27

Page 28: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

HW experiment: speed

• SW platform: – Intel Quad-core i5-2500 CPU, 1 thread – 3.3GHz, 4GB DDR2

• HW platform: – 100MHz

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

28

Page 29: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

Future Work

• Store the bounds in external memory • Parallelism between different centers • Better comparison with software

implementation

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

29

Page 30: K-Means Implementation on FPGA for High-dimensional Data ...Whats K-means • Unsupervised vs Supervised – Classes are predetermined or not • A simple iterative clustering algorithm

September-13-12 High-Performance Reconfigurable Computing Group ∙ University of Toronto

30