multi-core parallelization in clojure - a case study · multi-core parallelization in clojure - a...

33
Johann M. Kraus and Hans A. Kestler AG Bioinformatics and Systems Biology Institute of Neural Information Processing University of Ulm Multi-core Parallelization in Clojure - a Case Study 29.06.2009

Upload: others

Post on 28-Jan-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Johann M. Kraus and Hans A. Kestler

AG Bioinformatics and Systems BiologyInstitute of Neural Information Processing

University of Ulm

Multi-core Parallelization in Clojure -a Case Study

29.06.2009

Page 2: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Outline

1. Concepts of parallel programming

2. Short introduction to Clojure

3. Multi-core parallel K-means - the case study

4. Analysis and Results

5. Summary

Page 3: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Parallel Programming

Parallel programming is a form of programming where many calculations are performed simultaneously.

Definition:

• Physical constraints prevent frequency scaling of processors

• This led to an increasing interest in parallel hardware and parallel programming

• Multi-core hardware is standard on desktop computers

• Parallel software can use this hardware to the full capacity

Page 4: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

• Large problems are divided into smaller ones and the sub-problems are solved simultaneously

• Speedup S is limited by the fraction of parallelizable code P

• Amdahl’s law: S =1

1! P + PN

Amdahl's law

Number of processors

Spee

dup

1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 32768 65536

02

46

810

1214

1618

20

Fraction of parallelizable code0.95 %0.90 %0.75 %0.50 %

Page 5: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Concepts of Parallel Programming

Explicit vs. implicit parallelization

• Functional programming allows implicit parallelization:

• Explicitly define communication and synchronization details for each task:

• MPI

• Java Threads

• Parallel processing of functions

• Functions are free of side-effects

• Data is immutable

Page 6: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Distributed vs. local hardware

Master

Slave 0

Shared Memory

CPU 4

CPU 0

CPU 1

CPU 3

CPU2

Slave 4

Slave 3

Slave 1

Slave 2

readwrite

send data

send result

• Master - Slave parallelization (e.g. Message Passing Interface)

• Shared memory parallelization (e.g. Open Multi-Processing)

Page 7: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Thread programming

newstart

runnable

running

waiting

terminated

schedule

end block

awake

• Threads are refinements of a process that share the same memory and can be processed separately and simultaneously

• Available in many languages, e.g. PThreads (C), Java Threads (Java), OpenMP Threads (C, Fortran)

• Execution of threads is handled by a scheduler that manages the available processing time

• Communication between threads is faster than communication between processes

• Invoking threads is also faster than fork/join processes

Page 8: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Concurrency control via locking and synchronizing

• Concurrency control ensures that threads can access shared memory without violating data integrity

• The most popular approach to concurrency is locking and synchronizing

• Problems might occur when using too many locks, too few locks, wrong locks, or locks in the wrong order

• Using locks can be fatally error-prone, e.g. dead-locks

public class Counter{private int value = 0 ;public synchronized void i n c r {

value = value + 1 ;}

}Counter counter = new Counter ( ) ;counter . i n c r ( ) ;

Page 9: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

• Transactional memory offers a flexible alternative to lock-based concurrency control

• Functionality is analogous to controlling simultaneous access to database management systems

• Transactions ensure properties:

• Atomicity: Either all changes of a transaction occur or none do

• Consistency: Only valid changes are committed

• Isolation: No transaction sees the effect of other transactions

• Durability: Changes from transactions will be persistent

Concurrency control via transactional memory

Page 10: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

:Transaction 0 :Transaction 1:Data

get data

[consistent data]send modified data

[consistent data]send modified data

get data

[consistent data]send modified data

get data

TIME

• Software transactional memory maps transactional memory to concurrency control in parallel programming

Page 11: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Clojure

• Functional programming language hosted on the JVM

• Extends the code-as-data paradigm to maps and vectors

• Based on immutable data structures

• Provides built-in concurrency support via software transactional memory

• Completely symbiotic to Java, e.g. easy access to Java libraries

• Platform independent

Page 12: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

• Java interaction

• Add type hints to speed up code

(defn da+ [#ˆdoubles as #ˆdoubles bs ](amap as i r e t(+ ( aget as i ) ( aget bs i ) ) ) )

• Dynamic typing and multi-methods

• An object is defined as the sum of what it can do (methods), rather than the sum of what it is (type hierarchy)

( import ’ ( cern . j e t . random . samplingRandomSamplingAssistant ) )

(defn sample[ n k ]( seq ( . RandomSamplingAssistant

( sampleArray k ( int!array ( range n ) ) ) ) ) )

Page 13: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

• Transactional references ensure safe coordinated synchronous changes to mutable storage locations

• Are bound to a single storage location for their lifetime

• Only allow mutation of that location to occur within transactions

• Available operations are ref-set, alter, and commute

• No explicit locking is required

Transactional references and STM

(def counter ( ref 0) )(dosync ( alter counter inc ) )

Page 14: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

• Agents allow independent asynchronous change of mutable locations

• Are bound to a single storage location for their lifetime

• Only allow mutation of that location to a new state to occur as a result of an action

• Actions are functions that are asynchronously applied to the state of an Agent

• The return value of an action becomes new state of the Agent

• Agents are integrated with the STM

(def counter (agent 0) )(send counter inc )

Agents

Page 15: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Cluster analysis

3 cluster 9 cluster

• Given a data set X compute a partition of X into k disjoint clusters C, such that:

• How many clusters are in the data set?

(1)k!

i=1

Ci = X

(2) Ci != " and Ci # Cj = "

Page 16: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Cluster algorithms

• For all possible partitions evaluate the objective function f and search the optimum.

Cluster algorithms provide a heuristic for this search:

• The cardinality of the set of all possible partitions is given by:

• Partitional clustering (K-means, Neuralgas, SOM, Fuzzy C-means, ...)

• Hierarchical clustering (Divisive/agglomerative, Complete linkage, ...)

• Graph-based clustering (Spectral clustering, NMF, Affinity propagation, ...)

• Model-based clustering, Biclustering, Semi-supervised clustering

SkN =

1k!

k!

i=0

(!1)k!i"

k

i

#iN

0 5 10 15 20 25 30 35

0 5

1015

2025

30

0

5

10

15

20

25

30

35

Number of clusters

Num

ber o

f dat

a po

ints

Runt

ime

(nan

osec

ond)

Stirling numbers ofthe second kind

Page 17: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

K-means algorithm

Function KMeans

Input : X = { x 1 , . . . , x n } ( Data to be c l u s t e r e d )k (Number o f c l u s t e r s )

Output : C = { c 1 , . . . , c k } ( C l u s t e r c e n t r o i d s )m: X !> C ( C l u s t e r a s s i gnment s )

I n i t i a l i z e C ( e . g . random s e l e c t i o n from X)While C has changedFor each x i i n Xm( x i ) = a r gm i n j d i s t a n c e ( x i , c j )

EndFor each c j i n C

c j = c e n t r o i d ({ x i | m( x i ) = j })End

End

Page 18: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Cluster Validation

• MCA-index: mean proportion of samples being consistent over different clusterings

MCA = 1n max!

!ki=1 |Ai !Bj |

• Evaluation requires repeated runs of clustering, e.g.:

• Resampled data sets

• Different parameters

Page 19: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

0 10 20 30 40 50

0.0

0.2

0.4

0.6

0.8

1.0

cluster

me

an

mca

in

de

x

0 10 20 30 40 50

0.0

0.2

0.4

0.6

0.8

1.0

cluster

me

an

mca

in

de

x

0 10 20 30 40 50

0.0

0.2

0.4

0.6

0.8

1.0

cluster

me

an

mca

in

de

x

0 10 20 30 40 50

0.0

0.2

0.4

0.6

0.8

1.0

cluster

me

an

mca

in

de

x

Estimation of the expected value of a validation index

Random label: randomly assign each item to a cluster k

Random partition: choose a random partition

Random prototype: assign each item to its next prototype

Mean value from 100 runs

Page 20: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Multi-core K-means with Clojure

• Split the data set into smaller pieces that are handled by agents

• Each cluster is represented by an agent

• Add a commutative list of cluster members within a transactional reference to accelerate the centroid update step

Cluster Agent 1

Member Ref 1

Data Agent 0

Cluster Agent k

Member Ref k

Cluster Agent 0

Member Ref 0

Data Agent 1

Data Agent n

Data Agent 2

Data Agent 3

read

write

Page 21: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Data Agent 0

Cluster Agent 0

Data Agent 1

Data Agent n

Cluster Agent 1

Cluster Agent k

Member Ref 0

Member Ref 1

Member Ref 2

simultaneous read

simultaneous write

Data Agent 0

Data Agent 1

Data Agent n

Page 22: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

(defn assignment [ ](map #(send % update!dataagent ) DataAgents )

(defn update!dataagent [ datapo int s ](map update!datapoint datapo int s ) )

(defn update!datapoint [ datapoint ]( l e t [ newass ( nearest!c l u s t e r datapoint ) ](dosync (commute (nth MemberRefs newass )

conj ( : data datapoint ) ) )( assoc datapoint : ass ignment newass ) ) )

read: (nearest-cluster)

write: (commute) (assoc)

Page 23: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Benchmark results

• Each data point is sampled from N(0,1)

• Summary for 10 runs of K-means

050

100

150

run

tim

e (

seco

nd

s)

0150

300

450

ParaKMeans K-means R McKmeans K-means R McKmeans

10.000 cases, 100 dimensions

20 Cluster

1.000.000 cases, 200 dimensions

20 Cluster

run

tim

e (

min

ute

s)

Large data sets (artificial):

Page 24: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

1 4 8

050

010

0015

00

100.000 x 50020 cluster

number of computer cores

runt

ime

(sec

onds

)

4 6 8 10

020

040

060

080

0

100.000 x 50020 cluster

number of data agents

runt

ime

(sec

onds

)

• Number of computer cores used • Number of data agents used

Page 25: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

• Data sampled from a multi-variate normal distribution

• 100000 samples, 200/500 dimensions, 10/20 cluster

05

00

10

00

15

00

20

00

Number of samples / Number of clusters

run

tim

e (

se

co

nd

s)

200 / 10 200 / 20 500 / 10 500 / 20 200 / 10 200 / 20 500 / 10 500 / 20

K-means R McKmeans

Large data sets with cluster structure

Page 26: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

• Measured with the MCA index

• Red bars indicate the random-prototype baseline

0.0

0.2

0.4

0.6

0.8

1.0

MC

A i

nd

ex

_ __ _ _ _ _ _

McKmeans K-means R McKmeans K-means R McKmeans K-means R McKmeans K-means R

100.000 x 20010 cluster

100.000 x 20020 cluster

100.000 x 50010 cluster

100.000 x 50020 cluster

Accuracy compared to the known grouping of data

Page 27: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

• Microarray data (Radiation-induced changes in human gene expression)

• 22277 samples (genes) and 465 features (profiles)

Number of clusters

run

tim

e (

seco

nd

s)

050

150

250

350

2 Cluster 5 Cluster 10 Cluster 20 Cluster 2 Cluster 5 Cluster 10 Cluster 20 Cluster

K-means R McKmeans

Real world data set

Smirnov D, Morley M, Shin E, Spielman R, Cheung V: Genetic analysis of radiation-induced changes in human gene expression. Nature 2009, 459:587–591

Page 28: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Application to Cluster Number Estimation• Repeated clustering with different subsets of data

• Repeated for different number of clusters k

• Most stable clustering is produced for the ‘real’ cluster number

2 3 4 5 6 7

0.0

0.2

0.4

0.6

0.8

1.0

number of clusters

MC

A ind

ex

_ _ _ __ _

• Jackknife resampling

• Evaluation with MCA index

• Data set:100000 samples, 100 features, 3 cluster

• 10 runs per cluster number

• 49.26 minutes on dual-quad core 3.2 GHz

Page 29: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Java GUI

( import ’ ( javax . swing JFrame JLabel JTextFie ld JButton )’ ( java . awt . event Act ionL i s t ene r )’ ( java . awt GridLayout ) )

( l e t [ frame (new JFrame ”Hel lo , World ! ” )he l l o ! button (new JButton ”Say h e l l o ”)he l l o ! l a b e l (new JLabel ” ” ) ]

( . h e l l o ! button( addAct ionListener

( proxy [ Act i onL i s t ene r ] [ ]( act ionPerformed [ evt ]

( . h e l l o ! l a b e l( setText ”Hel lo , World ! ” ) ) ) ) ) )

( doto frame( . setLayout (new GridLayout 1 1 3 3) )( . add he l l o ! button )( . add he l l o ! l a b e l )( . s e t S i z e 300 80)( . s e tV i s i b l e t rue ) ) )

Page 30: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction
Page 31: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Summary

• Writing parallel programs usually requires a careful software design and a deep knowledge about thread-safe programming

• Concurrency control via transactional memory circumvents problems of lock-based concurrency strategies

• Immutable data structures play a key role to software transactional memory

• Clojure combines Lisp, Java and a powerful STM system

• This enables fast parallelization of algorithms, even for rapid prototyping

• Our simulations show a good performance of the parallelized code

Page 32: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Thank you for your attention.

Page 33: Multi-core Parallelization in Clojure - a Case Study · Multi-core Parallelization in Clojure - a Case Study 29.06.2009. Outline 1. Concepts of parallel programming 2. Short introduction

Statistical computing library

• http://wiki.github.com/liebke/incanter

• Clojure-based statistical computing

• R-like semantics

• COLT library for numerical computation

• JFreeChart library for graphics