12/20091 functional collection-oriented programming guy blelloch carnegie mellon university

27
12/2009 1 Functional Collection- Oriented Programming Guy Blelloch Carnegie Mellon University

Upload: chloe-riley

Post on 20-Jan-2018

214 views

Category:

Documents


0 download

DESCRIPTION

Collection-oriented programming Concise code Promotes a functional style of programming Has become popular even without parallelism (matlab, python, sql, …) Parallelism Map is naturally parallel Many collection operations are parallel: reduce, scan, collect, flatten, transpose, … Most often DETERMINISTIC 12/20093

TRANSCRIPT

12/2009 1

Functional Collection-Oriented

ProgrammingGuy Blelloch

Carnegie Mellon University

Collection-oriented programming

Programmer emphasis is on operations over collections of values. (Data Driven)

Array based: APL, Nial, FP, MatlabDatabase: SQL, LinqScripting: SETL, PythonData parallel: *Lisp, HPF, Nesl, Id, ZPLMap-reduce

All of these support some form of Map and some form of reduce.

12/2009 2

Collection-oriented programming

Concise codePromotes a functional style of programmingHas become popular even without parallelism (matlab, python, sql, …)Parallelism

Map is naturally parallelMany collection operations are parallel: reduce, scan, collect, flatten, transpose, …Most often DETERMINISTIC

12/2009 3

Collection-oriented programming

“Concurrency” (Non-deterministic environment)

On its own not really useful for “concurrent” applications (e.g. operating systems, or front-end of a web server).

12/2009 4

The environmentsequential concurrent

Parallelism

serialTraditional programming

Traditional OS

parallel

Deterministic parallelism

General parallelism

Flat vs. NestedCan collections contain collections?Can arbitrary functions be mapped?

Flat languagesAPL, SQL, Map-reduce, HPF, Matlab

Nested LanguagesSETL, Python, Nesl, Id

I conjecture that flat CO languages will never be general purpose—not good for trees, divide-and-conquer, …

12/2009 5

12/2009 6

Quicksort in NESLfunction quicksort(S) =if (#S <= 1) then Selse let a = S[rand(#S)]; S1 = {e in S | e < a}; S2 = {e in S | e = a}; S3 = {e in S | e > a}; R = {quicksort(v) : v in [S1, S3]};in R[0] ++ S2 ++ R[1];

12/2009 7

Quicksort in X10 double[] quicksort(double[] S) { if (S.length < 2) return S; double a = S[rand(S.length)]; double[] S1,S2,S3; finish { async { S1 = quicksort(lessThan(S,a));} async { S2 = eqTo(S,a);} S3 = quicksort(grThan(S,a)); } append(S1,append(S2,S3));}

Matrix MultiplicationFun A*B { if #A < k then baseCase.. A11,A12,A21,A22 = QuadSplit(A) B11,B12,B21,B22 = QuadSplit(B) Parallel { C11 = A11*B11 + A12*B21

C12 = A11*B12 + A12*B22

C21 = A21*B11 + A22*B21

C22 = A21*B12 + A22*B22

} return QuadJoin(C11,C12,C21,C22)}

12/2009 8

A =A11 A12

A21 A22

⎡ ⎣ ⎢

⎤ ⎦ ⎥

B =B11 B12

B21 B22

⎡ ⎣ ⎢

⎤ ⎦ ⎥

Need to be able to program for locality.

Question:How general is functional CO programming?Advantages

High-level/conciseNatural/IntuitiveDeterministic Parallelism (for all partial results)

No need for annotations, commutativity, regionsNo speculationSimple cost model (even including locality)

Potential DisadvantagesPerformanceMajor rewriting of codeDoes not support “concurrency” on its own

12/2009 9

Barnes Hutfunction bTree(Pts,box as (x0,y0,s)) =if #pts = 0 then EMPTYelse if #pts = 1 then LEAF(p[0])else let xm = x0 + s/2; ym = y0 + s/2; parallelLet T1 = bTree({(x,y,d) in pts | x<xm & y<ym}, (x0,y0,s/2)); T2 = bTree({(x,y,d) in pts | x<xm & y>=ym}, (x0,y0+s/2,s/2)); ..in NODE(cmass(T1,T2,T3,T4),box,T1,T2,T3,T4)

12/2009 10

Barnes Hutfunction force(p,LEAF(p’)) = force(p,p’) | force(p,EMPTY) = 0 | force(p,(c,box,T1,T2,T3,T4) if far(p,box) then forceC(p,c) else force(p,T1)+force(p,T2)+force(p,T3) +force(p,T4)

function forces(Points,T) = {move(p,force(p,T)) : p in Points};

12/2009 11

“Algorithms in the Real World”

Compression:JPEG

12/2009 12

Topic Algorithm CO Parallel

Functional*

Compression JPEG 4 4 5BW 4 4 5

Error Correcting

Reed Solomon 4 4 4

Codes Parity Check 5 5 5^Cryptography Reijdael 4 5 5

Bignum 4 4 5Comp Biology Blast 4 4 4

Clustall 3 3 5N-body Codes Barnes Hut 5 5 5

Callahan Kosaraju 5 5 5*Easily expressed with no shared writeable state^Depends on algorithm

Compression:JPEG

12/2009 13

Topic Algorithm CO Parallel Functional*Linear/Integer Interior Point 4 4 4^Programming Branch and Bound 3 3 3Web Indexing Index Building 5 5 5

Page Rank 5 5 5Geometry Delaunay 4 5 4^

Nearest Neighbors 4 4 5Dimensionality SVD 4 4 4Reduction Johnson

Lindenstraus5 5 5

String Searching

Suffix Trees 4 4 4^

Graph Separators

Contraction 4 4 4^Depends on algorithm

Barnes Hutfunction bTree(Pts,box as (x0,y0,s)) =if #pts = 0 then EMPTYelse if #pts = 1 then LEAF(p[0])else let xm = x0 + s/2; ym = y0 + s/2; parallelLet T1 = bTree({(x,y,d) in pts | x<xm & y<ym}, (x0,y0,s/2)); T2 = bTree({(x,y,d) in pts | x<xm & y>=ym}, (x0,y0+s/2,s/2)); ..in NODE(cmass(T1,T2,T3,T4),box,T1,T2,T3,T4)

12/2009 14

Barnes Hutfunction force(p,LEAF(p’)) = force(p,p’) | force(p,EMPTY) = 0 | force(p,(c,box,T1,T2,T3,T4) if far(p,box) then forceC(p,c) else force(p,T1)+force(p,T2)+force(p,T3) +force(p,T4)

function forces(Points,T) = {force(p,T) : p in Points};

12/2009 15

Graph Connectivity/Spanning Tree

12/2009 16

12/2009 17

Graph Connectivity

Edge List Representation: Edges = [(0,1), (0,2), (2,3), (3,4), (3,5),

(3,6), (1,3), (1,5), (5,6), (4,6)]

0

1 32

4

5 6

12/2009 18

Graph Contraction0

1 32

4

5 6

0

1 32

4

5 6

1

1 12

6

1 6

1

2

6 1

2

6 1

1

11

Form starsrelabel

contract

12/2009 19

Graph Connectivity

Edge List Representation: Edges = [(0,1), (0,2), (2,3), (3,4), (3,5),

(3,6), (1,3), (1,5), (5,6), (4,6)]

0

1 32

4

5 6

Hooks = [(0,1), (1,3), (1,5), (3,6), (4,6)]

12/2009 20

Graph ConnectivityL = Vertex Labels, E = Edge List

function connectivity(L, E) =if #E = 0 then Lelse let FL = {coinToss(.5) : x in [0:#L]}; H = {(u,v) in E | Fl[u] and not(Fl[v])}; L = L <- H; E = {(L[u],L[v]): (u,v) in E | L[u]\=L[v]};in connectivity(L,E);

12/2009 21

Conclusions/QuestionsPerhaps Functional Programming is adequate for most/all parallel applications.Collections seems to encourage a functional style even in non functional languagesGive fully deterministic results/and partial results

12/2009 22

Quicksort in Multilisp(defun quicksort (L) (qs L nil))

(defun qs (L rest) (if (null L) rest (let ((a (car L)) (L1 (filter (lambda (b) (< b a)) (cdr L))) (L3 (filter (lambda (b) (>= b a)) (cdr L)))) (qs L1 (future (cons a (qs L3 rest)))))))

(defun filter (f L) (if (null L) nil (if (f (car L)) (future (cons (car L) (filter f (cdr L)) (filter f (cdr L))))

12/2009 23

Quicksort in Multilisp (futures)

Span = O(n)

Work = O(n log n)

Not a very goodparallel algorithm

12/2009 24

Scan codefunction addscan(A) =if (#A <= 1) then [0]else let sums = {A[2*i] + A[2*i+1] : i in [0:#a/2]}; evens = addscan(sums); odds = {evens[i] + A[2*i] : i in [0:#a/2]};in interleave(evens,odds);,

Fourier Transformfunction fft(a,w) =if #a == 1 then aelse let r = {fft(b, w[0:#w:2]): b in [a[0:#a:2],a[1:#a:2]} in {a + b * w : a in r[0] ++ r[0]; b in r[1] ++ r[1]; w in w};

12/2009 25

Sparse Vector Matrix Multiplyfunction sparseMxV(M,v) = {sum({v[i]*w : i,w in row}) : row in M};

12/2009 26

MapReducefunction mapReduce(MAP,REDUCE,documents) = let temp = flatten({MAP(d) : d in documents}); in flatten({REDUCE(k,vs) : (k,vs) in collect(temp)});

12/2009 27

function wordcount(docs) = mapReduce(d => {(w,1) : w in wordify(d)}, (w,c) => [(w,sum(c))], documents); wordcount(["this is is document 1”, "this is document 2"]);

[(“1”,1),(“this”,2),(“is”,3),(“document”,2),(“2”,1)]