frédéric gava
DESCRIPTION
A Modular Impl ementation of Parallel Data Structures in Bulk-Synchronous Parallel ML. Frédéric Gava. Outline. Introduction; The BSML language; Impl emen tation of parallel data structures in BSML : Dictionaries; Set s; Load -Balancing. Application; Conclusion and futur works. - PowerPoint PPT PresentationTRANSCRIPT
F. Gava, HLPP 2005
Frédéric Gava
A Modular Implementation of Parallel Data Structures in
Bulk-Synchronous Parallel ML
F. Gava, HLPP 2005
Outline Introduction;
The BSML language;
Implementation of parallel data structures in BSML:
Dictionaries;
Sets;
Load-Balancing.
Application;
Conclusion and futur works.
F. Gava, HLPP 2005
Introduction Parallel Computing for speed;
To complex for many non-computer scientists;
Need for models/tools of parallelism.
AutomaticParallelization
ConcurrentProgramming
Structured Parallelism
AlgorithmicSkeletons
BSP
Data StructuresSkeletons
F. Gava, HLPP 2005
Introduction (bis) Observations:
Data Structures also important as algorithms;
Symbolic computations used massively those data structures.
Suggested solution, parallel implementations of data structures:
Interfaces as close as possible to the sequential ones;
Modular implementations to have a straightforward maintenance;
Load-balancing of the data.
F. Gava, HLPP 2005
BSML Outline:
Introduction;
BSML;
Parallel Data Structures in BSML;
Application;
Conclusion and futur works.
F. Gava, HLPP 2005
Advantages of the BSP model: Portability; Scalability, deadlock free; Simple cost model performance prediction.
Advantages of functional programming:
High-level features (higher order functions, pattern-matching, concrete types, etc…);
Savety of the environment;
Programs Proofs (proof of BSML programs using Coq).
Bulk-Synchronous Parallelism + Functional Programming =
BSML
F. Gava, HLPP 2005
Confluent language: deterministic algorithms;
Library for the « Objective Caml » language (called BSMLlib);
Operations to access to the BSP parameters;
5 primitives on a parallel data structure called parallel vector:
mkpar: create a parallel vector;
apply: parallel point-wise application;
put: send values within a vector;
proj: parallel projection;
super: BSP divide-and-conquer.
The BSML Language
F. Gava, HLPP 2005
A BSML Program
fp-1…f1f0
gp-1…g1g0
Sequential part
Parallel part
Communications
Synchronization
Communications
Synchronization
Communications
Synchronization
Communications
Synchronization
Communications
Synchronization
E1 E2 super E1 E2
0 1 20 . . .1 . . .2 . . .
0 1 20 . . .1 . . .2 . . .
0 1 20 . . .1 . . .2 . . .
Superthreads in BSML
F. Gava, HLPP 2005
Parallel Data Structures in BSML
Outline:
Introduction;
BSML;
Parallel Data Structures in BSML;
Application;
Conclusion and futur works.
F. Gava, HLPP 2005
General Points 5 modules: Set, Map, Stack, Queue, Hashtable;
Interfaces:
Same as O’Caml ones;
With some specific parallel functions (squeletons) as
parallel reduction;
Pure functional implementationx (for functional data);
Manual or Automatic load-balancing.
Modules in O’Caml Interface:
Implementation:
Functor:
module type Compare = sig type elt val compare : elt -> elt -> int end
module CompareInt = struct type elt=int let tools = ... let compare = ... endmodule AbstractCompareInt = (CompareInt : Compare)
module Make(Ord: Compare) = struct type elt = Ord.elt type t = Empty | Node of t * elt * t * int let mem e s = ... end
F. Gava, HLPP 2005
module Make (Ord : OrderedType)(Bal:BALANCE) (MakeLocMap:functor(Ord:OrderedType) -> Map.S with type key=Ord.t) = struct module Local_Map = MakeLocMap(Ord) type key = Ord.t type 'a t = ('a Local_Map.t par) * int * bool type seq_t = Local_Map.t (* operators as skeletons *)end
Parallel Dictionaries A parallel map (dictionary) = a map on each processor:
We need to re-implement all the operations (data skeletons).
F. Gava, HLPP 2005
Insert a Binding add: key 'a 'a t 'a t
If rebalanced
Otherwise
F. Gava, HLPP 2005
Parallel IteratorLet cardinal pmap=ParMap.fold (fun _ _ ii+1) 0 pmap
Fold need to respect the order of the keys;
Parallel map sequential map;
Too many communications…
async_fold: (key'a'b'b)'a t'b'b par
let cardinal pmap=List.fold left (+) 0 (total(ParMap.async fold (fun _ _ ii+1) pmap 0))
F. Gava, HLPP 2005
Parallel Sets
A sub-set on each processor;
Insert/Iteration as parallel maps;
But with some binary skeletons;
Load-balancing of couples of parallel sets using the superposition.
F. Gava, HLPP 2005
Difference
3 cases:
Two normal parallel sets;
One of the parallel sets has been rebalanced;
The two parallel sets have been rebalanced;
Imply a problem with duplicate elements.
F. Gava, HLPP 2005
Difference (third case)
S1
S2
F. Gava, HLPP 2005
Load-Balancing (1)
« Same sizes » of the local data structures;
Better performances for parallel iterations;
Load-Balancing in 2 super-steps (M. Bamha and G. Hains) using
a histogram
F. Gava, HLPP 2005
Generic code of the algorithm:
Load-Balancing (2)
rebalance: ( par) (int list ) ( ) list
parint par
Data || datasSelect « n » messages
Union Messages data
Datas data || HistogramData ||
F. Gava, HLPP 2005
ApplicationOutline:
Introduction;
BSML;
Parallel Data Structures in BSML;
Application;
Conclusion and futur works.
F. Gava, HLPP 2005
Computation of the « nth » nearest neighbors atom in a molecule
Code from «Objective Caml for Scientists » (J. Harrop);
Molecule as a infinitely-repeated graph of atoms;
Computation of sets differences (the neighbors);
Replace « fold » with « async_fold »;
Experiments with a silicate of 100.000 atoms and with a cluster of
5/10 machines (Pentium IV, 2.8 Ghz, Gigabit Ethernet Card).
Experiments (1)
Experiments (2)
Experiments (3)
F. Gava, HLPP 2005
Conclusion and Futur Works
Outline:
Introduction;
BSML;
Parallel Data Structures in BSML;
Application;
Conclusion and futur works.
F. Gava, HLPP 2005
BSML=BSP+ML;
Implementation of some data structures;
Modular for a simple development and maintenance;
Pure functional implementation;
Cost prediction with the BSP model;
Generic Load-balancing;
Application.
Conclusion
F. Gava, HLPP 2005
Futur Works Proof of the implementations (pure functional);
Implementation of another data structures (tree, priority list etc.);
Application to another scientist problems;
Comparison with another parallel ML (OCamlP3L, HirondML,
OCaml-Flight, MSPML etc.);
Development of a modular and parallel graph library:
Edges as parallel maps;
Vertex as parallel sets.
F. Gava, HLPP 2005