european research network on foundations, software infrastructures and applications for large scale...
TRANSCRIPT
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Experiences Deploying Parallel Applications on a Large-scale Grid
Rob van Nieuwpoort, Jason Maassen, Andrei Agapi, Ana-Maria Oprescu,
Thilo Kielmann
Vrije Universiteit, Amsterdam
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
H.M. Beatrix,
Koningin der Nederlanden
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
The N-Queens Contest
• Challenge: solve the most board solutions
within 1 hour
• Testbed:
– Grid5000, DAS-2, some smaller clusters
– Globus, NorduGrid, LCG, ???
– In fact, there was not too much precise
information available in advance...
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Computing in an Unknown Grid?
• Heterogeneous machines (architectures,
compilers, etc.)
– Use Java: “write once, run anywhere”
Use Ibis!
• Heterogeneous machines (fast / slow,
small / big clusters)
– Use automatic load balancing (divide-
and-conquer)
Use Satin!
• Heterogeneous middleware (job
submission interfaces, etc.)
– Use the Grid Application Toolkit (GAT)!
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
The Ibis Grid Programming System
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
The Ibis System
• Java centric: “write once, run anywhere”
• Efficient communication (pure Java or native)
• Parallel programming models:
– RMI (remote method invocation)
– GMI (group method invocation)• Collective communication (MPI-like) and more
– RepMI (replicated method invocation)• Strong consistency
– Satin (divide and conquer)
– MPJ (Java binding for MPI)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Satin: Divide-and-conquer
• Effective paradigm for Grid applications
(hierarchical)
• Satin: Grid-friendly load balancing (aware
of cluster hierarchy)
• Also support for
– Fault tolerance
– Malleability
– Migration
fib(1) fib(0) fib(0)
fib(0)
fib(4)
fib(1)
fib(2)
fib(3)
fib(3)
fib(5)
fib(1) fib(1)
fib(1)
fib(2) fib(2)cpu 2
cpu 1cpu 3
cpu 1
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Satin Example: Fibonacci
fib(1) fib(0) fib(0)
fib(0)
fib(4)
fib(1)
fib(2)
fib(3)
fib(3)
fib(5)
fib(1) fib(1)
fib(1)
fib(2) fib(2)
class Fib { int fib (int n) {
if (n < 2) return n;int x = fib(n-1);int y = fib(n-2);return x + y;
}}
Single-threaded
Java
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Satin Example: Fibonacci
public interface FibInter extends ibis.satin.Spawnable { public int fib (int n);}
class Fib extends ibis.satin.SatinObjectimplements FibInter { public int fib (int n) { if (n < 2) return n; int x = fib(n-1); /*spawned*/ int y = fib(n-2); /*spawned*/ sync(); return x + y; }}
(use byte code rewriting to generate parallel code)
Leiden
Delft
Rennes
Internet
Sophia
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Satin: Grid-friendly load balancing
(aware of cluster hierarchy)
• Random Stealing (RS)– Provably optimal on a single cluster (Cilk)
– Problems on multiple clusters:
• (C-1)/C % stealing over WAN
• Synchronous protocol
• Satin: Cluster-aware Random Stealing
(CRS) – When idle:
• Send asynchronous steal request to random node
in different cluster
• In the meantime steal locally (synchronously)
• Only one wide-area steal request at a time
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Summary: Ibis
• Java: “write once, run anywhere”
• Ibis provides efficient communication
among parallel processes
• Satin provides highly-efficient load
balancing
among nodes of multiples clusters
• But how do we deploy our Ibis / Satin
application?
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
The Grid Application Toolkit (GAT)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
The Grid Application Toolkit (GAT)
• Simple and uniform API to various Grid
middleware:
– Globus 2,3,4, ssh, Unicore, ...
• Job submission, remote file access,
job monitoring and steering
• Implementations:
– C, with wrappers for C++ and Python
– Java
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
The Beatrix Architecture(or: how we thought we could get away with)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
The Beatrix Architecture(what we finally ended up with)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Why all these changes?
• Scalability
– DAS-2 has O(100)
nodes
– Grid5000 has
O(1000) nodes
• Network connectivity
– Theory meets
practice
• The is in the
details...
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Changing Satin: Scalability
Satin uses random work stealing (CRS):
• Requires connections between all Ibis
nodes
• Old solution: all nodes connect to each
other at startup time
– previously, firewalls have signaled this
as
“Denial of Service” attack
• Problem: connections fail due to TCP
server socket backlog
– typically, 50 pending connection
requests are queued
– the rest will fail
• New solution: connect “on demand” with
steal requests
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Changing the Ibis Name Server: Scalability
Ibis name server provides totally ordered joins
– Broadcast join messages to all nodes
– Old: 1-by-1,
sequentially to all nodes
– Problem: one hanging
connection slows down all nodes
– New: message combining and
multiple, sending threads
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Changing the TCP Driver: Connectivity
Nodes advertise their IP address (to connect
to)
to the name server:
• Grid5000 nodes have up to 5 IP addresses:
which one?
– Global and local (RFC1918), like 10.x.y.z
– Address registered in DNS
– Some are routed, some are not
• Problem: incomplete and inconsistent
information
• Solution: manual configuration,
TCP-Ibis tries out multiple addresses
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Results achieved on the Grid5000 Testbed
Site CPUs426
Bordeaux 92120128196
Total: 960
Orsay
Rennes, Opteron clusterRennes, Xeon clusterSophia Antipolis
• Solved n=22 in 25 minutes
• 4.7 million jobs, 800,000 load balancing
messages
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
More Results...
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Conclusions
• We used a mix of software:
– Ibis with Satin
– Java GAT with ProActive and ssh
adaptors
• Scaling from 100 to 1000 processors
required some
redesign of Ibis and Satin
• Connectivity in Grid5000 is non-trivial
(we still cannot connect Grid5000 with
DAS-2)
• Ibis / Satin / GAT / ProActive has shown to
be a viable and
efficient Grid computing platform