das 1-4: 14 years of experience with the distributed asci supercomputer henri bal [email protected] vrije...
TRANSCRIPT
![Page 1: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/1.jpg)
DAS 1-4:14 years of experience with
the Distributed ASCI Supercomputer
Henri [email protected]
Vrije Universiteit Amsterdam
![Page 2: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/2.jpg)
Introduction
● DAS: shared distributed infrastructure for experimental computer science research● Controlled experiments for CS, not
production
● 14 years experience with funding & research
● Huge impact on Dutch CS
DAS-1 DAS-2 DAS-3 DAS-4
![Page 3: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/3.jpg)
Overview
● Historical overview of the DAS systems ● How to organize a national CS testbed● Main research results at the VU
![Page 4: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/4.jpg)
Outline● DAS (pre-)history● DAS organization● DAS-1 – DAS-4 systems● DAS-1 research● DAS-2 research● DAS-3 research● DAS-4 research (early results)● DAS conclusions
![Page 5: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/5.jpg)
Historical perspective
1980 1990 2000 2010
Collections of workstations (NOWs)
Networks of workstations (COWs)
Processor pools, clusters
Metacomputing paper
Flocking CondorsGrid blueprint
book
Gridse-Science
Optical networks
CloudsHeterogeneous
computingGreen IT
Mainframes
DASPre-DAS
Beowulf clusterMinicomputers
![Page 6: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/6.jpg)
VU (pre-)history
● Andy Tanenbaum already built a cluster around 1984● Pronet token ring network● 8086 CPUs● (no pictures available)
● He built several Amoeba processor pools● MC68000, MC68020, MicroSparc● VME bus, 10 Mb/s Ethernet, Myrinet, ATM
![Page 7: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/7.jpg)
Amoeba processor pool (Zoo, 1994)
![Page 8: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/8.jpg)
DAS-1 background: ASCI
● Research schools (Dutch product from 1990s)● Stimulate top research & collaboration● Organize Ph.D. education
● ASCI (1995):● Advanced School for Computing and Imaging● About 100 staff & 100 Ph.D. students from TU
Delft, Vrije Universiteit, Amsterdam, Leiden, Utrecht,TU Eindhoven, TU Twente, …
![Page 9: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/9.jpg)
Motivation for DAS-1● CS/ASCI needs its own infrastructure for
● Systems research and experimentation● Distributed experiments● Doing many small, interactive experiments
● Need distributed experimental system, ratherthan centralized production supercomputer
![Page 10: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/10.jpg)
Funding● DAS proposals written by ASCI committees
● Chaired by Tanenbaum (DAS-1), Bal (DAS 2-4)● NWO (national science foundation) funding for
all 4 systems (100% success rate)● About 900 K€ funding per system, 300 K€
matching by participants, extra funding from VU
● ASCI committee also acts as steering group● ASCI is formal owner of DAS
![Page 11: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/11.jpg)
Goals of DAS-1● Goals of DAS systems:
● Ease collaboration within ASCI● Ease software exchange● Ease systems management● Ease experimentation
● Want a clean, laboratory-like system● Keep DAS simple and homogeneous
● Same OS, local network, CPU type everywhere● Single (replicated) user account file
[ACM SIGOPS 2000] (paper with 50 authors)
![Page 12: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/12.jpg)
Behind the screens ….
Source: Tanenbaum (ASCI’97 conference)
![Page 13: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/13.jpg)
DAS-1 (1997-2002)A homogeneous wide-area
systemVU (128) Amsterdam (24)
Leiden (24) Delft (24)
6 Mb/sATM
200 MHz Pentium Pro128 MB memoryMyrinet interconnectBSDI Redhat LinuxBuilt by Parsytec
![Page 14: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/14.jpg)
BSDI Linux
![Page 15: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/15.jpg)
DAS-2 (2002-2007)a Computer Science Grid
VU (72) Amsterdam (32)
Leiden (32) Delft (32)
SURFnet1 Gb/s
Utrecht (32)
two 1 GHz Pentium-3s≥1 GB memory20-80 GB disk
Myrinet interconnectRedhat Enterprise LinuxGlobus 3.2PBS Sun Grid EngineBuilt by IBM
![Page 16: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/16.jpg)
VU (85)
TU Delft (68) Leiden (32)
UvA/MultimediaN (40/46)
DAS-3 (2007-2010)An optical grid
dual AMD Opterons4 GB memory250-1500 GB diskMore heterogeneous: 2.2-2.6 GHz
Single/dual core nodesMyrinet-10G (exc. Delft)Gigabit EthernetScientific Linux 4Globus, SGEBuilt by ClusterVision
SURFnet6
10 Gb/s lambdas
![Page 17: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/17.jpg)
DAS-4 (2011)Testbed for clouds, diversity & Green IT
Dual quad-core Xeon E5620 24-48 GB memory1-10 TB diskInfiniband + 1Gb/s EthernetVarious accelerators (GPUs, multicores, ….)Scientific LinuxBright Cluster ManagerBuilt by ClusterVision
VU (74)
TU Delft (32) Leiden (16)
UvA/MultimediaN (16/36)
SURFnet6
10 Gb/s lambdasASTRON (23)
![Page 18: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/18.jpg)
Performance
DAS-1 DAS-2 DAS-3 DAS-4
# CPU cores 200 400 792 1600
SPEC CPU2000 INT (base) 78.5 454 1445 4620
SPEC CPU2000 FP (base) 69.0 329 1858 6160
1-way latency MPI (s) 21.7 11.2 2.7 1.9
Max. throughput (MB/s) 75 160 950 2700
Wide-area bandwidth (Mb/s) 6 1000 40000 40000
![Page 19: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/19.jpg)
Impact of DAS
● Major incentive for VL-e 20 M€ funding● Virtual Laboratory for e-Science
● Collaboration SURFnet on DAS-3 & DAS-4● SURFnet provides dedicated 10 Gb/s light paths
● About 10 Ph.D. theses per year use DAS
● Many other research grants
![Page 20: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/20.jpg)
Major VU grants• DAS-1:
• NWO Albatross; VU USF (0.4M)
• DAS-2:• NWO Ibis, Networks; FP5 GridLab (0.4M)
• DAS-3:• NWO StarPlane, JADE-MM, AstroStream; STW
SCALPBSIK VL-e (1.6M); VU-ERC (1.1M); FP6 XtreemOS (1.2M)
• DAS-4:• NWO GreenClouds (0.6M); FP7 CONTRAIL (1.4M);
BSIK COMMIT (0.9M)
• Total: ~9M grants, <2M NWO/VU investments
![Page 21: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/21.jpg)
Outline● DAS (pre-)history● DAS organization● DAS-1 – DAS-4 systems● DAS-1 research● DAS-2 research● DAS-3 research● DAS-4 research (early results)● DAS conclusions
![Page 22: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/22.jpg)
DAS research agenda
• Pre-DAS: Cluster computing
• DAS-1: Wide-area computing
• DAS-2: Grids & P2P computing
• DAS-3: e-Science & optical Grids
• DAS-4: Clouds, diversity & green IT
![Page 23: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/23.jpg)
Overview VU researchAlgorithms & applications
Programming systems
DAS-1 Albatross MantaMagPIe
DAS-2 Search algorithmsAwari
Satin
DAS-3 StarPlaneModel checking
Ibis
DAS-4 Multimedia analysisSemantic web
Ibis
![Page 24: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/24.jpg)
DAS-1 research
● Albatross: optimize wide-area algorithms
● Manta: fast parallel Java● MagPIe: fast wide-area collective
operations
![Page 25: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/25.jpg)
Albatross project
● Study algorithms and applications for wide-area parallel systems
● Basic assumption: wide-areasystem is hierarchical● Connect clusters, not individual workstations
● General approach● Optimize applications to exploit hierarchical
structure most communication is local [HPCA 1999]
![Page 26: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/26.jpg)
Successive Overrelaxation
● Problem: nodes at cluster-boundaries
● Optimizations:● Overlap wide-area communication with
computation
● Send boundary row once every X iterations (slower convergence, less communication)
Cluster 1
CPU 3CPU 2CPU 1 CPU 6CPU 5CPU 4
50 5600 µsec
µs
Cluster 2
![Page 27: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/27.jpg)
Wide-area algorithms● Discovered numerous optimizations
that reduce wide-area overhead● Caching, load balancing, message
combining …
● Performance comparison between● 1 small (15 node) cluster, 1 big (60 node)
cluster, wide-area (4*15 nodes) system
![Page 28: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/28.jpg)
Manta: high-performance Java
● Native compilation (Java executable)● Fast RMI protocol
● Compiler-generated serialization routines
● Factor 35 lower latency than JDK RMI● Used for writing wide-area applications
[ACM TOPLAS 2001]
![Page 29: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/29.jpg)
MagPIe: wide-area collective
communication● Collective communication among many
processors● e.g., multicast, all-to-all, scatter, gather,
reduction
● MagPIe: MPI’s collective operations optimized for hierarchical wide-area systems
● Transparent to application programmer [PPoPP’99]
![Page 30: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/30.jpg)
Spanning-tree broadcast
Cluster 1 Cluster 2 Cluster 3 Cluster 4
● MPICH (WAN-unaware)● Wide-area latency is chained● Data is sent multiple times over same WAN-link
● MagPIe (WAN-optimized)● Each sender-receiver path contains ≤1 WAN-link● No data item travels multiple times to same cluster
![Page 31: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/31.jpg)
DAS-2 research
● Satin: wide-area divide-and-conquer● Search algorithms● Solving Awari
fib(1) fib(0) fib(0)
fib(0)
fib(4)
fib(1)
fib(2)
fib(3)
fib(3)
fib(5)
fib(1) fib(1)
fib(1)
fib(2) fib(2)cpu 2
cpu 1cpu 3
cpu 1
![Page 32: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/32.jpg)
Satin: parallel divide-and-conquer
● Divide-and-conquer isinherently hierarchical
● More general thanmaster/worker
● Satin: Cilk-like primitives (spawn/sync) in Java
fib(1) fib(0) fib(0)
fib(0)
fib(4)
fib(1)
fib(2)
fib(3)
fib(3)
fib(5)
fib(1) fib(1)
fib(1)
fib(2) fib(2)cpu 2
cpu 1cpu 3
cpu 1
![Page 33: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/33.jpg)
Satin
● Grid-aware load-balancing [PPoPP’01]
● Supports malleability (nodes joining/leaving) and fault-tolerance (nodes crashing) [IPDPS’05]
● Self-adaptive [PPoPP’07]
● Range of applications (SAT-solver, N-body simulation, raytracing, grammar learning, ….) [ACM TOPLAS 2010]
Ph.D theses: van Nieuwpoort (2003), Wrzesinska (2007)
![Page 34: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/34.jpg)
Satin on wide-area DAS-2
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
Fibonac
ci
Adapt
ive in
tegra
tion
Set co
ver
Fib. th
resh
old IDA*
Knaps
ack
N cho
ose
K
N quee
ns
Prime
facto
rs
Raytrac
erTSP
spee
du
p
single cluster of 64 machines 4 clusters of 16 machines
![Page 35: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/35.jpg)
Search algorithms
● Parallel search:partition search space
● Many applications have transpositions● Reach same position through different
moves● Transposition table:
● Hash table storing positions that have been evaluated before
● Shared among all processors
![Page 36: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/36.jpg)
Distributed transposition tables
● Partitioned tables● 10,000s
synchronous messages per second
● Poor performance even on a cluster
● Replicated tables● Broadcasting
doesn’t scale (many updates)
![Page 37: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/37.jpg)
● Send job asynchronously to owner table entry● Can be overlapped with computation● Random (=good) load balancing● Delayed & combined into fewer large messages
Bulk transfers are far more efficient on most networks
Transposition Driven Scheduling
![Page 38: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/38.jpg)
Speedups for Rubik’s cube
0
10
20
30
40
50
60
70
16-node cluster
4x 16-nodeclusters
64-node cluster
Sp
eed
up
Single Myrinet cluster
TDS on wide-area DAS-2
● Latency-insensitive algorithm works well even on a grid, despite huge amount of communication
[IEEE TPDS 2002]
![Page 39: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/39.jpg)
Solving awari● Solved by John Romein [IEEE Computer, Oct. 2003]
● Computed on VU DAS-2 cluster, using similar ideas as TDS
● Determined score for 889,063,398,406 positions
● Game is a drawAndy Tanenbaum: ``You just ruined a perfectly fine 3500 year old game’’
![Page 40: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/40.jpg)
DAS-3 research
● Ibis – see tutorial & [IEEE Computer 2010]
● StarPlane● Wide-area Awari● Distributed Model Checking
![Page 41: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/41.jpg)
StarPlane
![Page 42: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/42.jpg)
● Multiple dedicated 10G light paths between sites
● Idea: dynamically change wide-area topology
![Page 43: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/43.jpg)
Wide-area Awari
● Based on retrograde analysis● Backwards analysis of search space (database)
● Partitions database, like transposition tables● Random distribution good load balance
● Repeatedly send results to parent nodes● Asynchronous, combined into bulk transfers
● Extremely communication intensive:● 1 Pbit of data in 51 hours (on 1 DAS-2 cluster)
![Page 44: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/44.jpg)
Awari on DAS-3 grid● Implementation on single big cluster
● 144 cores● Myrinet (MPI)
● Naïve implementation on 3 small clusters● 144 cores● Myrinet + 10G light paths (OpenMPI)
![Page 45: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/45.jpg)
Initial insights
● Single-cluster version has high performance, despite high communication rate● Up to 28 Gb/s cumulative network throughput
● Naïve grid version has flow control problems● Faster CPUs overwhelm slower CPUs with
work● Unrestricted job queue growth
Add regular global synchronizations (barriers)
![Page 46: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/46.jpg)
Optimizations
● Scalable barrier synchronization algorithm● Ring algorithm has too much latency on a grid● Tree algorithm for barrier&termination detection
● Reduce host overhead● CPU overhead for MPI message handling/polling
● Optimize grain size per network (LAN vs. WAN)● Large messages (much combining) have lower
host overhead but higher load-imbalance [CCGrid 2008]
![Page 47: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/47.jpg)
Performance
● Optimizations improved grid performance by 50%
● Grid version only 15% slower than 1 big cluster● Despite huge amount of communication
(14.8 billion messages for 48-stone database)
![Page 48: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/48.jpg)
From Games to Model Checking
● Distributed model checking has verysimilar communication pattern as Awari● Search huge state spaces, random work
distribution, bulk asynchronous transfers
● Can efficiently run DiVinE model checker on wide-area DAS-3, use up to 1 TB memory [IPDPS’09]
![Page 49: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/49.jpg)
Required wide-area bandwidth
![Page 50: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/50.jpg)
DAS-4 research (early results)
● Distributed reasoning● (Jacopo Urbani)
● Multimedia analysis on GPUs● (Ben van Werkhoven)
![Page 51: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/51.jpg)
WebPIE
● The Semantic Web is a set of technologies to enrich the current Web
● Machines can reason over SW data to find best results to the queries
● WebPIE is a MapReduce reasoner with linear scalability
● It significantly outperforms other approaches by one/two orders of magnitude
![Page 52: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/52.jpg)
Performance previous state-of-the-art
![Page 53: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/53.jpg)
Performance WebPIE
Now we are here (DAS-4)!
Our performance at CCGrid 2010 (SCALE Award, DAS-3)
![Page 54: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/54.jpg)
Parallel-Horus: User Transparent ParallelMultimedia Computing on GPU Clusters
![Page 55: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/55.jpg)
Execution times (secs) of a line detection application on two
different GPU Clusters
Number of Tesla T10 GPUs Number of GTX 480 GPUs
Unable to execute because oflimited GTX 480 memory
![Page 56: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/56.jpg)
Naive 2D Convolution Kernel on DAS-4 GPU
![Page 57: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/57.jpg)
Our best performing 2D Convolution Kernel
![Page 58: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/58.jpg)
Conclusions
● Having a dedicated distributed infrastructure for CS enables experiments that are impossible on production systems● Need long-term organization (like ASCI)
● DAS has had a huge impact on Dutch CS, at moderate cost● Collaboration around common infrastructure
led to large joint projects (VL-e, SURFnet) & grants
![Page 59: DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649e4c5503460f94b419c5/html5/thumbnails/59.jpg)
AcknowledgementsVU Group:Niels Drost
Ceriel JacobsRoelof Kemp
Timo van KesselThilo KielmannJason Maassen
Rob van NieuwpoortNick Palmer
Kees van ReeuwijkFrank SeinstraJacopo UrbaniKees Verstoep
Ben van Werkhoven& many others
DAS Steering Group:Lex WoltersDick EpemaCees de Laat
Frank SeinstraJohn Romein
Rob van Nieuwpoort
DAS management:Kees Verstoep
DAS grandfathers:Andy TanenbaumBob Hertzberger
Henk SipsFunding:
NWO, NCF, SURFnet, VL-e, MultimediaN