cactus 4.0

35
Cactus 4.0

Upload: joel-holden

Post on 02-Jan-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Cactus 4.0. Cactus Computational Toolkit and Distributed Computing. Solving Einstein’s Equations Impact on computation Large collaborations essential and difficult! Code becomes the collaborating tool. Cactus, a new community code for 3D GR-Astrophysics - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Cactus 4.0

Cactus 4.0

Page 2: Cactus 4.0

Cactus Computational Toolkit and Distributed Computing

• Solving Einstein’s Equations – Impact on computation

• Large collaborations essential and difficult! – Code becomes the collaborating tool.

• Cactus, a new community code for 3D GR-Astrophysics– Toolkit for many PDE systems– Suite of solvers for Einstein system

• Metacomputing for the general user– Distributed computing experiments with

Cactus and Globus

Gabrielle Allen,Ed SeidelAlbert-Einstein-InstitutMPI-Gravitationsphysik

Gabrielle Allen,Ed SeidelAlbert-Einstein-InstitutMPI-Gravitationsphysik

Page 3: Cactus 4.0

Einstein’s Equations and Gravitational Waves

• Einstein’s General Relativity– Fundamental theory of Physics (Gravity)– Black holes, neutron stars, gravitational waves, ...– Among most complex equations of physics

• Dozens of coupled, nonlinear hyperbolic-elliptic equations with 1000’s of terms

• New field: Gravitational Wave Astronomy– Will yield new information about the Universe– What are gravitational waves? “Ripples in the curvature of spacetime”

• A last major test of Einstein’s theory: do they exist?– Eddington: “Gravitational waves propagate at the speed of thought”– 1993 Nobel Prize Committee: Hulse-Taylor Pulsar (indirect evidence)

s(t) h = s/s ~ 10-22 ! Colliding BH’s and NS’s...

Page 4: Cactus 4.0

Detecting Gravitational Gravitational Waves

• LIGO, VIRGO (Pisa), GEO600,…$1 Billion Worldwide

We need results from numerical relativity to:

• Detect them…pattern matching against numerical templates to enhance signal/noise ratio

• Understand them…just what are the waves telling us?

4km

Hanford Washington Site

Page 5: Cactus 4.0

Merger Waveform Must Be Found Numerically

Teraflop computation, AMR, elliptic-hyperbolic, ???

Page 6: Cactus 4.0

Axisymmetric Black Hole Simulations: Cray C90

Evolution of Highly Distorted Black Hole

Collision of two Black Holes (“Misner Data”)

Page 7: Cactus 4.0

Computational Needs for 3D Numerical Relativity

• Finite Difference Codes~ 104 Flops/zone/time step

~ 100 3D arrays

• Currently use 2503

~ 15 GBytes

~ 15 TFlops/time step

• Need 10003 zones ~1000 GBytes

~1000 TFlops/time step

• Need TFlop, TByte machine

• Need Parallel AMR, I/O

• Initial Data: 4 couple nonlinear elliptics•Time step update

• explicit hyperbolic update• also solve elliptics

t=0

t=100

Page 8: Cactus 4.0

Mix of Varied Technologies and Expertise!

• Scientific/Engineering:– formulation of equations, equation of state, astrophysics, hydrodynamics ...

• Numerical Algorithms:– Finite differences? Finite elements? Structured meshes?

– Hyperbolic equations: explicit vs implicit, shock treatments, dozens of methods (and presently nothing is fully satisfactory!)

– Elliptic equations: multigrid, Krylov subspace, spectral, preconditioners (elliptics currently require most of the time…)

– Mesh Refinement?

• Computer Science:– Parallelism (HPF, MPI, PVM, ???)

– Architecture Efficiency (MPP, DSM, Vector, NOW, ???)

– I/O Bottlenecks (generate gigabytes per simulation, checkpointing…)

– Visualization of all that comes out!

Page 9: Cactus 4.0

• Clearly need huge teams, with huge expertise base to attack such problems…

• … in fact need collections of communities

• But how can they work together effectively?

• Need a code environment that encourages this…

Page 10: Cactus 4.0

NSF Black Hole Grand Challenge Alliance

• University of Texas (Matzner, Browne)• NCSA/Illinois/AEI (Seidel, Saylor,

Smarr, Shapiro, Saied)• North Carolina (Evans, York)• Syracuse (G. Fox)• Cornell (Teukolsky)• Pittsburgh (Winicour)• Penn State (Laguna, Finn)

Develop CodeTo Solve G 0

Page 11: Cactus 4.0

NASA Neutron Star Grand Challenge

• NCSA/Illinois/AEI (Saylor, Seidel, Swesty, Norman)• Argonne (Foster)• Washington U (Suen)• Livermore (Ashby)• Stony Brook (Lattimer)

“A Multipurpose Scalable Code for Relativistic Astrophysics”

Develop CodeTo Solve G 8T

Page 12: Cactus 4.0

What we learn from Grand Challenges

• Successful, but also problematic…– No existing infrastructure to support collaborative HPC

– Many scientists are Fortran programmers, and NOT computer scientists

– Many sociological issues of large collaborations and different cultures

– Many language barriers …

… Applied mathematicians, computational scientists, physicists have very different concepts and vocabularies…

– Code fragments, styles, routines often clash

– Successfully merged code (after years) often impossible to transplant into more modern infrastructure (e.g., add AMR or switch to MPI…)

• Many serious problems … this is what the Cactus Code seeks to address

Page 13: Cactus 4.0

What Is Cactus?

• Cactus was developed as a general, computational framework for solving PDEs (originally in numerical relativity and astrophysics)

• Modular … for easy development, maintenance and collaborations. Users supply “thorns” which plug-into compact core “flesh”

• Configurable … thorns register parameter, variable and scheduling information with “runtime function registry” (RFR). Object-orientated inspired features

• Scientist friendly … thorns written in F77, F90, C, C++• Accessible parallelism … driver layer (thorn) is hidden from

physics thorns by a fixed flesh interface

Page 14: Cactus 4.0

What Is Cactus?

• Standard interfaces … interpolation, reduction, IO, coordinates. Actual routines supplied by thorns

• Portable … Cray T3E, Origin, NT/Win9*, Linux, O2, Dec Alpha, Exemplar, SP2

• Free and open community code … distributed under the GNU GPL. Uses as much free software as possible

• Up-to-date … new computational developments and/or thorns immediately available to users (optimisations, AMR, Globus, IO)

• Collaborative … thorn structure makes it possible for large number of people to use and development toolkits … the code becomes the collaborating tool

• New version … Cactus beta-4.0 released 30th August

Page 15: Cactus 4.0

Core Thorn Arrangements Provide Tools

• Parallel drivers (presently MPI-based)• (Mesh refinement schemes: Nested Boxes, DAGH,

HLL)• Parallel I/O for Output, Filereading, Checkpointing

(HDF5, FlexIO, Panda, etc…)• Elliptic solvers (Petsc, Multigrid, SOR, etc…)• Interpolators• Visualization Tools (IsoSurfacer)• Coordinates and boundary conditions• Many relativity thorns• Groups develop their own thorn arrangements to add

to these

Page 16: Cactus 4.0

Cactus 4.0

IOFlexIO

FLESH(Parameters, Variables, Scheduling)

IOHDF5

PUGH

WaveToyF90

CartGrid3D

GrACE

Boundary

WaveToyF77

Page 17: Cactus 4.0

Current Status

• It works: many people, with different backgrounds, different personalities, on different continents, working together effectively on problems of common interest.

• Dozens of physics/astrophysics and computational modules developed and shared by “seed” community

• Connected modules work together, largely without collisions• Test suites used to ensure integrity of both code and physics

• How to get it …

• Workshop 27 Sept - 1 Oct NCSAhttp://www.ncsa.uiuc.edu/SCD/Training/

Movie from Werner Benger, ZIB

Page 18: Cactus 4.0

Near Perfect Scaling

0

20

40

60

80

100

120

0

20

40

60

80

10

0

12

0

Processors

Sc

alin

g

Origin

NT SC

Cactus Scaling on T3E-600

192

760

5980

47900

100

1000

10000

100000

1 10 100 1000

Number of Processors

Cactus on T3E 600 Total Mflops/sec

• Excellent scaling on many architectures– Origin up to 128 processors

– T3E up to 1024

– NCSA NT cluster up to 128 processors

• Achieved 142 Gflops/s on 1024 node T3E-1200 (benchmarked for NASA NS Grand Challenge)

Page 19: Cactus 4.0

Many Developers: Physics & Computational Science

DAGH/AMR(UTexas)

AEI

NASA

NCSA

Valencia

ZIB

Panda I/O(UIUC CS)

Globus(Foster)

Petsc(Argonne)

SGI

Wash. U

HDF5

FlexIO

Page 20: Cactus 4.0

Metacomputing: harnessing power when and where it is needed

• Easy access to available resources

– Find Resources for interactive use: Garching? ZIB? NCSA? SDSC?

– Do I have an account there? What’s the password?

– How do get executable there?

– Where to store data?

– How to launch simulation. What are local queue structure/OS idiosyncracies?

Page 21: Cactus 4.0

Metacomputing: harnessing power when and where it is needed

• Access to more resources

– Einstein equations require extreme memory, speed

– Largest supercomputers too small!

– Networks very fast!• DFN gigabit testbed: 622 Mbits Potsdam-Berlin-Garching,

connect multiple supercomputers• Gigabit networking to US possible• Connect workstations to make supercomputer

Page 22: Cactus 4.0

Metacomputing: harnessing power when and where it is needed

• Acquire resources dynamically during simulation!– Need more resolution in one area

• Interactive visualization, monitoring and steering from anywhere– Watch simulation as it progresses … live visualisation– Limited bandwidth: compute vis. online with simulation– High bandwidth: ship data to be visualised locally– Interactive Steering

• Are parameters screwed up? Very complex?• Is memory running low? AMR! What to do? Refine selectively or

acquire additional resources via Globus? Delete unnecessary grids?

Page 23: Cactus 4.0

Metacomputing: harnessing power when and where it is needed

• Call up an expert colleague … let her watch it too– Sharing data space– Remote collaboration tools– Visualization server: all privileged users can login and check

status/adjust if necessary

Page 24: Cactus 4.0

Globus: Can provide many such services for Cactus

• Information (Metacomputing Directory Service: MDS) – Uniform access to structure/state information:

Where can I run Cactus today?

• Scheduling (Globus Resource Access Manager: GRAM)– Low-level scheduler API:

How do I schedule Cactus to run at NCSA?

• Communications (Nexus)– Multimethod communication + QoS management:

How do I connect Garching and ZIB together for a big run?

• Security (Globus Security Infrastructure)– Single sign-on, key management:

How do I get authority at SDSC for Cactus?

Page 25: Cactus 4.0

Globus: Can provide many such services for Cactus

• Health and status (Heartbeat monitor): Is my Cactus run dead?

• Remote file access (Global Access to Secondary Storage: GASS):

How do I manage my output, and get executable to Argonne?

Page 26: Cactus 4.0

Colliding Black Holes and MetaComputing: German Project supported by DFN-Verein

• Solving Einstein’s Equations

• Developing Techniques to Exploit High Speed Networks

• Remote Visualization

• Distributed Computing Across OC-12 Networks between AEI (Potsdam), Konrad-Zuse-Institut (Berlin), and RZG (Garching-bei-München)

AEI

Page 27: Cactus 4.0

Distributing Spacetime: SC’97 Intercontinental Metacomputing at AEI/Argonne/Garching/NCSA

Immersadesk

512 Node T3E

Page 28: Cactus 4.0

Metacomputing the Einstein Equations:Connecting T3E’s in Berlin, Garching, San Diego

Page 29: Cactus 4.0

Collaborators

• A distributed astrophysical simulation involving the following institutions:– Albert Einstein Institute (Potsdam, Germany)– Washington University St. Louis, MO.– Argonne National Laboratory (Chicago, IL)– NLANR Distributed Applications Team (Champaign, IL)

• The following supercomputer centers:– San Diego Supercomputer Center (268 proc. T3E)– Konrad-Zuse-Zentrum in Berlin (232 proc. T3E)– Max-Planck-Institute in Garching (768 proc. T3E)

Page 30: Cactus 4.0

The Grand Plan

• Distribute simulation across 128 PE’s of SDSC T3E and 128 PE’s of Konrad-Zuse-Zentrum T3E in Berlin, using Globus

• Visualize isosurface data in real-time on Immersadesk in Orlando• Transatlantic bandwidth from an OC-3 ATM network

Page 31: Cactus 4.0

SC98 Neutron Star Collision

Movie from Werner Benger, ZIB

Page 32: Cactus 4.0

Cactus scaling across PE’s(Jason Novotny, NLANR)

Page 33: Cactus 4.0

Analysis of metacomputing experiments

• It works! (That’s the main thing we wanted at SC98…)• Cactus not optimized for metacomputing: messages too

small, lower MPI bandwidth, could be better:– ANL-NCSA

• Measured bandwidth 17Kbits/sec (small) --- 25Mbits/sec (large)• Latency 4ms

– Munich-Berlin• Measured bandwidth 1.5Kbits/sec (small) --- 4.2Mbits/sec (large)• Latency 42.5ms

– Within single machine: Order of magnitude better

• Bottom Line:– Expect to be able to improve performance significantly– Can run much larger jobs on multiple machines– Start using Globus routinely for job submission

Page 34: Cactus 4.0

The Dream: not far away...

Physics Module 1

Garching T3E

Globus Resource Manager

Cactus/Einstein solver

MPI, MG, AMR, DAGH, Viz, I/O, ...

NCSA Origin 2000 array

Mass storage

BH Initial Data

BuddingEinstein inBerlin...

Ultra 3000:Whatever-Wherever

Page 35: Cactus 4.0

Cactus 4.0 Credits

• Cactus flesh and design – Gabrielle Allen

– Tom Goodale

– Joan Massó

– Paul Walker

• Computational toolkit – Flesh authors

– Gerd Lanferman

– Thomas Radke

– John Shalf

• Development toolkit – Bernd Bruegmann

– Manish Parashar

– Many others

• Relativity and astrophysics – Flesh authors

– Miguel Alcubierre

– Toni Arbona

– Carles Bona

– Steve Brandt

– Bernd Bruegmann

– Thomas Dramlitsch

– Ed Evans

– Carsten Gundlach

– Gerd Lanferman

– Lars Nerger

– Mark Miller

– Hisaaki Shinkai

– Ryoji Takahashi

– Malcolm Tobias

• Vision and Motivation – Bernard Schutz

– Ed Seidel "the Evangelist"

– Wai-Mo Suen