at louisiana state university university of washington – e-science introduction to the teragrid...

34
AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance Computing University of Washington Dept. of Physics Thanks to TeraGrid community (the source of many of these slides) esp. Daniel S. Katz, LSU

Upload: sherman-logan

Post on 16-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

AT LOUISIANA STATE UNIVERSITYUniversity of Washington – e-Science

Introduction to the TeraGridJeffrey P. Gardner

Sr. Research Scientist, High Performance Computing

University of Washington

Dept. of Physics

Thanks to TeraGrid community

(the source of many of these slides)

esp. Daniel S. Katz, LSU

Page 2: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

Overview

• TeraGrid is US national resource• Funded by the NSF Office of Cyberinfrastructure• Gives any researcher in the U.S. access to

leading-edge computational resources• Detailed info about the TeraGrid• How to start using the TeraGrid

2

Page 3: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

What is Cyberinfrastructure?

• “Cyberinfrastructure is a technological solution to the problem of efficiently connecting data, computers, and people with the goal of enabling derivation of novel scientific theories and knowledge.”1

• Term was used by the NSF Blue Ribbon committee in 2003 in response to the question: “How can NSF… remove existing barriers to the rapid evolution of high performance computing, making it truly usable by all the nation's scientists, engineers, scholars, and citizens?”

• The TeraGrid2 is the NSF’s response to this question.• Cyberinfrastructure is also called e-Science3

1Source: Wikipedia2More properly, the TeraGrid in it’s current form: the “Extensible Terascale Facility”3Source: NSF

Page 4: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

What is the TeraGrid?• World’s largest infrastructure for open scientific

discovery• Leadership class resources at eleven partner sites

combined to create an integrated, persistent computational resource– High-performance networks– High-performance computers (>750 TFlops)– Visualization systems– Data resources and tools (>30 PB, >100 discipline-specific

databases)– Science Gateways– User portal– User services - Help desk, training, advanced app support

• Allocated through national peer-review process• (It’s free!)

4

~2 PFlops in another year!

Page 5: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

TeraGrid Resources

5

Page 6: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

TeraGrid Systems 2007-8

Computational Resources (size approximate - not to scale)Slide Courtesy Tommy Minyard, TACC

SDSC

TACC

UC/ANL

NCSA

ORNL

PU

IU

PSC

NCAR

TennesseeLONI/LSU

6

Page 7: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

Who Uses TeraGrid

Molecular Biosciences

31%

Chemistry17%

Physics17%

Astronomical Sciences

12%

Materials Research6%

Earth Sciences3%

All 19 Others4%

Advanced Scientific Computing

2%

Atmospheric Sciences

3%

Chemical, Thermal Systems

5%

7

Page 8: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

How TeraGrid Is Used

8

Use ModalityUse ModalityCommunity SizeCommunity Size

(rough est. - number of (rough est. - number of users)users)

Batch Computing on Individual Resources 850Exploratory and Application Porting 650Workflow, Ensemble, and Parameter Sweep 250Science Gateway Access 500Remote Interactive Steering and Visualization 35Tightly-Coupled Distributed Computation 10

2006 data from

Page 9: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

25

50

75

100

125

150

175

200

225

250

275

J F M A M J J A S O N D J F M A M J J A S O N D J F M A M J J A S O N D J F M A M J

2004 2005 2006 2007

NU

s (m

illi

on

s)Specific

Roaming

TeraGrid Usage

33% Annual Growth

Specific Allocations Roaming Allocations

200

100

Normalized Units (millions)

TeraGrid currently delivers an average of 420,000 cpu-hours per day Dave Hart ([email protected])

Page 10: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

TG usage: Predicting storms

10

• Hurricanes and tornadoes cause massive loss of life and damage to property

• TeraGrid supported spring 2007 NOAA and University of Oklahoma Hazardous Weather Testbed– Major Goal: assess how well ensemble

forecasting predicts thunderstorms, including the supercells that spawn tornadoes

– Nightly reservation at PSC– Delivers “better than real time” prediction– Used 675,000 CPU hours for the season– Used 312 TB on HPSS storage at PSC

Slide courtesy of Dennis Gannon, IU, and LEAD Collaboration

Page 11: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

TG Usage: Gravitational Waves

Observations

ModelsAnalysis & Insight

Visualization Credits: Werner Benger, Ralf Kaehler, LSU/AEIData Simulation Credits: LSU/AEI relativity groups

CactusGravitational waves predicted from colliding black holes, neutron stars, supernovae

Page 12: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

TG Usage: Cosmology

The Cosmic WebCosmological evolution and galaxy formation using a 3D cosmological n-body gravity+hydrodynamics code, Gasoline.

100 million light yearsCredits: Tom Quinn, Jeff Gardner, Univ. of Washington

Page 13: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

TG usage: Biology

13

High resolution 3-D reconstruction of infectious viruses Wen Jiang, Weimin Wu, Purdue UniversityMatthew L. Baker, Joanita Jakana and Wah Chiu, Baylor College of MedicinePeter R. Weigele and Jonathan King, MIT

High resolution 3-D structure of virus particles provide important insights to the development of effective prevention and treatment strategies. This work used the electron cryo-microscopy to demonstrate the 3-D reconstruction of the infectious bacterial virus ε15 at 4.5 Å resolution, which allowed tracing of the polypeptide backbone of its major capsid protein gp7. The structure reveals similar protein architecture to that of other tailed double-stranded DNA viruses, even in the absence of detectable sequence similarity. However, the connectivity of the secondary structure elements (topology) in gp7 is unique.

Large numbers (104-105) of 2-D images (8002 pixels/image), representing the projections of identical 3-D structure viewed at different angles, were collected. These images require intensive computation to accurately determine their relative orientations before the 2-D images can be coherently merged into a single high resolution 3-D structure.

These results have been just published on Nature (Feb 28, 2008).Slide courtesy of Purdue and TeraGrid

Page 14: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

Solve any Rubik’s Cube in 26 moves?

• Rubik's Cube is perhaps the most famous combinatorial puzzle of its time

• > 43 quintillion states (4.3x10^19)

• Gene Cooperman and Dan Kunkle of Northeastern Univ. proved any state can be solved in 26 moves

• 7TB of distributed storage on TeraGrid allowed them to develop the proof

Source: http://www.physorg.com/news99843195.html

Page 15: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

Community Engagement through Science Gateways

• Increasing investment by communities in their own cyberinfrastructure, but heterogeneous:– Resources– Users – from expert to K-12– Software stacks, policies

• Three common forms:– Web portal with users in front and services in back– Client server model where application programs run on users'

machines (i.e. desktops) and access services– Bridges across multiple grids, allowing communities to utilize

both community developed grids and shared grids

• Science Gateways– Provide “TeraGrid Inside” capabilities– Leverage community investment

Slide courtesy of Nancy Wilkins-Diehr

15

Page 16: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

Current Science Gateways• Biology and Biomedicine Science Gateway• Open Life Sciences Gateway• The Telescience Project• Grid Analysis Environment (GAE)• Neutron Science Instrument Gateway• TeraGrid Visualization Gateway, ANL• BIRN

• Open Science Grid (OSG)• Special PRiority and Urgent Computing

Environment (SPRUCE)• National Virtual Observatory (NVO)• Linked Environments for Atmospheric

Discovery (LEAD)• Computational Chemistry Grid (GridChem)• Computational Science and Engineering

Online (CSE-Online)• GEON(GEOsciences Network)• Network for Earthquake Engineering

Simulation (NEES)• SCEC Earthworks Project• Network for Computational Nanotechnology

and nanoHUB• GIScience Gateway (GISolve)• Gridblast Bioinformatics Gateway• Earth Systems Grid• Astrophysical Data Repository (Cornell)

Slide courtesy of Nancy Wilkins-Diehr

16

Page 17: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

SGW Highlight: National Virtual Observatory - Facilitating Scientific Discovery

17

• Access to telescope images from around the world

• NVO provides access to combined sky surveys– Different views of the same

cosmological phenomenon can reveal new insights

• New science enabled by enhancing access to data and computing resources– Data correlation– Understanding of physical

processes– Identification of new

phenomenon

• NVO is a set of tools used to exploit the data avalanche

Page 18: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

SGW Highlight: Linked Environments for Atmospheric Discovery (LEAD)

•Providing tools that are needed to make accurate predictions of tornados and hurricanes

•Meteorological data•Forecast models•Analysis and visualization tools

•Data exploration and Grid workflow

Slide courtesy of Nancy Wilkins-Diehr

18

Page 19: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

SGW Highlight: GridChem’s Client-Server Approach Provides Power and a Rich Feature Set

Slide courtesy of Sudhakar Pamidighantam, NCSA

19

Page 20: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

Gateways

20

0

20,000

40,000

60,000

80,000

100,000

120,000

140,000

Jan-

07

Feb-

07

Mar

-07

Apr

-07

May

-07

Jun-

07

Jul-0

7

Aug

-07

Sep

-07

Oct

-07

Nov

-07

Dec

-07

# o

f G

atew

ay J

ob

s

Nearly 500k gateways jobs in CY2007GridChem: 192k jobs, >210k TG SUsCIGportal: 94k jobs, >154k TG SUsLEAD: 40k jobs, >54k TG SUs

Slide courtesy of Nancy Wilkins-Diehr

Page 21: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

TG New Large Resources

• Ranger@TACC– First NSF ‘Track2’

HPC system– 504 TFlops– 15,744 Quad-Core

AMD Opteron processors– 123 TB memory, 1.7 PB disk

• Kraken@NICS (UT/ORNL)– Second NSF ‘Track2’ HPC system– 170 TFlops Cray XT4 system– Will be upgraded to Cray XT5 at nearly 1 PFlops

• 10,000+ compute sockets• 100 TB memory, 2.3 PB disk

21

Page 22: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

So how do I get on the TeraGrid?

• The best thing to do: Talk to your local TeraGrid “Campus Champion” (for UW, that’s me)

• Campus Champion can:– Direct you to the most appropriate TeraGrid platforms– Give you an experimental TeraGrid account– Help you write proposals to acquire TeraGrid time

Page 23: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

TeraGrid Resource Allocations

• Every TeraGrid award of time is either:– System-specific (“Type S”): Time is awarded for a

specific system, e.g. PSC Cray XT3.– TeraGrid Roaming (“Type R”): Time is awarded that

can be used on any* TeraGrid system.

• TeraGrid time is awarded in “Service Units” or “SUs”. SUs correspond roughly to CPU-hours:– For system-specific awards, 1 SU = 1 CPU-hour on

that machine.– For TeraGrid Roaming awards, 1 SU = 1 CPU-hour

on a 1.5GHz Itanium2 system and will be converted for whichever machine you actually run on.

*With a few exceptions

Page 24: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

TeraGrid Resource Allocations

• The easiest type of allocation to get is a “Development Allocation” or “DAC”*:– Currently DACs are 30,000 SUs– Submit a single page (i.e. no more than 3 paragraphs)

description of your research and your goals for trying TeraGrid.

– Development applications are reviewed and awarded continuously

– You will be up and running within a few weeks.

*DAC = “Development Allocation Committee”

Page 25: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

POPS - Allocations

• POPS is the on-line system used for the allocations process (pops-submit.teragrid.org)– Allocation Requests– Peer reviews– Usage information

• pops.teragrid.org for now – also accessible from the TeraGrid user portal (portal.teragrid.org)

25

Page 26: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

Allocation Process

• Types of Requests– DAC (up to 30k SUs, continually reviewed)

• DACS on Ranger can be larger, same for other new large resources (Kraken, Track2c, etc.)

– MRAC (up to 500k SUs, reviewed quarterly)– LRAC (over 500k SUs, reviewed semi-annually)– Can apply for compute, data, support resources– Also, there are community accounts…

• Awards– Most awards are granted in full!– For one or more 12 month periods – can be renewed– Can rebut reviewers who reject or cut award

26

Page 27: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

How You Can Use TeraGrid

ComputeService

VizService

DataService

Network, Accounting, …

Site 1

Site 3

Site 2

TeraGrid Infrastructure (Accounting, Network, Authorization,…)

POPS (for now)

Science Gateways

UserPortal

Command Line

Slide courtesy of Dane Skow and Craig Stewart

27

Page 28: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

User Portal: portal.teragrid.org

28

Page 29: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

Access to resources

• Terminal: ssh, gsissh• Portal: TeraGrid user

portal, Gateways– Once logged in to

portal, click on “Login”

29

Page 30: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

User Portal – Compute/Viz Resources

30

Page 31: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

User Portal – Other Resources

31

Page 32: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

User Portal – Other Information

32

• Knowledge Base for quick answers to technical questions

• Documentation

• Science Highlights

• News and press releases

• Education, outreach and training events and resources

Page 33: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

Data Storage Resources

• GPFS-WAN– 700 TB disk storage at SDSC, accessible from machines at

NCAR, NCSA, SDSC, ANL

• Data Capacitor– 535 TB storage at IU, including databases

• Data Collections– Storage at SDSC (files, databases) for collections used by

communities

• Tape Storage– Available at IU, NCAR, NCSA, SDSC, PSC

• Access is generally through GridFTP • Typical data transfer speeds are 100MB/s!

33

Page 34: AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance

University of Washington – e-Science

Conclusions

• TeraGrid is not a secret government agency• Just a collection of universities working together,

funded by NSF• Currently, an abundance of cycles available• Talk to current users/participants or your

Campus Champion for help with proposals and other information

34