big red, the data capacitor, and the future (clouds) craig a. stewart [email protected] 2 march 2008

37
Big Red, the Data Capacitor, and the future (clouds) Craig A. Stewart [email protected] 2 March 2008

Upload: gabriel-simmons

Post on 25-Dec-2015

217 views

Category:

Documents


3 download

TRANSCRIPT

Big Red, the Data Capacitor, and the future (clouds)

Craig A. [email protected]

2 March 2008

License terms• Please cite as: Stewart, C.A. 2009. Big Red, the Data Capacitor, and the

future (clouds). Presentation. Presented 2 Mar 2009, University of Houston, Houston, TX. Available from: http://hdl.handle.net/2022/13940

• Except where otherwise noted, by inclusion of a source url or some other note, the contents of this presentation are © by the Trustees of Indiana University. This content is released under the Creative Commons Attribution 3.0 Unported license (http://creativecommons.org/licenses/by/3.0/). This license includes the following terms: You are free to share – to copy, distribute and transmit the work and to remix – to adapt the work under the following conditions: attribution – you must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). For any reuse or distribution, you must make clear to others the license terms of this work.

2

April 19, 2023

Big Red - Basics and history• IBM e1350 BladeCenter Cluster, SLES 9, MPICH,

Loadleveler, MOAB• Spring 2006: 17 days assembly at IBM facility,

disassembled, reassembled in 10 days at IU.• 20.48 TFLOPS peak theoretical, 15.04 achieved on

Linpack; 23rd on June 2006 Top500 List (IU’s highest listing to date).

• In production for local users on 22 August 2006, for TeraGrid users 1 October 2006

• Upgraded to 30.72 TFLOPS Spring 2008; ??? on June 2007 Top500 List

• Named after nickname for IU sports teams

April 19, 2023

Motivations and goals

• Initial goals for 20.48 TFLOPS system:

Local demand for cycles exceeded supply

TeraGrid Resource Partner commitments to meet

Support life science research

Support applications at 100s to 1000s of processors

• 2nd phase upgrade to 30.72 TFLOPS

Support economic development in State of Indiana

April 19, 2023

April 19, 2023

Why a PowerPC-based blade cluster?

• Processing power per node• Density, good power efficiency relative to

available processors

• Possibility of performance gains through use of Altivec unit & VMX instructions

• Blade architecture provides flexibility for future• Results of Request for Proposals process

Processor TFLOPS/ MWatt

MWatts/ PetaFLOPS

Intel Xeon 7041 145 6.88

AMD 219 4.57

PowerPC 970 MP (dual core) 200 5.00

April 19, 2023

Feature 20.4 TFLOPS 30.7 TFLOPS

Computational hardware, RAM

JS21 components Two 2.5 GHz PowerPC 970MP processors, 8 GB RAM, 73 GB SAS Drive, 40 GFLOPS

Same

No. of JS21 blades 512 768

No. of processors; cores 1,024 processors; 2,048 processor cores

1,536 processors; 3,072 processor cores

Total system memory 4 TB 6 TB

Disk storage

GPFS scratch space 266 TB Same

Lustre 535 TB Same

Home directory space 25 TB Same

Networks

Total outbound network bandwidth 40 Gbit/sec Same

Bisection bandwidth 64 GB/sec - Myrinet 2000 96 GB/sec - Myrinet 2000

April 19, 2023

April 19, 2023

IBM e1350 vs Cray XT3 (data from http://icl.cs.utk.edu/hpcc/hpcc_results.cgi)

Per process (core)Per processor

April 19, 2023

IBM e1350 vs HP XC4000 (data from http://icl.cs.utk.edu/hpcc/hpcc_results.cgi)

April 19, 2023

Difference: 4 KB vs 16 MB page size

Linpack performance

Benchmark set

Nodes Peak Theoretical

TFLOPS

Achieved TFLOPS

%

HPCC 510 20.40 13.53 66.3

Top500 512 20.48 15.04 73.4

Top500 768 30.72 21.79 70.9

April 19, 2023

April 19, 2023

Elapsed time per simulation timestep among best in TeraGrid

April 19, 2023

• Simulation of TonB-dependent transporter (TBDT)

• Used systems at NCSA, IU, PSC

• Modeled mechanisms for allowing transport of molecules through cell membrane

• Work by Emad Tajkhorshid and James Gumbart, of University of Illinois Urbana-Champaign. Mechanics of Force Propagation in TonB-Dependent Outer Membrane Transport. Biophysical Journal 93:496-504 (2007)

• To view the results of the simulation, please go to: http://www.life.uiuc.edu/emad/ TonB-BtuB/btub-2.5Ans.mpgImage courtesy of Emad Tajkhorshid

April 19, 2023

WxChallenge (www.wxchallenge.com)• Over 1,000 undergraduate students, 64

teams, 56 institutions• Usage on Big Red:

~16,000 CPU hours on Big Red 63% of processing done on Big RedMost of the students who used Big Red

couldn’t tell you what it is• Integration of computation and data flows

via Lustre (Data Capacitor)

April 19, 2023

April 19, 2023

Overall user reactions• NAMD, WRF users very pleased• Porting from Intel instruction set a perceived

challenge in a cycle-rich environment• MILC optimization with VMX not successful • Keys to biggest successes:

Performance characteristics of JS21 nodesLinkage of computation and storage (Lustre -

Data Capacitor)Support for grid computing via TeraGrid

Cloud computing

• Cloud computing does not exist• Infrastructure as a Service• Platform as a Service• Pitfalls of ‘Gartner hype curve’ and

confusing IaaS and PaaS

April 19, 2023

•  • Support collaborative activities• Application hosting (institutional apps)• Content distribution• Use cloud computing as a way to deliver

reliable 7 x 24 services when the institutions IT organization does not run a 7 x 24 operation

• Internal collaboration within the college/university

Question: What do you plan/want to do with cloud computing?

Is it safe (results of BOF at EDUCAUSE meeting)?

• Not a clear choice now, might be compelling later• Real worries about getting data and capability back after it

has once been outsources• Uses will be broad• Cloud computing will cause major realignments in funding• Cloud computing will push more computing to the

individual• Legal compliance issues may be solved as more

universities and colleges push on clould vendors• Utility model – is it here? Maybe

Consensus – big changes 1-3 years

• Look at history of local telcos, phone switches on campus

• Look at history of processors• Cloud computing, long haul networks,

and data management issues are deeply intertwined

• Our challenge *may be* to figure out how to make best use of IaaS in the future

No one makes their own lab glassware anymore (usually)

Acknowledgements - Funding Sources

• IU’s involvement as a TeraGrid Resource Partner is supported in part by the National Science Foundation under Grants No. ACI-0338618l, OCI-0451237, OCI-0535258, and OCI-0504075

• The IU Data Capacitor is supported in part by the National Science Foundation under Grant No. CNS-0521433.

• This research was supported in part by the Indiana METACyt Initiative. The Indiana METACyt Initiative of Indiana University is supported in part by Lilly Endowment, Inc.

• This work was supported in part by Shared University Research grants from IBM, Inc. to Indiana University.

• The LEAD portal is developed under the leadership of IU Professors Dr. Dennis Gannon and Dr. Beth Plale, and supported by NSF grant 331480.

• The ChemBioGrid Portal is developed under the leadership of IU Professor Dr. Geoffrey C. Fox and Dr. Marlon Pierce and funded via the Pervasive Technology Labs (supported by the Lilly Endowment, Inc.) and the National Institutes of Health grant P20 HG003894-01

• Many of the ideas presented in this talk were developed under a Fulbright Senior Scholar ’s award to Stewart, funded by the US Department of State and the Technische Universitaet Dresden.

• Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF), National Institutes of Health (NIH), Lilly Endowment, Inc., or any other funding agency

Acknowledgements - People• Maria Morris contributed to the graphics used in this talk• Marcus Christie and Surresh Marru of the Extreme! Computing Lab contributed

the LEAD graphics• John Morris (www.editide.us) and Cairril Mills (Cairril.com Design & Marketing)

contributed graphics• This work would not have been possible without the dedicated and expert efforts

of the staff of the Research Technologies Division of University Information Technology Services, the faculty and staff of the Pervasive Technology Labs, and the staff of UITS generally.

• Thanks to the faculty and staff with whom we collaborate locally at IU and globally (via the TeraGrid, and especially at Technische Universitaet Dresden)

Co-author affiliationsCraig A. Stewart; [email protected]; Office of the Vice President and CIO, Indiana University, 601 E. Kirkwood, Bloomington, IN

Matthew Link; [email protected]; University Information Technology Services (UITS), Indiana University, 2711 E. 10 th St., Bloomington, IN 47408

D. Scott McCaulay, [email protected],UITS, Indiana University, 2711 E. 10 th St., Bloomington, IN 47408

Greg Rodgers; [email protected]; IBM Corporation, 2455 South Road, Poughkeepsie, New York 12601

George Turner; [email protected]; UITS, Indiana University, 2711 E. 10 th St., Bloomington, IN 47408

David Hancock; dyhancoc@iupui,edu; UITS, Indiana University — Purdue University Indianapolis, 535 W. Michigan Street, Indianapolis, IN 46202

Richard Repasky; [email protected],UITS, Indiana University, 2711 E. 10 th St., Bloomington, IN 47408

Peng Wang; [email protected]; UITS, Indiana University — Purdue University Indianapolis, 535 W. Michigan Street, Indianapolis, IN 46202

Faisal Saied; [email protected]; Rosen Center for Advanced Computing, Purdue University, 302 W. Wood Street, West Lafayette, Indiana 47907

Marlon Pierce; Community Grids Lab, Pervasive Technology Labs at Indiana University, 501 N. Morton Street, Bloomington, IN 47404

Ross Aiken; [email protected]; IBM Corporation, 9229 Delegates Row, Precedent Office Park Bldg 81, Indianapolis, IN 46240;

Matthias Mueller; [email protected]; Center for Information Services and High Performance Computing (ZIH) Dresden University of Technology D-01062 Dresden, Germany

Matthias Jurenz; [email protected]; Center for Information Services and High Performance Computing (ZIH) Dresden University of Technology D-01062 Dresden, Germany

Matthias Lieber; [email protected];Center for Information Services and High Performance Computing (ZIH) Dresden University of Technology D-01062 Dresden, Germany

Thank you

• Any questions?