hpc directions at the u.s. national science foundation jim kasdorf director, special projects...
Post on 16-Dec-2015
219 Views
Preview:
TRANSCRIPT
HPC Directions at the
U.S. National Science Foundation
Jim Kasdorf
Director, Special Projects
Pittsburgh Supercomputing Center
Stichting Academisch Rekencentrum Amsterdam
October 12, 2010
DisclaimerNothing in this presentation represents an official view or position of the U.S. National Science Foundation nor of the Pittsburgh Supercomputing Center nor of Carnegie Mellon University.
October 5, 2005 3 3
October 5, 2005 4 4
National Science Foundation FY 2011Budget Request
BIO 767.81
CISE 684.51
ENG 825.67
ENG Programs 682.81
SBIR/STTR 142.86
GEO 955.29
MPS 1,409.91
SBE 268.79
OCI 228.07
OIntlSE 53.26
OPolarP\1 527.99
Integ Act 295.93
Arctic Research Commission 1.60
Research & Related Activities Total $6,018.83
Edu&HR 892.00
Major Research Equipment & Facilities Construction 165.19
Operations & Award Management 329.19
National Science Board $4.84
OIG $14.35
NSF Total $7,424.40
October 5, 2005 5 5
The “Tracks”: 2005
• Track 1: PF System for 2011, $200M
• Track 2A, 2B, 2C, 2D: One each year: $30M, interim system before final deployment
• (for hardware, from NSF)
October 5, 2005 6 6
Track 2A: Texas Advanced Computing Center - TACC
NSF Awards TACC $59 Million for Unprecedented High Performance Computing System University of Texas, Arizona State University, Cornell University and Sun Microsystems to deploy the world’s most powerful general-purpose computing system on the TeraGrid 09/28/2006 Marcia Inger
AUSTIN, Texas: The National Science Foundation (NSF) has made a five-year, $59 million award to the Texas Advanced Computing Center (TACC) at The University of Texas at Austin to acquire, operate and support a high-performance computing system that will provide unprecedented computational power to the nation’s research scientists and engineers.
October 5, 2005 7 7
Track 2A: Texas Advanced Computing Center
• Sun / AMD / InfiniBand: Capacity System
• Proposed peak: 421TF
• Final peak: 529 TF
• $58,233,304
October 5, 2005 8 8
Track 2B: University of Tennessee - NICS
• Principal Investigator: Thomas Zacharia, Professor, Electrical Engineering and Computer Science
• Cray “Baker” system: Kraken
• $58,233,304
9
Petaflop NSF Computing:The Track2B system at UT/ORNL
SC’07
Phil Andrews,NICS Director,Buddy Bland
(I stole everything from him!)
November 13, 2007Reno, California
11
Timeline Synopsis(Predictions are always hard: especially about
the future! –Yogi Berra)
· Phase-0: Early access to DoE Cray system, Jaguar (Now)
· Phase-1: ~40TF NSF Cray System (Valentine’s Day ‘08)
· Phase-1a: ~170TF NSF Cray System (Mid-May ‘08)
· Phase-2: ~1PF NSF Cray System (1H’09)
· Phase-3: (possible) >1PF NSF Cray System (’10)
October 5, 2005 12 12
Track 2C – Pittsburgh Supercomputing Center
NSB-08-54 May 8, 2008 • MEMORANDUM TO MEMBERS OF THE NATIONAL
SCIENCE BOARD
• SUBJECT: Major Actions and Approvals at the May 6-7, 2008 Meeting
• 4. The Board authorized the NSF Director, at his discretion, to make an award to the Mellon Pitt Carnegie (MPC) Corporation for support of proposal entitled, Transforming Science through Productive Petascale Computing.
October 5, 2005 13 13
Track 2C
Silicon Graphics Declares Bankruptcy and Sells Itself for $25 Million: by Erik Schonfeld on April 1, 2009
Sadly, this is no April Fool’s joke. Silicon Graphics, the high-end computer computer workstation and server company founded by Jim Clark in 1982, today declared bankruptcy and sold itself to Rackable Systems for $25 million plus the assumption of “certain liabilities.” In its bankruptcy filing, SGI listed debt of $526 million.
October 5, 2005 14 14
Track 2C
insideHPC
Rumor: SGI breaks off NSF petaflops deal with Pittsburgh
07.14.2009
About a year ago, the National Science Foundation worked with PSC to prepare for a 1 PetaFlop system to be deployed there and integrated into the TeraGrid, a large global supercomputing network used for academic and public research. The result was an SGI UltraViolet system, approximately 197 cabinets, 100,000 cores, and all of it for the low price of $30 million dollars.
Well, that was with the old SGI. News now is that the new SGI has found other customers willing to pay higher “more reasonable” prices for these same cabinets, and has decided not to honor the original offer. Legally, they don’t have to honor them but it puts PSC and the NSF in a tight spot as they now have $30 million that’s supposed to magically turn into a 1PF supercomputer, and won’t.
October 5, 2005 15 15
Track 2C
NSB-09-87
October 9, 2009
MEMORANDUM TO MEMBERS AND CONSULTANTS OF THE NATIONAL SCIENCE BOARD
SUBJECT: Summary Report of the September 24, 2009 Meeting
Track-2C Update
Dr. Marrett presented an update regarding an information item that the Office of Cyber-infrastructure (OCI) provided to the Board in August 2009. OCI had discussed the status of the Track-2C award to the Pittsburgh Supercomputer Center (PSC). The funding had been authorized by the Board but not yet awarded due to vendor financial difficulties, timing and technical issues. Because of the concerns expressed by the Board and the fact that the awardee was unable to arrive at a set of satisfactory terms and conditions with their chosen vendor partner or with an alternative partner, OCI consulted
with the Director and the Office of the General Counsel and decided not to make this award. Instead, OCI will issue a new solicitation in FY 2010.
October 5, 2005 16 16
Track 2D: Split into four parts
• Data-intensive, high-performance computing system
• Experimental high-performance computing system of innovative design
• Experimental, high-performance grid test-bed
• Pool of loosely coupled grid-computing resources.
October 5, 2005 17 17
Track 2D / Data
San Diego Supercomputer Center / UCSD
“Flash Gordon”
• Appro / Intel / ScaleMP
• 256TB Flash Memory, 200 TF Peak
• PI: Mike Norman
• $20,296,442
October 5, 2005 18 18
Track 2D / Experimental
Keeneland: National Institute for Experimental Computing
• Georgia Tech + University of Tennessee, NICS and Oak Ridge National Laboratory
• PI: Jeff Vetter, ORNL Future Technologies Group Leader, GATech Joint Professor
• $12M; Located at ORNL
• Initially HP + NVIDIA Tesla
• 2012: New technology, 2PF peak
October 5, 2005 19 19
Track 2D / Experimental Grid Test Bed
FutureGrid: Indiana University
Testbed to address complex research challenges in computer science related to the use and security of grids and clouds.
• A geographically distributed set of heterogeneous computing systems
• A data management system to hold both metadata and a growing library of software images,
• A dedicated network allowing isolatable, secure experiments.
• PI: Geoffrey Fox, $10,118,500
October 5, 2005 20 20
Track 2D: Loosely Coupled Grid-Computing Resources
• Not awarded
October 5, 2005 21 21
Track 1 Proposals: Rumors
State of California• ~1M core IBM Blue Gene (not HPCS system)
• Sited at Livermore
PSC, et al: ?? (not HPCS system)
Oak Ridge National Laboratory• Cray Cascade (not HPCS system?)
NCSA, et al• IBM PERCS (HPCS system)
October 5, 2005 22 22
Track 1: Blue Waters / NCSA
See Merle Giles
October 5, 2005 23 23
TeraGrid Phase III: eXtreme Digital Resources for Science and Engineering (XD)
High-Performance Computing and Storage Services: Four to six nodes, Track 2 and its successors
1. High-Performance Remote Visualization and Data Analysis Services – up to two
2. Technology Audit and Insertion Service
3. Advanced User Support Service
4. Training, Education and Outreach Service
5. Coordination and Management Service
October 5, 2005 24 24
XD (continued)
• Coordination and Management Service• Design XD grid architecture
• Manage its implementation,
• Coordinate regular reporting of XD activities to NSF
• Manage accounting, authorization, authentication, allocation and security services
• Coordinate XD component services
• Maintain a responsive, user-centric operational posture for XD
• Coordinate service providers that provide access to physical resources to maintain a XD network that meets the needs of the user community.
October 5, 2005 25 25
XD Status
• Preliminary proposals reviewed
• More planning needed before full proposals
• Coordination and Management Services
• Advanced User Support Services
• Training, Education and Outreach Services
• Technology Audit and Insertion Services
• Full proposals June 2009
October 5, 2005 26 26
XD Remote Visualization and Data
• September 28, 2009 • AUSTIN, Texas — The National Science Foundation
(NSF) has awarded a $7 million grant to the Texas Advanced Computing Center (TACC) at The University of Texas at Austin for a three-year project that will provide a new computing resource and the largest, most comprehensive suite of visualization and data analysis (VDA) services to the open science community.
• The new compute resource, "Longhorn," will provide unprecedented VDA capabilities and will enable the national and international science communities to interactively visualize and analyze datasets of near petabyte scale (a quadrillion bytes or 1,000 terabytes) for scientists to explore, gain insight and develop new knowledge.
October 5, 2005 27 27
XD Remote Visualization and Data - RDAV
• University of Tennessee: Center for Remote Data Analysis and Visualization• with: Lawrence Berkeley National Laboratory,
the University of Wisconsin, the National Center for Supercomputing Applications and Oak Ridge National Laboratory (ORNL)
• PI: Sean Ahern, ORNL Viz Task Leader, U of TN Associate Professor
• $10M, Located at ORNL
October 5, 2005 28 28
XD Remote Visualization and Data - RDAV
• SGI shared-memory system, “Nautilus”,1,024 cores, 4,096 GB memory, and 16 graphics processing units (NVIDIA GT300 Fermi)
October 5, 2005 29 29
XD Technology Audit Service
• State University of New York, Buffalo
• PI: Tom Furlani
• $1,549,127 : First year
October 5, 2005 30 30
XD Technology Insertion Service
• NCSA, TACC, PSC, NICS
• PI: John Towns, NCSA
• CO-PIs:
• Jay Boisseau, TACC
• Ralph Roskies, PSC
• Phil Andrews, NICS
• $8.9M / five years
October 5, 2005 31 31
TeraGrid Extension
• To bridge to XD
• One year: $30M
• PI: Ian Foster, Argonne National Lab / University of Chicago
October 5, 2005 32 32
Dear Colleague Letter: March 12, 2010
• Current TG RP’s (or a Track 2 resource which will be a TG provider in the near future
• Proposals or supplements to upgrade or replace some of their older equipment
• Funds Available: approximately $9.0M in FY10
• No request over $3M will be supported
• Expect to make no more than three awards (subject to availability of funds)
• Able to deploy the resource by Jan 1, 2011
October 5, 2005 33 33
Dear Colleague Letter: March 12, 2010
• NICS: Kraken upgrade: Additional 12 cabinets (144TF)• $2.85M
• NICS: Continue operation of Athena (XT4) for one year• $550,000
• TACC: 1888 Dell blades; two Westmere 6-core processors (302 TF): $2,864,065
• SDSC: Appro, 324 nodes of 8- core Magnycours, 4 to a node, 53 TB flash (100 TF) – Trestles, $2.8M
• PSC: SGI UV, 512 8-core Nahalem-EX, 32TB memory• $2.8M (partial funding)
October 5, 2005 34 34
Other funding
• NCSA Ember
• PI: John Towns
• SGI UV: 1,536 cores, 8TB memory
• $3,232,158
October 5, 2005 35 35
What’s Next: Planning!
NSF Advisory Committee for Cyber Infrastructure - ACCI
36
Task Force Introduction Timeline 12-18 months or less from June
2009 Led by NSF Advisory Committee on
Cyberinfrastructure Co-led by NSF PD’s (OCI)
Membership from community Include other agencies: DOE, EU, etc
Workshop(s) Program recommendations
We then go back and develop programs
37
Task Force Leads Chair – Jim Bottum Consultant – Paul Messina NSF – Ed Seidel, Carmen Whitson,
Jose Munoz
38
Task Forces & ACCI Leads Campus Bridging
Craig Stewart, Indiana University Software Infrastructure
David Keyes, Columbia University Data & Visualization
Shenda Baker, Harvey Mudd College HPC
Thomas Zacharia, U of Tennessee, ORNL Grand Challenge Communities
Tinsley Oden, U of Texas Learning & Workforce Development
Diana Oblinger, EDUCAUSE
THANKS TO ALAN BLATECKYNSF OCI ACTING DIRECTOR
Status (As of October 8, 2011)
39
CI Review and Community Input
14 workshops and 5 BOFs have been conducted; 3 more workshops and 1 more BoF before
Dec Over 1230 researchers involved to date Final reports are being published
(completed 1st Qtr, 2011) Interim results and recommendations
are being provided All Task Forces will report out at the Dec.
8-9 ACCI meeting in Arlington40
HPC Preliminary Recommendations
1) By 2015–2016, academic researchers should have access to a rich mix of HPC systems that:
• Deliver sustained performance of 20–100 petaflops on a broad range of science and engineering codes
• Are integrated into a comprehensive, national cyberinfrastructure environment
• Are supported at national, regional, and/or campus levels
41
HPC Preliminary Recommendations
2) NSF should direct the evolution of its supercomputing program in a sustainable way.
(1) A stable, experienced pool of expertise in the management and operation of leading-edge resources
(2) Support services for researchers using these resources
(3) Support for the development and deployment of the new tools needed to make optimal use of these resources
42
HPC Preliminary Recommendations • Commit to stable and sustained funding for
HPC centers to allow them to recruit and develop the expertise needed to maximize the potential offered by NSF’s hardware investments.
• Encourage HPC centers to build long-term relationships with vendors, thus providing researchers with the benefits of a planned road map for several generations of chip technology upgrades and with continuity in architecture and software environments 43
October 5, 2005 44 44
Het Einde
Jim Kasdorf
Kasdorf@psc.edu
www.psc.edu
Supplementary Slides
46
ACCI TASK FORCESUpdate
CASCJim BottumSeptember 22, 2009
47
Coordination
TF are functionally interdependent TF leaders talk regularly with each other, NSF Monthly conference calls with TF chairs, co-chairs,
Paul M, NSF team TF Chairs and ACCI members: please work with ADs!
This is NSF wide! Wiki site Public; anyone can contribute to this NSF team will cycle through each TF Joint workshops between TFs encouraged
48
Task Force Charges
49
Software Infrastructure Charge
Identify specific needs and opportunities across the spectrum of scientific software infrastructure
Design responsive approaches
Address issue of institutional barriers
50
Campus Bridging Charge Identification of best practices for
general process of bridging to national infrastructure interoperable identification and authentication Dissemination of and use of shared data collections Vetting and sharing definitive, open use educational
materials
Suggest common elements of software stacks widely usable across nation/world to promote interoperability/economy of scale
Recommended policy documents that any research university have in place
Identify solicitations to support this work
51
Data & Visualization Charge Examine the increasing importance of data, its
development cycle(s) and their integral relationships within exploration, discovery, research engineering and educations aspects
Address the increasing interaction and interdependencies of data within the context of a range of computational capacities to catalyze the development of a system of science and engineering data collections that is open, extensive and evolvable
Emphasis will be toward identifying the requirements for digital data cyberinfrastructure that will enable significant progress in multiple fields of science and engineering and education – including visualization and inter-disciplinary research and cross-disciplinary education
52
HPC Charge
To provide specific advice on the broad portfolio of HPC investments that NSF could consider to best advance science and engineering over the next five to ten years. Recommendations: should be based on input from the research community
and from experts in HPC technologies should include hardware, software and human expertise encompass both
infrastructure to support breakthrough research in science and engineering and
research on the next-generation of hardware, software and training.
53
Grand Challenge Communities Charge
Which grand challenges require prediction and which do not
What are the generic computational and social technologies that belong to OCI and are applicable to all grand challenges
How can OCI make the software and other technical investments that are useful and cut across communities
What are the required investments in data as well as institutional components needed for GCC’s
How can we help communities (outreach) work effectively that do not yet know what they need or how to work together.
54
Grand Challenge Communities Charge (2)
How to conceive of and enable grand challenge communities that make use of cyberinfrastructure.
What type of CI is needed (hardware, networking, software, data, social science knowledge, etc.).
How to deal with the issues of data gathering and inoperability for both static and dynamic, real time problems.
What open scientific issues transcend NSF Directorates
Can we develop a more coherent architecture including data interoperability, a software environment people can build on, applications to be built on this environment, common institutional standards, etc.
55
Learning & WorkforceDevelopment Charge
Foster the broad deployment and utilization of CI-enabled learning and research environments
Support the development of new skills and professions needed for full realization of CI-enabled opportunities;
Promote broad participation of underserved groups, communities and institutions, both as creators and users of CI;
Stimulate new developments and continual improvements of CI-enabled learning and research environments;
Facilitate CI-enabled lifelong learning opportunities ranging from the enhancement of public understanding of science to meeting the needs of the workforce seeking continuing professional development;
Support programs that encourage faculty who exemplify the role of teacher-scholars through outstanding research, excellent education and the integration of education and research in computational science and computational science curriculum development;
Support the development of programs that connect K-12 students and educators with the types of computational thinking and computational tools that are being facilitated by cyberinfrastructure.
56
Status
Task force charges and membership reviewed at June ACCI meeting
NSF staff leads assigned to each TF (staffing still ramping up over summer)
Workshops held or being planned
GCC and Software Infrastructure TFs drafting a recommendation regarding CS&E program
top related