65p.range or mini-supercomputer, vs. networks of workstations, vs. remote, shared supercomputers of...

DOCUMENT RESUME

ED 385 244 IR 017 257

TITLE From Desktop to Teraflop: Exploiting the U.S. Lead inHigh Performance Computing. NSF Blue Ribbon Panel onHigh Performance Computing.

INSTITUTION National Science Foundation, Washington, D.C.REPORT NO NSB-93-205PUB DATE Aug 93NOTE 65p.

PUB TYPE Reports Evaluative/Feasibility (142)

EDRS PRICE MF01/PC03 Plus Postage.DESCRIPTORS Computers; *Computer Science; Economic Progress;

*Engineering; Futures (of Society); Investment;Research and Development; *TechnologicalAdvancement

IDENTIFIERS Barriers to Change; *High Performance Computing;National Science Board; National Science Foundation;Supercomputers

ABSTRACTThis report addresses an opportunity to accelerate

progress in virtually every branch of science and engineeringconcurrently, while also boosting the American economy as businessfirms also learn to exploit these new capabilities. The successfulrapid advancement in both science and technology creates its ownchallenges, four of which are outlined here for the National ScienceBoard. Four sets of interdependent recommendations are made inresponse to the challenges. The first implements a balanced pyramidof computing environments. Each element in the pyramid supports theothers; whatever resources are applied to the whole, the balance inthe pyramid should be sustained. The second set addresses theessential research investments and other steps to remove theobstacles to realizing the technologies in the pyramid and thebarriers to the effective use of these environments. The third setaddresses the institutional structure for delivery of the HPCcapabilities, and consists itself of a pyramid. At the base of theinstitutional pyramid is the diverse array of investigators in theiruniversities and other settings, who use all the facilities at alllevels of the pyramid, followed by departments and research groupsdevoted to specific areas of computer science and engineering, andthe National Science Foundation (NSF) high performance computing(HPC) Centers. At the apex is the national teraflop-class society,which is recommended as a multi-agency facility pushing the frontiersof high performance into the next decade. A final recommendationaddresses the NSF role at the national level and its relationshipwith the states in HPC. Concepts are illustrated with two figures andtwo tables. Appendices include: a list of the membership of the BlueRibbon Panel mi High Performance Computing; information on thehistory and origin of this study on the NSF and HPC; a discussion oftechnology trends and barriers to further progress; four figuresillustrating supercomputer data; and a review and prospectus ofcomputational and computer science and engineering with personalstatements by panel members. (MAS)

U.L. DEPARTMENT OF toucAnoraka o aloe-abloom! 1111Mb and Impro.N...ratEDUCATIONAL RESOURCES INFORMATION

CENTER (ERIC)O This document ha" been reproduced asmowed from the person or otos misbonoat Miffing tO Minot Chattiltlitli have MM made to ImprOv

rttiteduCttOn quality

Points of maw tar °mambo state° .n th m Soso.moot do not nocossanly rmsn ofbmalOERI 0410n or oottcy

EXPLOITING THE U.S. LEAD

HIGH PERFORMANCE COMPUTING

'NSF Blue Ribbon Panel on

High Performance Computing

44"okaliski...:Ctk

t

NDAC

August 1993'

BEST COPY r i AILABLE

Dedication

This report is dedicated to one of the nation's most distinguished computer scientists,

a builder of important academic institutions, and a devoted and effective public servant;

Professor Nico Hahcrmann. Dr. Hahermann took responsibility in organizing this Panel's

work and saw it through to completion, but passed away just a few days before it

was presented to the National Science Board. The members of the panel deeply

feel the loss of his creativity, wisdom, and friendship.

FROM

DESKTOP

To TERAFLOP:

EXPLOITING THE U.S. LEAD

IN

HIGH PERFORMANCE COMPUTING

NSF Blue Ribbon Panel onHigh Performance Computing

August 1993

Lewis Branscomb (Chairman) Barry HonigTheodore Belytschko Neal Lane (resigned from Panel, July 1993)

Peter Bridenbaugh William Lester, Jr.

Teresa Chay Gregory J. McRae

Jeff Dozier James A. Sethian

Gary S. Grest Burton Smith

Edward E Hayes Mary Vernon

"It is easier to invent the future than to predict it" - Alan Kay

4

EXECUTIVE SUMMARYAn Introductory Remark: Many reports areprepared for the National Science Board and theNational Science Foundation that make an elo-quent case for more resources for one discipline oranother. This is not such a report. This report ad-dresses an opportunity to accelerate progress invirtually every branch of science and engineeringconcurrently, while also giving a shot in the arm tothe entire American economy as business firmsalso learn to exploit these new capabilities. Theway much of science and engineering are prac-ticed will be transformed, if our recommendationsare implemented.

The National Science Board can take pride in theFoundation's accomplishments in the decade sinceit implemented the recommendations of the PeterLax Report on high performance computing(HPC). The Foundation's High Performance Com-puting Centers continue to play a central role inthis successful strategy, creating an enthusiasticand demanding set of sophisticated users, whohave acquired the specialized computational skillsrequired to use the fast advancing but still imma-ture high performance computing technology.

Stimulated by this growing user community, theHPC industry finds itself in a state of excitementand transition. The very success of the NSF pro-gram, together with those of sister agencies, hasgiven rise to a growing variety of new experimen-tal computing environments, from massively paral-lel systems to networks of coupled workstations,that could, with the right research investments,produce entirely new levels of computing power,economy, and usability. The U.S. enjoys a sub-stantial lead in computational science and in theemerging technology; it is urgent that the NSFcapitalize on this lead, which not only offers scien-tific preeminence but also the industrial lead in agrowing world market.

The vision of the rapid advances in both scienceand technology that the new generation of super-computers could make possible has been shown tobe realistic. This very success, measured in termsof new discoveries, the thousands of researchers

and engineers who have gained experience inHPC, and the extraordinary technical progress inrealizing new computing environments, creates itsown challenges. We invite the Board to considerfour such challenges:

Challenge 1: How can NSF, as the nation'spremier agency funding basic research, remove ex-isting barriers to the rapid evolution of high perfor-mance computing, making it truly usable by allthe nation's scientists and engineers? These bar-riers are of two kinds: technological barriers(primarily to realizing the promise of highly paral-lel machines, workstations, and networks) and im-plementation barriers (new mathematical methodsand new ways to formulate science and engineer-ing problems for efficient and effective computa-tion). An aggressive commitment by NSF toleadership in research and prototype development,in both computer science and in computationalscience, will be required.

Challenge 2: How can NSF provide scalable ac-cess to a pyramid of computing resources, fromthe high performance workstations needed bymost scientists to the critically needed teraflop-and-beyond capability required for solving GrandChallenge problems? What balance of amonghigh performance desktop workstations, vs. mid-range or mini-supercomputer, vs. networks ofworkstations, vs. remote, shared supercomputersof very high performance should NSF anticipateand encourage?

Challenge 3: The third challenge is to encouragethe continued broadening of the base of participa-tion in HPC, both in terms of institutions and interms of skill levels and disciplines. This calls forexpanded education and training, and participationby state-based and other HPC institutions.

Challenge 4: How can NSF best create the intel-lectual and management leadership for the futureof high performance computing in the U.S.'? Whatrole should NSF play within the scope of the na-tionally coordinated HPCC program? What

iiic

relationships should NSF's activities in HPC haveto the activities of other federal agencies?

This report recommends significant expansion inNSF investments, both in accelerating progress inhigh performance computing through computerand computational science research and in piovid-ing the balanced pyramid of computing facilitiesto the science and engineering communities. Thecost estimates are only approximate, but in totalthey do not exceed the Administration's stated in-tent to double the investments in HPCC during thenext 5 years. We believe these investments are notonly justified but are compatible with stated na-tional plans, both in absolute amount and in theirdistribution.

RECOMMENDATIONS:

We have four sets of interdependent recommenda-tions. The first implements a balanced pyramid ofcomputing environments (see Figure A followingthis Summary). Each element in the pyramid sup-ports the others; whatever resources are applied tothe whole, the balance in the pyramid should besustained. The second set addresses the essentialresearch investments and other steps to remove theobstacles to realizing the technologies in thepyramid and the barriers to the effective use ofthese environments.

The third set addresses the institutional structurefor delivery of HPC capabilities, and consists it-self of a pyramid (see Figure B following thisSummary), of which the NSF Centers are an im-portant part. At the base of the institutionalpyramid is the diverse array of investigators intheir universities and other settings, who use allthe facilities at all levels of the pyramid. At thenext level are departments and research groupsdevoted to specific areas of computer science orcomputational science and engineering. At thenext level are the NSF HPC Centers, which mustcontinue to be providers of shared high capabilitycomputing systems and to provide aggregations ofspecialized capability for all aspects of use and ad-vance of high performance computing. At the apexis the national teraflop-class facility, which we

iv

recommend as a multi-agency facility pushing thefrontiers of high performance into the next decade.

A final recommendation addresses the NSF's roleat the national level and its relationship with thestates in HPC.

A. CENTRAL GOAL FORNSF HPC POLICY

Recommendation A-1: The National ScienceBoard should take the lead, under OSTP guidanceand in collaboration with ARPA, DoE and otheragencies, to expand access to all levels of thedynamically evolving pyramid of high perfor-mance computing capability for all sectors of thewhole nation. The realization of this pyramiddepends, of course, on rapid progress in thepyramid's technologies. The computationalcapability we envision includes not only the re-search capability for which NSF has specialstewardship, but also includes a rapid expansionof capability in business and industry to use HPCprofitably, and many operational uses of HPC incommercial and military activities.

VISION OF THE HPC PYRAMID

Recommendation A-2: At the apex of thepyramid is the need for a national capability at thehighest level of computing power the industry cansupport with both efficient software and hardware.A reasonable goal would be the design, develop-ment, and realization of a national teraflop-classcapability, subject to the successful developmentof software and computational tools for such alarge machine (recommendation B-1). NSFshould initiate, through OSTP, an interagency planto make this investment, anticipating multi-agencyfunding and usage.

Recommendation A-3: Over a period of 5 yearsthe research universities should be assisted to ac-quire mid-range machines. These mid-sizedmachines are the underfunded element of thepyramid today about 10% of NSF's FY92 HPCbudget is devoted to their acquisition. They areneeded for both demanding science and engineer-ing problems that do not require the very maxi-

6

mum in computing capacity, and for use by thecomputer science and computational mathematicscommunity in addressing the architectural,software, and algorithmic issues that are theprimary barriers to progress with massively paral-lel processor architectures.

Recommendation A-4: We recommend that NSFdouble the current annual level of investment ($22million) providing scientific and engineeringworkstations to its 20,000 principal investigators.Within 4 or 5 years workstations delivering up to400 megaflops costing no more than $15,000 to$20,000 should be widely available. For educationand a large fraction of the computational needs ofscience and engineering, these facilities will beadequate.

Recommendation A-5: We recommend that theNSF expand its New Technologies program to sup-port expanded testing of the new parallel con-figurations for HPC applications. For example,the use of Gigabit local area networks to linkworkstations may meet a significant segment ofmid-range HPC science and engineering applica-tions. A significant supplement to HPC applica-tions research capacity can be had with minimaladditional cost if such collections of workstationsprove practical and efficient.

B. RECOMMENDATIONS TOIMPLEMENT THESE GOALS

REMOVING BARRIERS TO HPCTECHNICAL PROGRESS AND HPC USAGE

Recommendation B-1: To accelerate progress indeveloping the HPC technology needed by users,NSF should create, in the Directorate for Com-puter and Information Science and Engineering, achallenge program in computer science with grantsize and equipment access sufficient to support thesystems and algorithm research needed for morerapid progress in HPC capability. The Centers, incollaboration with hardware and software vendors,can provide test platforms for much of this work,and recommendation A-3 provides the hardware

V

support required for initial development ofprototypes.

Recommendation B-2: A significant barrier torapid progress in HPC application lies in the for-mulation of the computational strategy for solvinga scientific or engineering problem. In response toChallenge 1, the NSF should focus attention, boththrough CISE and through its disciplinary pro-gram offices, on support for the design anddevelopment of computational techniques, algo-rithmic methodology, and mathematical, physicaland engineering models to make efficient use ofthe machines.

BALANCING THE PYRAMID OFHPC ACCESS

Recommendation B-3: We recor. mend NSF setup a task force to develop a way to ameliorate theimbalance in the HPC "pyramid" the under-in-vestment in the emerging mid-range scalable,parallel computers and the inequality of access tostand-alone (but potentially networked) worksta-tions in the disciplines. This implementation planshould involve a combination of funding by dis-ciplinary program offices and some form cf morecentralized allocation of NSF resources.

C. THE NSF HPC CENTERS

Recommendation C-1 : The Centers should beretained and their missions should be reaffirmed.However, the NSF HPC effort now embraces avariety of institutions and programs HPCCenters, Engineering Research Centers, andScience & Technology Centers devoted to HPC re-search, and disciplinary investments in computerand computational science and applied mathe-matics all of which are essential elements ofthe HPC effort needed for the next decade. Fur-thermore, HPC institutions outside the NSF orbitalso contribute to the goals for which the NSFCenters are chartered. Thus we ask the Board torecognize that the overall structure of the I-1PC pro-gram at NSF will have more institutional diversity,more flexibility, and more interdependence with

other agencies and private institutions than waspossible in the early years of the HPC initiative.

The NSF should continue its current practice ofencouraging HPC Center collaboration, bothwith one another and with other entitiesengaged in HPC work. The division of the sup-port budget into one component committed tothe centers and another for multi-center ac-tivities is a useful management tool, eventhough it may have the effect of reducing com-petition among centers. The National Consor-tium for HPC (NCHPC), formed by NSF andARPA is a welcome measure as well.

Recommendation C-2 : The current situation inHPC is both more exciting, more turbulent, andmore filled with promise of really big benefits tothe nation than at any time since the Lax report;this is not the time to "sunset" a successful, chang-ing venture, of which the Centers remain an impor-tant part. Furthermore, we also recommendagainst re-competition of the four Centers at thistime, favoring periodic performance evaluationand competition for some elements of their ac-tivities, both among Centers and when appropriatewith other HPC Centers such as those operated bystates (see Recommendation D-1).

Recommendation C-3 : The mission of theCenters is to foster rapid progress in the use ofHPC by scientists and engineers, to accelerateprogress in usability and economy of HPC and todiffuse HPC capability throughout the technicalcommunity, including industry. Provision to scien-tists and engineers of access to leading edge super-computer resources will contine to be a primarypurpose of the Centers. The following additionalcomponents of the Center missions should be af-firmed:

Supporting computational science, by re-search and demonstration in the solution ofsignificant science and engineering problems.

Fostering interdisciplinary collaborationacross sciences and between sciences andcomputational science and computer science

as in the Grand Challenge programs.

vi

Prototyping and evaluating software, new ar-chitectures, and the uses of high speed datacommunications in collaboration with: com-puter and computational scientists, discipli-nary scientists exploiting HPC resources, theHPC industry, and business firms exploringexpanded use of HPC.

Training and education, from post-docs andfaculty specialists to introduction of less ex-perienced researchers to HPC methods, to col-laboration with state and regional HPCcenters working with high schools and com-munity colleges.

ALLOCATION 9F CENTER HPCRESOURCES TO INVESTIGATORS

Recommendation C-4: The NSF should continueto monitor the administrative procedures used toallocate Center resources, and the relationship ofthis process to the initial funding of the researchby the disciplinary program offices, to ensure thatthe burden on scientists applying for research sup-port is minimized. NSF should continue to pro-vide HPC resources to the research communitythrough allocation committees that evaluate com-petitively proposals for use of Center resources.

EDUCATION AND TRAINING

Recommendation C-5: The NSF should givestrong emphasis to its education mission in HPC,and should actively seek collaboration with state-sponsored and other HPC centers not supportedprimarily on NSF funding. Supercomputingregional affiliates should be candidates for NSFsupport, with education as a key role. HPC willalso figure in the Administration's industrial exten-sion program, in which the states have the primaryoperational role.

D. NSF AND THE NATIONAL HPCEFFORT; RELATIONSHIPS WITH

THE STATES

Recommendation D-1: We recommend that NSFurge OSTP to establish an advisory committee rep-resenting the states, HPC users, NSF Centers, corn-

puter manufacturers, computer and computationalscientists (similar to the Federal NetworkingCouncil's Advisory Committee), which shouldreport to HPCCIT. A particularly important rolefor this body would be to facilitate state-federalplanning related to high performance computing.

vii

Teraflopclass

Centersupercomputers

Mid-range parallel processors;Networked work stations

High performance workstations

Figure APYRAMID OF HIGH PERFORMANCE

COMPUTING ENVIRONMENTS

NationalTeraflopfacility

NSF HPC CentersOther agency, State Centers

Departments, institutes, laboratories

Subject specific, Computer Science,Computational Science and Engineering Groups

Individual investigators and small groups

Figure BPYRAMID OF HIGH PERFORMANCE

COMPUTING INSTITUTIONS

viii10

INTRODUCTION AND BACKGROUNDA revolution is underway in the practice of scienceand engineering, arising from advances in com-putational science and new models for scientificphenomena, and made possible by advances incomputer science and technology. The importanceof this revolution is not yet fully appreciated be-cause of the limited fraction of the technical com-munity that has developed the skills required andhas access to high performance computationalresources. These skill and access barriers can bedramatically lowered, and if they are, a new levelof creativity and progress in science and engineer-ing may be realized which will be quite differentfrom that known in the past. This report is aboutthat opportunity for all of science and engineering;it is not about the needs of one or two specializeddisciplines.

A little over a decade ago, the National ScienceBoard convened a panel chaired by Prof. PeterLax to explore what should NSF do to exploit thepotential for science and industry of the rapid ad-vances in high performance computing.1 The ac-tions taken by the NSF with the encouragement ofthe Board to implement the "Large Scale Comput-ing in Science and Engineering" Report of 1982have helped computing foster a revolution inscience and engineering research and practice, inacademic institutions and to a lesser extent in in-dustrial applications. At the time, centralizedfacilities were the only way to provide access tohigh performance computing, which compelledthe Lax panel to recommend the establishment ofNSF Supercomputer Centers interconnected by ahigh speed network. The new revolution is charac-terized both by advances in the power of super-computers and by the diffusion throughout thenation of access to and experience with using high

'Report of the Panel on Large Scale Computing in Scienceand Engineering, Peter Lax, chairtnan, commissioned by theNational Science Board in cooperation with the U.S. Depart-ment of Defense, Department of Energy, and the NationalAeronautics and Space Administration, December 26, 1982.

performance computing.2 This success has openedup a vast set of new research and applicationsproblems amenable to solution through high levelsof computational power and better computationaltools.

The key features of the new capabilities include:

The power of the big, multiprocessing vectorsupercomputers, today's workhorse of supercom-puting, has increased by a factor of 100 to 200since the Lax Report.3

An exciting array of massively parallel proces-sors (MPP) have appeared in the market, offeringthree possibilities: an acceleration in the rate of ad-vance of peak processing power, an improvementin the ratio of performance to cost, and the optionto grow the power of an installation incrementallyas the need arises.4

Switched networks based on high speed digi-tal communications are extending access to majorcomputational facilities, permitting the dynamicredeployment of computing power to suit theusers' needs, and improving connectivity amongcollaborating users.

2With every new generation of computing machines, thecapability associated with "high performance computing"changes. High performance computing (HPC) may bedefined as "a computation and communications capabilitythat allows individuals and groups to extend their ability tosolve research, design, and modelling problems substantiallybeyond that available to them before." This definition recog-nizes that HPC is a relative and changing concept. For thePC user a scientific workstation is high performance comput-ing. For the technical people with specialized skill in com-putational science and access to high performance facilities,a reasonable level for 1992-1993 might be 1 Gflop for a vec-tor machine and 2 Gflops for a MPP system.

3As noted in Appendix C, the clock speed of a single vectorprocessor has only increased by a factor of 5 to 6 since 1976,but a 16-way Cray C-90 with one additional vector pipe mul-tiplies the effective spedd by the estimated factor of a hundredor more.

4The promise (not yet realized) of massively parallel systemsis a much higher degree of installed capacity expandability withminimal disruption to the user's programming.

Technical progress in computer science andmicroelectronics have transformed yesterday's su-percomputers into today's emerging desktopworkstations. These workstations offer moreflexible tradeoffs between ease of access and in-herent computing power and can be coupled to thelargest supercomputers over a national network,used in locally-networked clusters, or as stand-alone processors.

Advances in computer architectures, computa-tional mathematics, algorithmic modeling, andsoftware, along with new computer architectures,are solving some of the most intractable but imA-tant scientific, technical, and economic problemslacing our society.

To address these changes, the National ScienceBoard charged this panel with taking a fresh lookat the current situation and new directions thatmight be required. (See Appendix A for institution-al identification of the panel membership and Ap-pendix B for historical background leading to thepresent study and the Charge to the Panel.)

To provide both direction and potential to exploitthese advac xs, a leadership role for the NSF con-tinues to be required. The goal of this report is tosuggest how NSF should evolve its role in highperformance computing. Our belief that NSF canand should continue to exert influence in thesefields is based in part on its past successesachieved through the NSF Program in High Perfor-mance Computing and Communications.

Achievements Since the Lax Report

In the past 10 years, the NSF Program in High Per-formance Computing and Communications has:

Facilitated many new scientific discoveriesand new industrial processes, and supported fun-damental work which has led to advances in ar-chitectures, tools and algorithms forcomputational science.

In Appendix E of this report several panelmembers describe examples of those ac-complishments and suggest their personal

visions for what may be even more dramaticprogress in the future.

Supported fundamental work in computerscience and engineering which has led to advancesin architectures, tools, and algorithms for computa-tional science.

Initiated collaborations with many companiesto help them realize the economic and technologi-cal benefits of high performance computing.

Caterpillar Inc. uses supercomputing tomodel diesel engines in an attempt to reduceemissions.

Dow Chemical Company simulates andvisualizes fluid flow in chemical processes toensure complete mixing.

USX has turned to supercomputing to im-prove the hot rolling process-control systemsused in steel manufacturing.

Solar Turbine, Inc. applies computationalfinite-element methods to the design of verycomplex mechanical systems.

Opened up supercomputer access to a widerange of researchers and industrial scientists andengineers.

This was one of the key recommendations ofthe Lax Report. The establishment of thefour NSF Supercomputer Centers (in additionto NCAR) has been extraordinarily success-ful. By providing network access, throughthe NSFNET and Internet linkages, NSF hasput these computing resources at the finger-tips of scientists, engineers, mathematiciansand other professionals all over the nation.Users seldom need to go personally to theseCenters; in fact, the distribution of computa-tional cycles by the four NSF SupercomputerCenters shows surprisingly little geographicbias. This extension of compute power, awayfrom dedicated, on-site facilities and towardsa seamless national computing environmenthas been instrumental in creating the condi-tions required for advances on a broad frontin science, engineering, and the tools of com-putational science. There seems to be a lack

12

of geographic bias in users Figure 1 in Ap-pendix D shows users widely distributedacross the United States.

Educated literally thousands of scientists, en-gineers and students, as well as a new generationof researchers who now use computational scienceequally with theory and experiment.

At the time of the Lax Report access to themost advanced facilities was restricted to arelatively small set of users. Furthermore su-percomputing was regarded by many scien-tists as either an inaccessible tool or as aninelegantly brute force approach to science.The :SF program successfully inoculated vir-tually all of the disciplines with the realiza-tion that HPC is both a powerful and apractical tool for many purposes. These NSFinitiatives have not only pushed the technol-ogy and computational science ahead insophistication and power, they have helpedbring high performance computing to a largefraction of the technical community. Therehas been a 5-fold increase in number of NSFfunded scientists using HPC and a 5-fold in-crease in ratio of graduate students to facultyusing HPC through the NSF SupercomputerCenters. (See Figure 2 of Appendix D)

Provided the HPC industry a committed, en-thusiastic, and dedicated class of expert users whoshare their experience and ideas with vendors, ac-celerating the evolutionary improvement in thetechnology and its software.

One of the problems in the migration of newtechnologies from experimental environ-ments to production modes are the inherentrisks in committing substantial resourcestowards converting existing codes anddeveloping software tools. The NSF Super-computer Centers have provided a provingground for these new technologies; variousindustrial players have entered into partner-ships with the Centers aimed at acceleratingthis migration while maintairk'ng solid andreliable underpinnings.

Encouraged the Supercomputer Centers toleverage their relationship with HPC producers toreduce the cost of bringing innovation to the scien-tific and engineering communities.

In recognition of Center activity in improvingearly versions of hardware and software forhi[,:t performance computing systems, thecomputer industry has provided equipment atfavorable prices and important technical sup-port. This has allowed researchers earlier andmore useful access to HPC facilities thanmight have been the case under commercialterms.

Joined into successful partnerships with otheragencies to make coordinated contributions to theU.S. capability in HPC.

A decade ago the United States enjoyed aworld-wide commercial lead in vector sys-tems. In part as the result of more recentdevelopment and procurement actions of theAdvanced Projects Agency, the Departmentof Energy, and the National Science Founda-tion, the U.S. now has the dominant lead inproviding new Massively Parallel Processing(MPP) systems.5 As an example, the NSFhas enabled NSF Supercomputer Center ac-quisitions of scalable parallel systems firstdeveloped under seed money provided byARPA, and thus has been instrumental inleveraging ARPA projects into the

5Ivlassively parallel computers are constructed from largenumbers of separate processors linked by high speed com-munications providing access to each other and to shared I/Odevices and/or computer memory. There are many differentarchitectural forms of MPP r.iachines, but they have in com-mon economies of scale from the use of microprocessorsproduced at high volumes and the ability to combine them atmany levels of aggregation. The challenge in using suchmachines is to formulate the problem so that it can be decom-posed and run efficiently on most or all of the processorsconcurrently. Some scientific problems lend themselves toparallel computation much more easily than others, suggest-ing that improved utility of MPP machines will not heavailed in all fields of science at once.

313

mainstream.6 (Figure 3 of Appendix Dshows data on the uptake of advanced com-puting by sector across the world). '

The Lax Report

All of these accomplishments have, in a large part,arisen from the response by NSF to the recommen-dations of the 1982 Lax report "Large Scale Com-puting in Science and Engineering". Theserecommendations included:

Increase access to regularly upgraded super-computing facilities via high bandwidth net-works.

Increase research in computational mathe-matics, software, and algorithms.

Train people in scientific computing.

Invest in research on new supercomputer sys-tems.

For several reasons, NSF's investment in computa-tional research and training has been a startlingsuccess. First, there has been a widespread accep-tance of computational science as a vital com-ponent and tool in scientific and technologicalunderstanding. Second, there have been revolution-ary advances in computing technology in the pastdecade. And third, the demonstrated ability tosolve key critical problems has advanced theprogress of mathematics, science and engineeringin many important ways, and has created greatdemand for additional HPC resources.

The New Opportunities in Science and inIndustry

As discussed in detail in the Appendix E essays,the prospects are for dramatic progress in scienceand engineering and for rapid adoption of corn-

6Scalable parallel machines are those in which the numberof processor nodes can be expanded over a wide rangewithout substantial changes in either the shared hardware orthe application interfaces of the operating system.

4

putational science in industry. The next majorHPC revolution may well be in industry, which isstill seriously under-utilizing HPC (with some ex-ceptions such as aerospace, automotive, andmicroelectronics).

The success of the Chemical industry in design-ing and simulating pilot plants, of the aircraft in-dustry in simulating wind tunnels andperforming dynamic design evaluation, and inthe electronics industry in designing integratedcircuits and modelling the performance of com-puters and networks suggests the scale of avail-able opportunities. The most importantrequirements are (a) improving the usability andefficiency problems of high performancemachines, and (b) training in HPC for peoplegoing into industry. The Supercomputer Centershave demonstrated they can introduce the com-mercial sector to HPC at little cost, and withhigh potential benefits to economy (produc-tivity of industry and stimulation of markets forU.S. HPC vendors). Success in stimulatingHPC usage in industry will also accelerate needfor HPC education and technology, thus exploit-ing the benefits of collaboration with univer-sities and vendors. The Centers' role can be acatalytic one, but often rises to the level of atrue collaborative partnership with industry, tothe mutual advantage of the firm and the NSFCenters. A:: industrial uses of HPC grow, thescientists, mathematicians, and engineersbenefit from the falling costs and risingusability of the new equipment. In addition thetechnological uses of HPC spur new and inter-esting problems in science. The following chartindicates the increasing importance of advancedcomputing in industry.

Cray Research Inc. supercomputer sales

EraPercent togovernment

Percent toindustry

Percent toUniversities

Early 1980s 70 25 5

Late 1980s 60 25 15

Today 40 40 20

14

The New Technology

Most HPC production work being done today usesbig vector machines in single processor (or looselycoupled multiprocessor) mode. VectorizingFortran compilers and other software tools arewell tested and many people have been trained intheir use. These big shared memory machines willcontinue to be the mainstay of high performancecomputing, at least for the next 5 years or so, andperhaps beyond if the promise of massively paral-lel supercomputing is delayed longer than manyexpect.

New desk top computers have made extraordinarygains in cost-performance (driven by competition-driven commodity microprocessor production).Justin Rattner of Intel estimated that in 1996microprocessors with clock speeds of 200 MHzmay power an 800 Mflops peak speed worksta-tion! He, and others from the industry, predictedthe convergence of the clock speeds ofmicroprocessor chips and the large vectormachines such as the Cray C90, perhaps as soonas 1995. They held out the likelihood that in 1997microprocessors may be available at 1 gigaflop; adesktop PC might be available with this speed for$10,000 or less. Mid-range workstations will alsoshow great growth in capacity; Today one can pur-chase a mid-range workstation with a clock speedof 200MHz for an entry price of $40,000 to$50,000.

Thus a technical transition is underway from theworld in which uniprocessor supercomputers were

7The instruction execution speeds of scientific computers aregenerally reckoned in the number of floating point instruc-tions that can be executed in one second. Thus a 1 Megaflopmachine executes 1 million floating point instructions persecond, a Gigaflop would be one billion instructions persecond, and a Teraflop 1012 floating point instructions persecond. Since different computer architectures may havequite different instruction sets one "flop" may not be thesame as another, either in application power or in the numberof machine cycles required. To avoid such difficulties, thosewho want to compare machines of different architecturegenerally use a benchmark suite of test cases to measureoverall performance on each machine.

distinguished from desktop machines by havingmuch faster cycle times, to a world in which cycletimes converge and the highest levels of computerpower will be delivered through parallelism,memory size and bandwidth, and I/O speed. Thewidespread availability of scientific workstationswill accelerate the introduction of more scientistsand engineers to high performance computing,resulting in a further acceleration of the need forhigher performance machines. Early explorationof message-passing distributed operating systemsgives promise of loosely-coupled arrays ofworkstations being used to process large problemsin the background and when the workstations areunused at night, as well as coupling the worksta-tions (on which problems are initially designedand tested) to the supercomputers located atremote facilities.

Of course, the faster microprocessors also makepossible new MPP machines of ever increasingpeak processing speed. MPP is catching on fast,as researchers with sufficient expertise (anddiligence) in computational science are solving agrowing number of applications that lend themsel-ves to highly parallel architectures. In some casesthose investigators are realizing a ratio of theoreti-cal to peak performance approaching thatachieved by vector machines, with significant cost-performance advantages. Efficient use of MPP onthe broad range of scientific and engineeringproblems is still beyond the reach of most inves-tigators, however, because of the expertise and ef-fort required. Thus the first speculative phase ofMPP HPC is coming to an end, but its ultimatepotential is still uncertain and largely unrealized.

Limiting progress in all three of these technologiesis a set of architecture and software issues that arediscussed below in Recommendations B. Principalamong them is the evolution of a programmingmodel that can allow portability of applicationssoftware across architectures.

These technical issues are discussed at greaterlength in Appendix C.

5 15

FOUR CHALLENGES FOR NSF

High performance computing is changing veryfast, and NSF policy must chase a moving target.For that reason, the strategy adopted must be agile.and flexible in order to capitalize on past invest-ments and adapt to the emerging opportunities.The Board and the Foundation face four centralchallenges, on which we will make specific recom-mendations for policy and action. These challen-ges are:

Removing barriers to the rapid evolution ofHPC capability

Providing scalable access to all levels ofHPC car ability

Finding the right incentives to promote ac-cess to all three levels of the computationalpower pyramid

Creating NSF's intellectual and managementleadership for the future of high performancecomputing in the U.S.

CHALLENGE NO. 1: Removing barriers tothe rapid evolution of HPC capability

How can NSF, as the nation's premier agencyfunding basic research, remove existing barriers tothe rapid evolution of High Performance Comput-ing? These barriers are of two kinds: technologi-cal barriers (primarily to realizing the promise ofhighly parallel machines, workstations, and net-works) and exploitation barriers (new mathemati-cal methods and new ways to formulate scienceand engineering problems for efficient and effec-tive computation). An aggressive commitment byNSF to leadership in research and prototypedevelopment, in both computer science and com-putational science, will be required. Indeed,NSF's position as the leading provider of HPCcapability to the nation's scientists and engineerswill be strengthened if it commands a leadershiprole in technical advances in both areas, whichwill contribute to the nation's economic positionas well as its position as a world leader in research.

6

Computer Science and Engineering. The firstchallenge is to accelerate the development ofthe technology underlying high performancecomputing. Among the largest barriers to effec-tive use of the emerging HPC technologies areparallel architectures from which it is easy to ex-tract peak performance, system software (operat-ing systems, databases of massive size,compilers, and programming models) to take ad-vantage of these architectures and provide por-tability of end-user applications, parallelalgorithms, and advances in visualization techni-ques to aid in the interpretation of results. Thetechnical barriers to progress are discussed inAppendix C. What steps will most effectivelyreduce these barriers?

Computational Tools for Advancing Scienceand Engineering. Research in the develop-ment of computational models, the design of al-gorithmic techniques, and their accompanyingmathematical and numerical analysis, is re-quired in order to ensure the continued evolu-tion of efficient and accurate computationalalgorithms designed to make optimal use ofthese emerging technologies.

In the past ten years, exciting developments incomputer architectures, hardware and softwarehave come in tandem with stunningbreakthroughs in computational techniques,mathematical analysis, and scientific models.For example, the potential of parallel machineshas been realized in part through new versionsof numerical linear algebra routines and multi-grid techniques; rethinking and reformulating al-gorithms for computational physics within thedomain of parallel machines has posed sig-nificant and challenging research questions. Ad-vances in such areas of N-body solvers, fastspecial function techniques, wavelets, highresolution fluid solvers, adaptive mesh techni-ques, and approximation theory have generatedhighly sophisticated algorithms to handle com-plex problems. At the same time, importanttheoretical advances in the modelling of under-lying physical and engineering problems haveled to new, efficient and accurate discretizationtechniques. Indeed, in the evolution to scalable

16

computing across a range of levels, designingappropriate numerical and computational techni-ques is of paramount importance. The challengefacing NSF is to weave together existing workin these areas, as well as fostering new bridgesbetween pure, applied and computational techni-ques, engaging the talents of disciplinary scien-tists, engineers, and mathematicians.

CHALLENGE NO. 2: Providing scalableaccess to all levels of HPC capability

How can NSF provide scalable access to comput-ing resources, from the high performance worksta-tions needed by most scientists to the criticallyneeded teraflop-and-beyond capability requiredfor solving Grand Challenge problems?8 Whatbalance should NSF anticipate and encourageamong high performance desktop workstations,mid-range or mini-supercomputers, networks ofworkstations, and remote, shared supercomputersof very high performance?

Flexible strategy. NSF must ensure that ade-quate additional computational capacity is avail-able to a steadily growing user community tosolve the next generation of more complexscience and engineering problems. A flexibleand responsive strategy that can support thelarge number of evolving options for HPC andcan adapt to the outcomes of major currentdevelopment efforts (for example in MPP sys-tems and in networked workstations) is required.

A pyramid of computational capability. Therewill continue to be an available spectrum span-ning almost five orders of magnitude of com-puter capabilities and prices.9 NSF, as a leader

8By scalable access we mean the ability to develop a prob-lem on a workstation or intermediate sized machine andmigrate the problem with relative efficiency to largermachines as increased complexiy requires it. Scalable ac-cess implies scalable architectures and software.

9A Paragon machine of 300 Gigaflops peak performance wouldbe five orders of magnitude faster than a 3 megaflop entryworkstation. Effective performance in most science applica-tions would, however, be perhaps a factor of ten lower.

in the national effort in high performance com-puting, should support a "pyramid" of comput-ing capability. At the apex of the pyramid is thehighest performance systems that affordabletechnology permits, established at nationalfacilities. At the next level, every major re-search university should have access to one, ora few, intermediate-scale high-performance sys-tems and/or aggregated workstation clusters.10At the lowest level are workstations withvisualization capabilities in sufficient numbersto support computational scientists and en-gineers.

Mid-range computational requirements.Over the next five years, the middle range ofscientific computing and computational en-gineering will be handled by an amazing varietyof moderately parallel systems. In some cases,these will be scaled-down versions of thehighest performance systems available; in othercases, they will be systems targeted at themidrange computing market. The architecturewill vary from shared memory at one end of thespectrum to workstation networks at the other,depending on the types of parallelism in thelocal spectrum of applications. Looselycoupled networks of workstations will competewith mid-range systems for performance ofproduction HPC work. At the same timeautonomous mid-range systems are needed tosupport the development of next-generation ar-chitectures and software by computer sciencegroups.

The panel perceives that there are imbalances inaccess to the pyramid of HPC resources (see thefollowing table). The disciplinary NSF programoffices have not been uniformly effective inresponding to the need for a desktop environ-

t0As discussed in the recommendations, dedicated mid-range systems are required not only for science and engineer-ing applications but also for research to improve HPChardware and software, and for interactive usage. Forscience and science and engineering batch applications, net-works of workstations will likely develop into an alternative.

717

ment for their supported researchers, and thereis serious under-investment in the mid-sizedmachines. The distribution of investment tendsto be bimodal, to the disadvantage of mid-rangesystems. The incentive structures internal to theFoundation do not address this distortion.NSF's HPCC coordinating mechanism needs toaddress this distortion in a more direct manner.

Computational Infrastructure at NSF(FY92 $, M)

Other NSF ASC

Workstations . . 20.1 3.2

Small Parallel . . 2.1 0.5

Large Parallel . . 9.4 3.2

Mainframe 9.1 16.3

Total 40.8 23.2

CHALLENGE NO. 3: The right incentives topromote access to all three levels of the com-putational institution pyramid

The third challenge is to encourage the continuedbroadening of the base of participation in HPC,both in terms of institutions and in terms of skilllevels and disciplines.

Lax Report incentives. At the time of the Laxreport, relatively few people were interested inHPC; even fewer had access to supercomputers.Some users were fortunate to have contactswith someone at one of a few select governmentlaboratories where computer resources wereavailable. Most, however, were less fortunateand were forced to carry out their research onsmall departmental machines. This severelylimited the research that could be carried out toproblems that would "fit" into available resour-ces. NSF addressed this problem by concentrat-ing supercomputer resources in Centers; by thismeans those in the academic community mostprepared and motivated were provided with ac-cess to machine cycles.

Need for expanded scope of access. Now thatthese resources are available on a peer review

8

basis to everyone no matter where they work, itis clear the research community cannot accept areturn to the previous mode of operation.Thehigh performance computing community hasgrown to depend on NSF to make the necessaryresources available to continually upgrade theSupercomputer Centers in support of their com-putational science and engineering applications.NSF needs to broaden the base of participationin HPC through NSF program offices as well asthrough the Supercomputer Centers. There is noquestion that HPC has broken out of its originalnarrow group of privileged HPC specialists.The Super Quest competition for high school stu-dents already demonstrates how quickly youngpeople can master the effective use of HPCfacilities. Other agencies, states, and privateHPC centers are springing up, making majorcontributions not only to science but to K-12education and to regional economies. NSF'spolicies on expanding access and training musttake advantage of the leverage these Supercom-puter Centers can provide.

Allocation of HPC resources. There remainsthe question of the best way to allocate HPCresources. Should Supercomputer Centers con-tinue to be funded to allocate HPC cycles com-petitively, or should NSF depend on the"market" of funded investigators for allocationof HPC resources? This question gets at twoother issues: (a) the future role of the Centersand (b) the best means for insuring adequatefunding of workstations and other means ofHPC access throughout the NSF. The Centershave peer review committees which allocateHPC resources on the basis of competitiveproject selection. The Panel believes these al-locations are fairly made and reflect solidprofessional evaluation of computational merit.The only remaining issue is whether there con-tinues to be a need for protected funding forHPC access in NSF, including access to sharedSupercomputer Centers facilities? We believestrongly that there is such a need. The paneldoes have suggestions for broadening the sup-port for the remainder of the HPC pyramid;

these are articulated in the recommendationsbelow.

Education and training. A major requirementfor education and training continues to exist.Even though most disciplines have been inocu-lated with successful uses of HPC (see Appen-dix D essays), and even though graduate studentand postdoctoral uses of HPC resources is risingfaster than faculty usage, only a minority ofscientists have the training to allow them toovercome the initial barrier to proficiency, espe-cially in the use of MPP machines which re-quire a high level of computationalsophistication for most problems.

CHALLENGE NO. 4: How can NSF bestcreate the intellectual and managementleadership for the future of high performancecomputing in the U.S.?

What relationships should NSF's activities in HPChave to the activities of other federal agencies?NSF is a major player. What role should NSFplay within the scope of the nationally coordinatedHPCC program and budget, as indicated in the fol-lowing chart?

HPCC Agency Budgets

Agency FY92 Funding ($, M)

ARPA 232.2

NSF 200.9

DOE 92.3

NASA 71.2

HHS/NIH 41.3

DOC/NOAA 9.8

EPA 5.0

DOC/NIST 2.1

NSF leadership in HPCC. The voice of HPCCusers needs to be more effectively felt in the na-tional program; NSF has the best contact with thiscommunity. NSF has played, and continues toplay, a leadership role in the NREN program andthe evolution of the Internet. Its initiative in creat-ing the "meta-center" concept establishes an NSFrole in the sharing and coordination of resources(not only in NSF but in other cooperating agenciesas well), and the concept can be usefully extendedto cooperating facilities at state level and inprivate firms. The question is, does the currentstructure in CISE, the HPCC coordination office,the Supercomputer Centers, and the science andengineering directorates constitute the mostfavorable arrangement for that leadership? Thepanel does not attempt to suggest the best ways tomanage the relationships among these importantfunctions, but asks the NSF leadership to assurethe level of attention and coordination required toimplement the broad goals of this report.

Networking. The third barrier is the need for net-work access with adequate bandwidth. For widearea networks, this is addressed in the NSF HPCCNREN strategy. In the future, NSF will focus itsnetwork subsidies on HPC applications and theirsupporting infrastructure, while support for basicInternet connectivity shifts to the research andeducation institutions.11

11NREN is the National Research and Education Network,envisioned in the High Performance Computing Act of 1991.NREN is not a network so much as it is a program of ac-tivities including the evolution of the Internet to serve theneeds of HPC as well as other information activities.

RECOMMENDATIONS

We have four sets of interdependent recommenda-tions for the National Science Board and the Foun-dation. The first implements a balanced pyramidof computing environments; each element sup-ports the others, and as priorities are applied thebalance in the pyramid should be sustained. Thesecond set addresses the essential research invest-ments and other steps to remove the obstacles torealizing the technologies of the pyramid and thebarriers to the effective use of these environments.The third set addresses the in thtutional structurefor the delivery of HPC capabilities, and consistsitself of a pyramid. At the base of the institutionalpyramid is the diverse array of investigators intheir unversities and other settings who use all thefacilities at all levels of the pyramid. At the nextlevel are departments and research groups devotedto specific areas of computer science or computa-tional science and engineering. Continuing up-ward are the NSF HPC Centers, which mustcontinue to play a very important role, both asproviders of the major resources of high capabilitycomputing systems and as aggregations of special-ized capability for all aspects of use and advanceof high performance computing. At the apex isthe national teraflop facility, which we recom-mend as a multi-agency facility pushing the fron-tiers of high performance into the next decade. Afinal recommendation addresses the NSF's role atthe national level and its relationship with thestates in HPC.

This report recommends significant expansion inNSF investments, both in of xlerating progress inhigh performance computing through computerand computational science research and in provid-ing the balanced pyramid of computing facilitiesto the science and engineering communities, but intotal they do not exceed the Administration'sstated intent to double the investments in HPCCduring the next 5 years. We believe these invest-ments are not only justified, but are compatiblewith stated national plans, both in absolute amountand in their distribution.

10

A. CENTRAL GOAL FORNSF HPC POLICY

Recommendation A-1: We strongly recommendthat NSF build on its success in helping the U.S.achieve its preeminent world position in high per-formance computing by taking the lead, underOSTP guidance and in collaboration with ARPA,DoE and other agencies, to expand access to alllevels of the rapidly evolving pyramid of high performance computing for all sectors of the nation.The realization of this pyramid depends, ofcourse, on rapid progress in the pyramid's tech-nologies.

High performance computing is essential to theleading edge of U.S. research and development.It will provide the intelligence and power thatjustifies the breadth of connectivity and accesspromised by the NREN and the National Infor-mation Infrastructure. The computationalcapability we envision includes not only the re-search capability for which NSF has specialstewardship, but also includes a rapid expansionof capability in business and industry to useHPC profitably and the many operational usesof HPC in commercial and military activities.

The panel is concerned that if the governmentfails to implement the planned HPCC invest-ments to support the National Information In-frastructure, the momentum of the U.S.industry, which blossomed in the first phase ofthe national effort, will be lost. Supercomputersare only a $2 billion industry, but an industrythat provides critical tools for innovation acrossall areas of U.S. competitiveness, includingpharmaceuticals, oil, aerospace, automotive,and others. The administration's planned newinvestment of $250 million in HPCC is fully jus-tified. Japanese competitors could easily closethe gap in the HPC sectors in which the U.S. en-joys that lead; they are continuing to invest andcould capture much of the market the U.S.government has been helping to create.

20

VISION OF THE HPC PYRAMID

Recommendation A-2: At the apex of the HPCpyramid is a need for a national capability at thehighest level of computing power the industry cansupport with both efficient software and hardware.

A reasonable goal for the next 2-3 years wouldbe the design, development, and realization of anational teraflop-class capability, subject to theeffective implementation of RecommendationB-1 and the development of effective softwareand computational tools for such a largemachine.I2 Such a capability would provide asignificant stimulus to commercial developmentof a prototype high-end commercial HPC sys-tem of the future. We believe the importance ofNSF's missior in HPC justifies NSF initiatingan interagency plan to make this investment,and further that NSF should propose to operatethe facility in support of national goals inscience and technology. For budgetary ana inter-agency collaboration reasons OSTP should in-voke a FCCSET project to establish such acapability on a government-wide basis withmulti-agency funding and usage.

If development begins in 1995 or 1996, areasonable guess at the cost of a teraflopmachine is $50/megaflop for delivery in 1997to 1998. If so, $50 million a year might buy one

t2Some panel members have reservations about the urgencyof this recommendation, are pessimistic about the likelihoodof realizing the effective performance in applications, or areconcerned about the possible opportunity cost to NSF ofsuch a large project. The majority notes that the recommen-dation is intended to drive solutions to those architecturaland software problems. Intel's Paragon machine is on themarket today with 0.3 Teraflops peak speed, but without thesupport to deliver that speed in most applications. The panelalso recommends a multi-agency federal effort. NSF's shareof cost and role in managing such a project are left to aproposed FCCSET review.

11

such machine per year.I3 Development costwould be substantial, perhaps in excess of theproduction cost of one machine; alt 'lough it isnot clear to what extent government supportwould be required, this is a further reason tosuggest a multi-agency program.14 Supportcosts would also be additional, but one can as-sume that one or more of the NSF Supercom-puter Centers could host such a facility withsomething like the current staff.

Such a nationally shared machine, or machines,must be open to competitive merit-evaluatedproposals for science and engineering computa-tion, although it could share this mission ofresponding to the community's researchpriorities with mission-directed work of thesponsoring agencies. The investment is jus-tified by (a) the existence of problems whosesolution awaits a teraflop machine, (b) the im-portance of driving the HPC industry's innova-tion rate, (c) the need for early and concreteexperience with the scalability of software en-vironments to higher speeds and larger arrays ofprocessors, since software development time isthe limiting factor to hardware acceptance in themarket.

13The cost estimates in this report cannot be much more thaninformed guesses. We have assumed a cost of $50/megaflopfor purchase of a one teraflop machine in 1997 or 1998. Wesuspect that this cost might be reached earlier, say in 1995 or1996 in a mid-range machine, because a tightly-coupled mas-sively parallel machine may have costs rising more thanlinearly with the number of processors, overcoming the scaleeconomies that might make the cost rise less than linearly.The cost estimates in recommendations A2-4 are intended toindicate that scale of investment we recommend is not in-compatible with the published pllans of the administrationfor investment in HPCC in the next 5 years, and further thatroughly equal levels of incremental expenditures in the threelevels of the HPC pyramid could produce the balance amongthese levels that we recommend.

14The Departments of Energy and Defense and NASA mightshare a major portion of the development cost and might alsoacquire such machines in the future as well.

4.11

Recommendation A-3: Over a period of 5 yearsthe research universities should be assisted to ac-quire mid-range machines.

This will bring a rapid expansion in access tovery robust capability, reducing pressure on theSupercomputer Centers' largest facilities, and al-lowing the variety of vendor solutions to be ex-ercised extensively. If the new MPParchitectures prove robust, usable, and scalable,these institutions will be able to grow thecapacity of such system in proportion to needand with whatever incremental resources areavailable. This capability is also needed to pro-vide testbeds for computer and computationalscience research and testing.

These mid-sized machines are the underfundedelement today less than 5% of NSF's FY92HPC budget is devoted to their acquisition.They are needed for both demanding scienceand engineering problems that do not requirethe very maximum in computing capacity, andimportantly for use by the computer science andcomputational mathematics community in ad-dressing the architectural, software, and algo-rithmic issues that are the primary barriers toprogress with MPP architectures.I5

Engineering is also a key candidate for theiruse. There are 1050 University-Industry Re-search Centers in the U.S. Those UIRCs thatare properly equipped with computationalfacilities can increase the coupling with in-dustrial computation, adding greatly to what theNSF HPC Supercomputer Centers are doing.Many engineering applications, such as roboticsresearch, require "real time" interactive com-putation which is incompatible with the batchenvironment on the highest performancemachines.

15The development of prototypes of architectures and operat-ing systems for parallel computation requires access to amachine whose hardware and software can he experimental-ly modified. This research often cannot he done on machinesdediicated to full time production.

If we assume a cost in three or four years of$50/megaflop for mid-sized MPP machines, anannual expenditure of $10 million would fundthe annual acquisition of one hundred 2Gigaflop (peak) computers. Support costs forusers would be additional.

Recommendation A-4: We recommend that NSFdouble the current annual level of investment ($22million) in scientific and engineering workstationsfor its 20,000 principal investigators.

Many researchers strongly prefer the new highperformance workstations that are under theircontrol and find them adequate to meet many oftheir initial needs. Those without access to thenew workstations may apply to use remote ac-cess to a supercomputer in a Center, but oftenthey do not need all the I/O and othercapabilities of the large shared facilities. NSFneeds a strategy to off -load work not requiringthe highest level machines in the Centers. Thejustification is not economy of scale, buteconomy of talent and time.

When the Lax report was written a 160 Mfloppeak Cray 1 was a high performance supercom-puter. Within 4 or 5 years workstations deliver-ing up to 400 megaflops costing no more than$15,000 to $20,000 should be widely available.For education and a large fraction of the com-putational needs of science and engineering,these facilities will be adequate. However, oncevisualization of computational output becomesroutinely required they will be ubiquitouslyneeded. With the rapid pace of improvement,the useful lifetimes of workstations are decreas-ing rapidly; they often cannot cope with thelatest software. Researchers face escalatingcosts to upgrade their computers. NSF supportsperhaps some 20,000 principal investigators.Equipping an additional 10 percent of this num-ber each year (2,000 machines) at $20,000 eachrequires an incremental $20 million. Recom-mendation B-3 addresses how this investmentmight be managed.

Recommendation A-5: We recommend that NSFexpand its New Technologies program to supportexpanded testing of the practicality of new parallel

12 2

configurations for HPC applications.I6 For ex-ample, networks of workstations may meet a sig-nificant part of midrange HPC science andengineering applications. As progress is made inthe development of this and other technologies, ex-perimental use of the new configurations shouldbe encouraged. A significant supplement to HPCapplications research capacity can be had withminimal additional cost if such collections ofworkstations prove practical and efficient.

There have already been sufficient experimentswith use of distributed file systems and looselycoupled workstations to encourage the beliefthan many compute-intensive problems areamenable to this approach. For those problemsthat do not suffer from the latency inherent inthis approach the incremental costs can be verylow indeed, for the problems run in backgroundand at times the workstations are otherwise un-engaged. There are those who strongly believethat in combination with object-orientedprogramming this approach can create a revolu-tion in software and algorithm sharing as wellas more economical machine cycles.I7

B. RECOMMENDATIONS TOIMPLEMENT THESE GOALS

REMOVING BARRIERS TO HPCTECHNICAL PROGRESS AND HPC USAGE

Recommendation B-1: To accelerate progress indeveloping the HPC technology needed by users,NSF should create, in CISE, a challenge programin computer science with grant size and equipmentaccess sufficient to support the systems and algo-

IEToday NSF CISE has a "new technologies" program thatco-funds with disciplinary program offices perhaps 50projects/yr. This program is in the division that funds theCenters, but is focused on projects which can ultimatelybenefit all users of parallel systems. This program funds per-haps 15 methods and tools projects annually, in addition tothose co-funded with science programs.

"MITRE Corporation, among others, is pursuing this vision.

13

rithm research needed for more rapid progress inHPC. The Supercomputer Centers, in collabora-tion with hardware and software vendors, can pro-vide test platforms for much of this work.Recommendation A-3 provides the hardware sup-port required for initial development ofprototypes.

There is consensus that the absence of sufficientfunding for systems and algorithms work whichis not mission-oriented is the primary barrier tolower cost, more widely accessible, and moreusable massively parallel systems. This work,including bringing the most promising ideas toprototype stage for effective transfer to the HPCindustry, would address the most significant bar-riers to the ultimate penetration of parallel ar-chitectures in workstations. Advances on thehorizon that could be accelerated include moreadvanced network interface architectures andoperating systems technologies to provide lowoverhead communications in collections ofworkstations, and advances in algorithms andsoftware for distributed databases of massivesize. Computer science has made, and continuesto make, important contributions to both hardand soft parallel machine technology, and has ef-fectively transferred these ideas to the industry.

Two problems impede the full contribution ofcomputer science to rapid advance in MPPdevelopment; grant sizes in the discipline aretypically too small to allow enough con-centrated effort to build and test prototypes, andtoo few computer science departments have ac-cess to a mid-sized machine on which systemsdevelopment can be done.

The Board should ask for a proposal from CISEto effectively mobilize the best computerscience and computational mathematics talentto addressing the solution of these problems inthe areas of both improved operating systems,architectures, compilers, and algorithms for ex-isting systems as well as research in next-generation systems. We recommendestablishing a number of major projects, withhigher levels of annual funding than is tyr .calin Computer Science, and assured duration

23

up to five years, for a total annual incrementalinvestment of $10 million. We recommend thatthis challenge fund be managed by CISE, andbe accessible to all disciplinary program officeswho wish to forward team proposals for add-onfunding in response to specific proposals fromthe community.

Recommendation B-2: A significant barrier torapid progress in the application of HPC lies in for-mulating a computational strategy to solve a prob-lem. In response to Challenge 1 above, NSFshould focus attention, both through CISE andthrough its disciplinary program offices, on sup-port for the design and development of computa-tional techniques, algorithmic methodology, andmathematical, physical and engineering models tomake efficient use of the machines.

Without such work in both theoretical and ap-plied areas of numerical analysis, applied mathe-matics, and computational algorithms, the fullbenefit of advances in architecture and systemssoftware will not be realized. In particular, sig-nificantly increased funding of collaborativeand individual state-of-the-art methodology iswarranted, and is crucial to the success of highperformance computing. Some of this can bedone through the individual directorates withfunds supplemented by HPCC funds; the GrandChallenge Applications Group awards are agood first step.

Recommendation B-3: We recommend NSF setup an agency-wide task force to develop a way toameliorate the imbalance in the HPC pyramid -

the under-investment in the emerging mid-rangescalable, parallel computers and the inequality ofaccess to stand-alone (but potentially networked)workstations in the disciplines. This implementa-tion plan should involve a combination of fundingby disciplinary program offices and some form ofmore centralized allocation of NSF resources.

Some directorates have "infrastructure"programs; others do not. Still others fundworkstations until they reach the "target" set bythe HPCC coordination office. We believe thatindividual disciplinary program managersshould consider it their responsibility to fund

14

purchase of workstations out of their equipmentfunds. But we recognize that these funds needto be supplemented by HPCC funds. CISE hasan office which co-funds interdisciplinary ap-plications of HPC workstations. We believe thisoffice may require more budgetary authoritythan it now enjoys, to ensure the proper balanceof program and CISE budgets for workstations.

Scientific value must be a primary criterion forresource allocation. It would be unwise to sup-port mediocre projects just because they requiresupercomputers. The strategy of application ap-proval will depend very heavily on fundingscenarios. If sufficient HPCC funds are madeavailable to individual programs for computerusage, then the Supercomputer Centers shouldbe reserved for applications that cannot be car-ried out elsewhere, with particular priority tonovel applications. If individual scienceprograms continue to be underfunded relative tolarge centers, the Supercomputer Centers maybe forced into a role of supporting less novel ordemanding computing applications. Underthese circumstances, less stringent fundingcriteria should be applied.

C. THE NSF SUPERCOMPUTERCENTERS

Recommendation C-1: The SupercomputerCenters should be retained and their missions, asthey have evolved since the Lax Report, should bereaffirmed. However, the NSF HPC effort nowembraces a variety of institutions and programsHPC Centers, Engineering Research Centers(ERC) and Science and Technology Centers(STC) devoted to HPC research, and disciplinaryinvestments in computer and computationalscience and applied mathematics - all of which areessential elements of the HPC effort needed forthe next decade. NSF plays a primary but notnecessarily dominant role in each of them (see Fig-ure 4 of Appendix D). Furthermore, HPC institu-tions outside the NSF orbit also contribute to thegoals for which the NSF Supercomputer Centersare chartered. Thus we ask the Board to recognizethat the overall structure of the HPC program at

24

NSF will have more institutional diversity, moreflexibility, and more interdependence with otheragencies and private institutions than in the earlyyears of the HPC initiative.

We anticipate an evolution, which has alreadybegun, in which the NSF SupercomputerCenters increasingly broaden their base of sup-port, and NSF expands its support in collabora-tion with other institutional settings for HPC.Center-like groups, especially NSF S&TCenters, are an important instrument for focus-ing on solving barriers to HPC, although theydo not provide HPC resources to users. An ex-cellent example is the multi-institutional Centerfor Research in Parallel Computation at RiceUniversity, which is supported at about $4M/yr,with additional support from ARPA. Another ex-ample is the Center for Computer Graphics andScientific Visualization, an S&T Center awardto University of Utah with participation ofUniversity of N. Carolina, Brown, Caltech, andCornell. Still another example is the DiscreteMathematics and Computational Science Center(DIMACS) at Rutgers and. Princeton. Thesecenters fill important roles today, and the ERCand S&T Center structures provide a necessaryaddition to the Supercomputer Centers for in-stitutionalizing the programmatic work requiredfor HPC.

The NSF should continue its current practice ofencouraging HPC Center collaboration, bothwith one another and with other entitiesengaged in HPC work. The division of the sup-port budget into one component committed tothe Supercomputer Centers and another formulti-center activities is a useful managementtool, even though it may have the effect ofreducing competition among SupercomputerCenters. The National Consortium for HPC(NCHPC), formed by NSF and ARPA is a wel-come measure as well.

Recommendation C-2: The current situation inHPC is more exciting, more turbulent, and morefilled with promise of really big benefits to the na-ti t than at any time since the Lax report; this isnot the time to "sunset" successful, changing

15

venture, of which Supercomputer Centers remainan essential part. Furthermore, we also recom-mend against an open recompeti on of the fourSupercomputer Centers at this time, favoring in-stead periodic performance evaluation and com-petition for some elements of their activities, bothamong the Centers themselves and when ap-propriate with other HPC Centers such as thoseoperated by states (see Recommendation D-1).

Continuing evaluation of each Center's perfor-mance, as well as the performance of the overallprogram, is, of course, an essential part of goodmanagement of the Supercomputer Centers pro-gram. Such evaluations must take place on aregular basis in order to develop a sound basisfor adjustments in support levels, to provide in-centives for quality performance and to recog-nize the need to encourage other institutionssuch as S&T Centers that are attacking HPC bar-riers and state-based centers with attractiveprograms in education and training. Whilerecompetition of existing SupercomputerCenters does not appear to be appropriate at thistime, if regular review of the Centers and theCenters program identifies shortcomings in aCenter or the total program, a recompetition ofthat element of the program should be initiated.

Supercomputer Centers are highly leveragedthrough investments by industry, vendors, andstates. This diversification of support impedesunilateral action by NSF, since the Centers'other sponsors must be onsulted beforedecisions important to the Center are made.'8 Italso suggests that the issue of recompetitionmay, in future, become moot as the formaldesignation "NSF HPC Center" erodes in sig-

I8Each year each center gets a cooperative agreement levelwhich is negotiated. Each center gets about $14M; about15% is flexible. NSF centers have also received help fromARPA to buy new MPP machines. Most of the Centers haveimportant outside sources of support, which imply obliga-tions NSF must respect, such as the Cornell Center relation-ship with IBM and the San Diego Center's activities with theState of California.

25

nificance. There is a form of recompetition al-ready in place; the Centers compete for supportfor new machine acquisition and for ..oles inmulti-center projects.

Recommendation C-3: The NSF should con-tinue to provide funding to support the Supercom-puter Centers' HPC capacity. Any distortion in theuses of the computing pyramid that result fromthis dedicated funding are best offset by the recom-mendations we make for other elements in thepyramid. Provision to scientists and engineers ofaccess to leading edge supercomputer resourceswill continue to be a primary purpose of theCenters, but it is a means to a broader mission; tofoster rapid progress in the use of HPC by scien-tists and engineers, to accelerate progress inusability and economy of HPC and to diffuse HPCcapability throughout the technical community, in-cluding industry. The following additional com-ponents of the Center missions should beaffirmed:

Supporting computational science, by re-search and demonstration in the solution of sig-nificant science and engineering problems.

Fostering interdisciplinary collaboration -

across sciences and between sciences and corn-putatiohal science and computer science - as in theGrand Challenge programs.

Prototyping and evaluating software, new ar-chitectures, and the uses of high speed data. com-munications in collaboration with three groups:computer and computational scientists, discipli-nary scientists exploiting HPC resources, the HPCindustry, and business firms exploring expandeduse of HPC.

Training and education, from post-docs andfaculty specialists to introduction of less ex-perienced researchers to HPC methods, to col-laboration with state and regional HPC centersworking with high schools, community colleges,colleges, and universities.

The role of a Supercomputer Center should,therefore, continue to be primarily one of afacilitator, pursuing the goals just listed bymaking the hardware and human resources

16

available to computational scientists, whothemselves are intellectual leaders. In thisway the Centers will participate in leadershipbut will not necessarily be its primary source.With certain notable exceptions, intellectualleadership computational science has comefrom scientists around the country who haveat times used the resources available at theCenters. This situation is unlikely to changenor should it change. It would be unrealisticto place this type of demand on the Super-computer Centers and it would certainly notbe in the successful tradition of Americanscience.

The Supercomputer Centers facilitate interdis-ciplinary collaborations because they supportusers from a variety of disciplines, and areaware of their particular strengths. TheCenters have been deeply involved in nucleat-ing Grand' Challenge teams, and particularlyin reaching out to bring computer scientiststogether with computational scientists.Visualization, for example, is no longer justin the realm of the computational scientist;experimentalists use the same tools fordesigning and simulating experiments in ad-vance of actual data generation. This com-mon ground should not be separated from theenabling technologies which have made thiswork possible. Rather high performancecomputing and the new science it has enabledhave seeded advances that would not havehappened any other way.

ALLOCATION OF CENTER HPCRESOURCES TO INVESTIGATORS

Recommendation C-4: The NSF should reviewthe administrative procedures used to allocate Cen-ter resources, and the relationship of this processto the initial funding of the research by the discipli-nary program offices, to ensure that the burden onscientists applying for research support is mini-mized, when that research also requires access tothe facilities of the Centers, or perhaps access toother elements of the HPC pyramid that will be es-tablished pursuant to our recommendations. How-

26

ever we believe the NSF should continue to pro-vide HPC resources to the research communitythrough allocation committees that evaluate com-petitively proposals for use of Center resources.19

At the present time, the allocation of resourcesin the Supercomputer Centers for all users ishandled by requiring principal investigators tosubmit annual proposals to a specified Centerfor access to specific equipment. The NSFshould not require a duplicate peer review ofthe substantive scientific merit of the proposedscientific investigation, first by disciplinary pro-gram offices, and then again by the Center Al-location Committees. For this reason, it isproposed that the allocation of supercomputertime be combined with the allocation of re-search funds to the investigator.

Although this panel is not in a position to giveadministrative details of such a procedure, it issuggested that requests for computer time be at-tached to the original regular NSF proposal,with (a) experts in computational science in-cluded among peer reviewers, or, (b) that por-tion of the proposal be reviewed in parallel by apeer review established by the Centers. In eithercase only one set of peer reviewers shouldevaluate scientific merits, and only one set ofreviewers should determine that the researchtask is being formulated properly for use ofHPC resources.

Second, we recommend that the Centers collec-tively establish the review and allocationmechanism, so that while investigators might

t9For NSF funded investigators, allocation committees at Su-percomputer Centers should evaluate requests for HPCresources only on the appropriateness of the computationalplans, choice of machine, and amount of resource requested.Centers should rely on disciplinary program office deter-minations of scientific merit, based on their peer review. Inthis way a two level review of the merits of the science isavoided. A further simplification might be for the applicationfor computer time at the Centers to be included in theoriginal disciplinary proposal, and forwarded to the Centerswhen the proposal is approved. For non-NSF funded inves-tigators an alternative form of peer review of the research isrequired.

express a preference for a particular computeror Center for their work, all Centers facilitieswould be in the pool from which each inves-tigator receives allocations.

We recognize, of course, that the specific alloca-tion of machine time often cannot be made atthe time of the original proposal for NSF re-search support, since in some cases the workhas not progressed to the point that the mathe-matical approach, algorithms, etc., are availablefor Center experts to evaluate and translate intoestimates of machine time. Nor is the demandfunction for facilities known at that time.

EDUCATION AND TRAINING

Recommendation C-5: The NSF should givestrong emphasis to its education mission in HPC,and should actively seek collaboration with state-sponsored and other HPC centers not supportedprimarily on NSF funding. Supercomputingregional affiliates should be candidates for NSFsupport, with education as a key role. HPC willalso figure in the Administration's industrial exten-sion program, in which the states have the primaryoperational role.

The serious difficulties associated with the useof parallel computers pose a new training bur-den. In the past it was expected that individualinvestigators would port their code to new com-puters and this could usually be done withlimited effort. This is no longer the case. TheSupercomputer Centers should see their futuremission as providing direct aid to the rewritingof code for parallel processors.

Computational science is proving to be an effec-tive way to generate new knowledge. As part ofits basic mission, NSF needs to teach scientists,engineers, mathematicians, and even computerscientists how high performance computing canbe used to produce new scientific results. Therole of the Supercomputer Centers is critical tosuch a mission since the Centers have expertiseon existing hardware and software systems,modelling, and algorithms, as well asknowledge of useful high performance comput-

17

2i

ing application packages, awareness of trends inhigh performance computing and requisite staff.

D. NSF AND THE NATIONALHPC EFFORT; RELATIONSHIPS

WITH STATES

Recommendation D-1: We recommend that theNational Science Board urge OSTP to establish anadvisory committee representing the states, HPCusers, NSF Supercomputer Centers, computermanufacturers, computer and computational scien-tists (similar to the Federal Networking Council'sAdvisory Committee), which should report toHPCCIT. A particularly important role for thisbody would be to facilitate state-federal planningrelated to high performance computing.

Congress required advisory committee report-ing to the PMES, but the committee has not yetbeen implemented. The committee we proposewould provide policy level advice and coordina-tion with the states. The main components ofHPCC are networking and HPC, although thenetworks seem to be receiving priority atten-tion. The Panel believes it is important to con-tinue to emphasize the importance of ensuringadequate compute power in the network to sup-port the National Information Infrastructure ap-plications. We also believe that as participationin HPC continues to broaden through initiativesby the states and by industry, the NSF (and

18

other federal agencies) should encourage theircollaboration in the national effort.

The Coalition of Academic SupercomputerCenters (CASC) was founded in 1989 to pro-vide a forum to encourage support for high per-formance computing and networking. Unlikethe FCCSET task force, CASC is dependent onothers to bring the money to support high per-formance computing - usually their own Stategovernment or university. The result is a valu-able discussion group for exchanging informa-tion and developing a common agenda andCASC should be encouraged. However, CASCis not a substitute for a more formal federal ad-visory body.

This recommendation is consistent with a recentCarnegie Commission Report entitled "Science,Technology and the States in America's ThirdCentury," which recommends the creation of asystem of joint advisory and consultative bodiesto foster federal-state exchanges and to create apartnership in policy development, especiallyfor construction of national information in-frastructure and provision of services based onit. Because of the importance of high perfor-mance computing to future economic develop-ment, we need a new balance of cooperationbetween federal and state government in thisarea, as in a number of others.

2&

Appendix A

MEMBERSHIP OF THE BLUE RIBBON PANEL ON HIGHPERFORMANCE COMPUTING

Lewis Branscomb, John F. Kennedy School ofGovernment, Harvard University (Chairman)

Lewis Branscomb is a physicist, formerly chairmanof the National Science Board (1980-1984) andChief Scientist of IBM Corp. (1972-1986).

Theodore Belytschko, Department of Civil Engineer-ing, Northwestern University

Ted Belytschko's research interests are in computa-tional mechanics, particularly in the modeling ofnonlinear problems, such as failure, crashworthi-ness, and manufacturing processes.

Peter R. Bridenbaugh, Executive Vice President -Science, Engineering, Environment, Safety & Health,Aluminum Company of America

Peter Bridenbaugh serves on a number of universityadvisory boards, and is a member of the NationalAcademy of Engineering's Industrial Ecology Com-mittee. He also serves on the NSF Task Force 1994Budget Committee and is a Fellow of ASM Interna-tional.

Theresa Chay, Professor, Department of BiologicalSciences, University of Pittsburgh

Teresa Chay's research interests are in modellingbiological phenomena such as nonlinear dynamicsand chaos theory in excitable cells, cardiac arrhyth-mias by bifurcation analysis, mathematical modelingfor electrical activity of insulin secreting pancreaticB-cells and agonist-induced cytosolic calcium oscil-lations, and elucidation of the kinetic properties ofion channels.

Jeff Dozier, Center for Remote Sensing, University ofCalifornia, Santa Barbara

Jeff Dozier, University of California, Santa Barbara.is a hydrologist and remote sensing specialist. From1990-1992 he was Senior Project Scientist onNASA's Earth Observing System.

Gary Crest, Exxon Corporate Research ScienceLaboratory

Gary Grest's research interest are in the areas ofcomputational physics and material science, recentlyemphasizing the modeling the properties ofpolymers and complex fluids.

19

Edward Hayes, Vice President for Research, OhioState University

Edward F. Hayes is a computational chemist, former-ly Controller and DiVision Director for Chemistry atNSF.

Barry Honig, Department of Biochemistry andMolecular Biology, Columbia University

Barry Honig's research interests are in theoreticaland computational studies of biological macro-molecules. He is an associate editor of the Journal ofMolecular Biology and is a former president of theBiophysical Society (1990-1991).

Neal Lane, Provost, Rice University (resigned fromthe Panel July 1993)

William A. Lester, Jr., Professor and Associate Dean,Department of Chemistry, University of California,Berkeley

William A. Lester, Jr., is a theoretical chemist,formerly Director of the National Resource for Com-putation in Chemistry (1978-81) and Chairman ofthe NSF Joint Advisory Committees for AdvancedScientific Computing and Networking and Com-munications Research and Infrastructure (1987).

Gregory McRae,Professor, Department of ChemicalEngineering, MIT

James Sethian, Professor, University of California atBerkeley

James Sethian is an applied mathematician in theMathematics Department at the University ofCalifornia at Berkeley and in the Physics Division ofthe Lawrence Berkeley Laboratory.

Burton Smith, Tela Computer CompanyBurton Smith is Chairman and Chief Scientist ofTera Computer Company, a manufacturer of highperformance computer systems.

Mary Vernon, Department of Computer Science,University of Wisconsin

Mary Vernon is a computer scientist who hasreceived the 'NSF Presidential Young InvestigatorAward and the NSF Faculty Award for WomenScientists and Engineers in recognition of her re-search in parallel computer architectures and theirperformance.

29

Appendix B

NSF AND HIGH PERFORMANCE COMPUTING:HISTORY AND ORIGIN OF THIS STUDY

Introduction

This report of the Blue Ribbon Panel on High Per-formance Computing follows a number ofseparate, but related, activities in this area by theNSF, the computational science community, andthe Federal Government in general acting in con-cert through the Federal Coordinating Committeeon Science, Engineering, and Technology. ThePanel's findings and recommendations must beviewed within this broad context of HPC. Thissection provides a description of the way in whichthe panel has conducted its work and a brief over-view of the preceding accomplishments whichwere used as the starting point for the Panel'sdeliberations.

The Origin of the Present Panel and Charter

Following the renewal of four of the five NSF Su-percomputer Centers in 1990, the NationalScience Board (NSB) maintained an interest in theCenters' operations and activities. Given the na-tional scope of the Centers, and the possible im-plications for them contained in the HPCC Act of1992, the NSB commissioned the formation of ablue ribbon panel to investigate the future changesin the overall scientific environment due the rapidadvances occurring in the field of computers andscientific computing. The panel was instructed toinvestigate the way science will be practiced in thenext decade, and recommend an appropriate rolefor NSF to enable research in the overall comput-ing environment of the future. The panel consistsof representatives from a wide spectrum of thecomputer and computational science communitiesin industry and academia. The role expected of thePanel is reflected by its Charter :

A. Assess the contributions of high perfor-mance computing to scientific and enpineer-ing research and education, includingancillary benefits, such as the stimulus to

the pace of innovation in U.S. industries andthe public sector.

B. Project what hardware, software and com-munication resources may be available inthe next five to ten years to further these ad-vances and identify elements that may beparticularly important to the development ofHPC.

C. Assess the variety of institutional formsthrough which access to high performancecomputing,may be gained including fundingof equipment acquisition, shared accessthrough local centers, and shared accessthrough broad band telecommunications.

D. Project sources, other than NSF, for supportof such capabilities, and potential coopera-tive relationships with: states, private sector,other federal agencies, and internationalprograms.

E. Identify barriers to the development of moreefficient, usable, and powerful means for ap-plying high performance computing, andmeans for overcoming them.

F. Provide recommendations to help guide thedevelopment of NSF's participation in super-computing and its relation to the federal in-teragency High Performance Computingand Communications Program.

G. Recommend policies and managerial struc-tures needed to achieve NSF program goals,including clarification of the peer reviewprocedures and suggesting appropriateprocesses and mechanisms to assess pro-gram effectiveness necessary for insuringthe highest quality science and engineeringresearch.

At its first meeting in January 1993, the panel ap-proved its Charter, and established a scope ofwork which would allow a final report to be

21

39

presented to the NSB in Summer 1993. A largenumber of questions were raised amplifying theCharter's directions. Prior to its second meetingin March 1993 the Panel solicited input from thenational research community; a response to the fol-lowing four questions was requested.

How would you project the emerging highperformance computing environment andmarket forces over the next five years and theimplications for change in the way scientistsand engineers will conduct R&D, design andprooLction modeling?

What do you see as the largest barriers to theeffective use of these emergent technologiesby scientists and engineers and what effortswill be needed to remove these barriers?What is the proper role of government, and,in particular, the NSF to foster progress?

To what extent do you believe there is a fu-ture role for government-supported supercom-puter centers? What role should NSF play inthis spectrum of capabilities?

To what extent should NSF use its resourcesto encourage use of high performance com-puting in commercial industrial applicationsthrough collaboration between high perfor-mance computing centers, academic usersand industrial groups?

Over fifty responses were received and were con-sidered and discussed by the Panel at its Marchmeeting. The Panel also received presentations,based on these questions, from vendors of highperformance computing equipment and repre-sentatives from non-NSF supercomputer centers.

NSF's Early Participation in High PerformanceComputing

Although the National Science Foundation is nowa major partner in the nation's high performancecomputing effort, this was not always the case. Inthe early 1970s the NSF ceased its support of cam-pus computing centers, and by the mid-1970sthere were no "supercomputers" on any campusavailable to the academic community. Certainly

computers of this capability were availablethrough other government agency (DoE andNASA) laboratories, but NSF did not play a role,and hence many of its academic researchers didnot have the ability to perform computational re-search on anything other than a departmental mini-computer, thereby limiting the scope of theirresearch.

This lack of NSF participation in the high perfor-mance computing environment began to be notedin the early 1980s with the publication of a grow-ing number of reports on the subject. A report tothe NSF Division of Physics Advisory Committeein March 1981 entitled "Prospectus for Computa-tional Physics", edited by W. Press, identified a"crisis" in computational physics, and recom-mended support for facilities. Subsequent to thisreport a joint agency study, "Large Scale Comput-ing in Science and Engineering", edited by P. Lax,appeared in December 1982 and acted as thecatalyst for NSF's reemergence in the support ofhigh performance computing. The Lax Reportpresented four recommendations for a government-wide program:

Increased access to regularly upgraded super-computing facilities via high bandwidth net-works

Increased research in computational mathe-matics, software, and algorithms

Training of personnel in scientific computing

R&D of new supercomputer systems

The key suggestions contained in the Lax Reportwere studied by an internal NSF working group,and the findings were issued in July 1983 as "ANational Computing Environment for AcademicResearch", a report edited by M. Bardon and K.Curtis. The report studied NSF supportedscientists' needs for academic computing, andvalidated the conclusions of the Lax Report for theNSF supported research community. The findingsof Bardon/Curtis reformulated the four recommen-dations of the Lax Report into a six point im-plementation plan for the NSF. Part of this actionplan was a recommendation to establish tenacademic supercomputer centers.

22 31

The immediate NSF response was to set up ameans for academic researchers to have access, atexisting sites, to the most powerful computers ofthe day. This was an interim step prior to asolicitation for the formation of academic super-computer centers directly supported by the NSF.By 1987, five NSF Supercomputer Centers hadbeen established, and all had completed at leastone year of operation.

During this phase the Centers were essentially iso-lated "islands of supercomputing" whose role wasto provide supercomputer access to the academiccommunity. This aspect of the Centers' activitieshas changed considerably. The NSF concept ofthe Centers' activities was mandated to be muchbroader, as indicated by the Center's original ob-jectives:

Access to state of the art supercomputers

Training of computational scientists and en-gineers

Stimulate the U.S. supercomputer industry

Nurture computational science and engi-neering

Encourage collaboration among researchersin academia, industry and government

In 1988-1989 NSF conducted a review to deter-mine whether support was justified beyond 1990.In developing proposals, the Centers were advisedto increase their scope of responsibilities. Quotingfrom the solicitation:

"To insure the long term health and value of asupercomputer center, an intellectual environ-ment, as well as first class service, is neces-sary. Centers should identify an intellectualcomponent and research agenda".

In 1989 NSF approved continuation through 1995of the Cornell Theory Center, the National Centerfor Supercomputing Applications, the PittsburghSupercomputing Center, and the San Diego Super-computer Center. Support for the John vonNeumann Center was not continued.

The Federal High Performance Computing andCommunications Initiative

23

At the same time the NSF Supercomputer Centerswere beginning the early phases of their opera-tions the Federal Coordinating Committee forScience, Engineering, and Technology began astudy in 1987 on the status and direction of highperformance computing, and its relationship tofederal research and development. The resultswere "A Research and Development Strategy forHigh Performance Computing" issued by the Of-fice of Science and Technology Policy (OSTP) inNovember 1987, followed in September 1989 byanother OSTP document "The Federal High Per-formance Computing Program". These tworeports set the framework for the inter-governmen-tal agency cooperation on high performancecomputing which led to the High PerformanceComputing and Communications (HPCC) Act of1991.

HPCC focuses on four integrated components* ofcomputer research and applications which veryclosely echo the Lax Report conclusions:

High Performance Computing Systems tech-nology development for scalable parallel sys-tems to achieve teraflop speed

Advanced Software Technology and Algo-rithms - generic software and algorithmdevelopment to support Grand Challengeprojects, including early access to productionscalable systems

National Research and Education Network -to further develop the national network andnetworking tools, and to support the researchand development of gigabit networks

Basic Research and Human Resources tosupport individual investigator research onfundamental and novel science and to initiateactivities to significantly increase the pool oftrained personnel

*At the lime of writing this Report, a fifth component, en-titled, Information Infrastructure, Technology, and Applica-tions is being defined for inclusion in the HPCC Program

32

With this common structure across all the par-ticipating agencies, the Program outlines eachagency's roles and responsibilities. NSF is thelead agency in the National Research and Educa-tion Network, and has major roles in AdvancedSoftware Technology and Algorithms, and inBasic Research and Human Resources.

The Sugar Report

After the renewal of the four NSF SupercomputerCenters the NSF Division of Advanced Scientifick_omputing recognized that the computing environ-ment within the nation had changed considerablyfrom that which existed at the inception of theCenters Program. The Division's Advisory Com-mittee was asked to survey the future possibilitiesfor high performance computing, and report backto the Division. Two workshops were held in theFall of 1991 and Spring of 1992. Thirty one par-ticipants with expertise in computational science,computer science and the operation of major super-computer centers were involved.

The final report, edited by R. Sugar of the U. ofCalifornia at Santa Barbara, recommended future

24

directions for the Supercomputer Centers Programwhich would "enable it to take advantage of these(HPCC) opportunities and to meet its respon-sibilities to the national research community".The committee's recommendations can be sum-marized as:

Decisions and planning by the Division needto be made in a programmatic way, ratherthan on an individual Center by Center basis -

the meta-center concept provides a vehiclefor this management capability which goesbeyond the existing Centers.

Access to stable computing platforms (cur-rently vector supercomputers) needs to beaugmented by access to state of the art tech-nology (currently massively parallel com-puters) but, the former cannot be sacrificedto provide the latter

The Supercomputer Centers can be focalpoints for enabling collaborative effortsacross many communities - computationaland computer science, private sector andacademia, vendors and academia.

Appendix C

TECHNOLOGY TRENDS and BARRIERSto FURTHER PROGRESS

BACKGROUND:

What is the state of the HPC industry here andabroad? What is its prognosis?

The high performance computer industry is in astate of turmoil, excitement and opportunity. Onthe one hand, the vector multiprocessors manufac-tured by many firms, large and small, have con-tinued to improve in capability over the years.These systems are now quite mature, as measuredby the fact that delivered performance is a sig-nificant fraction of the theoretical peak perfor-mance of the hardware, and are still the preferred.platform for many computational scientists and en-gineers. They are the workhorses of high perfor-mance computing today and will continue in thatrole even as alternatives mature.

On the other hand, dramatic improvements inmicroprocessor performance and advances inmicroprocessor-based parallel architectures haveresulted in "massively parallel" systems that offerthe potential for high performance at lower cost.2°For example, $10 million in 1993 buys over 40gigaflops peak processing power in a multicom-puter but only 5 gigaflops in a vector multiproces-sor. As a result, increasing numbers ofcomputational scientists and engineers are turningto the highly parallel systems manufactured bycompanies such as Cray Research Inc., IBM, Intel,

20Note that the higher cost of vector machines is partlycaused by their extensive use of static memory chips formain memory and the interconnection networks they use forhigh shared-memory bandwidth. These attributes contributeto increased programmability and the realization of a highfraction of peak performance on user applications. Realizedperformance on MPP machines is still uncertain. A com-parison of today's vector machines versus MPP systemsbased on realized performance per dollar reveals much lessdifference in cost-performance than comparisons based onpeak performance.

25

Kendall Square, Thinking Machines Inc., MasPar,and nCUBE.

Microprocessor performance has increased by 4Xevery three years, matching the rate of integratedcircuit logic density improvement as predicted byMoore's law. For example, the microprocessors of1993 are around 200 times faster than those of1981. By contrast, the clock rates of vectorprocessors have improved much more slowly;today's fastest vector processors are only five orsix times faster than 1976's Cray 1. Thus, the per-formance gap between these two technologies isquickly disappearing in spite of other performanceimprovements in vector processor architecture.

Although microprocessor-based massively parallelsystems hold considerable promise for the future,they have not yet reached maturity in terms ofease of programming and ability to deliver highperformance routinely to large classes of applica-tions. Unfortunately, the programming technologythat has evolved for the vector multiprocessorsdoes not directly transfer to highly parallel sys-tems. New mechanisms must be devised for highperformance communication and coordinationamong the processors. These mechanisms must beefficiently supported in the hardware and effective-ly embodied in programming models.

Currently, vendors are providing a variety of sys-tems based on different approaches, each of whichhas the potential to evolve into the method ofchoice. Vector multiprocessors support a simpleshared memory model which demands no par-ticular attention to data arrangement in memory.Many of the currently available highly parallel ar-chitectures are based on the "multicomputer" ar-chitecture which provides only a message-passinginterface for inter-processor communication.Emerging architectures, including the KendallSquare KSR-1 and systems being developed byConvex, Cray Research, and Silicon Graphics,have shared address spaces with varying degrees

34

of hardware support and different refinements ofthe shared memory programming model. Thesecomputers represent a compromise in that theyoffer much of the programming simplicity ofshared memory yet still (at least so far) requirecareful data arrangement to achieve good perfor-mance. (The data parallel language on the CM-5has similar properties.) A true shared memoryparallel architecture, based on mechanisms thathide memory access latency, is under developmentat Tera Computer.

The size of the high performance computer marketworldwide is about $2 billion (excluding sales ofthe IBM add-on vector hardware), with Cray Re-search accounting for roughly $800 million of it.IBM and Fujitsu are also significant contributorsto this total, but most companies engaged in thisbusiness have sales of $100 million or less. Somecompanies engaged in high speed computing haveother, larger sources of revenue (IBM, Fujitsu,Intel, NEC, Hitachi); other companies both large(Cray Research) and small (Thinking Machines,Kendall Square, Meiko, Tera Computer) are highperformance computer manufacturers exclusively.There are certainly more companies in the busi-ness than can possibly be successful, and no doubtthere are new competitors that will appear. Help-ing to sustain this high level of competitive in-novation should be an important objective forNSB policy in HPC.

FINDINGS

Where is the hardware going to be in 5 years?What will be the performance and cost of themost powerful machines, the best workstations,the mid-range computers?

The next five years will continue to see improve-ments in hardware price/performance ratios.Since microprocessor speeds now closely ap-proach those of vector processors, it is unclearwhether microprocessor performance improve-ment can maintain its current pace. Still, as long asintegrated circuits continue to quadruple in densitybut only double in cost every three years we canprobably expect a fourfold price/performance im-

26

provement in both processors and memory by1998. Estimating in constant 1993 dollars, themost powerful machines ($50 million) will havepeak performance of nearly a teraflop21; mini-su-percomputers ($1 million) will advertise 20gigaflops peak performance; workstations($50,000) will approach 1 gigaflops, and personalcomputers ($10,000) will approach 200megaflops.22

During this period, parallel architectures will con-tinue to emerge and evolve. Just as the CM-5 rep-resented a convergence between SIMD andMIMD parallel architectures and brought about ageneralization of the data-parallel programmingmodel, it is likely that the architectures will con-tinue to converge and better user-level program-ming models will continue to emerge. Thesedevelopments will improve software portabilityand reduce the variety of architectures that are re-quired for computational science and engineeringresearch, although there will likely still be somediversity of approaches at the end of this 5-yearhorizon. Questions that may be resolved by 1998include:

Which varieties of shared memory architec-ture provide the most effective tradeoff be-tween hardware simplicity, systemperformance, and programming con-venience? and

What special synchronization mechanismsfor processor coordination should be sup-ported in the hardware?

Most current systems are evolving in these direc-tions, and answers to the issues will provide amore stable base for software efforts. Further-more, much of the current computer science re-search in shared memory architectures is lookingfor cost-effective hardware support that can be im-

210ne teraflop is 1000 gigaflops or 1012 floating point in-structions per second.

22Spokesman from Intel, Convex and Silicon Graphics in ad-dressing the panel all made even higher estimates than this.

3D

plemented in multiprocessor workstations that areinterconnected by general-purpose local area net-works. Thus, technology from high performanceparallel systems may be expected to migrate toworkstation networks, further improving thecapabilities of the systems to deliver high-perfor-mance computing to particular applications. It ispossible that in the end the only substantial dif-ference between the supercomputers of tomorrowand the workstation networks of tomorrow will bethe installed network bandwidth.23

Where is the software/programmability goingto be in 5 years? What new programmingmodels will emerge for the new technology?How transparent will parallel computers be tousers?

While the architectural issues are being resolved,parallel languages and their compilers will need tocontinue to improve the programmability of newhigh performance computer systems. Implementa-tions of "data parallel" language dialects like HighPerformance Fortran, Fortran D, and High Perfor-mance C will steadily improve in quality over thenext five years and will simplify programming ofboth multi-computers and shared address systemsfor many applications. For the applications thatare not helped by these languages, new languagesand programming models will emerge, although ata slower pace. Despite strong efforts addressingthe problem from the language research com-munity, the general purpose parallel programminglanguage is an elusive and difficult quarry, espe-cially if the existing Fortran software base must beaccommodated, because of difficulties with thecorrect and efficient use of shared variables.

Support tools for software development have alsobeen making progress, with emphasis on visualiza-tion of a program's communication and

23While parallel architectures mature, vector multiprocessorswill continue to evolve. Scaling to larger numbers of proces-sors ultim-iely involves solving the same issues as for themicroprocessor-based systems.

27

synchronization behavior. Vendors are increasing-ly recognizing the need for sophisticated perfor-mance tuning tools, with most now developing orbeginning to develop such tools for theirmachines. The increasing number of computerscientists who are also using these tools could leadto even more rapid improvement in the quality andusability of these support tools. Operating sys-tems for high performance computers are increas-ingly ill suited to the demands placed on them.Virtualization of processors and memory oftenleads to poor performance, whereas relativelyfixed resource partitioning produces inefficiency,especially when parallelism within the applicationvaries. High performance 110 is another area ofshortfall in many systems, especially the multi-computers. Research is needed in nearly everyaspect of operating systems for highly parallelcomputers.

What market forces or technology investmentsdrive HPC technologies and products?

Future high performance systems will continue tobe built using technologies and components builtfor the rest of the computer industry. Since in-tegrated circuit fabrication facilities now representbillion dollar capital investments, integrated cir-cuits benefit from very large scale ecoaornies; ac-cordingly it has been predicted that onlymass-market microprocessors will prove to haveacceptable costs in future high performance sys-tems. Certainly current use of workstationmicroprocessors such as Sparc, Alpha and the RS-6000 chips suggests this trend. Even so the costof memory chips is likely to be a major factor inthe costs of massively parallel systems, which re-quire massive amounts of fast memory. Thus theintegrated circuit technology available for bothcustom designs and industry standard processorswill increasingly be driven by the requirements ofmuch larger markets, including consumerelectronics.

The health of the HPC vendors and the structureof their products will be heavily influenced bydemand from industrial customers. Business ap-plications represents the most rapidly growing

36

market for HPC products; they have much higherpotential growth than government or academicuses. Quite apart from NSF's obligation to con-tribute to the nation's economic health through itsresearch activities, this fact motivates the impor-tance of cooperation with industry users in expand-ing HPC usage. This reality means that NSFshould be attentive to the value of throughput as afigure of merit in HPC systems (in contrast withturnaround time which academic researchers usual-ly favor), as well as the speed with which largevolumes of data can be accessed. Industry won'tput up with a stand-alone, idiosyncratic environ-ment.

How practical will be the loose coupling of desk-top workstations to aggregate their unused com-pute power?

Networks of workstations will become an impor-tant resource for the many computations that per-form well on them. The probable success of theseloosely coupled system will inevitably raise thestandard for communication capabilities in themulticomputer arena. Many observers believethat competition from workstation networks onone side and shared address space systems on theother will drive multi-computers from the sceneentirely; in any event, the network bandwidth andlatency of multi-computers must improve to dif-ferentiate them from workstation networks. Manylarge institutions have 1000 or more workstationsalready installed; the utilization rate of theirprocessors on a 24 hour basis is probably only afew percent. An efficient way to use the power ofsuch heterogeneous networks would be morefinancially attractive. It will, however, raiseserious question about security, control, virus-prevention, and accounting programs.

Are there some emerging HPC technologies ofinterest other than parallel processing? What istheir significance?

Neural networks have recently become popularand have been successfully applied to many pat-tern recognition and classification problems.Fuzzy logic has enjoyed an analogous renaissance.

28

Technologies of this sort are both interesting andimportant in a broad engineering context and alsoare having impact on computational science andengineering. Machine learning approaches, suchas neural networks, are most appropriate in scien-tific disciplines where there is insufficient theoryt.) support accurate computer modeling andsimulation.

How important are simulation and visualiza-tion capabilities?

Simulation will play an ever increasing role inscience and engineering. Much of this work willbe able to be carried out on workstations or inter-mediate-scale systems, but it will continue to beappropriate to share the highest performance sys-tems (and the expertise in using them) on a nation-al scale, to accomplish large simulations withinhuman time scales. Smaller configurations ofthese machines should be provided to individualresearch universities for application softwaredevelopment and research that involves modifyingthe operating system and/or hardware.

Personal computer capabilities will improve, andvisualization on the desktop will become moreroutine. Scientists and engineers in increasingnumbers will need to be equipped with visualiza-tion capabilities. The usefulness of high perfor-mance computing relies on these systems becauseprinted lists of numbers (or printed sheaves of pic-tures, for that matter) are increasingly unsatisfac-tory as an output medium, even for moderatelysized simulations.

BARRIERS TO CONTINUEDRAPID PROGRESS

What software and/or hardware "inventions"are needed? Who will address meeting theseneeds?

The most important impediment to the use of newhighly parallel systems has been the difficulty ofprogramming these machines and the wide varia-tion that exists in communication capabilitiesacross generations of machines as well as among

the machines in a given generation. Applicationsoftware developers are understandably reluctantto re-implement their large scale production codeson multi-computers, when significant effort is re-quired to port the codes across parallel systems asthey evolve. In theory, any programming modelcan be implemented (by appropriate compilers) onany machine. However, the inefficiency of certainmodels on certain architectures is so great as torender them impractical.24 What is needed in highperformance computing is an architectural consen-sus and a simple model to summarize and abstractthe machine interface to allow compilers to beported more easily across systems, facilitating theportability of application programs. Ideally, theconsensus interface should efficiently support ex-isting programming models (even the multi-com-puters have created their own dusty decks), as wellas more powerful models. Considerable researchin the computer science community is currentlydevoted to these issues. It is unlikely that the diver-sity of programming models will decrease withinthe next five years, but it is likely that models willbecome more portable.

How important will be access to data, datamanagement?

Besides needing high performance I/O, somefields of computational science need widely dis-tributed access to data bases that are extremelylarge and constantly growing. The need is par-ticularly felt in the earth and planetary sciences, al-though the requirements are also great in cellularbiology, high energy physics, and other dis-ciplines. Large scale storage hierarchies and thesoftware to manage them must be developed, andmeans to distribute the data nationally and interna-tionally are also required. Although this area ofhigh performance computing has been relativelyneglected in the past, these problems are nowreceiving significantly more attention.

24For example, it is not practical to implement data-parallelcompliers on the Intel iPSC/860.

29

ROLES FOR GOVERNMENT AGENCIES

What should government agencies (NSF, DoD,DoE) do to advance HPC beyond today's stateof the art? What more might they be doing?

The National Science Foundation plays severalcritical roles in advancing high performance com-puting. First, NSF's support of basic research andhuman resources in all areas of science and en-gineering (and particularly in mathematics, com-puter science and engineering) has beenresponsible for many of the advances in our abilityto successfully tackle grand challenge problems.The Supercomputer Centers and the NSFnet havebeen essential to the growth of high performancecomputing as a basic paradigm in science and en-gineering. These efforts have been successful andshould be continued.

However, NSF has done too little in supportingcomputational engineering in the computerscience community. For example, the NSF Super-computer Centers were slow in providing ex-perimental parallel computing facilities and arecurrently not responding adequately to integratingemerging technologies from the computer sciencecommunity. Although this situation is graduallychanging, the pace of the change should be ac-celerated.

Many advances in high performance computer sys-tems have been funded and encouraged by the Ad-vanced Research Projects Agency (ARPA), themajor supporter of large scale projects in com-puter science and engineering research anddevelopment in the US. ARPA has been chargedby Congress to champion "dual use" technology;in so doing it is addressing many of the needs ofcomputational science and engineering, even inthe mathematical software arena, that are commonto defense and commercial applications, and thescience that underlies both.

The Department of Energy has traditionallyprovided substantial support to computer scienceand engineering research within its nationallaboratories and at universities with strong im-petus being provided by national defense require-

38

ments and resources. More recently, the focus hasshifted to the high performance computing andcommunications needs of the unclassified EnergyResearch programs within DoE. The NationalEnergy Research Supercomputer Center (NERSC)and the Energy Sciences Network (ESnet) provideproduction services similar to the NSF supercom-puter centers and the NSFnet. Under the DoEHPCC component, "grand challenge" applicationsare supported at NERSC and also at two High Per-formance Computing Research Centers (HPCRCs)which offer selected access for grand challenge ap-plications to leading edge parallel computingmachines. DoE also sponsors a variety of graduatefellowships in the computational sciences. Thecomputational science infrastructure and traditionsof DoE remain sound; however, the ability of theDepartment to advance the state-of-the-art in highperformance computing systems will be paced byits share of the funding available through theFederal High Performance Computing Initiative orthrough Defense conversion funds.

The Department of Commerce has not been a sig-nificant source of funds for computer system re-search and development since the very early daysof the computer industry when the NationalBureau of Standards built one of the first digitalcomputers. NBS has been an important factor insupporting standards development, particularly forthe Federal Information Processing Standards is-sued by GSA. The expanded role of the NationalInstitute for Standards and Technology (as NBS isnow called) under the Clinton administration mayinclude this kind of activity, especially when in-dustrial participation is a desired component.

NASA is embarked on a number of projects ofpotential importance, especially in the develop-ment of a shared data system for the globalclimate change program, which will generate mas-sive amounts of data from the Earth ObservingSatellite Program.

What is role of NSF computer science and ap-plied mathematics research program? Is itrelevant to the availability of HPC resources ina five year time span?

Investments in mathematics and computer scienceresearch provide the foundation for attackingtoday's problems in high performance computingand must continue. NSF continues to be theprimary U.S. source of funds for mathematics andcomputer science research within the scope ofwhat one or two investigators and several graduateassistants can do. Many fundamental advances inalgorithms, programming languages, operatingsystems, and computer architecture have beenNSF funded. This mission has been just as vital asARPA's and is complementary to it.

Among the largest barriers to the effective use ofemergent computing technologies are parallel ar-chitectures from which it is relatively easy to ex-tract peak performance, system software(operating systems, databases, compilers, program-ming models) to take advantage of these architec-tures, parallel algorithms, mathematical modeling,and efficient and high order numerical techniques.These are core computational mathematics andcomputer science/engineering research issues,many of which are best tackled through NSF'straditional peer-reviewed model. NSF should in-crease its support of this work. Increased invest-ment in basic research and human resources inmathematics and computer science/engineeringcould significantly accelerate the pace of HPCtechnology development. In addition, technologytransfer can be increased by supporting a newscale of research not currently being funded byany agency: small teams with annual budgets inthe $250K $1M range. These projects were oncesupported by DoE, and also indirectly by ARPA atthe former "block grant" universities.25 These

30

25ARPA made an enormous contribution to the maturing ofcomputer science as a discipline in U.S. universities by con-sistently funding research at about $1 million per year atMIT, Carnegie Mellon, Stanford University and Berkeley. Atthis level of consistent support these universities could buildup a critical mass of faculty and trained the next generationof faculty leadership for departments being set up at everysubstantial research university. This targeted investmentplayed a role in computer science not unlike what NSF hasdone in computational science at the four Centers.

33

enterprises are now generally too small for ARPAand seem to be too big for current NSF budgetlevels in computer science. A project of this scalecould develop and release an innovative piece ofsoftware to the high performance computing com-munity at large, or build a modest hardwareprototype as a stepping stone to more significantfunding. A project of this scale would also allow

31

multi-disciplinary collaboration, either withinmathematics and computer science/engineering(architecture, operating systems, compilers, algo-rithms). and related disciplines (astronomers orchemists working with computer scientists andmathematicians interested in innovative program-ming or architectural support for that problemdomain).

4 0

' !:

. `)."..;'

t

$ £

S$44

a

Figure 2Trends In user status at four National Supercomputer Centers

(Data from use of vector supercomputers only)

# of Users by Academic Status

0

FY86 FY87 FY88 FY89 FY90 FY91 FY92

70

60

50

40

30

20

10

0

FY86 FY87 FY88 FY89 FY90 FY91 FY92

of Users by Academic Status

4234

-4 Faculty

-e- Post Doc

Grad.

Undergrad.

v HS-ii.-- Industry

0 Faculty9 Post Doc

a Grad.Undergrad.

4111-

HS

Industry

Figure 3Installed Supercomputer Base

(U.S. Vendors Only)

16

14

2

20

US E3 Japan a Europe J Other

5,3

<;"P.41',

122

Academic E3 Government 1:1 Industry

r,r)f-,,, ri+17 Ant p 13

Figure 4Trend In NSF Supercomputer Center Leverage and

Growth In number of users

$, M

No. ofUsers

24.4 9` 588933

12.2 5444

0 5000FY88 FY89 FY90 FY91 FY92

# of Active Users 0 Other $ ii NSF $

3644

Appendix E

REVIEW AND PROSPECTUS OF COMPUTATIONALAND COMPUTER SCIENCE AND ENGINEERING

Personal Statements by Panel Members

Computational Mechanics and Structural Analysis

by Theodore Belytschko

High performance computing has had a dramaticimpact on structural analysis and computationalmechanics, with significant benefits for various in-dustries. The finite element method, which wasdeveloped at aerospace companies such as Boeingin the late 1950's and subsequently at the univer-sities, has become one of the key tools for themechanical design of almost all industrialproducts, including aircraft, automobiles, powerplants, packaging, etc. The original applications offinite element methods were primarily in linearanalysis, which are useful for determining the be-havior of engineering products in normal operat-ing conditions. Most linear finite element analysesare today performed on workstations , except forproblems with the order of 1 million unknowns.

Supercomputers are used primarily for nonlinearanalysis, when they replace prototype testing.One rapidly developing area has been automobilecrashworthiness analysis, where models ofautomobiles are used to design for occupant safetyand for features such as accelerometer placementfor a air bag deployment. The models which arecurrently used are generally on the order of100,000 to 250,000 unknowns, and even on thelatest supercomputers such as the CRAY C90 re-quire on the order of 10 to 20 hours of computertime. Nevertheless these models are still often toocoarse to permit true prediction and hence theymust be tuned by tests.

Such models have had a tremendous impact onreducing the design time for automobiles, sincethey eliminate the need for building numerousprototypes. Almost all major automobiles manufac-

37

turers have undertaken extensive programs incrashworthiness simulations by computer on highperformance machines, and many manufacturershave bought supercomputers almost expressly forcrashworthiness simulation.

Because of the increasing concern with safetyamong manufacturers of many other products, non-linear analysis are also emerging in many other in-dustries: the manufacturer of trucks andconstruction equipment, where the product mustbe certified for safety in various accidents such asoverturning or impact due to falling constructionequipment; railroad car safety; the safety ofaircraft, where recently the FAA have undertakenprograms to simulate the response of aircraft tosmall weapons so that damage from such ex-plosives can be minimized. Techniques of thistype are also being used the analysis the safety ofjet engines due to bird impact, the containment offragments in case of jet engine failure, and bird im-pact on aircraft canopies. In several cases, NSF Su-percomputer Centers have introduced industry tothe potentials of this type of simulation. In all ofthese, highly nonlinear analysis which require onthe order of 10x floating point operations for asimulation must be made: such simulations evenon today's supercomputers are still often so timeconsuming that decisions cannot be reached fastenough. Therefore an urgent need exists for in-creasing the speed with which such simulationscan be made.

Nonlinear finite element analysis is also becomingincreasingly important in the simulation manufac-turing processes. For example, tremendous im-

45

provements can be made in processes such assheet metal forming, extrusion, and machiningprocesses if these are carefully designed throughnonlinear finite element simulation. These simula-tions offer large cost reductions and reduce designtime. Also the design of materials can be im-proved if computers are first used to examine howthese materials fail and then to design the materialso that failure is either decreased or so that thematerial fails in a less catastrophic manner. Suchsimulations require great resolution, and at the tipsof cracks phenomenon at the atomic scale must beconsidered.

Most of the calculations mentioned above are notmade with sufficient resolution because of limita-tions in computational power and speed. Also, im-portant physical phenomena are omitted forreasons of expediency, and their computationalmodeling is not well understood. Therefore, theavailability of more computational power will in-crease our understanding of modeling nonlinearstructural response and provide industry withmore effective tools for design.

Cellular and Systemic Biology

by Teresa Chay

HPC has made a great impact on a variety ofbiological disciplines, such as physiology, biologi-cal macromolecules, and genetics. I will discussbelow three vital organs in our body where our un-derstanding has greatly benefitted from high-per-formance comuting and will continue to do so inthe future.

Computer Models For Vital Organs In Our Body

Although the heart, brain, and pancreas functiondifferently in our body (i.e., the heart circulatesthe blood, the brain stores and transfers informa-tion, and the pancreas secretes vital hormonessuch as insulin), the mechanisms underlying theirfunctioning are quite similar - "excitable" cellsthat are coupled electrically and chemically, form-ing a network.

Ion channels in the cell membranes are involvedin information transfer. The ion channels receivestimuli from neighboring cells and from cells inother organs. Upon receiving stimuli some ionchannels open while others close. When thesechannels are open, they pass ions into or out of thecells, creating an electrical difference (membranepotential) between the outside and the inside ofthe cell. Some of these ion channels are sensitiveto the voltage (i.e., membrane potential) and

38

others are responsive to chemical substances (e.g.,neurotransmitters/ hormones).

Opening of ion channels creates the "action poten-tial" which spreads from cell to cell, either direct-ly or via chemical mediators. Electricaltransmission and chemical transmission are inter-dependent in that chemical substances can in-fluence the ionic currents and visa versa. Forexample, the arrival of the action potential at apresynaptic terminal may cause a release of chemi-cal substances; in turn these chemicals canopen/close the ion channels in the postsynapticcell.

Why is high-performance computing needed?How the signals are passed from cell to cell is anonlinear dynamical problem and can be treatedmathematically by solving simultaneous differen-tial equations. These equations involve voltage,conductance of the ionic current, and concentra-tion of those chemical substances that influenceconductances. Depending on the model, each net-work can be represented by a set of several mil-lion differential equations. The need for parallelprocessors is obvious the organs process infor-mation just the same way as the most powerfulparallel supercomputers do. Since the mechanisms

46

involved in these three organs are essentially thesame, algorithms developed for one can be easilymodified to solve for another.

Three specific areas in which high-performancecomputing is central are cardiac research, neuralnetworks, and insulin secretion. These aredetailed below.

Cardiac Research

It would be a great benefit to cardiac research if arealistic computer model of the heart, its valvesand the nearby major vessels were to becomeavailable. With such a model the general publicwould be able to see how the heart generates itsrhythm, how this rhythm leads to contraction, andhow the contraction leads to blood circulation.Scientists, on the other hand, could study normaland diseased heart function without the limitationsof using human and animal subjects. With futureHPC and parallel processing, it may be possible tobuild a model heart without consuming too manyhours of computerlime. As a step toward achiev-ing this goal, the scientists in computational car-diology have thus far accomplished the followingthree objectives.

1. A computer model of blood flow in theheart: Researchers have used supercom-puters to design an artificial mitral valve(the valve that controls blood flow betweenthe left atrium and left ventricle) which isless likely to induce clots. The computersimulated mitral valve has been patentedand licensed to a corporation developing itfor clinical use. With parallel processors,this technique is now expanded in order toconstruct a realistic three dimensional heart.

2. Constructing an accurate map of the electri-cal potential of the heart surface (epicar-diuni): Arrhythmia in the heart is caused bya breakdown in the normal pattern of car-diac electrical activity. Many arrhythmiasoccur because of an abnormal tissue insidethe heart. Bioengineers have been develop-ing a technique with which to obtain theepicardial potential map from the coarse in-formation of it that can be recorded on the

39

surface of the body via electrocardiogram.With such a map, clinicians can accuratelylocate the problem tissue and remove it witha relatively simple surgical procedure in-stead of with drastic open-heart surgery.

3. Controlling sudden cardiac death: Suddencardiac death is triggered by an extra heartbeat. Such a beat is believed to initiate spiralwaves (i.e., reentrant arrhythmias) on themain pumping muscle of the heart known asthe ventricular myocardium. With HPC it ispossible to simulate how this part of theheart can generate reentrant arrhythmiaupon receiving a premature pulse. Computermodelling of reentrant arrhythmia is veryimportant clinically since it can be used as atool to predict the onset of this type of dead-ly arrhythmia and find a means to cure it byproperly administrating antiarrhythmiadrugs (instead of actually carrying out ex-periments on animals).

Parallel computing and development of bettersoftware will soon enable the researchers to ex-tend their simulations to a more realistic three-dimensional system which includes the detailedgeometry of ventrical muscle.

Neural Networks

Learning how the brain works is a grand challengein science and engineering. Artificial neural netsare based largely on their connection patterns, butthey have very simple processing elements ornodes (either ON or OFF). That is, a simple net-work consists of a layered network with an inputlayer (sensory receptors), one or more "hidden"layers (representing the intemeurons which allowanimals to have complex behaviors), and an out-put layer (motor neurons). Each unit in a neuralnet receives inputs, both excitatory and inhibitory,from a number of other units and, if the strengthof the signal exceeds a given threshold, the "on-unit" sends signals to other units.

The real nervous system, however, is a complexorgan that cannot be viewed simply as an artificialneural net. Neural nets are not hard-wired but aremade of neurons which are connected by synap-

4 7

ses. There are at least 10 billion active neurons inthe brain. There are thousands of synapses perneuron, and hundreds of active chemicals whichcan modify the properties of km channels in themembrane.

With HPC and massive parallel computing,neuroscientists are moving into a new phase of in-vestigation which focuses on biological neuralnets, incorporating features of real neurons and theconnectivity of real neural nets. Some of thesemodels are capable of simulating patterns ofelectrical activity, which can be compared to ac-tual neuronal activity observed in experiments.With the biological neural nets, we begin to under-stand the operation of the nervous system in termsof the structure, function and synaptic connec-tivity of the individual neurons.

Insulin Secretion

Insulin is secreted from the beta cells in thepancreas. To cure diabetes it is essential to under-stand how beta cells release insulin. The beta cellsare located in a part of pancreas known as the islet

of Langerhans. In islets, beta cells are coupled bya special channel (gap junctional channel) whichconnects one cell to the next. Gap junctional chan-nels allow small ions such as calcium ions passthrough from cell to cell. In the plasma of betacells, there are ion channels whose propertieschange when the content of calcium ions changes.There are other types of cells in the islet whichsecrete hormones. These hormones in turn in-fluence insulin secretion by altering the propertiesof the receptors bound in the membrane of a betacell. Thus the study on how beta cells release in-sulin involves very complex non-linear dynamics.With a supercomputer it is possible to construct amodel of the islet of Langerhans. With this model,researchers would learn how beta cells release in-sulin in response to the external signals such asglucose, neurotransmitters and hormones. Theywould also learn the roles of other cell types in theislet of Langerhans and how they influence thefunctional properties of beta cells. A model inwhich beta cells function as a cluster has been al-ready constructed.

Matcrial Science and Condensed Matter Physics

by Gary S. Grest

The impact of high performance computing onmaterial science and condensed matter physics hasbeen enormous. Major developments in the sixtiesand seventies set the stage for the establishment ofcomputational material science as a third dis-cipline, equal, yet distinct from analytic theoryand experiment. These developments include theintroduction of molecular dynamics and MonteCarlo methods to simulate the properties of liquidsand solids under a variety of conditions. Densityfunctional theory was developed to model theelectron-electron interactions and pseudopotentialmethods to model the electron-ion interactions.These methods were crucial in computing theelectronic structure for a wide variety of solids.Later, the development of path integral and

40

Green's function Monte Carlo methods allowedone to begin to simulate quantum many-bodyproblems. Quantum molecular dynamics whichcombine well-established electronic methodsbased on local density theory with moleculardynamics for atoms have recently been intro-duced. On a more macroscopic scale, computation-al mechanics which was discussed above by T.Belyschko was developed to study structuralproperties relations.

Current usage of high performance computing inmaterial science and condensed matter physics canbe broadly classified as Classical Many-Body andStatistical Mechanics, Electronic Structure andQuantum Molecular Dynamics , and QuantumMany-Body, which are discussed below.

46

Classical Many-Body and Statistical Mechanics

Classical statistical mechanics, where one treats ahuge number of atoms collectively date back

to Boltzmann and Gibbs. In these systems, quan-tum mechanics plays only a subsidiary role. Whileit is needed to determine the interaction betweenatoms, in practice these interactions are oftenreplaced by phenomenologically determined pair-wise forces between the atoms. This allows one totreat large ensembles of atoms, by moleculardynamics and Monte Carlo methods. Successes ofthis approach include insight into the properties ofliquids, phase transitions and critical phenomena,crystallization of solids and compositional order-ing in alloys. For systems where one needs a quan-titative comparison to experiment, embedded atommethods have been developed in which empirical-ly determined functions are employed to evaluatethe energy and forces. Although the details of theelectronic structure are lost, these empiricalmethods have been successful in giving

'reasonable descriptions of the physical processesin many systems in which directional bonding isnot important. Theoretical work in the mid-70's onrenormalization group methods, showed that awide variety of different kinds of phase transitionscould be classified according to the symmetry ofthe order parameter and the range of the interac-tion and did not depend on the details of the inter-action potential. This allowed one to use relativelysimply models, usually on a lattice, to study criti-cal phenomena and phase behavior.

While the basic computational techniques used inclassical many-body theory are now well estab-lished, there remain a large number of importantproblems in material science which only be ad-dressed with these techniques. At present, withCray YMP class computers, one can typicallyhandle thousands of atoms for hundreds ofpicoseconds. With the next generator of massivelyparallel machines, this can be extended to millionsof particles for microseconds. While not allproblems require this large number of particles orlong times, many do. Problems which will benefitfrom the faster computational speed typically in-volve either large lengths and/or long time scales.

Examples include polymers and macromolecularliquids, where typical lengths scales of eachmolecule can be hundreds angstroms and relaxa-tion time scales extend from microseconds andlonger, liquids near their glass transition whererelaxation times diverge exponentially, nucleationand phase separation which requires both large sys-tems and long times and effects of shear. Macro-molecular liquids typically contain objects of verydifferent sizes. For example, in most co" oidalsuspensions, the colloid particles are hundreds ofangstroms in size while the solvent is only a fewangstroms. At present, the solvent molecules mustt treated as a continuum background. While thisallows one to study the static properties of the sys-tem, the dynamics are incorrect. Faster computerwill allow us to study flocculation, sedimentationand the effects of shear on order. Non-equilibriummolecular dynamics methods have been developedto simulate particles under shear. However due tothe lack of adequate computation power, simula-tions at present can only be carried out at unphysi-cally high shear rates. Access to HPC will enableone to understand the origins of shear thinning andthickening in a variety of technologically impor-tant systems. While molecular dynamics simula-tions are inherently difficult to vectorize, recentefforts to run them on parallel computers havebeen very encouraging, with increases in speed ofnearly a factor of 30 in comparison to the CrayYMP.

Monte Carlo simulations on a lattice remain a verypowerful computational technique. Simulations ofthis type have been very successful in under-standing critical phenomena, phase separation,growth kinetics and disordered magnetic systems.Successes include accurate determination ofuniversal critical exponents, both static anddynamic, and evidence for the existence of a phasetransition in spin glasses. Future work using mas-sively parallel computers will be essential to under-stand wetting and surface critical exponents aswell as systems with complex order parameters.Direct numerical integration of a set of Langevinequations that describe the nonlinear fluctuatinghydrodynamics can be solved in two-dimensionson a Cray YMP class supercomputer but the exten-

41

49

sion to three dimensions requires HPC. Finally,cell automata solutions of Navier-Stokes andBoltzmann equations are a powerful method forstudying hydrodynamics. All of these methods, be-cause of their inherent locality, run very efficientlyon parallel computers.

Electronic Structure and Quantum MolecularDynamics

The ability of quantum mechanics to predict thetotal energy of a system of electrons and nucleienables ones to reap tremendous benefits fromquantum-mechanical calculations. Since manyphysical properties can be related to the total ener-gy of a system or to differences in total energy,tremendous theoretical effort has gone intodeveloping accurate local density functional totalenergy techniques. These methods have been verysuccessful in predicting with accuracy equilibriumconstants, bulk moduli, phonons, piezoelectricconstants and phase-transition pressures andtemperatures for a variety of materials. Thesemethods have recently been applied to study thestructural, vibrational, mechanical and otherground state properties of systems containing upto several hundred atoms. Some recent successesinclude the unraveling of the normal state proper-ties of high T_c superconducting oxides, predic-tions of new phases of materials under highpressure, predictions of superhard materials, deter-mination of the structure and properties of sur-faces, interfaces and clusters, and calculations ofproperties of fullerenes and fullerites.

Particularly important are the developments of thepast few years which make it possible to carry out"first principles" computations of complex atomicarrangements in materials starting from nothingmore than the identities of the atoms and the rulesof quantum mechanics. Recent developments innew iterative diagonalization algorithms coupledwith increases in the computational efficiency ofmodem high performance computers have made itpossible to use quantum mechanical calculationsof the dynamics of systems in the solid, liquid andgaseous state. The basic idea of these methodswhich are known as ab initio methods is to mini-mize the total energy of the system by allowing

42

both the electronic and the ionic degrees offreedom to relax towards equilibrium simul-taneously. While ab initio methods have beenaround for more than a decade, only recently havethey been applied to systems of more than a fewatoms. Now, however, this method can be used tomodel a few but lired atom system and this num-ber will increase by at least a factor of 10 withinthe next five years. The method has already leadto new insights into the structure of amorphousmaterials, finite temperature simulations of thenew C60 solid, computation of the atomic andelectronic structures of 7/7 reconstruction ofSi(111) surface, melting of carbon and studies ofstep geometries on semiconductor surfaces. In thefuture, it will be possible to address many impor-tant materials phenomena including phase transfor-mations, grain boundaries, dislocations, disorderand melting.

The problem of understanding and improving themethods of growth of complicated materkuls, suchas multi-component heterostructures which areproduced by epitaxial growth using molecularbeam or chemical vapor deposition techniques,stands out as one very important technological ap-plication of this method. Although a "brute-force"simulation of atomic deposition on experimentaltime scales will not possible for sometime, onecan learn a great deal from studying the mech-anisms of reactive film growth. Combiningatomic calculations for the structure of an inter-face with continuum theories of elasticity and plas-tic deformation is also an important area for thefuture.

One of the most obvious areas for future applica-tions are biological systems, where key reaction se-quences would be simulated in ab initio fashion.These calculations would not replace existingmolecular mechanics approaches, but rather sup-plement them in those areas where they are notsufficiently reliable. This includes enzymatic reac-tions involving transition metal centers and othermulti-center bond-reforming processes. A relatedarea is catalysis, where the various proposed reac-tion mechanisms could be explicitly evaluated.Short-time finite temperature simulations can alsobe explored to search for unforeseen reaction pat-

50

terns. The potential for new discoveries in theseareas is high.

Important progress has also been made in under-standing the excitation properties of solids, in par-ticular the predictions of band offsets and opticalproperties. This requires the evaluation of theelectron self-energy and is computationally muchheavier than the local density approaches dis-cussed above. This first principles quasiparticle ap-proach has allowed for the first time the ab initiocalculation of electron excitation energies in solidsvalid for quantitative interpretation of spectro-scopic measurements. The excitation of systems ascomplex as C_60 fullerites have been computed.Although the quasiparticle calculations have yet tobe implemented on massively parallel machines, itis doable and the gain in efficiency and power isexpected to be similar to the ab initio moleculardynamics types of calculations.

Much effiy. t has been devoted in the past severalyears to algorithm development to extend the ap-plicability of these new methods to ever larger sys-tems. The ab initio molecular dynamics have beensuccessfully implemented on massively parallelmachines for systems as large as 700 atoms. Tightbinding molecular dynamics methods are an ac-curate, empirical way to include the electronicdegrees of freedom which are important forcovalently bonded materials, at speeds of 300 timefaster than ab initio methods. This method has al-ready been used to simulate 2000 atoms and withthe new massively parallel machines, this numberwill easily increase to 10,000 within a year.Another very exciting recent development in thisarea is work on the so-called order N methods forelectronic structure calculations. At present, quan-tum mechanical calculations scale at least as N^3in the large N limit, where N is the number ofatoms in a unit cell. Significant progress has beenmade recently by several groups in developingmethods which would scale as N. The success ofthese approaches would further enhance ourability to study very large molecular and materialssystems including systems with perhaps thousandsof atoms in the near future.

43

Quantum Many-Body

The quantum many body problem lies at the coreof understanding many properties of materials.Over the last decade much of the classicalmethodology discussed above has been extendedinto the quantum regime in particular with thedevelopment of the path integral and Green's func-tion Monte Carlo methods. Early calculations ofthe correlation energy of the electron gas are exten-sively used in local density theory to estimate cor-relation energy in solids. The low temperatureproperties of liquid and solid helium, three andfour, the simplest strongly correlated many-bodyquantum system, are now well understood thanksin large part to computer simulations. These quan-tum simulations have required thousands of hourson Cray-YMP class computers.

While there still remain very difficult algorithmicissues, .exact fermion methods and quantumdynamical methods to name two, the progress inthe next decade should parallel the previousdevelopments in classical statistical mechanics.Computer simulations of quantum many-body sys-tems will become a ubiquitous tool, integrated intotheory and experiment. The software andhardware has reached a state where much larger,complex and realistic systems can be studied.Some particular examples are electrons in transi-tion metals, in restricted geometries, at hightemperatures and pressures or in strong magneticfields. Mean field theory is unreliable in many ofthese situations. However, these applications, ifthey are to become routine and widely distributedin the materials science community will requirehigh performance hardware. Quantum simulationsare naturally parallel and are likely to be amongthe first applications using massively parallel com-puters.

Thanks to J. Bernholc, D. Ceperley, J. Joan-nopoulous, B. Harmon and S. Louie for their helpin preparing this subsection.

51

Computational Molecular Biology/Chemistry/Biochemistry

by Barry Honig

Background

There have been a number of revolutionarydevelopments in molecular biology that havegreatly expanded the need for high performancecomputing within the biological community. First,there has been an exponential growth in the num-ber of gene sequences that have been determined,and no end is in sight. Second, there has been aparallel (although slower) growth in the number ofproteins whose structures have been determinedfrom x-ray crystallography and, increasingly, frommultidimensional NMR. This literal explosion innew information has led to developments in areassuch as statistical analysis of gene sequences andof three dimensional structural data, newdatabases for sequence and structural information,molecular modeling of proteins and nucleic acids,and three-dimensional pattern recognition. Therecognition of grand challenge problems such asprotein folding or drug design has resulted in largepart from these developments. Moreover, the con-tinuing interest both in sequencing the humangenome and in the field of structural biologyguarantees that computational requirements willcontinue to grow rapidly in the coming decade.

To illustrate the type of problems that can arise,consider the case where a new gene has been iso-lated and its sequence is known. In order to fullyexploit this information it is necessary to first ob-tain maximum information about the protein thisgene encodes. This can be accomplished by search-ing a nucleic acid sequence data base for struc-turally or functionally related proteins, and/or bydetecting sequence patterns characteristic of thethree dimensional fold of a particular class ofproteins. There are numerous complexities thatarise in such searches and the computationaldemands imposed by the increasingly sophisti-cated statistical techniques that are being used canbe imposing.

A variety of methods, all of them requiring vastcomputational resources, are currently being ap-

44

plied to the protein folding problem (predictingthree dimensional structure from amino acid se-quence). Methods include statistical analyses (in-cluding neural nets) of homologies to knownstructures, approaches based on physical chemica._principles and simplified lattice models of the typeused in polymer physics. A major problem in un-derstanding the physical principles of protein andnucleic conformation is the treatment of the sur-rounding solvent. Molecular dynamics techniquesare widely used to model the solvent but their ac-curacy depends on the potential functions that areused as well as the number of solvent moleculesthat can be included in a simulation. Thus, thetechnique is limited by the available computation-al power. Continuum solvent models offer an alter-native approach but these too are highly computerintensive.

Even assuming a reliable method to evaluate freeenergies, the problem of conformational search isdaunting. There are a large number of possibleconformations available t a macromolecule and itis necessary to develop thviiods, such as MonteCarlo techniques with simulated annealing, to en-sure that the correct one has been included in thegenerated set of possibilities. A similar set ofproblems arises, for example, in the problem ofstructure-based drug design. In this case one mayknow the three-dimensional structure of a proteinand it is necessary to design a molecule that bindstightly to a particular site on the surface. Efficientconformational search, energy evaluation and pat-tern recognition are requirements of this problem,all requiring significant computational power.

Significant progress has been made in these andmany other related areas. Ten years ago most cal-culations were made without including the effectsof solvent. This situation has changed dramaticallydue to scientific progress that has been potentiatedby the availability of significant computationalpower. Some of this has been provided by Super-computer centers while some has been made avail-able by increasingly powerful workstations. Fast

52

computers have also been crucial in the veryprocess of three dimensional structure determina-tion. Both x-ray and NMR data analysis have ex-ploited methods such as molecular dynamics andsimulated annealing to yield atomic coordinates ofmacromolecules. More generally, the new dis-cipline of structural biology, which involves thestructure determination and analysis of biologicalmacromolecules, has been able to evolve due theincreased availability of high performance comput-ing.

Future

Despite the enormous progress that has been madethe field is just beginning to take off. Gene se-quence analysis will continue to become more ef-fective as the available data continue to grow andas increasingly sophisticated data analysis techni-ques are applied. It will be necessary to make state-of-the-art sequence analysis available to individualinvestigators, presumably through distributedworkstations and through access to centralizedresources. This will require a significant trainingeffort as well as the development of user-friendlyprograms for the biological community.

There is enormous potential in the area of threedimensional structure analysis. There is certain tobe major progress in understanding the physicalchc.nical basis of biological structure and func-tion. Improved energy functionals resulting fromprogress in quantum mechanics will become avail-able. Indeed a combination of quantum mechanicsand reaction field methods will make it possible toobtain accurate descriptions of molecules in thecondensed phase. The impact of such work will befelt in chemistry as well as in biology. Improveddescriptions of the solvent through a combinationof continuum treatments and detailed moleculardynamics simulations at the atomic level will leadto truly level descriptions of the conformationalfree energies and binding free energies of biologi-cal macromolecules. When combined with sophis-ticated conformational search techniques,simplified lattice models, and sophisticated statisti-cal techniques that identify sequence and struc-tural homologies, there is every reason to expectmajor progress on the protein folding problem.

There will be parallel improvements in structurebased design of biologically active compoundssuch as pharmaceuticals. Moreover, the develop-ment of new compounds based on biomimeticchemistry and new materials based on polymerdesign principles deduced from biomoleculesshould become a reality. All of this progress willrequire increased access to high performance com-puting for the reasons given above. The varioussimulation and conformational search techniqueswill continue to benefit dramatically from in-creased computational power. This will be true atthe level of individual workstations, which abench chemist for example might use to design anew drug. Work of this type often requires sophis-ticated three dimensional graphics and will benefitfrom progress in this area. Massively parallelmachines which will certainly be required for themost ambitious projects. Indeed, it is likely thatfor some applications the need for raw computingpower will exceed what is available in the foresee-able future.

New developments in the areas covered in this sec-tion will have major economic impact. Thebiotechnology, pharmaceutical and general healthindustries are obvious beneficiaries but there willbe considerable spin-off in materials science aswell.

Recommendations

Support the development of software in com-putational biology and chemistry. This shouldtake the form of improved software and algo-rithms for workstations as well as the portingof existing programs and the development ofnew ones on massively parallel machines.

Make funds available for training that will ex-ploit new technologies and for familiarizingbiologists with existing technologies.

Funding should be divided betwc zn largecenters, smaller centers involving a group ofinvestigators at a few sites developing newtechnologies, and individual investigators.

45

Molecular Modeling and Quantum Chemistry

by William Lester

The need for high performance computing hasbeen met historically by large vector supercom-puters. It is generally agreed in the computer andcomputational science communities that sig-nificance improvements in computational efficien-cy will arise from parallelism. The advent ofconventional parallel computer systems hasgenerally required major computer code restruc-ture to move applications from vector serial com-puters to parallel architectures. The move toparallelism has occurred in two forms: distributedMIMD machines and clusters of workstations withthe former receiving the focus of attention in largemulti-user center facilities and the latter in local re-search installations.

The tremendous interest in the simulation ofbiological processes at the molecular level usingmolecular mechanics and molecular dynamicsmethods has led to continuing increase in demandof computational power. Applications have poten-tially high practical value and include, for ex-ample, the design of inhibitors for enzymes thatare suspected to play a role in disease states andthe effect of various carcinogens on the structureof DNA.

In the first case, one expects that a moleculedesigned to conform within the three-dimensionalarrangement of the enzyme structure should bebound tightly to the enzyme in solution. This re-quirement, and others, make it desirable to knowthe tertiary structure of the enzyme. The use ofcomputation for this purpose is contributing sig-nificantly to the understanding of those structureswhich are then used to guide organic synthesis.

In the second case, serial vector supercomputerstypically can carry out molecular dynamics simula-tions of DNA for time frames of only picosecondsto nanoseconds. A recent calculation of 200-ps in-volving 3542 water molecules and 16 sodium ionstook 140 hours of Cray Y-MP time. Extendingsuch calculations to the millisecond or even thesecond range where important motions can occur

46

remains a major computational challenge that willrequire the use of massively parallel computer sys-tems.

Although in molecular mechanics or force fieldmethods, computational effort is dominated by theevaluation of the force field that gives the poten-tial energy as a function of internal coordinates ofthe molecules and non-bonded interactions be-tween atoms, a popular approach for small organicmolecules is the ab initio Hartree-Fock (HF)method which has come into routine use by or-ganic and medicinal chemists to study compoundsand drugs. The HF method is ab initio because thecalculation depends only on basic informationabout the molecule, including the number andtypes of atoms and the total change. The computa-tional effort of HF computations scales as N4,where N is the number of basis functions used todescribe the atoms of the molecule. Because theHF method describes only the "average" behaviorof electrons, it typically provides a better descrip-tion of relative geometries than of energetics. Theaccurate treatment of the latter requires proper ac-count of the instantaneous correlated motions ofelectrons which inherently is not described by theHF method.

For systems larger than those accessible with theHF methods, one has, in addition to molecularmechanics methods, semiemperical approaches.Their name arises out of the use of experimentaldata to parameterize integrals and other simplifica-tions of the HF method leading to a reduction ofcomputational effort to order N3. Results of thesemethods can be informative for systems whereparametrization has been performed.

Recently, the density functional (DF) method hasbecome popular, overcoming deficiencies in ac-curacy for chemical applications that limited ear-lier use. Improvements have come in the form ofbetter basis sets, advances in computational algo-rithms for solving the DF equations, and thedevelopment of analytical geometry optimization

54

methods. The DF method is an ab initio approachthat takes into account electron correlation. Inview of the latter capability, it can be used to studya wide variety of systems, including metals and in-organic species.

The move to parallel systems has turned out to bea major undertaking in software development forthe approaches described. Serious impedimentshave been encountered in algorithm modificationfor methods that go beyond the ab initio HFmethod, and in steps to maximize efficiency withincreased numbers of processors. These cir-cumstances have increased interest in quantumMonte Carlo (QMC) methods for electronic struc-ture. In addition, QMC methods have been usedwith considerable success for the calculation ofvibrational eigenvalues, and in statisticalmechanics studies.

QMC, as used here in the context of electronicstructure, is an ab initio method for solvit.g theSchroedinger equation stochastically based on theformal similarity between the Schroedinger equa-tion and the classical diffusion equation. Thepower of the method is that it is inherently an N-body method that can capture all of the instan-taneous correlation of the electrons. The QMCmethod is readily ported to parallel computer sys-tems with orders of magnitude savings in computa-tional effort over serial vector supercomputers.

In the statistical mechanical studies of complexsystems, one is often interested in the spontaneousformation and energetics of structure over largelength scales. The mesoscopic structures, such asvesicles and lamellars, formed from self assemblyin oil/water-surfactant mixtures are important ex-amples. The systematic analysis of thesephenomena have only recently begun, and com-puter simulation is one of the important tools inthis analysis. Due to the large length scales in-volved, simulation is necessarily confined to verysimple classes of models. Even so, the work pres-ses the capabilities of current computational equip-ment to their limits. While the stability of variousnontrivial structures have been documented, weare still far from understanding the rich phasediagram in such systems. The work of Smit and

47

coworkers using transputors demonstrates theutility of parallelization in these simulations. Fu-ture equipment should carry us much furthertowards understanding.

Competing interactions and concomitant "frustra-tion" characterizes complex fluids and the result-ing mesoscopic structures. Such competition isalso a central feature of "random polymers" - amodel for proteins and also for manufacturedpolymers. Computer simulation studies of randompolymers can be extremely useful. Though heretoo, the computations of even the simplest modelspress the limits of current technology. To treat thisclass of systems, Binder and coworkers andFrenkel and coworkers have developed new algo-rithms, some of which are manifestly paral-lelizable. Thus, this area is one where the newcomputer technology should be very helpful.

Along with large length scale fluctuations, as inself assembly and polymers, simulations press cur-rent computational equipment where relaxation oc-curs over many orders of magnitude. This is thephenomena of glasses. Here, the work of Fredrick-son on the spin-facilitated Ising systemdemonstrates the feasibility of parallelization instudying long time relaxation and the glass transi-tion by simulation.

Polymers and glasses are examples pertinent to theunderstanding and design of advanced materials.In addition, one needs to understand the electronicand magnetic behavior of these and other con-densed matter systems. In recent years, a fewmethods have appeared, especially the Car-Par-rinello approach, which now makes feasible thecalculation of electronic properties of complexmaterials. The calculations are intensive. For ex-ample, studying the dynamics and electronic struc-ture of a system with only 64 atoms, periodicallyreplicated, for only a picosecond is at the limits ofcurrent capabilities. With parallelization, andsimplified models, one can imagine, however, sig-nificant progress in our understanding of metalin-sulator transitions and localization in correlateddisordered systems.

55

Mathematics and High Performance Computing

by James Sethian

A. Introduction

Mathematics underlies much, if not all, of highperformance computing. At first glance, it mightseem that mathematics, with its emphasis ontheorems and proofs, might have little to con-tribute to solving large problems in the physicalsciences and engineering. On the contrary, in thesame way that mathematics contributes the under-lying language for problems in the sciences, en-gineering, discrete systems, etc., mathematicaltheory underlies such factors as the design and un-derstanding of algorithms, error analysis, ap-proximation accuracy, and optimal execution.Mathematics plays a key role in the drive toproduce faster and more accurate algorithmswhich, in tandem with hardware advances,produce state-of-the-art simulations across thewide spectrum of the sciences.

At the same time, high performance computingprovides a valuable laboratory tool for many areasof theoretical mathematics such as number theoryand differential geometry. At the heart of mostsimulations lies a mathematical model and an algo-rithmic technique for approximating the solutionto this model. Aspects of such areas as approxima-tion theory, functional analysis, numericalanalysis, probability theory, and the theory of dif-ferential equations provide valuable tools fordesigning effective algorithms, assessing their ac-curacy and stability, and suggesting new techniques.

What is so fascinating about the intertwining ofcomputing and mathematics is that each in-vigorates the other. For example, understanding ofentropy properties of differential equations haveled to new methods for high resolution shockdynamics, approximation theory using multipoleshas led to fast methods for N-body problems,methods from hyperbolic conservation laws anddifferential geometry have produced excitingschemes for image processing, parallel computinghas spawned new schemes for numerical linear al-gebra and multi-grid techniques, and methods

designed for tracking physical interfaces havelaunched new theoretical investigations in differen-tial geometry, to name just a few. Along the way,this interrelation between mathematics and com-puting has brought breakthroughs in such areas asmaterial science (such as new schemes forsolidification and fracture problems), computation-al fluid dynamics (e.g., high order projectionmethods and sophisticated particle schemes), com-putational physics (such as new schemes for Isingmodels and percolation problems), environmentalmodeling (such as new schemes for groundwatertransport and pollutant modeling) and combustion(e.g. new approximation models and algorithmictechniques for flame chemistry/fluid mechanicalinteractions).

B. Current State

Mathematical research which contributes to highperformance computing exists across a widerange. On one end are individual investigators orsmall, joint collaborations. In these settings, thework takes a myriad of forms; brand-new algo-rithms are invented which can save an order ofmagnitude speedup in computer resources, exist-ing techniques are analyzed for convergenceproperties and accuracy, and model problems areposed which can isolate particular phenomena.For example, such work includes analysts workingon fundamental aspects of the Navier-Stokes equa-tions and turbulence theory (c.f. the following sec-tion on Computational Fluid Dynamics), appliedmathematicians designing new algorithms formodel equations, discrete mathematiciansfocussed on combinatorics problems, and numeri-cal linear algebraists working inoptimizationtheory. At the other end are mathematicians work-ing in focussed teams on particular problems, forexample, in combustion, oil recovery,aerodynamics, material science, computationalfluid dynamics, operations research, cryptography,and computational biology.

48 5 6.

Institutionally, mathematical work in high perfor-mance computing is undertaken at universities, theNational Laboratories, the NSF High PerformanceComputing Centers, NSF Mathematics Centers(such as the Institute for Mathematical Analysisand the Mathematical Sciences Research Institute,and the Institute for Advanced Study), and acrossa spectrum of industries. In recent years, high per-formance computing has become a valuable toolfor understanding subtle aspects of theoreticalmathematics. For example, computing has revolu-tionized the ability to visualize and evolve com-plex geometric surfaces, provided techniques tountie knots, and helped compute algebraic struc-tures.

C. Recommendations

Mathematical research is critical to ensure state-of-the-art computational and algorithmic techniqueswhich foster the efficient use of national comput-ing resources. In order to promote this work, it isimportant that :

1. Mathematicians be supported in their needto have access to the most advanced comput-

Introduction

ing systems available, both through net-works to supercomputer facilities, on-site ex-perimental machines (such as parallelprocessors), and individual high-speedworkstations.

2. State-of-the-art research in modeling, newalgorithms, applied mathematics, numericalanalysis, and associated theoretical analysisbe amply supported; it is this work that con-tinually and continuously rejuvenates com-putational techniques. Without it,yesterday's algorithms will be running ontomorrow's machines.

3. Such research be supported on all levels; theindividual investigator, small joint collabora-tions, interdisciplinary teams, and largeprojects.

4. Funding be significantly increased in theabove areas, both to foster frontier researchin computational techniques, and to usecomputation as a bridge to bring mathe-matics and the sciences closer together.

Computational Fluid Dyanmics

by James Sethian

The central goal of computational fluid dynamics(CFD) is to follow the evolution of a fluid by solv-ing the appropriate equations of motion on anumerical computer. The fundamental equationsrequire that the mass, momentum, and energy of aliquid/gas are conserved as the fluid moves. In allbut the simplest cases, these equations are too dif-ficult to solve mathematically, and instead oneresorts to computer algorithms which approximatethe equations of motion. The yardstick of successis how well the results of numerical simulationagree with experiment in cases where carefullaboratory experiments can be established, andhow well the simulations can predict highly corn-

49

plex phenomena that cannot be isolated in thelaboratory.The effectiveness and versatility of acomputational fluid dynamics simulation rests onseveral factors. First, the underlying model mustadequately describe the essential physics. Second,the algorithm must accurately approximate theequations of motion. Third, the computer programmust be constructed to execute efficiently. Andfourth, the computer equipment must be fastenough and large enough to calculate the answerssufficiently rapidly to be of use. Weaving thesefactors together, so that answers are accurate, reli-able, and obtained with acceptable cost in an ac-ceptable amount of time, is both an art and ascience. Current uses of computational fluiddynamics range from analysis of basic research

5i

into fundamental physics to commercial applica-tions. While the boundaries are not sharp, CFDwork may be roughly categorized in three ways:Fundamental Research, Applied Science, and In-dustrial Design and Manufacturing

CFD and Fundamental Research

At many of the nation's research universities ar...inational laboratories, much of the focus of CFDwork is on fundamental research into fluid flowphenomena. The goal is to understand the role thatfluid motion plays in such areas as the evolutionof turbulence in the atmosphere and in the oceans,the birth and evolution of galaxies, atmosphericson other planets, the formation of polymers, thephysiological fluid flow in the body, and the inter-play of fluid mechanics and material science suchas in the physics of superconductors. In thesesimulations, often employing the most advancedand sophisticated algorithms, the emphasis is onaccurate solutions and basic insight. These calcula-tions are often among the most expensive of allCFD simulations, requiring many hundreds ofhours of computer time on the most advancedmachines available for a single simulation. Themodeling and algorithmic techniques for suchproblems are constantly under revision and refine-ment. For the most part, the major advances innew algorithmic tools, from schemes to handle theassociated numerical linear algebra to high ordermethods to approximate difference equations,have their roots in basic research into CFD appliedto fundamental physics.

CFD and Applied Science

Here, the main emphasis is on the application ofthe tools of computational fluid dynamics toproblems motivated by specific problems such asmight occur in natural phenomena or physicalprocesses. Such work might include detailedstudies of the propagation of flames in engines orfire research in closed rooms, the fluid mechanicsinvolved in the dispersal of pollutants or toxicgroundwater transport, the hydraulic response of aproposed heart valve, the development of severestorms in the atmosphere, and the aerodynamicproperties of a proposed space shuttle design. For

50

the most part, this work is also carried outthroughout the national laboratories and univer-sities with government support. While cost is not amajor issue in these investigations, the focus ismore on obtaining answers to directed questions.Less concerned with the algorithm for its ownsake, this work links basic research with commer-cial CFD applications, and provides a steppingstone for advances in algorithms to propagate intoindustrial sectors.

CFD and Industrial Design and Manufacturing

The focus in this stage of the process is on apply-ing the tools of computational fluid dynamics tosolve problems that directly relate to technology.A vast array of examples exist, such as thedevelopment of a high-speed inkjet plotter, the ac-tion of slurry beds for processing minerals,analysis of the aerodynamic characteristics of anautomobile or airplane, efficiency analysis of aninternal combustion engine, performance of high-speed computer disk drives, and optimal pouringand packaging techniques in manufacturing. Forthe most part, such work is carried out in privateindustry, often with only informal ties to academicand government scientists. Communication of newideas rests loosely on the influx of new employeestrained in the latest techniques, journal articles,and professional conferences. A distinguishingcharacteristic of this work is'its emphasis on turn-around time and cost. Here, cost is not only thecost of the equipment to perform the calculation,but the people-years involved in developing thecomputer code, and the time involved performingmay hundreds of simulations as part of a detailedparameter study. This motivation is quite differentfrom that in the other two areas. The need to per-form a large number of simulations under extreme-ly general circumstances may mean that a simpleand fast technique that attacks only a highlysimplified version of the problem may bepreferable to a highly sophisticated and accuratetechnique that requires many orders of magnitudemore computational effort. This orientation is liesat the heart of the applicability and suitability ofcomputational techniques to a competitive in-dustry.

58

Future of CFD

Modeling and Algorithmic Issues.

In many ways, the research, applied science,and industry agenda in CFD has changed,mostly in response to increased computation-al power coupled to signific int algorithmicand theoretical/numerical advances. On theresearch side, up through the early 1980's,the emphasis was on basic discretizationmethods. In that setting, it was possible todevelop methods based on looking at simpleproblems in two, or even one space dimen-sion, in simple geometries, and in a fair de-gree of isolation from the fluid dynamicsapplications. Over the last five years, how-ever, there has been a transition to the nextgeneration of problems. These problems aremore difficult in part because they are at-tempting more refined and detailed simula-tions, necessitating finer grids and morecomputational elements. However, a morefundamental issue is that these problems arequalitatively different from those previouslyconsidered. To begin, they often involve com-plex and less well-understood physicalmodels - chemically reacting fluid flows,flows involving multiphase or multicom-ponent mixtures of fluids or other complexconstitutive behavior. They are often set inthree dimensions, in which both the solutiongeometry and the boundary geometry aremore complicated than in two dimensions.Finally, they often involve resolving II iultiplelength and time scales, such as boundary andinterior layers coming from small diffusiveterms, and intermittent large variations thatarise in fluid turbulence. Algorithmically,these require work in several areas. Complexphysical behavior makes it necessary todevelop a deeper understanding of thephysics and modeling than had previouslybeen required. Additional physics requiresthe theory and design of complex boundaryconditions to couple the fluid mechanics tothe rest of the problem. Multiple lengthscales and complex geometries lead todynamically adaptive methods, since

memory and compute power are still insuffi-cient to brute force through most problems.For example, the wide variation in length andtime scales in turbulent combustion require ahost of iterative techniques, stiff ode solvers,and adaptive techniques. Further algorithmicadvances are required in areas such asdomain decomposition, grid partitioningerror estimation, and preconditioners ariterative solvers for non-symmetric, non-diagonally dominant matrices. Mesh genera-tion, while critical, has not progressed far.

All in all, to accomplish in three-dimensionalcomplex flow what is now routine in two-dimensional basic flow will require theory,numerics, and considerable cleverness. Thenet effect of these developments is to makethe buy-in to perform CFD research muchhigher. Complex physical behavior makes itnecessary to become more involved with thephysics modeling than had previously beenthe case. The problems are sufficiently dif-ficult that one cannot blindly throw them atthe computer and overwhelm them. A consid-erable degree of mathematical, numerical,and physical understanding must be obtainedabout the problems in order to obtain effi-cient and accurate solution techniques. Onthe industrial side, truly complex problemsare still out-of-reach. For example, in theaircraft industry, we are still a long way froma full Navier-Stokes high Reynolds numberunsteady flow around a commercial aircraft.Off in the distance are problems of takeoffand landing, multiple wings in close relationto each other, and flight recovery from sud-den changes in conditions. In the automotiveindustry, a solid numerical simulation of thecomplete combustion cycle (at, opposed to atime-averaged transport model) is still manyyears away. Other automotive CFD problemsinclude analysis of coolant flows, thermalheat transfer, plastic mold problems, andsheet metal formation.

High Performance Computing Issues.

51

59

The computer needs to continue the next fiveyears of CFD work are substantial. As an ex-ample, an unforced Navier-Stokes simulationmight require 1000 cells in each of threespace dimensions. Presuming 100 flops percell. and 25,000 time steps, this yields 109 x102 x 2.4 x 104 = 2.5 x 1015 flops; a typicalcompute time of 2 hours would then require amachine of 300 gigaflops. Adding forcing,combustion, or other physics severely ex-tends this calculation. As a related issue;memory requirements pose an additionalproblem. Ultimately, more compute powerand memory is needed. The promise of asingle, much faster, larger vector machine isnot being made convincingly, and CFD is at-tempting to adapt accordingly. Here is anarea where the dream of parallelism is bothtantalizing and frustrating. To begin, parallelcomputing has naturally caused emphasis onissues related to processor allocation and loadbalancing. To this end, communications costaccounting (as opposed to simply accountingfor floating point costs) has become impor-tant in program design. For example, parallelmachines can often have poor cache memorymanagement and a limited number of pathsto/from main memory; this can imply a ;ongmemory fetch/store time, which can result inactual computational speeds for real CFDproblems far below the optimal peek speedperformance. To compensate, parallel com-puters tend to be less memory efficient than

vector machines, as space is exchanged forcommunication time (duplicating data wherepossible rather than sending it betweenprocessors). The move to parallel machines iscomplicated by the fact that millions of linesof CFD codes have been written in theserial/vector format The instability of thehardware platforms, the lack of a standardglobal high performance Fortran and C, thelack of complete libraries, and insecurity as-sociated with a volatile industry all contributeto the caution and reluctance of all but themost advanced research practitioners of CFD.

Recommendations

In order to tackle the next generation of CFDproblems, the field will require:

Significant accessibility to the fastest currentcoarse-grained parallel machines.

Massively parallel machines with largememory, programmable under stableprogramming environments, including highperformance Fortran and C, mathematicallibraries, functioning I/O systems, and ad-vanced visualization systems.

Algorithmic advances in adaptive meshing,grid generation, load balancing, and highorder difference, element, and particleschemes.

Modeling and theoretical advances couplingfluid mechanics to other related physics.

High Performance Computing In Physics

by James Sethian and Neal Lane

INTRODUCTION AND BACKGROUND

High Energy Physics

Two areas in which high performance computingplays a crucial role are lattice gauge theory and theanalysis of experimental data. Lattice gaugetheory addresses some of fundamental theoretical

52

problems in high energy physics, and is relevantto experimental programs in high energy andnuclear physics. In the standard model of highenergy physics, the strong interactions aredescribed by quantum chromodynamics (QCD).In this theory the forces are so strong that the fun-damental entities, the quarks and gluons, are not

60

observed directly under ordinary laboratory condi-tions. Instead one observes their bound states,protons and neutrons, the basic constituents of theatomic nucleus, and a host of short lived particlesproduced in high energy accelerator collisions.One of the major objectives of lattice gauge theoryis to calculate the masses and other basic proper-ties of these strongly interacting particles fromfirst principles, and provide a test of QCD, as wellas suggest that the same tools could be used to cal-culate additional physical quantities which maynot be so well determined experimentally. In addi-tion, lattice gauge theory provides an avenue formaking first principle calculations of the effects ofstrong interactions on weak interaction processes,and thus holds the promise of providing crucialtests on the standard model at its most vulnerablepoints. And, although quarks and gluons are notdirectly observed in the laboratory, it is expectedthat at extremely high temperatures one wouldfind a new state of matter consisting of a plasmaof these particles. The questions being addressedby lattice gauge theorists are the nature of the tran-sition between the lower temperature state of ordi-nary matter and the high temperature quark-gluonplasma, the temperature at which this transition oc-curs, and the properties of the plasma. In thegeneral area of experimental high energy physics,there are three primary areas of computing:

1) The processing of the raw data that is usual-ly accumulated at a central acceleratorlaboratory, such as the Wilson Laboratory atCornell.

2) The simulation of physical processes of in-terest., and the simulation of the behavior ofthe final states in detector.

3) The analysis of the compressed data thatresults from processing of the raw data andthe simulations.

Atomic and Molecular Physics

In contrast to some areas of theoretical physics,the AM theorist has the advantage that he under-stands the basic equations governing the evolutionof the system of particles under consideration.However, the wealth of phenomena that derive

from the many-body interactions of the con-stituents and their interactions with externalprobes such as electric and magnetic fields aretruly astounding. Computation now provides apractical and useful alternative method to studythese problems. Most importantly, it is now pos-sible to perform calculations sophisticated enoughto have a real impact on AM science. These in-clude high precision computations of the groundand excited states of small molecules, scattering ofelectrons from atoms, atomic ions and smallpolyatomic molecules, simple chemical reactionsinvolving atom-diatom collisions, photoionizationand photodissociation and various time dependentprocesses such as multiphoton ionization and theinteraction of atoms with ultra strong ( or short )electromagnetic fields.

Gravitational Physics

The computational goal of classical and astrophysi-cal relativity is the solution of the associated non-linear partial differential equations. For example,simulations have been performed of the critical be-havior in black hole formation using high ac-curacy adaptive grid methods and which followthe collapse of spherical scalar wave pulses over atleast fourteen orders of magnitude, of the structurewe currently see in the universe (galaxies, clustersof galaxies arranged in sheets, voids etc.) and howit may have arisen through fluctuations generatedduring an inflationary epoch, of head-on collisionsof black holes, and of horizon behavior in a num-ber of black hole configurations.

CURRENT STATE

Definitive calculations in all of the areas men-tioned above would require significantly greatercomputing resources than have been available upto now. Nevertheless steady progress has beenmade during the last decade due to important im-provements in algorithms and calculational techni-ques, and very rapid increases in availablecomputing power. For example, among the majorachievements in lattice gauge theory have been: ademonstration that quarks and gluons are confinedat low temperatures; steady improvements inspectrum calculations, which have accelerated

61

markedly in the last year; an estimate of the transi-tion temperature between the ordinary state of mat-ter and the quark-gluon plasma; a bound on themass of the Higgs boson; calculations of weakdecay parameters including a determination of 'Ilemixing parameter; and a determination of thestrong coupling constant at the energy scale of 5GeV from a study of the charmonium spectrum.Much of this work has been carried out at the NSFSupercomputer Centers. In the area of atomic andmolecular physics, until quite recently most AMtheorists were required to compute usingsimplified models or, if the research was computa-tionally intensive, to use vector supercomputers.This has changed with the widespread availabilityof cheap, fast, Unix based, RISC workstations.These "boxes" are now capable of performing atthe 40-50 megaflop level and can have as much as256 megabytes of memory. In addition,it is pos-sible to cluster these workstations and distributethe computational task among the cpu's. There area few researchers in the US who have as many astwenty or thirty of these workstations for theirown group. This has enabled computational experi-ments using a loosely coupled, parallel model onselected problems in AM physics. However, this isnot the norm in AM theory. More typically ourmost computationally intensive calculations arestill performed on the large mainframe, vector su-percomputers available only to a limited numberof users. The majority of the AM researchers inthe country are still computing on single worksta-tions or ( outdated ) mainframes of one kind oranother.

FUTURE COMPUTING NEEDS

In the field of high energy physics, the processingof raw data and the simulation of generic mixingprocesses are activities that are well-suited to acentralized computer center. The processing of theraw data should be done in a consistent, or-ganized, and reliable manner. Often, in the middleof the data processing, special features in somedata are discovered, and these need to be treated

quickly and in a manner consistent with the entiresample. It is usual and logical that the processingtake place at the central accelerator center wherethe apparatus sits, because the complete records ofthe data taking usually reside there. In contrast,the high energy physics work in simulation ofspr 'fie processes and analysis of compressed dataare well matched to individual Universitygroups.In terms of computing needs, all of theabove are well served either by a large, powerful,central computer system, or by a cluster or farm ofworkstations. For example, CERN does themajority of its computing on central computers,while the Wilson Lab has z farm of DECstation5000/240's to process its raw data. High energycomputing usually proceeds one event at a time;each event, whether from simulation or raw data,can be dealt to an individual workstation forprocessing. There is no need to put the entireresources of a supercomputer on one event. However, massively parallel computers have the poten-tial to handle large numbers of eventssimultaneously in an efficient manner. Thesemachines will also be of importance in the workon lattice gauge theory. Conversely, universitygroups are well served by the powerful worksta-tions now available, which has freed them from de-pendence on the central laboratory to study thephysics signals of interest to them. These groupsneed both fast CPUs, for simulation of data, aswell as relatively large and fast disk farms, forrepeated processing of the compressed data. In thefield of atomic and molecular modeling, the futurelies in the use of massively parallel, scaleable mul-ticomputers. AM theorists, with rare exceptions,have not been as active as other disciplines inmoving to these platforms. This is to be contrastedwith the quantum chemists, the lattice QCDtheorists and many materials scientists who are be-coming active users of these computers. The lackof portability of typical AM codes and the need toexpend lots of time and effort in rewriting orrethinking algorithms has prevented a mass migra-tion to these platforms.

54

62

Accomplishments in Computer Science and Engineering since the Lax Report

by Mary K. Vernon

While vector multiprocessors have been theworkhorses for many fields of computationalscience and engineering over the past ten years, re-search in computer science and engineering hasbeen focused both on improving the capabilities ofthese systems, and on developing the next genera-tion of high-performance computing systemsnamely the scalable, highly parallel computerswhich have recently been commercially realizedas systems such as the Intel Paragon, the KendallSquare Research KSR-1, and the ThinkingMachines Corporation CM-5.

A variety of factors make scalable, highly parallelcomputers the only viable way to achieve theteraflop capability required by Grand Challengeapplications. These systems represent far morethan an evolutionary step from their modestlyparallel vector predecessors. Realizing the terafloppotential of massively parallel systems requires ad-vances in a broad range of computer science andengineering subareas, including VLSI, computerarchitecture, operating systems, runtime systems,compilers, programming languages, and algo-rithms. The development of the new capabilities inturn requires computationally intensive experimen-tation and/or simulations that have been carriedout on experimental prototypes (e.g., the NYUUltracomputer), early commercial parallelmachines (such as the BBN Butterfly or the InteliPSC/2), and more recently on high-performanceworkstations as well as the emerging massivelyparallel systems such as the Thinking MachinesCM5 and the Intel Paragon.

Computer science and engineering researchershave made tremendous progress in the past tenyears in the development of high performancecomputing technology, including the developmentof ALL of the major technologies in massivelyparallel systems. Among the specific accomplish-ments are:

development of RISC processor technologyand the compiler technology for RISC proces-

55

sors, which is used in all high performanceworkstations as well as in the massively paral-lel machines.

development of computer-aided tools tofacilitate the design, testing, and fabricationof complex digital systems, and their con-stituent components.

invention of the multicomputer and develop-ment of the message-passing programmingparadigm which is used in many of today'smassively parallel systems.

refinement of shared memory architecturesand the shared memory programming modelwhich is used in the KSR-1, the Cray T3D,and other emerging massively parallelmachines.

invention of the hypercube interconnectionnetwork and refinement of this network tolower-dimensional, 2-d and 3-d, mesh net-works that are currently used in the IntelParagon and the Cray T3D

invention of the fat-tree interconnection net-work which is currently used in the ThinkingMachines CM-5.

refinement of the SIMD architecture which isused for example in the Thinking MachinesCM2 and the MasPar/l.

invention and refinement of the SPMD anddata parallel programming models which aresupported in several massively parallel sys-tems.

development of the technology underlyingthe mature compilers for vector machines(i.e., compilers that give delivered perfor-mance that is a substantial fraction of thetheoretical peak performance of thesemachines.)

63

development of the technology underlying allof the existing compilers for parallelmachines,

development of the Mach operating system,which provided the basis for the OSF/1 stand-ard used, for example, in the Intel Paragon.

development of light-weight and wait-freesynchronization primitives

development of performance debugging tools

development of high-performance databasetechnologies, including both algorithms andarchitectures that have influenced emergingsystems, for example from NCR, Teradata,and IBM.

development of parallel algorithms for high-performance optimization

development of parallel algorithms fornumerical linear algebra

development of machine learning technologyfor computational biology

In other words, key hardware, system software,and algorithm technologies are directly the resultof computer science and engineering researchacross a broad range of subdisciplines. Much of

56

this work has been highly experimental, and hasmade extensive use of current-generation, earlycommercial, and prototype high performance sys-tems. For example, simulations of next-generationarchitectures, multi-user database systems, and thelike, as well as the development and testing ofnew algorithms for large-scale optimization,numerical linear algebra, computational biology,and the like, often require days of simulation timeon the most advanced platforms available.

Research efforts today are focused on improvingthe capabilities, performance, and ease of use ofparallel machine technology, including thecapabilities of workstation networks. Experimentsto evaluate the technology for next-generation sys-tems, like many other applications that would beclassified as "computational engineering", requirethe highest performance systems available. In addi-tion, simulation and/or testing of innovations incomputer architecture or operating systems some-times involve modifications to the host hardwareand/or operating software. These modificationscan be developed and debugged on medium-scaleversions of the high-end systems. Support for suchinitial development, as well as porting workingmodifications to larger-scale systems for furthertest, is critical to the rapid development of newHPC technologies.

64

NATIONAL SCIENCE FOUNDATIONWASHINGTON, D.C. 20550

OFFICIAL BUSINESSPENALTY FOR PRIVATE USE $300

RETURN MS COVER SHEET TO ROOM 233A W YOU DONOT WISH TO RECEIVE THIS MATERIAL 0 , OR WCHANGE Of ADDRESS IS NEEDED 0 . INDICATE

CHANGE, INCLUDING ZIP CODE ON THE LABEL (DO NOT

REMOVE LABEL).

BEST COPY AVAILABLE

65p.range or mini-supercomputer, vs. networks of workstations, vs. remote, shared supercomputers of...

Documents