community input on the future ofhigh-performance computing · 7 theory and modeling of...

32
Community Input on the Future of High-Performance Computing Robert J. Harrison [email protected] [email protected]

Upload: others

Post on 24-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

Community Input on the Future of High-Performance

Computing

Robert J. [email protected]

[email protected]

Page 2: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

2

Questions

● What are Exascale computing and data cyber-infrastructure essential for?

● What new science do Exascale computing and data CI enable?

● What are the challenges to reach Exascale?

● What scientific areas have not yet considered Exascale, and how do we help them get there?

Page 3: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

3

What is a processor in 2015?● Probably not a conventional x86● E.g., Nvidia C2060 480 cores

● ~1TF peak (single), ~0.5TF (double)● LANL RoadRunner – hybrid

● 1 PF of IBM Cell + Opteron● Cyclops

● 160 simple thread units per chip● no cache - S/W managed on-chip memory

● FLOPs are cheap; bandwidth is expensive● Exascale is not inherently scary

● it’s the path/rush to exascale that frightens me

1 core

Page 4: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

4

Exascale technologies

● Will still be just one corner of the entire ecosystem● Cost dictates this ● In 2020 1EF = $100M = 1000 PF → 1PF ≤ 0.1M● Software is still more expensive than HW● Most science will happen at the petascale

● Hardware● Will leverage high-end server and professional

computing platforms● Software

● Will (perhaps with reconfiguration) run everywhere

Page 5: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

5

National benefits of exascale and associated technologies

● Basic science currently drives high-end HPC ● It consumes (nearly) all petascale cycles● Product design/engineering at terascale or below –

expertise and startup cost are barriers● We must change this

● Mature simulation (e.g., comp. chem.) must eventually become relevant to new technologies

● Can we eliminate most of the time lag?● Simulation is effective at encapsulating basic science

knowledge for engineering● Creating a vastly larger market for HPC will drive down our

H/W costs and change our S/W cost-model

Page 6: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

6

3 examples from energy science● Knowledge-based design of new energy

technologies● Sustained, predictive exascale simulation will

demand● A new generation of codes and expertise

● managed as a long-lived facility● Reorganization of our community

● to connect theoretical innovation to HPC● to integrate HP simulation and experiment

● e.g., success stories at some of the DOE nanocenters

● Breakthroughs in key theoretical, algorithmic, and mathematical challenges

Page 7: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

7

Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier)

ACS Nano, 2010, 4 (4), pp 2382–2390 Publication Date (Web): April 5, 2010 (Article)DOI: 10.1021/nn100126w

Supercapacitors are electrical energy storage devices that have large energy power and density. The density is tremendously improved by reducing pore sizes in the nanometer regimes. Computational studies show that partial desolvation is important, and leads to replacement of the conventional model by a “wire-in-a-cylinder or sandwich” model, which accounts for all the observed phenomena.

Page 8: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

8

Challenges

● Time and length scales● Need ab initio dynamics – strong scaling

● Limitations of current density functionals● Long-range interactions, exchange

● Speed and accuracy of many-body methods

● Slow convergence with basis (solved by explicitly correlated wave function)

● Poor scaling with system size O(N7)(solved by local-correlation & low-rank)

Page 9: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

9

• Demonstrated the emergent surface-confined growth of ordered arrays of PEDOT chains using in situ STM imaging combined with first principles density functional theory calculations

• Developed controlled assembly of aromatic building blocks into ordered structures on a metallic substrate with mechanically and electrically robust contacts

• The epitaxially aligned conjugated systems show promise for use in future electronic and optoelectronic devices requiring high-quality active materials.

J. A. Lipton-Duffin, J. A. Miwa, M. Kondratenko, F. Cicoira, B. G. Sumpter, V. Meunier, D. F. Perepichka, F. Rosei “Step-by-step growth of epitaxially aligned polythiophene by surface-confined oligomerization” Proc. Nat. Acad. Sci. (2010). DOI:10.1073/ pnas.1000726107; BGS and VM were funded in part by the CNMS at ORNL, Computing resources: NCCS at ORNL

Step-by-step growth of epitaxially aligned polythiophene by surface-confined oligomerization

J. A. Lipton-Duffin, J. A. Miwa, M. Kondratenko, F. Cicoira, B. G. Sumpter, V. Meunier, D. F. Perepichka, F. Rosei

PEDOT oligomers, showing the formation of dimers, trimers and tetramers (a), dimers (b), trimers (c) and the characteristic length of polymers that are grown (d): arrows on (d) mark oligomers with 14 and 16 monomers. DFT computed geometries are shown on (b) and (c). (e) shows simulated (lower) STM images compared to the measured STM (upper).

d

e

Page 10: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

10

Challenges● Chemistry at interfaces● Time and length scales

● Need ab initio dynamics – strong scaling ● Limitations of current density functionals

● Incompatible screening models (or fudge factors)

● Many-body methods for materials● Much less mature than for molecules (implementation

complexity can be eliminated as done in chemistry)● Incompatible formulations (propagtor methods less developed

in chemistry)● Lots of calibration of numerous models still required

● Accurate treatment of 1e and 2e excited states

Page 11: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

11

Formation of “sharp edges” in graphene nano ribbon: quantum transport and electronic structure calculations

Science 323, 1701 (2009) Jia, Dresselhus, Sumpter, Meunier, Terrones, ...

Page 12: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

12

Formation of “sharp edges” in graphene nano ribbon: quantum transport and electronic structure calculations

Science 323, 1701 (2009) Jia, Dresselhus, Sumpter, Meunier, Terrones, ...

Page 13: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

13

Challenges

● Time and length scales; rare events● Coupling of electronic and nuclear motion

● Electron-phonon interactions – inelastic transport (response calculation for 1000s of atoms)

● Open-shell systems – complex chemistry● Current DFT functionals neglect long-range

correlations● Spectroscopy (Raman – 3rd derivatives!)

Page 14: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

14

Exascale simulation● Changes how we ask questions

● E.g., ab initio bond strength in 100,000 atom system (know how to do this courtesy of classical MD)

● E.g., quantitative reaction dynamics at nanoscale(don’t know how to do this – need statistical model)

● Challenges our ability to compute accurately● E.g., UQ for local, explicitly correlated, CCSD(T) in

aug-cc-PVTZ basis on solvated photochemistry● Most chemists are in a state of denial about their

numerical problems – small molecule methods will *NOT* work for large systems

Page 15: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

15

O(1) programmersO(10,000) nodesO(100,000) processorsO(100,000,000) threads and growing• Complexity constrains all of our ambitions (cost &

feasibility)– Growing intrinsic complexity of science problem– Architectural flux on path to exascale– Expressing concurrency at extreme scale– Managing the memory hierarchy

• Semantic gap (Colella) – Our equations are O(100) lines but the program

is O(1M) & growing

Page 16: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

16

HPC futures we’d like to avoid

High-end simulation is the most credible vehicle for accelerating the application of knowledge from basic science to the design of new energy technologies.

Complexity constrains all of our ambitions (cost & feasibility).

We hear about the successes, but what about the failures?

What about the disciplines that are not computing?

Page 17: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

17

Scientific vs. WWW software● Why are we not experiencing the same nearly

exponential growth in functionality?● Level of investment or number of developers?● Lack of software interoperability and standards?● Competition not cooperation between groups?● Shifting scientific objectives?● Our problems are intrinsically harder? ● Failure to embrace/develop higher levels for

composing applications?● Differing impact of hardware complexity?

Page 18: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

18

Dead code

● Requires human labor ● to migrate to new

architectures, or● to exploit additional

concurrency, etc.● By these criteria most

extant code is dead● E.g., porting effort for

hybrid cpu+GPGPU● Conclusion: We need “performance portable” or “future

proof” code by automating transformations and re-targeting of code

● To do this we need higher-level semantic informationand a rich, easily extended tool set

7 December 1969

Page 19: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

19

● Lisp, Scheme, Java, Fortress

Page 20: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

20

Guy Steele’s example

● Natural language problem description

I will give you some text – please remove white space and give me the list of words

● The wetware of an undistracted first grader is capable of running this two-line “program” with no more guidance

Page 21: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

21

Missing ingredients● Natural language problem description

● Implicit & fuzzy definitions of verbs, nouns, ...● Incomplete without more context

● E.g., what if an American first grader was given

وقال خليل إبراهيم في برنامج ضيف المنتصف على قناة الجزيرة اليوم إن "ما يحدث في الدوحة هو تزوير لن الطراف الرئيسية ."غير موجودة والحرب دائرة على الرض وتمتد للمدن

● No ordering specified – parallelism● Does not include an algorithm

Khalil Ibrahim in the middle Guest on Al-Jazeera today, "what is happening in Doha is a fraud because the main parties do not exist and the war department on the ground and extends to cities”

Page 22: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

22

Conventional solution● Problem statement + brain

→ algorithm● Algorithm + language + brain

→ program● Computer + program + input

→ result● The brain is

● Expensive● Finite ● Not growing exponentially

Image from http://www.ucdmc.ucdavis.edu/welcome/features/20071017_Medicine_whitematter/Photos/head_and_brain.jpg

This is clearly Guy’s brain - Flashes of inspiration - Blue aura of authority

Page 23: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

23

Impact of sustained exponential growth

● We are only beginning to realize the transforming power of computing as an enabler of innovation and discovery.

● A characteristic of exponential growth is that we will make as much progress in the next doubling cycle as we’ve made since the birth of the field:

● 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, ...

Page 24: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

24

Back to the Gap – what did Guy’s brain do?

● Complete the specification with mathematical rigor

● Including aspects of representation● Provide an algorithm● Whence came the algorithm?

● Derived from the specification?● Instantly and unconsciously

pattern matched against decades of prior experience?

● Express the algorithm as Fortress code

Page 25: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

25

Fortunately for scientific HPC

● Mathematical rigor is the norm● Which partially explains why some disciplines

are late to the table● Unfortunately, also the norm are neglect or

ignorance of ● difference between algebraic and floating-

point numbers,● accuracy, stability, and other properties of

common algorithms● parallel algorithms and programming

Page 26: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

26

The language of many-body physics

Page 27: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

27

CCSD Doubles Equationhbar[a,b,i,j] == sum[f[b,c]*t[i,j,a,c],{c}] -sum[f[k,c]*t[k,b]*t[i,j,a,c],{k,c}] +sum[f[a,c]*t[i,j,c,b],{c}] -sum[f[k,c]*t[k,a]*t[i,j,c,b],{k,c}]

-sum[f[k,j]*t[i,k,a,b],{k}] -sum[f[k,c]*t[j,c]*t[i,k,a,b],{k,c}] -sum[f[k,i]*t[j,k,b,a],{k}] -sum[f[k,c]*t[i,c]*t[j,k,b,a],{k,c}] +sum[t[i,c]*t[j,d]*v[a,b,c,d],{c,d}] +sum[t[i,j,c,d]*v[a,b,c,d],{c,d}] +sum[t[j,c]*v[a,b,i,c],{c}] -sum[t[k,b]*v[a,k,i,j],{k}] +sum[t[i,c]*v[b,a,j,c],{c}] -sum[t[k,a]*v[b,k,j,i],{k}] -sum[t[k,d]*t[i,j,c,b]*v[k,a,c,d],{k,c,d}] -sum[t[i,c]*t[j,k,b,d]*v[k,a,c,d],{k,c,d}] -sum[t[j,c]*t[k,b]*v[k,a,c,i],{k,c}] +2*sum[t[j,k,b,c]*v[k,a,c,i],{k,c}] -sum[t[j,k,c,b]*v[k,a,c,i],{k,c}] -sum[t[i,c]*t[j,d]*t[k,b]*v[k,a,d,c],{k,c,d}] +2*sum[t[k,d]*t[i,j,c,b]*v[k,a,d,c],{k,c,d}] -sum[t[k,b]*t[i,j,c,d]*v[k,a,d,c],{k,c,d}] -sum[t[j,d]*t[i,k,c,b]*v[k,a,d,c],{k,c,d}] +2*sum[t[i,c]*t[j,k,b,d]*v[k,a,d,c],{k,c,d}] -sum[t[i,c]*t[j,k,d,b]*v[k,a,d,c],{k,c,d}] -sum[t[j,k,b,c]*v[k,a,i,c],{k,c}] -sum[t[i,c]*t[k,b]*v[k,a,j,c],{k,c}] -sum[t[i,k,c,b]*v[k,a,j,c],{k,c}] -sum[t[i,c]*t[j,d]*t[k,a]*v[k,b,c,d],{k,c,d}] -sum[t[k,d]*t[i,j,a,c]*v[k,b,c,d],{k,c,d}] -sum[t[k,a]*t[i,j,c,d]*v[k,b,c,d],{k,c,d}] +2*sum[t[j,d]*t[i,k,a,c]*v[k,b,c,d],{k,c,d}] -sum[t[j,d]*t[i,k,c,a]*v[k,b,c,d],{k,c,d}] -sum[t[i,c]*t[j,k,d,a]*v[k,b,c,d],{k,c,d}] -sum[t[i,c]*t[k,a]*v[k,b,c,j],{k,c}] +2*sum[t[i,k,a,c]*v[k,b,c,j],{k,c}] -sum[t[i,k,c,a]*v[k,b,c,j],{k,c}] +2*sum[t[k,d]*t[i,j,a,c]*v[k,b,d,c],{k,c,d}] -sum[t[j,d]*t[i,k,a,c]*v[k,b,d,c],{k,c,d}] -sum[t[j,c]*t[k,a]*v[k,b,i,c],{k,c}] -sum[t[j,k,c,a]*v[k,b,i,c],{k,c}] -sum[t[i,k,a,c]*v[k,b,j,c],{k,c}] +sum[t[i,c]*t[j,d]*t[k,a]*t[l,b]*v[k,l,c,d],{k,l,c,d}] -2*sum[t[k,b]*t[l,d]*t[i,j,a,c]*v[k,l,c,d],{k,l,c,d}] -2*sum[t[k,a]*t[l,d]*t[i,j,c,b]*v[k,l,c,d],{k,l,c,d}] +sum[t[k,a]*t[l,b]*t[i,j,c,d]*v[k,l,c,d],{k,l,c,d}] -2*sum[t[j,c]*t[l,d]*t[i,k,a,b]*v[k,l,c,d],{k,l,c,d}] -2*sum[t[j,d]*t[l,b]*t[i,k,a,c]*v[k,l,c,d],{k,l,c,d}] +sum[t[j,d]*t[l,b]*t[i,k,c,a]*v[k,l,c,d],{k,l,c,d}] -2*sum[t[i,c]*t[l,d]*t[j,k,b,a]*v[k,l,c,d],{k,l,c,d}] +sum[t[i,c]*t[l,a]*t[j,k,b,d]*v[k,l,c,d],{k,l,c,d}] +sum[t[i,c]*t[l,b]*t[j,k,d,a]*v[k,l,c,d],{k,l,c,d}] +sum[t[i,k,c,d]*t[j,l,b,a]*v[k,l,c,d],{k,l,c,d}] +4*sum[t[i,k,a,c]*t[j,l,b,d]*v[k,l,c,d],{k,l,c,d}] -2*sum[t[i,k,c,a]*t[j,l,b,d]*v[k,l,c,d],{k,l,c,d}] -2*sum[t[i,k,a,b]*t[j,l,c,d]*v[k,l,c,d],{k,l,c,d}] -2*sum[t[i,k,a,c]*t[j,l,d,b]*v[k,l,c,d],{k,l,c,d}] +sum[t[i,k,c,a]*t[j,l,d,b]*v[k,l,c,d],{k,l,c,d}] +sum[t[i,c]*t[j,d]*t[k,l,a,b]*v[k,l,c,d],{k,l,c,d}] +sum[t[i,j,c,d]*t[k,l,a,b]*v[k,l,c,d],{k,l,c,d}] -2*sum[t[i,j,c,b]*t[k,l,a,d]*v[k,l,c,d],{k,l,c,d}] -2*sum[t[i,j,a,c]*t[k,l,b,d]*v[k,l,c,d],{k,l,c,d}] +sum[t[j,c]*t[k,b]*t[l,a]*v[k,l,c,i],{k,l,c}] +sum[t[l,c]*t[j,k,b,a]*v[k,l,c,i],{k,l,c}] -2*sum[t[l,a]*t[j,k,b,c]*v[k,l,c,i],{k,l,c}] +sum[t[l,a]*t[j,k,c,b]*v[k,l,c,i],{k,l,c}] -2*sum[t[k,c]*t[j,l,b,a]*v[k,l,c,i],{k,l,c}] +sum[t[k,a]*t[j,l,b,c]*v[k,l,c,i],{k,l,c}] +sum[t[k,b]*t[j,l,c,a]*v[k,l,c,i],{k,l,c}] +sum[t[j,c]*t[l,k,a,b]*v[k,l,c,i],{k,l,c}] +sum[t[i,c]*t[k,a]*t[l,b]*v[k,l,c,j],{k,l,c}] +sum[t[l,c]*t[i,k,a,b]*v[k,l,c,j],{k,l,c}] -2*sum[t[l,b]*t[i,k,a,c]*v[k,l,c,j],{k,l,c}] +sum[t[l,b]*t[i,k,c,a]*v[k,l,c,j],{k,l,c}] +sum[t[i,c]*t[k,l,a,b]*v[k,l,c,j],{k,l,c}] +sum[t[j,c]*t[l,d]*t[i,k,a,b]*v[k,l,d,c],{k,l,c,d}] +sum[t[j,d]*t[l,b]*t[i,k,a,c]*v[k,l,d,c],{k,l,c,d}] +sum[t[j,d]*t[l,a]*t[i,k,c,b]*v[k,l,d,c],{k,l,c,d}] -2*sum[t[i,k,c,d]*t[j,l,b,a]*v[k,l,d,c],{k,l,c,d}] -2*sum[t[i,k,a,c]*t[j,l,b,d]*v[k,l,d,c],{k,l,c,d}] +sum[t[i,k,c,a]*t[j,l,b,d]*v[k,l,d,c],{k,l,c,d}] +sum[t[i,k,a,b]*t[j,l,c,d]*v[k,l,d,c],{k,l,c,d}] +sum[t[i,k,c,b]*t[j,l,d,a]*v[k,l,d,c],{k,l,c,d}] +sum[t[i,k,a,c]*t[j,l,d,b]*v[k,l,d,c],{k,l,c,d}] +sum[t[k,a]*t[l,b]*v[k,l,i,j],{k,l}] +sum[t[k,l,a,b]*v[k,l,i,j],{k,l}] +sum[t[k,b]*t[l,d]*t[i,j,a,c]*v[l,k,c,d],{k,l,c,d}] +sum[t[k,a]*t[l,d]*t[i,j,c,b]*v[l,k,c,d],{k,l,c,d}] +sum[t[i,c]*t[l,d]*t[j,k,b,a]*v[l,k,c,d],{k,l,c,d}] -2*sum[t[i,c]*t[l,a]*t[j,k,b,d]*v[l,k,c,d],{k,l,c,d}] +sum[t[i,c]*t[l,a]*t[j,k,d,b]*v[l,k,c,d],{k,l,c,d}] +sum[t[i,j,c,b]*t[k,l,a,d]*v[l,k,c,d],{k,l,c,d}] +sum[t[i,j,a,c]*t[k,l,b,d]*v[l,k,c,d],{k,l,c,d}] -2*sum[t[l,c]*t[i,k,a,b]*v[l,k,c,j],{k,l,c}] +sum[t[l,b]*t[i,k,a,c]*v[l,k,c,j],{k,l,c}] +sum[t[l,a]*t[i,k,c,b]*v[l,k,c,j],{k,l,c}] +v[a,b,i,j]

h i jab=⟨a bi j∣e−

T 1−T 2 H e

T 1T 2∣0⟩

Page 28: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

Robert J. Harrison, UT/ORNL Joint Insititute of Computational Science 28Robert J. Harrison, UT/ORNL Joint Insititute of Computational Science4/20/2007 284/20/2007

The Tensor Contraction Engine: A Tool for Quantum Chemistry

Oak Ridge National Laboratory

David E. Bernholdt, Venkatesh Choppella, Robert Harrison

Pacific Northwest National Laboratory

So Hirata

Louisiana State UniversityJ Ramanujam,

Ohio State University Gerald Baumgartner, Alina Bibireata, Daniel Cociorva, Xiaoyang Gao, Sriram Krishnamoorthy, Sandhya Krishnan, Chi-Chung Lam, Quingda Lu, Russell M. Pitzer, P Sadayappan, Alexander Sibiryakov

University of WaterlooMarcel Nooijen, Alexander Auer

Research at ORNL supported by the Laboratory Directed Research and Development Program. Research at PNNL supported by the Office of Basic Energy Sciences, U. S. Dept. of Energy. Research at OSU, Waterloo, and LSU supported by the National Science Foundation Information Technology Research Program

http://www.cis.ohio-state.edu/~gb/TCE/

Page 29: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

29

range V = 3000;range O = 100;

index a,b,c,d,e,f : V;index i,j,k,l : O;

mlimit = 100GB;

procedure P(in A[V,V,O,O], in B[V,V,V,O], in C[V,V,O,O], in D[V,V,V,O], out S[V,V,O,O])=begin S[a,b,i,j] == sum[ A[a,c,i,k] * B[b,e,f,l] * C[d,f,j,k] * D[c,d,e,l], {c,e,f,k,l}];end

Tensor Contraction Engine (TCE)• Sadayappan et al. Proc. IEEE, 93, 2005• High-level domain-specific language for a class of problems

in chemistry/physics based on contraction of large multi-dimensional tensors (NSF + DOE)

• Specialized optimizing compiler– Produces F77+GA code, linked to runtime libs

Sabij=∑cefkl

Aacik Bbefl Cdfjk Dcdel

Page 30: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

30

Python vs. Java ● The initial Python prototype

written by chemists worksbut has lots of “issues” withmemory, speed, ...

● The OSU TCE generates better code, respects bounds on memory use, but is written in Java byC/S graduate students

● And none of the chemistshave a clue how it works andnone of them know Java

● Guess which is in use

Page 31: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

31

Thoughts/Recommendations● Expand efforts directed at technology transfer

of HPC sci/eng to industry● E.g., program to facilitate initial use or

evaluation of HPC at NSF centers ($$ needed at center and company)

● Adding value to and increasing usability of research codes

● Extensible programming models and tools● Enables creation of nimble, domain specific

languages that raise the abstraction level to increase productivity and performance portability

Page 32: Community Input on the Future ofHigh-Performance Computing · 7 Theory and modeling of Supercapacitors for Energy Storage (Sumpter and Meunier) ACS Nano, 2010, 4 (4), pp 2382–2390

32

Thoughts/Recommendations● Teach more students in more fields how to

compute, and expose them to simulation-based discovery

● At all levels (HS, UG, G)● They will naturally apply this knowledge and

create new computational disciplines● Example: DOE CSGF deliberately supports a few

students far outside the energy sciences● IGERTs bridging mature and new HPC fields

● How to compute means● Algorithmic thinking, numerical algorithms,

programming, exposure to HPC