50 years of scientific computing: from floating point to

31
Austin SciSoftDays Feb 2016 1/31 50 Years of Scientific Computing: from Floating Point to Integers L. Ridgway Scott The Institute for Biophysical Dynamics, The Computation Institute, and the Departments of Computer Science and Mathematics, The University of Chicago

Upload: others

Post on 22-Apr-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 50 Years of Scientific Computing: from Floating Point to

Austin SciSoftDays Feb 2016 1/31

50 Years of Scientific Computing:

from Floating Point to Integers

L. Ridgway ScottThe Institute for Biophysical Dynamics,The Computation Institute, and theDepartments of Computer Science and Mathematics,The University of Chicago

Page 2: 50 Years of Scientific Computing: from Floating Point to

Computing in high school

Austin SciSoftDays Feb 2016 2/31

50 years ago....

π

4= 1−

1

3+1

5−

1

7+1

9−

1

11+

1

13· · ·

This is a very poor algorithm.

It is all about the algorithms.

Page 3: 50 Years of Scientific Computing: from Floating Point to

Computing in college

Austin SciSoftDays Feb 2016 3/31

48 years ago....

The Perkin-Elmer Thermal Analysis

Suite of Codes, including

Differential scanning calorimeter codes

• Purity detection• Baseline correction

It is all about code correctness.

Page 4: 50 Years of Scientific Computing: from Floating Point to

Navier-Stokes free-boundary flow

Austin SciSoftDays Feb 2016 4/31

Some years later .... Worked with BillPritchard and Simon Tavener [5]

Used Crouzeix-Raviart nonconformingpiecewise linear element

Coercive for the gradient formulation

Not coercive for the strain formulation

Check your math carefullybefore designing the code

Page 5: 50 Years of Scientific Computing: from Floating Point to

Deterministic parallel computing abstractions

Austin SciSoftDays Feb 2016 5/31

Thirty years ago .... Fusions

x@p = y@q

Implementation: the Golden Rule

• First see if you should send data tosomeone else; if so, do so

• Then recieve any data that this statementimplies

Deterministic: avoids race conditionsL. R. Scott, J. M. Boyle, and B. Bagheri. Distributed data structures forscientific computation. In Hypercube Multiprocessors 1987, M. T. Heath,ed., pages 55–66. Philadelphia: SIAM, 1987.

Page 6: 50 Years of Scientific Computing: from Floating Point to

Planguages

Austin SciSoftDays Feb 2016 6/31

PC and Pfortran were used for many projects

Molecular Dynamics code GROMOSparallelized with Pfortran by Terry Clark

However there were less than 20 parallel linesof code in a 70K deck

The algorithms are more important than

the parallel programming language

But fusions could augment message passingproviding deadlock-free code

Page 7: 50 Years of Scientific Computing: from Floating Point to

Depiction of a critical section.

Austin SciSoftDays Feb 2016 7/31

Page 8: 50 Years of Scientific Computing: from Floating Point to

Scientific Parallel Computing; Scott, Clark & Bagheri, Princeton 20 05

Austin SciSoftDays Feb 2016 8/31

Mil

lio

ns

of

op

era

tio

ns

pe

r se

con

d

Year of introduction of various supercomputers and micro processors

1970 1975 1980 1985 1990 1995 200010-3

10 0

101

10 2

10 3

10 4

Cray XMP

Cray 1

Cray T90

NEC SX3Pentium III

Pentium II

Pentium Pro

Pentium

80486

80386

80286

80808086

Page 9: 50 Years of Scientific Computing: from Floating Point to

Top 500 supercomputers, Moore’s Law

Austin SciSoftDays Feb 2016 9/31

19931995

20002005

20102014

0.1

1

10

100

1000

10000

100000

1000000

10000000

100000000

1000000000

Sum

Top

#500

Page 10: 50 Years of Scientific Computing: from Floating Point to

Direct numerical solution: Alexander Veit

Austin SciSoftDays Feb 2016 10/31

We solved 6-D Schrodinger equations with a finitedifference method using tensor contraction techniques

For a given positive integer r, the (simplified) rank-r Tensor-Train formatcontains all elements A such that

Ai1,...,id = A(1)i1

A(2)i2

· · ·A(d)id

,

where each A(k)ik

are r × r matrices.

The A(k) can be considered as three-dimensional arrays of size r2 × n

and are called cores of the TT-tensor A.

More generally, r’s can be different for each i and

we define an effective rank based on an average.

We used the TT-toolbox 2.2 by I. Oseledetsspring.inm.ras.ru/osel

Page 11: 50 Years of Scientific Computing: from Floating Point to

Test problem

Austin SciSoftDays Feb 2016 11/31

Test solution Z (left) and pointwise error Z − Zh (right) evaluated at(

·, π2, π2, ·, π

2, π2

)

for n = 210 in each direction of 6 dimensions.

(H0 − λ0)Z =

(

4−1

|x|−

1

|y|

) 3∏

i=1

sin(xi)3∏

j=1

sin(yj).

Page 12: 50 Years of Scientific Computing: from Floating Point to

Real problem: n = 212

Austin SciSoftDays Feb 2016 12/31

R λR − λ0 “exact” rankR = 10.0 −8.9520 · 10−6 −8.754 · 10−6 27.7R = 12.0 −2.3335 · 10−6 −2.546 · 10−6 49.5R = 14.0 −9.5037 · 10−7 37.4R = 16.0 −4.1749 · 10−7 20.8R = 18.0 −2.0325 · 10−7 39.2R = 20.0 −1.0673 · 10−7 38.7R = 22.0 −5.9265 · 10−8 40.4R = 24.0 −3.5191 · 10−8 39.7R = 26.0 −2.1657 · 10−8 40.7

Number of grid points approaches Avogadro’s Number.

Exact values from TABLE II, Relativistic energies of the ground state ofthe hydrogen molecule, L. Wolniewicz, J. Chem. Phys. 99, 1851 (1993).

Page 13: 50 Years of Scientific Computing: from Floating Point to

Finite element libraries

Austin SciSoftDays Feb 2016 13/31

First steps towards automation

A finite element code in Ada

• most errors caught at compile time due totype mismatch

fec: C++ library (Babak Bagheri)

• used in theses by several students andpost-docs at the U. Houston [2, 3, 4, 6, 7]

L. R. Scott and B. Bagheri. Software environments for the parallel solutionof partial differential equations. In Computing Methods in AppliedSciences and Engineering IX, R. Glowinski and A. Lichnewsky, eds.,pages 378-392. Philadelphia: SIAM, 1990.

Page 14: 50 Years of Scientific Computing: from Floating Point to

Automated modeling

Austin SciSoftDays Feb 2016 14/31

Automatation provides higher-levelopportunities for optimization

• Variational formulation provides a languagefor PDE’s

• Finite element simulation can be furtherautomated (fiat, FErari, finat)

Implementations:

• Analysa was written by Babak Bagheri• FEniCS Project: too many contributors

here to list everyone

Page 15: 50 Years of Scientific Computing: from Floating Point to

Automated modeling process

Austin SciSoftDays Feb 2016 15/31

Models provide abstractions

Two aspects to implementation

• Need to translate correctly toappropriate (e.g., machine)language

• Opportunity to performoptimizations

Optimization involves algorithmicchallenges

Page 16: 50 Years of Scientific Computing: from Floating Point to

SDSC transitions to NPACI

Austin SciSoftDays Feb 2016 16/31

NIC versus DIC

Numeric-intensive versus dataintensive computing

Example: extracting informationfrom genome and protein data

Computing with integers is stillhard

Page 17: 50 Years of Scientific Computing: from Floating Point to

Dehydrons in human hemoglobin

Austin SciSoftDays Feb 2016 17/31

Dehydronsin human hemoglobinFrom PNAS 100: 6446-6451 (2003)Ariel Fernandez, Jozsef Kardos,L. Ridgway Scott, Yuji Goto,and R. Stephen Berry.Structural defects and thediagnosis of amyloidogenic propensity.

Well-wrapped hydrogen bonds aregrey, and dehydrons are green.

The standard ribbon modelof “structure” lacks indicatorsof electrostatic environment.

Page 18: 50 Years of Scientific Computing: from Floating Point to

Amino acid sidechains have different properties

Austin SciSoftDays Feb 2016 18/31

Carbonaceous groups on sidechains are hydrophobic:

�� ❅❅

CH2

CH2CH2

Valine

�� ❅❅

CH2

CH

CH3 CH3

Leucine

CH CH3

CH2

CH3

Isoleucine

��❅❅

CH2CH2

CH2

Proline

✟✟✟✟❍❍

❍❍ ✟✟❍❍

CH2

Phenyl-alanine

Amino acid residues (sidechains only shown)having only carbonaceous groups.

Page 19: 50 Years of Scientific Computing: from Floating Point to

What sidechains are found at interfaces?

Austin SciSoftDays Feb 2016 19/31

By examining interfaces in PDB structures, we can seewhich residues are most likely to be found at interfaces.

�� ❅❅❅❅

CH2

C

NH2 O

Asparagine

C

CH3

H OH

Threonine

H

Glycine

C

H

H OH

Serine

�� ❅❅❅❅

CH2

C

O− O

Asparticacid

CH3

Alanine

CH2

SH

Cysteine

Sidechains most likely to be involved in interactions,

ordered from the left (asparagine), are not hydrophobic.

Page 20: 50 Years of Scientific Computing: from Floating Point to

Genetic code

Austin SciSoftDays Feb 2016 20/31

Genetic code minimizes changes of polarity due to single-letter codonmutations, but it facilitates changes in wrapping due to single-letter codonmutations.

+ −uuauug

uuuuuc

Leu

Phe 7

4

gucuucc

ucguca

cuucuccuacug

Leu 4

Ser 0 + −

auuaucauaaug

4Ile

Met + −guugucguagug

Val 3

cagu

u

u

c

c

c

a

a

a

g

g

g

ccu

ccgccaccc

Pro 2

acuaccacaacg

Thr + −1

gcugccgcagcg

Ala 1

uacuaauag

uau

caucaccaacag

+ −

+ −

aauaacaaaaag

+ −

+ +

gacgaagag

gau − −

− −

1

Tyr 6

His 1

Gln 2

Asn 1

Lys 3

Asp

Glu 2

1

+ − ugcugaugg

ugu Cys + −0

Trp 7cgucgccgacgg

2 + +

| |

| |

| |

| |

| |

| |

| |

aguagcagaagg

Ser 0 + −

2 + +

Arg

Arg

ggcggaggg

ggu

c

a

g

u stopstopstop

Gly 0 + −

u

Second Position

Firs

t Pos

ition

Third P

osition

u c a

First digit after residue name is amount of wrapping. Second indicator ispolarity; | |: nonpolar, +−: polar, −−: negatively charged, ++: positivelycharged.

Page 21: 50 Years of Scientific Computing: from Floating Point to

Wrapping protects hydrogen bond from water

Austin SciSoftDays Feb 2016 21/31

C

OHHO

H

H

CHn

CHn CHn

CHn

CHn

NHO

C

OH

CHn

CHnCHn

O HH

H

HON

Well wrapped hydrogen bond Underwrapped hydrogen bond

Page 22: 50 Years of Scientific Computing: from Floating Point to

Extent of wrapping changes nature of hydrogen bond

Austin SciSoftDays Feb 2016 22/31

Hydrogen bonds (B) not protected from water do not persist.

From De Simone, et al., PNAS 102 no 21 7535-7540 (2005)

Page 23: 50 Years of Scientific Computing: from Floating Point to

Dynamics of hydrogen bonds and wrapping

Austin SciSoftDays Feb 2016 23/31

0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.50

100

200

300

400

500

600

700

800

900

Figure 2: Distribution of bond lengths for two hydrogen bonds formed in astructure of the sheep prion [1]. Horizontal axis measured in nanometers,vertical axis represents numbers of occurrences taken from a simulationwith 20, 000 data points with bin widths of 0.1 Angstrom. Distribution for thewell-wrapped hydrogen bond (H3) has smaller mean value but a longer (ex-ponential) tail, whereas distribution for the underwrapped hydrogen bond(H1) has larger mean but Gaussian tail.

Page 24: 50 Years of Scientific Computing: from Floating Point to

Ligand binding removes water

Austin SciSoftDays Feb 2016 24/31

L IG A N D

������������������������������

������������������������������

OH

CHn

CHnCHn

O HH

H

HONC C O H N

CHn

CHn

CHn

Binding of ligand changes underprotected hydrogen

bond (high dielectric) to strong bond (low dielectric)

No intermolecular bonds needed!

Page 25: 50 Years of Scientific Computing: from Floating Point to

Wrapping made quantitative

Austin SciSoftDays Feb 2016 25/31

Wrapping made quantitative by counting carbonaceous groups inthe neighborhood of a hydrogen bond.

Page 26: 50 Years of Scientific Computing: from Floating Point to

Distribution of wrapping

Austin SciSoftDays Feb 2016 26/31

Distribution of wrapping for an antibody complex.

0

2

4

6

8

10

12

0 5 10 15 20 25 30 35 40

freq

uenc

y of

occ

urre

nce

number of noncarbonaceous groups in each desolvation sphere: radius=6.0 Angstroms

PDB file 1P2C: Light chain, A, dotted line; Heavy chain, B, dashed line; HEL, C, solid line

line 2line 3line 4

0

2

4

6

8

10

12

0 5 10 15 20 25 30 35 40

freq

uenc

y of

occ

urre

nce

number of noncarbonaceous groups in each desolvation sphere: radius=6.0 Angstroms

PDB file 1P2C: Light chain, A, dotted line; Heavy chain, B, dashed line; HEL, C, solid line

line 2line 3line 4

Page 27: 50 Years of Scientific Computing: from Floating Point to

Dehydrons in human hemoglobin

Austin SciSoftDays Feb 2016 27/31

Dehydronsin human hemoglobinFrom PNAS 100: 6446-6451 (2003)Ariel Fernandez, Jozsef Kardos,L. Ridgway Scott, Yuji Goto,and R. Stephen Berry.Structural defects and thediagnosis of amyloidogenic propensity.

Well-wrapped hydrogen bonds aregrey, and dehydrons are green.

The standard ribbon modelof “structure” lacks indicatorsof electrostatic environment.

Page 28: 50 Years of Scientific Computing: from Floating Point to

Introducing wrappa: Chris Fraser

Austin SciSoftDays Feb 2016 28/31

WRAPPA 1.0

Home About Instructions WRAPPA Updates References Contact

upload configure analyze download

PDB Selection

model chain(s)

- ALL

PDB File

PDB file: no file selectedChoose File

Upload

PDB Selection

Specify the PDB model number and chain(s) to be analyzed by WRAPPA:

model: enter the MODEL number to be analyzed. If the PDB file to be analyzed contains only a singlemodel, leave the field unchanged.

chains: select all chains (default) or specify a single chain to analyze. No more than 36 chains may beanalyzed during a single WRAPPA session.

PDB File

Select a PDB file from a local directory. After uploading, the file will be checked for PDB Format Version3.0 compliance. This may take some time for larger PDB files. Please note that there is a 3MB size limitfor uploaded files.

Page 29: 50 Years of Scientific Computing: from Floating Point to

What is hard about this?

Austin SciSoftDays Feb 2016 29/31

Protein (and other bio-) data is messy

Some of this is formal in the PDB

• alternate atom locations• inserted residues

Some of this is not formalized

• missing sidechains• missing residues

Makes definitive statements difficult

Page 30: 50 Years of Scientific Computing: from Floating Point to

Thanks

Austin SciSoftDays Feb 2016 30/31

This talk featured joint work with

• Babak Bagheri: PC, guarded memory,Pfortran, UHBD, fec, Analysa

• Terry Clark: Pfortran, Euler Gromos, NPACIdata-intensive system under linux, SAGE

• Reinhard von Hanxleden: Euler Gromos• Ernesto Gomez: PC, SOS• Alexander Veit: compressed Schrodinger• Chris Fraser: Wrappa, various Python tools

We acknowledge partial support by NSF grant DMS-1226019.

Page 31: 50 Years of Scientific Computing: from Floating Point to

References

Austin SciSoftDays Feb 2016 31/31

[1] Alfonso De Simone, Guy G. Dodson, Chandra S. Verma, Adriana Zagari, and Franca Fraternali. Prion and water: Tight anddynamical hydration sites have a key role in structural stability. Proceedings of the National Academy of Sciences, USA,102:7535–7540, 2005.

[2] A. Ilin, B. Bagheri, R. Metcalfe, and L. R. Scott. Error control and mesh optimization for high-order finite element approximation ofincompressible viscous flow. Comp. methods appl. Mechanics and Eng., 150:313–325, 1997.

[3] Hector Juarez, L. R. Scott, R. Metcalfe, and B. Bagheri. Direct simulation of freely rotating cylinders in viscous flows by high–orderfinite element methods. Research Report UH/MD 241, Dept. Math., Univ. Houston, 1998.

[4] Hector Juarez, L. R. Scott, R. Metcalfe, and B. Bagheri. Direct simulation of freely rotating cylinders in viscous flows by high–orderfinite element methods. Computers & Fluids, 29:547–582, 2000.

[5] W. G. Pritchard, S. J. Tavener, and L. R. Scott. Numerical and asymptotic methods for certain viscous free-surface flows. Philos.Trans. Roy. Soc. London A, 340:1–45, 1992.

[6] Anna-Karin Tornberg, R. Metcalfe, L. R. Scott, and B. Bagheri. A fluid particle simulation method. In eds. M-O. Bristeau et al.,editor, Computational Science for the 21st century, pages 312–321. John Wiley & Sons, New York, 1997.

[7] Anna-Karin Tornberg, R. Metcalfe, L. R. Scott, and B. Bagheri. A front-tracking method for simulating bubble motion in a viscousincompressible fluid using high–order finite element methods. In 1997 ASME Fluids Engineering Division Summer MeetingFEDSM97-3493, 1997.