new architectures for a new biology martin m. deneroff [email protected] d. e. shaw research, llc
TRANSCRIPT
![Page 2: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/2.jpg)
*** Background(A Bit of Basic Biochemistry)
![Page 3: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/3.jpg)
DNA Codes for Proteins
![Page 4: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/4.jpg)
The 20 Amino Acids
![Page 5: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/5.jpg)
Polypeptide Chain
Source: www.yourgenome.org
![Page 6: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/6.jpg)
Levels of Protein Structure
Source: Robert Melamede, U. Colorado
![Page 7: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/7.jpg)
What We Know and What We Don’t
Decoded the genome
Don’t know most protein structures
– Especially membrane proteins
No detailed picture of what most proteins do
Don’t know how everything fits together into a working system
![Page 8: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/8.jpg)
We Now Have The Parts List ...
![Page 9: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/9.jpg)
But We Don’t Know What the Parts Look Like ...
![Page 10: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/10.jpg)
Or How They Fit Together ...
![Page 11: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/11.jpg)
Or How The Whole Machine Works
![Page 12: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/12.jpg)
How Can We Get There?
Two major approaches: Experiments
– Wet lab– Hard, since everything is so small
Simulation– Simulate:
• How proteins fold (structure, dynamics)• How proteins interact with
- Other proteins- Nucleic acids- Drug molecules
– Gold standard: Molecular dynamics (MD)
![Page 13: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/13.jpg)
*** Molecular Dynamics
![Page 14: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/14.jpg)
Molecular Dynamics
t
Divide time into discrete time steps
~1 fs time step
![Page 15: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/15.jpg)
Molecular Dynamics
Calculate forces
Molecular mechanicsforce field
![Page 16: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/16.jpg)
Molecular Dynamics
Move atoms
![Page 17: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/17.jpg)
Molecular Dynamics
Move atoms
... a little bit
![Page 18: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/18.jpg)
Molecular Dynamics
IterateIterate
Iterate... and iterate
Iterate... and iterate
Integrate Newton’s laws of motion
![Page 19: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/19.jpg)
Example of an MD Simulation
![Page 20: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/20.jpg)
Main Problem With MD
Too slow!
Example I just showed:
2 ns simulated time
3.4 CPU-days to simulate
![Page 21: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/21.jpg)
*** Goals and Strategy
![Page 22: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/22.jpg)
Thought Experiment
What if MD were
– Perfectly accurate?
– Infinitely fast?
Would be easy to performarbitrary computational experiments
– Determine structures by watching them form
– Figure out what happens by watching it happen
– Transform measurement into data mining
![Page 23: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/23.jpg)
Two Distinct Problems
Problem 1: Simulate many short trajectories
Problem 2: Simulate one long trajectory
![Page 24: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/24.jpg)
Simulating Many Short Trajectories
Can answer surprising number of interesting questions
Can be done using– Many slow computers– Distributed processing approach– Little inter-processor communication
E.g., Pande’s Folding at Home project
![Page 25: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/25.jpg)
Simulating One Long Trajectory
Harder problem
Essential to elucidate many biologically interesting processes
Requires a single machine with– Extremely high performance– Truly massive parallelism– Lots of inter-processor communication
![Page 26: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/26.jpg)
DESRES Goal
Single, millisecond-scale MD simulations (long trajectories)
– Protein with 64K or more atoms
– Explicit water molecules
Why?
– That’s the time scale at which many biologically interesting things start to happen
![Page 27: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/27.jpg)
Image: Istvan Kolossvary & Annabel Todd, D. E. Shaw Research
Protein Folding
![Page 28: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/28.jpg)
Interactions Between Proteins
Image: Vijayakumar, et al., J. Mol. Biol. 278, 1015 (1998)
![Page 29: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/29.jpg)
Image: Nagar, et al., Cancer Res. 62, 4236 (2002)
Binding of Drugs to their Molecular Targets
![Page 30: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/30.jpg)
Image: H. Grubmüller, in Attig, et al. (eds.), Computational Soft Matter (2004)
Mechanisms of Intracellular Machines
![Page 31: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/31.jpg)
What Will It Take to Simulate a Millisecond?
We need an enormous increase in speed– Current (single processor): ~ 100 ms / fs– Goal will require < 10 s / fs
Required speedup:
> 10,000 x faster than current single-processor speed
~ 1,000x faster than current parallel implementations
Can’t accept >10,000x the power (~5 Megawatts)!
![Page 32: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/32.jpg)
Target Simulation Speed
3.4 days today (one processor)
~ 13 seconds on our machine
(one segment)
![Page 33: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/33.jpg)
Molecular Mechanics Force Field
2
0bonds
20
angles
torsions
12 6
( )
[1 cos( )]
b
i j
i j i ij
ij ij
i j i ij ij
E k r r
k
A n
q q
r
A B
r r
Stretch
Bend
Torsion
Electrostatic
Van der Waals
Non-Bonde
d
Bonded
![Page 34: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/34.jpg)
What Takes So Long?
Inner loop of force field evaluation looks at all pairs of atoms (within distance R)
On the order of 64K atoms in typical system
Repeat ~1012 times
Current approaches too slow by several orders of magnitude
What can be done?
![Page 35: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/35.jpg)
Our Strategy
New architectures– Design a specialized machine– Enormously parallel architecture– Based on special-purpose ASICs– Dramatically faster for MD, but less flexible– Projected completion: 2008
New algorithms– Applicable to
• Conventional clusters• Our own machine
– Scale to very large # of processing elements
![Page 36: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/36.jpg)
Interdisciplinary Lab
Computational Chemists and Biologists
Computer Scientists and Applied Mathematicians
Computer Architects and Engineers
![Page 37: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/37.jpg)
*** New Architectures
![Page 38: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/38.jpg)
Alternative Machine Architectures
Conventional cluster of commodity processors
General-purpose scientific supercomputer
Special-purpose molecular dynamics machine
![Page 39: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/39.jpg)
Conventional Cluster of Commodity Processors
Strengths:
– Flexibility
– Mass market economies of scale
Limitations
– Doesn’t exploit special features of the problem
– Communication bottlenecks
• Between processor and memory
• Among processors
– Insufficient arithmetic power
![Page 40: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/40.jpg)
General-Purpose Scientific Supercomputer
E.g., IBM Blue Gene
More demanding goal than ours– General-purpose scientific supercomputing– Fast for wide range of applications
Strengths:– Flexibility– Ease of programmability
Limitations for MD simulations– Expensive– Still not fast enough for our purposes
![Page 41: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/41.jpg)
Anton: DESRES’ Special-Purpose MD Machine
Strengths:
– Several orders of magnitude faster for MD
– Excellent cost/performance characteristics
Limitations:
– Not designed for other scientific applications
• They’d be difficult to program
• Still wouldn’t be especially fast
– Limited flexibility
![Page 42: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/42.jpg)
Anton System-Level Organization
Multiple segments (probably 8 in first machine)
512 nodes (each consists of one ASIC plus DRAM) per segment– Organized in an 8 x 8 x 8 toroidal mesh
Each ASIC equivalent performance to roughly 500 general purpose microprocessors– ASIC power similar to a single
microprocessor
![Page 43: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/43.jpg)
3D Torus Network
![Page 44: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/44.jpg)
Why a 3D Torus?
Topology reflects physical space being simulated:– Three-dimensional nearest neighbor
connections– Periodic boundary conditions
Bulk of communications is to near neighbors– No switching to reach immediate neighbors
![Page 45: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/45.jpg)
Source of Speedup on Our Machine
Judicious use of arithmetic specialization
– Flexibility, programmability only where needed
– Elsewhere, hardware tailored for speed• Tables and parameters, but not
programmable
Carefully choreographed communication
– Data flows to just where it’s needed
– Almost never need to access off-chip memory
![Page 46: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/46.jpg)
Two Subsystems on Each ASIC
SpecializedSubsystem
FlexibleSubsystem
Programmable, general-purpose
Efficient geometric operations
Modest clock rate
Pairwise point interactions
Enormously parallel
Aggressive clock rate
![Page 47: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/47.jpg)
Where We Use Specialized Hardware
Specialized hardware (with tables, parameters) where:
Inner loop
Simple, regular algorithmic structure
Unlikely to change
Examples:
Electrostatic forces
Van der Waals interactions
![Page 48: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/48.jpg)
Example: Particle Interaction Pipeline (one of 32)
![Page 49: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/49.jpg)
Array of 32 Particle Interaction Pipelines
![Page 50: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/50.jpg)
Advantages of Particle Interaction Pipelines
Save area that would have been allocated to
– Cache
– Control logic
– Wires
Achieve extremely high arithmetic density
Save time that would have been spent on
– Cache misses,
– Load/store instructions
– Misc. data shuffling
![Page 51: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/51.jpg)
Where We Use Flexible Hardware
– Use programmable hardware where:• Algorithm less regular• Smaller % of total computation
- E.g., local interactions (fewer of them)• More likely to change
– Examples:• Bonded interactions• Bond length constraints• Experimentation with
- New, short-range force field terms- Alternative integration techniques
![Page 52: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/52.jpg)
Forms of Parallelism in Flexible Subsystem
The Flexible Subsystem exploits three forms of parallelism:
– Multi-core parallelism (4 Tensilicas, 8 Geometry Cores)
– Instruction-level parallelism
– SIMD parallelism – calculate on 3D and 4D vectors as single operation
![Page 53: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/53.jpg)
Overview of the Flexible Subsystem
GC = Geometry Core
(each a VLIW processor)
![Page 54: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/54.jpg)
Geometry Core(one of 8; 64 pipelined lanes/chip)
+
X
+ + + +
InstructionMemory
Decode
FromTensilicaCore
X X X
X Y Z W
PC
+
X
+ + + +
X X X
X Y Z W
DataMemory
f f f f f f f f
![Page 55: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/55.jpg)
But Communication is Still a Bottleneck
Scalability limited by inter-chip communication
To execute a single millisecond-scale simulation,
– Need a huge number of processing elements
– Must dramatically reduce amount of data transferred between these processing elements
Can’t do this without fundamentally new algorithms:
– A family of Neutral Territory (NT) methods that reduce pair interaction communication load significantly
– A new variant of Ewald distant method, Gaussian Split Ewald (GSE) which simplifies calculation and communication for distant interactions
– These are the subject of a different talk.
![Page 56: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/56.jpg)
An Open Question That
Keeps Us Awake at Night
![Page 57: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/57.jpg)
Are Force Fields Accurate Enough?
Nobody knows how accurate the force fields that everyone uses actually are
– Can’t simulate for long enough to know (until we use Anton for the first time!)
– If problems surface, we should at least be able to• Figure out why• Take steps to fix them
But we already know that fast, single MD simulations will prove sufficient to answer at least some major scientific questions
![Page 58: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/58.jpg)
Example: Simulation of a Na+/H+ Antiporter
Cytoplasm
Periplasm
![Page 59: New Architectures for a New Biology Martin M. Deneroff deneroff@deshaw.com D. E. Shaw Research, LLC](https://reader035.vdocuments.us/reader035/viewer/2022070411/56649cb15503460f949761b3/html5/thumbnails/59.jpg)
Our Functional Model of the Na+/H+ Antiporter