co-processors for speeding up drug design algorithms advait jain priyanka jindal pulkit gambhir...
DESCRIPTION
Approach to the problem Familiarization with the code Software profiling Identifying bottleneck procedures/loops Compiler level optimizations H/w - S/w partitioning Where to partition API’s to export Hardware Design Performance AnalysisTRANSCRIPT
![Page 1: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/1.jpg)
Co-processors for speeding up drug design algorithms
Advait JainPriyanka JindalPulkit Gambhir
Under the guidance of:Prof. M BalakrishnanProf. Kolin Paul
![Page 2: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/2.jpg)
Objective To design FPGA based hardware
accelerators for speeding up the energy minimization process.
![Page 3: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/3.jpg)
Approach to the problem Familiarization with the code Software profiling
Identifying bottleneck procedures/loops Compiler level optimizations
H/w - S/w partitioning Where to partition API’s to export
Hardware Design Performance Analysis
![Page 4: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/4.jpg)
Overall Control Flow
![Page 5: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/5.jpg)
Bottleneck Functions
![Page 6: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/6.jpg)
Bottleneck Functions
![Page 7: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/7.jpg)
Split Up codeEval_Energy_for _step(%)
Diff_Energy(%)
Non-bonded pairs
68.61 29.10
Dihedrals 00.54 00.56
Angles 00.17 00.12
Bonded 00.00 00.00
![Page 8: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/8.jpg)
Bottleneck Functions
Iterate over list of bonds {O(N) elements}
Iterate over list of angles {O(N) elements}
Iterate over list of dihedrals {O(N) elements}
Iterate over list of non-bonded pairs {O(N2) elements}
Eval energy Eval Energy for stepDiff energy
![Page 9: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/9.jpg)
Molecule Size v/s Time (log plot)
Average Slope = 2.03
![Page 10: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/10.jpg)
Energy v/s CG Steps
We are here
![Page 11: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/11.jpg)
Non-bonded List
Node structure
Float A, B, C (4*3 bytes)Int a1, a2
C is a function of charge q1 and q2 of atoms.
471,282 distinct Cs(3 bytes)
A, BAre a function of radius and epsilon of atoms.
192 distinct pairs of A,B(1 byte)
![Page 12: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/12.jpg)
New Data Structure
Vector of
Distinct Cs
Vector of Distinct
(A,B) pairs
New Node structure
3d coordinates of atoms
Int a1, a2
Unsigned common_index
31
![Page 13: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/13.jpg)
Result of new data structureMolecule Size: 2008
VanderList: 2,008,417 AB_Vander list: 136 C_Vanderlist: 21,651
Old Data Structure
New Data Structure
Projected Data Structure
2,008,417 * 20
~ 40 MB
2,008,417 * 12 + 136 * 8 + 21,651 * 4
~ 24 MB
2,008,417 * 8 + 136 * 8 + 21,651 * 4
~ 16 MB
Improvement in cache performance
![Page 14: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/14.jpg)
Sorting to improve performance Consecutive nodes of van-der list can
point randomly anywhere in the C and (A,B) vectors
Scope for further improving Cache performance
Radix sort on the van-der list First bucket sort on the C-index Second stable bucket sort on the (A,B)-index
Sequential access of (A,B) vector
![Page 15: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/15.jpg)
Cache Profiling (unsorted vs sorted)
L1D refs L1D misses L2 refs1,773,145,080 Rd: 1,451,802,230Wr:321,342,785
44,016,787 Rd: (3%)43,429,781 Wr: (.1826 %)587,006
44,754,341 Rd:44,167,335 Wr:587,006
1,842,686,500 Rd:1,495,124,238 Wr:347,562,262
29,287,877 Rd: (1.9%)28,470,590 Wr:(.235%)817,287
30,152,893 Rd:29,335,606 Wr:817,287
Test Case : Molecule of size 413 atoms with 25 SD and 100 CG steps
![Page 16: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/16.jpg)
Converting to floating point All the code written with a double point
precision Double point difficult to replicate in
hardware Need to test feasibility of conversion to
single precision
![Page 17: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/17.jpg)
Single Point PrecisionminEnergyCG()
diffEnergy() evalEnergy_for_step()
moveStep()
Precision lost here
Instability introduced hereResulting in NaN
![Page 18: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/18.jpg)
Single Point Precision Removed the instability
Parabolic interpolation replaced by lnsearch() whenever points are colinear.
Time taken to evaluate the energy increased.
Increase in the number of calls to evalEnergy_for_step().
![Page 19: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/19.jpg)
Slow Float Vs Double: Time Plot
![Page 20: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/20.jpg)
Control Flow
![Page 21: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/21.jpg)
Single Point Precision (Molecule Size: 2008 SD:100 CG: 150)# of Calls to: EvalEnergyforStep()
Double642
Slow Float893
From: minEnergyCG() 450 450From: lnSearch() 192 443
Double Slow Float
# of Calls to:lnSearch()
100 177
evalEnergyForStep() per lnSearch()
1.92 2.5
![Page 22: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/22.jpg)
Reducing the number of Calls minEnergyCG:
Parabolic interpolation – which 3pts to choose. Lnsearch :
Iteratively calculates the step size. When to stop the iteration determined by 2
tolerances. What we did:
Pts for parabolic interpolation are further apart Increased the tolerances till the time to
minimize the energy was same as double. Then profiled to check the actual energy.
![Page 23: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/23.jpg)
Fast Float Vs Double: Time Plot
![Page 24: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/24.jpg)
Fast Float Vs Double: Energy Plot
![Page 25: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/25.jpg)
Our conclusions from this exercise Located the source of instability. However converting to float increased the
time required for the code to run. Increasing tolerances again made the code
fast. The energy in case of float did not agree
well with double computation.
![Page 26: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/26.jpg)
Feedback from SCF-Bio team They are interested primarily in “relaxing”
the molecule. Actual energy is not of any consequence. To check float-code, metric should be error
between the molecular structures (float vs double).
![Page 27: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/27.jpg)
Start Structure
Double Relaxed Structure
Float Relaxed Structure
RMS Distance
New Checking Methodology
Acceptance: < 0.5
![Page 28: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/28.jpg)
RMS Distance vs CG Steps
We are here
![Page 29: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/29.jpg)
Comparison with new metric
![Page 30: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/30.jpg)
Tasks completed this semester Software Profiling
No. of calls Cache misses Effect of parameters
Control Flow Analysis Flow Diagram Data parallelism
Floating point precision requirement Exploring H/W Options
Platform Selection S/W H/W Partitioning
![Page 31: Co-processors for speeding up drug design algorithms Advait Jain Priyanka Jindal Pulkit Gambhir Under the guidance of: Prof. M Balakrishnan Prof. Kolin](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b307f8b9ab05999b037/html5/thumbnails/31.jpg)
Ongoing work + next semester Setting up building blocks
ZBT RAM access PCI Interface Floating Point Unit
Combining blocks for a simple implementation
Refining the implementation Multiple compute engines Multiple PCI cards