molecular shape searching on gpus: a brave new world
DESCRIPTION
Shape is a fundamental three dimensional molecular property and a powerful descriptor for molecular comparison and similarity assessment; similarity in shape has proven to be a very effective method for predicting similarity in biology. As such shape-based virtual screening has become an integral part of computational drug discovery, due to both its speed and efficacy. OpenEye’s recent port of their shape similarity application, ROCS, to the GPU has resulted in a virtual screening tool of unprecedented power – FastROCS. FastROCS’ speed allows it to perform large-scale calculations of a kind inaccessible in the past and has accelerated more routine shape searching to the point that it has become competitive with more traditional, but less effective, two dimensional methods. Go through the slides to learn more. Try GPUs for free here: www.Nvidia.com/GPUTestDriveTRANSCRIPT
![Page 1: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/1.jpg)
FastROCS: What does it mean to be “fast”?
OpenEye Scienti!c Software Brian Cole
March 26, 2013 © 2013 OpenEye Scienti!c Software
![Page 2: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/2.jpg)
FastROCS and the “Chasm”
OpenEye Scientific Software Brian Cole
© 2013 OpenEye Scientific Software March 26, 2013
![Page 3: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/3.jpg)
ROCS: Rapid Overlay of Chemical Structures
March 26, 2013 © 2013 OpenEye Scienti!c Software
![Page 4: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/4.jpg)
LeadHopper
March 26, 2013 © 2013 OpenEye Scienti!c Software
![Page 5: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/5.jpg)
And then you wait…
March 26, 2013 © 2013 OpenEye Scienti!c Software
![Page 6: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/6.jpg)
What is FastROCS?
CPU GPU
Shap
e Overla
ys per Secon
d
© 2013 OpenEye Scienti!c Software
High is
Best
![Page 7: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/7.jpg)
1
10
100
1,000
10,000
100,000
1,000,000
CPU GPU
Shap
e Overla
ys per Secon
d
What is FastROCS?
© 2013 OpenEye Scienti!c Software
High is
Best
![Page 8: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/8.jpg)
© 2013 OpenEye Scien;fic So>ware
0
100,000
200,000
300,000
400,000
500,000
600,000
CPU GPU
Shap
e Overla
ys per Secon
d
What is FastROCS?
High is
Best
![Page 9: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/9.jpg)
1
10
100
1,000
10,000
100,000
1 10 100
Log (Elapsed
5me in se
cond
s)
Log (cores/GPUs)
March 26, 2013 © 2013 OpenEye Scienti!c Software
But I want it now!
ROCS
FastROCS Low is
Best
![Page 10: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/10.jpg)
Riding Moore’s Law
March 26, 2013 © 2013 OpenEye Scienti!c Software
0 200,000 400,000 600,000 800,000
1,000,000 1,200,000 1,400,000 1,600,000 1,800,000 2,000,000
C1060 C2050 C2075 C2090 K10 K20
Shap
e Overla
ys per Secon
d
High is
Best
![Page 11: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/11.jpg)
ROCS user base
• Every Pharma R&D • Many BioTechs • Many Universities • National Labs and Research Centers • Other software companies
March 26, 2013 © 2013 OpenEye Scienti!c Software
![Page 12: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/12.jpg)
Licenses by Year
March 26, 2013 © 2013 OpenEye Scienti!c Software
2009 2010 2011 2012
ROCS
FastROCS
High is
Best
![Page 13: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/13.jpg)
Licenses by Year (Linear Scale)
March 26, 2013 © 2013 OpenEye Scienti!c Software
2009 2010 2011 2012
ROCS
FastROCS
%15
Pharmageddon
![Page 14: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/14.jpg)
All ROCS users (linear scale)
March 26, 2013 © 2013 OpenEye Scienti!c Software
2009 2010 2011 2012
Academics
ROCS
FastROCS
%3
![Page 15: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/15.jpg)
Technology Adoption Lifecycle
March 26, 2013 © 2013 OpenEye Scienti!c Software
%2.5 %13.5 %34 %34 %16
FastROCS
![Page 16: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/16.jpg)
What’s in the “chasm”?
• “ROCS is already fast enough”
• “The results aren’t bitwise comparable”
• “There’s nothing else to run on the GPU”
• “GPUs are different”
March 26, 2013 © 2013 OpenEye Scienti!c Software
GTC!
Some other ;me…
![Page 17: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/17.jpg)
FastROCS Quick Start
• crtl-alt-F1 (to switch to a non X-server terminal) • login as root • /sbin/init 3 (to turn off the X-server) • ./NVIDIA-Linux-x86_64-285.05.09.run • reboot • ./cuda.sh to give /dev/nvidia* correct permissions
• tar –xzf fastrocs-1.3.1-RHEL5-x64-OpenCL-1.1-CUDA-4.1.tar.gz • openeye/bin/ShapeDatabaseServer.py database.oeb.gz • openeye/bin/ShapeDatabaseClient.py localhost:8080 query.sdf out.sdf
March 26, 2013 © 2013 OpenEye Scienti!c Software
![Page 18: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/18.jpg)
ROCS Quick Start
• tar –xzf ROCS-3.1.1-RHEL5-x64.tar.gz
• openeye/bin/rocs query.sdf database.oeb.gz
March 26, 2013 © 2013 OpenEye Scienti!c Software
S;ll a barrier to entry to work around!
![Page 19: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/19.jpg)
This is even worse!
fastrocs-1.3.1-RHEL5-x64-OpenCL-1.1-CUDA-4.1.tar.gz
March 26, 2013 © 2013 OpenEye Scienti!c Software
NVidia OpenCL binaries are ;ghtly locked to a par;cular driver version
![Page 20: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/20.jpg)
Worthwhile to upgrade
March 26, 2013 © 2013 OpenEye Scienti!c Software
0
100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
C2050 (260 Driver) C2050 (295 Driver)
Conformers /
Secon
d %11
High is
Best
![Page 21: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/21.jpg)
Needed for new hardware
March 26, 2013 © 2013 OpenEye Scienti!c Software
0
200,000
400,000
600,000
800,000
1,000,000
1,200,000
C2050 (295 Driver) M2090 (295 Driver)
Conformers /
Secon
d
High is
Best
![Page 22: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/22.jpg)
Scalability between drivers (4x C2050)
March 26, 2013 © 2013 OpenEye Scienti!c Software
1
2
3
4
1 2 3 4
Speedu
p (Single GPU
5me / Mul5-‐GPU
5me)
Number of GPUs
Ideal
260 driver
295 driver
High is
Best
![Page 23: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/23.jpg)
Really bad for 8x M2090
March 26, 2013 © 2013 OpenEye Scienti!c Software
0
1
2
3
4
5
6
7
8
1 2 3 4 5 6 7 8
Speedu
p (Single GPU
5me / Mul5-‐GPU
5me)
Number of GPUs
High is
Best
![Page 24: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/24.jpg)
Ways to transfer to device
• CL_MEM_USE_HOST_PTR – kernelBuf = clCreateBuffer(CL_MEM_USE_HOST_PTR)
• CL_MEM_ALLOC_HOST_PTR|CL_MEM_COPY_HOST_PTR – kernelBuf = clCreateBuffer(CL_MEM_ALLOC_HOST_PTR|CL_MEM_COPY_HOST_PTR)
• CL_MEM_ALLOC_HOST_PTR – kernelBuf = clCreateBuffer(CL_MEM_ALLOC_HOST_PTR) - cacheable – ptr = clEnqueueMapBuffer(kernelBuf, CL_MAP_WRITE) – memcpy(ptr, data) – clEnqueueUnmapMemObject(ptr)
• clEnqueueMapBuffer – kernelBuf = clCreateBuffer() - cacheable – ptr = clEnqueueMapBuffer(kernelBuf, CL_MAP_WRITE) – memcpy(ptr, data) – clEnqueueUnmapMemObject(ptr)
• clEnqueueWriteBuffer – kernelBuf = clCreateBuffer() - cacheable – clEnqueueWriteBuffer(kernelBuf, data)
• oclCopyCompute – pinnedBuf = clCreateBuffer(CL_MEM_ALLOC_HOST_PTR|CL_MEM_READ_WRITE) – cacheable – pinnedPtr = clEnqueueMapBuffer(pinnedBuf, CL_MAP_WRITE) – cacheable – memcpy(pinnedPtr, data) – kernelBuf = clCreateBuffer() – cacheable – clEnqueueWriteBuffer(kernelBuf, pinnedPtr)
March 26, 2013 © 2013 OpenEye Scienti!c Software
![Page 25: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/25.jpg)
Ways to transfer from device
• CL_MEM_ALLOC_HOST_PTR – kernelBuf = clCreateBuffer(CL_MEM_ALLOC_HOST_PTR) - cacheable – ptr = clEnqueueMapBuffer(kernelBuf, CL_MAP_WRITE) – memcpy(data, ptr) – clEnqueueUnmapMemObject(ptr)
• clEnqueueMapBuffer – kernelBuf = clCreateBuffer() - cacheable – ptr = clEnqueueMapBuffer(kernelBuf, CL_MAP_WRITE) – memcpy(data, ptr) – clEnqueueUnmapMemObject(ptr)
• clEnqueueReadBuffer – kernelBuf = clCreateBuffer() - cacheable – clEnqueueWriteBuffer(kernelBuf, data)
• oclCopyCompute – pinnedBuf = clCreateBuffer(CL_MEM_ALLOC_HOST_PTR|CL_MEM_READ_WRITE) –
cacheable – pinnedPtr = clEnqueueMapBuffer(pinnedBuf, CL_MAP_WRITE) – cacheable – memcpy(pinnedPtr, data) – kernelBuf = clCreateBuffer() – cacheable – clEnqueueReadBuffer(kernelBuf, pinnedPtr)
March 26, 2013 © 2013 OpenEye Scienti!c Software
![Page 26: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/26.jpg)
March 26, 2013 © 2013 OpenEye Scienti!c Software
0
1
2
3
4
5
6
7
8
9
1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 Speedu
p (Tim
e Sequ
en5a
l / Tim
e Pa
rallel)
Number of GPUs U5lized
FastROCS scalability across 8x M2070
![Page 27: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/27.jpg)
Lessons from the mess
• clEnqueueWriteBuffer > clEnqueueMapBuffer
• clEnqueueMapBuffer >> clEnqueueReadBuffer
• CL_MEM_* constants aren’t worth the effort
March 26, 2013 © 2013 OpenEye Scienti!c Software
![Page 28: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/28.jpg)
CUDA?
• Serious customers will only use NVidia cards
• Pinned memory
• Better support for binaries and compatibility • CUDA support >> OpenCL support
March 26, 2013 © 2013 OpenEye Scienti!c Software
![Page 29: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/29.jpg)
FastROCS CUDA port
March 26, 2013 © 2013 OpenEye Scienti!c Software
0
500,000
1,000,000
1,500,000
2,000,000
2,500,000
3,000,000
OpenCL CUDA CUDA-‐pinned
Confom
ers p
er Secon
d
2xC2075 2xC2090 2xK20
High is
Best
![Page 30: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/30.jpg)
CUDA Scaling?
March 26, 2013 © 2013 OpenEye Scienti!c Software
0
1,000,000
2,000,000
3,000,000
4,000,000
5,000,000
6,000,000
7,000,000
8,000,000
1 2 3 4 5 6 7 8
Conformers p
er Secon
d
Number of individual K10 GPUs (Note, each K10 has 2 physical GPUs on the board)
CUDA
OpenCL
Ideal
High is
Best
![Page 31: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/31.jpg)
CUDA vs OpenCL: Ding Ding!
• Portability vs Innovation
• NVidia vs Intel and AMD
• Open vs Proprietary
• Customers don’t care…
March 26, 2013 © 2013 OpenEye Scienti!c Software
![Page 32: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/32.jpg)
ROCS Implementations
• We only care a little…
• Fortran code (1995) • C code (1999) • C++ wrapper code (2003) • OpenCL code (2009) • CUDA code (2012) • C++ thread-safe code (2013)
March 26, 2013 © 2013 OpenEye Scienti!c Software
![Page 33: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/33.jpg)
OpenEye Software
• Lots of Software – 14 products – 13 software libraries
• C++ (no SIMD) – 2.5 million lines
• Python – 416 thousand lines
• Java – 63 thousand lines
• C# – 38 thousand lines
© 2012 OpenEye Scien;fic So>ware
![Page 34: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/34.jpg)
20
12
10 Programmers Hardcore Scripter Other stuff
The People
• GPGPU = ½ of a developer – Only %2.5 of development effort
© 2012 OpenEye Scientific Software
![Page 35: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/35.jpg)
Technology Adoption Lifecycle
March 26, 2013 © 2013 OpenEye Scienti!c Software
%2.5 %13.5 %34 %34 %16
OpenEye GPGPU development
![Page 36: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/36.jpg)
LinkedIn skills
March 26, 2013 © 2013 OpenEye Scienti!c Software
%2.2
![Page 37: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/37.jpg)
Technology Adoption Lifecycle
March 26, 2013 © 2013 OpenEye Scienti!c Software
%2.5 %13.5 %34 %34 %16
GPGPU development
![Page 38: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/38.jpg)
I Believe…
• GPGPU computing can become ubiquitous…
• By expressing parallelism everywhere…
• We can make it easy for our customers… – Pre-installed in every operating system – Integrated seamlessly into every language – Then eventually becoming the CPU
March 26, 2013 © 2013 OpenEye Scienti!c Software
![Page 39: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/39.jpg)
Acknowledgements
• Nikolai Sakharnykh (NVidia) • Dave Mullaly (HP) • Exxact Computing
March 26, 2013 © 2013 OpenEye Scienti!c Software
![Page 40: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/40.jpg)
Father of “ROCS”
Andrew Grant April 28th 1963 - December 29th 2012
March 26, 2013 © 2013 OpenEye Scienti!c Software
![Page 41: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/41.jpg)
March 26, 2013 © 2013 OpenEye Scienti!c Software
![Page 42: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/42.jpg)
Dude, where’s my color?
March 26, 2013 © 2010 OpenEye Scienti!c Software
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
ROCS FastROCS
DUD Av
erage AU
C
Shape Only With Color
![Page 43: Molecular Shape Searching on GPUs: A Brave New World](https://reader034.vdocuments.us/reader034/viewer/2022051818/54bac6c14a7959f6498b45c3/html5/thumbnails/43.jpg)
ROCS vs FastROCS Histogram
March 26, 2013 © 2010 OpenEye Scienti!c Software
0
2
4
6
8
10
12 0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
0.55
0.60
0.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
Num
ber o
f Targets
Kendall Tau Correla5on Coefficient