indiana university bloomington - rick weber, david d...

17
Accelerating Tandem MS Protein Database Searches Using OpenCL Rick Weber, David D. Jenkins, Nicholas Lineback, Robert Hettich, Gregory D. Peterson

Upload: others

Post on 09-Oct-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Indiana University Bloomington - Rick Weber, David D ...salsahpc.indiana.edu/ECMLS2012/slides/ECMLS12...Specmaster OpenCL Myrimatch implementation Runs correctly on AMD, Nvidia GPUs;

Accelerating Tandem MS Protein Database Searches Using OpenCL

Rick Weber, David D. Jenkins, Nicholas Lineback, Robert Hettich, Gregory D. Peterson

Page 2: Indiana University Bloomington - Rick Weber, David D ...salsahpc.indiana.edu/ECMLS2012/slides/ECMLS12...Specmaster OpenCL Myrimatch implementation Runs correctly on AMD, Nvidia GPUs;

Programming devices the intractable way

Page 3: Indiana University Bloomington - Rick Weber, David D ...salsahpc.indiana.edu/ECMLS2012/slides/ECMLS12...Specmaster OpenCL Myrimatch implementation Runs correctly on AMD, Nvidia GPUs;

Programming devices with OpenCL

Page 4: Indiana University Bloomington - Rick Weber, David D ...salsahpc.indiana.edu/ECMLS2012/slides/ECMLS12...Specmaster OpenCL Myrimatch implementation Runs correctly on AMD, Nvidia GPUs;

Tandem MS/MS experiment

Collect a sample

Clean it

Try to remove things that aren’t proteins

Dissolve proteins into peptidesTrypsin

Shoot mixture through mass spectrometer

Mass spectrometer gives ~100k scans containing m/Z and intensities

Page 5: Indiana University Bloomington - Rick Weber, David D ...salsahpc.indiana.edu/ECMLS2012/slides/ECMLS12...Specmaster OpenCL Myrimatch implementation Runs correctly on AMD, Nvidia GPUs;

Peptide searching with database

Page 6: Indiana University Bloomington - Rick Weber, David D ...salsahpc.indiana.edu/ECMLS2012/slides/ECMLS12...Specmaster OpenCL Myrimatch implementation Runs correctly on AMD, Nvidia GPUs;

Search algorithms

Mostly differ in the scoring algorithm

Consequently, different execution rates

Sequest

Cross correlation

Most widely used

X! Tandem

Dot product

Myrimatch

Multi-Variate Hypergeometric (MVH) distribution

Page 7: Indiana University Bloomington - Rick Weber, David D ...salsahpc.indiana.edu/ECMLS2012/slides/ECMLS12...Specmaster OpenCL Myrimatch implementation Runs correctly on AMD, Nvidia GPUs;

Specmaster

OpenCL Myrimatch implementation

Runs correctly on AMD, Nvidia GPUs; AMD, Intel CPUs

Not tested anything else

Designed from ground up for speed

Myrimatch already multi-threaded

No 400x speedup using GPU

10x is more reasonable

Page 8: Indiana University Bloomington - Rick Weber, David D ...salsahpc.indiana.edu/ECMLS2012/slides/ECMLS12...Specmaster OpenCL Myrimatch implementation Runs correctly on AMD, Nvidia GPUs;

Algorithm design

Make peptides from proteins sequentially on CPU

Needs to be done in OpenCL (future work)

Amdahl’s law

Perform search using OpenCL devices

Each workgroup processes different MS2+ scan

Each work item searches a different candidate

Page 9: Indiana University Bloomington - Rick Weber, David D ...salsahpc.indiana.edu/ECMLS2012/slides/ECMLS12...Specmaster OpenCL Myrimatch implementation Runs correctly on AMD, Nvidia GPUs;

Search

Binary search for candidates

Precursor masses within tolerance for assumed charge state

Binary search for ions

Look for peaks theoretically predicted for peptide’s amino acid in multiple charge states

Compute MVH as a function of number of found peaks by intensity class

Page 10: Indiana University Bloomington - Rick Weber, David D ...salsahpc.indiana.edu/ECMLS2012/slides/ECMLS12...Specmaster OpenCL Myrimatch implementation Runs correctly on AMD, Nvidia GPUs;

OpenCL and the lack of free lunch

Little performance portability

Different devices have:

Different memories

Different SIMD sizes

Different branch penalties

Different execution models

Page 11: Indiana University Bloomington - Rick Weber, David D ...salsahpc.indiana.edu/ECMLS2012/slides/ECMLS12...Specmaster OpenCL Myrimatch implementation Runs correctly on AMD, Nvidia GPUs;

Memory speeds

__constant __local __global (cached)

__global(raw)

E5-2680 518GB/s 425GB/s 469GB/s 51GB/s

GTX 480 1.29TB/s 1.3TB/s 588GB/s 152GB/s

Radeon 7970 7TB/s 3.6TB/s 1.7TB/s 213GB/s

Page 12: Indiana University Bloomington - Rick Weber, David D ...salsahpc.indiana.edu/ECMLS2012/slides/ECMLS12...Specmaster OpenCL Myrimatch implementation Runs correctly on AMD, Nvidia GPUs;

Preferred work group sizes

CPU: 1

AMD GPU: multiple of 32 or 64

Nvidia GPU: multiple of 32 or 64

Page 13: Indiana University Bloomington - Rick Weber, David D ...salsahpc.indiana.edu/ECMLS2012/slides/ECMLS12...Specmaster OpenCL Myrimatch implementation Runs correctly on AMD, Nvidia GPUs;

Peformance (as of time of publication)

Page 14: Indiana University Bloomington - Rick Weber, David D ...salsahpc.indiana.edu/ECMLS2012/slides/ECMLS12...Specmaster OpenCL Myrimatch implementation Runs correctly on AMD, Nvidia GPUs;

“Future work” already completed

Portable device specific tuning

Still running with same kernel code on all devices!

Preprocessor abuse

Kernel apathetic to work group size

Heterogeneous scan scoring

Use every device in CPU to score

Up to 90% of peak strong-scaled throughput using 32 cores and 3 Radeon 7970s

Page 15: Indiana University Bloomington - Rick Weber, David D ...salsahpc.indiana.edu/ECMLS2012/slides/ECMLS12...Specmaster OpenCL Myrimatch implementation Runs correctly on AMD, Nvidia GPUs;

Actual future work

Post translational modifications

When generating peptides, create each modified variant of pepties on CPU

Easy (Don’t need to modify kernels)

Probably slow

Take existing unmodified list and modify on the fly on the device

Hard due to lack of recursion in OpenCL

Amortizes sequential execution and PCIe transfers

Page 16: Indiana University Bloomington - Rick Weber, David D ...salsahpc.indiana.edu/ECMLS2012/slides/ECMLS12...Specmaster OpenCL Myrimatch implementation Runs correctly on AMD, Nvidia GPUs;

Acknowledgements

The University of Tennessee

NSF and SCALE-IT

IntelFor donating the research machine

Page 17: Indiana University Bloomington - Rick Weber, David D ...salsahpc.indiana.edu/ECMLS2012/slides/ECMLS12...Specmaster OpenCL Myrimatch implementation Runs correctly on AMD, Nvidia GPUs;

Questions?