1 incob 2009, singapore ren é hussong et al. highly accelerated feature detection in mass...

28
1 InCoB 2009, Singapore René Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics 25 (2009). Junior Research Group for Protein-Protein-Interactions and Computational Proteomics Saarland University, Saarbruecken, Germany

Upload: aldous-bryan

Post on 18-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

1

InCoB 2009, Singapore

René Hussong et al.

Highly accelerated feature detection in mass spectrometry data

using modern graphics processing unitsBioinformatics 25 (2009).

Junior Research Group for Protein-Protein-Interactions and Computational Proteomics

Saarland University, Saarbruecken, Germany

Page 2: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

2

Outline

∙ Introduction & Motivation - The Differential Proteomics Pipeline

∙ Computational Proteomics- Signal Processing and Feature Detection- The Isotope Wavelet Transform

∙ Parallelization via GPUs

∙ Results & Discussion

Page 3: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

3

The Differential Proteomics Pipeline

Two probes:e.g. sick vs. healthy Mass Spectrometer

List of differentiallyexpressed proteins

Applications range from basic pharmaceutical researchover medical diagnostics and therapy

to biotechnology and engineering.

Page 4: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

4

Principle of Biological Mass Spectrometry

digest

intensity

mass

Fingerprint

Proteins Peptides

Peptides are

ionized and

accelerated

Page 5: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

5

Principle of Biological Mass Spectrometry

digest

intensity

mass

Fingerprint

mass of a single neutron

Page 6: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

6

Principle of Biological Mass Spectrometry

digest

intensity

mass

Fingerprint

mass of a single neutron

Page 7: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

7

(Simple) Feature Finding

Typically done by simple thresholding:

Needs additional preprocessing steps, like e.g.: - Baseline elimination (e.g. by morphological filters) - Noise reduction and/or smoothing (Mostly) needs resampling

Needs additional postprocessing steps, like e.g.:- Peak clustering (so-called “deconvolution”)- Model fitting, charge prediction

Page 8: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

8

The Isotope Wavelet Transform

Convolution with a kernel function

- by construction robust against noise and baseline artifacts- also acts as a filter for chemical noise - predicts simultaneously the charge state- needs no explicit resampling - only a single parameter (threshold)

Page 9: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

9

Results – Myoglobin PMF

Page 10: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

10

Parallelization via CUDA

Page 11: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

11

Parallelization via CUDA

Page 12: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

12

Parallelization via CUDA

b-th data point

Page 13: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

13

Parallelization via CUDA

b-th data point

Page 14: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

14

Parallelization via CUDA

b-th data point

Page 15: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

15

Parallelization via CUDA

b-th data point

Page 16: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

16

Parallelization via CUDA

T0

b-th data point

Tn

Page 17: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

17

Parallelization via CUDA and TBB

2x NVIDIA Tesla C870 via Intel Threading Building Blocks

1x NVIDIA Tesla C870

1x CPU 2.3 GHz

>200x speedup>200x speedup

Page 18: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

18

Open Issues – Future Work

∙ Solutions for machine-specific ‘artifacts’, e.g.- Tailing effects in TOF-Analyzers- Severe mass discretization in high resolution data

∙ Separating overlapping patterns

∙ Tests for MSn spectra- Refined averagine model

GPU solutions

Page 19: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

19

Availability: OpenMS

∙ An open source C++ library for mass spectrometry

∙ Designed for “users” as well as for “developers”

∙ TOPP- “The OpenMS proteomics pipeline”- suite of independent software tools- include file handling / conversion- peak picking and feature detection - includes visualizer TOPPView…

http://www.openms.de

Page 20: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

20

References

Hussong, R, Gregorius, B, Tholey, A, and Hildebrandt, A (2009). Highly accelerated feature detection in proteomics data sets using modern graphics processing units. Bioinformatics 25.

Schulz-Trieglaff, O, Hussong, R, Gröpl, C, Leinenbach, A, Hildebrandt, A, Huber, C, and Reinert, K (2008). Computational Quantification of Peptides from LC-MS Data. Journal of Computational Biology 15(7).

Sturm, M, Bertsch, A, Gröpl, C, Hildebrandt, A, Hussong, R, Lange, E, Pfeifer, N, Schulz-Trieglaff, O, Zerck, A, Reinert, K, and Kohlbacher, O (2008).OpenMS - An open-source software framework for mass spectrometry, BMC Bioinformatics 9(163).

Hussong, R, Tholey, A, and Hildebrandt, A (2007).Efficient Analysis of Mass Spectrometry Data Using the Isotope WaveletIn: COMPLIFE 2007: The Third International Symposium on Computational Life Science. American Institute of Physics (AIP) 940.

Schulz-Trieglaff, O, Hussong, R, Gröpl, C, Hildebrandt, A, and Reinert, K (2007). A Fast and Accurate Algorithm for the Quantification of Peptides from Mass Spectrometry Data, In: Proceedings of the Eleventh Annual International Conference on Research in Computational Molecular Biology (RECOMB). Lecture Notes in Bioinformatics (LNBI) 4453.

Page 21: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

21

The Isotope Wavelet Transform

Kernel functioncharge state 1, mass 1000D

Kernel functioncharge state 1, mass 2000D

- by construction robust against noise and baseline artifacts- also acts as a filter for chemical noise - predicts simultaneously the charge state- needs no explicit resampling - only a single parameter (threshold)

Convolution with a kernel function

Page 22: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

22

The Isotope Wavelet Transform

MS spectrum (charge state 3)

charge-1-transform

charge-2-transform

charge-3-transform

Page 23: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

23

The Sweep Line Idea

m/z [Th]

RT [s]

2 additional parameters:RT_cutoffRT_interleave

2 additional parameters:RT_cutoffRT_interleave

Page 24: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

24

digest

intensity

mass/charge

Fingerprintcharge state 1

Open Issues – Future Work

Fragment Fingerprint

Page 25: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

25

Open Issues – Future Work

∙ Separating overlapping patterns

Page 26: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

26

The Retention Time

Page 27: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

27

Results – 2D noisy data

Page 28: 1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics

28

The Adaptive Isotope Wavelet Kernel

- denotes the Heaviside step function- λ(m) is a linear function fit to the averagine model