peptideprophet: validation of peptide assignments to ms/ms...
TRANSCRIPT
![Page 1: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/1.jpg)
PeptideProphet: Validation of Peptide Assignments to
MS/MS Spectra
Andrew Keller
![Page 2: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/2.jpg)
Outline
• Need to validate peptide assignments toMS/MS spectra
• Statistical approach to validation
• Running PeptideProphet software
• Interpreting results of PeptideProphet
• Exercises
![Page 3: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/3.jpg)
Most search results are wrong
• [M+2H]2+/[M+3H]3+ uncertainty (LCQ)
• Non-peptide noise
• Incomplete databasee.g. post-translational modifications
• Limitation of database search algorithm
![Page 4: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/4.jpg)
Validation of peptide assignments
In the past, a majority of analysis time was devoted to identifying the minority of correct search results from the majority of incorrect results
Required manual judgment
8.3%8.3%
8.3%
75.0%
sample prepinstrument timedatabase searchvalidation
manual validation
Peptide Identification Effort
![Page 5: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/5.jpg)
Results of 50 Spectrum TestConsistency among ‘Experts’:Of 50 search results
9 had < 67% ‘publishable’, ‘borderline’, or ‘not pub’Consistency of Individual ‘Experts’:Of 10 duplicated search results, on average
0.4 were assessed ‘publishable’/‘not publishable’2 were assessed inconsistently
The true validity of the search results are known.Accuracy of ‘Experts’:
of 511 total ‘publishable’: 95% correctof 102 total ‘borderline’: 49% correctof 387 total ‘not publishable’: 14% correct
Even ‘Experts’ are not dependable!
![Page 6: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/6.jpg)
Need for objective criteria
Manual scrutiny of search results is not practical for large datasets common to high throughput proteomics
As an alternative to relying on human judgment, many research groups employ search scores and properties of the assigned peptides to discriminate between correct and incorrect results
![Page 7: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/7.jpg)
Traditional filtering criteria
Each SEQUEST search result has a:
Xcorr, dCn, Sp, NTT (number of tryptic termini)
Accept all results that satisfy:[M+2H]2+: Xcorr ≥ 2, dCn ≥ 0.1, Sp ≤ 50[M+3H]3+: Xcorr ≥ 2.5, dCn ≥ 0.1, Sp ≤ 50
![Page 8: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/8.jpg)
Traditional filtering criteria
Each SEQUEST search result has a:
Xcorr, dCn, Sp, NTT (number of tryptic termini)
Accept all results that satisfy:[M+2H]2+: Xcorr ≥ 2, dCn ≥ 0.1, Sp ≤ 50, NTT ≥ 1[M+3H]3+: Xcorr ≥ 2.5, dCn ≥ 0.1, Sp ≤ 50, NTT ≥ 1
![Page 9: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/9.jpg)
Traditional filtering criteria
Each SEQUEST search result has a:
Xcorr, dCn, Sp, NTT (number of tryptic termini)
Accept all results that satisfy:[M+2H]2+: Xcorr ≥ 2, dCn ≥ 0.1, Sp ≤ 50 (NTT ≥ 1)[M+3H]3+: Xcorr ≥ 2.5, dCn ≥ 0.1, Sp ≤ 50 (NTT ≥ 1)
[M+2H]2+: Xcorr ≥ 2, dCn ≥ 0.1, Sp ≤ 50[M+3H]3+: Xcorr ≥ 2, dCn ≥ 0.1, Sp ≤ 50
![Page 10: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/10.jpg)
Traditional filtering criteria
Each SEQUEST search result has a:
Xcorr, dCn, Sp, NTT (number of tryptic termini)
Accept all results that satisfy:[M+2H]2+: Xcorr ≥ 2, dCn ≥ 0.1, Sp ≤ 50 (NTT ≥ 1)[M+3H]3+: Xcorr ≥ 2.5, dCn ≥ 0.1, Sp ≤ 50 (NTT ≥ 1)
[M+2H]2+: Xcorr ≥ 2, dCn ≥ 0.1, Sp ≤ 50, NTT ≥ 1[M+3H]3+: Xcorr ≥ 2, dCn ≥ 0.1, Sp ≤ 50, NTT ≥ 1
![Page 11: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/11.jpg)
Traditional filtering criteria
Each SEQUEST search result has a:
Xcorr, dCn, Sp, NTT (number of tryptic termini)
Accept all results that satisfy:[M+2H]2+: Xcorr ≥ 2, dCn ≥ 0.1, Sp ≤ 50 (NTT ≥ 1)[M+3H]3+: Xcorr ≥ 2.5, dCn ≥ 0.1, Sp ≤ 50 (NTT ≥ 1)
[M+2H]2+: Xcorr ≥ 2, dCn ≥ 0.1, Sp ≤ 50 (NTT ≥ 1)[M+3H]3+: Xcorr ≥ 2, dCn ≥ 0.1, Sp ≤ 50 (NTT ≥ 1)
![Page 12: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/12.jpg)
Problems with traditional filtering
• Different research groups use different thresholds• Combines scores in unsatisfactory manner:What if Xcorr is just below its threshold, but dCn is
far above?• Divides data into correct and incorrect- no in between• Unknown error rates (fraction of data passing filter
that are incorrect)• Unknown sensitivity (fraction of correct resultspassing filter)
• Appropriate threshold may depend on database, mass spectrometer type, sample, etc.
![Page 13: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/13.jpg)
Statistical Approach
Use search scores and properties of the assigned peptides to compute a probability that each search result is correct
Desirable model properties:
• Accurate
• High power to discriminate correct and incorrectresults
• Robust
![Page 14: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/14.jpg)
Training dataset
Want dataset of SEQUEST search results for which the true correct and incorrect peptide assignments are known
Sample of 18 control proteins (bovine, yeast, bacterial)
Collect ~40,000 MS/MS spectra, and search using SEQUEST vs. a Drosophila database appendedwith sequences of 18 control proteins and common sample contaminants
![Page 15: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/15.jpg)
Training dataset
Peptides corresponding to drosophila proteins are incorrect
Peptides corresponding to 18 control proteins or contaminants are correct*
![Page 16: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/16.jpg)
Combine multiple SEQUEST scores into single discrminant score F
Want to combine together Xcorr, dCn, and Sp in a linear manner to produce a new score, F, that maximally separates the correct and incorrect search results in the training dataset:
F = c0 + c1Xcorr + c2dCn + c3Sp
Actually first transform Xcorr and Sp:[M+2H]2+: Xcorr′ = log(Xcorr) / (log 2 * peplength) [M+3H]3+: Xcorr′ = log(Xcorr) / (log 4 * peplength)Sp′ = log Sp
![Page 17: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/17.jpg)
Derive Discriminant Fucntion
Derive F for each precursor ion charge separately:
F = c0 + c1Xcorr′ + c2dCn + c3Sp′
For [M+3H]3+ search results in training datset,
c0 = -2.0c1 = 10.68c2 = 11.26c3 = -0.2
F = -2.0 + 10.68 * Xcorr′ + 11.26 * dCn – 0.2 * Sp′
![Page 18: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/18.jpg)
Compute Discriminant ScoreF = -2.0 + 10.68 * Xcorr′ + 11.26 * dCn – 0.2 * Sp′Example:
Peptide K.ARPEFTLPVHFYGR.VXcorr = 3.29dCn = 0.233Sp = 3Precursor Ion Charge = 3Peplength = 14Xcorr′ = log(3.29)/(log 56) = 0.296Sp′ = log(3) = 1.09
F = -2.0 + 10.68 * 0.296 + 11.26 * 0.233 +-0.2 * 1.09 = 3.56
![Page 19: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/19.jpg)
Discriminant Score Distributions
discriminant score (F)
correct (+)
incorrect (-)noof spectra
Training dataset [M+2H]2+ spectra
![Page 20: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/20.jpg)
Computing probabilities from discriminant score distributions
discriminant score (F)
correct (+)
incorrect (-)
p = 0.5
noof spectra
Probability of being correct, given discriminant score Fobs, is:
Number of correct search results with Fobsp =
Total number of search results with Fobs
![Page 21: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/21.jpg)
Computing probabilities from discriminant score distributions
discriminant score (F)
correct (+)
incorrect (-)
p = 0.5
Model Incorrect results as GammaDistributionno
of spectra Model Correct
results as NormalDistribution
Probability of being correct, given discriminant score Fobs, is:
Normalµ,σ(Fobs) * Total correctp =
Normalµ,σ(Fobs) * Total correct + Gammaα,β,zero(Fobs) * Total incorrect
![Page 22: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/22.jpg)
Employing peptide properties
Properties of the assigned peptides, in addition to search scores, are useful information for distinguishing correct and incorrect results
For example in unconstrained SEQUEST searches for MS/MS spectra collected from trypsinizedsamples, a majority of correct assigned peptides have 2 tryptic termini (preceded by K,R), whereas a majority of incorrect assigned peptides have 0 tryptic termini
![Page 23: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/23.jpg)
Number of tryptic termini (NTT)
NTT can equal 0, 1, or 2:
G.HVEQLDSSS.D NTT = 0
K.HVEQLDSSS.D NTT = 1G.HVEQLDSSR.D NTT = 1
K.HVEQLDSSR.D NTT = 2
![Page 24: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/24.jpg)
Number of tryptic termini (NTT)
For the same value of F, assigned peptides with higher NTT values are more likely to be correct
Example: training datasetCorrect: 0.03 NTT=0, 0.28 NTT=1, 0.69 NTT=2Incorrect: 0.80 NTT=0, 0.19 NTT=1, 0.01 NTT=2
Probability of being correct, given discriminant score Fobs with NTT=2 is:
Normalµ,σ(Fobs) * Total corr * 0.69p =
Normalµ,σ(Fobs) * Total corr * 0.69 + Gammaα,β,zero(Fobs) * Total incorr * 0.01
Fobs: p = 0.5 without NTT becomes p=0.99 using NTT
![Page 25: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/25.jpg)
Number of tryptic termini (NTT)
For the same value of F, assigned peptides with lower NTT values are less likely to be correct
Example: training datasetCorrect: 0.03 NTT=0, 0.28 NTT=1, 0.69 NTT=2Incorrect: 0.80 NTT=0, 0.19 NTT=1, 0.01 NTT=2
Probability of being correct, given discriminant score Fobs with NTT=0 is:
Normalµ,σ(Fobs) * Total corr * 0.03p =
Normalµ,σ(Fobs) * Total corr * 0.03 + Gammaα,β,zero(Fobs) * Total incorr * 0.80
Fobs: p = 0.5 without NTT becomes p=.04 using NTT
![Page 26: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/26.jpg)
Additional peptide properties
Number of missed tryptic cleavages (NMC)
Mass difference between precursor ion and peptide
Presence of light or heavy cysteine (ICAT)
Presence of N-glyc motif (N-glycosylation capture)
Calculated pI (FFE)
Incorporate similar to NTT above, assuming independence of peptide properties and search scores among correct and incorrect results
![Page 27: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/27.jpg)
Computed ProbabilitiesGiven training dataset distributions of F, NTT, NMC, Massdiff, ICAT, N-glyc, and pI among correct and incorrect search results,…
…then the probability of any search result with Fobs, NTTobs, NMCobs, Massdiffobs, ICATobs, N-glycobs, and pIobs can be computed as described above, with terms for each piece of information
Accurate
Discriminating
![Page 28: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/28.jpg)
Robust Model
One cannot rely on the training datasetdistributions of F, NTT, NMC, Massdiff, ICAT, N-glyc, and pI among correct and incorrect search results
These distributions are expected to vary depending on:
• Database used for search• Mass spectrometer• Spectrum quality• Sample preparation and purity
![Page 29: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/29.jpg)
EM AlgorithmPeptideProphet learns the distributions of F and peptide properties among correct and incorrect search results in each dataset
It then uses the learned distributions to compute probabilities that each search result is correct
EM algorithm: unsupervised learning method that iteratively estimates the distributions given probabilities that each search result is correct, and then computes those probabilities given the distributions
Initial settings help guide algorithm to good solution
![Page 30: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/30.jpg)
E-M Algorithm learns test data score distributions
NTT = 0NTT = 1NTT = 2
Incorrect Peptide Assignments
Correct Peptide Assignments
No.
of s
pect
ra
Discriminant search score
![Page 31: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/31.jpg)
Incorrect Peptide Assignments
Correct Peptide Assignments
No.
of s
pect
ra
NTT = 0NTT = 1NTT = 2
Discriminant search score
E-M Iteration 0
![Page 32: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/32.jpg)
Incorrect Peptide Assignments
Correct Peptide Assignments
No.
of s
pect
ra
NTT = 0NTT = 1NTT = 2
Discriminant search score
E-M Iteration 1
![Page 33: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/33.jpg)
Incorrect Peptide Assignments
Correct Peptide Assignments
No.
of s
pect
ra
NTT = 0NTT = 1NTT = 2
Discriminant search score
E-M Iteration 2
![Page 34: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/34.jpg)
Incorrect Peptide Assignments
Correct Peptide Assignments
No.
of s
pect
ra
NTT = 0NTT = 1NTT = 2
Discriminant search score
E-M Iteration 3
![Page 35: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/35.jpg)
Incorrect Peptide Assignments
Correct Peptide Assignments
No.
of s
pect
ra
NTT = 0NTT = 1NTT = 2
Discriminant search score
E-M Iteration 4
![Page 36: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/36.jpg)
Incorrect Peptide Assignments
Correct Peptide Assignments
No.
of s
pect
ra
NTT = 0NTT = 1NTT = 2
Discriminant search score
E-M Iteration 7
![Page 37: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/37.jpg)
Accuracy of the Model
p=0.9
~ 0.9
test data: A. Keller et al. OMICS 6, 207 (2002)
model
ideal
100 spectra withcomputed p ~ 0.9
90% of them (90) should be correct
observed probability is around 0.9
Model is accurate
![Page 38: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/38.jpg)
Discriminating Power of Computed Probabilities
test data: A. Keller et al. OMICS 6, 207 (2002)
Sensitivity: fraction of all correct resultspassing filter
Error: fraction of allresults passingfilter that areincorrect
Ideal Spot
![Page 39: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/39.jpg)
Discriminating Power of Computed Probabilities
observed predictedmin. probability threshold
sens
itivi
ty, e
rror sensitivity
error
test data: A. Keller et al. OMICS 6, 207 (2002)
Sensitivity: fraction of all correct resultspassing filter
Error: fraction of allresults passingfilter that areincorrect
![Page 40: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/40.jpg)
Discriminating Power : Example p ≥ 0.9
observed predictedmin. probability threshold
sens
itivi
ty, e
rror sensitivity
error
test data: A. Keller et al. OMICS 6, 207 (2002)
0.83
0.01
Sensitivity: fraction of all correct resultspassing filter
Error: fraction of allresults passingfilter that areincorrect
![Page 41: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/41.jpg)
Discriminating Power : Example p ≥ 0.5
observed predictedmin. probability threshold
sens
itivi
ty, e
rror sensitivity
error
test data: A. Keller et al. OMICS 6, 207 (2002)
0.9
0.07
Sensitivity: fraction of all correct resultspassing filter
Error: fraction of allresults passingfilter that areincorrect
![Page 42: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/42.jpg)
Can experts discriminate better than model?
20 of the test spectra were assigned probabilities close to 0.5 (in complete test dataset)
Of those, 10 were correct:on average:51% ‘publishable’, 24% ‘borderline’, 25% ‘not pub’
and 10 were incorrect:on average:11% ‘publishable’, 16% ‘borderline’, 74% ‘not pub’
![Page 43: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/43.jpg)
Getting started with PeptideProphet
Input: SEQUEST summary html files (file.html)
Interact merges files together into interact.htm, then PeptideProphet runs model, computes probabilities, and writes probabilities as first column
Combine together runs that are similar (sample, database, search constraints, mass spectrometer)
![Page 44: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/44.jpg)
Getting started with PeptideProphet
![Page 45: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/45.jpg)
PeptideProphet Results
![Page 46: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/46.jpg)
PeptideProphet Results: Model Summary
![Page 47: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/47.jpg)
Reasonable Learned Discriminant Score Distributions
![Page 48: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/48.jpg)
Suspicious Looking Learned DiscriminantScore Distributions
![Page 49: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/49.jpg)
PeptideProphet Results: Model Summary
![Page 50: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/50.jpg)
PeptideProphet Results: Model Summary
Good
Not so good
![Page 51: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/51.jpg)
PeptideProphet Results: Model Summary
![Page 52: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/52.jpg)
PeptideProphet Results: Model Summary
![Page 53: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/53.jpg)
PeptideProphet Results: Predicted Numbers of Correct and Incorrect Peptides
![Page 54: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/54.jpg)
PeptideProphet [M+2H]2+ vs [M+3H]3+
Precursor Ions
![Page 55: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/55.jpg)
PeptideProphet Results: Incomplete Analysis
![Page 56: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/56.jpg)
Sort Data by Computed Probability
![Page 57: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/57.jpg)
Some Options for InteractRename Output File (e.g. to interact-noicat.htm):
![Page 58: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/58.jpg)
Some Options for Interact• no PeptideProphet analysis• alternative minimum probability• sample enzyme other than trypsin
![Page 59: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/59.jpg)
Use of Supplemental Discriminating Information
Use additional discriminating information, including ICAT or N-glyc, when relevant
PeptideProphet automatically uses ICAT information when it thinks appropriate. Nevertheless, you can explicitly set whether or not ICAT information is utilized
![Page 60: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/60.jpg)
DeltaCn* Example
![Page 61: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/61.jpg)
DeltaCn* OptionsThere are three ways asterisked deltas can be
treated by PeptideProphet:1. Penalize (the default option, sets asterisked
deltas to 0)2. Leave alone (suitable for the context of
homologues)3. Exclude (the most conservative, assigns
probability 0)
![Page 62: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/62.jpg)
Ongoing Developments for PeptideProphet
Optimize for various additional mass spectrometers
New discriminant function
Adapt to additional methods for assigning peptides to tandem mass spectra
MascotCOMETX!TandemOthers
![Page 63: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/63.jpg)
Pep3D mzXML Data ViewerXiao-jun Li
1. Main features of Pep3D2. Evaluating sample quality3. Evaluating LC-ESI-MS/MS performance
![Page 64: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/64.jpg)
1. Main Features of Pep3Dinteract
![Page 65: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/65.jpg)
Pep3D Images: Type A
![Page 66: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/66.jpg)
Pep3D Images: Type B
CID (1198)
![Page 67: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/67.jpg)
Pep3D Images: Type C
peptide (221)
![Page 68: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/68.jpg)
Pep3D Images: Type D
peptide/CID
(221/1196 = 0.184783)
![Page 69: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/69.jpg)
Display CID Spectrum
![Page 70: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/70.jpg)
Features: 2720
![Page 71: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/71.jpg)
Features: 2720
CIDs: 1633
![Page 72: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/72.jpg)
Features: 2720
CIDs: 1633
IDs: 363
ID/CID: 22%
ID/feature: 13%
![Page 73: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/73.jpg)
2. Evaluating Sample Quality
Good sample:
Plenty well-localized spots without any particular large-scale pattern
![Page 74: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/74.jpg)
Empty Sample
•Very few localized spots
•Mainly background noise
•Distinguishable from no-spray
peptide/CID (0/311 = 0)
![Page 75: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/75.jpg)
Chemical Contamination
CID (588); no positive ID
CID (26)
![Page 76: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/76.jpg)
Features of Chemical Contamination
• Long horizontal streaks
• Low m/z values
• Singly charged ions
• Many wasteful CID attempts
• Can be put on CID exclusive list
![Page 77: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/77.jpg)
Polymer Contamination
Repeat unit of 44 amu
+2+3
+4
![Page 78: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/78.jpg)
Features of Polymer Contamination
• Localized spots running off-diagonally
• Equal distance in m/z
• Almost equal distance in time
• May be ionized in multiple charge states
• Many wasteful CID attempts
• Eliminated by better washing steps?
![Page 79: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/79.jpg)
3. Evaluating LC-ESI-MS/MS performance
Good performance:
•Peptide ions evenly distributed
•Smooth background
•Majority of intensive ions fragmented
End too soon
![Page 80: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/80.jpg)
Insufficient Sample Separationpeptide/CID
(527/2150 = 0.245116)
•Large proportion of intensive ions not fragmented
•Insufficient SCX separation
•Non-optimal RP separation
![Page 81: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/81.jpg)
Non-Optimal RP Gradient%B
80
60
40
20
0
•1st ID: 42 min
•Effective range: 60-120 mins
•Horizontal streaks
•Larger slope at beginning
•Slower slope in middle
•Higher %B at end
![Page 82: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/82.jpg)
Bad RP ColumnSame sample, same system, different columns
peptide/CID
(354/3004 = 0.12) (224/2760 = 0.08)
37% less IDs
Quantification also suffers
![Page 83: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/83.jpg)
Summary
• Pep3D can be used to evaluate sample quality and LC-ESI-MS/MS performance
• Other applications possible• Suggestion: Use complex standard sample
to check system performance
![Page 84: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/84.jpg)
Exercises with PeptideProphet
• Accuracy of computed probabilities
• Utility of conventional SEQUEST score thresholdsand PeptideProphet analysis
• Model results for ICAT data analyzed with andwithout ICAT information
• Model results for unconstrained vs. trypticconstrained search results
![Page 85: PeptideProphet: Validation of Peptide Assignments to MS/MS ...tools.proteomecenter.org/course/lectures/0501-Day2.keller.pdf · • Need to validate peptide assignments to MS/MS spectra](https://reader034.vdocuments.us/reader034/viewer/2022051601/5add94557f8b9ae1408d29b5/html5/thumbnails/85.jpg)
Exercise DatasetsMany of the exercises utilize SEQUEST search results generated from datasets for which true results are known:
• HaloICAT: ICAT halobacterium sample searchedagainst a halo_plus_human protein database
PeptideProphet run on these datasets automatically colors all correct corresponding proteins red !