in-depth analysis of protein amino acid sequence and ptms with high-resolution mass spectrometry...
TRANSCRIPT
In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry
Lian Yang2; Baozhen Shan1; Bin Ma2
1Bioinformatics Solutions Inc, Canada 2University of Waterloo, Canada
• Problem
Complete protein sequence coverageo antibody confirmationo biomarker discovery
Database search software along is insufficient
Protein sequence analysis
• Possible reasons for incomplete coverage• “non-database” peptides
o unexpected modificationso mutated residueso novel peptide
• database errors
• MeanwhileLarge amount of high-quality spectra are not matched.
Protein sequence analysis
• A workflow to identify both the database and “non-database” peptides
• Objective
• Maximize protein sequence coverage
• Explain more high-quality MS/MS spectra
Proposed workflow for in-depth analysis
• Workflow
Proposed workflow for in-depth analysis
Multiple enzyme• Multiple protein digests with different enzymes
• High accuracy MS for both precursor and fragment ions
• Workflow
Proposed workflow for in-depth analysis
PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Commun Mass Spectrom. 2003;17(20):2337-42.
Multiple enzyme
• Identify de novo sequence tags
• Reveal a set of high quality spectra
• Workflow
Proposed workflow for in-depth analysis
PEAKS DB: De Novo sequencing assisted database search for sensitive and accurate peptide identification. Mol Cell Proteomics 2012; 11:10.1074, 1–8.
Multiple enzyme
• Identify database peptides.
• Database search result validated by de novo tags
• Reveal a set of confident proteins
• Workflow
Proposed workflow for in-depth analysis
PeaksPTM: Mass spectrometry-based identification of peptides with unspecified modifications.Journal of Proteome Research 10.7 (2011) : 2930-2936
Multiple enzyme
• Identify peptides with unexpected modifications
• Peptides from the set of confident proteins are “modified” in-silico by trying all possible modifications in UNIMOD.
• Speed up by de novo tags
For input spectra with+ highly confident de novo tags- no significant database matches
• Workflow
Proposed workflow for in-depth analysis
SPIDER: software for protein identification from sequence tags with de novo sequencing error. J Bioinform Comput Biol. 2005 Jun;3(3):697-716.
Multiple enzyme
• Identify peptides with mutation, such as residue insertion, deletion, and substitution.
• Screen the protein database to find short sequences similar to de novo tags
• Use both the de novo tags and database sequence to reconstruct the most probable sequences that match the spectrum
For input spectra with+ highly confident de novo tags- no significant database matches
• Workflow
Proposed workflow for in-depth analysis
Multiple enzyme
Unassigned de novo sequence tags are reported as possible novel peptides
• Result integration
Proposed workflow for in-depth analysis
Test the workflow with the standard bovine serum albumin
• Sample
• Workflow
In-depth analysis of BSA
• Pure ALBU_BOVIN from SIGMA• 3 digests with Trypsin, LysC, GluC.• LC-MS/MS with Thermo LTQ-Orbitrap XL.
• Workflow implemented in PEAK 6• 3 digests in one project• Searched database: Swiss-Prot
Trypsin LysC GluC
Workflow
LC-MS/MS
• More PSMs are identified in each additional step:
Result
5,152 MS/MS spectra
1,737 PSMs
906 PSMs
44 PSMs
38 MS/MS spectra
Filtered at 1% FDR
1,737 -> 2,687 PSMs
PEAKS ALC score > 70%
• BSA coverage
Result
The uncovered 4% is in the protein N-terminal region, which is mostly likely cleaved-off and not in the purchased sample1.
1specific binding site (Asp-Thr-His-Lys) for Cu(II) ions. T. Peters Jr., F.A. Blumenstock. J. Biol. Chem., 242 (1967), p. 1574
Trypsin + PEAKS DB Proposed workflow82%84%86%88%90%92%94%96%98%
87%
96%
• Contaminants• Identified with at least 3 unique peptides.
– Human keratin proteins (K2C1_HUMAN and K1C_HUMAN)
– Bacteria protein (SSPA_STAAR)
– Trypsin (TRY1_BOVIN)
Result
• PTMs• Unsuspected modifications identified by PTM search
– Three PTMs specified in database search» Carbamidomethylation (C)» Oxidation (M)» Deamidation (NQ)
Result
• Mutation• 214th amino acid A T• Brown 1975, Fed. Proc. 34:591
Result
• Unexplained de novo tags• Might be…
– Novel peptides outside of the searched database
Result
KK.QTALVELLK.HK ||||||| DPALVELLKK
• A software workflow proposed for in-depth protein sequence analysis• Found many things in a “pure” sample
– Contaminants– Unsuspected PTMs– Mutations
• Improved protein sequence coverage– BSA coverage: 87% -> 96%
• Explained more high-quality MS/MS spectra– Identified MS/MS spectra: 1,737 -> 2,687
Summary
Q / A