![Page 1: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/1.jpg)
In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry
Lian Yang2; Baozhen Shan1; Bin Ma2
1Bioinformatics Solutions Inc, Canada 2University of Waterloo, Canada
![Page 2: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/2.jpg)
• Problem
Complete protein sequence coverageo antibody confirmationo biomarker discovery
Database search software along is insufficient
Protein sequence analysis
![Page 3: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/3.jpg)
• Possible reasons for incomplete coverage• “non-database” peptides
o unexpected modificationso mutated residueso novel peptide
• database errors
• MeanwhileLarge amount of high-quality spectra are not matched.
Protein sequence analysis
![Page 4: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/4.jpg)
• A workflow to identify both the database and “non-database” peptides
• Objective
• Maximize protein sequence coverage
• Explain more high-quality MS/MS spectra
Proposed workflow for in-depth analysis
![Page 5: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/5.jpg)
• Workflow
Proposed workflow for in-depth analysis
Multiple enzyme• Multiple protein digests with different enzymes
• High accuracy MS for both precursor and fragment ions
![Page 6: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/6.jpg)
• Workflow
Proposed workflow for in-depth analysis
PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Commun Mass Spectrom. 2003;17(20):2337-42.
Multiple enzyme
• Identify de novo sequence tags
• Reveal a set of high quality spectra
![Page 7: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/7.jpg)
• Workflow
Proposed workflow for in-depth analysis
PEAKS DB: De Novo sequencing assisted database search for sensitive and accurate peptide identification. Mol Cell Proteomics 2012; 11:10.1074, 1–8.
Multiple enzyme
• Identify database peptides.
• Database search result validated by de novo tags
• Reveal a set of confident proteins
![Page 8: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/8.jpg)
• Workflow
Proposed workflow for in-depth analysis
PeaksPTM: Mass spectrometry-based identification of peptides with unspecified modifications.Journal of Proteome Research 10.7 (2011) : 2930-2936
Multiple enzyme
• Identify peptides with unexpected modifications
• Peptides from the set of confident proteins are “modified” in-silico by trying all possible modifications in UNIMOD.
• Speed up by de novo tags
For input spectra with+ highly confident de novo tags- no significant database matches
![Page 9: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/9.jpg)
• Workflow
Proposed workflow for in-depth analysis
SPIDER: software for protein identification from sequence tags with de novo sequencing error. J Bioinform Comput Biol. 2005 Jun;3(3):697-716.
Multiple enzyme
• Identify peptides with mutation, such as residue insertion, deletion, and substitution.
• Screen the protein database to find short sequences similar to de novo tags
• Use both the de novo tags and database sequence to reconstruct the most probable sequences that match the spectrum
For input spectra with+ highly confident de novo tags- no significant database matches
![Page 10: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/10.jpg)
• Workflow
Proposed workflow for in-depth analysis
Multiple enzyme
Unassigned de novo sequence tags are reported as possible novel peptides
![Page 11: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/11.jpg)
• Result integration
Proposed workflow for in-depth analysis
![Page 12: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/12.jpg)
Test the workflow with the standard bovine serum albumin
• Sample
• Workflow
In-depth analysis of BSA
• Pure ALBU_BOVIN from SIGMA• 3 digests with Trypsin, LysC, GluC.• LC-MS/MS with Thermo LTQ-Orbitrap XL.
• Workflow implemented in PEAK 6• 3 digests in one project• Searched database: Swiss-Prot
Trypsin LysC GluC
Workflow
LC-MS/MS
![Page 13: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/13.jpg)
• More PSMs are identified in each additional step:
Result
5,152 MS/MS spectra
1,737 PSMs
906 PSMs
44 PSMs
38 MS/MS spectra
Filtered at 1% FDR
1,737 -> 2,687 PSMs
PEAKS ALC score > 70%
![Page 14: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/14.jpg)
• BSA coverage
Result
The uncovered 4% is in the protein N-terminal region, which is mostly likely cleaved-off and not in the purchased sample1.
1specific binding site (Asp-Thr-His-Lys) for Cu(II) ions. T. Peters Jr., F.A. Blumenstock. J. Biol. Chem., 242 (1967), p. 1574
Trypsin + PEAKS DB Proposed workflow82%84%86%88%90%92%94%96%98%
87%
96%
![Page 15: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/15.jpg)
• Contaminants• Identified with at least 3 unique peptides.
– Human keratin proteins (K2C1_HUMAN and K1C_HUMAN)
– Bacteria protein (SSPA_STAAR)
– Trypsin (TRY1_BOVIN)
Result
![Page 16: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/16.jpg)
• PTMs• Unsuspected modifications identified by PTM search
– Three PTMs specified in database search» Carbamidomethylation (C)» Oxidation (M)» Deamidation (NQ)
Result
![Page 17: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/17.jpg)
• Mutation• 214th amino acid A T• Brown 1975, Fed. Proc. 34:591
Result
![Page 18: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/18.jpg)
• Unexplained de novo tags• Might be…
– Novel peptides outside of the searched database
Result
KK.QTALVELLK.HK ||||||| DPALVELLKK
![Page 19: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/19.jpg)
• A software workflow proposed for in-depth protein sequence analysis• Found many things in a “pure” sample
– Contaminants– Unsuspected PTMs– Mutations
• Improved protein sequence coverage– BSA coverage: 87% -> 96%
• Explained more high-quality MS/MS spectra– Identified MS/MS spectra: 1,737 -> 2,687
Summary
![Page 20: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics](https://reader030.vdocuments.us/reader030/viewer/2022032518/56649cc35503460f9498b575/html5/thumbnails/20.jpg)
Q / A