in-depth analysis of protein amino acid sequence and ptms with high-resolution mass spectrometry...

20
In-depth Analysis of Protein Amino Acid Sequence and PTMs with High- resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics Solutions Inc, Canada 2 University of Waterloo, Canada

Upload: guadalupe-okeefe

Post on 16-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry

Lian Yang2; Baozhen Shan1; Bin Ma2

1Bioinformatics Solutions Inc, Canada 2University of Waterloo, Canada

Page 2: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

• Problem

Complete protein sequence coverageo antibody confirmationo biomarker discovery

Database search software along is insufficient

Protein sequence analysis

Page 3: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

• Possible reasons for incomplete coverage• “non-database” peptides

o unexpected modificationso mutated residueso novel peptide

• database errors

• MeanwhileLarge amount of high-quality spectra are not matched.

Protein sequence analysis

Page 4: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

• A workflow to identify both the database and “non-database” peptides

• Objective

• Maximize protein sequence coverage

• Explain more high-quality MS/MS spectra

Proposed workflow for in-depth analysis

Page 5: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

• Workflow

Proposed workflow for in-depth analysis

Multiple enzyme• Multiple protein digests with different enzymes

• High accuracy MS for both precursor and fragment ions

Page 6: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

• Workflow

Proposed workflow for in-depth analysis

PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Commun Mass Spectrom. 2003;17(20):2337-42.

Multiple enzyme

• Identify de novo sequence tags

• Reveal a set of high quality spectra

Page 7: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

• Workflow

Proposed workflow for in-depth analysis

PEAKS DB: De Novo sequencing assisted database search for sensitive and accurate peptide identification. Mol Cell Proteomics 2012; 11:10.1074, 1–8.

Multiple enzyme

• Identify database peptides.

• Database search result validated by de novo tags

• Reveal a set of confident proteins

Page 8: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

• Workflow

Proposed workflow for in-depth analysis

PeaksPTM: Mass spectrometry-based identification of peptides with unspecified modifications.Journal of Proteome Research 10.7 (2011) : 2930-2936

Multiple enzyme

• Identify peptides with unexpected modifications

• Peptides from the set of confident proteins are “modified” in-silico by trying all possible modifications in UNIMOD.

• Speed up by de novo tags

For input spectra with+ highly confident de novo tags- no significant database matches

Page 9: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

• Workflow

Proposed workflow for in-depth analysis

SPIDER: software for protein identification from sequence tags with de novo sequencing error. J Bioinform Comput Biol. 2005 Jun;3(3):697-716.

Multiple enzyme

• Identify peptides with mutation, such as residue insertion, deletion, and substitution.

• Screen the protein database to find short sequences similar to de novo tags

• Use both the de novo tags and database sequence to reconstruct the most probable sequences that match the spectrum

For input spectra with+ highly confident de novo tags- no significant database matches

Page 10: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

• Workflow

Proposed workflow for in-depth analysis

Multiple enzyme

Unassigned de novo sequence tags are reported as possible novel peptides

Page 11: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

• Result integration

Proposed workflow for in-depth analysis

Page 12: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

Test the workflow with the standard bovine serum albumin

• Sample

• Workflow

In-depth analysis of BSA

• Pure ALBU_BOVIN from SIGMA• 3 digests with Trypsin, LysC, GluC.• LC-MS/MS with Thermo LTQ-Orbitrap XL.

• Workflow implemented in PEAK 6• 3 digests in one project• Searched database: Swiss-Prot

Trypsin LysC GluC

Workflow

LC-MS/MS

Page 13: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

• More PSMs are identified in each additional step:

Result

5,152 MS/MS spectra

1,737 PSMs

906 PSMs

44 PSMs

38 MS/MS spectra

Filtered at 1% FDR

1,737 -> 2,687 PSMs

PEAKS ALC score > 70%

Page 14: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

• BSA coverage

Result

The uncovered 4% is in the protein N-terminal region, which is mostly likely cleaved-off and not in the purchased sample1.

1specific binding site (Asp-Thr-His-Lys) for Cu(II) ions. T. Peters Jr., F.A. Blumenstock. J. Biol. Chem., 242 (1967), p. 1574

Trypsin + PEAKS DB Proposed workflow82%84%86%88%90%92%94%96%98%

87%

96%

Page 15: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

• Contaminants• Identified with at least 3 unique peptides.

– Human keratin proteins (K2C1_HUMAN and K1C_HUMAN)

– Bacteria protein (SSPA_STAAR)

– Trypsin (TRY1_BOVIN)

Result

Page 16: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

• PTMs• Unsuspected modifications identified by PTM search

– Three PTMs specified in database search» Carbamidomethylation (C)» Oxidation (M)» Deamidation (NQ)

Result

Page 17: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

• Mutation• 214th amino acid A T• Brown 1975, Fed. Proc. 34:591

Result

Page 18: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

• Unexplained de novo tags• Might be…

– Novel peptides outside of the searched database

Result

KK.QTALVELLK.HK ||||||| DPALVELLKK

Page 19: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

• A software workflow proposed for in-depth protein sequence analysis• Found many things in a “pure” sample

– Contaminants– Unsuspected PTMs– Mutations

• Improved protein sequence coverage– BSA coverage: 87% -> 96%

• Explained more high-quality MS/MS spectra– Identified MS/MS spectra: 1,737 -> 2,687

Summary

Page 20: In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics

Q / A