proteome bioinformatics - vanderbilt...
TRANSCRIPT
Proteome Bioinformatics
Applied Bioinformatics lecture 6
David L. Tabb
Overview
• Identifying proteins through “shotgun”
proteomics: bench and bioinformatics
• Understanding Peptide-Spectrum Matches
• Trawling ProteomExchange via PeptideAtlas
Mass spectrometers
• IONIZATION:– Produce ions from biological materials.
• MASS ANALYSIS:– Separate or select ions by mass-to-charge (m/z)
ratio.
• DETECTION:– Report intensity of ions in mass spectrum.
Discovery Proteomics
Peptide
Mixture
Liquid
Chromatography
Electrospray
Ionization
High-Resolution
Mass Spectrometry
Isolate
Ions of Peptide
Collide Ions to
Dissociate
Collect Fragments
in Tandem MS
Tandem
Mass spectra
Peptide
Identifications
Confident
Peptide List
Assembled
Protein List
Disassembly and reassembly
Collection of tandem
mass spectra
Collection of raw
peptide identifications
LSELIGAR
z=2 XCorr=3.5
Mixture of PeptidesConfidently identified
peptide sequences
...LSEGTSFR
LSELIGAR
LSENLRK
LSEPVHK...
Mixture of Proteins Confidently identified
proteins
...YGR192C
YGR204W
YGR208W
YGR209C...
After AI Nesvizhskii, Mol Cell Proteomics (2005) 4: 1419-40.
Database search algorithms
First published in 1994, these tools identify MS/MS scans by comparing them
to predictions from database peptide sequences. Prominent examples include:
Sequest : Eng (1994) J. Amer. Soc. Mass Spectrom. 5: 976-989.
Mascot: Perkins (1999) Electrophoresis 20: 3551-3567.
X!Tandem: Craig (2003) Rapid Comm. Mass Spectrom. 17: 2310-2316.
Fragment ions result from breakage of peptide bonds
TSIIGTIGPK
N-terminal
b ionC-terminal
y ion
HFISELEK, +2 charge state
HF-
-LEK -SELEK
-ISELEK
-FISELEK
Neutral loss of
water from peptide
Proteomic Repositories
Web resources of proteomic data have become substantial in recent years.
Raw data and
Identification archives
Peer-to-peer file storage
Data analysis tools
and databases
Peptide Atlas: protein coverage
http://www.peptideatlas.org search for NP_862897 in mouse build
Peptide-spectrum match view
predicted and observed
fragment ions
y ion contains
peptide C-terminusb ion contains
peptide N-terminus
Distance between
b8 and b9 is mass
of ninth amino acid Tabb (2006) Nat. Protocols 1: 2213-2222.
Summary
• Proteomics generates large data sets that require automated interpretation.
• Proteomic repositories are relatively new, and interpreting experiments requires expertise.
• Visual inspection of tandem mass spectra is largely built from rules-of-thumb.
Challenges
• What areas of mouse dihydropyrimidinedehydrogenase (NP_740748.1) have been observed by proteomics?
• I want to create a targeted measurement of mouse adenylyl cyclase-associated protein 1 (NP_031624.2). What charge state does KEPALLELEGK adopt?