august 11th 2016 pratik jagtap - college of biological … jagtap ©2016 regents of the university...
TRANSCRIPT
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
PROTEOINFORMATICS OVERVIEW
Center for Mass Spectrometry and Proteomics
August 11th 2016
Pratik Jagtap
http://www.cbs.umn.edu/msp
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
Outline
• PROTEOMICS WORKFLOW • PEAKLIST PROCESSING
• Search Databases Overview
• Protein Identification
• Protein Validation and Quantification
• Publication Guidelines
Terminology
• RAW file
• Peaklist
• Peaklist processing
• Peptide-Spectral Match (PSM)
• Genome Assembly and annotation
• Variety of search databases
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
PROTEOMICS WORKFLOW
Eng et al 2011 Mol Cell Proteomics. 10(11): R111.009522.
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
Mass spectral data (.RAW)
Statistical validation of Protein Identification.
Protein Identification
Processing Mass Spectrometer
PROTEOMICS WORKFLOW
Search databases Protein
Quantitation.
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
Outline
• PROTEOMICS WORKFLOW
• PEAKLIST PROCESSING • Search Databases Overview
• Protein Identification
• Protein Validation and Quantification
• Publication Guidelines
Terminology
• RAW file
• Peaklist
• Peaklist processing
• Peptide-Spectral Match (PSM)
• Genome Assembly and annotation
• Variety of search databases
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved. .
MASS SPECTRAL DATA
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
Cappadona et al 2012 Amino Acids. Sep 2012; 43(3): 1087–1108
Eng et al 2011 Mol Cell Proteomics. 10(11): R111.009522.
MASS SPECTRAL DATA
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
Eng et al 2011 Mol Cell Proteomics. 10(11): R111.009522.
PROTEOMICS WORKFLOW
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
Peaklist Processing
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
RAW DATA CONVERSION TOOLS
.RAW XRawfile library from
ThermoFinnigan
Xcalibur software.
ReAdW
mzxML
http://tools.proteomecenter.org/wiki/index.php?title=Software:ReAdW
msconvert
ProteoWizard
mzML
http://proteowizard.sourceforge.net/
Others
Raw2MSM
extract_msn
DeconMSn
DTASuperCharge
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
Average ppm and Standard deviation
improves when MaxQuant processed
files are used.
ORBITRAP: PROCESSING AND EFFECTS
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
Peaklist Processing
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
Eng et al 2011 Mol Cell Proteomics. 10(11): R111.009522.
PROTEOMICS WORKFLOW
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
Outline
• PROTEOMICS WORKFLOW
• PEAKLIST PROCESSING
• Search Databases
Overview • Protein Identification
• Protein Validation and Quantification
• Publication Guidelines
Terminology
• RAW file
• Peaklist
• Peaklist processing
• Peptide-Spectral Match (PSM)
• Genome Assembly and annotation
• Variety of search databases
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
Mass spectral data (.RAW)
Statistical validation of Protein Identification.
Protein Identification
Processing Mass Spectrometer
PROTEOMICS WORKFLOW
Search databases Protein
Quantitation.
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
Search against database. Mass spectrum
DATABASE SEARCH
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
Salzberg Genome Biology 2007 8:102 doi:10.1186
DNA → GENOME → PROTEOMIC DATABASE.
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
GENOMIC AND PROTEOMIC DATABASES
Finished and Published Genomes • 3551 Bacterial Genomes.
• 211 Archaeal Genomes.
• 58 Eukaryal Genomes.
• 3363 Viral Genomes
http://www.genomesonline.org/index
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
PROTEOMIC DATABASES
CUSTOMIZED DATABASES
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
Swiss-Prot is the manually annotated and
reviewed section of the UniProt
Knowledgebase (UniProtKB).
It is a high quality annotated and non-
redundant protein sequence database,
which brings together experimental results,
computed features and scientific
conclusions.
http://en.wikipedia.org/wiki/Swiss-Prot
TrEMBL contains high-quality
computationally analyzed records, which
are enriched with automatic annotation.
The translations of annotated coding
sequences in the EMBL-
Bank/GenBank/DDBJ nucleotide sequence
database are automatically processed and
entered in TrEMBL.
http://en.wikipedia.org/wiki/TrEMBL
PROTEOMIC DATABASES
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
UNIPROT DATABASE
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
UNIPROT DATABASE
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
The Reference Sequence (RefSeq) collection provides a comprehensive, integrated,
non-redundant, well-annotated set of sequences, including genomic DNA, transcripts,
and proteins. RefSeq sequences form a foundation for medical, functional, and
diversity studies. They provide a stable reference for genome annotation, gene
identification and characterization, mutation and polymorphism analysis (especially
RefSeqGene records), expression studies, and comparative analyses.
http://www.ncbi.nlm.nih.gov/refseq/
PROTEOMIC DATABASES
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
CUSTOMIZED PROTEOMIC DATABASES
Customized
database
repositories
(CPTAC /
UniMesh)
Genomic
DNA
sequences.
Expressed
sequence
tags / cDNA
sequences.
Six-frame
translation
Three-frame
translation
Metagenomic
databases.
Translation
RNASeq data.
Translation and
database reduction
workflows
Proteomic
databases. 24
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
Eng et al 2011 Mol Cell Proteomics. 10(11): R111.009522.
PROTEOMICS WORKFLOW
©2016 Regents of the University of Minnesota, All rights reserved.
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279
© 2015 Regents of the University of Minnesota. All rights reserved.
Outline
• PROTEOMICS WORKFLOW
• PEAKLIST PROCESSING
• Search Databases Overview
• Protein
Identification • Protein Validation and Quantification
• Publication Guidelines
Terminology
• RAW file
• Peaklist
• Peaklist processing
• Peptide-Spectral Match (PSM)
• Genome Assembly and annotation
• Variety of search databases
©2016 Regents of the University of Minnesota, All rights reserved.