august 11th 2016 pratik jagtap - college of biological … jagtap ©2016 regents of the university...

26
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279 PROTEOINFORMATICS OVERVIEW Center for Mass Spectrometry and Proteomics August 11th 2016 Pratik Jagtap http://www.cbs.umn.edu/msp ©2016 Regents of the University of Minnesota, All rights reserved.

Upload: dangdan

Post on 27-May-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

PROTEOINFORMATICS OVERVIEW

Center for Mass Spectrometry and Proteomics

August 11th 2016

Pratik Jagtap

http://www.cbs.umn.edu/msp

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Outline

• PROTEOMICS WORKFLOW • PEAKLIST PROCESSING

• Search Databases Overview

• Protein Identification

• Protein Validation and Quantification

• Publication Guidelines

Terminology

• RAW file

• Peaklist

• Peaklist processing

• Peptide-Spectral Match (PSM)

• Genome Assembly and annotation

• Variety of search databases

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

PROTEOMICS WORKFLOW

Eng et al 2011 Mol Cell Proteomics. 10(11): R111.009522.

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Mass spectral data (.RAW)

Statistical validation of Protein Identification.

Protein Identification

Processing Mass Spectrometer

PROTEOMICS WORKFLOW

Search databases Protein

Quantitation.

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Outline

• PROTEOMICS WORKFLOW

• PEAKLIST PROCESSING • Search Databases Overview

• Protein Identification

• Protein Validation and Quantification

• Publication Guidelines

Terminology

• RAW file

• Peaklist

• Peaklist processing

• Peptide-Spectral Match (PSM)

• Genome Assembly and annotation

• Variety of search databases

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved. .

MASS SPECTRAL DATA

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Cappadona et al 2012 Amino Acids. Sep 2012; 43(3): 1087–1108

Eng et al 2011 Mol Cell Proteomics. 10(11): R111.009522.

MASS SPECTRAL DATA

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Eng et al 2011 Mol Cell Proteomics. 10(11): R111.009522.

PROTEOMICS WORKFLOW

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Peaklist Processing

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

RAW DATA CONVERSION TOOLS

.RAW XRawfile library from

ThermoFinnigan

Xcalibur software.

ReAdW

mzxML

http://tools.proteomecenter.org/wiki/index.php?title=Software:ReAdW

msconvert

ProteoWizard

mzML

http://proteowizard.sourceforge.net/

Others

Raw2MSM

extract_msn

DeconMSn

DTASuperCharge

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Average ppm and Standard deviation

improves when MaxQuant processed

files are used.

ORBITRAP: PROCESSING AND EFFECTS

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Peaklist Processing

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Eng et al 2011 Mol Cell Proteomics. 10(11): R111.009522.

PROTEOMICS WORKFLOW

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Outline

• PROTEOMICS WORKFLOW

• PEAKLIST PROCESSING

• Search Databases

Overview • Protein Identification

• Protein Validation and Quantification

• Publication Guidelines

Terminology

• RAW file

• Peaklist

• Peaklist processing

• Peptide-Spectral Match (PSM)

• Genome Assembly and annotation

• Variety of search databases

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Mass spectral data (.RAW)

Statistical validation of Protein Identification.

Protein Identification

Processing Mass Spectrometer

PROTEOMICS WORKFLOW

Search databases Protein

Quantitation.

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Search against database. Mass spectrum

DATABASE SEARCH

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Salzberg Genome Biology 2007 8:102 doi:10.1186

DNA → GENOME → PROTEOMIC DATABASE.

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

GENOMIC AND PROTEOMIC DATABASES

Finished and Published Genomes • 3551 Bacterial Genomes.

• 211 Archaeal Genomes.

• 58 Eukaryal Genomes.

• 3363 Viral Genomes

http://www.genomesonline.org/index

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

PROTEOMIC DATABASES

CUSTOMIZED DATABASES

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Swiss-Prot is the manually annotated and

reviewed section of the UniProt

Knowledgebase (UniProtKB).

It is a high quality annotated and non-

redundant protein sequence database,

which brings together experimental results,

computed features and scientific

conclusions.

http://en.wikipedia.org/wiki/Swiss-Prot

TrEMBL contains high-quality

computationally analyzed records, which

are enriched with automatic annotation.

The translations of annotated coding

sequences in the EMBL-

Bank/GenBank/DDBJ nucleotide sequence

database are automatically processed and

entered in TrEMBL.

http://en.wikipedia.org/wiki/TrEMBL

PROTEOMIC DATABASES

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

UNIPROT DATABASE

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

UNIPROT DATABASE

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

The Reference Sequence (RefSeq) collection provides a comprehensive, integrated,

non-redundant, well-annotated set of sequences, including genomic DNA, transcripts,

and proteins. RefSeq sequences form a foundation for medical, functional, and

diversity studies. They provide a stable reference for genome annotation, gene

identification and characterization, mutation and polymorphism analysis (especially

RefSeqGene records), expression studies, and comparative analyses.

http://www.ncbi.nlm.nih.gov/refseq/

PROTEOMIC DATABASES

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

CUSTOMIZED PROTEOMIC DATABASES

Customized

database

repositories

(CPTAC /

UniMesh)

Genomic

DNA

sequences.

Expressed

sequence

tags / cDNA

sequences.

Six-frame

translation

Three-frame

translation

Metagenomic

databases.

Translation

RNASeq data.

Translation and

database reduction

workflows

Proteomic

databases. 24

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Eng et al 2011 Mol Cell Proteomics. 10(11): R111.009522.

PROTEOMICS WORKFLOW

©2016 Regents of the University of Minnesota, All rights reserved.

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Outline

• PROTEOMICS WORKFLOW

• PEAKLIST PROCESSING

• Search Databases Overview

• Protein

Identification • Protein Validation and Quantification

• Publication Guidelines

Terminology

• RAW file

• Peaklist

• Peaklist processing

• Peptide-Spectral Match (PSM)

• Genome Assembly and annotation

• Variety of search databases

©2016 Regents of the University of Minnesota, All rights reserved.