mn-b-c 2 analysis of high dimensional (-omics) data

19
B-C 2 Analysis of High Dimensional (-omics) Kay Hofmann – Protein Evolution Group http://www.genetik.uni-koeln.de/groups/H Week 5: Proteomics 1

Upload: phoebe

Post on 24-Feb-2016

31 views

Category:

Documents


0 download

DESCRIPTION

MN-B-C 2 Analysis of High Dimensional (-omics) Data. Week 5: Proteomics 1. Kay Hofmann – Protein Evolution Group http://www.genetik.uni-koeln.de/groups/Hofmann. Proteomics - what is it good for ?. Detection Proteomics Which proteins are there? Which are abundant? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: MN-B-C 2  Analysis of High Dimensional (-omics) Data

MN-B-C 2 Analysis of High Dimensional (-omics) Data

Kay Hofmann – Protein Evolution Grouphttp://www.genetik.uni-koeln.de/groups/Hofmann

Week 5: Proteomics 1

Page 2: MN-B-C 2  Analysis of High Dimensional (-omics) Data

Proteomics - what is it good for?

Detection Proteomics· Which proteins are there? Which are abundant?· Which part of a protein is there?· Are there changes in protein presence or abundance?

Modification Proteomics· Which proteins have posttranslational modifications?· What fraction of a protein pool is modified?· Are there changes in protein modification level?

Interaction Proteomics· Which bind to each other or form complexes?· What fraction of a protein pool is bound/complexed?· Are there changes in protein interaction patterns?

Page 3: MN-B-C 2  Analysis of High Dimensional (-omics) Data

Major methods for protein detectionTwo-dimensional gel electrophoresis· Detektion (and possibly quantification) of entire proteins.· limited scope, lack of reproducibility· old school

Mass spectrometry· Possibly coupled with liquid chromatography (LC)· Detection (and possibly quantification) of peptides· Requires sophisticated instrumentation

Specific protein tags· Candidate proteins have to be known and be modified artificially· only suitable for special applications

Antibodies· Candidate proteins have to be known· Specific antibodies have to be available· only suitable for small- and medium-scale studies

Page 4: MN-B-C 2  Analysis of High Dimensional (-omics) Data

Two-dimensional gel electrophoresis

1st DimensionIsoelectric focussingSeparation by isoelectric point

2nd DimensionSDS PAGESeparation by size

Modification

Page 5: MN-B-C 2  Analysis of High Dimensional (-omics) Data

Two-dimensional gel electrophoresis

Differences in Gel properties can be (partially) compensated

Protein RecognitionCan to (some degrees) be done directly on gel by a combination of IEP/MW-values and recognition of spot patterns.Usually, MS analysis of eluted spots required.

Page 6: MN-B-C 2  Analysis of High Dimensional (-omics) Data

Principles of Protein Analysis by MS/MS

Page 7: MN-B-C 2  Analysis of High Dimensional (-omics) Data

Coverage of a good MS analysis

How many proteins can be detected ?In a 4 h run on the newest generation instrument:• 25.000 peptides• 4.000 proteins

Sample requirement4 µg peptide sample

Sample preparation time6h plus digest time

Page 8: MN-B-C 2  Analysis of High Dimensional (-omics) Data

Fragmentation pattern

H – N – C – C – N – C – C – N – C – C – N – C – C – OH

R1 R2 R3 R4O O O

HH H H H HHH

O

a1 b1 c1 a2 b2 c2 a3 b3 c3

x3 y3 z3 x2 y2 z2 x1 y1 z1H+

a,b,c ions: charge retained in N-terminal fragmentx,y,z ions: charge retained in C-terminal fragmentThe type of generated ion depends on the MS method

Page 9: MN-B-C 2  Analysis of High Dimensional (-omics) Data

Identification by 'sequence tags'

In principle, the fragmentation pattern of a peptide could result in a complete series of (e.g.) b-ions that allows the determination of the peptide sequence.

Real data typically allows only identification of 3mer or 4mer peptides.

Short tags can identify a peptide if combined with additional data (size)

GWSV1489.430

650.213

K/R K/R

This method can tolerate some degree of modification or variability,even if this is unknown/unexpected.

Identification of suitable tags is difficult, often done manually

Page 10: MN-B-C 2  Analysis of High Dimensional (-omics) Data

Identification by comparison with precalculated spectra

The entire fragment spectrum of a peptide is compared to a database of expected spectra for every possible peptide.

Possible peptides are taken from a proteome-wide sequence database, taking the cleaving enyzme into account.

Even if some ions are missing or too much, the correcpt peptide can be identified by a good correlation of expected and observed pattern.

Problems with polymorphisms, modifications, other unexpected things.

Page 11: MN-B-C 2  Analysis of High Dimensional (-omics) Data

Does a pre-separation of proteins make sense ?

Nowadays rarely done

Can introduce biasProtein separation is often not very reproducible

Reduces sample/spectrum complexitySpectra contain data of fewer proteins. But: more spectra have to be measured

Allows quantification at protein levelProtein amounts (e.g. LC peaks) more quantitative than peptide amounts or even MS peaks.

Allows the detection of minor componentsMinor proteins are not overwhelmed by peptids from major components

Page 12: MN-B-C 2  Analysis of High Dimensional (-omics) Data

Example output of Mascot (Protein Level)missed cleavage

site

matches this protein, but also another one (with better score)

Page 13: MN-B-C 2  Analysis of High Dimensional (-omics) Data

Example output of Mascot (Peptide Level)

Page 14: MN-B-C 2  Analysis of High Dimensional (-omics) Data

Example output of Mascot (Ion Level)

b* = b without NH3b0 = b without H2Ob++ = b with two charges

Page 15: MN-B-C 2  Analysis of High Dimensional (-omics) Data

Quantitative MS Proteomics

Mass Spectrometry is not a quantitative methodDifferent ions have different physicochemical properties, 'flyability', stability etc.

Quantification before MSIn some settings, it is possible to quantify the proteins or peptides before MS analysis, e.g. by gels, LC.

Semi-quantitative 'label-free' approachesWhile the peak intensity does not correlate with protein abundance, the peptide count can be used for quantification (iBAQ, spectral counting, PAI)

Quantitative labeling approachesAllow quantitative comparison of protein abundance under various conditions (iCAT, iTRAQ, SILAC)

Page 16: MN-B-C 2  Analysis of High Dimensional (-omics) Data

Label-free 'Spectral Counting' approaches

Peptide frequency is correlated with protein abundance• only semi-quantitative• requires comparison over multiple runs, conditions must be stable & reproducible• normalization for protein size required (big proteins generate more peptides)

Observed vs. Non-observed•Frequently used line of argumentation. Statistics difficult; effects are easily overestimated

• Difference 6 vs 0 observations typically not significant. (Audic &Claverie statistics)

Protein abundance index (PAI) • The number of observed peptides divided by the number of observable peptides per

protein.• Related to the logarithm of protein abundance

Page 17: MN-B-C 2  Analysis of High Dimensional (-omics) Data

SILACStable isotope labeling by amino acids in cell culture• A frequently used type of metabolic labeling

One cell culture is fed with Lys/Arg containing light C12 atoms

One cell culture is fed with Lys/Arg containing heavy C13 atoms

Proteins from the two cultures are mixed and analysed in a single experiment. The proteins and resulting peptides behave identically

During MS analysis, all labeled ions appear as a duplet with a defined size difference. The intensity ratio of these peaks is a good proxy for the ratio of protein abundance in the two cultures

Page 18: MN-B-C 2  Analysis of High Dimensional (-omics) Data

Isobaric labeling (TMT, iTRAQ)Two commercially available variants:TMT: Tandem Mass TagiTRAQ: Isobaric Tag for relative and absolute quantitation

Labeling is not done metabolically but at the protein or peptide level.

Isobaric properties ensure that peptide differences are only observed after fragmentation

Page 19: MN-B-C 2  Analysis of High Dimensional (-omics) Data

Demonstration von

• Human Protein Atlas (http://www.proteinatlas.org) • ArrayExpress Expression atlas (http://www.ebi.ac.uk/gxa)