ms based workflow for quantitative proteomics · nesvizhskii, a.i. and aebersold, r. mol cell...

35
MS E based workflow for quantitative proteomics Kerman Aloria Proteomics Core Facility-SGIKER, UPV/EHU MS Technology Days: Biomoléculas Barcelona, 21-03-2012

Upload: others

Post on 18-Apr-2020

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

MSE based workflow for quantitative proteomics

Kerman Aloria Proteomics Core Facility-SGIKER, UPV/EHU

MS Technology Days: Biomoléculas Barcelona, 21-03-2012

Page 2: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Sample diversity

Proteomics Core Facility

Proteomics Core Facility

Quantitative analysis

Quantitation results

Page 3: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Proteomics Core Facility

Quantification workflow:

Sample preparation

Liquid Chromatography

Mass Spectrometry

Bioinformatic analysis

Standard, flexible and robust workflow

Page 4: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

MSE acquisition mode

Low collision energy MS MS

Retention time Retention time

Inte

nsi

ty

Inte

nsi

ty

Page 5: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

MSE based protein identification and quantification

Absolute quantification

Based on an added standard protein

MSE data

Protein identification

Based on peptide intensities

Relative quantification

Based on absolute values

All information in one run

Page 6: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Estimated absolute protein quantification

“the average MS signal response for the three most intense tryptic peptides per mole of protein is constant within a coefficient of variation of less than ±10%”

Silva et al. (2006) Mol. Cell. Proteomics 5, 144-156

Sample A

Sample B

MSE based protein quantification

X (fmol)

2X (fmol)

About the accuracy of the absolute abundance measurement: “…provided an estimated average error rate of 1.8-fold for the extracted ion intensities, and ~threefold for the spectral counting.”

Malsmtron et al. (2009) Nature 460, 762-765

inte

nsi

ty

Comparative analysis of ion currents

Probabilistic algorithm, where relative quantification of all peptides is performed

Page 7: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Sample preparation:

- Label-free approach - 2-D Clean-Up - RapiGest - Tryptic digestion

LC: NanoAcquity UPLC System

MS: SYNAPT HDMS MSE acquisition mode

Data analysis:

- ProteinLynx Global Server - PAnalyzer - Excel (MS office)

Quantification workflow

Quantification workflow:

Sample preparation

Liquid Chromatography

Mass Spectrometry

Bioinformatic analysis

Page 8: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Label-free quantification

SILAC ICAT ICPL

ICPL 18O

iTRAQ

Label-free

Independent and parallel sample analysis

Reproducible manipulation along the whole workflow

Page 9: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Sample preparation

Sample preparation:

Sample cleaning and concentration: 2-D Clean-Up

Protein denaturation: Rapigest

Standard in solution tryptic digestion

- Escherichia coli

- Vibrio harveyi: different growth temperatures

- Brucella abortus: ± erythritol treatment

- Rhodococcus sp.: different culture media

- Synechococcus sp.: different culture media

- Saccharomyces cerevisiae

- Botrytis cinerea (plant pathogenic fungus):

- mycelium of different strains

- culture media

Human:

- Tear (control and disease)

- Depleted plasma

- MDA-MB-468 breast cancer cells (± EGF treatment)

- HEK 293T cells (± sh RNA treatment)

- Adipose Stem Cells: exosomes

Mouse:

- T-lymphocytes: wt and KO

- NIH 3T3 cells: - nuclear proteins (± doxycycline treatment)

- chromatin bound proteins

* Betula pendula: pollen extract

* Phleum pratense: pollen extract

Page 10: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

0

25

50

75

100

93.2

98.3 96.3 95.8 95.4 94.9 93.4

95.9 96.8

90.7

6.8

1.7 3.7 4.2 4.6 5.1 6.6

4.1 3.2

9.3

80

85

90

95

100

Sample preparation

Digestion efficiency: percentage of peptides with missed cleavages

0 missed cleavages

1 missed cleavage

Digestion efficiency independent of the sample

% o

f p

epti

des

Page 11: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Sample preparation

BPI chromatogram of 6 different sample preparations of protein extracts form human HEK293T cells

2.45e3

2.37e3

2.42e3

2.24e3

2.32e3

2.34e3

Reproducible sample preparation

Page 12: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

0

10

20

30

40

50

60

70

1 2 3

Sample preparation

Overlap in protein identification is not affected by sample preparation

122 108

106 replicates

% o

f id

enti

fied

pro

tein

s Plasma: Different digestions

E. coli: commercial sample (LC-MSE replicates) Plasma: different digestions

Protein identifications

Page 13: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Liquid chromatography

LC reproducibility: Standard deviation of the retention time of replicate analysis

% o

f p

epti

des

minutes

E. coli: commercial sample (no sample preparation). LC replicates. Botrytis (laboratory sample): LC replicates HEK 293T (laboratory sample): biological replicates

Reproducible LC performance independently of sample type and preparation

Page 14: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

replicates

Mass Spectrometry

0

10

20

30

40

50

1 2 3 4 5

Escherichia_1

Escherichia_2

Botrytis_1

Botrytis_2

T-Lymphocytes_1

T-Lymphocytes_2

NIH 3T3_1

NIH 3T3_2

% o

f p

rote

ins

~70% of the proteins are identified in >1 replicate

E. coli: commercial sample (no sample preparation). LC-MSE replicates. Botrytis, T-Lymphocytes and NIH 3T3 (laboratory samples) LC-MSE replicates

Page 15: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

0

25

50

75

100

125

150

0 20 40 60 80 100

Escherichia

Botrytis

T-lymphocytes

NIH 3T3

E. coli: commercial sample (no sample preparation). LC-MSE replicates. Botrytis, T-Lymphocytes and NIH 3T3 (laboratory samples) LC-MSE replicates

~75% of the quantified proteins have a CV < 25%

Coefficient of variation of the absolute protein quantification

Mass Spectrometry

% of proteins

Co

effi

cien

t o

f va

riat

ion

(%

)

Page 16: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Bioinformatics

Protein identification

Based on peptide intensities

Relative quantification Based on

absolute values

Based on an added standard protein

Absolute quantification

Page 17: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Bioinformatics

Protein identification

Based on peptide intensities

Relative quantification Based on

absolute values

Based on an added standard protein

Absolute quantification

Page 18: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Bioinformatics

Protein inference: the task of assembling the sequences of identified peptides to infer the protein content of the sample

1

1

2

2

3 A

B

Identifications in PLGS 2.4:

- Hit: a group of proteins (can have only one member) in which one of them has a exlusive peptide - Protein: proteins that have at least one identified peptide

Proteins are grouped in hits where one of the proteins has exclusive peptide(s) and the other members have only shared peptide(s)

1 hit and 2 proteins: protein A names the hit

Page 19: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

1

1

2

2

3 A

B

1 hit and 2 proteins. Which protein names the hit?

3

Bioinformatics

No experimental evidence to select “Polyubiquitin B” to name the hit

Page 20: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Bioinformatics

Hit_1: 4 Proteins

Hit_2: 4 Proteins

1

1

2

2

3 A

B

1 2 C 4

2 hits: - Protein A - Protein C In which hit should be protein B?

One proetin assigned to more than one group and counted several times (2 hits, 8 proteins)

Page 21: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Bioinformatics

Hit_1: 4 Proteins

Hit_2: 4 Proteins

1

1

2

2

3 A

B

1 2 C 4

2 hits: - Protein A - Protein C In which hit should be protein B?

One proetin assigned to more than one group and counted several times (2 hits, 8 proteins)

ProteinLynx Global Server 2.4: 8 hits, 32 proteins

Totally identified proteins: 11

Page 22: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Conclusive proteins: they have at least

one distinct/unique peptide

Protein group: All peptides are shared and several

combinations of proteins can explain the presence

of all peptides

Nonclonclusive proteins: All peptides are shared and

their presence is explained by other proteins that must

be present in the sample

Indistinguishable proteins: all the peptides are

shared and at least one of them is distinct/unique

for this group

Bioinformatics

Interpretation of Shotgun Proteomic Data. The Protein Inference Problem

Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440.

Page 23: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Bioinformatics

PAnalyzer:A software tool that classifies peptide and protein identifications from ProteinLynx Global Server based on the proposal of Nesvizhskii and Aebersold (with minor modifications).

Page 24: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Peptide list Protein list

PAnalyzer reads the results .xml file from ProteinLynx Global Server and classifies the identified proteins based on the defined criteria.

Possibility to filter peptides by score thresholds

Bioinformatics

Page 25: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Bioinformatics

PAnalyzer PLGS 2.4

High confidence Medium confidence Low confidence

Conclusive 149 194 214

188 hits

417 proteins (389 proteins)

Indistinguishable 81 (34 groups) 122 (49 groups) 102 (41 groups)

Group 13 (1 group) 0 3 (1 group)

Non conclusive 92 62 70

Filtered 54 11 0

Total 389 389 389

Comparison between Panalyzer and PLGS 2.4

Panalyzer reports more precise and clear results than PLGS 2.4

Page 26: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Bioinformatics

estimated absolute protein quantification “the average MS signal response for the three most intense tryptic peptides per mole of protein is constant within a coefficient of variation of less than ±10%”

Silva et al. (2006) Mol. Cell. Proteomics 5, 144-156

Quantification in PLGS 2.4:

Absolute quantification of proteins identified with only 1 or 2 peptides is performed

Export data to excel and filter

Page 27: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Results

ECOLI_1

257 237

ECOLI_2

251

132

36 32 45

180

134

43 29 35

187

0

50

100

150

200

1 2 3 4 5

number of replicates

nu

mb

er

of

pro

tein

s ECOLI_1 ECOLI_2

0

25

50

75

100

125

0 25 50 75 100

% of quantified proteins

CV

ECOLI_1 ECOLI_2

ECOLI _1

0.5 µg E. coli digest

ADH1_YEAST: 1

ALBU_BOVIN: 1

ENO1_YEAST: 1

PYGM_RABIT: 1

Commercial samples (LC-MSE replicates only)

+

ECOLI _2

0.5 µg E. coli digest

ADH1_YEAST: 1

ALBU_BOVIN: 8

ENO1_YEAST: 2

PYGM_RABIT: 0.5

+

Identified proteins

Quantified proteins: > 2 replicates > 2 peptides

Coefficient of variation of the quantified proteins

Page 28: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Results

ECOLI _1

0.5 µg E. coli digest

ADH1_YEAST: 1

ALBU_BOVIN: 1

ENO1_YEAST: 1

PYGM_RABIT: 1

Commercial samples (LC-MSE replicates only)

+

ECOLI _2

0.5 µg E. coli digest

ADH1_YEAST: 1

ALBU_BOVIN: 8

ENO1_YEAST: 2

PYGM_RABIT: 0.5

+

Protein Expected ratio

Ratio based on ion current

Ratio based on absolute quantification

ALBU_BOVIN 8.0 7.61 6.95

ENO1_YEAST 2.0 2.04 1.91

PYGM_RABIT 0.5 0.55 0.54

False positive quantifications: Ion current: 0.05>p>0.95 0.66>ratio>1.5 Absolute quantification: t-test p<0.05 0.66>ratio>1.5

Protein Expected ratio

Ratio based on ion current

Ratio based on absolute quantification

HSLV_ECOLI 1 0.62

RL13_ECOLI 1 2.10

TDCE_ECOLI 1 0.65

0.42% (1 out of 237)

0.84% (2 out 237)

Page 29: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Results

SAMPLE_A

1 µg E. coli digest

CYC_HORSE: 1 pmol

MYG_HORSE: 525 fmol

ALDOA_RABIT: 25 fmol

ALBU_BOVIN: 1 fmol

+

SAMPLE_B

1 µg E. coli digest

+

CYC_HORSE: 1.5 pmol

MYG_HORSE: 200 fmol

ALDOA_RABIT: 12.5 fmol

ALBU_BOVIN: 5 fmol

Protein Theoretical ratio (B/A)

Ratio based on absolute

quantification (B/A)

CV Sample A

CV Sample B

p (t-test)

CYC_HORSE 1.50 1.61 1.0 7.2 8.3e-4

MYG_RABIT 0.38 0.41 7.9 9.5 3.0e-4

ALDOA_RABIT 0.50 0.4 4.0 2.9 8.7e-5

ALBU_BOVIN 5.0 2.08 15.9 6.1 7.5e-4

False positive quantifications: t-test p<0.05 0.8>ratio>1.25

Protein Theoretical ratio (B/A)

Ratio based on absolute

quantification (B/A)

CV Sample A

CV Sample B

p (t-test)

FETP_ECOLI 1 0.2 5.5 7.9 2.3e-5

0.65% (1 out of 153)

Page 30: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Results

Protein Sample A Sample B Sample C Sample D Sample E

CAH1_HUMAN 1 2 5 10 50

CAH2_HUMAN 1 2 5 10 50

CRP_HUMAN 1 2 5 10 50

S. cerevisiae background

+

Protein E/A E/B E/C E/D

Theor Exp Theor Exp Theor Exp Theor Exp

CAH1 50 42.4 25 31.1 10 10.1 5 4.9

CAH2 50 42.8 25 18.9 10 10.7 5 3.9

CRP 50 54.2 25 17.7 10 6.4 5 7.1

84.9 85.6

108.4

124.5

75.670.8

100.7107.1

64.4

98.2

79

142.8

0

25

50

75

100

125

150

CAH1 CAH2 CRP CAH1 CAH2 CRP CAH1 CAH2 CRP CAH1 CAH2 CRP

A B C D

SAMPLE

Page 31: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Results

WT KO

T-lymphocyte purification

protein extraction

Sample cleaning and digestion

LC-MSE 5 replicates

WT 333

288

E2F2-/-

331

41

deregulated Based on

ion current 48

Based on absolute

estimation 44

Identified

quantified

* Identified: proteins identified in > 1 replicate

* Quantified: proteins identified in >2 replicates with > 2 peptides * Deregulated: - ion current: 0.05>p>0.95 ratio>1.25 -absolute: t-test: p<0.05 ratio>1.25

Page 32: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Nuclear protein extraction

Sample cleaning and digestion

LC-MSE 5 replicates

Results

Cation exchange chromatography

pH=8.5

NIH 3T3

- Treatment 252

+ Treatment

207

Identified proteins

- treatment + treatment

156

84

deregulated

Based on ion current

96

Based on absolute

estimation 97

quantified

* Identified: proteins identified in > 1 replicate

* Quantified: proteins identified in >2 replicates with > 2 peptides * Deregulated: - ion current: 0.05>p>0.95 ratio>1.25 -absolute: t-test: p<0.05 ratio>1.25

Page 33: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Results Bacteria: 3 growing conditions

3 biological replicates

Extraction of membrane proteins

Sample cleaning and digestion

LC-MSE LC-MSE LC-MSE

A B C

Identified A Identified B Quantified Deregulated Unique

111 69 56 7 17

Identified A Identified C Quantified Deregulated Unique

111 94 62 12 13

* Identified: proteins identified in > 1 replicate * Quantified: proteins identified in >1 replicate with > 2 peptides * Deregulated: t-test: p<0.05 ratio>1.25 * Unique: proteins identified 3 times in one condition and 0 in the other

Page 34: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

MSE is an accurate approach for label-free relative quantification in discovery type of experiments MSE approach can be applied to a wide variety of protein samples

Conclusions

Page 35: MS based workflow for quantitative proteomics · Nesvizhskii, A.I. and Aebersold, R. Mol Cell Proteomics. 2005 4: 1419-1440. Bioinformatics PAnalyzer:A software tool that classifies

Proteomics Core Facility - SGIKER Kerman Aloria Biochemistry and Molecular Biology Jabier Beaskoetxea Miren Josu Omaetxebarria Jesus Mari Arizmendi Genetics, Physical Anthropology and Animal Physiology Asier Fullaondo Electronics and Telecommunications Gorka Prieto

www.ehu.es/sgiker www.proteored.org www.ehu.es