comprehensive, quantitative, intact proteoform

1
Tao Liu 1 , Paul Piehowski 1 , Samuel Payne 1 , Sangtae Kim 1 , Jungkap Park 1 , Christopher Wilkins 1 , Carrie Nicora 1 , Yufeng Shen 1 , Rui Zhao 1 , Anil Shukla 1 , Ronald Moore 1 , Sherri Davies 2 , Shunqiang Li 2 , Reid Townsend 2 , Matthew Ellis 3 , Emily Boja 4 , Henry Rodriguez 4 , Karin Rodland 1 , and Richard D. Smith 1 Introduction Overview Methods Results Acknowledgements This project was funded by NIH grant U24-CA-160019 from the National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC). Samples were analyzed using capabilities developed under this grant and NIGMS Biomedical Technology Research Resource P41GM103493, and performed in the Environmental Molecular Sciences Laboratory, a U.S. Department of Energy (DOE) Office of Biological and Environmental Research (OBER) national scientific user facility on the Pacific Northwest National Laboratory (PNNL) campus. PNNL is a multiprogram national laboratory operated by Battelle for the DOE under contract DE-AC05-76RL01830. References 1. Li et al., Cell Rep. 2013 Sep 26;4(6):1116-30 2. Ntai et al., Mol Cell Proteomics. 2016 Jan;15(1):45-56 Conclusions Comprehensive and reproducible detection of proteoforms CONTACT: Tao Liu, Ph.D. Biological Sciences Division Pacific Northwest National Laboratory E-mail: [email protected] Many biologically active proteoforms cannot be identified or distinguished by conventional bottom- up approaches since the information is lost with trypsin digestion. Integrating data from bottom-up and top-down proteomics measurements with available genomic data enables characterization of protein variants, including coding polymorphisms, processing products, and post-translational modifications. There is an urgent need to be able to make comprehensive and reproducible measurements of proteoforms in biological or clinical samples. The available patient-derived xenografts (PDX) for two breast cancer subtypes (luminal B and basal) 1 provide an ideal system for this evaluation. Our top-down proteomics pipeline provided significant improvement in proteome coverage, precision of the quantification, and the ability to identify statistically significant changes on proteoform abundances, in comparison to another recent report that analyzed proteoforms in the same PDX samples 2 . Although still relatively limited in coverage compared to the conventional bottom-up approach, top-down analysis provides a complementary view of the proteome (e.g., truncated and/or post-translationally modified proteoforms, variants, additional protein identifications). The top-down, bottom-up, and peptidomics analysis of the same PDX samples showed the same degree of changes in protein/peptide abundances between the two subtypes. The top-down measurements provided unique information on up-regulation of many proteoforms of histones and 60S ribosomal proteins as well as key pathway level changes (e.g., glycolysis) in the luminal subtype. This holds great promise in providing new insights into cancer biology. Fractionation and new technologies (e.g., SLIM IMS) will greatly improve proteome coverage. The widely used bottom-up proteomics approaches in many cases cannot identify or distinguish many different processed and/or modified versions of the gene products whose coverage is important for understanding the development and treatment of cancers. Bottom-up proteomics approaches are inherently limited by the “peptide-to-protein” inference problem, affecting both accuracy and precision of the quantification. We have recently developed a significantly improved top-down proteomics pipeline that allows for comprehensive quantitative intact proteoform measurements. Application of this pipeline to the comparative analysis of PDX models of basal and luminal B human breast cancer provided much improved performance in the ability to obtain both comprehensive proteome coverage, and precise intact proteoform quantification, resulting in novel biological insights otherwise unseen in bottom-up analysis approaches. www.omics.pnl.gov Career Opportunities: For openings in the Integrative Omics Group at PNNL please visit http:// omics.pnl.gov/careers Washington human-in-mouse (WHIM) PDX samples representing luminal B (P6 and P33) and basal (P5 and P32) breast cancer subtypes Top-down 949 up and 929 down out of 3123 proteoforms at 0.05 FDR Bottom-up 2792 up and 2779 down out of 7412 proteins at 0.05 FDR Peptidomics 215 up and 235 down out of 815 peptides at 0.05 FDR 1) clustering isotopomer envelopes across adjacent time and charge state; 2) the initial cluster is refined to accurately determine its elution-time span and range of charge states; 3) calculates the likelihood that the final cluster is an LC-MS feature. K R A Q K T T R A M 1 Methyl 2 Methyls 1 Methyl, 1 Oxidation 1 Oxidation 2 Oxidations No modification Protein Extraction 8 M Urea; 1 mM PMSF Phosphatase inhibitor cocktail Mechanical homogenization Incubate 30 min at 37 °C 100 kDa MWCO Retain filtrate Buffer Exchange LC-MS/MS Analysis 10 kDa MWCO; Buffer A Retain retentate Acidify lysate: remove proteins incompatible with RPLC Enrich LMW Proteome LC: 75 um ID, 80 cm, C2 column, 3-h gradient MS: 240K MS; Top 3 CID MS/MS at 60K (Orbitrap Velos Elite) Log2-scaled mean abundance) 14 16 18 20 22 24 26 28 30 32 CV (%) 0 50 100 150 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 CV (%) Cumulative percentage (%) * A total of 7392 differentially expressed, out of 14412 clustered features Log-2 Fold Change -10 -5 0 5 10 15 - Log-2 false discovery rate 0 5 10 15 20 25 Log-2 Fold Change -10 -5 0 5 10 15 - Log-2 false discovery rate 0 5 10 15 20 25 30 A total of 1780 differentially expressed, out of 3123 proteoforms Adjusted p-value < 0.01 fold change > 2 Upregulated proteins Downregulated proteins REACTOME_INFLUENZA_VIRAL_RNA_TRANSCRIPTION_AND_REPLICATION HSIAO_HOUSEKEEPING_GENES REACTOME_PEPTIDE_CHAIN_ELONGATION REACTOME_DIABETES_PATHWAYS REACTOME_VIRAL_MRNA_TRANSLATION KEGG_OXIDATIVE_PHOSPHORYLATION REACTOME_TRANSLATION KEGG_PARKINSONS_DISEASE REACTOME_REGULATION_OF_GENE_EXPRESSION_IN_BETA_CELLS REACTOME_FORMATION_OF_ATP_BY_CHEMIOSMOTIC_COUPLING KEGG_RIBOSOME PROTON_TRANSPORTING_TWO_SECTOR_ATPASE_COMPLEX REACTOME_GTP_HYDROLYSIS_AND_JOINING_OF_THE_60S_RIBOSOMAL_SUBUNIT BIOCARTA_PROTEASOME_PATHWAY REACTOME_METABOLISM_OF_PROTEINS REACTOME_GLUCOSE_REGULATION_OF_INSULIN_SECRETION REACTOME_FORMATION_OF_A_POOL_OF_FREE_40S_SUBUNITS REACTOME_REGULATION_OF_INSULIN_SECRETION REACTOME_GENE_EXPRESSION REACTOME_INTEGRATION_OF_ENERGY_METABOLISM REACTOME_REGULATION_OF_BETA_CELL_DEVELOPMENT MITOCHONDRIAL_INNER_MEMBRANE REACTOME_INFLUENZA_LIFE_CYCLE MITOCHONDRIAL_ENVELOPE STRUCTURAL_CONSTITUENT_OF_RIBOSOME REACTOME_INSULIN_SYNTHESIS_AND_SECRETION STRUCTURAL_MOLECULE_ACTIVITY REACTOME_PACKAGING_OF_TELOMERE_ENDS REACTOME_RNA_POLYMERASE_I_PROMOTER_OPENING REACTOME_RNA_POLYMERASE_I_PROMOTER_CLEARANCE REACTOME_TELOMERE_MAINTENANCE REACTOME_TRANSLATION_INITIATION_COMPLEX_FORMATION REACTOME_PROCESSING_OF_CAPPED_INTRON_CONTAINING_PRE_MRNA REACTOME_RNA_POLYMERASE_I_III_AND_MITOCHONDRIAL_TRANSCRIPTION REACTOME_TRANSCRIPTION KEGG_SPLICEOSOME KEGG_OXIDATIVE_PHOSPHORYLATION REACTOME_INTEGRATION_OF_ENERGY_METABOLISM REACTOME_PREFOLDIN_MEDIATED_TRANSFER_OF_SUBSTRATE_TO_CCT_TRIC REACTOME_ELECTRON_TRANSPORT_CHAIN REACTOME_MRNA_SPLICING_MINOR_PATHWAY NUCLEOBASENUCLEOSIDENUCLEOTIDE_AND_NUCLEIC_ACID_METABOLIC_PROCESS Fisher exact test adjusted p value <0.05 Many proteoforms of histones and 60S ribosomal proteins showed upregulation in the luminal subtype Green: KEGG default Blue: up in luminal Red: up in basal Grey: mixed Blank: not in human Warburg Effect Bottom-up 7412 Top-down 867 Peptidomics 429 123 370 177 31 Precise quantification of the proteoforms Separation of the two BrCa subtypes Comparison of top-down, bottom-up, and peptidomics results Top-down vs. bottom-up: Enriched pathways Example: Affected pathways LC-MS feature finding in ProMex Sample preparation Proteoform ID by sequence graphs Comprehensive, Quantitative, Intact Proteoform Measurements of Patient-Derived Breast Tumor Xenografts Using an Improved Top-Down Proteomics Pipeline 1 Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA; 2 Department of Medicine, Washington University, St. Louis, MO; 3 Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX; 4 National Cancer Institute, National Institutes of Health, Bethesda, MD P32: basal (WHIM2) P33: luminal B (WHIM16) Over 90% of all detected clustered features have CV <30% in abundance

Upload: others

Post on 22-Apr-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Comprehensive, Quantitative, Intact Proteoform

Tao Liu1, Paul Piehowski1, Samuel Payne1, Sangtae Kim1, Jungkap Park1, Christopher Wilkins1, Carrie Nicora1, Yufeng Shen1, Rui Zhao1, Anil Shukla1, Ronald Moore1, Sherri Davies2, Shunqiang Li2, Reid Townsend2, Matthew Ellis3, Emily Boja4, Henry Rodriguez4, Karin Rodland1, and Richard D. Smith1

Introduction

Overview Methods Results

AcknowledgementsThis project was funded by NIH grant U24-CA-160019 from the National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC). Samples were analyzed using capabilities developed under this grant and NIGMS Biomedical Technology Research Resource P41GM103493, and performed in the Environmental Molecular Sciences Laboratory, a U.S. Department of Energy (DOE) Office of Biological and Environmental Research (OBER) national scientific user facility on the Pacific Northwest National Laboratory (PNNL) campus. PNNL is a multiprogram national laboratory operated by Battelle for the DOE under contract DE-AC05-76RL01830.

References1. Li et al., Cell Rep. 2013 Sep 26;4(6):1116-30

2. Ntai et al., Mol Cell Proteomics. 2016 Jan;15(1):45-56

ConclusionsComprehensive and reproducible detection of proteoforms

CONTACT: Tao Liu, Ph.D.Biological Sciences DivisionPacific Northwest National LaboratoryE-mail: [email protected]

• Many biologically active proteoforms cannot be identified or distinguished by conventional bottom-up approaches since the information is lost with trypsin digestion.

• Integrating data from bottom-up and top-down proteomics measurements with available genomic data enables characterization of protein variants, including coding polymorphisms, processing products, and post-translational modifications.

• There is an urgent need to be able to make comprehensive and reproducible measurements of proteoforms in biological or clinical samples.

• The available patient-derived xenografts (PDX) for two breast cancer subtypes (luminal B and basal)1

provide an ideal system for this evaluation.

• Our top-down proteomics pipeline provided significant improvement in proteome coverage, precision of the quantification, and the ability to identify statistically significant changes on proteoform abundances, in comparison to another recent report that analyzed proteoforms in the same PDX samples2.

• Although still relatively limited in coverage compared to the conventional bottom-up approach, top-down analysis provides a complementary view of the proteome (e.g., truncated and/or post-translationally modified proteoforms, variants, additional protein identifications).

• The top-down, bottom-up, and peptidomics analysis of the same PDX samples showed the same degree of changes in protein/peptide abundances between the two subtypes.

• The top-down measurements provided unique information on up-regulation of many proteoforms of histones and 60S ribosomal proteins as well as key pathway level changes (e.g., glycolysis) in the luminal subtype. This holds great promise in providing new insights into cancer biology.

• Fractionation and new technologies (e.g., SLIM IMS) will greatly improve proteome coverage.

• The widely used bottom-up proteomics approaches in many cases cannot identify or distinguish many different processed and/or modified versions of the gene products whose coverage is important for understanding the development and treatment of cancers.

• Bottom-up proteomics approaches are inherently limited by the “peptide-to-protein” inference problem, affecting both accuracy and precision of the quantification.

• We have recently developed a significantly improved top-down proteomics pipeline that allows for comprehensive quantitative intact proteoform measurements.

• Application of this pipeline to the comparative analysis of PDX models of basal and luminal B human breast cancer provided much improved performance in the ability to obtain both comprehensive proteome coverage, and precise intact proteoform quantification, resulting in novel biological insights otherwise unseen in bottom-up analysis approaches.

www.omics.pnl.govCareer Opportunities: For openings in the Integrative Omics Group at PNNL please visit http://omics.pnl.gov/careers

Washington human-in-mouse (WHIM) PDX samples representing luminal B (P6 and P33) and basal (P5 and P32) breast cancer subtypes

Top-down949 up and 929 down out of

3123 proteoforms at 0.05 FDR

Bottom-up2792 up and 2779 down out of

7412 proteins at 0.05 FDR

Peptidomics215 up and 235 down out of

815 peptides at 0.05 FDR1) clustering isotopomer envelopes across adjacent time and charge state; 2) the initial cluster is refined to accurately determine its elution-time span and range of charge states; 3) calculates the likelihood that the final cluster is an LC-MS feature.

K R A Q K TT R A M

1 Methyl

2 Methyls

1 Methyl, 1 Oxidation

1 Oxidation

2 Oxidations

No modification

Protein Extraction

8 M Urea; 1 mM PMSFPhosphatase inhibitor cocktail

Mechanical homogenizationIncubate 30 min at 37 °C

100 kDa MWCORetain filtrate

Buffer ExchangeLC-MS/MS Analysis

10 kDa MWCO; Buffer A Retain retentate

Acidify lysate: remove proteins incompatible with RPLC

Enrich LMW Proteome

LC: 75 um ID, 80 cm, C2 column, 3-h gradient MS: 240K MS; Top 3 CID MS/MS at 60K (Orbitrap Velos Elite)

Log2-scaled mean abundance)

14 16 18 20 22 24 26 28 30 32

CV

(%

)

0

50

100

150

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

100

CV (%)

Cum

ulat

ive

perc

enta

ge (%

)

*

A total of 7392 differentially expressed, out of 14412 clustered features

Log-2 Fold Change

-10 -5 0 5 10 15

- Log

-2 fa

lse

disc

over

y ra

te

0

5

10

15

20

25

Log-2 Fold Change

-10 -5 0 5 10 15

- Log

-2 fa

lse

disc

over

y ra

te

0

5

10

15

20

25

30

A total of 1780 differentially expressed, out of 3123 proteoforms

Adjusted p-value < 0.01 fold change > 2

Upregulated proteins Downregulated proteinsREACTOME_INFLUENZA_VIRAL_RNA_TRANSCRIPTION_AND_REPLICATION HSIAO_HOUSEKEEPING_GENESREACTOME_PEPTIDE_CHAIN_ELONGATION REACTOME_DIABETES_PATHWAYSREACTOME_VIRAL_MRNA_TRANSLATION KEGG_OXIDATIVE_PHOSPHORYLATIONREACTOME_TRANSLATION KEGG_PARKINSONS_DISEASEREACTOME_REGULATION_OF_GENE_EXPRESSION_IN_BETA_CELLS REACTOME_FORMATION_OF_ATP_BY_CHEMIOSMOTIC_COUPLINGKEGG_RIBOSOME PROTON_TRANSPORTING_TWO_SECTOR_ATPASE_COMPLEXREACTOME_GTP_HYDROLYSIS_AND_JOINING_OF_THE_60S_RIBOSOMAL_SUBUNIT BIOCARTA_PROTEASOME_PATHWAYREACTOME_METABOLISM_OF_PROTEINS REACTOME_GLUCOSE_REGULATION_OF_INSULIN_SECRETIONREACTOME_FORMATION_OF_A_POOL_OF_FREE_40S_SUBUNITS REACTOME_REGULATION_OF_INSULIN_SECRETIONREACTOME_GENE_EXPRESSION REACTOME_INTEGRATION_OF_ENERGY_METABOLISMREACTOME_REGULATION_OF_BETA_CELL_DEVELOPMENT MITOCHONDRIAL_INNER_MEMBRANEREACTOME_INFLUENZA_LIFE_CYCLE MITOCHONDRIAL_ENVELOPESTRUCTURAL_CONSTITUENT_OF_RIBOSOMEREACTOME_INSULIN_SYNTHESIS_AND_SECRETIONSTRUCTURAL_MOLECULE_ACTIVITYREACTOME_PACKAGING_OF_TELOMERE_ENDSREACTOME_RNA_POLYMERASE_I_PROMOTER_OPENINGREACTOME_RNA_POLYMERASE_I_PROMOTER_CLEARANCEREACTOME_TELOMERE_MAINTENANCEREACTOME_TRANSLATION_INITIATION_COMPLEX_FORMATIONREACTOME_PROCESSING_OF_CAPPED_INTRON_CONTAINING_PRE_MRNAREACTOME_RNA_POLYMERASE_I_III_AND_MITOCHONDRIAL_TRANSCRIPTIONREACTOME_TRANSCRIPTIONKEGG_SPLICEOSOMEKEGG_OXIDATIVE_PHOSPHORYLATIONREACTOME_INTEGRATION_OF_ENERGY_METABOLISMREACTOME_PREFOLDIN_MEDIATED_TRANSFER_OF_SUBSTRATE_TO_CCT_TRICREACTOME_ELECTRON_TRANSPORT_CHAINREACTOME_MRNA_SPLICING_MINOR_PATHWAYNUCLEOBASENUCLEOSIDENUCLEOTIDE_AND_NUCLEIC_ACID_METABOLIC_PROCESS

Fisher exact testadjusted p value <0.05

Many proteoforms of histones and 60S ribosomal proteins showed upregulation in the luminal subtype

Green: KEGG defaultBlue: up in luminalRed: up in basalGrey: mixedBlank: not in human

Warburg Effect

Bottom-up7412

Top-down867

Peptidomics429

123

370

17731

Precise quantification of the proteoforms

Separation of the two BrCa subtypes

Comparison of top-down, bottom-up, and peptidomics results

Top-down vs. bottom-up: Enriched pathways

Example: Affected pathwaysLC-MS feature finding in ProMex

Sample preparation

Proteoform ID by sequence graphs

Comprehensive, Quantitative, Intact Proteoform Measurements of Patient-Derived Breast Tumor Xenografts Using an Improved Top-Down Proteomics Pipeline

1Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA; 2Department of Medicine, Washington University, St. Louis, MO; 3Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX; 4National Cancer Institute, National Institutes of Health, Bethesda, MD

P32: basal (WHIM2) P33: luminal B (WHIM16) Over 90% of all detected clustered features have CV <30% in abundance