ahsan aspb-2013-final-r1.pdf

1
Abstract The calcium-dependent family of protein kinases (CPKs) are exclusive to plants and protozoans. CPKs are involved in a broad spectrum of Ca +2 -induced responses, as they are activated upon Ca +2 binding. In Arabidopsis thaliana, CPKs are a family comprised of 34 members although client/substrate specificity is largely unknown. In this study, a quantitative, high-throughput, mass spectrometry-based in vitro phosphorylation strategy termed Kinase Client (KiC)assay was employed to identify potential CPK clients. Using this approach a synthetic peptide library comprised of 2,095 peptides and 2,661 in vivo mapped phosphorylation sites from Arabidopsis, was screened for potential clients using CPK1, CPK3, CPK6, CPK17, CPK24 and CPK32 which correspond to six different subfamilies A, E, B, D, J and K, respectively. A total of 180 non-redundant phosphorylated peptides were identified. Six peptides were phosphorylated by all CPK isoforms, and five peptides were phosphorylated by five isoforms. Identification of shared clients indicates overlapping preference for some of related Arabidopsis CPK family kinases. Among the CPK isoforms, CPK17 showed the highest number of phosphorylated peptides (151), of which 105 peptides were isoform-specific. Subcellular localization of clients showed good correlation with the respective CPK. Identification a significant number of validated CPKs clients including nitrate reductase 2, calmodulin- binding transcription activator 1, plasma membrane protein, CRK6, CRK8, calcium-binding EF hand family protein, AtRbohD, and syntaxin indicates the CPKs are functionally active and the MS-based KiC assay is a feasible approach to identify novel clients. Materials and Methods Figure 1. Properties of 2095 synthetic peptides used for KiC assay. A, Group distribution of synthetic peptides based on hydrophobicity. B, Distribution of five different hydrophobic groups of peptides in each peptide pool. C and D, Subcellular localization and percentage of in vivo phosphosites, respectively. Results Client specificity of Arabidopsis calcium-dependent protein kinases Nagib Ahsan, R. Shyama Prasad Rao, Rashaun S. Wilson, Kirby N. Swatek and Jay J. Thelen * Department of Biochemistry, Interdisciplinary Plant Group, University of Missouri-Columbia, MO 65211, USA *Corresponding author: [email protected] Figure 2. Diagrammatic representation of the KiC assay. Each pool was phosphorylated in vitro by individual CPKs and subsequently analyzed by an LTQ Orbitrap XL ETD mass spectrometer (Thermo Fisher, CA). For identification and phosphosite localization, raw MS files were searched using SEQUEST (Proteome Discoverer, v. 1.0.3, Thermo Fisher). Figure 3. Relative phosphorylation of syntide-2 peptide by CPKs demonstrates recombinant kinases are comparably active. A, Phylogenetic relationship of CPKs used in this study. B, Relative phosphorylation of syntide-2 peptide standard by CPKs. C, MS/MS spectrum of the syntide-2 peptide showing Thr phosphorylation by CPK17. Small letter in the sequence and star in the MS/MS spectrum indicates the trans- phosphorylation site and detection of the phosphorylated residue, respectively. KiC Assay p p p p p Identification of phosphopeptides MS analysis pRS score LTQ XL Orbitrap Synthetic peptide library (208-211 peptides /pool) Recombinant purified CPKs Kinase buffer containing ATP, Mg, DTT etc CPK Database search A B C D 2095 synthetic peptides were designed based on the Arabidopsis in vivo phosphoproteomic dataset available from P 3 DB. Due to a wide range of hydrophobicity (Figure 1A), ten pools were generated with equal distribution of hydrophobic peptides (Figure 1B). Each pool consists of equimolar concentration of 208-211 peptides. Kinase assay was conducted using kinase buffer, recombinant CPK, and a pool of synthetic peptides (Figure 2A). Samples were analyzed by mass spectrometry (Figure 2B) as described previously (Ahsan et al., 2013; J Proteome Res. 12: 937-948). Raw MS files were searched against a decoy database consisting of the random complement of the sequences comprising the peptide library, using SEQUEST (Proteome Discoverer, v. 1.0.3, Thermo Fisher) (Figure 2C). Phosphorylation site localization was accomplished using phosphoRS. For final validation, each spectrum was inspected manually and accepted only when the phosphopeptide had the highest pRS site probability and pRS score (Figure 2D). Analysis of CPK activity and phosphorylation preference on peptide standard Identification of CPK specific clients 36, 25, 20, 151, 33 and 27 peptides were phosphorylated by CPK1, CPK3, CPK6, CPK17, CPK24 and CPK32, respectively. Many peptides were phosphorylated by multiple CPKs (Figure 4). Thus, a total of 180 non-redundant phosphorylated peptides were identified. As expected, Ser was preferred (81%) followed by Thr (15%) and Tyr (4%) (Figure 5). Figure 5. CPK phosphorylation sites identified by KiC assay. The inset pie chart represent the total number of Ser, Thr and Tyr sites identified. A B C D E F G No Motif * Motif score 1 Foreground % 1 Background % 1 Fold increase 1 Foreground % 2 Background % 2 1 .L.R..S...... 23.54 24.84 1.43 17.42 24.84 1.43 2 ...R..SF..... 20.60 12.17 0.53 22.80 13.73 0.71 3 ...R..S...... 16.00 35.64 6.18 5.77 57.52 8.01 4 .L....S...... 7.72 27.69 5.79 4.78 36.60 6.76 * Motifs are significant at p << 0.01 after Bonferroni correction, Binomial probability 1 Based on motif-x (http://motif-x.med.harvard.edu/, Schwartz and Gygi, 2005, Nature Biotech 23:1391–1398) 2 Based on actual frequencies (Rao and Møller, 2012, Biochim Biophys Acta 1824:405–412) A B C D Figure 6. Consensus sequence and motif analysis of peptides phosphorylated by CPK17. A and B represents the consensus sequences. C-F represents the CPK17 recognition motifs (p << 0.01) account for ~69% of Ser sites. Consensus sequence of remaining ~31% Ser sites that did not produce any motifs (G). Table 1. CPK17 recognition motifs for Ser sites (153) phosphorylation based on 5921 Ser sites in 2095 synthetic peptides. Motif analysis of CPK17 clients Among the CPK17 phosphopeptides, a total of four recognition motifs (p << 0.01 after Bonferroni correction, Binomial probability) were identified (Figure 6, Table 1). LxRxxS is the most significant motif and accounted for ~25% of Ser sites compared to a background of ~1.4% (Figure 6C, Table 1). Other motifs include RxxSF, RxxS, and LxxxxS. Based on the actual frequency, ~58% of Ser clients contain an R residue at -3 position (Figure 6). The four motifs together accounted for 69.3% of 153 Ser clients for CPK17, while remaining 30.7% sites did not produce significant motifs (Figure 6G). Summary Among the CPKs analyzed, CPK17 had the largest number of specific clients, all of which contained at least one of four motifs (Figure 6C-F). Co-expression analysis of these candidates may further refine client authenticity. Curran et al. (Front Plant Sci., 2011, 2:36) demonstrated that CPK34 (subfamily D) phosphorylated a higher number of substrates than other CPKs. It is interesting to note that , 25% of the CPK34 clients overlapped with the CPK17 clients. Additionaly, RxxS motif is common for both CPK17 and CPK34 clients. CPK17 and CPK34 are both D subfamily CPKs, suggesting these members are more promiscuous than other CPKs. Acknowledgement: This work is supported by the NSF Plant Genome Research Program DBI-0604439 Figure 4. Topological relationship of the identified cognate clients for CPKs. Data were obtained by KiC assay screening of a library consisting of 2095 synthetic peptides as substrates for recombinant purified CPKs. The cartograph was assembled by Cytoscape 3.0.1 (http://www.cytoscape.org). Nodes with different shapes indicate kinase-specific versus overlapping clients. Node color indicates subcellular localization. Edge thickness represents phosphorylation stoichiometry as relative phosphorylation of each site. Co-expression relationship between pair of genes/proteins is based on the Pearson correlation of log 2 transformed normalized expression values from GSE3011 for A. thaliana (http://www.ncbi.nlm.nih.gov/geo) and is denoted by edge color .

Upload: lamanh

Post on 03-Jan-2017

227 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Ahsan ASPB-2013-Final-R1.pdf

Abstract The calcium-dependent family of protein kinases (CPKs) are exclusive to plants and protozoans. CPKs are involved in a broad spectrum of Ca+2-induced responses, as they are activated upon Ca+2 binding. In Arabidopsis thaliana, CPKs are a family comprised of 34 members although client/substrate specificity is largely unknown. In this study, a quantitative, high-throughput, mass spectrometry-based in vitro phosphorylation strategy termed Kinase Client (KiC)assay was employed to identify potential CPK clients. Using this approach a synthetic peptide library comprised of 2,095 peptides and 2,661 in vivo mapped phosphorylation sites from Arabidopsis, was screened for potential clients using CPK1, CPK3, CPK6, CPK17, CPK24 and CPK32 which correspond to six different subfamilies A, E, B, D, J and K, respectively. A total of 180 non-redundant phosphorylated peptides were identified. Six peptides were phosphorylated by all CPK isoforms, and five peptides were phosphorylated by five isoforms. Identification of shared clients indicates overlapping preference for some of related Arabidopsis CPK family kinases. Among the CPK isoforms, CPK17 showed the highest number of phosphorylated peptides (151), of which 105 peptides were isoform-specific. Subcellular localization of clients showed good correlation with the respective CPK. Identification a significant number of validated CPKs clients including nitrate reductase 2, calmodulin-binding transcription activator 1, plasma membrane protein, CRK6, CRK8, calcium-binding EF hand family protein, AtRbohD, and syntaxin indicates the CPKs are functionally active and the MS-based KiC assay is a feasible approach to identify novel clients. Materials and Methods

Figure 1. Properties of 2095 synthetic peptides used for KiC assay. A, Group distribution of synthetic peptides based on hydrophobicity. B, Distribution of five different hydrophobic groups of peptides in each peptide pool. C and D, Subcellular localization and percentage of in vivo phosphosites, respectively.

Results

Client specificity of Arabidopsis calcium-dependent protein kinases

Nagib Ahsan, R. Shyama Prasad Rao, Rashaun S. Wilson, Kirby N. Swatek and Jay J. Thelen*

Department of Biochemistry, Interdisciplinary Plant Group, University of Missouri-Columbia, MO 65211, USA *Corresponding author: [email protected]

Figure 2. Diagrammatic representation of the KiC assay. Each pool was phosphorylated in vitro by individual CPKs and subsequently analyzed by an LTQ Orbitrap XL ETD mass spectrometer (Thermo Fisher, CA). For identification and phosphosite localization, raw MS files were searched using SEQUEST (Proteome Discoverer, v. 1.0.3, Thermo Fisher).

Figure 3. Relative phosphorylation of syntide-2 peptide by CPKs demonstrates recombinant kinases are comparably active. A, Phylogenetic relationship of CPKs used in this study. B, Relative phosphorylation of syntide-2 peptide standard by CPKs. C, MS/MS spectrum of the syntide-2 peptide showing Thr phosphorylation by CPK17. Small letter in the sequence and star in the MS/MS spectrum indicates the trans-phosphorylation site and detection of the phosphorylated residue, respectively.

KiC Assay

p

p p

p p

Identification of phosphopeptides

MS analysis

pRS score

LTQ XL Orbitrap

Synthetic peptide library (208-211 peptides /pool)

Recombinant purified CPKs

Kinase buffer containing ATP, Mg, DTT etc

CPK

Database search

A B

C D

2095 synthetic peptides were designed based on the Arabidopsis in vivo phosphoproteomic dataset available from P3DB. Due to a wide range of hydrophobicity (Figure 1A), ten pools were generated with equal distribution of hydrophobic peptides (Figure 1B). Each pool consists of equimolar concentration of 208-211 peptides.

Kinase assay was conducted using kinase buffer, recombinant CPK, and a pool of synthetic peptides (Figure 2A). Samples were analyzed by mass spectrometry (Figure 2B) as described previously (Ahsan et al., 2013; J Proteome Res. 12: 937-948). Raw MS files were searched against a decoy database consisting of the random complement of the sequences comprising the peptide library, using SEQUEST (Proteome Discoverer, v. 1.0.3, Thermo Fisher) (Figure 2C). Phosphorylation site localization was accomplished using phosphoRS. For final validation, each spectrum was inspected manually and accepted only when the phosphopeptide had the highest pRS site probability and pRS score (Figure 2D).

Analysis of CPK activity and phosphorylation preference on peptide standard

Identification of CPK specific clients 36, 25, 20, 151, 33 and 27 peptides were phosphorylated by CPK1, CPK3, CPK6, CPK17, CPK24 and CPK32, respectively. Many peptides were phosphorylated by multiple CPKs (Figure 4). Thus, a total of 180 non-redundant phosphorylated peptides were identified. As expected, Ser was preferred (81%) followed by Thr (15%) and Tyr (4%) (Figure 5).

Figure 5. CPK phosphorylation sites identified by KiC assay. The inset pie chart represent the total number of Ser, Thr and Tyr sites identified.

A

B

C

D

E

F

G

No Motif * Motif score 1

Foreground %1

Background %1

Fold increase

1

Foreground %2

Background %2

1 .L.R..S...... 23.54 24.84 1.43 17.42 24.84 1.43

2 ...R..SF..... 20.60 12.17 0.53 22.80 13.73 0.71

3 ...R..S...... 16.00 35.64 6.18 5.77 57.52 8.01

4 .L....S...... 7.72 27.69 5.79 4.78 36.60 6.76

* Motifs are significant at p << 0.01 after Bonferroni correction, Binomial probability 1 Based on motif-x (http://motif-x.med.harvard.edu/, Schwartz and Gygi, 2005, Nature Biotech 23:1391–1398) 2 Based on actual frequencies (Rao and Møller, 2012, Biochim Biophys Acta 1824:405–412)

A

B

C D

Figure 6. Consensus sequence and motif analysis of peptides phosphorylated by CPK17. A and B represents the consensus sequences. C-F represents the CPK17 recognition motifs (p << 0.01) account for ~69% of Ser sites. Consensus sequence of remaining ~31% Ser sites that did not produce any motifs (G).

Table 1. CPK17 recognition motifs for Ser sites (153) phosphorylation based on 5921 Ser sites in 2095 synthetic peptides.

Motif analysis of CPK17 clients Among the CPK17 phosphopeptides, a total of four recognition motifs (p << 0.01 after Bonferroni correction, Binomial probability) were identified (Figure 6, Table 1). LxRxxS is the most significant motif and accounted for ~25% of Ser sites compared to a background of ~1.4% (Figure 6C, Table 1). Other motifs include RxxSF, RxxS, and LxxxxS. Based on the actual frequency, ~58% of Ser clients contain an R residue at -3 position (Figure 6). The four motifs together accounted for 69.3% of 153 Ser clients for CPK17, while remaining 30.7% sites did not produce significant motifs (Figure 6G).

Summary Among the CPKs analyzed, CPK17 had the largest number of specific clients, all of which contained at least one of four motifs (Figure 6C-F). Co-expression analysis of these candidates may further refine client authenticity. Curran et al. (Front Plant Sci., 2011, 2:36) demonstrated that CPK34 (subfamily D) phosphorylated a higher number of substrates than other CPKs. It is interesting to note that , 25% of the CPK34 clients overlapped with the CPK17 clients. Additionaly, RxxS motif is common for both CPK17 and CPK34 clients. CPK17 and CPK34 are both D subfamily CPKs, suggesting these members are more promiscuous than other CPKs.

Acknowledgement: This work is supported by the NSF Plant Genome Research Program DBI-0604439

Figure 4. Topological relationship of the identified cognate clients for CPKs. Data were obtained by KiC assay screening of a library consisting of 2095 synthetic peptides as substrates for recombinant purified CPKs. The cartograph was assembled by Cytoscape 3.0.1 (http://www.cytoscape.org). Nodes with different shapes indicate kinase-specific versus overlapping clients. Node color indicates subcellular localization. Edge thickness represents phosphorylation stoichiometry as relative phosphorylation of each site. Co-expression relationship between pair of genes/proteins is based on the Pearson correlation of log2 transformed normalized expression values from GSE3011 for A. thaliana (http://www.ncbi.nlm.nih.gov/geo) and is denoted by edge color .