supplementary materials for - science...
TRANSCRIPT
www.sciencesignaling.org/cgi/content/full/4/164/rs3/DC1
Supplementary Materials for
System-Wide Temporal Characterization of the Proteome and Phosphoproteome of Human Embryonic Stem Cell Differentiation
Kristoffer T. G. Rigbolt, Tatyana A. Prokhorova, Vyacheslav Akimov, Jeanette
Henningsen, Pia T. Johansen, Irina Kratchmarova, Moustapha Kassem, Matthias Mann, Jesper V. Olsen,* Blagoy Blagoev*
*To whom correspondence should be addressed. E-mail: [email protected] (B.B.); [email protected]
(J.V.O.)
Published 15 March 2011, Sci. Signal. 4, rs3 (2011) DOI: 10.1126/scisignal.2001570
This PDF file includes:
Materials and Methods Fig. S1. Cells treated with NCM or PMA differentiate in a nondirected manner. Fig. S2. General properties of the data sets. Fig. S3. Frequency of protein and phosphorylation site identifications with selected GO Molecular Function and Cellular Component terms. Fig. S4. Dynamic changes in phosphorylation organized by phosphorylation motif. Fig. S5. Double-phosphorylation motifs. Fig. S6. The S-score algorithm. Fig. S7. Quantitative proteomics approach for identification of DNMT3A- and DNMT3B-interacting proteins. Table S8. Primers used for real-time QPCR. References
Other Supplementary Material for this manuscript includes the following: (available at www.sciencesignaling.org/cgi/content/full/4/164/rs3/DC1)
Table S1 (Microsoft Excel format). Protein identifications. Table S2 (Microsoft Excel format). Phosphorylation site identifications. Table S3 (Microsoft Excel format). Protein and phosphorylation site identifications for 30-min and 6-hour biological replicas. Table S4 (Microsoft Excel format). Identified transcription factors and transcription factor phosphorylation sites. Table S5 (Microsoft Excel format). Linear phosphorylation motifs. Table S6 (Microsoft Excel format). Identified kinases and kinase phosphorylation sites.
Table S7 (Microsoft Excel format). Protein identifications from DNMT3A and DNMT3B immunoprecipitations.
1
Supplementary Materials and Methods
Fractionation and Enrichment of Phosphorylated Peptides
Peptide extracts from in-solution digests were desalted and concentrated by solid phase extraction (SPE)
using SepPak Classic C18 cartridges (Waters). Each SPE eluate was diluted to 2 ml in 30% ACN / 0.1%
TFA and loaded on a 1 mL Resource S column (GE Healthcare) for strong cation exchange
chromatography (SCX). SCX was performed as described in (1) on a Äkta purifier (GE Healtcare) with a
flow rate of 1 ml/min on a 30-min linear gradient from 5 mM to 100 mM KCl in 30% ACN / 5 mM
KH2PO4 / 0.1% TFA during which 15 fractions of 2 ml were collected. Flow-through fractions containing
peptides not retained on the column during sample loading were collected, as was the solvent flow during
the pre-gradient column equilibration. Early fractions expected to have the highest abundance of
phosphopeptides were processed further individually, whereas later fractions were combined on the basis
of the 215 nm and 280 nm absorption chromatogram.
To enrich for phosphorylated peptides, titanium dioxide chromatography was performed essentially as
described in (2). Briefly, from a slurry of 10 µm Titansphere material (GL Sciences) in 80% ACN / 1%
TFA containing 50 mg/ml 2,5-dihydroxybenzoic acid between 1 and 5 mg was added directly to the SCX
fractions. The slurry was incubated for 15-30 min at room temperature with end-over-end rotation, spun
briefly on a bench-top centrifuge, and the supernatant was collected. Titansphere material with bound
phosphopeptides was washed three times with 150 µl 60% ACN / 1% TFA and transferred on top of a C8
disc (Empore) placed in 200 μL pipette-tip. Bound phosphopeptides were eluted directly into a 96-well
plate by passing 2 times 20µl 15% NH4OH / 40% ACN (pH~11) through the beads. Elutes were then
dried almost to completeness, and diluted to 10 µl in 2% ACN / 0.3% TFA. For the biological replica
experiments, phosphorylated peptides were directly enriched from the in-solution digests, without prior
SCX fractionation.
2
LC-MS/MS
All samples were analyzed by nanoscale C18-HPLC coupled online to a mass spectrometer. For C18-
HPLC either Easy-nLC (Proxeon), Agilent 1100 or Agilent 1200 systems was used to separate the
peptides over a 90-min linear gradient from 8-24% ACN in 0.5% acetic acid with a flow rate of 250
nl/min. Mass spectrometry was performed in positive ion mode on an LTQ-FT Ultra, LTQ-Orbitrap or
LTQ-Orbitrap XL (all Thermo Scientific) equipped with a nano-electrospray source (Proxeon). Eluate
was ionized by electrospray with a voltage of 2 kV using no sheath and auxiliary gas flow and introduced
into the mass spectrometer through an ion transfer tube heated to 200°C.
All full-scan acquisition was done in the FT-MS part of the mass spectrometers in the range from m/z
300-1650 with an automatic gain control (AGC) target value of 5 x 106 or 1 x 10
6 and at resolution
100,000 or 60,000 at m/z 400, for the LTQ-FT or LTQ-Orbitrap instruments respectively. For the orbitrap
instruments, lock-mass operation was enabled using poly-siloxane and phthalate ions from ambient air at
m/z 371.101233; m/z 391.284286; m/z 429.088724; m/z 445.120024; m/z 503.107515; and m/z
519.138815 for internal calibration (3). MS acquisition was done in data-dependent mode to sequentially
perform MS/MS on the ten most intense ions in the full scan (Top10) in the LTQ using the following
parameters. AGC target value: 5 x 103. Ion selection thresholds: 100 counts and a maximum fill time of
150 ms. Ion activation done in wide-band mode with an activation q = 0.25 applied for 30 ms at a
normalized collision energy of 40%. Dynamic exclusion was applied to reject ions from repeated MS/MS
selection for 90 ms using monoisotopic precursor selection. Singly charged and ions with unassigned
charge state were also excluded from MS/MS. For samples targeting phosphopeptides, multistage
activation was enabled with a neutral loss mass list of m/z 97.97, m/z 48.99 and m/z 32.66 (4).
Identification and Quantitation of Peptides and Phosphorylation Sites
The developmental edition of v. 1.0.12.25 of the MaxQuant software suite (5) in conjunction with Mascot
(MatrixScience) was used to process the raw data. The Quant element of MaxQuant was used to produce
peak lists based on the following parameters: Orbitrap/FT Ultra instrument; double triple SILAC with
3
medium labels Arg6/Lys4 and heavy labels Arg10/Lys8; a maximum of three missed cleavages; six most
intense peaks per 100 Da for peak lists; 0.5 Da MS/MS tolerance. Carbamidomethylation of cysteines
were set as fixed modifications and protein N-terminal acetylation, methionine oxidation, pyro-glutamate
for N-terminal glutamine and phophorylation of serine, threonine, and tyrosine were selected as variable
modifications.
Resulting peak lists were searched by Mascot v.2.2 against the human IPI database (6) v. 3.37
concatenated with known contaminants and reversed sequences of all entries. The resulting mascot output
files (DAT-files) were used as input for the Identify element of MaxQuant, which was applied using the
following parameters: Peptide and protein FDR = 0.01; maximum peptide PEP = 1 based on Mascot
score; minimum peptides length = 6; minimum score = 7; minimum unique sequences = 1; minimum
peptides = 1. Unmodified and all modified, except phosphorylated, peptides were used for protein
quantitation based on both razor and unique peptides requiring a minimum ratio count of one. Site
quantitation was based on the highest observed change and option for match between runs was enabled
using an elution time window of 2 min.
Gene Ontology Enrichment
To evaluate functional categories overrepresented in each of the clusters the biological process Gene
Ontology (GO) (7) information provided in the MaxQuant output tables was used. The phosphorylation
sites from each protein in each of the clusters were collapsed to one phosphoprotein per cluster and all
GO terms associated with the phosphoproteins in the cluster were then extracted to produce a table of GO
term appearances in each of the clusters. Subsequently, all GO terms with less than four occurrences in a
cluster were discarded and the remaining terms were then tested for overrepresentation versus the group
of proteins carrying only unregulated phosphorylation sites, using binomial probability.
The Benjamini & Hochberg algorithm was used to adjust p-values for multiple testing. If a GO term was
significantly overrepresented at probability < 0.05 in one or more of the fuzzy c-means clusters, this term
was kept; all remaining terms were discarded. To extract the most descriptive terms, we used the
4
hierarchical structure of the GO annotations by discarding all terms that had an offspring term within our
group of overrepresented terms. As the final step in the analysis we standardized the p-values for
overrepresentation in each of the individual clusters and performed hierarchical clustering on the –log10 of
the standardized p-values.
Supplementary References
1. J. V. Olsen, B. Blagoev, F. Gnad, B. Macek, C. Kumar, P. Mortensen, M. Mann, Global, in vivo,
and site-specific phosphorylation dynamics in signaling networks Cell 127, 635-648 (2006).
2. M. R. Larsen, T. E. Thingholm, O. N. Jensen, P. Roepstorff, T. J. Jorgensen, Highly selective
enrichment of phosphorylated peptides from peptide mixtures using titanium dioxide microcolumns
Mol. Cell. Proteomics 4, 873-886 (2005).
3. J. V. Olsen, L. M. de Godoy, G. Li, B. Macek, P. Mortensen, R. Pesch, A. Makarov, O. Lange, S.
Horning, M. Mann, Parts per million mass accuracy on an Orbitrap mass spectrometer via lock
mass injection into a C-trap Mol .Cell. Proteomics 4, 2010-2021 (2005).
4. M. J. Schroeder, J. Shabanowitz, J. C. Schwartz, D. F. Hunt, J. J. Coon, A neutral loss activation
method for improved phosphopeptide sequence analysis by quadrupole ion trap mass spectrometry
Anal. Chem. 76, 3590-3598 (2004).
5. J. Cox, M. Mann, MaxQuant enables high peptide identification rates, individualized p.p.b.-range
mass accuracies and proteome-wide protein quantification Nat. Biotechnol. 26, 1367-1372 (2008).
6. P. J. Kersey, J. Duarte, A. Williams, Y. Karavidopoulou, E. Birney, R. Apweiler, The International
Protein Index: an integrated database for proteomics experiments Proteomics 4, 1985-1988 (2004).
7. M. Ashburner, C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K.
Dolinski, S. S. Dwight, J. T. Eppig, M. A. Harris, D. P. Hill, L. Issel-Tarver, A. Kasarskis, S.
Lewis, J. C. Matese, J. E. Richardson, M. Ringwald, G. M. Rubin, G. Sherlock, Gene ontology:
tool for the unification of biology. The Gene Ontology Consortium Nat. Genet. 25, 25-29 (2000).
Supplementary Figure S1
of three samples.
CMNCMd1
NCMd2
PMAd2
PMAd1
CMNCMd1
NCMd2
PMAd2
PMAd1
Ectoderm
Endoderm
Mesoderm, mesendoderm
Relative E
xpression
CMNCMd1
NCMd2
PMAd2
PMAd1
CMNCMd1
NCMd2
PMAd2
PMAd1
CMNCMd1
NCMd2
PMAd2
PMAd1
CMNCMd1
NCMd2
PMAd2
PMAd1
CMNCMd1
NCMd2
PMAd2
PMAd1
CMNCMd1
NCMd2
PMAd2
PMAd1
1.0
1.31.1
0.10.0
1.0
1.21.0
0.1 0.10.00.20.40.60.81.01.21.41.6
SOX1 OD3 HUES9
1.0
0.7
1.21.4
2.4
1.0
1.5
0.80.4
0.6
0.0
0.5
1.0
1.5
2.0
2.5
3.0PAX6 OD3 HUES9 3.5
1.0
2.32.8
2.42.7
1.0
2.8
2.1
1.0
2.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
MAP2 OD3 HUES9
1.01.3 1.3
2.4
3.7
1.01.5 1.7
2.6 2.6
0.00.51.01.52.02.53.03.54.04.5
Beta-TUBULIN OD3 HUES9
1.0 0.7 0.4 1.0
8.6
1.0 0.3 0.3
40.2
29.2
05
1015202530354045 SOX17 OD3 HUES9 1.6
1.0
1.2
0.91.1
0.9 1.00.9
1.1
0.8
0.5
0.20.40.60.81.01.21.4
0
AMN OD3 HUES9
0.0
1.0
1.7
4.1
0.80.4
1.0
1.91.9
0.6
1.7
0.51.01.52.02.53.03.54.04.55.0 ATP OD3 HUES9
1.0 1.0 0.9
49.6
94.8
1.0 0.8 1.6
45.446.9
20
40
60
80
100
120
0
SOX7 OD3 HUES9
1.0
1.2 1.2
0.2 0.3
1.0
1.3
0.7
0.1
0.3
0.00.20.40.60.81.01.21.41.6
PDGFRA OD3 HUES9
1.00.9 1.0 0.9
0.5
1.0
0.5
1.2
0.7
1.3
0.00.20.40.60.81.01.21.41.6
PDGFRB OD3 HUES9
1.0
0.50.4
0.1 0.0
1.0
0.3
0.5
0.3 0.3
0
0.2
0.4
0.6
0.8
1.0
1.2FLK1 OD3 HUES9 BRACHYURY
1.000.27
0.220.390.14
1.00
3.012.38
11.0810.51
0
2
4
6
8
10
12
14 OD3 HUES9
Relative E
xpressionR
elative Expression
Fig. S1. Cells treated with NCM or PMA differentiate in a nondirected manner. Real-time quantitativePCR analysis of mRNA levels from selected differentiation markers in untreated cells and cells treated withNCM or PMA. Analyses were performed in Odense-3 (blue) and HUES9 (red) hESC lines. Target geneexpression levels were normalized to expression of actin; the normalized values for undifferentiated cells werethen set at 1 separately for Odense-3 and HUES9 cell lines in all panels. Error bars indicate standard deviation
A
0
20
40
60
80
100
00.
30.
60.
91.
21.
51.
82.
12.
42.
7
25
507590
Percentiles (%)
Per
cent
of m
ax
Mass Error (ppm)
-2.7
-2.4
-2.1
-1.8
-1.5
-1.2
-0.9
-0.6
-0.3
-6
-4
-2
0
2
4
6
Log 2
(Rat
io) O
rigin
al
Correlation Coefficient: 0.67n = 4364
Counts0 50 100 150 200 250 300
-6 -4 -2 0 2 4 6
92%
B C
n = 21
82
Averag
e = 19
%
n = 21
82
Averag
e = 19
%
n = 23
51
Averag
e = 20
%
n = 23
50
Averag
e = 25
%
% o
f occ
urre
nces
Relative Standard Deviation
0
90
80
70
60
50
40
30
20
10
100
R.S.D.< 50%
> 50%
PMA 30 m
ins
PMA 6 hrs
NCM 30 m
ins
NCM 6 hrs
Supplementary Figure S2
Log2 (Ratio) Replicate
R.S.D.< 50%
R.S.D.< 50%
R.S.D.< 50%
> 50% > 50% > 50%
Fig. S2. General properties of the data sets. (A) Analysis of precursor mass deviations of all identified peptides: 90% of all peptides were identified with a mass deviation less than 1.1 ppm. (B) and (C) Replica experiments for the 30 min and 6 hrs of both treatments identified 8,205 phosphorylation sites and 4,667 proteins (see table S3). Average relative standard deviation (RSD) for the SILAC ratios between the original and the replica experiments was 14.5 % for proteins and 21.3 % for phosphopeptides. (B) Relative standard deviations of SILAC quantitation ratios of the phosphopeptides identified in the complete and replicate sets. Each bar represents the individual time point and treatment, as indicated. The fractions of phosphopeptides with RSD of less than 50% between the two replicas are shown in grey. (C) Density scatter plot of phosphopeptides identified in both the complete and the replicate data sets shows that the majority (92%) of quantitations for phosphopeptide pairs are withinthe range indicated by dotted lines which indicate one-fold difference in quantitation between the replicate. Counts represent phosphopeptides in the bins with combination of phosphopeptide ratios in the two replicate experiments.
transcription factor activity (216)
actin binding (192)
GTPase activity (113)protein kinase activity (223)
ubiquitin−protein ligase activity (65)receptor activity (137)
structural molecule activity (356)
transporter activity (292)
phosphatase activity (102)transcription regulator activity (467)
GTPase regulator activity (159)
translation regulator activity (99)
transcription factor activity (1100 on 180 proteins)
actin binding (1015 on 146 proteins)GTPase activity (231 on 59 proteins)
protein kinase activity (1017 on 181 proteins)ubiquitin−protein ligase activity (332 on 44 proteins)
receptor activity (482 on 97 proteins)
structural molecule activity (1331 on 225 proteins)
transporter activity (788 on 168 proteins)
phosphatase activity (211 on 59 proteins)
transcription regulator activity (2651 on 363 proteins)
GTPase regulator activity (861 on 137 proteins)
translation regulator activity (334 on 68 proteins)
transcription factor activity (714 on 165 proteins)
actin binding (625 on 130 proteins)GTPase activity (145 on 47 proteins)protein kinase activity (654 on 169 proteins)
ubiquitin−protein ligase activity (199 on 40 proteins)receptor activity (284 on 90 proteins)
structural molecule activity (889 on 202 proteins)
transporter activity (481 on 144 proteins)
phosphatase activity (123 on 53 proteins)
transcription regulator activity (1679 on 331 proteins)
GTPase regulator activity (519 on 125 proteins)
translation regulator activity (245 on 60 proteins)
Proteins
Cellular Component
Molecular Function
nucleus (2307)spliceosome (126)
cytoplasm (2128)
mitochondrion (621)
endoplasmic reticulum (372)cytosol (604)
ribosome (179)cytoskeleton (347)
membrane (1531)
extracellular matrix (54)
organelle lumen (56)
All phosphorylation sitesnucleus (12609 on 1701 proteins)
spliceosome (1174 on 90 proteins)
cytoplasm (8075 on 1436 proteins)
mitochondrion (785 on 274 proteins)endoplasmic reticulum (744 on 196 proteins)
cytosol (1945 on 400 proteins)ribosome (305 on 95 proteins)
cytoskeleton (2101 on 267 proteins)
membrane (4264 on 890 proteins)
extracellular matrix (112 on 30 proteins)
organelle lumen (67 on 24 proteins)
Class 1 phosphorylation sitesnucleus (8238 on 1552 proteins)
spliceosome (870 on 81 proteins)
cytoplasm (5077 on 1285 proteins)
mitochondrion (466 on 213 proteins)endoplasmic reticulum (441 on 166 proteins)
cytosol (1270 on 360 proteins)ribosome (205 on 85 proteins)
cytoskeleton (1326 on 249 proteins)
membrane (2622 on 785 proteins)
extracellular matrix (70 on 26 proteins)
organelle lumen (37 on 20 proteins)
Proteins
All phosphorylation sites
Class 1 phosphorylation sites
Supplementary Figure S3
Fig. S3. Frequency of protein and phosphorylation site identifications with selected GO Molecular Functionand Cellular Component terms. For selected GO Molecular Function and Cellular Component terms the numberof proteins and phosphorylation sites were counted.
Mea
n ra
tioM
ean
ratio
Mea
n ra
tioM
ean
ratio
Mea
n ra
tioM
ean
ratio
Mea
n ra
tio
−0.8
−0.4
0.0 378 ....P.SP.....
●
●● ●
●
●
● ●● ●
−0.2
0.2
0.4 173 ......SE.EE..
●
●
● ●
●● ●
●
●
●
●●
−0.2
0.2
0.4 142 ....E.SE.E...
●●
● ●●● ●
●
249 ..R.S.S......
−0.2
0.0
0.2
0.4
●●
●
●●●
●●
●
●
−0.4
0.0
0.4 232 ..E...S.E....
● ●● ● ●
●●
●
●●
−0.6
0.0
0.4
0.8 35 ......SEEEEE.
● ●●
●
●●● ●
● ●
−0.2
0.2
0.6 112 .R.RS.S......
●●
● ●●
● ●●
●●
−0.2
0.2
0.4 106 ......S.E.E.E
●●
● ● ●●● ●
●●
432 ..S...SP.....
−0.6
−0.2
●
● ● ●
●
●
● ●●
●
0 ½ 1 6 24
−0.6
−0.3
0.0 286 ......SP.S...
●
● ● ●
●
●
● ●●
●
−0.4
0.0
0.2
0.4 66 .....ESEEE...
●● ● ●
●●● ●
●●
−0.3
0.0
0.2
0.4 150 ......S.EE.E.
●● ● ●
●●●
●
●
●
−0.2
0.0
0.2
0.4
282 ......S.SE...
● ● ●
●
●●●
●
●
●
198 R..S..S......
−0.3
−0.1
0.1
0.3
●●
● ● ●●●
●●
●
−0.6
−0.3
0.0 1822 ......SP.....
●
● ●●
●
●
● ●●
●
−1.0
−0.6
−0.2
59 ......SP.R..S●
●●
●
●
●
●● ● ●
−0.6
−0.2
0.2
0.6 55 ......SESE.E.
● ● ●
●
●
● ●●
●
●
−1.0
−0.6
−0.2
298 ......SP.P...●
● ●
●●
●
●● ●
●
250 ...R..S..S...
−0.1
0.1
0.3
0.5
●
●
●●
●●
● ●●
●
−0.2
0.0
0.2
0.4 263 ...R..S.....S
● ●
●●
●●
●
●
●
●
−0.6
−0.2
210 ....SPS......●
● ●●
●
●
● ● ● ●
−0.8
−0.2
0.2
0.6 48 ...R..SP...R.
● ●●
●●
● ● ●●
●
−0.6
−0.2
337 ......SP.R...●
●●
● ●
●
● ● ●
●
−0.1
0.1
0.3
0.5
286 ...RR.S......
●●
●●
●●
●
●●
●
423 ......S.EE...
−0.2
0.0
0.2
●●
● ● ●●
● ●
●
●
−0.2
0.0
0.2
253 .....RS.S....
●● ●
●●
● ●
●
●●
−0.3
0.0
0.2
265 .....ESE.....
●
●●
●
●
●●
●
●
●
−0.8
−0.4
0.0
70 ......SP..R.R
●
● ●●
●
●
●● ●
●
−0.2
0.2
0.6
106 ...R.RS.S....
● ●●
●●
●
●
●
●
●
−0.6
−0.2
388 ....S.SP.....●
● ●●
●
●
●● ●
●
400 ......SP..S..
−0.6
−0.2
0.0 ●
●● ●
●
●
● ● ● ●
−0.2
0.0
0.2
0.4 225 .RR...S......
● ●
●●
●●●
●
●
●
−0.6
−0.2
156 ......SSP....●
●●
●●
●
● ● ●●
Mea
n ra
tio
0 ½ 1 6 24
−1.2
−0.8
−0.4
0.0 68 ......TP....R
●
● ●
● ●
●●
●●
●
Mea
n ra
tio
−1.5
−0.5
0.5 22 ...R..TPP....
●● ● ●
●
●●
● ● ●
Time point (hrs)0 ½ 1 6 24
0.2
−1.0
−0.6
−0.2
77 ....SPT......
●
● ●●
●
●
●●
●●
−1.2
−0.6
0.0 69 ......TP.R...
●
●●
● ●
●●
● ● ●
0 ½ 1 6 24
−1.0
0.0
1.0
30 .....RT....R.
● ●●
● ●● ●●
● ●
−1.4
−0.8
−0.2
73 ....P.TP.....●
● ●●
●
●
● ● ● ●
0 ½ 1 6 24
−0.5
0.0
0.5
1.0 20 .....ETE.E...
●● ●
●
●● ●●
●●
−1.0
−0.6
−0.2
56 ......TP.P...●
●●
●
●
●
●●
● ●
0 ½ 1 6 24
−0.8
−0.4
0.0
61 ......TSP....
●●
●●
●
●
● ● ●●
−1.2
−0.8
−0.4
0.0 80 ......TPP....
●
●● ●
●
●
● ●●
●
0 ½ 1 6 24
Time point (hrs)0 ½ 1 6 24 0 ½ 1 6 24 0 ½ 1 6 24 0 ½ 1 6 24 0 ½ 1 6 24
−6−2
24
6 11 .....RTR.....
●●
● ●●
● ●●
● ●
Basic
Threonine Motifs
Serine MotifsSupplementary Figure S4
0 ½ 1 6 24
0 ½ 1 6 24
0 ½ 1 6 24
0 ½ 1 6 24 0 ½ 1 6 24 0 ½ 1 6 24 0 ½ 1 6 24
0 ½ 1 6 24 0 ½ 1 6 24 0 ½ 1 6 24
0 ½ 1 6 24 0 ½ 1 6 24 0 ½ 1 6 24
0 ½ 1 6 24 0 ½ 1 6 24 0 ½ 1 6 24
0 ½ 1 6 24Time point (hrs)
Time point (hrs)
Time point (hrs)
Time point (hrs)
Time point (hrs)
Time point (hrs)
Acidic
Uncharged
Uncharged
Charged
Fig. S4. Dynamic changes in phosphorylation organized by phosphorylation motif.For each motif the regulated sites were extracted and their mean were plotted for PMA (red) and NCM (blue),error bars indicate 95% confidence intervals.
Time point (hrs) Time point (hrs) Time point (hrs) Time point (hrs) Time point (hrs)
Time point (hrs) Time point (hrs) Time point (hrs) Time point (hrs)
Time point (hrs) Time point (hrs) Time point (hrs) Time point (hrs)
Time point (hrs) Time point (hrs) Time point (hrs) Time point (hrs)
Time point (hrs) Time point (hrs)
Time point (hrs) Time point (hrs)
Acidic
0 +2 +4 +6-2-4-6
0 +2 +4 +6-2-4-6
0 +2 +4 +6-2-4-6
0 +2 +4 +6-2-4-6
0 +2 +4 +6-2-4-6
Phosphorylation directed
Basic
0 +2 +4 +6-2-4-6
0 +2 +4 +6-2-4-6
0 +2 +4 +6-2-4-6
Proline directed
0 +2 +4 +6-2-4-6
0 +2 +4 +6-2-4-6
0 +2 +4 +6-2-4-6
0 +2 +4 +6-2-4-6
Supplementary Figure S5
Fig. S5. Double-phosphorylation motifs. To test for predominant double phosphorylation motifs +/- 6 residue sequence windows were generated from all multi-phosphorylated peptides and tested against sequence windows from single phosphorylated peptides. The identified motifs were grouped according to theamino acid residues surrounding the central phosphorylation site, B residues in the motifs represent a phosphorylated serine or threonine.
2
1
0
Similarity Score
Area 1 Area 2 Area 3 Area 4 Area A2 = · (3 - 2) = 1 2 + 0
2
Area A3 = · (4 - 3) = 2 2 + 2
2
Area A4 = · (5 - 4) = 2 2 + 2
2AUC = 0 + 1 + 2 + 2 = 5
M-comp. =
X = 0, 0, 2, 2, 2
P-comp = Pr(T>1.5492) = 0.1095
= (0 - 1.2)(0 - 0.4)+(0 - 1.2)(0 - 0.4)+(2 - 1.2)(0 - 0.4) +(2 - 1.2)(1 - 0.4)+(2 - 1.2)(1 - 0.4) = 1.6
= (0 - 1.2)2+(0 - 1.2)2+(2 - 1.2)2+(2 - 1.2)2+(2 - 1.2)2 = 4.8
= (0 - 0.4)2+(0 - 0.4)2+(0 - 0.4)2+(1 - 0.4)2+(1 - 0.4)2 = 1.2
Y = 0, 0, 0, 1, 1
Magnitude Component
M-comp.=
· (ti - ti-1) yi + yi-1
2 AUCa - AUCb AUCa + AUCb
AUC =n
i = 1Σ
Example:
Area B2 = 0 + 0
Area A1 = · (2 - 1) = 0
· (3 - 2) = 0
· (4 - 3) = 0.5
· (5 - 4) = 1
· (2 - 1) = 0 0 + 0
2
2
2
2
2Area B1 = 0 + 0
Area B3 = 0 + 1
Area B4 = 1 + 1
AUC = 0 + 0+ 0.5 + 1 = 1.5
= 0.54 5 - 1.5
5 + 1.5
∑=
−−
−−
n
ii
ii
yyx
yyxx
1
2_
2_
__
)()
∑=
−n
i yy1
2_)(
∑=
−n
ii xx
1
2_)(
))(
∑=
−−n
iii yyxx
1
__) )((
x = 5
(0 + 0 + 2 + 2 + 2)
y = 5
(0 + 0 + 0 + 1 + 1)
= 1.2
= 0.4
r = = 2/31.6
4.8·1.2
t = = 1.54921-(2/3)
2
5-2
Pearson Component
∑
∑
=
=−=−=n
ii
n
i
YX x
YXr
1
1
(
(1),cov(
P-comp = Pr(T>t)
S-score = -10 · log10(P-comp · M-comp) S-score = -10 · log10(0.1095 · 0.54)= 12.3
1σσ
2
21
2−≈
−
−= nt
r
nrt
(2/3)
1(0hrs) 2(½hr) 3(1hr) 4(6hrs) 5(24hrs)
0
10
20
30
1
3
5
7
0
40
80
120
PRPF4B - Ser20
FAM21B - Ser640
JMJD1C - Thr622
S-score = 0.7
S-score = 0.9
S-score = 0.9
0 .8
1 .2
1 .6
2 .0
0 .5
1 .5
2 .5
1 .0
1 .5
2 .0
CDC2L5 - Ser439
MYH10 - Ser1938
RBM15 - Thr737
S-score = 10
S-score = 10
S-score = 10
1 .0
2 .0
3 .0
0 .4
0 .6
0 .8
1 .0
1 .01 .21 .41 .6
RDBP - Ser115
DEK - Ser68
ARID1A - Ser1184
S-score = 20
S-score = 20
S-score = 20
26
1014
0 .4
0 .6
0 .8
1 .0
0 .6
0 .8
1 .0BCLAF1 - Ser290
ABCF1 - Ser24
PDS5B - Tyr1187
S-score = 30
S-score = 30
S-score = 30.1
0 .2
0 .6
1 .0
0 .2
0 .6
1 .0
0 .4
0 .6
0 .8
1 .0
PCM1 - Ser65
BCR - Ser459
SRRM1 - Ser786
S-score = 40.5
S-score = 40.7
S-score = 40.9
0 .5
1 .0
1 .5
2 .0
0 .6
0 .8
1 .0
0 .4
0 .6
0 .8
1 .0
SOX15 - Ser37
CHERP - Ser813
MYEF2 - Thr13
S-score = 50.1
S-score = 50.3
S-score = 50.4
S-score~1 ~10 ~20 ~30 ~40 ~50
A
Supplementary Figure S6
B
Pearson component. The areas defined by the profile and the x-axis were calculated using trapezoid approximationand the magnitude component is then the difference in the area under the curve for each treatment divided with the sum of the two areas. The Pearson Component is calculated as the p-value of the Pearson correlationcoefficient. The S-score is calculated as minus ten times log10 of the product of the Magnitude and Pearson Component. (B) Examples from the data on sites with S-scores increasing from ~1 to ~50, PMA profiles areshown in red and NCM profiles are shown in blue.
0 ½ 1 6 24 hrs0 ½ 1 6 24 hrs0 ½ 1 6 24 hrs0 ½ 1 6 24 hrs0 ½ 1 6 24 hrs0 ½ 1 6 24 hrs
Fig. S6. The S-score algorithm. (A) The S-score contains contributions from two components: a magnitude and a
Supplementary Figure S7
Arg0/Lys0 Arg6/Lys4
Undifferentiated (control) Differentiatied (6 hrs PMA)
Cell Lysis Cell Lysis
IP: DNMT3A IP: DNMT3A
Mix Eluates
In-gel digest, LC-MS/MS
Data Processing (Max Quant)
The flow diagram exemplifies the experiment carried out for DNMT3A. An identical experiment was performed for DNMT3B as well.
Fig. S7. Quantitative proteomics approach for identification of DNMT3A- and DNMT3B-interacting proteins.
Gene Forward primer Reverse primer
OCT4 5’-gctgacaacaatgaaaatcttcag g-3’ 5’-gttacagaaccacactcggac c-3’
NANOG 5’-aaagaatcttcacctatgcc-3’ 5’-gaaggaagaggagagacagt -3’
SOX1 5’-cacaactcggagatcagcaa-3’ 5’-ggtacttgtaatccgggtgc-3’
PAX6 5’-gtccatctttgcttgggaaa-3’ 5’-tagccaggttgcgaagaact-3’
MAP2 5’-caggagacagagatgagaattcctt-3’ 5’-gtagtgggtgttgaggtaccactctt-3’
Beta-TUBULIN 5’-catccaggagctgttcaagc-3’ 5’-tttagacactgctggcttcg-3’
SOX17 5’-ctctgcctcctccacgaa-3’ 5’-cagaatccagacctgcacaa-3’
AMN 5’-gactctgaccgcttctcctg-3’ 5’-cactaggcggaaagaagacg-3’
AFP 5’-aaatgcgtttctcgttgctt-3’ 5’-gccacaggccaatagtttgt-3
SOX7 5’-acgccgagctcagcaagat-3’ 5’-tccacgtacggcctcttctg-3’
PDGFRA 5’-acaggttggtgtgggttcat-3’ 5’-ctgcatcttccaaagcatca-3’
PDGFRB 5’-cccttatgtcggagctgaag-3’ 5’-gcggcagtactcagtgatga-3’
FLK1 5’-tgatcggaaatgacactgga-3’ 5’-cacgactccatgttggtcac-3’
BRACHYURY 5’-aattggtccagccttggaat-3’ 5’-cgttgctcacagaccaca-3’
Beta-ACTIN 5’-cgtaccactggcatcgtgat-3’ 5’-ttctccttaatgtcacgcac-3’
Table S8. Primers used for real-time Q PCR.