1
Supplementary Information for:
5-Formylcytosine can be a stable DNA modification in mammals
Martin Bachman1,2, Santiago Uribe-Lewis2, Xiaoping Yang2, Heather E
Burgess3, Mario Iurlaro3, Wolf Reik3,4, Adele Murrell2,5 & Shankar
Balasubramanian1,2*
Affiliations
1Department of Chemistry, University of Cambridge, Cambridge CB2 1EW,
UK
2Cancer Research UK Cambridge Institute, University of Cambridge,
Cambridge CB2 0RE, UK
3Babraham Institute, Babraham CB22 3AT, UK
4Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, UK
5Present address: Centre for Regenerative Medicine, Department of Biology
and Biochemistry, University of Bath, Bath BA2 7AY, UK
*e-mail: [email protected]
Nature Chemical Biology: doi:10.1038/nchembio.1848
2
Supplementary Results
Supplementary Fig. 1. Mass spectra (measurements of global levels).
Supplementary Fig. 2. Extracted ion chromatograms (synthetic standards).
Supplementary Fig. 3. Calibration curves.
Supplementary Fig. 4. Extracted ion chromatograms (genomic DNA samples).
Supplementary Fig. 5. Reproducibility of mass spectrometry measurements.
Supplementary Fig. 6. Global levels of 5mC, 5hmC and 5fC in all tissues.
Supplementary Fig. 7. Global levels of 5mC, 5hmC, 5fC and 5caC in CD1
mice.
Supplementary Fig. 8. Correlations between 5fC, 5mC and 5hmC.
Supplementary Fig. 9. Mass spectra (measurements of labelling ratios).
Supplementary Table 1. Masses of parent ions and base fragments.
Nature Chemical Biology: doi:10.1038/nchembio.1848
3
Supplementary Figure 1. Mass spectra of all nucleoside base fragments
acquired at a the resolution of 70,000 and used for quantification of global
5mC, 5hmC, 5fC and 5caC levels. Theoretical and found accurate masses
are listed in Supplementary Table 1. Theoretical masses ± 5 ppm were used
to generate the extracted ion chromatograms shown in Supplementary Fig 2.
5mC
126.05 126.07 126.090
1×1006
2×1006
3×1006
m/z
5mC_IS
129.07 129.09 129.110
1×1007
2×1007
3×1007
m/z
5hmC
142.04 142.06 142.080.0
5.0×1004
1.0×1005
1.5×1005
m/z
5hmC_IS
145.06 145.08 145.100
2×1005
4×1005
6×1005
8×1005
1×1006
m/z
5fC
140.03 140.05 140.070.0
5.0×1003
1.0×1004
1.5×1004
2.0×1004
2.5×1004
m/z
5caC
156.02 156.04 156.060
1×1003
2×1003
3×1003
m/z
NH
NH+
NH2
O
H
O
NH
NH+
NH2
O
H
OHH
NH
NH+
NH2
O
H
HH
C
112.03 112.05 112.070
5×1007
1×1008
2×1008
2×1008
2×1008
m/z
Rela
tive
abun
danc
e (A
.U.)
C_IS
115.04 115.06 115.080
2×1005
4×1005
6×1005
m/z
Rela
tive
abun
danc
e (A
.U.)
NH
NH+
NH2
O
NH
NH+
NH2
O
HO
O
NH
NH+
NH2
O
D
OHD
DNH
NH+
NH2
O
D
DD
NH
NH+
15NH2
O
D
D
Nature Chemical Biology: doi:10.1038/nchembio.1848
4
Supplementary Figure 2. Example of extracted ion chromatograms for each
analyte and their corresponding internal standards. Areas under the curve
were used for quantification. The numbers show retention times (in min)
achieved on a capillary column packed with 3 µm Hypercarb beads and a 15-
min gradient.
C
4 6 8 10 12 140
4×1008
8×1008
RT (min)
Inte
nsity
(AU)
7.07
C_IS
4 6 8 10 12 140
1×1005
2×1005
RT (min)
Inte
nsity
(AU)
7.11
5mC
4 6 8 10 12 140
5×1007
1×1008
RT (min)
7.67
5mC_IS
4 6 8 10 12 140
1×1005
2×1005
RT (min)
7.64
5hmC
4 6 8 10 12 140
1×1007
2×1007
RT (min)
7.50
5hmC_IS
4 6 8 10 12 140
1×1005
2×1005
RT (min)
7.47
5fC
4 6 8 10 12 140
5×1006
1×1007
RT (min)
9.98
5caC
4 6 8 10 12 140
2×1006
4×1006
RT (min)
11.40
Nature Chemical Biology: doi:10.1038/nchembio.1848
5
Supplementary Figure 3. Calibration curves were generated using area
ratios (unlabelled vs. IS) for C, 5mC and 5hmC, and areas only for 5fC and
5caC. Excellent linearity was obtained for all analytes.
C
-1 0 1 2 3 40
1
2
3
4
5
Log10([C]) (nM)
Log 10
(Are
a ra
tio)
r2 = 0.9930
5mC
-2 -1 0 1 2 3-1
0
1
2
3
4
Log10([5mC]) (nM)
r2 = 0.9994
5hmC
-3 -2 -1 0 1 2-2
-1
0
1
2
3
Log10([5hmC]) (nM)
r2 = 0.9982
5fC
-3 -2 -1 0 1 24
5
6
7
8
9
Log10([5fC]) (nM)
Log 10
(Are
a) r2 = 0.9947
5caC
-3 -2 -1 0 1 24
5
6
7
8
9
Log10([5caC]) (nM)
r2 = 0.9924
Nature Chemical Biology: doi:10.1038/nchembio.1848
6
Supplementary Figure 4. Example of extracted ion chromatograms for each
analyte and their corresponding internal standards obtained from a low-5fC-
containing mouse genomic DNA sample (adult spleen). Areas under the curve
were used for quantification. The numbers show retention times achieved on
a capillary column packed with 3 µm Hypercarb beads and a 20-min gradient.
C
5 10 15 200
1×1009
2×1009
3×1009
RT (min)
Inte
nsity
(AU)
9.57 min
C_IS
5 10 15 200
1×1006
2×1006
RT (min)
Inte
nsity
(AU)
9.55 min
5mC
5 10 15 200
1×1008
2×1008
3×1008
4×1008
RT (min)
10.20 min
5mC_IS
5 10 15 200
2×1007
4×1007
6×1007
RT (min)
10.18 min
5hmC
5 10 15 200
1×1006
2×1006
3×1006
4×1006
RT (min)
10.05 min
5hmC_IS
5 10 15 200
2×1006
4×1006
6×1006
RT (min)
10.02 min
5fC
5 10 15 200
2×1003
4×1003
6×1003
8×1003
1×1004
RT (min)
14.72 min
5fC zoom
14.0 14.5 15.0 15.50
2×1003
4×1003
6×1003
8×1003
1×1004
RT (min)
Nature Chemical Biology: doi:10.1038/nchembio.1848
7
Supplementary Figure 5. Reproducibility of concentration measurements
(left) and global levels of 5mC, 5hmC, 5fC and 5caC (right). Each dot
represents the coefficient of variation (% CV) of 2 technical replicates for
genomic DNA samples. Bars represent mean ± SD of 47 random samples.
Reproducibility
[C][5m
C]
[5hmC]
[5fC]
[5caC
]
0
20
40
60
% C
V
Reproducibility
% 5mC
% 5hmC
ppm 5f
C
ppm 5c
aC
0
20
40
60
Nature Chemical Biology: doi:10.1038/nchembio.1848
8
Supplementary Figure 6. Global levels of 5mC, 5hmC and 5fC in the
genomic DNA of all studied unlabelled C57BL/6 mouse tissues and mES
cells. Shown are mean ± SEM of 3 embryos (E11.5), 3 newborns (1 d old)
and mean and range for 2 adolescent (21 d old) and 2 adult (15 w old) mice.
Each biological sample was analysed in 2 technical replicates, and the mean
value was used.
Embryo (11.5 d)
Carcas
s
Forebra
in
Hindbra
inHea
rtAort
aLiv
er
mES (WT)
mES (TET-TKO)
0
1
2
3
4
5
Carcas
s
Forebra
in
Hindbra
inHea
rtAort
aLiv
er
mES (WT)
mES (TET-TKO)
0.0
0.2
0.4
0.6
0.8
Newborn (1 d)
Brain
TailHea
rt
Kidney Skin Gut
Liver
LungSple
en0
1
2
3
4
5
Brain
TailHea
rt
Kidney Skin Gut
Liver
LungSple
en0.0
0.2
0.4
0.6
0.8
Adolescent (21 d)
Brain
Cerebe
llum Tail
Muscle
SkinKidn
ey Lung
Tongu
eHea
rt
Dist gu
t
Mid gutColo
nLiv
erSple
en
Prox gu
t
Thymus
0
1
2
3
4
5
Brain
Cerebe
llum Tail
Muscle
SkinKidn
ey Lung
Tongu
eHea
rt
Dist gu
t
Mid gutColo
nLiv
erSple
en
Prox gu
t
Thymus
0.0
0.2
0.4
0.6
0.8
Adult (15 w)
Brain
Cerebe
llum
Prox gu
t
Tongu
eTail
Liver Skin
Mid gut
Spleen
Thymus
Heart
MuscleKidn
ey
Dist gu
tColo
nLun
g0
1
2
3
4
5
% m
C (o
ver t
otal
C)
Brain
Cerebe
llum
Prox gu
t
Tongu
eTail
Liver Skin
Mid gut
Spleen
Thymus
Heart
MuscleKidn
ey
Dist gu
tColo
nLun
g0.0
0.2
0.4
0.6
0.8
% h
mC
(ove
r tot
al C
)
Brain
TailHea
rt
Kidney Skin Gut
Liver
LungSple
en0
6
12
18
Brain
Cerebe
llum Tail
Muscle
SkinKidn
ey Lung
Tongu
eHea
rt
Dist gu
t
Mid gutColo
nLiv
erSple
en
Prox gu
t
Thymus
0
6
12
18
Brain
Cerebe
llum
Prox gu
t
Tongu
eTail
Liver Skin
Mid gut
Spleen
Thymus
Heart
MuscleKidn
ey
Dist gu
tColo
nLun
g0
6
12
18
ppm
5fC
(ove
r tot
al C
)
Nature Chemical Biology: doi:10.1038/nchembio.1848
9
Supplementary Figure 7. Global levels of 5mC, 5hmC, 5fC and 5caC in adult
12-week old female CD1 mice. Shown are mean ± SEM of 3 animals. The
high 5fC levels in adult spleen are consistent with values reported by Ito et al.
(strain not given)1.
CD1 mice (12 w)
BrainKidn
eyHea
rt
Muscle Lun
gLiv
er TailColo
n
Tong
ue Skin
Dist gu
t
Mid gut
Prox gu
t
Spleen
Thymus
0
1
2
3
4
5
% 5
mC
(ove
r tot
al C
)
CD1 mice (12 w)
BrainKidn
eyHea
rt
Muscle Lun
gLiv
er TailColo
n
Tong
ue Skin
Dist gu
t
Mid gut
Prox gu
t
Spleen
Thymus
0.0
0.2
0.4
0.6
% 5
hmC
(ove
r tot
al C
)Brai
nKidn
eyHea
rt
Muscle Lun
gLiv
er TailColo
n
Tong
ue Skin
Dist gu
t
Mid gut
Prox gu
t
Spleen
Thymus
0
5
10
15
ppm
5fC
(ove
r tot
al C
)
BrainKidn
eyHea
rt
Muscle Lun
gLiv
er TailColo
n
Tong
ue Skin
Dist gu
t
Mid gut
Prox gu
t
Spleen
Thymus
0.0
0.5
1.0
1.5
2.0
ppm
5ca
C (o
ver t
otal
C)
Nature Chemical Biology: doi:10.1038/nchembio.1848
10
Supplementary Figure 8. Correlations between global levels of 5fC, 5hmC
and 5mC in DNA from the same tissues and individuals. Shown are mean
values from individual biological replicates presented in Fig. 2 and brain
samples are indicated with an asterisk.
5fC vs. 5mC - E11.5
2.5 3.0 3.5 4.0 4.5 5.00
5
10
15
20
% 5mC (over total C)
ppm
fC (o
ver t
otal
C)
r2 = 0.30
5fC vs. 5hmC - E11.5
0.00 0.05 0.10 0.15 0.200
5
10
15
20
% 5hmC (over total C)
ppm
fC (o
ver t
otal
C)
r2 = 0.24
5fC vs. 5mC - newborn
2.6 3.0 3.4 3.8 4.20
2
4
6
8
% 5hmC (over total C)
r2 = 0.02 including brainr2 = 0.00 omitting brain
* **
5fC vs. 5mC - adolescent
2.4 3.0 3.6 4.2 4.80
5
10
15
20
% 5hmC (over total C)
r2 = 0.15 including brainr2 = 0.01 omitting brain *
*
5fC vs. 5mC - adult
2.8 3.2 3.6 4.0 4.4 4.80
5
10
15
% 5mC (over total C)
r2 = 0.05 including brainr2 = 0.00 omitting brain
**
5fC vs. 5hmC - newborn
0.0 0.1 0.2 0.30
2
4
6
8
% 5hmC (over total C)
r2 = 0.53 including brainr2 = 0.27 omitting brain *
**
5fC vs. 5hmC - adolescent
0.0 0.2 0.4 0.60
5
10
15
20
% 5hmC (over total C)
r2 = 0.73 including brainr2 = 0.36 omitting brain *
*
5fC vs. 5hmC - adult
0.0 0.2 0.4 0.60
4
8
12
% 5hmC (over total C)
r2 = 0.64 including brainr2 = 0.10 omitting brain * *
Nature Chemical Biology: doi:10.1038/nchembio.1848
11
Supplementary Figure 9. Mass spectra of all nucleoside base fragments
acquired at a resolution of 70,000 and used for quantification of labelling
ratios of 5mC, 5hmC and 5fC. Theoretical and found accurate masses are
listed in Supplementary Table 1. Theoretical masses ± 5 ppm were used to
generate extracted ion chromatograms and obtain areas under the curve for
quantification.
5mC
126.05 126.07 126.090
1×1006
2×1006
3×1006
m/z
5mC[+4]
130.07 130.09 130.110
5×1006
1×1007
2×1007
m/z
5hmC
142.04 142.06 142.080.0
5.0×1004
1.0×1005
1.5×1005
m/z
5hmC[+3]
145.06 145.08 145.100
2×1005
4×1005
6×1005
8×1005
m/z
5fC
140.03 140.05 140.070.0
5.0×1003
1.0×1004
1.5×1004
2.0×1004
2.5×1004
m/z
5fC[+2]
142.04 142.06 142.080
2×1003
4×1003
6×1003
8×1003
1×1004
m/z
NH
NH+
NH2
O
H
O
NH
NH+
NH2
O
D
DD
*
NH
NH+
NH2
O
H
OHH
NH
NH+
NH2
O
D
O
*
NH
NH+
NH2
O
H
HH
NH
NH+
NH2
O
D
OHD
*
Nature Chemical Biology: doi:10.1038/nchembio.1848
12
Supplementary Table 1. A list of all nucleosides measured in this study. The
Q Exactive mass spectrometer was set to select given parent ions (± 0.2 Da)
in the quadrupole mass filter and perform fragmentation and acquisition of full
scans (50 - 300 Da) for each ion. Indicated accurate masses of fragment ions
(± 5 ppm) were used for quantification. IS = internal standard. [13CD3] L-met =
[methyl-13CD3] L-methionine.
Base Source Isotopes
Parent ion
formula
Parent ion
[M+H]+
Fragment
formula
Fragment
[M+H]+
C Natural - C9H14O4N3 228.1 C4H6ON3 112.05054
C IS 15N, D2 C9H12D2O4N215N 231.1 C4H4D2ON2
15N 115.06013
5mC Natural - C10H16O4N3 242.1 C5H8ON3 126.06619
5mC IS D3 C10H13D3O4N3 245.1 C5H5D3ON3 129.08502
5mC [13CD3] L-met 13C, D3 C913CH13D3O4N3 246.1 C4
13CH5D3ON3 130.08837
5hmC Natural - C10H16O5N3 258.1 C5H8O2N3 142.06110
5hmC IS D3 C10H13D3O5N3 261.1 C5H5D3O2N3 145.07993
5hmC [13CD3] L-met 13C, D2 C913CH14D2O5N3 261.1 C4
13CH6D2O2N3 145.07701
5fC Natural - C10H14O5N3 256.1 C5H6O2N3 140.04545
5fC [13CD3] L-met 13C, D C913CH13DO5N3 258.1 C4
13CH5DO2N3 142.05508
5caC Natural - C10H14O6N3 272.1 C5H6O3N3 156.04037
RNA 5mC Natural - C10H16O5N3 258.1 C5H8ON3 126.06619
RNA 5mC [13CD3] L-met 13C, D3 C913CH13D3O5N3 262.1 C4
13CH5D3ON3 130.08837
Nature Chemical Biology: doi:10.1038/nchembio.1848
13
References
1. Ito, S. et al. Tet proteins can convert 5-methylcytosine to 5-
formylcytosine and 5-carboxylcytosine. Science 333, 1300-3 (2011).
Nature Chemical Biology: doi:10.1038/nchembio.1848