supplementary information: genome-wide identi cation of...
TRANSCRIPT
Supplementary information: Genome-wide identification of
microRNA targets in human ES cells reveals a role for miR-302 in
modulating BMP response
Inna Lipchina1, Yechiel Elkabetz2, Markus Hafner3, Robert Sheridan4, Aleksandra Mihailovic3,Thomas Tuschl3, Chris Sander4, Lorenz Studer1, Doron Betel4,5
Correspondence to [email protected] reaches all authors1Developmental Biology Program, Memorial Sloan-Kettering Cancer Center, New York, NY
2Department of Cell and Developmental Biology, Sackler School of Medicine, Tel Aviv University, Israel3Howard Hughes Medical Institute, Laboratory of RNA Molecular Biology, The Rockefeller University, New York, NY
4Computational Biology Program, Memorial Sloan-Kettering Cancer Center, New York, NY5
Current address: Department of Medicine and Institute of Computational Biomedicine, Weill Cornell Medical College, New York, NY
September 12, 2011
Supplementary Figures
Supplemental Figure 1: MicroRNA expression profile determined by hybridization array is consistentwith sequence-based expression profile
Supplemental Figure 2: Efficacy of coding region CCRs in target downregulation
Supplemental Figure 3: Sequence similarity among the highly expressed microRNAs results in ambiguoustarget assignment in PAR-CLIP data
Supplemental Figure 4: Validation of miR-302/367 inhibition and overexpression by luciferase assays
Supplemental Figure 5: Antagomir treatment downregulates both miR-302 and miR-367 activity inde-pendently
Supplemental Figure 6: Identification of the high confidence miR-302/367 targets
Supplemental Figure 7: miR-302/367 inhibition leads to reduction of BMP activity during neural induc-tion
Supplemental Figure 8: miR-302/367 inhibition decreases the efficiency of mesoderm and endoderm in-duction
Supplemental Figure 9: Luciferase assay validation of DAZAP2, SLAIN1, TOB2 regulation by miR-302
Supplemental Files
Supplementary Table 1: 454 and Solexa microRNA sequencing data
Supplementary Table 2: Upregulated targets from miR-302/367 antagomir experiments
Supplementary File 1: PAR-CLIP target sites (5′UTR, CD, 3′UTR). Excel file containing the miRNAbinding sites found in CCRs that matched annotated regions.
1
Description of Supplementary Files
Supplementary File 1: PAR-CLIP target sites (5′UTR, CD, 3′UTR).This Excel file contains three tables:CLIP 3UTR HitsCLIP 5UTR HitsCLIP CD Hits
Each table includes a description of the crosslink centered regions (CCRs) that were found in annotatedgene regions (3′UTR, 5′UTR and CD) and that contain target sites to the top 29 expressed seed familiesin hESC.
CCR records begin with a description line followed by the specific microRNA targets. The descriptionline contains the following fields:[Gene Name] [RefSeq ID] [CCR id] [number of crosslinked reads] [number of reads] [miRNA seed family]
Note that there could be more than one seed family with target sites in the CCR. The microRNAtarget lines contain the predicted target sites found in the CCR. Each line includes the following fields:[miRNA name] [miRNA alignment] [alignment string] [mRNA alignment] [mirSVR score (Betel et al., 2010)] [conservation score (Siepel et
al., 2005)] [position in UTR or CD]
For example:
LEFTY2 NM 003240 slc28214 chr1 10 43 miR-302a/302b/302c/302d/372/373/520e/520a-3p/520b/520c-3p/520d-3p/302e
hsa-miR-302a aguGGU--UUUGUA-CCUUCGUGAAu ||| | ||| |||||||| cgtCCATCACCCATCCTAAGCACTTa -0.893 0.590 153
hsa-miR-302b gaugAUUUUGUA-CCUUCGUGAAu | | ||| |||||||| tccaTCACCCATCCTAAGCACTTa -0.893 0.590 155
2
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
−14 −12 −10 −8 −6 −4 −2 0
02
46
810
12
hESC miRNA expression signal
454 seq log2(fraction)
Agi
lent
nor
mal
ized
sig
nal i
nten
sity
hsa−miR−103
hsa−miR−106b
hsa−miR−18ahsa−miR−16
hsa−miR−17
hsa−miR−19a
hsa−miR−20a
hsa−miR−21
hsa−miR−19b
hsa−miR−20b
hsa−miR−302ahsa−miR−367
hsa−miR−302bhsa−miR−302c
hsa−miR−302d
Supplemental Figure S1: MicroRNA expression profile determined by hybridization array isconsistent with sequence-based expression profile. MicroRNA expression levels measured by Agilentarray are consistent with the deep sequencing profile as determined by 454 sequencing. In particular, thenormalized array intensities are highly correlated with the sequencing coverage for the abundantly expressedmicroRNA clusters in undifferentiated hESCs.
3
−0.4 −0.2 0.0 0.2 0.4 0.6 0.8
0.0
0.2
0.4
0.6
0.8
1.0
CDF of coding regions and 3'UTR CLIP targets
log2 expression change
Cum
ulat
ive D
istri
butio
n Fu
nctio
n
bakcground3‘UTRCD
KS−test backVS_CD p−value = 0.0334
KS−test CDvsUTR p−value = 0.000111
Supplemental Figure S2: CCRs in coding regions are generally less effective in target downreg-ulation than CCRs in 3′UTRs. Similar to figure 2d,e in the manuscript, we performed a CDF analysisof genes with miR-302/367 CCR sites in their coding regions (184 genes) and similarly for genes with miR-302/367 CCRs in their 3′UTRs (654 genes). These gene sets were mutually exclusive such that genes withmiR-302/367 CCRs in both coding region and 3′UTR were excluded from the analysis. We found that theextent of target upregulation following miR-302/367 inhibition for genes with coding region targets (green)is significantly smaller than genes with 3′UTR target sites (blue). KS p-value=0.0334 for the difference be-tween coding-region set and background and the p-value=0.000111 for the difference between coding-regionand 3′UTR.
4
miR 302a/302b/302c/302d/372/373/302e
miR 17/20a/93/106a/106b/20b/519dmiR 130a/301a/130b/454/301b
139 87
19
108
61
178
201
hsa miR 130a Chsa miR 17 CAAhsa miR 302a UA
AGUGCAGUGC
AGUGC
AAUGUUAAAAGGGCAUUUACAGUGCAGGUAG
UUCCAUGUUUUGGUGA
Supplemental Figure S3: Sequence similarity among the highly expressed microRNAs resultsin ambiguous target assignment. Most target prediction methods are heavily biased towards perfectcomplementarity between the microRNA seed sequence (positions 2-8) and the target mRNA. As a result,microRNAs with similar seed sequences that are few bases shifted are often predicted to target the samesites. Among the highly expressed microRNAs in undifferentiated hESC are miR-302, miR-17/20 and miR-130 which are highly similar in their seed sequences (marked in red). Consequently, there is a large overlapamong their respective targets identified by the PAR-CLIP experiment.
5
Antagomir
Control a
ntagomir
Lipofectam
ine0
10
20
30
40
Nor
mal
ized
luci
fera
seun
itsFC
over
lipof
ecta
min
e
*
Mimic
Control m
imic
Lipofectam
ine0.0
0.5
1.0
1.5
Nor
mal
ized
luci
fera
se u
nits
FC
ove
r lip
ofec
tam
ine
a b
Supplemental Figure S4: Validation of miR-302/367 inhibition and overexpression by luciferaseassays. Cells were transfected with psi-check2 plasmid containing four miR-302a target sites and assayedfor luciferase activity after 24 hours by dual-luciferase assay. (a) Inhibition of miR-302a by antagomirs ledto increased luciferase activity relative to the corresponding controls of random antagomir and transfectionvehicle (Lipofectamine) indicating loss of miR-302a-mediated downregulation (p-value < 0.05 comparedto control antagomir). (b) Conversely, overexpression of miR-302/367 (mimics) resulted in reduction ofluciferase activity relative to controls indicating increased downregulation of the target.
6
−0.4 −0.2 0.0 0.2 0.4
0.0
0.2
0.4
0.6
0.8
1.0
CDF of miR-302/367/106 targets after miR-302~367 inhibition
Log2 expression change
Cum
ulat
ive D
istri
butio
n Fu
nctio
n
bakcgroundmiR-302miR-367miR-106
KS−test miR-302 p−value = 4.49e−06
KS−test miR-367 p−value = 0.00230
KS−test miR-106 p−value = 0.781
Supplemental Figure S5: Antagomir treatment downregulates both miR-302 and miR-367activity. CDF analysis of miR-367 and miR-302 targets following antagomir treatment indicates clearupregulation of miR-367 targets independently of miR-302 targets (gene sets are mutually exclusive). Fur-thermore, the expression levels of miR-106 targets are not significantly different from the background geneset even though miR-106 seed sequence is one base shifted relative to miR-302 (see Supplemental Fig. 3)indicating that the antagomir treatment is highly specific to the mir-302 cluster.
7
0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Inhibition FC enrichment
x
f(x)
Inhibitionoverlap
p−value 3.7208e−07
0 500 1000 1500
0.0
0.2
0.4
0.6
0.8
1.0
CLIP x−link enrichment
x
f(x)
PAR−CLIPoverlap
p−value 1.6184e−05
a b
Supplemental Figure S6: Identification of the high confidence miR-302/367 targets. CDF plotsdemonstrating the enrichment of both gene upregulation and number of crosslinked reads in the set of high-confidence targets. The list of 146 high-confidence targets was defined as described in the text (Fig. 3a)and compared to the respective background datasets. (a) Among the set of 622 miR-302/367 targets thatare upregulated by more than 0.2 log change upon miR-302/367 inhibition (blue circle in Fig. 3a), the 146high-confidence targets (green line) are significantly more upregulated than the remaining 476 targets (p-value ≤ 3.72e− 7, KS-test). (b) Similarly, among 734 genes identified in the PAR-CLIP experiment (greencircle in Fig. 3a) the set of high-confidence 146 miR-302/367 targets (green line) are enriched for crosslinkedsequence reads (p-value ≤ 1.62e− 5, KS-test) relative to the remaining 588 targets.
8
100 101 102 103 104
FL1
100
101
102
103
104
FL2
99.7 3.97e-3
5.96e-3
Unstained control
100 101 102 103 104
FL1- PAX6 Alexa 488
100
101
102
103
104
FL2-
H
1.7 97.3
0
Noggin, SB
100 101 102 103 104
FL1- PAX6 Alexa 488
100
101
102
103
104
FL2
99.5 0.109
1.98e-3
SB with control antagomir
100 101 102 103 104
FL1- PAX6 Alexa 488
100
101
102
103
104
FL2
97.9 1.43
0.0994
SB with miR-302 antagomir
Antagomir
Control
0.0
0.5
1.0
1.5
% o
f PA
X6 p
ositi
ve c
ells
PAX6 FACS Analysis
SB Noggin No inhibitor Noggin SB
SOX1m
iR-3
02/3
67 a
ntag
omir/
cont
rol S
OX
1 le
vels
1
2
3
4
5
6
7
8
a
b c
d fe
*
Supplemental Figure S7: miR-302/367 inhibition leads to reduction of BMP activity duringneural induction. (a) qRT- PCR for early neural marker SOX1 after 7 days of differentiation in knockoutserum replacement (KSR)-based medium containing inhibitors of BMP (Noggin) and TGF-β/Activin/Nodal(SB-431542) signaling pathways. In all pathway inhibition experiments the miR-302/367 antagomir treat-ment led to increase SOX1 levels relative to control antagomir (p-values 0.036 for SB, 0.159 for Noggin, 0.005for Noggin SB, Wilcoxon rank sum test, n=7 ). In contrast, without the inhibitor treatments there is nodifference between the two antagomir treatments (average ratio of 1).In the analysis of PAX6 expression by intracellular FACS, hESCs were transfected with control antagomirs(b) or miR-302/367 antagomir (c) and cultured in media containing the inhibitor of TGF-β/Activin/Nodalsignaling pathway (SB-431542). On day 11 of differentiation, cells were fixed and analyzed for PAX6 ex-pression by intracellular flow cytometry. (d) In all 4 replicates the percentage of PAX6 positive cells waslarger in the miR-302 antagomir vs. control antagomir (p-value = 0.004 ). As a positive control (e) cellswere grown in the presence of both TGF-β/Activin/Nodal and BMP inhibitors (SB-431542 and Noggin).(f) unstained negative control.
9
Sox17 (qRT-PCR)
Antagomir
Control a
ntagomir
0.0
0.5
1.0
1.5
Fcof
norm
aliz
edG
APD
Hra
tio
**
Brachyury (qRT-PCR)
Antagomir
Control a
ntagomir
0.0
0.5
1.0
1.5
Fcof
norm
aliz
edG
APD
Hra
tio
***
a b
Supplemental Figure S8: miR-302/367 inhibition decreases the efficiency of mesoderm andendoderm induction. Cells were transfected with miR-302/367 antagomirs and cultured in endoderm (a)or mesoderm (b) differentiation conditions. Induction was measured by qRT-PCR of SOX17 for endodermdifferentiation and BRACHYURY for mesoderm differentiation. In both cases loss of miR-302/367 led toreduced differentiation possibly through increased inhibition of TGF-β signaling (** p-value < 0.01, ***p-value < 0.001 compared to control antagomir).
10
SMAD6
SMAD7
Control
DAZAP2
SLAIN1TOB2
BTG3
CNOT6FBN2
LCOR
LEFTY1
MTMR4PTEN
RAB320
1
2
3
4
Fc o
f nor
mal
ized
GA
PDH
ratio
s ID1 levels following siRNA treatment
*** ***
**
** ***
siRNA knockdown validation (qRT-PCR)
siR
NA
/con
trol
siR
NA
fc o
f no
rmal
ized
GA
PDH
ratio
SMAD6
SMAD7
DAZAP2
SLAIN1
TOB20.0
0.2
0.4
0.6
a b
Empty ve
ctor
WT
M1+M2+
M3 M1 M2 M30.0
0.5
1.0
1.5
2.0
2.5
Empty ve
ctor
WT M0.0
0.5
1.0
1.5
2.0
Empty ve
ctor
WT M0.0
0.5
1.0
1.5
Rel
ativ
e lu
cife
rase
act
ivity
M
ir-30
2/36
7 an
tago
mir/
cont
rol
Rel
ativ
e lu
cife
rase
act
ivity
M
ir-30
2/36
7 m
imic
/con
trol
Empty ve
ctor
WT
M1+M2+
M3 M1 M2 M30.0
0.2
0.4
0.6
0.8
1.0
1.2
Empty ve
ctor
WT M0.0
0.2
0.4
0.6
0.8
1.0
1.2
Empty ve
ctor
WT M0.0
0.2
0.4
0.6
0.8
1.0
1.2
DAZAP2: Site 1, position 127 hsa-miR-302a agUGGUUUUGUACCUUCGUGAAu :|| |||| | |:|||||| WT DAZAP2 gtGCCCAAACTTTCAGGCACTTt MUT1 DAZAP2 gtgcccaaactttcaggGTAGtt Site 2, position 566 hsa-miR-302a agugguuuuguaccuuCGUGAAu |||||| WT DAZAP2 atttccttctttcctcGCACTTc MUT2 DAZAP2 atttccttctttcctcgGTAGtc Site 3, position 1227 hsa-miR-302a aguGGUUUUGUACCUUCGUGAAu |:| | ::||| |||||| WT DAZAP2 tctCTACACTGTGG-TGCACTTa MUT3 DAZAP2 tctctacactgtgg-tgGTAGta SLAIN1: Position 145 hsa-miR-302a agUGGU-UUUG--UACCUUCGUGAAu |||| |:|| || |||||| WT SLAIN1 tcACCAGAGACTAATTATTGCACTTa MUT SLAIN1 tcaccagagactaattattgGTAGta TOB2: Position 641 hsa-miR-302a agUGGUUUUGUACCUUCGUGAAu || :::| || ||||||| WT TOB2 ctACAGGGAGAT--TAGCACTTc MUT TOB2 ctacagggagat--tagGTAGtc
DAZAP2
DAZAP2
SLAIN1
SLAIN1
TOB2
TOB2
*** * * ** *
** * ** ** *
c
d
e
Supplemental Figure S9: Validation of DAZAP2, SLAIN1, TOB2 siRNA knockdown and regulation bymiR-302 by luciferase assay. (a) siRNA screen of miR-302/367 targets for BMP signaling regulation. BMP-specificmarker ID1 levels were assayed by qRT-PCR following siRNA treatment of the 11 genes as well as two positive controlsSMAD6/7 (known inhibitors of BMP signaling) and non-targeting siRNA control. Of the 11 targets knockdown ofDAZAP2, SLAIN1 and TOB2 had robust and consistent upregulation of ID1 levels when compared to the control siRNAsuggesting that these are negative regulators of BMP signaling. (b) siRNA functional validation by qRT-PCR 48 hoursafter knockdown, normalized to negative control siRNA. (c) Sequences of miR-302 target sites identified by PAR-CLIPand corresponding mutated sites. Luciferase reporter constructs containing wild type or mutant 3′UTR of DAZAP2,SLAIN1, and TOB2 were transfected into (d) hESC together with miR-302/367 antagomirs, or (e) HEK293T cellswith miR-302/367 mimics. Three DAZAP2 miR-302 target sites were mutated either individually or together. Datais presented as fold change of luciferase signal between miR-302/367 antagomir or mimic treated cells over the control,normalized to empty pMIR-REPORT plasmid (WT = wild type 3′UTR; M = mutated 3′UTR; * p-value < 0.05,**p-value < 0.01, *** p-value < 0.001)
11