supplementary information: genome-wide identi cation of...

Supplementary information: Genome-wide identification of

microRNA targets in human ES cells reveals a role for miR-302 in

modulating BMP response

Inna Lipchina1, Yechiel Elkabetz2, Markus Hafner3, Robert Sheridan4, Aleksandra Mihailovic3,Thomas Tuschl3, Chris Sander4, Lorenz Studer1, Doron Betel4,5

Correspondence to [email protected] reaches all authors1Developmental Biology Program, Memorial Sloan-Kettering Cancer Center, New York, NY

2Department of Cell and Developmental Biology, Sackler School of Medicine, Tel Aviv University, Israel3Howard Hughes Medical Institute, Laboratory of RNA Molecular Biology, The Rockefeller University, New York, NY

4Computational Biology Program, Memorial Sloan-Kettering Cancer Center, New York, NY5

Current address: Department of Medicine and Institute of Computational Biomedicine, Weill Cornell Medical College, New York, NY

September 12, 2011

Supplementary Figures

Supplemental Figure 1: MicroRNA expression profile determined by hybridization array is consistentwith sequence-based expression profile

Supplemental Figure 2: Efficacy of coding region CCRs in target downregulation

Supplemental Figure 3: Sequence similarity among the highly expressed microRNAs results in ambiguoustarget assignment in PAR-CLIP data

Supplemental Figure 4: Validation of miR-302/367 inhibition and overexpression by luciferase assays

Supplemental Figure 5: Antagomir treatment downregulates both miR-302 and miR-367 activity inde-pendently

Supplemental Figure 6: Identification of the high confidence miR-302/367 targets

Supplemental Figure 7: miR-302/367 inhibition leads to reduction of BMP activity during neural induc-tion

Supplemental Figure 8: miR-302/367 inhibition decreases the efficiency of mesoderm and endoderm in-duction

Supplemental Figure 9: Luciferase assay validation of DAZAP2, SLAIN1, TOB2 regulation by miR-302

Supplemental Files

Supplementary Table 1: 454 and Solexa microRNA sequencing data

Supplementary Table 2: Upregulated targets from miR-302/367 antagomir experiments

Supplementary File 1: PAR-CLIP target sites (5′UTR, CD, 3′UTR). Excel file containing the miRNAbinding sites found in CCRs that matched annotated regions.

1

Description of Supplementary Files

Supplementary File 1: PAR-CLIP target sites (5′UTR, CD, 3′UTR).This Excel file contains three tables:CLIP 3UTR HitsCLIP 5UTR HitsCLIP CD Hits

Each table includes a description of the crosslink centered regions (CCRs) that were found in annotatedgene regions (3′UTR, 5′UTR and CD) and that contain target sites to the top 29 expressed seed familiesin hESC.

CCR records begin with a description line followed by the specific microRNA targets. The descriptionline contains the following fields:[Gene Name] [RefSeq ID] [CCR id] [number of crosslinked reads] [number of reads] [miRNA seed family]

Note that there could be more than one seed family with target sites in the CCR. The microRNAtarget lines contain the predicted target sites found in the CCR. Each line includes the following fields:[miRNA name] [miRNA alignment] [alignment string] [mRNA alignment] [mirSVR score (Betel et al., 2010)] [conservation score (Siepel et

al., 2005)] [position in UTR or CD]

For example:

LEFTY2 NM 003240 slc28214 chr1 10 43 miR-302a/302b/302c/302d/372/373/520e/520a-3p/520b/520c-3p/520d-3p/302e

hsa-miR-302a aguGGU--UUUGUA-CCUUCGUGAAu ||| | ||| |||||||| cgtCCATCACCCATCCTAAGCACTTa -0.893 0.590 153

hsa-miR-302b gaugAUUUUGUA-CCUUCGUGAAu | | ||| |||||||| tccaTCACCCATCCTAAGCACTTa -0.893 0.590 155

2

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

● ●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

−14 −12 −10 −8 −6 −4 −2 0

02

46

810

12

hESC miRNA expression signal

454 seq log2(fraction)

Agi

lent

nor

mal

ized

sig

nal i

nten

sity

hsa−miR−103

hsa−miR−106b

hsa−miR−18ahsa−miR−16

hsa−miR−17

hsa−miR−19a

hsa−miR−20a

hsa−miR−21

hsa−miR−19b

hsa−miR−20b

hsa−miR−302ahsa−miR−367

hsa−miR−302bhsa−miR−302c

hsa−miR−302d

Supplemental Figure S1: MicroRNA expression profile determined by hybridization array isconsistent with sequence-based expression profile. MicroRNA expression levels measured by Agilentarray are consistent with the deep sequencing profile as determined by 454 sequencing. In particular, thenormalized array intensities are highly correlated with the sequencing coverage for the abundantly expressedmicroRNA clusters in undifferentiated hESCs.

3

−0.4 −0.2 0.0 0.2 0.4 0.6 0.8

0.0

0.2

0.4

0.6

0.8

1.0

CDF of coding regions and 3'UTR CLIP targets

log2 expression change

Cum

ulat

ive D

istri

butio

n Fu

nctio

n

bakcground3‘UTRCD

KS−test backVS_CD p−value = 0.0334

KS−test CDvsUTR p−value = 0.000111

Supplemental Figure S2: CCRs in coding regions are generally less effective in target downreg-ulation than CCRs in 3′UTRs. Similar to figure 2d,e in the manuscript, we performed a CDF analysisof genes with miR-302/367 CCR sites in their coding regions (184 genes) and similarly for genes with miR-302/367 CCRs in their 3′UTRs (654 genes). These gene sets were mutually exclusive such that genes withmiR-302/367 CCRs in both coding region and 3′UTR were excluded from the analysis. We found that theextent of target upregulation following miR-302/367 inhibition for genes with coding region targets (green)is significantly smaller than genes with 3′UTR target sites (blue). KS p-value=0.0334 for the difference be-tween coding-region set and background and the p-value=0.000111 for the difference between coding-regionand 3′UTR.

4

miR 302a/302b/302c/302d/372/373/302e

miR 17/20a/93/106a/106b/20b/519dmiR 130a/301a/130b/454/301b

139 87

19

108

61

178

201

hsa miR 130a Chsa miR 17 CAAhsa miR 302a UA

AGUGCAGUGC

AGUGC

AAUGUUAAAAGGGCAUUUACAGUGCAGGUAG

UUCCAUGUUUUGGUGA

Supplemental Figure S3: Sequence similarity among the highly expressed microRNAs resultsin ambiguous target assignment. Most target prediction methods are heavily biased towards perfectcomplementarity between the microRNA seed sequence (positions 2-8) and the target mRNA. As a result,microRNAs with similar seed sequences that are few bases shifted are often predicted to target the samesites. Among the highly expressed microRNAs in undifferentiated hESC are miR-302, miR-17/20 and miR-130 which are highly similar in their seed sequences (marked in red). Consequently, there is a large overlapamong their respective targets identified by the PAR-CLIP experiment.

5

Antagomir

Control a

ntagomir

Lipofectam

ine0

10

20

30

40

Nor

mal

ized

luci

fera

seun

itsFC

over

lipof

ecta

min

e

*

Mimic

Control m

imic

Lipofectam

ine0.0

0.5

1.0

1.5

Nor

mal

ized

luci

fera

se u

nits

FC

ove

r lip

ofec

tam

ine

a b

Supplemental Figure S4: Validation of miR-302/367 inhibition and overexpression by luciferaseassays. Cells were transfected with psi-check2 plasmid containing four miR-302a target sites and assayedfor luciferase activity after 24 hours by dual-luciferase assay. (a) Inhibition of miR-302a by antagomirs ledto increased luciferase activity relative to the corresponding controls of random antagomir and transfectionvehicle (Lipofectamine) indicating loss of miR-302a-mediated downregulation (p-value < 0.05 comparedto control antagomir). (b) Conversely, overexpression of miR-302/367 (mimics) resulted in reduction ofluciferase activity relative to controls indicating increased downregulation of the target.

6

−0.4 −0.2 0.0 0.2 0.4

0.0

0.2

0.4

0.6

0.8

1.0

CDF of miR-302/367/106 targets after miR-302~367 inhibition

Log2 expression change

Cum

ulat

ive D

istri

butio

n Fu

nctio

n

bakcgroundmiR-302miR-367miR-106

KS−test miR-302 p−value = 4.49e−06

KS−test miR-367 p−value = 0.00230

KS−test miR-106 p−value = 0.781

Supplemental Figure S5: Antagomir treatment downregulates both miR-302 and miR-367activity. CDF analysis of miR-367 and miR-302 targets following antagomir treatment indicates clearupregulation of miR-367 targets independently of miR-302 targets (gene sets are mutually exclusive). Fur-thermore, the expression levels of miR-106 targets are not significantly different from the background geneset even though miR-106 seed sequence is one base shifted relative to miR-302 (see Supplemental Fig. 3)indicating that the antagomir treatment is highly specific to the mir-302 cluster.

7

0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Inhibition FC enrichment

x

f(x)

Inhibitionoverlap

p−value 3.7208e−07

0 500 1000 1500

0.0

0.2

0.4

0.6

0.8

1.0

CLIP x−link enrichment

x

f(x)

PAR−CLIPoverlap

p−value 1.6184e−05

a b

Supplemental Figure S6: Identification of the high confidence miR-302/367 targets. CDF plotsdemonstrating the enrichment of both gene upregulation and number of crosslinked reads in the set of high-confidence targets. The list of 146 high-confidence targets was defined as described in the text (Fig. 3a)and compared to the respective background datasets. (a) Among the set of 622 miR-302/367 targets thatare upregulated by more than 0.2 log change upon miR-302/367 inhibition (blue circle in Fig. 3a), the 146high-confidence targets (green line) are significantly more upregulated than the remaining 476 targets (p-value ≤ 3.72e− 7, KS-test). (b) Similarly, among 734 genes identified in the PAR-CLIP experiment (greencircle in Fig. 3a) the set of high-confidence 146 miR-302/367 targets (green line) are enriched for crosslinkedsequence reads (p-value ≤ 1.62e− 5, KS-test) relative to the remaining 588 targets.

8

100 101 102 103 104

FL1

100

101

102

103

104

FL2

99.7 3.97e-3

5.96e-3

Unstained control

100 101 102 103 104

FL1- PAX6 Alexa 488

100

101

102

103

104

FL2-

H

1.7 97.3

0

Noggin, SB

100 101 102 103 104

FL1- PAX6 Alexa 488

100

101

102

103

104

FL2

99.5 0.109

1.98e-3

SB with control antagomir

100 101 102 103 104

FL1- PAX6 Alexa 488

100

101

102

103

104

FL2

97.9 1.43

0.0994

SB with miR-302 antagomir

Antagomir

Control

0.0

0.5

1.0

1.5

% o

f PA

X6 p

ositi

ve c

ells

PAX6 FACS Analysis

SB Noggin No inhibitor Noggin SB

SOX1m

iR-3

02/3

67 a

ntag

omir/

cont

rol S

OX

1 le

vels

1

2

3

4

5

6

7

8

a

b c

d fe

*

Supplemental Figure S7: miR-302/367 inhibition leads to reduction of BMP activity duringneural induction. (a) qRT- PCR for early neural marker SOX1 after 7 days of differentiation in knockoutserum replacement (KSR)-based medium containing inhibitors of BMP (Noggin) and TGF-β/Activin/Nodal(SB-431542) signaling pathways. In all pathway inhibition experiments the miR-302/367 antagomir treat-ment led to increase SOX1 levels relative to control antagomir (p-values 0.036 for SB, 0.159 for Noggin, 0.005for Noggin SB, Wilcoxon rank sum test, n=7 ). In contrast, without the inhibitor treatments there is nodifference between the two antagomir treatments (average ratio of 1).In the analysis of PAX6 expression by intracellular FACS, hESCs were transfected with control antagomirs(b) or miR-302/367 antagomir (c) and cultured in media containing the inhibitor of TGF-β/Activin/Nodalsignaling pathway (SB-431542). On day 11 of differentiation, cells were fixed and analyzed for PAX6 ex-pression by intracellular flow cytometry. (d) In all 4 replicates the percentage of PAX6 positive cells waslarger in the miR-302 antagomir vs. control antagomir (p-value = 0.004 ). As a positive control (e) cellswere grown in the presence of both TGF-β/Activin/Nodal and BMP inhibitors (SB-431542 and Noggin).(f) unstained negative control.

9

Sox17 (qRT-PCR)

Antagomir

Control a

ntagomir

0.0

0.5

1.0

1.5

Fcof

norm

aliz

edG

APD

Hra

tio

**

Brachyury (qRT-PCR)

Antagomir

Control a

ntagomir

0.0

0.5

1.0

1.5

Fcof

norm

aliz

edG

APD

Hra

tio

***

a b

Supplemental Figure S8: miR-302/367 inhibition decreases the efficiency of mesoderm andendoderm induction. Cells were transfected with miR-302/367 antagomirs and cultured in endoderm (a)or mesoderm (b) differentiation conditions. Induction was measured by qRT-PCR of SOX17 for endodermdifferentiation and BRACHYURY for mesoderm differentiation. In both cases loss of miR-302/367 led toreduced differentiation possibly through increased inhibition of TGF-β signaling (** p-value < 0.01, ***p-value < 0.001 compared to control antagomir).

10

SMAD6

SMAD7

Control

DAZAP2

SLAIN1TOB2

BTG3

CNOT6FBN2

LCOR

LEFTY1

MTMR4PTEN

RAB320

1

2

3

4

Fc o

f nor

mal

ized

GA

PDH

ratio

s ID1 levels following siRNA treatment

*** ***

**

** ***

siRNA knockdown validation (qRT-PCR)

siR

NA

/con

trol

siR

NA

fc o

f no

rmal

ized

GA

PDH

ratio

SMAD6

SMAD7

DAZAP2

SLAIN1

TOB20.0

0.2

0.4

0.6

a b

Empty ve

ctor

WT

M1+M2+

M3 M1 M2 M30.0

0.5

1.0

1.5

2.0

2.5

Empty ve

ctor

WT M0.0

0.5

1.0

1.5

2.0

Empty ve

ctor

WT M0.0

0.5

1.0

1.5

Rel

ativ

e lu

cife

rase

act

ivity

M

ir-30

2/36

7 an

tago

mir/

cont

rol

Rel

ativ

e lu

cife

rase

act

ivity

M

ir-30

2/36

7 m

imic

/con

trol

Empty ve

ctor

WT

M1+M2+

M3 M1 M2 M30.0

0.2

0.4

0.6

0.8

1.0

1.2

Empty ve

ctor

WT M0.0

0.2

0.4

0.6

0.8

1.0

1.2

Empty ve

ctor

WT M0.0

0.2

0.4

0.6

0.8

1.0

1.2

DAZAP2: Site 1, position 127 hsa-miR-302a agUGGUUUUGUACCUUCGUGAAu :|| |||| | |:|||||| WT DAZAP2 gtGCCCAAACTTTCAGGCACTTt MUT1 DAZAP2 gtgcccaaactttcaggGTAGtt Site 2, position 566 hsa-miR-302a agugguuuuguaccuuCGUGAAu |||||| WT DAZAP2 atttccttctttcctcGCACTTc MUT2 DAZAP2 atttccttctttcctcgGTAGtc Site 3, position 1227 hsa-miR-302a aguGGUUUUGUACCUUCGUGAAu |:| | ::||| |||||| WT DAZAP2 tctCTACACTGTGG-TGCACTTa MUT3 DAZAP2 tctctacactgtgg-tgGTAGta SLAIN1: Position 145 hsa-miR-302a agUGGU-UUUG--UACCUUCGUGAAu |||| |:|| || |||||| WT SLAIN1 tcACCAGAGACTAATTATTGCACTTa MUT SLAIN1 tcaccagagactaattattgGTAGta TOB2: Position 641 hsa-miR-302a agUGGUUUUGUACCUUCGUGAAu || :::| || ||||||| WT TOB2 ctACAGGGAGAT--TAGCACTTc MUT TOB2 ctacagggagat--tagGTAGtc

DAZAP2

DAZAP2

SLAIN1

SLAIN1

TOB2

TOB2

*** * * ** *

** * ** ** *

c

d

e

Supplemental Figure S9: Validation of DAZAP2, SLAIN1, TOB2 siRNA knockdown and regulation bymiR-302 by luciferase assay. (a) siRNA screen of miR-302/367 targets for BMP signaling regulation. BMP-specificmarker ID1 levels were assayed by qRT-PCR following siRNA treatment of the 11 genes as well as two positive controlsSMAD6/7 (known inhibitors of BMP signaling) and non-targeting siRNA control. Of the 11 targets knockdown ofDAZAP2, SLAIN1 and TOB2 had robust and consistent upregulation of ID1 levels when compared to the control siRNAsuggesting that these are negative regulators of BMP signaling. (b) siRNA functional validation by qRT-PCR 48 hoursafter knockdown, normalized to negative control siRNA. (c) Sequences of miR-302 target sites identified by PAR-CLIPand corresponding mutated sites. Luciferase reporter constructs containing wild type or mutant 3′UTR of DAZAP2,SLAIN1, and TOB2 were transfected into (d) hESC together with miR-302/367 antagomirs, or (e) HEK293T cellswith miR-302/367 mimics. Three DAZAP2 miR-302 target sites were mutated either individually or together. Datais presented as fold change of luciferase signal between miR-302/367 antagomir or mimic treated cells over the control,normalized to empty pMIR-REPORT plasmid (WT = wild type 3′UTR; M = mutated 3′UTR; * p-value < 0.05,**p-value < 0.01, *** p-value < 0.001)

11

supplementary information: genome-wide identi cation of...

Documents