extended experimental procedures cell culture mouse partial ips
TRANSCRIPT
Extended Experimental Procedures
Cell Culture
Mouse partial iPS cells, mouse iPS cells and mouse ES cells were maintained in ES cell
medium (DMEM containing 15% fetal calf serum (FCS), 1X non-essential amino acids
(NEAA), 5.5 mM 2-mercaptoethanol (2-ME), 50 units/ml penicillin and 50 mg/ml
streptomycin) on feeder layers of mitomycin-C-treated SNL cells stably expressing the
puromycin-resistance gene. We used conditioned medium from Plat-E cell cultures that
had been transduced with a LIF-expressing vector as a source of leukemia inhibitory
factor (LIF). MEFs were maintained in DMEM containing 10% FCS, 50 units/ml
penicillin and 50 mg/ml streptomycin. Plat-E cells were maintained in DMEM
containing 10% FCS, 50 units/ml penicillin, 50 mg/ml streptomycin, 1 mg/ml
puromycin and 10mg/ml blasticidin S. We used 13.5 d.p.c. embryos for MEF isolation.
Mouse partial iPS cell lines used in this study were 20A4 (Okita et al., 2007), 20A5
(Okita et al., 2007) and 20A16 (Okita et al., 2007). Mouse iPS cell lines used in this
study were 492B4 (Okita et al., 2008), 492B9 (Okita et al., 2008), 178B5 (Nakagawa et
al., 2008), 20D17 (Okita et al., 2007), 256H18 (Nakagawa et al., 2008), 98A-1 (Aoi et
al., 2008) and 99-1 (Aoi et al., 2008). Mouse ES cell lines used in this study were v6.5,
RF8, and 1A2 (Okita et al., 2007). MEFs used in this study were C57BL/B6 MEF (wt
MEF), Fbx-GFP reporter MEF (Takahashi and Yamanaka, 2006) and Nanog-GFP
reporter MEF (Okita et al., 2007).
RNA Sequencing
Mouse iPS cells grown on feeder cells were passaged on gelatin-coated plates twice to
remove as many feeder cells as possible, and then their RNA was isolated using the
RNeasy Mini Kit (QIAGEN). The polyA fraction was selected using a magnetic based
purification kit (Dynabeads mRNA purification kit, Invitrogen). The strand-specific
cDNA library was generated with the Whole Transcriptome Analysis Kit
(LifeTechnologies) using either adaptor mix A or B. The cDNA libraries for iPS cells
and MEFs were sequenced with the SOLiD system (LifeTechnologies) with 50-bp
single-end reads according to the instructions of the manufacturer.
Data Analysis
The sequenced reads obtained from the SOLiD 3 plus system were mapped to both the
mouse exon-exon junction sequences which were defined by three transcript databases
(i.e., Refseq, UCSC knowngenes and Ensemble transcripts) and the mouse reference
genome (mm9) using BioScope v. 1.3 (LifeTechnologies) with default mapping
parameters. Only reads that mapped to exon-exon junctions were further processed in
downstream analyses. The possible alternatively spliced regions were identified using
three transcripts databases (i.e., Refseq, UCSC knowngenes and Ensemble transcripts).
The alternatively spliced regions that were expressed in MEF and iPS cells (total
junction reads >= 10) were used to analyze the exons that were alternatively spliced
between MEF and iPS cells. The inclusion ratios were used to determine the
alternatively spliced exons by statistical analysis (a Fisher's exact-test using the 2x2
table in which junction reads were divided by cell types (i.e. MEF versus iPS cells) and
read types (i.e. inclusion versus exclusion)) followed by multiple-test correction with a
false discovery rate (Benjamini-Hochberg FDR) of less than 0.01. For clustering
analysis, we obtained an additional data set including MEF, two iPS cell lines and one
ES cell line from the SOLiD 4 system to identify the splicing patterns that differed
between the MEFs and ES cells; these differences were identified by the same method
described above. The calculation of the inclusion ratio, the statistical analysis and the
clustering analysis were performed using Microsoft SQL server and R software with
custom-made programs. The functional analyses were generated through the use of IPA
(Ingenuity Systems, www.ingenuity.com). The genes that were associated with
biological functions and pathways in the Ingenuity Knowledge Base were considered
for the analysis. Fisher's exact test was used to calculate a p-value. For motif analysis,
nucleotide sequences (300 bp in size for intronic regions and 40 bp for exonic regions)
around the skipped exons were extracted to search for overrepresented motifs between 3
to 7 nucleotides in size. The region-specific background set that includes sequences
from the set of skipped exons expressed in both MEF and iPS cells.were used to
calculate motif-enrichment p-values (Fisher's exact test) followed by multiple-test
correction with a false discovery rate (Benjamini-Hochberg FDR) of less than 0.05.
Mouse iPSC Generation
The generation of mouse iPS cells with retroviruses was performed as previously
described (Takahashi and Yamanaka, 2006) with some modifications. Briefly, Plat-E
cells were seeded at 8.1 x 10⁶ cells per 150-mm dish. On the next day, pMX-based
retroviral vectors encoding the four reprogramming factors were independently
introduced into Plat-E cells using the FuGENE 6 transfection reagent. After 24 h, the
medium was replaced with 20 ml of DMEM -containing 10% FBS. Nanog-GFP reporter
MEFs were seeded at 0.7 x 10⁶ cells per dish in 100-mm dishes coated with a layer of
gelatin. The next day, virus-containing supernatants from the Plat-E cultures were
recovered and combined. The Nanog-GFP reporter MEFs were incubated in the
virus/polybrene-containing supernatants at a final concentration of 4 μg/ml for 24 h.
Three days after infection, the medium was changed to ES cell medium supplemented
with LIF. In knockdown experiments during iPS cell induction, retrovirus for shRNAs
and the four reprogramming factors were produced in Plat-E cells. Nanog-GFP reporter
MEFs were seeded at 1.0 x 104 cells per well in 24-well plates coated with a layer of
gelatin one day before infection. The next day, virus-containing supernatants from the
Plat-E cultures were recovered and combined. The Nanog-GFP reporter MEFs were
incubated in the virus/polybrene-containing supernatants at a final concentration of 4
μg/ml for 24 h. Then, the feeder cells are seeded on the infected fibroblasts. The
following day, the medium was changed to ES cell medium containing KSR
(Invitrogen) instead of FCS. At day 14 after infection, the area of AP positive cells and
Nanog-GFP intensity per well were scored. GFP intensity was measured using
Powerscan4 (DS Parma Biomedical).
Flow Cytometry
Cultures were harvested by incubation in 0.25% trypsin/1 mM EDTA for 5 min at 37ºC,
and single-cell suspensions were obtained by repetitive pipetting and transfer through a
70 μm cell strainer. Cells were incubated with PE-conjugated rat anti-Thy1 (sc-52616
PE, Santa Cruz) and Alexa Flour 647-conjugated mouse anti-SSEA-1 (sc-21702 AF647,
Santa Cruz) antibodies and analyzed on a FACSAria II instrument (BD Biosciences).
Dead cells were excluded by staining with DAPI. The data were analyzed using FlowJo
software (Tree Star).
RNA Isolation, qPCR and Absolute qPCR
RNA was isolated from MEF, partial iPS cell lines and iPS/ES cell lines using the
RNeasy Mini Kit (QIAGEN) following the manufacturer’s instructions, and cDNA was
produced using the QuantiTect Reverse Transcription Kit (QIAGEN). Real-time
quantitative PCR reactions were set up with SYBR Premix Ex Taq II (Perfect Real
Time) (TaKaRa) and run on a StepOne Plus QPCR System (Applied Biosystems). RNA
was isolated from tissues and FACS-sorted cells using the RNeasy Mini Kit (QIAGEN)
following the manufacturer’s instructions, and cDNA was produced using the
QuantiTect Reverse Transcription Kit (QIAGEN). For high-throughput qPCR, cDNA
was pre-amplified, and real-time quantitative PCR reactions were loaded on 96.96
qPCR Dynamic array, and run using BioMark System (Fluidigm) following the
manufacturer’s instructions. For digital PCR, cDNA was diluted on dqPCR 37K chips,
and PCR reactions were run using the BioMark System (Fluidigm). For determining
inclusion ratios, the expression levels of splicing variants were separately detected; for
inclusion variants, one of the primer pair were designed within alternative exons and for
exclusion variants, one of the primer pair were designed for cross exon-exon junctions.
The primer sequences for qPCR are listed in Supplementary Table S3. Moreover, for the
data obtained from each primer pair, we checked them by gel electrophoresis to confirm
that only a single band is detected and the size of the qPCR product is correct. Further,
we purified a part of the PCR products and confirmed their sequences. Finally, we
cloned both inclusion and exclusion isoforms of almost all the genes, whose splicing
ratio was examined, into plasmid vectors. For a part of the genes, not the whole coding
sequence but the several hundred bp long region containing the alternative exon was
cloned. With these plasmids as templates, we tested whether PCR primers specifically
detected only their intended targets (data not shown).
Microarray Experiments
Total RNA was prepared using the RNeasy Mini Kit (Qiagen) according to the
manufacturer’s instructions. cDNA synthesis and transcriptional amplification were
performed using 200 ng of total RNA following the "Whole Transcript (WT) Expression
kit" (Ambion/Affymetrix). The fragmented and biotin-labeled cDNA targets were
hybridized to mouse Gene 1.0 ST arrays (Affymetrix) according to Affymetrix protocols.
Hybridized arrays were scanned using an Affymetrix GeneChip Scanner. The data
analyses were performed using GeneSpring GX software (Agilent Technologies).
siRNA Screen
For RNAi screen, the target genes (92 genes in total) were selected under the following
criteria; (1) Their expression levels were increased by 2-fold or more in both iPS cells
and ES cells compared to MEF based on our microarray data, and (2) they were in
the ”RNA binding” biogroup in the NextBio database (October 2010). Three Silencer
siRNA (Life Technologies) for a single target gene were pooled. Then the siRNA pools
were transfected into murine iPS cells and ES cells by using Lipofectamine 2000 (Life
Technologies) by reverse transfection according to manufacturers’ instructions. siRNAs
were used at 25nM. Transfection efficiency was examined using Alexa Fluor
555-labeled, double-stranded RNA oligomer (Invitrogen). After siRNA treatment for 48
hr, cDNA was synthesized using Cells-to-Ct kit (Life Technologies). Then, cDNA was
pre-amplified, and real-time quantitative PCR reactions were run on the BioMark
System (Fluidigm) following the manufacturer’s instructions. Each of the siRNA pools
for U2af1 and Srsf3 was tested under the same condition as the above.
Immunoblotting
Cells were lysed in 1 x Cell Lysis Buffer (Cell Signaling Technology) with protease
inhibitors (Complete; Roche) and 0.5% deoxycholate. Cell lysates were separated by
SDS-PAGE and analyzed by immunoblotting with anti-U2AF1, anti-SRSF3 antibodies
(Santa Cruz Biotechnology) and anti-GAPDH antibody (Ambion).
Public Microarray Data Analysis
The expression profiles of the 92 RNA-binding genes in tissues and cell lines were
obtained from the BioGPS public database (http://biogps.gnf.org).
Extended Discussion
In this study, we performed global analysis of alternative splicing and identified several
hundred genes whose splicing patterns are changed during the reprogramming process.
Moreover, our data indicate that molecular properties of somatic cells revert to those of
embryonic stem cells in terms of isoform expression. Although the functional
significance of each splicing variant in the reprogramming process remains to be
elucidated, our analysis reveals that cellular reprogramming accompanies the drastic
changes in splicing regulation of genes that are expressed in somatic cells and
pluripotent stem cells.
Our analyses identified several hundreds of genes which undergo alterations in
splicing patterns during somatic cell reprogramming. We also found that about a half of
the genes are changed in their expression by more than two fold based on our RNA-seq
data. Alternative splicing can induce nonsense-mediated RNA decay (NMD) and
modulate transcript levels. However, only approximately 10% of the alternative splicing
changes are predicted to modulate transcript levels by NMD (data not shown).
Therefore, the changes in the gene expression could not be explained simply by NMD
alone. It is assumed that gene expression levels are determined not only by NMD but
also by transcriptional regulation and/or modulation of mRNA stability. The relationship
between alternative splicing and transcriptional regulation should be investigated in a
future study. Our computational analyses provided mechanistic insight into the
differences in splicing patterns between somatic cells and pluripotent stem cells. First,
overrepresented motifs were identified in and around the alternative exons, implying
that particular RNA-binding proteins recognize these sequences to control splicing
outcomes during the iPS cell induction. Our analysis also showed that the lengths of the
introns around exons, which are preferentially included in iPS cells, are shorter than
those in MEF, whereas the lengths of the exons are longer. We also found that there is
difference in distribution of the inclusion ratios in two classes of AS events, alternative
last exon (ALE) and alternative 3' splice site (A3SS), between iPS cells and MEFs,
indicating that shorter introns tend to be removed more easily in iPS cells than in MEFs
in those two classes of AS events. The lengths of exons and their surrounding introns
have been proposed to be associated with the exon recognition efficiency (Pandit et al.,
2013). Thus, our analysis suggest the drastic change in the molecular machinery of
splicing regulation, which occur during somatic cell reprogramming, are associated with
the change of the molecular repertoire of RNA-binding proteins.
Our clustering analysis based on our absolute qPCR data demonstrated that overall
splicing patterns were similar among the iPS and ES cell lines, and that the splicing
patterns in iPS cell lines used in our study were clearly different from those in partial
iPS (piPS) cells. On the contrary, the inclusion ratios of several genes varied among
pluripotent stem cells. However, we could not find any splicing events that clearly
distinguish iPS cell lines from ES cell lines, or any attributes, such as cell origin, which
could account for the differences in splicing patterns among iPS cell lines. It will be
interesting to explore the mechanisms that cause the differences in splicing among iPS
cell lines for elucidating the nature of reprogramming by transcription factors.
We found that the splicing patterns of genes in pluripotent stem cells are similar to
those in testes. This may imply that alternative splicing regulation in pluripotent stem
cells uses the same mechanisms as those in the testes. Previous studies showed that
pluripotent cells can be derived from neonatal and adult testes in mice and humans
(Conrad et al., 2008; Guan et al., 2006; Kanatsu-Shinohara et al., 2004; Kee et al., 2010).
Thus, the similarity of the splicing characteristics may reflect the potential capacity to
derive pluripotent cell lines.
Our results also indicate that the timing of splicing pattern transitions during
somatic cell reprogramming significantly varies from gene to gene. This result may
support the hypothesis that induced reprogramming is a gradual process that involves
several intermediate states (Stadtfeld et al., 2008). This sequential splicing switching,
along with the sequential events of gene expression and repression, may provide clues
to dissect the intermediate states. So far, a few genes have been proposed as markers of
cells that are committed to reprogramming (Brambrink et al., 2008; Stadtfeld et al.,
2008). However, most of them, despite being enriched in committed cells, cannot be
used to distinguish committed from non-committed cells. The splicing patterns of the
genes we identified may be promising to use as early reprogramming markers or for the
verification of pluripotency by developing reporter constructs to visualize their splicing
patterns.
Our siRNA screen experiment has identified candidate RNA-binding proteins that
function as splicing regulators in pluripotent stem cells. Moreover, our analysis showed
that U2af1 and Srsf3 play a role in somatic cell reprogramming. U2af1 is one of two
subunits of U2 snRNP auxiliary factor (U2af) and binds to the 3’ splice site of the
pre-mRNA intron to enhance splicing (Webb and Wise, 2004; Wu et al., 1999). It has
also been shown that the abundance of U2af1 could affect alternative splicing (Fu et al.,
2011; Pacheco et al., 2006). Thus, it is likely that the splicing pattern in pluripotent stem
cells is maintained by the abundance of U2af1. Srsf3 is one of the SR protein family
genes that are known to be a regulator of alternative splicing (Zahler et al., 1992). It has
been reported that Srsf3 knockout mice fail to form blastocysts, and die at the morula
stage (Jumaa et al., 1999). Moreover, we found that siRNAs against Srsf3 suppressed
the expression of pluripotency genes, including Nanog and Oct4 (Figure S4D),
indicating that Srsf3 is also important for the maintenance of pluripotency. The splicing
regulation by particular RNA-binding proteins is known to be critical in a number of
biological processes, such as postnatal heart development (Kalsotra and Cooper, 2011;
Kalsotra et al., 2008). After completion of our work, there appeared a paper of Han et al.
(2013) reporting that two RNA-binding proteins, MBNL1 and MBNL2, whose
expression levels are much lower in pluripotent stem cells than in many differentiated
cells, are involved in the maintenance of pluripotency and cellular reprogramming by
negatively regulating alternative splicing specific for pluripotent stem cells. Hence, the
RNA-binding proteins might have a role in the coordinated splicing transition during
reprogramming, and be integrated into the molecular mechanisms underlying
reprogramming.
In summary, our study describes the drastic change in splicing isoform expression
and its regulatory mechanisms during reprogramming, and suggests that alternative
splicing regulation represents part of the mechanisms of cellular reprogramming and has
important roles in pluripotency, although the functional relevance of splicing during
cellular reprogramming remains to be elucidated.
Inclusion ratio
0.0 0.5 1.0
Alte
rnat
ive
5'sp
lice
site
(A5S
S)
Alte
rnat
ive
3'sp
lice
site
(A3S
S)
Alte
rnat
ive
first
exon
(AFE
)
Mut
ually
exc
lusi
veex
on (M
XE
)A
ltern
ativ
e la
stex
on (A
LE)
0 5 10-log10(p-value)
Embryonic Development
Cellular Assembly and Organization
Cellular Function and Maintenance
Gene Expression
Organismal Survival
RNA Post-Transcriptional Modification
Post-Translational Modification
Cell Morphology
Connective Tissue Development and Function
Renal and Urological System Development and Function
Tissue Morphology
Cell Cycle
Figure S1
iPS
(492
B4)
iPS
(178
B5)
MEF ES
(v6.
5)
iPS
(492
B4)
iPS
(178
B5)
MEF ES
(v6.
5)
iPS
(492
B4)
iPS
(178
B5)
MEF ES
(v6.
5)
iPS
(492
B4)
iPS
(178
B5)
MEF ES
(v6.
5)
iPS
(492
B4)
iPS
(178
B5)
MEF ES
(v6.
5)
A
B C
Region 1(40bp)
Region 4(40bp)
Region 2(300bp)
Region 3(300bp)
Region 5(40bp)
Region 8(40bp)
Region 6(300bp)
Region 7(300bp)
Group_iPSincGroup_MEFinc
ACAA (p=2.0×10 )-2
ACAAA (p=2.6×10 )-3
AACUA (p=4.8×10 )-2
Region 4
Region 5
Region 7
CAAU (p=4.0×10 )-2Region 8
Region 6UCG (p=3.2×10 )-2
CAA (p=1.1×10 )-2Region 3
UUUGC (p=3.7×10 )-2
Figure S1. Clustering Analysis for Each Splicing Pattern, the Functional analysis
and Motif Identification, Related to Figure 1
(A) Clustering analysis of the splicing profiles for each splicing pattern determined by
inclusion ratios are shown as described in Figure 1B.
(B) The functional analysis using IPA identified the biological functions and pathways
that were significantly enriched in the genes whose splicing patterns are different
between MEFs and iPS cells (Figure 1A). Fisher's exact test was used to calculate a
p-value.
(C) Nucleotide sequences (300 bp in size for intronic regions and 40 bp for exonic
regions) in eight regions (from Region 1 to Region 8) around the skipped exons were
extracted to search for overrepresented motifs in Group_MEFinc and Group_iPSinc
(Figure 1C) between 3 to 7 nucleotides in size. The region-specific background set that
includes sequences from the set of skipped exons expressed in both MEFs and iPS cells
were used to calculate motif-enrichment p-values (Fisher's exact test). Overrepresented
motifs in each group with p-value (Benjamini-Hochberg FDR) < 0.05 are presented.
Csda Csnk1d Dclk2 Ezh2 Fgfr1 Foxm1
Trim33 Ubn1 Ubtf
Trim33 Ubn1 Ubtf
ExclusionInclusion
ExclusionInclusion
ExclusionInclusion
Sam
ple
123456789
10111213141516
0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0
0 0.25 0.5Inclusion ratio
0.75 1.0
0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0
0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0
0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0
0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0
0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0
MEF
piPSC
iPSC
ESC
Sam
ple
123456789
10111213141516
MEF
piPSC
iPSC
ESC
Sam
ple
123456789
10111213141516
MEF
piPSC
iPSC
ESC
Sam
ple
123456789
10111213141516
MEF
piPSC
iPSC
ESC
Sam
ple
123456789
10111213141516
MEF
piPSC
iPSC
ESC
Figure S2
Hmgxb4 Hsf2
Csda Csnk1d Dclk2 Ezh2 Fgfr1 Foxm1 Mark2Hmgxb4 Hsf2 Map3k7 Map4k4 Mapk9
Golga2 Mpzl1 Nasp Palm Plod2 Prrc2b Smg7 Ubqln1
Mark3 Max Maz Mlh3 Mta1 Myef2 Tfdp2Nek1 Nrf1 Prpf4b Tbx3Rnps1
Mark2
0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0
Map3k7 Map4k4 Mapk9 Mark3 Max Maz Mlh3
0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0
Mta1 Myef2 Tfdp2
0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0
Nek1 Nrf1 Prpf4b Tbx3Rnps1
ES
(1A
2)E
S (v
6.5)
iPS
(98A
-1)
iPS
(99-
1)iP
S (1
78B
5)iP
S (2
56H
18)
iPS
(492
B9)
ES
(RF8
)iP
S (4
92B
4)iP
S (2
0D17
)
wt M
EF
Ng
ME
FFb
x M
EF
piP
S (2
0A4)
piP
S (2
0A5)
piP
S (2
0A16
)
BA
D
E
C
Sam
ple
MuscleOvaryTestis
KidneyIntestine
SpleenStomach
LiverLungHeart
ThymusBrainiPSCMEF
Sam
ple
MuscleOvaryTestis
KidneyIntestine
SpleenStomach
LiverLungHeart
ThymusBrainiPSCMEF
Sam
ple
MuscleOvaryTestis
KidneyIntestine
SpleenStomach
LiverLungHeart
ThymusBrainiPSCMEF
Inclusion variantiPSC
iPSC
Exclusion variant
Inclusion variant
Exclusion variant
800
Estimated number of molecules
2) The number of chambers in which a PCR reaction occurs are counted up.
200
Inclusion variantExclusion variant
iPSC
ESC
MEF
iPSC
ESC
MEF
0.4 0.0 0.40.8 0.0 0.5 1.0
Inclusion ratioAbsolute expression (A.U.)
digital PCR
3) The actual number of molecules in a unit volume are estimated based on the poisson distribution.
4) By combining digital PCR data of a single reference sample and high-throuput qPCR data, absolute expression values are determined to directly compare expression levels of different targets (”absolute qPCR”, the panel above, left). Then, the inclusion ratios for each sample are calculated based on the absolute qPCR data (the panel above, right).
1) Each PCR reaction solutions is diluted into 770 chamberson “dq PCR 37K chips” (Fluidigm®). PCR reactions are run
using the BioMark™ real-time PCR reader (Fluidigm®).
Figure S2. Characterization of Splicing Patterns in MEFs and iPS Cells, Related to
Figure 2
(A) Workflow schematic of the absolute qPCR method.
(B) Splicing patterns of 27 genes that belong to GO categories, “DNA-binding” and
“Protein-kinase” in MEFs, three piPS cell lines, seven iPS cell lines and three ES cell
lines using absolute qPCR. Experiments were carried out as in Figure 2A.
(C) Clustering analysis of splicing patterns of MEF, partial iPS (piPS) cells, iPS cells
and ES cells. This analysis was based on the inclusion ratios obtained in Figure 2A and
S2B.
(D) Splicing patterns of 8 genes, that are not related to “DNA-binding” and
“Protein-kinase”, in MEFs, three piPS cell lines, seven iPS cell lines and three ES cell
lines using absolute qPCR. Experiments were carried out as in Figure 2A.
(E) Splicing patterns across multiple adult mouse tissues. Experiments were carried out
as in Figure 2B.
Csda Csnk1d
Dclk2
Ezh2
Fgfr1 Hsf2
Foxm1 Hmgxb4
Map3k7
Thy1 Nanog
Map4k4 Mark2Mark3 Mlh3 Myef2 Nrf1 Rnps1 Ubn1
Exon Exclusion
Exon Inclusion
1
early middle
late
23456
0 0.25 0.5Inclusion ratio
0.75 1.0
0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0
0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0
0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0
0 0.3 0.6Inclusion ratio
0.9 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0 0 0.25 0.5Inclusion ratio
0.75 1.0
Sam
ple
123456
Sam
ple
123456
Sam
ple
Sample number
1 2 3 4 5 6Day -1 Day 4 Day 8 Day 14Thy1 +
Nanog +
iPSThy1
SSEA-1 +SSEA-1 Nanog
MEF
Max Maz Mta1 Nek1 Prpf4b Tbx3Mapk9
Tfdp2 Trim33 Ubtf
Figure S3
A
B
C
Rel
ativ
e ex
pres
sion
1 2 3 4 5 6 1 2 3 4 5 60
0.2
0.40.6
0.81.0
1.2
1.0
0
1.21.41.61.8
0.20.40.60.8
Sample Sample
iPS
(492
B4)
Ng
ME
F
Day
-1 T
hy+
Day
4 T
hy-
piP
S (2
0A-4
)
piP
S (2
0A-5
)
piP
S (2
0A-1
6)
Day
8 S
S+
Day
14 N
anog
+
Figure S3. Splicing Pattern Transitions during Somatic Cell Reprogramming,
Related to Figure 3
(A) qPCR analysis for marker genes. Larger sample numbers correspond to more
reprogrammed samples. The expression levels of Nanog and Thy1 were normalized to
Gapdh expression. For Thy1, the expression levels in the day -1 sample was set to 1.
For Nanog, the expression level in the day 14 sample was set to 1. The sample numbers
are the same as in Figure 3B.
(B) Determination of splicing switch times during reprogramming using absolute qPCR.
Experiments were carried out as in Figure 3B.
(C) Clustering analysis of splicing patterns of FACS-sorted cells, piPS cells, iPS cells
and MEF. This analysis was based on the inclusion ratios obtained in Figure 2A, 3B,
S2B and S3B.
Figure S4
N.C
.1N
.C.2
Oct
4Rb
m44
Pnpt
1Er
al1
Dim
t1Cc
nt1
Dazl
Celf1
Rpf1
Mrp
l1Dd
x18
Rpp2
5Dk
c1Ei
f2d
Nhp
2Ig
f2bp
1Dd
x41
Srsf
2Sr
sf7
Mrp
s5Ra
ver2
G3b
p2Ra
d51a
p1Rr
p9N
r0b1 Bo
llG
3bp1
Brca
1Dd
x4Sr
sf3
Srbd
1Rb
m38
Srsf
10O
as1g
Exos
c5Ex
osc7
C Bard
1Yb
x2Ei
f4a3
Nufi
p1Ls
m5
Dcp2
Strb
pEs
rp1
Oas
1aSn
rpa1
Elav
l3N
xf7
Ncl
Ddx5
2Ei
f1a
Hnrn
pcDd
x11
Rnm
tAq
rEx
osc3
Thum
pd3
Snrp
nDd
x25
Krr1
Ddx5
6N
ol8
Dis3 Fb
lDn
d1Fu
bp1
Elav
l2Rp
s4y2
Rbpm
sDd
x10
Ddx2
1N
pm1
Ddx4
6La
rp4
U2a
f1Rb
m14
Exos
c8Ta
rdbp
Tra2
aAd
at1
Rbm
x2Xp
o1G
emin
5N
sun2
Hnrn
pa1
Cdc5
lEx
osc2
Gar
1Ja
kmip
1Ee
fsec Ilf3
Hnrn
pa3
N.C
.1N
.C.2
Oct
4Rb
m44
Pnpt
1Er
al1
Dim
t1Cc
nt1
Dazl
Celf1
Rpf1
Mrp
l1Dd
x18
Rpp2
5Dk
c1Ei
f2d
Nhp
2Ig
f2bp
1Dd
x41
Srsf
2Sr
sf7
Mrp
s5Ra
ver2
G3b
p2Ra
d51a
p1Rr
p9N
r0b1 Bo
llG
3bp1
Brca
1Dd
x4Sr
sf3
Srbd
1Rb
m38
Srsf
10O
as1g
Exos
c5Ex
osc7
C Bard
1Yb
x2Ei
f4a3
Nufi
p1Ls
m5
Dcp2
Strb
pEs
rp1
Oas
1aSn
rpa1
Elav
l3N
xf7
Ncl
Ddx5
2Ei
f1a
Hnrn
pcDd
x11
Rnm
tAq
rEx
osc3
Thum
pd3
Snrp
nDd
x25
Krr1
Ddx5
6N
ol8
Dis3 Fb
lDn
d1Fu
bp1
Elav
l2Rp
s4y2
Rbpm
sDd
x10
Ddx2
1N
pm1
Ddx4
6La
rp4
U2a
f1Rb
m14
Exos
c8Ta
rdbp
Tra2
aAd
at1
Rbm
x2Xp
o1G
emin
5N
sun2
Hnrn
pa1
Cdc5
lEx
osc2
Gar
1Ja
kmip
1Ee
fsec Ilf3
Hnrn
pa3
-4
-3
-2
-1
0
1
2
3
4
-4
-3
-2
-1
0
1
2
3
4
Nanog
Oct4
Nanog
Oct4
Rel
ativ
e ex
pres
sion
(lo
g2)
Rel
ativ
e ex
pres
sion
(lo
g2)
CA
B D
E F
G
I
H
J
iPSC
ESC
Rel
ativ
e ex
pres
sion
0
2
4
6
8
10
12
CSECSPi
N.T
.
N.C
.1
siOct4
1 2 3
N.T
.
N.C
.1
siOct4
1 2 3
2xdC
Rel
ativ
e ex
pres
sion
0
0.2
0.4
0.6
0.8
1
1.2
CSECSPi
N.T
.
N.C
.1
siOct4
1 2 3
N.T
.
N.C
.1
siOct4
1 2 3
4tcO
0.5
0.0
-0.5
1.5
2.0
2.5
1.0
Mark3 Foxm1 Trim33
0
-1
-2
2
3
4
5
1
-0.8-1.0-1.5
-0.4-0.20.0
0.60.40.2
-0.6
iPS
C
N.C
.
siU
2af1
1 32
Inc/
Exc
(log
2)
Inc/
Exc
(log
2)
Inc/
Exc
(log
2)
iPS
C
N.C
.
siU
2af1
1 32
iPS
C
N.C
.
siU
2af1
1 32
Ezh2
-2.0-2.5-3.0
-1.0-0.50.0
1.00.5
-1.5
Inc/
Exc
(log
2)
iPS
C
N.C
.
siU
2af1
1 32
Mta1 Myef2 Nrf1
-1.0-1.2-1.4
-0.6-0.4-0.2
0.20.0
-0.8-0.8-1.0-1.2
-0.4-0.20.0
0.40.2
-0.60.0
-1.0
2.0
3.0
4.0
1.0
iPS
C
N.C
.
siU
2af1
1 32
iPS
C
N.C
.
siU
2af1
1 32
iPS
C
N.C
.
siU
2af1
1 32
Inc/
Exc
(log
2)
Inc/
Exc
(log
2))2gol( cxE/cnI0.4
0.2
0.0
0.8
1.0
U2af1
0.6
iPS
C
Rel
ativ
e ex
pres
sion
N.C
.
siU
2af1
1 32
iPS
C
N.C
.
siS
rsf3
1
0.0
-0.4
-0.8
0.8
1.2
0.4
Hmgxb4 Rnps1 Map4k4 Mark2Srsf3
0.4
0.2
0.0
0.8
1.0
1.4
1.2
0.60.0
-0.4
-0.8
0.8
0.40
-1
-2
-3
-4
1
0.0
-0.5
-1.0
-1.5
-2.0
0.5
1.0
32
iPS
C
N.C
.
siS
rsf3
1 32
Inc/
Exc
(log
2)
Inc/
Exc
(log
2)
Inc/
Exc
(log
2))2gol( cxE/cnI
Rel
ativ
e ex
pres
sion
iPS
C
N.C
.
siS
rsf3
1 32
iPS
C
N.C
.
siS
rsf3
1 32
iPS
C
N.C
.
siS
rsf3
1 32
Snrpa1
shNC #1
**
*
#2
shSnrpa1
#3
Rel
ativ
e ex
pres
sion
0
0.2
0.4
0.6
0.8
1
1.2Nsun2
* *
*
shNC #1 #2
shNsun2
#3
Rel
ativ
e ex
pres
sion
0
0.2
0.4
0.6
0.8
1
1.2Ddx46
* **
shNC #1 #2
shDdx46
#3
Rel
ativ
e ex
pres
sion
0
0.2
0.4
0.6
0.8
1
1.2Hnrpa1
**
shNC #1 #2
shHnrpa1
Rel
ativ
e ex
pres
sion
0
0.2
0.4
0.6
0.8
1
1.2
Snrpa1
shNC #1 #2
shSnrpa1
#3
GFP
inte
nsity
0
0.2
0.4
0.6
0.8
1
1.2Nsun2
shNC #1 #2
shNsun2
#3
GFP
inte
nsity
00.20.40.60.8
11.21.4
Ddx46
shNC #1 #2
shDdx46
#3
GFP
inte
nsity
00.20.40.60.8
11.21.4
* *
Celf1
shNC #1 #2
shCelf1
Rel
ativ
e ex
pres
sion
0
0.2
0.4
0.6
0.8
1
1.2
** *
2phN
shNC #1 #2
shNhp2
#3
Rel
ativ
e ex
pres
sion
0
0.2
0.4
0.6
0.8
1
1.2
* **
Larp4
shNC #1 #2
shLarp4
#3
Rel
ativ
e ex
pres
sion
0
0.2
0.4
0.6
0.8
1
1.2
2phN
GFP
inte
nsity
00.20.40.60.8
11.21.4
shNC #1 #2
shNhp2
#3
*
GFP
inte
nsity
00.20.40.60.8
11.21.4
Larp4
shNC #1 #2
shLarp4
#3
*
GFP
inte
nsity
Celf1
shNC #1 #2
shCelf1
00.20.40.60.8
11.21.41.6 *
Hnrpa1
shNC #1 #2
shHnrpa1
GFP
inte
nsity
00.20.40.60.8
11.21.41.6
MEFTestis
ES cell 1ES cell 2
Samples
92 R
NA
-bin
ding
gen
es
ExressionLow High
-3 0 3(log2)
MEF (log10) MEF (log10) MEF (log10)
iPS
492B
4 (lo
g10)
iPS
178B
5 (lo
g 10)
ES
V6.
5 (lo
g 10)
Figure S4. siRNA Screen for RNA-binding Proteins which Regulate Splicing
Patterns in Pluripotent Stem Cells, Related to Figure 4
(A) Scatter plots of gene expression profile in MEF and ES/iPS cells. Only the 92
RNA-binding protein-encoding genes, which were selected based on the criterion that
their expression level is at least 2-fold higher in iPS/ES cells than in MEF, are shown in
the scatter plots.
(B) The expression profiles of the RNA-binding genes which are highly expressed in
pluripotent stem cells across various tissues and cell lines. Hierarchical clustering of
expression profiles of the 92 RNA-binding genes based on the data sets registered in the
BioGPS database.
(C) qPCR analysis in iPS cells (492B4) and ES cells (RF8) treated with siRNAs against
Oct4. An siRNA against Oct4 effectively downregulated its target expression by ~90%
relative to negative control siRNA, and the treatment with the siRNA against Oct4 for
48 hr was sufficient to induce Cdx2, a trophectoderm marker gene. The expression
levels of Oct4 and Cdx2 were normalized to Gapdh expression. The expression levels in
the N. T. sample were set to 1. N.T.: no treatment.
(D) qPCR analysis in iPS cells (492B4) and ES cells (RF8) treated with siRNAs against
RNA-binding protein-encoding genes. Relative expression of pluripotency genes,
Nanog and Oct4, in each siRNA treated cell. The expression levels of Nanog and Oct4
were normalized to Gapdh expression. The expression levels in the sample treated with
N.C.1 siRNA were set to 0 (log2). N.C.: negative control.
(E) (G) The expression levels of U2af1 (E) and Srsf3 (G) were determined by qPCR in
iPS cells (492B4) after treatment with each of siRNA pools against U2af1 and Srsf3,
respectively. The expression levels were normalized to Gapdh. The expression levels in
the N.C. sample were set to 1. N.C.: negative control.
(F) (H) The graphs indicate fold-changes in relative inclusion ratios for each gene after
treatment with each of siRNA pools against U2af1 and Srsf3. The inclusion ratios in the
N.C. sample were set to 0 (log2). Experiments were performed in biological triplicates.
The error bars represent standard deviations.
(I) Knockdown efficiencies of each shRNAs in MEFs. The expression levels of each
RNA-binding protein-encoding genes were analyzed by qPCR and normalized to Gapdh.
The expression levels in the shNC-treated sample were set to 1; shNC, negative control
shRNA. Mean ± SD; n=3. *p < 0.05 for Student’s t test comparing to control
shRNA-expressing MEF.
(J) The effects of each shRNA expression on somatic cell reprogramming. Retroviruses
expressing Oct4, Sox2, Klf4, c-Myc and shRNAs were used to infect Nanog-GFP
reporter MEFs on day 0. Nanog-GFP reporter activity was measured by microplate
reader on day 14 after infection. Non-infected MEF was used as a GFP negative control.
The GFP intensity in the shNC-treated sample was set to 1. Mean ± SD; n=3. *p < 0.05
for Student’s t test comparing to control shRNA-expressing cells.
Table S1. Summary of the Mapping of SOLiD Reads by RNA-seq, Related to Figure 1
SOLiD3plus
MEF
(Adaptor A)
MEF
(Adaptor B)
iPS, 492B4
(Adaptor A)
iPS, 492B4
(Adaptor B)
Total reads 99,678,123 94,618,693 105,705,946 80,531,414
Mapped reads 74,371,900 74.61% 65,943,715 69.69% 79,926,847 75.61% 56,713,380 70.42%
Junction reads 6,024,795 6.04% 5,015,361 5.30% 5,956,049 5.63% 3,787,470 4.70%
Total sequenced reads
194,296,816
186,237,360
Mapped reads
140,315,615 72.22%
136,640,227 73.37%
Junction reads 11,040,156 5.68% 9,743,519 5.23%
SOLiD4
MEF
(Adaptor A)
iPS, 492B4
(Adaptor A)
iPS, 178B5
(Adaptor A)
ES, V6.5
(Adaptor A)
Total reads 119,542,102 125,608,031 92,175,180 120,056,116
Mapped reads 104,563,190 87.47% 107,612,507 85.67% 80,140,580 86.94% 103,025,982 85.81%
Junction reads 8,513,309 7.12% 8,204,982 6.53% 6,178,685 6.70% 7,970,114 6.64%
Table S2. The List of Genes Whose Splicing Patterns Differ between MEFs and iPS
Cells by More Than 0.2 with Respect to the Inclusion Ratio with Statistical
Significance, Related to Figure 1
See separate Excel file.
Table S3. PCR Primers List, Related to Figures 2, 3, and 4
See separate Excel file.
Supplemental References
Aoi, T., Yae, K., Nakagawa, M., Ichisaka, T., Okita, K., Takahashi, K., Chiba, T., and
Yamanaka, S. (2008). Generation of pluripotent stem cells from adult mouse liver and
stomach cells. Science 321, 699-702.
Conrad, S., Renninger, M., Hennenlotter, J., Wiesner, T., Just, L., Bonin, M., Aicher, W.,
Buhring, H.J., Mattheus, U., Mack, A., et al. (2008). Generation of pluripotent stem cells
from adult human testis. Nature 456, 344-349.
Fu, Y., Masuda, A., Ito, M., Shinmi, J., and Ohno, K. (2011). AG-dependent 3'-splice sites are
predisposed to aberrant splicing due to a mutation at the first nucleotide of an exon. Nucleic
Acids Res. 39, 4396-4404.
Guan, K., Nayernia, K., Maier, L.S., Wagner, S., Dressel, R., Lee, J.H., Nolte, J., Wolf, F., Li,
M., Engel, W., et al. (2006). Pluripotency of spermatogonial stem cells from adult mouse
testis. Nature 440, 1199-1203.
Jumaa, H., Wei, G., and Nielsen, P.J. (1999). Blastocyst formation is blocked in mouse
embryos lacking the splicing factor SRp20. Curr. Biol. 9, 899-902.
Kalsotra, A., and Cooper, T.A. (2011). Functional consequences of developmentally regulated
alternative splicing. Nat. Rev. Genet. 12, 715-729.
Kalsotra, A., Xiao, X., Ward, A.J., Castle, J.C., Johnson, J.M., Burge, C.B., and Cooper, T.A.
(2008). A postnatal switch of CELF and MBNL proteins reprograms alternative splicing in
the developing heart. Proc. Natl. Acad. Sci. USA 105, 20333-20338.
Kanatsu-Shinohara, M., Inoue, K., Lee, J., Yoshimoto, M., Ogonuki, N., Miki, H., Baba, S.,
Kato, T., Kazuki, Y., Toyokuni, S., et al. (2004). Generation of pluripotent stem cells from
neonatal mouse testis. Cell 119, 1001-1012.
Kee, K., Pera, R.A., and Turek, P.J. (2010). Testicular germline stem cells. Nat. Rev. Urol. 7,
94-100.
Nakagawa, M., Koyanagi, M., Tanabe, K., Takahashi, K., Ichisaka, T., Aoi, T., Okita, K.,
Mochiduki, Y., Takizawa, N., and Yamanaka, S. (2008). Generation of induced pluripotent
stem cells without Myc from mouse and human fibroblasts. Nat. Biotechnol. 26, 101-106.
Okita, K., Nakagawa, M., Hyenjong, H., Ichisaka, T., and Yamanaka, S. (2008). Generation
of mouse induced pluripotent stem cells without viral vectors. Science 322, 949-953.
Pacheco, T.R., Coelho, M.B., Desterro, J.M., Mollet, I., and Carmo-Fonseca, M. (2006). In
vivo requirement of the small subunit of U2AF for recognition of a weak 3' splice site. Mol.
Cell. Biol. 26, 8183-8190.
Pandit, S., Zhou, Y., Shiue, L., Coutinho-Mansfield, G., Li, H., Qiu, J., Huang, J., Yeo, G.W.,
Ares, M., Jr., and Fu, X.D. (2013). Genome-wide analysis reveals SR protein cooperation and
competition in regulated splicing. Mol. Cell 50, 223-235.
Webb, C.J., and Wise, J.A. (2004). The splicing factor U2AF small subunit is functionally
conserved between fission yeast and humans. Mol. Cell. Biol. 24, 4229-4240.
Wu, S., Romfo, C.M., Nilsen, T.W., and Green, M.R. (1999). Functional recognition of the 3'
splice site AG by the splicing factor U2AF35. Nature 402, 832-835.
Zahler, A.M., Lane, W.S., Stolk, J.A., and Roth, M.B. (1992). SR proteins: a conserved family
of pre-mRNA splicing factors. Genes Dev. 6, 837-847.