microscopic chromosomal structural and …...2020/07/16  · experiments. our model can be extended...

19
Microscopic Chromosomal Structural and Dynamical Origin of Cell Differentiation and Reprogramming Xiakun Chu 1 , and Jin Wang 1,2,* 1 Department of Chemistry 2 Department of Physics and Astronomy State University of New York at Stony Brook, Stony Brook, NY 11794, USA * [email protected] Abstract As an essential and fundamental process of life, cell de- velopment involves large-scale reorganization of the three- dimensional genome architecture, which forms the basis of gene regulation. Here, we develop a landscape-switching model to explore the microscopic chromosomal structural ori- gin of the embryonic stem cell (ESC) differentiation and the somatic cell reprogramming. We show that chromosome struc- ture exhibits significant compartment-switching in the unit of topologically associating domain. We find that the chromo- some during differentiation undergoes monotonic compaction with spatial re-positioning of active and inactive chromoso- mal loci towards the chromosome surface and interior, respec- tively. In contrast, an over-expanded chromosome, which ex- hibits universal localization of loci at the chromosomal sur- face with erasing the structural characteristics formed in the somatic cells, is observed during reprogramming. We suggest an early distinct differentiation pathway from the ESC to the terminally differentiated cell, giving rise to early bifurcation on the Waddington landscape for the ESC differentiation. Our theoretical model including the non-equilibrium effects, draws a picture of the highly irreversible cell differentiation and re- programming processes, in line with the experiments. The pre- dictions from our model provide a physical understanding of cell differentiation and reprogramming from the chromosomal structural and dynamical perspective and can be tested by fu- ture experiments. 1 Introduction Cell is the basic unit of life. Cells have two essential functions. One is the division for reproduction, and the other is differ- entiation for multi-purpose functionality. Cell differentiation has been suggested to be determined by the underlying gene regulatory network [1]. A fundamental question is how the unique gene expression pattern is progressively established in the specialized cells during differentiation. The roles of epige- netic modifications [2], such as DNA methylation and histone modifications, have been highlighted in modulating the gene expression across different cell types and lineages [3]. As the scaffold for gene expression, genome structure is constantly altered by epigenetic modifications [4]. Therefore, elucidat- ing the microscopic molecular structural origin of cell differ- entiation at the genomic level is critical for understanding the differentiation mechanism yet still challenging. Increasing evidence suggests that genome folds into a 3D architecture [5, 6, 7]. Derived from the chromosome confor- mation capture techniques [8], Hi-C measures the frequency of genomic loci spatially close to each other, resulting in a high-resolution chromosome contact map to infer chromo- some structure [9]. The prominent findings of Hi-C show that the chromosome is hierarchically organized through multiple scales from the local self-interacting domains, termed as topo- logically associating domains (TADs) [10, 11, 12], to the long- range segregation patterns, termed as compartments [9]. TADs are characterized as the functional units of the genome in con- trolling the gene expressions by restricting the promoters and enhancers within one TAD [13, 7, 14]. On the other hand, com- partments form two types of mutually excluded segregations, referred to as compartment A and B, corresponding to the gene-rich, open active euchromatin and the gene-poor, closed repressive heterochromatin, respectively [9, 15, 16]. TADs and compartments are critical genome structural features that are found to play important roles in many biological processes, in- cluding DNA replication [17], transcription [7] and gene regu- lation [18, 19, 20]. In the context of Waddington’s epigenetic landscape [21], the cell differentiation process can be metaphorically referred to as an iconic picture, where a ball rolls down on a surface (landscape) from the top to the lowest point. The guidance of the ball in seeking stable positions, which are the differenti- ated cell states, is controlled by the landscape that is shaped by the epigenetic mechanism. Genome structure, which is in intimate relation to epigenetic modifications, has been found to undergo high-order structural rearrangement during differ- entiation [22], in order to accommodate the required gene ex- pression pattern possibly through modulating the accessibility of the genomic loci to transcription factors. Therefore, cell differentiation should be associated with a landscape at the mi- croscopic molecular level that describes the genome structural dynamics, as a reminiscence of the one widely used in protein folding [23]. The information of this landscape can provide a physical understanding of genome dynamics and microscopic molecular structural origin for cell differentiation, giving in- sights into the underlying mechanism. Therefore, such struc- tural landscape at the genomic level can serve as the physi- cal and structural basis for understanding and quantifying the Waddington landscape at the gene expression level for cell dif- 1 (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint this version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662 doi: bioRxiv preprint

Upload: others

Post on 22-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

Microscopic Chromosomal Structural and Dynamical Origin of CellDifferentiation and Reprogramming

Xiakun Chu1, and Jin Wang1,2,∗1 Department of Chemistry

2 Department of Physics and AstronomyState University of New York at Stony Brook, Stony Brook, NY 11794, USA

[email protected]

Abstract

As an essential and fundamental process of life, cell de-velopment involves large-scale reorganization of the three-dimensional genome architecture, which forms the basis ofgene regulation. Here, we develop a landscape-switchingmodel to explore the microscopic chromosomal structural ori-gin of the embryonic stem cell (ESC) differentiation and thesomatic cell reprogramming. We show that chromosome struc-ture exhibits significant compartment-switching in the unit oftopologically associating domain. We find that the chromo-some during differentiation undergoes monotonic compactionwith spatial re-positioning of active and inactive chromoso-mal loci towards the chromosome surface and interior, respec-tively. In contrast, an over-expanded chromosome, which ex-hibits universal localization of loci at the chromosomal sur-face with erasing the structural characteristics formed in thesomatic cells, is observed during reprogramming. We suggestan early distinct differentiation pathway from the ESC to theterminally differentiated cell, giving rise to early bifurcationon the Waddington landscape for the ESC differentiation. Ourtheoretical model including the non-equilibrium effects, drawsa picture of the highly irreversible cell differentiation and re-programming processes, in line with the experiments. The pre-dictions from our model provide a physical understanding ofcell differentiation and reprogramming from the chromosomalstructural and dynamical perspective and can be tested by fu-ture experiments.

1 Introduction

Cell is the basic unit of life. Cells have two essential functions.One is the division for reproduction, and the other is differ-entiation for multi-purpose functionality. Cell differentiationhas been suggested to be determined by the underlying generegulatory network [1]. A fundamental question is how theunique gene expression pattern is progressively established inthe specialized cells during differentiation. The roles of epige-netic modifications [2], such as DNA methylation and histonemodifications, have been highlighted in modulating the geneexpression across different cell types and lineages [3]. As thescaffold for gene expression, genome structure is constantlyaltered by epigenetic modifications [4]. Therefore, elucidat-ing the microscopic molecular structural origin of cell differ-

entiation at the genomic level is critical for understanding thedifferentiation mechanism yet still challenging.

Increasing evidence suggests that genome folds into a 3Darchitecture [5, 6, 7]. Derived from the chromosome confor-mation capture techniques [8], Hi-C measures the frequencyof genomic loci spatially close to each other, resulting ina high-resolution chromosome contact map to infer chromo-some structure [9]. The prominent findings of Hi-C show thatthe chromosome is hierarchically organized through multiplescales from the local self-interacting domains, termed as topo-logically associating domains (TADs) [10, 11, 12], to the long-range segregation patterns, termed as compartments [9]. TADsare characterized as the functional units of the genome in con-trolling the gene expressions by restricting the promoters andenhancers within one TAD [13, 7, 14]. On the other hand, com-partments form two types of mutually excluded segregations,referred to as compartment A and B, corresponding to thegene-rich, open active euchromatin and the gene-poor, closedrepressive heterochromatin, respectively [9, 15, 16]. TADs andcompartments are critical genome structural features that arefound to play important roles in many biological processes, in-cluding DNA replication [17], transcription [7] and gene regu-lation [18, 19, 20].

In the context of Waddington’s epigenetic landscape [21],the cell differentiation process can be metaphorically referredto as an iconic picture, where a ball rolls down on a surface(landscape) from the top to the lowest point. The guidance ofthe ball in seeking stable positions, which are the differenti-ated cell states, is controlled by the landscape that is shapedby the epigenetic mechanism. Genome structure, which is inintimate relation to epigenetic modifications, has been foundto undergo high-order structural rearrangement during differ-entiation [22], in order to accommodate the required gene ex-pression pattern possibly through modulating the accessibilityof the genomic loci to transcription factors. Therefore, celldifferentiation should be associated with a landscape at the mi-croscopic molecular level that describes the genome structuraldynamics, as a reminiscence of the one widely used in proteinfolding [23]. The information of this landscape can provide aphysical understanding of genome dynamics and microscopicmolecular structural origin for cell differentiation, giving in-sights into the underlying mechanism. Therefore, such struc-tural landscape at the genomic level can serve as the physi-cal and structural basis for understanding and quantifying theWaddington landscape at the gene expression level for cell dif-

1

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 2: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

ferentiation and development. However, exploring such struc-tural landscape in practice is very challenging due to the lackof precise determination of the chromosome structure and dy-namics, as well as the non-equilibrium nature of cell differenti-ation distinctly different from protein folding. A natural initialstep towards this ultimate goal is to quantify the chromosomalstructural rearrangement during cell differentiation.

Embryonic stem cells (ESCs) are pluripotent cells and candifferentiate to the somatic cells during cell development. Onthe other hand, a somatic cell can be reprogrammed back tothe induced pluripotent stem cell (iPSC), which genetically be-haves like the ESC with high pluripotency [24, 25]. Cell repro-gramming, induced by ectopic expression of the key transcrip-tional factors, erases the epigenetic modifications establishedin the differentiated cell [26], and results in the reorganizationof chromosome structure into an ESC-like form [27]. Under-standing how the chromosome undergoes the structural rear-rangement during reprogramming has important implicationsin tissue engineering for developing new approaches to im-prove the reprogramming efficiency, however, is currently notwell addressed. The understanding of cell reprogramming atchromosome structural level can complete our knowledge ofthe forward and backward cell development processes, how-ever, is still lacking.

Herein, we developed a non-equilibrium landscape-switching model to investigate chromosome dynamics duringcell differentiation and reprogramming. The key element thatcontrols and induces the differentiation and reprogrammingwas modeled as an energy excitation that triggers the processesfor switching of chromosome structural dynamics between thetwo landscapes, representing the ESC and the terminally dif-ferentiated cell, followed by the subsequent conformational re-laxation. The model allowed us to accumulate a statisticallysufficient number of differentiation and reprogramming tra-jectories and perform the subsequent chromosome structuralanalysis. We found that the chromosome shows compartment-switching in the units of relatively stable TADs during both thedifferentiation and reprogramming processes. Chromosomein differentiation undergoes gradually monotonic compactionwith spatial re-positioning of active and inactive genomic loci,while an over-expanded chromosome structure with universallocalization of the loci at the chromosome surface appears inreprogramming. Further analysis of the differentiation pro-cess exhibited a distinct pathway that bifurcates at the quiteearly developmental stages to the destined differentiated cell.Our model made extensive predictions that have suggesteda chromosomal-level structural and dynamical picture of celldifferentiation and reprogramming.

2 Results

2.1 Chromosome dynamics during cell differ-entiation and reprogramming

We developed a non-equilibrium excitation-relaxation energylandscape-switching model to investigate the chromosomalstructural rearrangement during the differentiation of the ESCto the somatic cell and the reprogramming of the somatic cellto the iPSC. The model is similar to the one used for study-

ing the cell-cycle chromosome dynamics in our previous study[28]. To simplify the processes, here we used the ESC to re-place the iPSC as the destination for reprogramming. This isa reasonable approximation based on the recent evidence thatthe human ESC and iPSC are equivalent in terms of gene ex-pression patterns [29]. On the other hand, we used the IMR90as the somatic cell in our study. IMR90 is the terminally differ-entiated human fetal lung fibroblast cell and has been provedcapable of reprogramming to the iPSC by various transcrip-tional factors [30].

We focused on the long arm of chromosome 14 (20.5Mb-106.1Mb). With a resolution of 100 kb, the chromosome seg-ment was reduced into a coarse-grained polymer representa-tion with 857 beads. The Hi-C maps and compartment profilesof the chromosome segment at the ESC and IMR90 show dis-tinct differences (Figure 1), indicating that significant struc-tural rearrangement should occur during the transitions be-tween these two cell states. We first used the maximum entropyprinciple to implement Hi-C data to a generic polymer modelas a minimally biased experimental restraint [31, 32]. This ap-proach provides two effective energy landscapes that generatethe experimentally consistent chromosome structural ensem-bles and kinetically govern the chromosome dynamics at theESC and IMR90, respectively (Figure S1-S2 in SI) [33, 34].It is worth noting that chromosome dynamics at both the ESCand IMR90 are out-of-equilibrium [35, 36, 33]. However, aneffective energy landscape was demonstrated to be capable ofdescribing the behaviors of non-equilibrium systems in certaincases not too far from equilibrium [37, 38, 39]. Then we trig-gered the differentiation/reprogramming process by switchingbetween these two effective landscapes. Finally, the chromo-some was relaxed under the new effective landscape. In prac-tice, we termed the switching and relaxation from the ESC tothe IMR90 landscape as a differentiation process and from theIMR90 to the ESC landscape as a reprogramming process, re-spectively. Such switching can be regarded as an energy exci-tation that activates the system to trigger the escape from thelocal minima on the energy landscape and achieves the tran-sition to a new one. This excitation requires energy input orcost, which is provided by the underlying molecular processes(e. g., ATP hydrolysis). The net energy input or cost leadsto the detailed balance breaking of the system, resulting in anon-equilibrium process. Therefore, the model captured theessence of cell differentiation and reprogramming processesand can provide insights for the underlying non-equilibriumdynamics. Further details of the model can be found in ”Modeland Simulation Section” and our previous study [28].

We accumulated a statistically sufficient number of chromo-somal structural rearrangement pathways during the ESC dif-ferentiation and the IMR90 reprogramming, and then calcu-lated the contact probability map of the chromosome at eachtime frame, referring to as the evolution of the Hi-C heat mapin differentiation and reprogramming (Figure 2). We quanti-fied the similarity of the Hi-C maps by a sum of the differ-ences of the pairwise contact probability Pt

i, j (for chromoso-mal loci i and j) between the time frames t = I and t = J, andthen obtained a time-dependent matrix ∆PI,J (Figure 2A and2F). These matrices clearly show that chromosomal structuralrearrangement during both differentiation and reprogrammingundergo stepwise changes. Since chromosome structure is or-

2

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 3: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

ganized at multiple levels (e. g., TADs and compartments), wefurther calculated ∆PI,J at local and non-local contact rangesbased on the genomic distance between the involved contactpair of chromosomal loci i and j. Hierarchical clustering ofmatrices ∆PI,J exhibits two apparent large clusters that canroughly group the trajectories into two steps or stages in thedifferentiation and reprogramming processes (Figure 2A and2F), indicated as blue and red linkages on the hierarchicaltrees. Moreover, we observed different time points for sepa-rating these two linkages at local and non-local contacts, im-plying that the local and non-local contacts in the chromosomeevolve asynchronously. During differentiation (Figure 2A), thechromosome appears to form the destined local contacts fasterthan the non-local ones, while the local and non-local struc-tural changes in the chromosome are more synchronous dur-ing reprogramming (Figure 2F). We then calculated the contactprobability P(s) versus the genomic distance (s) in the chro-mosome and we see the slope a in the relation of P(s) ∼ sa

gradually changes from -1.5 to -1.0 in differentiation and -1.0to -1.5 in reprogramming (Figure 2B and 2H, Figure S7A, Fig-ure S7B, S8A and S8B). This is consistent with the consen-sus that the chromosome in the human somatic cell behavesas a fractal-like globule [40] (Figure S4) and is more looselyformed in the ESC (Figure S3).

We further reduced the complexity of differentiation andreprogramming by respectively coarse-graining the processesinto 7 stages based on hierarchical clustering of chromosomecontact map similarity ∆PI,J (Figure 2B and 2G). This wasdone by directly comparing the hierarchical trees for local andnon-local ∆PI,J and then sequentially and gradually groupingthe time-evolving states that have high similarity at both localand non-local ranges from the contact map perspective (stateswithin one stage are within the same cluster in terms of simi-lar formed contacts). Chromosomes at these stages are shownwith Hi-C heat maps and representative structures (Figure 2D,2E and 2I, 2J). The plaid pattern on the chromosome con-tact probability map in the IMR90 is gradually gained (lost)during the differentiation (reprogramming) process along withthe shape changing from large to small (small to large) glob-ules. Careful examination on reprogramming can detect pos-sible over-expanded chromosome structures at stages R4 andR5, where P(s) is over-decreased under that in the ESC (Figure2H and Figure S8A). In contrast, the chromosome undergoesmonotonic compaction during differentiation (Figure 2C andFigure S7A). These results clearly show that the chromosomeadapts large-scale structural rearrangement during cell differ-entiation and reprogramming, and these two processes do notshare the same pathway at the chromosomal level.

2.2 Chromosomal structural rearrangementsin the units of TADs

Since the TAD is regarded as the structural and functional unitof the chromosome [41, 17, 7] and the compartment spatiallysegregates the active and inactive genomic loci from functionalperspective [9], here we examined chromosomal structural re-arrangement in terms of TADs and compartments during dif-ferentiation and reprogramming. Compartment profiles werecalculated along all the loci in the chromosome segment, and

the reduced 7 stages in differentiation and reprogramming arepresented (Figure 3A and 3G). We can see that the chromo-some undergoes apparent compartment-switching during bothof the two processes. In detail, differentiation exhibits a moresubstantial degree of switching from compartment A to B(blue to yellow) than that from compartment B to A (yellowto blue) (Figure 3A). This is a sign of more regions switch-ing from active to inactive loci, consistent with the fact thatthe ESC possesses the most expressive euchromatin in orderto execute the highest pluripotency [42, 22]. This is oppo-sitely found during cell reprogramming (Figure 3G). In addi-tion, the most significant compartment-switching during dif-ferentiation occurs at stage D5 (Figure 3A), which appears tobe after 2τ , much slower than the structural rearrangementsin TADs (Figure 3F). Since compartment forms at a longerrange (beyond 5Mb) than TAD does, this shows again that lo-cal chromosome structural arrangement occurs faster than non-local one does. On the other hand, we observed quite differentcompartment-switching behaviors in the chromosome duringreprogramming, which is not a simple reversal of that duringdifferentiation. The compartment profiles appear to proceedwith gradual change during reprogramming from stages R1 toR4 (Figure 3G). At stage R5, a large proportion of switchingin compartment profiles occurs with extensive regions show-ing that the associated compartment profiles are different fromthose in either the ESC or IMR90 (e. g., at 6-22Mb). This im-plies an aberrant structural rearrangement in the chromosome.We note that the chromosome in R5 exhibits an over-expandedconfiguration. Such configuration has completely disturbed thechromosome compartment formation reserved in the ESC andIMR90. This is also evidenced by the enhanced contact proba-bility map (Figure S9), which essentially determines the com-partment profiles [43].

On the other hand, we observed very similar Hi-C heat mapsat local regions in the chromosome during differentiation andreprogramming, implying that the structures of TADs are unal-tered during these two processes (Figure 3B and 3H). To quan-titatively investigate the structural change of TAD, we firstlyused the insulation score, proposed by Crane et al. [44], todefine the TAD boundary (Figure S1C and S2C). We see thatmost of the TAD boundaries are conserved during differenti-ation and reprogramming (bars in Figure 3B and 3H). Thiswas also observed in stages D5 and R5, where the compart-ment profiles of the chromosome have significantly changed.Since the boundaries of TADs in the ESC and IMR90 arestrongly overlapped (Figure 3C), the differentiation and re-programming processes do not change TADs significantly ina way that would deviate from the destined states. This is alsoquantitatively confirmed by the observation that the correlationbetween the insulation score and the TAD boundary overlap-ping population to the destined state (IMR90 in differentiationand ESC in reprogramming) increases monotonically as theprocess proceeds (Figure 3D, 3E, 3I and 3J).

Therefore, our findings have reached a conclusion that chro-mosomal structural rearrangements during differentiation andreprogramming undergo changes in the unit of TADs. Thisis consistent with the fact that TADs are conserved struc-tural units across different cell types [10]. However, throughthe detailed analysis, we observed that the size of TADslightly increases and decreases in differentiation and repro-

3

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 4: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

gramming, respectively (Figure 3F and 3K). Interestingly, theover-expanded chromosome during reprogramming possessesa larger average TAD size than that in both the ESC andIMR90, followed by a quick relaxation process to the destinedstate. This implies that expanding of TADs also occurs withinthe over-expanded chromosome. Overall, we conclude thatthe chromosome rearranges its structure mostly through thecompartment-switching without significantly disturbing the lo-cal TAD formation.

2.3 Identification of cell differentiation and re-programming pathways and irreversibilityof chromosomal structural rearrangements

To investigate the chromosomal structural rearrangements dur-ing differentiation and reprogramming, we quantified the tran-sition pathways and projected them onto different structuralorder parameters. It is shown that the pathways are highlystochastic during both differentiation and reprogramming (Fig-ure 4). We first examined from the geometrical perspectiveby using the structural extension at the three principal axes(PA) (Figure 4A, Left panel). We show that the chromosomecompacts its configuration via a two-step scenario with ini-tially shortening along its shortest PA (PA3) followed by com-pressing along with the longest PA (PA1) during differentia-tion. Such anisotropic compaction can also be observed byprojecting the pathways onto the radius of gyration (Rg) andan anisotropy measure, the aspheric quantity ∆ calculated us-ing the inertia tensor [45]. Deviation of ∆ from 0, which corre-sponds to a perfect sphere, measures the extent of anisotropy.As shown in Figure 4B, Left panel, Rg decreases with ∆ in-creasing first and then followed by decreasing due to theanisotropic compaction. Interestingly, the chromosome in re-programming apparently undergoes distinct pathways, whichare not the reversal of those in differentiation. The pathwaysprojected on PA1 and PA3 indicate isotropic expansions dur-ing the IMR90 reprogramming to the ESC (Figure 4A, Middlepanel). In particular, there are chromosome structures that ex-hibit bigger sizes than those in the ESC being observed, asevidence of over-expansion. Such over-expansion followed byslight compaction in the chromosome during reprogrammingis also confirmed from the pathways projected onto Rg and ∆

(Figure 4B, Middle panel).We then used the fraction of “native” contacts Q, which was

widely used in protein folding [46], as the order parameter.We practically used the averaged pairwise distances obtainedfrom the maximum entropy principle simulations for the ESCand IMR90 as the references for native distances to calculate Q(Figure S3 and S4). Furthermore, we divided Q into differentcategories varied by genomic distances between the contactingpairs. From Figure 4C and 4D (Left panels), we can see thatduring differentiation, the chromosome gradually loses andgains the contacts formed in the ESC and IMR90, respectively.The local contacts (0-2Mb) form faster than non-local ones(10-15Mb) at the early stages of the differentiation process.On the other hand, we observed that the over-expansion occursin both local and non-local contacts of the chromosome duringreprogramming, shown by a simultaneous decrease of QESC

and QIMR90 during reprogramming (Figure 4C and 4D, Right

panels). Intriguingly, the most significant over-expansion inthe chromosome was observed at different stages at local (R4)and non-local contacts (R5). This implies that the chromosomehas different expanding rates at different structural levels. Byexamining the average pathways for differentiation and repro-gramming, we found that the two pathways never overlap, indi-cating that the chromosomal structural rearrangements in thesetwo processes are highly irreversible (Figure 4, Right panels).

2.4 Rearrangements and transformations ofchromosome compartmental structuresduring cell differentiation and reprogrm-ming

The transcriptional activity has been found to be strongly cor-related with the spatial distribution of the gene regions [47, 48].To see how the chromosome dynamically manages the spatialre-positioning of active and inactive loci, we calculated the ra-dial density of chromosomal loci and examined its change dur-ing differentiation and reprogramming (Figure 5). Consistentwith previous simulations [49, 33, 50] and experiments [16],we found that in the IMR90, loci in compartment A tends to liecloser to the surface of the chromosome than the ones in com-partment B (Figure 5A, red lines in Middle and Right panels).In contrast, there is not much difference on the spatial posi-tioning of loci in compartment A and B in the ESC and the locialmost exhibit uniform spatial distribution in the chromosome(Figure 5A, blue lines in Middle and Right panels). Duringdifferentiation, the profile of radial density moves towards theinterior of the chromosome along with specific re-positioningof active and inactive loci towards the surface and interior ofthe chromosome, respectively (Figure 5A and 5B). In furtherdetail, for the regions with compartment A to B switching, themost probable state in the destined spatial distribution still al-lows a slight deviation from the interior (Figure 5C, Left Most),compared to the regions with stable compartment B to B (Fig-ure 5C, Right Most), which undergoes the most significant spa-tial re-positioning towards the interior. For the regions withcompartment B to A switching, the peak in the profile of radialdensity moves from the interior towards the surface of chro-mosome (Figure 5C, Second Left), but is still closer to the in-terior than the regions with stable compartment A to A (Fig-ure 5C, Second Right). Therefore, we assume that the spatialre-positioning of chromosomal loci strongly depends on thecompartment states during differentiation. The chromosomalregions maintaining the stable compartment states move moresignificantly in terms of the compartment A towards the sur-face of the chromosome and the compartment B towards theinterior of the chromosome than the ones that have compart-ment A/B switching.

On the other hand, we observed a completely different wayof chromosome rearranging its spatial distribution during re-programming, which is not the reversal of that in differenti-ation (Figure 5D). Notably, all loci in the chromosome seg-ment, except the ones with stable compartment B to B, haveto spatially move to the surface of the chromosome at the earlystage of reprogramming, as an indication of chromosome over-expansion, followed by a slight re-positioning towards the in-terior of the chromosome (Figure 5E and 5F). Therefore, the

4

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 5: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

whole process appears to be highly homogenous for chromoso-mal loci, independent of the compartment states of the chromo-some during reprogramming. This eventually leads to nearlyuniform spatial distribution of chromosomal loci in the ESC.

2.5 Bifurcation of chromosomal structural re-arrangement paths during differentiation

We analyzed the experimental Hi-C data of four humanearly ESC-derived cell lineages, including trophoblast-like cell(TB), mesendoderm (ME), neural progenitor cell (NP) andmesenchymal stem cell (MS) [22]. As shown in Figure 6A,these cells represent the early ESC developmental stages andhave deterministic pathways in differentiation to become ei-ther the extraembryo or to the three germ layers of the embryo[51]. We calculated the compartment profiles of the four celllines from the experimental Hi-C data [22] and then incorpo-rated them into the principal component analysis (PCA) on thetrajectories of the chromosome compartment profiles evolvingduring differentiation in our simulations (Figure 6B). Interest-ingly, we found that except the ME, the other three experimen-tal cell lines are located far from the ESC to the IMR90 differ-entiation pathway. This in general indicates that differentiationof the ESC to the IMR90 does not go through the TB, NP andMS cell lines at the chromosomal level. It is worth noting thatthe ME cell line is very close to the ESC [52] and there are alsohigh similarities in Hi-C data between the ME and ESC, mak-ing the ME undistinguishable with the ESC in the PCA plot.Further clustering of the chromosome compartment profiles ofthe experimental cell lines along with the reduced 7 stages ob-tained from the differentiation simulations, shows that the ME,TB, NP cells are located at the early stage of differentiation,while the MS is close to the late differentiation process (Figure6B, hierarchical clustering tree plot). This is consistent withthe fact that the MS exhibits many characteristics of the cellsat the relatively late developmental stages [51].

Further projecting the pathways onto contact probability atvarious pairwise contact ranges of the chromosome clearlyshows that the TB, NP and MS are neither on the pathways ofthe ESC differentiating to the IMR90, nor on the pathways ofthe IMR90 reprogramming to the ESC (Figure 6C). A similarobservation can also be obtained by performing PCA on com-partment profiles (Figure S11). Such non-overlapped chro-mosomal structural rearrangement in differentiation and repro-gramming was observed in recent Hi-C experiments on mouse[53].

Since the compartment profiles of the chromosome changesignificantly during differentiation, we grouped the chromoso-mal loci based on the patterns that they adopt for compartment-switching and then examined the change of their compartmen-tal values along the differentiation times (Figure 6D). There are4 different groups of loci present with compartment-switchingof A to B, B to A and stable compartment of A to A, B toB. Interestingly, for loci that do not undergo compartment-switching, they are not constantly stable, as apparent fluctu-ations were observed. This generally indicates that differenti-ation has extensive impacts spreading over the chromosomerather than only gradually switching the relevant compart-ments. As the compartment-switching of A to B contributes

to the most significant chromosomal structural rearrangementthat potentially impacts the genomic function and guides thedirection of differentiation, we further investigated the kinet-ics of the compartment-switching of A to B (Figure 6E). Wefound that a large number of relevant loci undergo a similartimescale 4∼ 5τ to perform the switching and a small numberof loci is associated with a slower switching timescale ∼ 10τ .Similar phenomena were found in the experiments where thecompartment switches universally and gradually to the differ-entiated cells, while loci may be heterogeneous for participat-ing in chromosomal structural rearrangement during differen-tiation [53].

3 DiscussionThe landscape-switching model developed here and then ap-plied to the cell differentiation and reprogramming processesproduced the large-scale structural rearrangements occurringin a long human chromosome segment. We found that most ofthe TAD boundaries conserve during these two processes, inconcordance with many experimental findings that TADs areconserved between different cell types [10, 22, 54]. Never-theless, slight structural changes of TADs were observed witha rapid increase and decrease of TAD size during differentia-tion and reprogramming, as evidence to show that the chromo-some structure and dynamics regulate the transcription activityduring these two processes [55]. On the other hand, the rela-tively slow and notable changes in chromosome compartmentswere observed in terms of switching compartment A/B duringdifferentiation and reprogramming, recapitulating the findingsexisting in Hi-C data [22] and the analysis that epigenomes dif-fer significantly for various cell types [56, 51]. Interestingly,we found distinct scenarios for the chromosome to perform thecompartment-switching during differentiation and reprogram-ming. Compartment-switching undergoes monotonic changeduring differentiation, though a small degree of fluctuationsmay be encountered due to cell stochasticity [57]. In con-trast, during reprogramming, compartment profiles show aber-rant changes in extensive regions, where compartmental valuesmay be significantly different from the differentiated cell or thedestined ESC.

By projecting the trajectories onto several order parameters,we obtained the chromosomal structural rearrangement path-ways during differentiation and reprogramming. We attributedthe aberrant changes in the compartment profiles during repro-gramming to the over-expansion that occurs at both local andnon-local contact ranges in the chromosome, though under dif-ferent timescales. Such chromosome structure with spatial re-positioning of the chromosomal loci universally to the borderof chromosome territory (Figure 5D-F), is followed by slightcompaction. Interestingly, the recent time series Hi-C experi-ment with a focus on the cell-cycle mitotic exit process identi-fied a short-lived intermediate state at telophase, where the Hi-C properties cannot be derived from the combinations of thoseat the prometaphase and interphase [58]. This implies that thestructure of the chromosome in such intermediate state doesnot lay in between the condensed cylinder in prometaphase ordecondensed fractal globule in interphase [40]. Further analy-sis of the chromatin-protein binding experiments showed that

5

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 6: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

the chromosome is free of SMC complexes at such intermedi-ate state. In our recent study on the cell cycle, we establisheda correspondence of the over-expanded chromosome observedin the simulations to the intermediate state in the experiments[28]. The over-expanded chromosome topologically increasesthe accessible surface area for the loading of the SMC com-plexes, thus it may promote the chromosomal structural rear-rangement in favor of an efficient mitotic exit process [59].Likewise, we suggest a similar biological role of the over-expanded chromosome in facilitating the reprogramming pro-cess by opening the structure for protein binding. Differentfrom cell differentiation, cell reprogramming is quite ineffi-cient [60, 61]. From computational efforts, here we can pro-pose a strategy to accelerate the reprogramming by openingthe chromosome structure through chromosome remodeling,which has been found to play an important role in reprogram-ming of the somatic cells to pluripotency [62, 63].

The chromosomal structural rearrangement pathways ob-served in our simulations show high stochasticity and ir-reversibility for the differentiation and reprogramming pro-cesses. This finding is consistent with increasing experimen-tal evidence that reprogramming proceeds in a way that devi-ates largely from the simple reversal of differentiation [64, 53].Recent experiments interestingly showed that the mouse ESCcan differentiate to the same destined state via multiple paths[65], as evidence of stochasticity. From a physical perspective,these facts underscore the prevalence of non-equilibrium andstochastic characteristics in cell differentiation and reprogram-ming, where the cell is constantly under intrinsic and extrinsicfluctuations from the underlying biochemical reactions involv-ing a large number of molecules and the external stimuli inmaking cell-fate decision [66]. Therefore, the chromosomalstructural rearrangements occur in a highly stochastic mannerduring differentiation and reprogramming. Still, the averagedirreversible pathways can be determined to some extent, drivenby the destined states.

By developing a theoretical framework, we previously in-vestigated the landscape and path of cell-fate decision-makingprocesses based on the gene regulatory network [67, 68]. Theresulting probability landscape provides a quantitative basisfor the original pictorial Waddington’s epigenetic landscape[21]. This can be used to guide the dynamics of cell devel-opment. Furthermore, it was shown that the cell-fate decision-making processes are additionally determined by the curl force(flux), which originates from the broken detailed balance in thenon-equilibrium dynamics [69]. The detailed balance breakingleads to the non-equilibrium thermodynamic cost for main-taining the functional cell states, supplied by the underlyingmolecular processes (e. g., ATP hydrolysis). As the epigeneticinformation is encoded in genome architecture [70, 50], wecan suggest a landscape based on our simulation findings atthe chromosomal level (Figure 7). We showed that from thechromosome structural perspective, the differentiation pathsfrom the ESC to the IMR90 are well-defined and bifurcate at avery early stage from the paths differentiating to the other cells,consistent with our theoretical predictions [67]. The pathwaysdo not necessarily follow the steepest descent gradient paths(along the valleys) because of the non-equilibrium effects.Thus, they may bypass the MS, which is a metastable attractoron the landscape. The deviation is controlled by the directions

and magnitudes of the flux, which can drive the developmentalpaths in practice to go through distinct differentiated interme-diate states in reaching the destination [65]. As the efficiencyof reprogramming from the somatic cells to the iPSCs is gener-ally lower than 1% [60, 61], the terminally differentiated cellsare stable and likely located at the minima of the landscapeand reprogramming thus externally requires the introductionsof reprogramming factors into the system. In addition, repro-gramming undergoes different paths that do not overlap withthe differentiation ones, due to the non-equilibrium effects, inline with the theoretical predictions [67]. Chromosomes on thedifferentiation and reprogramming paths located at the samelevel as the same layer on the landscape should be structurallydifferent. This is also observed experimentally [53], when themouse B cell was reprogrammed to a state which resemblesthe epiblast, but is very different at the genomic and geneticlevel in terms of Hi-C and RNA-seq data.

As evidenced by the experiments [71, 72], the gene regu-latory network that determines the cell development processshows multi-stability, corresponding to different cell fates. Be-sides, the transitions between these cell fates in the cell devel-opment have been found to occur abruptly, similar to a switchbetween different gene expression states [73, 74, 75]. Recenttheoretical studies showed that the simple circuitry of bi-stableswitches between two gene expression states, which corre-spond to distinct cell fates, is able to capture many character-istics of the cell development in reality [76, 77, 78, 79]. Over-all, our landscape-switching model is in accordance with theswitching between the bi-stable cell fates in cell development.However, the model has two prominent limitations worth not-ing. First, cell development has been simplified as a bi-stablesystem, while the intermediate cell states, often recognized asthe multipotent progenitor cells in differentiation, may existduring the process. Second, the model implements the instan-taneous switch to trigger the transition between different cellfates, whereas the real biological responses in both cell differ-entiation and reprogramming should occur with finite reactionrates. The most plausible solution for overcoming these limita-tions is to add more intermediate cell fates into the model whenthe experimental data are available, so the model will eventu-ally have the stepwise switches between the multi-stable cellstates during cell differentiation and reprogramming. Besides,our model does not consider the cell cycle dynamics, due tothe lack of experimental Hi-C data and the knowledge of thequantitative relation between the cell cycle and the cell devel-opment. In reality, there are usually many cell cycles duringcell differentiation and reprogramming. Since we are mainlyinterested in the development process in this study, for simplic-ity we focused on the slow developmental process while treat-ing the fast cell-cycle dynamics as an averaged background.Therefore, the model provided a prediction of the chromosomestructural evolution during cell differentiation and reprogram-ming at the interphase, which accounts for most of the time inthe cell cycle. Further improvements in implementing the cell-cycle dynamics into cell development should be considered inthe future.

Our computational predictions can be potentially assessedvia the targeted experiments. In this regard, it would behighly useful to apply the time-series Hi-C experiments, whichwere recently developed and applied to the cell cycle process

6

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 7: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

[80, 81, 58, 82], to obtain the time-evolving Hi-C maps duringdifferentiation and reprogramming. Cell differentiation and re-programming depend on both the intrinsic and extrinsic fac-tors, so the processes may vary under different conditions. Theswitching implementation imposes an instantaneous change ofthe conditions to the desired ones, so our observations maycorrespond to the extremes, where the transitions between cellfates are strongly favored. With rapid developments in improv-ing the reprogramming efficiency [83, 84, 85, 86], we antici-pate that the predictions present here can be tested by the ex-periments in the future.

4 ConclusionsIn summary, we explicitly considered the non-equilibrium dy-namics and investigated chromosomal structural rearrange-ment during cell differentiation and reprogramming througha landscape approach. We simplified the non-equilibriumeffects as an instantaneous energy excitation followed bythe subsequent relaxation, so the kinetics was driven bythe landscape-switching, which resonates with the cell-fateswitching in multi-stable cell development. Our approach pro-vides a physical and structural evolution picture of the cell-fatedecision-making processes for differentiation and reprogram-ming through the chromosome structural dynamics. The pre-dictions made in this study are in need of assessments by futureexperiments. Our model can be extended to gain insights intothe chromosome pathological disorganization in cancer cellsand cell senescence.

5 Model and Simulation Section

5.1 Hi-C data processingWe used human embryonic stem cell (ESC) and human fetallung fibroblast cell (IMR90), a terminally differentiated cell,as the two start/end points to investigate the differentiation andreprogramming processes. Hi-C data of these two cell lineswere measured by Dixon et al. [10] and are publicly avail-able on GEO with accession number GSE35156. Hi-C datafor the four cell lines (TB, ME, NP and MS) during the earlyESC development used for comparisons in our study can bedownloaded on GEO with accession number GSE52457 [22].Hi-C data were handled by the Hi-C Pro software through thestandard pipelines [87]. To best minimize the effects of the ex-perimental errors existing in Hi-C experiments, all the replicasfor each cell line were collected and used for generating theHi-C contact map at a resolution of 100kb.

5.2 Maximum entropy principle simulationsWe used a generic polymer model as the basic background[36], which is made up of bonded potentials to maintain thechain connection and soft-core nonbonded potentials to allowchain crossing as an outcome of topoisomerase [43]. Topoiso-merases, which are found abundant in cell nucleus [88], showwide participation in most of the reactions that involve double-stranded DNA, including transcription and recombination [89],which are essential for cell differentiation and reprogramming.

As noted elsewhere [33], the underlying kinetic behaviors ofthe chromosome loci are not affected by the soft-core poten-tials when the chromosome is within one cell state. Therefore,we applied the soft-core potentials throughout the simulationsto keep the effects of the topoisomerase being constantly ac-tive.

In order to best reproduce the Hi-C map for the ESC andIMR90, we implemented the experimental restraints into thegeneric polymer model by means of maximum entropy princi-ple [31, 32]. The potential of the polymer model for the ESCor IMR90 can be written as:

VESC(IMR90) =VPolymer +VHi−C

, where VPolymer is the potential of the generic polymer modeland the restraint potential from the Hi-C data (VHi−C) is lin-early added with VPolymer as requested by the maximum en-tropy principle [90]. Hi-C maps were normalized to be contactprobability maps fi, j, where i and j are chromosomal loci, byassuming the adjacent beads always form contacts fi,i±1 ≡ 1[31]. Therefore VHi−C has an expression:

VHi−C = αi, jPi, j

, where Pi, j is the calculated contact probability using stepfunction and αi, j is the parameter that is determined by themaximum entropy principle through multiple rounds of itera-tion to ultimately match Pi, j to fi, j. Details can be found inprevious studies [31, 32].

5.3 Landscape-switching model

A prominent outcome of the maximum entropy principle sim-ulation is the potential VESC(IMR90), which can generate a chro-mosome ensemble in matching the Hi-C map and reproducethe kinetic essence of chromosome dynamics in experiments[34, 33]. We observed the sub-diffusivity of the genomic lociin the chromosome at both the ESC and IMR90 (Figure S6).The diffusion exponents, which are characterized in the rela-tion of mean square displace (MSD) and time t with MSD∼ tβ ,are found to be β ∼ 0.4, in line with previous experimental ob-servations [91, 92].

In our previous work on the cell cycle [28], we developedan energy excitation-relaxation landscape-switching model toapproximately and explicitly include the non-equilibrium ef-fects in simulations to describe the chromosome conforma-tional transition between the interphase and mitotic phase forthe cell-cycle dynamics. Here, we extended our model to theESC differentiation and the IMR90 reprogramming. In prac-tice, the non-equilibrium simulations were performed in threesteps:(i) Chromosome was initially simulated under the potentialVESC (preparing for differentiation) or VIMR90 (preparing forreprogramming);(ii) A sudden switch of the potential from VESC to VIMR90 (trig-gered to differentiation) or VIMR90 to VESC (triggered to repro-gramming) was implemented;(iii) Chromosome was relaxed under the new potential VIMR90(differentiation) or VESC (reprogramming) for a long period oftime.

7

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 8: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

5.4 Identifications of compartments andboundaries of TADs

To identify the compartments, we first calculated the enhancedcontact probability map (Figure S9), which is the observed ver-sus expected contact map, at a reduced resolution level (1Mb)[9, 93]. Then we used Iterative Correction and Eigenvector de-composition (ICE) method to normalize the enhanced contactprobability map [94] and subsequently convert it into a Pearsoncorrelation matrix. Finally, we followed the previous sugges-tion to perform the principal component analysis (PCA) on thismatrix, and the first PC (PC1) was assigned as the compart-ment profiles [9]. The direction of the PC1 values is arbitrary,and we set the positive and negative PC1 values with gene den-sity (positive to gene-rich and negative to gene-poor). This isdone on the IMR90 data. Then, the direction of the PC1 val-ues during differentiation and reprogramming was determinedaccording to the correlation coefficient with the PC1 values ofthe IMR90 Hi-C data.

The boundaries of TADs were identified by the insulationscore proposed by Crane et al. [44]. We used the same sizeof sliding square (500×500kb) within the original application[44] to calculate the insulation score. The local minima ofthe insulation score represent the areas of high local insula-tion, which may correspond to the boundaries of TADs. Wethen used a delta vector that measures the slope of the insula-tion vector to quantitatively determine the boundaries of TADs[44]. Practically, the delta vector with crossing the horizon-tal 0 and difference between the local left maximum and lo-cal right minimum higher than 0.1 signified the boundary ofa TAD. To calculate the boundary matching population, weallowed 100kb noise error in accounting as an exact match-ing. We found that the average numbers of beads in formingthe TADs for the chromosome focused in our study identifiedfrom the Hi-C maps are about 9 and 10 in the ESC and IMR90,respectively. Forming the TADs by such relatively low num-bers of beads could be contributed by the thermal fluctuations,which can transiently induce the contacts formed by the beadsclose in sequence. Therefore, we evaluated the TAD forma-tion in our model for the chromosome ensemble at the ESCand IMR90. This was undertaken by comparing to the poly-mer ensemble generated by the simulations with applying onlythe generic polymer potential VPolymer. We found that the insu-lation score of the polymer ensemble constantly remains zero,corresponding to no TAD formation (Figure S5). The result in-dicated that the TADs are robustly formed in our chromosomemodel under the Hi-C constraint potentials VESC(IMR90).

5.5 Statistical Analysis

The hierarchical clustering on the chromosome ensemble gen-erated by the maximum entropy principle simulations was per-formed on the randomly selected 2400 chromosome structures.The cut-off criteria were chosen to be the up-limit of structuraldifference for chromosome conformational diffusion (FigureS3 and S4). By these criteria, chromosomes within one clusterare structurally similar, as shown by the contact maps. Twostructures from each cluster, whose population is higher than0.3%, were chosen. These structures were used as the ini-tial structures for the landscape-switching simulations. There

are 354 and 226 simulation trajectories for differentiation andreprogramming, respectively. All the trajectories were col-lected and used for calculating the contact probability, com-partment profiles, insulation scores, radial density and the av-eraged pathways. The changes of TAD size and compartmen-tal values for A to B switching with the time were presented asmean ± standard deviation.

Supporting InformationSupporting Information is available on bioRxiv.org.

AcknowledgementsWe thank Shufan Ma for help in preparing the figures. We

acknowledge the support from the National Science Founda-tion PHY-76066. The authors would like to thank Stony BrookResearch Computing and Cyberinfrastructure, and the Insti-tute for Advanced Computational Science at Stony Brook Uni-versity for access to the high-performance SeaWulf computingsystem, which was made possible by a $1.4M National ScienceFoundation grant (#1531492).

References[1] Smadar Ben-Tabou de Leon and Eric H Davidson. Gene

regulation: gene control network in development. Annu.Rev. Biophys. Biomol. Struct., 36:191–212, 2007.

[2] En Li. Chromatin modification and epigenetic repro-gramming in mammalian development. Nat. Rev. Genet.,3(9):662–673, 2002.

[3] Wolf Reik. Stability and flexibility of epigeneticgene regulation in mammalian development. Nature,447(7143):425–432, 2007.

[4] Eran Meshorer and Tom Misteli. Chromatin in pluripo-tent embryonic stem cells and differentiation. Nat. Rev.Mol. Cell Biol., 7(7):540–546, 2006.

[5] Christian Lanctot, Thierry Cheutin, Marion Cremer, Gi-acomo Cavalli, and Thomas Cremer. Dynamic genomearchitecture in the nuclear space: regulation of gene ex-pression in three dimensions. Nat. Rev. Genet., 8(2):104,2007.

[6] Giacomo Cavalli and Tom Misteli. Functional impli-cations of genome topology. Nat. Struct. Mol. Biol.,20(3):290, 2013.

[7] David U Gorkin, Danny Leung, and Bing Ren. The3d genome in transcriptional regulation and pluripotency.Cell Stem Cell, 14(6):762–775, 2014.

[8] Job Dekker, Karsten Rippe, Martijn Dekker, and NancyKleckner. Capturing chromosome conformation. Sci-ence, 295(5558):1306–1311, 2002.

[9] Erez Lieberman-Aiden, Nynke L Van Berkum, LouiseWilliams, Maxim Imakaev, Tobias Ragoczy, AgnesTelling, Ido Amit, Bryan R Lajoie, Peter J Sabo,Michael O Dorschner, et al. Comprehensive mapping oflong-range interactions reveals folding principles of thehuman genome. Science, 326(5950):289–293, 2009.

8

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 9: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

[10] Jesse R Dixon, Siddarth Selvaraj, Feng Yue, AudreyKim, Yan Li, Yin Shen, Ming Hu, Jun S Liu, and BingRen. Topological domains in mammalian genomes iden-tified by analysis of chromatin interactions. Nature,485(7398):376, 2012.

[11] Elphege P Nora, Bryan R Lajoie, Edda G Schulz, LucaGiorgetti, Ikuhiro Okamoto, Nicolas Servant, Tristan Pi-olot, Nynke L van Berkum, Johannes Meisig, John Sedat,et al. Spatial partitioning of the regulatory landscape ofthe x-inactivation centre. Nature, 485(7398):381, 2012.

[12] Tom Sexton, Eitan Yaffe, Ephraim Kenigsberg, FredericBantignies, Benjamin Leblanc, Michael Hoichman,Hugues Parrinello, Amos Tanay, and Giacomo Cavalli.Three-dimensional folding and functional organizationprinciples of the drosophila genome. Cell, 148(3):458–472, 2012.

[13] Orsolya Symmons, Veli Vural Uslu, Taro Tsujimura,Sandra Ruf, Sonya Nassari, Wibke Schwarzer, LaurenceEttwiller, and Francois Spitz. Functional and topolog-ical characteristics of mammalian regulatory domains.Genome Res., 24(3):390–400, 2014.

[14] Boyan Bonev, Netta Mendelson Cohen, Quentin Sz-abo, Lauriane Fritsch, Giorgio L Papadopoulos, YanivLubling, Xiaole Xu, Xiaodan Lv, Jean-Philippe Hugnot,Amos Tanay, et al. Multiscale 3d genome rewiring duringmouse neural development. Cell, 171(3):557–572, 2017.

[15] Suhas SP Rao, Miriam H Huntley, Neva C Durand,Elena K Stamenova, Ivan D Bochkov, James T Robinson,Adrian L Sanborn, Ido Machol, Arina D Omer, Eric SLander, et al. A 3d map of the human genome at kilobaseresolution reveals principles of chromatin looping. Cell,159(7):1665–1680, 2014.

[16] Alistair N Boettiger, Bogdan Bintu, Jeffrey R Moffitt,Siyuan Wang, Brian J Beliveau, Geoffrey Fudenberg,Maxim Imakaev, Leonid A Mirny, Chao-ting Wu, and Xi-aowei Zhuang. Super-resolution imaging reveals distinctchromatin folding for different epigenetic states. Nature,529(7586):418, 2016.

[17] Benjamin D Pope, Tyrone Ryba, Vishnu Dileep, FengYue, Weisheng Wu, Olgert Denas, Daniel L Vera, YanliWang, R Scott Hansen, Theresa K Canfield, et al.Topologically associating domains are stable units ofreplication-timing regulation. Nature, 515(7527):402,2014.

[18] David L Spector. The dynamics of chromosome or-ganization and gene regulation. Annu. Rev. Biochem.,72(1):573–608, 2003.

[19] Robert Schneider and Rudolf Grosschedl. Dynamics andinterplay of nuclear architecture, genome organization,and gene expression. Genes Dev., 21(23):3027–3043,2007.

[20] Andrea Smallwood and Bing Ren. Genome organiza-tion and long-range regulation of gene expression by en-hancers. Curr. Opin. Cell Biol., 25(3):387–394, 2013.

[21] Conrad Hal Waddington. The strategy of the genes. Allenand Unwin, London, 1957.

[22] Jesse R Dixon, Inkyung Jung, Siddarth Selvaraj, YinShen, Jessica E Antosiewicz-Bourget, Ah Young Lee,Zhen Ye, Audrey Kim, Nisha Rajagopal, Wei Xie, et al.Chromatin architecture reorganization during stem celldifferentiation. Nature, 518(7539):331, 2015.

[23] Joseph D Bryngelson, Jose Nelson Onuchic, Nicholas DSocci, and Peter G Wolynes. Funnels, pathways, and theenergy landscape of protein folding: a synthesis. Pro-teins: Struct., Funct., Bioinf., 21(3):167–195, 1995.

[24] Kazutoshi Takahashi and Shinya Yamanaka. Induction ofpluripotent stem cells from mouse embryonic and adultfibroblast cultures by defined factors. Cell, 126(4):663–676, 2006.

[25] Kazutoshi Takahashi, Koji Tanabe, Mari Ohnuki,Megumi Narita, Tomoko Ichisaka, Kiichiro Tomoda, andShinya Yamanaka. Induction of pluripotent stem cellsfrom adult human fibroblasts by defined factors. Cell,131(5):861–872, 2007.

[26] Konrad Hochedlinger and Kathrin Plath. Epigenetic re-programming and induced pluripotency. Development,136(4):509–523, 2009.

[27] Anna Mattout, Alva Biran, and Eran Meshorer. Globalepigenetic changes during somatic cell reprogramming toips cells. J. Mol. Cell Biol., 3(6):341–350, 2011.

[28] Xiakun Chu and Jin Wang. Conformational state switch-ing and pathways of chromosome dynamics in cell cycle.bioRxiv, page DOI 10.1101/2019.12.20.885335, 2019.

[29] Jiho Choi, Soohyun Lee, William Mallard, KendellClement, Guidantonio Malagoli Tagliazucchi, HotaeLim, In Young Choi, Francesco Ferrari, Alexander MTsankov, Ramona Pop, et al. A comparison of geneti-cally matched cell lines reveals the equivalence of humanipscs and escs. Nat. Biotechnol., 33(11):1173, 2015.

[30] Xiaoping He, Yang Cao, Lihua Wang, Yingli Han, Xi-uying Zhong, Guixiang Zhou, Yongping Cai, HuafengZhang, and Ping Gao. Human fibroblast reprogrammingto pluripotent stem cells regulated by the mir19a/b-ptenaxis. PloS one, 9(4), 2014.

[31] Bin Zhang and Peter G Wolynes. Topology, structures,and energy landscapes of human chromosomes. Proc.Natl. Acad. Sci. U. S. A., 112(19):6062–6067, 2015.

[32] Bin Zhang and Peter G Wolynes. Shape transitions andchiral symmetry breaking in the energy landscape of themitotic chromosome. Phys. Rev. Lett., 116(24):248101,2016.

[33] Lei Liu, Guang Shi, D Thirumalai, and ChangbongHyeon. Chain organization of human interphase chromo-some determines the spatiotemporal dynamics of chro-matin loci. PLoS Comput. Biol., 14(12):e1006617, 2018.

9

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 10: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

[34] Michele Di Pierro, Davit A Potoyan, Peter G Wolynes,and Jose N Onuchic. Anomalous diffusion, spatial co-herence, and viscoelasticity from the energy landscape ofhuman chromosomes. Proc. Natl. Acad. Sci. U. S. A.,115(30):7753–7758, 2018.

[35] Kazuhiro Maeshima, Satoru Ide, Kayo Hibino, andMasaki Sasai. Liquid-like behavior of chromatin. Curr.Opin. Genet. Dev., 37:36–45, 2016.

[36] Angelo Rosa and Ralf Everaers. Structure and dynam-ics of interphase chromosomes. PLoS Comput. Biol.,4(8):e1000153, 2008.

[37] Shenshen Wang and Peter G Wolynes. Communication:Effective temperature and glassy dynamics of active mat-ter. J. Chem. Phys., 135(5):051101–051101, 2011.

[38] S Wang and PG Wolynes. Tensegrity and motor-driveneffective interactions in a model cytoskeleton. J. Chem.Phys., 136(14):145102, 2012.

[39] Bin Zhang and Peter G Wolynes. Genomic energy land-scapes. Biophys. J., 112(3):427–433, 2017.

[40] Leonid A Mirny. The fractal globule as a model ofchromatin architecture in the cell. Chromosome Res.,19(1):37–51, 2011.

[41] Jesse R Dixon, David U Gorkin, and Bing Ren. Chro-matin domains: the unit of chromosome organization.Mol. Cell, 62(5):668–680, 2016.

[42] Kashif Ahmed, Hesam Dehghani, Peter Rugg-Gunn,Eden Fussner, Janet Rossant, and David P Bazett-Jones.Global chromatin architecture reflects pluripotency andlineage commitment in the early mouse embryo. PloSone, 5(5), 2010.

[43] Natalia Naumova, Maxim Imakaev, Geoffrey Fudenberg,Ye Zhan, Bryan R Lajoie, Leonid A Mirny, and JobDekker. Organization of the mitotic chromosome. Sci-ence, 342(6161):948–953, 2013.

[44] Emily Crane, Qian Bian, Rachel Patton McCord,Bryan R Lajoie, Bayly S Wheeler, Edward J Ral-ston, Satoru Uzawa, Job Dekker, and Barbara JMeyer. Condensin-driven remodelling of x chromo-some topology during dosage compensation. Nature,523(7559):240, 2015.

[45] Ruxandra I Dima and D Thirumalai. Asymmetry in theshapes of folded and denatured states of proteins. J. Phys.Chem. B, 108(21):6564–6570, 2004.

[46] Samuel S Cho, Yaakov Levy, and Peter G Wolynes. Pversus q: Structural reaction coordinates capture proteinfolding on smooth landscapes. Proc. Natl. Acad. Sci. U.S. A., 103(3):586–591, 2006.

[47] Tom Misteli. Beyond the sequence: cellular organizationof genome function. Cell, 128(4):787–800, 2007.

[48] Wendy A Bickmore and Bas van Steensel. Genome ar-chitecture: domain organization of interphase chromo-somes. Cell, 152(6):1270–1284, 2013.

[49] Michele Di Pierro, Bin Zhang, Erez Lieberman Aiden,Peter G Wolynes, and Jose N Onuchic. Transferablemodel for chromosome architecture. Proc. Natl. Acad.Sci. U. S. A., 113(43):12168–12173, 2016.

[50] Yifeng Qi and Bin Zhang. Predicting three-dimensionalgenome organization with chromatin states. PLoS Com-put. Biol., 15(6):e1007024, 2019.

[51] Wei Xie, Matthew D Schultz, Ryan Lister, ZhonggangHou, Nisha Rajagopal, Pradipta Ray, John W Whitaker,Shulan Tian, R David Hawkins, Danny Leung, et al.Epigenomic analysis of multilineage differentiation ofhuman embryonic stem cells. Cell, 153(5):1134–1148,2013.

[52] Pengzhi Yu, Guangjin Pan, Junying Yu, and James AThomson. Fgf2 sustains nanog and switches the outcomeof bmp4-induced human embryonic stem cell differenti-ation. Cell Stem Cell, 8(3):326–334, 2011.

[53] Hisashi Miura, Saori Takahashi, Rawin Poonperm, AkieTanigawa, Shin-ichiro Takebayashi, and Ichiro Hiratani.Single-cell dna replication profiling identifies spatiotem-poral developmental dynamics of chromosome organiza-tion. Nat. Genet., 51(9):1356–1368, 2019.

[54] Anthony D Schmitt, Ming Hu, Inkyung Jung, Zheng Xu,Yunjiang Qiu, Catherine L Tan, Yun Li, Shin Lin, YiingLin, Cathy L Barr, et al. A compendium of chromatincontact maps reveals spatially active regions in the humangenome. Cell Rep., 17(8):2042–2059, 2016.

[55] Quentin Szabo, Frederic Bantignies, and Giacomo Cav-alli. Principles of genome folding into topologically as-sociating domains. Sci. Adv., 5(4):eaaw1668, 2019.

[56] R David Hawkins, Gary C Hon, Leonard K Lee, QueM-inh Ngo, Ryan Lister, Mattia Pelizzola, Lee E Edsall,Samantha Kuan, Ying Luu, Sarit Klugman, et al. Dis-tinct epigenomic landscapes of pluripotent and lineage-committed human cells. Cell Stem Cell, 6(5):479–491,2010.

[57] Takashi Nagano, Yaniv Lubling, Csilla Varnai, CarmelDudley, Wing Leung, Yael Baran, Netta Mendelson Co-hen, Steven Wingett, Peter Fraser, and Amos Tanay. Cell-cycle dynamics of chromosomal organization at single-cell resolution. Nature, 547(7661):61, 2017.

[58] Kristin Abramo, Anne-Laure Valton, Sergey V Venev,Hakan Ozadam, A Nicole Fox, and Job Dekker. A chro-mosome folding intermediate at the condensin-to-cohesintransition during telophase. Nat. Cell Biol., 21(11):1393–1402, 2019.

[59] Ning Qing Liu and Elzo de Wit. A transient absenceof smc-mediated loops after mitosis. Nat. Cell Biol.,21(11):1303–1304, 2019.

10

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 11: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

[60] Jacob H Hanna, Krishanu Saha, and Rudolf Jaenisch.Pluripotency and cellular reprogramming: facts, hy-potheses, unresolved issues. Cell, 143(4):508–525, 2010.

[61] Matthias Stadtfeld and Konrad Hochedlinger. Inducedpluripotency: history, mechanisms, and applications.Genes Dev., 24(20):2239–2263, 2010.

[62] Effie Apostolou and Konrad Hochedlinger. Chro-matin dynamics during cellular reprogramming. Nature,502(7472):462–471, 2013.

[63] Bernadett Papp and Kathrin Plath. Epigenetics of repro-gramming to induced pluripotency. Cell, 152(6):1324–1343, 2013.

[64] Kazutoshi Takahashi and Shinya Yamanaka. A devel-opmental framework for induced pluripotency. Develop-ment, 142(19):3274–3285, 2015.

[65] James Alexander Briggs, Victor C Li, Seungkyu Lee,Clifford J Woolf, Allon Klein, and Marc W Kirschner.Mouse embryonic stem cells can differentiate via multi-ple paths to the same state. eLife, 6:e26945, 2017.

[66] Peter S Swain, Michael B Elowitz, and Eric D Sig-gia. Intrinsic and extrinsic contributions to stochastic-ity in gene expression. Proc. Natl. Acad. Sci. U. S. A.,99(20):12795–12800, 2002.

[67] Jin Wang, Kun Zhang, Li Xu, and Erkang Wang. Quanti-fying the waddington landscape and biological paths fordevelopment and differentiation. Proc. Natl. Acad. Sci.U. S. A., 108(20):8257–8262, 2011.

[68] Chunhe Li and Jin Wang. Quantifying waddington land-scapes and paths of non-adiabatic cell fate decisions fordifferentiation, reprogramming and transdifferentiation.J. R. Soc. Interface, 10(89):20130787, 2013.

[69] Jin Wang, Li Xu, and Erkang Wang. Potential landscapeand flux framework of nonequilibrium networks: Robust-ness, dissipation, and coherence of biochemical oscilla-tions. Proc. Natl. Acad. Sci. U. S. A., 105(34):12271–12276, 2008.

[70] Michele Di Pierro, Ryan R Cheng, Erez LiebermanAiden, Peter G Wolynes, and Jose N Onuchic. De novoprediction of human chromosome structures: Epigeneticmarking patterns encode genome architecture. Proc.Natl. Acad. Sci. U. S. A., 114(46):12126–12131, 2017.

[71] Sui Huang, Gabriel Eichler, Yaneer Bar-Yam, and Don-ald E Ingber. Cell fates as high-dimensional attractorstates of a complex gene regulatory network. Phys. Rev.Lett., 94(12):128701, 2005.

[72] Hannah H Chang, Philmo Y Oh, Donald E Ingber, andSui Huang. Multistable and multistep dynamics in neu-trophil differentiation. BMC Cell Biol., 7(1):11, 2006.

[73] James E Ferrell and Eric M Machleder. The biochem-ical basis of an all-or-none cell fate switch in xenopusoocytes. Science, 280(5365):895–898, 1998.

[74] Wen Xiong and James E Ferrell. A positive-feedback-based bistable memory modulethat governs a cell fate de-cision. Nature, 426(6965):460–465, 2003.

[75] Hannah H Chang, Martin Hemberg, Mauricio Barahona,Donald E Ingber, and Sui Huang. Transcriptome-widenoise controls lineage choice in mammalian progenitorcells. Nature, 453(7194):544–547, 2008.

[76] Masaki Sasai and Peter G Wolynes. Stochastic gene ex-pression as a many-body problem. Proc. Natl. Acad. Sci.U. S. A., 100(5):2374–2379, 2003.

[77] Jose EM Hornos, Daniel Schultz, Guilherme CP Inno-centini, JAMW Wang, Aleksandra M Walczak, Jose NOnuchic, and Peter G Wolynes. Self-regulating gene: anexact solution. Phys. Rev. E, 72(5):051907, 2005.

[78] Daniel Schultz, Jose N Onuchic, and Peter G Wolynes.Understanding stochastic simulations of the smallest ge-netic networks. J. Chem. Phys., 126(24):245102, 2007.

[79] Haidong Feng, Bo Han, and Jin Wang. Adiabatic andnon-adiabatic non-equilibrium stochastic dynamics ofsingle regulating genes. J. Phys. Chem. B, 115(5):1254–1261, 2011.

[80] Haiming Chen, Jie Chen, Lindsey A Muir, Scott Ron-quist, Walter Meixner, Mats Ljungman, Thomas Ried,Stephen Smale, and Indika Rajapakse. Functional orga-nization of the human 4d nucleome. Proc. Natl. Acad.Sci. U. S. A., 112(26):8002–8007, 2015.

[81] Johan H Gibcus, Kumiko Samejima, AntonGoloborodko, Itaru Samejima, Natalia Naumova,Johannes Nuebler, Masato T Kanemaki, Linfeng Xie,James R Paulson, William C Earnshaw, et al. Apathway for mitotic chromosome formation. Science,359(6376):eaao6135, 2018.

[82] Haoyue Zhang, Daniel J Emerson, Thomas G Gilgenast,Katelyn R Titus, Yemin Lan, Peng Huang, Di Zhang,Hongxin Wang, Cheryl A Keller, Belinda Giardine, et al.Chromatin structure dynamics during the mitosis-to-g1phase transition. Nature, 576(7785):158–162, 2019.

[83] Jacob Hanna, Krishanu Saha, Bernardo Pando, JeroenVan Zon, Christopher J Lengner, Menno P Creyghton,Alexander van Oudenaarden, and Rudolf Jaenisch. Di-rect cell reprogramming is a stochastic process amenableto acceleration. Nature, 462(7273):595–601, 2009.

[84] Yoach Rais, Asaf Zviran, Shay Geula, Ohad Gafni, EladChomsky, Sergey Viukov, Abed AlFatah Mansour, In-bal Caspi, Vladislav Krupalnik, Mirie Zerbib, et al.Deterministic direct reprogramming of somatic cells topluripotency. Nature, 502(7469):65–70, 2013.

[85] Tyson Ruetz, Ulrich Pfisterer, Bruno Di Stefano, JamesAshmore, Meryam Beniazza, Tian V Tian, Daniel F Kae-mena, Luca Tosti, Wenfang Tan, Jonathan R Manning,et al. Constitutively active smad2/3 are broad-scope po-tentiators of transcription-factor-mediated cellular repro-gramming. Cell Stem Cell, 21(6):791–805, 2017.

11

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 12: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

[86] Junsang Yoo, Junyeop Kim, Jeong Hun Lee, Hyein Kim,Sung Joo Jang, Hyo Hyun Seo, Seung Taek Oh, Se-ung Jae Hyeon, Hoon Ryu, Jongpil Kim, et al. Accel-eration of somatic cell reprogramming into the inducedpluripotent stem cell using a mycosporine-like aminoacid, porphyra 334. Sci. Rep., 10(1):1–10, 2020.

[87] Nicolas Servant, Nelle Varoquaux, Bryan R Lajoie,Eric Viara, Chong-Jian Chen, Jean-Philippe Vert, EdithHeard, Job Dekker, and Emmanuel Barillot. Hic-pro: anoptimized and flexible pipeline for hi-c data processing.Genome Biol., 16(1):259, 2015.

[88] Jason R Swedlow, John W Sedat, and David A Agard.Multiple chromosomal populations of topoisomerase iidetected in vivo by time-lapse, three-dimensional wide-field microscopy. Cell, 73(1):97–108, 1993.

[89] James C Wang. Recent studies of dna topoisomerases.Biochim. Biophys. Acta, 909(1):1–9, 1987.

[90] Andrea Cesari, Sabine Reißer, and Giovanni Bussi. Usingthe maximum entropy principle to combine simulationsand solution experiments. Computation, 6(1):15, 2018.

[91] I Bronstein, Y Israel, E Kepten, S Mai, Yaron Shav-Tal,E Barkai, and Y Garini. Transient anomalous diffusion oftelomeres in the nucleus of mammalian cells. Phys. Rev.Lett., 103(1):018102, 2009.

[92] Soya Shinkai, Tadasu Nozaki, Kazuhiro Maeshima, andYuichi Togashi. Dynamic nucleosome movement pro-vides structural information of topological chromatindomains in living human cells. PLoS Comput. Biol.,12(10):e1005136, 2016.

[93] Takashi Nagano, Yaniv Lubling, Tim J Stevens, StefanSchoenfelder, Eitan Yaffe, Wendy Dean, Ernest D Laue,Amos Tanay, and Peter Fraser. Single-cell hi-c revealscell-to-cell variability in chromosome structure. Nature,502(7469):59, 2013.

[94] Maxim Imakaev, Geoffrey Fudenberg, Rachel Patton Mc-Cord, Natalia Naumova, Anton Goloborodko, Bryan RLajoie, Job Dekker, and Leonid A Mirny. Iterative cor-rection of hi-c data reveals hallmarks of chromosome or-ganization. Nat. Methods, 9(10):999, 2012.

12

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 13: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

Figure 1: Structural differences of the chromosome segment in the ESC and IMR90. The chromosome segment focused in our study isfrom the long arm of chromosome 14 with a range of 20.5-106.1 Mb. Hi-C maps of the chromosome for ESC (Left) and IMR90 (Right) arenormalized from 0 to 1. The compartment profiles are shown close to the Hi-C maps and colored by the compartment states with A and Bcorresponding to blue and yellow, respectively. The ideograms of chromosome 14 are shown. The bands within the chromosome segmentused in our study (denoted by the dashed rectangle) are colored by the compartment states. Both the compartment profiles and ideograms arealigned to the corresponding Hi-C maps (Left as ESC and Bottom as IMR90).

13

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 14: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

Figure 2: Chromosomal structural rearrangement during the ESC differentiation (Upper) and the IMR90 reprogramming (Lower). (A)Differences and hierarchical clustering of contact probability maps of the chromosome among each time frame t = I,J during differentiationvaried by total, local (< 2Mb) and non-local (> 2Mb) contact ranges. The contact map difference ∆PI,J is calculated by: ∑i, j |Pt=I

i, j −Pt=J

i, j |/∑i, j Pt=Ii, j , where Pt=I(J)

i, j is the contact probability between chromosomal loci i and j at the differentiation time t = I or J. ∆PI,J

is further normalized for achieving a better visualization. (B) Reduced 7 stages (“D1”-“D7”) for the differentiation process based on thecombination and comparison of the hierarchical clustering trees of local and non-local ∆PI,J established in (A). The chromosome within onestage thus possesses a relatively similar contact probability map. (C) The change of the contact probability P(s) versus genomic distance s inthe chromosome during differentiation with the 7 stages indicated at the bottom. (D) Hi-C heat (contact probability) maps of chromosome forthe 7 stages during differentiation. (E) Representative structures of the chromosome for the 7 stages during differentiation. (F-J) are similarwith (A-E) except for the process of IMR90 reprogramming to ESC. Another reduced 7 stages during the reprogramming (“R1”-“R7”) aredetermined in (G).

14

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 15: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

Figure 3: Extensive switching of compartments A/B and conserved TADs in the chromosome during differentiation and reprogramming.(A) Compartment profiles of the chromosome for the 7 stages during differentiation. The values of compartment profiles are defined as thefirst principal component (PC1) obtained by the principal component analysis (PCA) of the contact probability maps (Details in “Model andSimulation Section”). Compartments A and B are shown blue (positive) and yellow (negative) along with the whole range of the chromosomesegment, respectively. (B) Hi-C heat (contact probability) maps of the two local regions of the chromosome that have compartment-switchingindicated in (A) for illustrating the structural changes of TADs at the 7 stages during differentiation. The boundaries of TADs are determinedby the insulation score [44] and shown with vertical lines in the heat maps. (C) An illustration of TAD boundary overlapping between Hi-Cdata (ESC or IMR90) and simulation processing state during differentiation and reprogramming (Left). The numbers of TAD boundariesdetected by insulation score for experimental Hi-C (ESC or IMR90), simulation processing state, and the overlap between processing stateand Hi-C (ESC or IMR90) are denoted as NESC(IMR90)

Hi−C , NPS and NESC(IMR90)ol , respectively. Insulation scores obtained from the experimental

Hi-C data show high correlations between the ESC and IMR90 (Right). (D) The change of Pearson’s correlation coefficient of the insulationscore obtained from the simulation to the one from experimental Hi-C of the ESC and IMR90 during differentiation. (E) The change of TADboundary overlapping population from the simulation processing state to the Hi-C of the ESC and IMR90 (calculated by NESC

ol /NESCHi−C and

NIMR90ol /NIMR90

Hi−C ) and from the Hi-C of the ESC and IMR90 to the simulation processing state (indicated by superscript + and calculated byNESC

ol /NPS and NIMR90ol /NPS) during differentiation. (F) The change of the TAD size during differentiation. The data are fitted to exponential

function with the relaxation time shown. Shadow region represents the standard deviations of the TAD size at the corresponding time. (G-K)are similar to (A, B, D, E, F) except for reprogramming. The time shown in (K) is the relaxation time after passing the over-expansion stage.

15

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 16: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

Figure 4: The pathways of chromosomal structural rearrangement during differentiation and reprogramming. The pathways are projected ontothe order parameters of the chromosome with all trajectories presented for differentiation (Left) and reprogramming (Middle). The trajectoriesare colored by time in the logarithmic scale. The 7 stages are indicated by triangles (“D1”-“D7” in differentiation, Left) and diamonds (“R1”-“R7” in reprogramming, Right) on the averaged pathways shown as black lines. The averaged pathways for differentiation and reprogrammingare additionally shown and colored by time (Right). The ESC and IMR90 are placed as the blue and red points, respectively. The arrowsindicate the directions of differentiation (black) and reprogramming (grey). The pathways are projected onto (A) the extensions of the longestand the shortest principal axes (PA1 and PA3), (B) the radius of gyration (Rg) and the aspheric quantity (∆), contact similarity in terms of thefraction of native contact Q to ESC and IMR90 at (C) local range (0-2Mb), and (D) long-range (10-15Mb).

16

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 17: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

Figure 5: The change of the radial density in the chromosome during differentiation and reprogramming. (A) The change of the radial densityprofile for the whole loci (Left), loci in compartment A (Middle) and loci in compartment B (Right) in the chromosome during differentiation.The profile at each state is colored by the processing time in the logarithmic scale, while the ones at the ESC and IMR90 are colored in blueand red, respectively. (B) Structural illustrations of the chromosome with loci colored by compartment states during differentiation. (C) Thechange of the radial density profile for particular groups of loci in the chromosome during differentiation. The chromosomal loci are classifiedinto 4 groups based on their associated compartment-switching patterns adapted during the differentiation, including dynamical switching ofthe compartment “A to B” and “B to A”, as well as the stable maintenance of the compartment “A to A” and “B to B” (Left to Right). Theprobability density is normalized by the number of loci involved. The lines with arrows illustrate the directions of changes in the radial densityprofile. (D-F) as (A-C) except for reprogramming. There are two types of lines with arrows in each figure for reprogramming. The black linesrepresent the expansion of the chromosome at the first stage, while the grey lines represent the compaction of the chromosome at the secondstage.

17

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 18: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

Figure 6: Bifurcation of the chromosomal structural rearrangement during differentiation. (A) Scheme illustrating the ESC differentiationto the extraembryo and three germ layers in the embryo along with the experimental cell lines that are used for comparisons in our study.(B) Principal component analysis (PCA) of chromosome compartment profiles during differentiation projected on the first two principalcomponents (PCs) with populations indicated. Insert shows the hierarchical clustering tree of chromosome compartment profiles for the 7stages in differentiation from the simulations and the experimental cell lines. (C) Change of the contact probability in the chromosome duringdifferentiation and reprogramming varied by different contact ranges. Time is in logarithmic scale and arrows indicated the directions ofdifferentiation (black) and reprogramming (grey). Experimental cell lines are calculated from Hi-C data indicated as points, in the samecolor scheme with those in (B). (D) Compartmental value of individual chromosomal locus changing with respect to differentiation time.Chromosomal loci are classified into 4 groups based on their types of compartment-switching in differentiation: A to B, B to A, A to A and Bto B (Bottom to Top). (E) k-means clustering (k=4) of compartmental values changing with respect to time for loci that undergo compartment-switching from A to B, corresponding to the group at the bottom in (D). The averaged compartmental value of loci changing with time in eachcluster is fitted to exponential function and the corresponding relaxation time is shown. The percentage is the population of the correspondingcluster. The standard deviation is shown with the shadow region.

Figure 7: A schematic Waddington landscape for cell differentiation and reprogramming gained from a perspective of chromosomal structuralrearrangement. The balls are attractors on the landscape and colored in the same scheme as Figure 6 to represent the cell lines used in ourstudy. The paths are shown for illustrating the differentiation (Left) and reprogramming (Right) processes.

18

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint

Page 19: Microscopic Chromosomal Structural and …...2020/07/16  · experiments. Our model can be extended to gain insights into the chromosome pathological disorganization in cancer cells

Table of Contents

ToC text: Chromosomes are the structural scaffolds for gene functions, which determine the cell developmental process. A non-equilibriumlandscape model is developed to identify the chromosomal structural and dynamical origins of cell differentiation and reprogramming. Theirreversible pathways in differentiation and reprogramming show distinct dynamical rearrangement for the chromosome structure to facilitatethe formation of specific gene expression patterns and associated functions.ToC keyword: Irreversible chromosome dynamics

19

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted July 17, 2020. . https://doi.org/10.1101/2020.07.16.207662doi: bioRxiv preprint