functional innovation in the evolution of the calcium-dependent … · evolution of er ca2+ storage...

26
Functional innovation in the evolution of the calcium-dependent system of the eukaryotic endoplasmic reticulum Daniel E. Schäffer 1,2† , Lakshminarayan M. Iyer 1 , A. Maxwell Burroughs 1 , L. Aravind 1* 1 1 National Center for Biotechnology Information, National Library of Medicine, National Institutes of 2 Health, Bethesda, MD, USA 3 2 Science, Mathematics, and Computer Science Magnet Program, Montgomery Blair High School, 4 Silver Spring, MD, USA. 5 * Correspondence: 6 L. Aravind 7 [email protected] 8 †School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA 9 Keywords: calcium binding, calcium stores, calmodulin, channels, eukaryote origins, 10 endomembranes, SOCE pathway, wolframin. 11 Abstract 12 The origin of eukaryotes was marked by the emergence of several novel subcellular systems. Once 13 such is the calcium (Ca 2+ )-stores system of the endoplasmic reticulum, which profoundly influences 14 diverse aspects of cellular function including signal transduction, motility, division, and 15 biomineralization. We use comparative genomics and sensitive sequence and structure analyses to 16 investigate the evolution of this system. Our findings reconstruct the core form of the Ca 2+ - stores 17 system in the last eukaryotic common ancestor as having at least 15 proteins that constituted a basic 18 system for facilitating both Ca 2+ flux across endomembranes and Ca 2+ -dependent signaling. We 19 present evidence that the key EF-hand Ca 2+ -binding components had their origins in a likely bacterial 20 symbiont other than the mitochondrial progenitor, whereas the protein phosphatase subunit of the 21 ancestral calcineurin complex was likely inherited from the asgardarchaeal progenitor of the stem 22 eukaryote. This further points to the potential origin of the eukaryotes in a Ca 2+ -rich biomineralized 23 environment such as stromatolites. We further show that throughout eukaryotic evolution there were 24 several acquisitions from bacteria of key components of the Ca 2+ -stores system, even though no 25 prokaryotic lineage possesses a comparable system. Further, using quantitative measures derived 26 from comparative genomics we show that there were several rounds of lineage-specific gene 27 expansions, innovations of novel gene families, and gene losses correlated with biological innovation 28 such as the biomineralized molluscan shells, coccolithophores, and animal motility. The burst of 29 innovation of new genes in animals included the wolframin protein associated with Wolfram 30 syndrome in humans. We show for the first time that it contains previously unidentified Sel1, EF- 31 hand, and OB-fold domains, which might have key roles in its biochemistry. 32 1 Introduction 33 The emergence of a conserved endomembrane system marks the seminal transition in cell structure 34 that differentiates eukaryotes from their prokaryotic progenitors (Jekely, 2007). This event saw the 35 emergence of a diversity of eukaryotic systems and organelles such as the nucleus, the endoplasmic 36 made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also The copyright holder for this preprint (which was not this version posted July 26, 2019. . https://doi.org/10.1101/716472 doi: bioRxiv preprint

Upload: others

Post on 03-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Functional innovation in the evolution of the calcium-dependent system of the eukaryotic endoplasmic reticulum

Daniel E. Schäffer1,2†, Lakshminarayan M. Iyer 1, A. Maxwell Burroughs1, L. Aravind1* 1

1National Center for Biotechnology Information, National Library of Medicine, National Institutes of 2 Health, Bethesda, MD, USA 3 2Science, Mathematics, and Computer Science Magnet Program, Montgomery Blair High School, 4 Silver Spring, MD, USA. 5

* Correspondence: 6 L. Aravind 7 [email protected] 8

†School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA 9

Keywords: calcium binding, calcium stores, calmodulin, channels, eukaryote origins, 10 endomembranes, SOCE pathway, wolframin. 11

Abstract 12

The origin of eukaryotes was marked by the emergence of several novel subcellular systems. Once 13 such is the calcium (Ca2+)-stores system of the endoplasmic reticulum, which profoundly influences 14 diverse aspects of cellular function including signal transduction, motility, division, and 15 biomineralization. We use comparative genomics and sensitive sequence and structure analyses to 16 investigate the evolution of this system. Our findings reconstruct the core form of the Ca2+- stores 17 system in the last eukaryotic common ancestor as having at least 15 proteins that constituted a basic 18 system for facilitating both Ca2+ flux across endomembranes and Ca2+-dependent signaling. We 19 present evidence that the key EF-hand Ca2+-binding components had their origins in a likely bacterial 20 symbiont other than the mitochondrial progenitor, whereas the protein phosphatase subunit of the 21 ancestral calcineurin complex was likely inherited from the asgardarchaeal progenitor of the stem 22 eukaryote. This further points to the potential origin of the eukaryotes in a Ca2+-rich biomineralized 23 environment such as stromatolites. We further show that throughout eukaryotic evolution there were 24 several acquisitions from bacteria of key components of the Ca2+-stores system, even though no 25 prokaryotic lineage possesses a comparable system. Further, using quantitative measures derived 26 from comparative genomics we show that there were several rounds of lineage-specific gene 27 expansions, innovations of novel gene families, and gene losses correlated with biological innovation 28 such as the biomineralized molluscan shells, coccolithophores, and animal motility. The burst of 29 innovation of new genes in animals included the wolframin protein associated with Wolfram 30 syndrome in humans. We show for the first time that it contains previously unidentified Sel1, EF-31 hand, and OB-fold domains, which might have key roles in its biochemistry. 32

1 Introduction 33

The emergence of a conserved endomembrane system marks the seminal transition in cell structure 34 that differentiates eukaryotes from their prokaryotic progenitors (Jekely, 2007). This event saw the 35 emergence of a diversity of eukaryotic systems and organelles such as the nucleus, the endoplasmic 36

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 2: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

2

reticulum (ER), vesicular trafficking, and several novel signaling systems that are uniquely 37 associated with this sub-cellular environment. A major eukaryotic innovation in this regard is the 38 intracellular ER-dependent calcium (Ca2+)-stores system that regulates the cytosolic concentration of 39 Ca2+ (Ashby and Tepikin, 2001). Although Ca2+ ions are maintained at a 4000 to 10,000-fold higher 40 concentration in the endoplasmic reticulum (ER) lumen as compared to the cytoplasm (Woo et al., 41 2018), upon appropriate stimulus they are released into the cytoplasm by Ca2+-release channels such 42 as the inositol trisphosphate receptors (IP3Rs) and the ryanodine receptors (RyRs). The process is 43 then reversed and Ca2+ is pumped back into the ER by the ATP-dependent SERCA 44 (sarcoplasmic/endoplasmic reticulum calcium ATPase) pumps, members of the P-type ATPase 45 superfamily (Ashby and Tepikin, 2001; Altshuler et al., 2012). In addition to the above core 46 components that mediate the flux of Ca2+ from and into the ER-dependent stores, several other 47 proteins have been linked to the regulation of this process and transmitting of Ca2+-dependent signals, 48 including: 1) chaperones such as calreticulin, calnexin, and calsequestrin in the ER lumen (Kozlov et 49 al., 2010); 2) diverse EF-hand proteins such as calmodulin (and its relatives), calcineurin B, and 50 sorcin that bind Ca2+ and regulate the response to the Ca2+ flux (Denessiouk et al., 2014); 3) channel 51 proteins such as the voltage gated calcium channels (VGCC) and trimeric intracellular cation 52 channels (TRIC) that influence the flow of Ca2+ (Lanner et al., 2010; Zhou et al., 2014); and 4) 53 protein kinases (calcium/calmodulin-dependent kinases (CaMKs)) and protein phosphatases 54 (calcineurin A) that mediate the Ca2+-dependent signaling response (Berridge, 2012). Together, the 55 Ca2+-stores system and intracellular Ca2+-dependent signaling apparatus regulate a variety of cellular 56 functions required for eukaryotic life, such as transcription, cellular motility, cell growth, and cell 57 division (Clapham, 2007; Berridge, 2012). 58

Comparative evolutionary analyses of the proteins in the ER Ca2+-stores and -signaling system have 59 revealed that some components were either present in the last eukaryotic common ancestor (LECA) 60 (e.g. Calmodulin and SERCA) or derived early in the evolution of the eukaryotes (e.g. IP3R) (Nolan 61 et al., 1994; Moreno and Docampo, 2003; Reiner et al., 2003; Prole and Taylor, 2011; Plattner and 62 Verkhratsky, 2013; Verkhratsky and Parpura, 2014; Perez-Gordones et al., 2015). Other proteins 63 show a patchier distribution in lineages outside of metazoans (e.g. calreticulin and calnexin) (Moreno 64 and Docampo, 2003; Banerjee et al., 2007), or were reconstructed to have been derived in lineages 65 closely related to the metazoans (e.g. RyR, which diverged from the ancestor of IP3R at the base of 66 filozoans) (Alzayady et al., 2015). A substantial number of the components that have been studied in 67 this system are primarily found in the metazoans, with no identifiable homologs outside of metazoa 68 (Cai et al., 2015). Most studies have focused on animal proteins of these systems, highlighting the 69 general lack of knowledge regarding the regulation of ER Ca2+-stores and the potential diversity in 70 the regulatory systems present in other eukaryotes. To our knowledge, a systematic assessment of the 71 evolutionary origins of the entire Ca2+-stores system and its regulatory components, as currently 72 understood, has yet to be attempted. 73

Given our long-term interest in the origin and evolution of the eukaryotic subcellular systems, we 74 conducted a comprehensive analysis of the core and regulatory components of the Ca2+-stores 75 system, analyzing their known and predicted interactions and inferring the evolutionary depth of 76 various components. We show that an ancient core of at least 15 protein families was already in place 77 at the stem of the eukaryotic lineage. Of these, a subset of proteins is of recognizable bacterial 78 ancestry, although there is no evidence of a bacterial Ca2+-stores system resembling those in 79 eukaryotes. We also show that gene loss and lineage-specific expansions of these components shaped 80 the system in different eukaryotic lineages, and sometimes corresponds to recognized adaptive 81 features unique to particular organisms or lineages. Further, we conducted a systematic domain 82 analysis of the proteins in the system, uncovering three novel unreported domains in the enigmatic 83

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 3: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

3

wolframin protein. These provide further testable hypotheses on the functions of wolframin in the 84 context of the Ca2+-stores system and in protecting cells against the response stresses that impinge on 85 the ER. 86

2 Methods 87

2.1 Sequence analysis 88

Iterative sequence profile searches were performed using the PSI-BLAST program (RRID: 89 SCR_001010) (Altschul et al., 1997) against a curated database of 236 eukaryotic proteomes 90 retrieved from the National Center for Biotechnology Information (NCBI), with search parameters 91 varying based on the query sequence (see Supplementary Figure S1) and composition. The program 92 HHpred (RRID:SCR_010276) (Soding, 2005; Alva et al., 2016) was used for profile-profile 93 comparisons. The BLASTCLUST program1 (RRID: SCR_016641) was used to cluster protein 94 sequences based on BLAST similarity scores. Support for inclusion of a protein in an orthologous 95 cluster involved reciprocal BLAST searches, conservation of domain architectures, and, when 96 required, construction of phylogenetic trees with FastTree 2.1.3 (RRID: SCR_015501) (Price et al., 97 2010) with default parameters. The trees were visualized using FigTree2 (RRID: SCR_008515). 98 Selected taxonomic absences were further investigated with targeted BLASTP (RRID: SCR_001010) 99 (Altschul et al., 1990) and TBLASTN (RRID: SCR_011822) (Gertz et al., 2006) searches against 100 NCBI’s non-redundant (nr) and nucleotide (nt) databases (Benson et al., 2013), respectively. Multiple 101 sequence alignments were constructed using the MUSCLE (Edgar, 2004) and GISMO (Neuwald and 102 Altschul, 2016) programs with default parameters. Alignments were manually adjusted using BLAST 103 high-score par (hsp) results as guides. Secondary structure predictions were performed with the Jpred 104 4 program (RRID: SCR_016504) with default settings (Drozdetskiy et al., 2015). EMBOSS (RRID: 105 SCR_008493) pepwheel3 was used to generate renderings of amino acid positions on the 106 circumference of an α-helix. 107

2.2 Protein network construction 108

Protein-protein interactions were extracted from published data sources, updating any outdated 109 gene/protein names and making substantial efforts to disambiguate between paralogs (Supplementary 110 Data). High-throughput/predicted protein-protein interactions were extracted from the FunCoup 111 database (Ogris et al., 2018). Networks were visualized using the R-language implementations of the 112 iGraph and qGraph packages. For network rendering, the Fruchterman-Reingold force-directed 113 algorithm was used (Fruchterman and Reingold, 1991). 114

2.3 Comparative genome analyses 115

Quantitative analysis of phyletic patterns and paralog counts for different proteins/families were first 116 obtained using a combination of sequence similarity searches as outlined above. We filtered the 117 counts to exclude multiple identical sequences annotated with the same gene name, using the latest 118 genome assemblies for each of these organisms as available in NCBI GenBank (RRID: 119 SCR_002760) as anchors. Proteomes with identifiable quality issues were removed from downstream 120 analyses (e.g. genomes containing sequences with ambiguous strain assignment), leaving counts for 121

1 ftp://ftp.ncbi.nih.gov/blast/documents/blastclust.html 2 http://tree.bio.ed.ac.uk/software/figtree 3 http://www.bioinformatics.nl/cgi-bin/emboss/pepwheel

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 4: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

4

216 organisms. These counts and their phyletic patterns defined two sets of vectors, namely the 122 distribution by organism for a given protein and the complement of proteins for a given organism. 123 These vectors for the protein families and organisms were used to compute the inter-protein or inter-124 organism Canberra distances (Lance and Williams, 1966), which is best suited for such integer data 125 of the form of presences and absences. The Canberra distance between two vectors �� and �� is defined 126 as: 127

������, ��� �� |�� � �||��| � |�|�

���

These distances were used to cluster the protein families and organisms through agglomerative 128 hierarchical clustering using Ward’s method (Kaufman and Rousseeuw, 1990). The results were 129 rendered as dendrograms. Ward’s method takes the distance between two clusters, A and B to be 130 amount by which the sum of squares from the center of the cluster will increase when they are 131 merged. Ward’s method then tries to keep this growth as small as possible. It tends to merge smaller 132 clusters that are at the same distance from each other as larger ones, a behavior useful in lumping 133 “stragglers” in terms of both organisms and proteins with correlated phyletic patterns. 134 135 The protein complement vectors for organisms were also used to perform principal component 136 analysis to detect spatial clustering upon reducing dimensionality. The variables were scaled to have 137 unit variance for this analysis. Similarly, a linear discriminant analysis was performed on these 138 vectors using representatives of the major eukaryotic evolutionary lineages (see Supplementary Table 139 S1 for list) as the prior groups for classification. This was then used on our complete phyletic pattern 140 data for classification of the organisms based on their protein complements. 141 142 Organism polydomain scores were calculated as follows: if ���,� counts the number of paralogs of 143 some protein domain family � in some organism �, � is the set of all proteins studied, and � is the 144 set of all organisms studied, then the polydomain score for an organism � � � is defined as: 145

���� � ����,� � ���� � �����

where �� is the mean of ��� for all proteins � � �, and ��� is defined as: 146

��� � ���� � ∑ ���,��∑ ∑ ���, ����

!

Computations and visualizations were performed using the R language. 147

3 Results 148

3.1 Protein-protein interaction network for the ER-dependent calcium stores 149 regulatory/signaling apparatus 150

Metazoan Ca2+ stores have been the extensively studied, leading to the identification of several 151 proteins directly or indirectly involved in calcium transport or signaling and regulation of these 152 processes. In order to apprehend the global structure of this system, we constructed a protein-protein 153 interaction (PPI) network for individual components, centering the network on the three families of 154 ER Ca2+ channels (SERCA, IP3R, RyR) (see Methods). The resulting network totaled 173 protein 155 nodes and 761 interaction edges (Figure 1A, Supplementary Data). The distribution of the number of 156 connections per node (degree distribution) in this network displays an inverse (rectangular 157 hyperbolic) relationship (R2= 0.88; Figure 1C). This is a notable departure from typical PPI networks 158 which tend to show power-law degree distributions (Bader and Hogue, 2002; Rodrigues et al., 2011). 159

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 5: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

5

To better understand this pattern, we studied its most densely connected subnetworks by searching 160 for cliques, where every node is connected to every other node in the subnetwork. The largest cliques 161 in this network have ten nodes. As the degree distribution graph shows an inflection around degree 6, 162 we merged all cliques of size 6 or greater resulting in a subnetwork of 46 nodes comprising close to 163 50% of the edges of the overall network (Figure 1B). This suggests that the inverse relationship of 164 the degree distribution is a consequence of the presence of a core of several highly connected nodes 165 (6 or more edges), which is in contrast to other system-specific PPI networks showing a power-law 166 degree distribution, as in the case of the ubiquitin network (Venancio et al., 2009). 167 168 Analysis of the proteins in this highly connected sub-network suggests that the 46 proteins can be 169 broadly classified into 5 groups: 1) the channels and ATP-dependent pumps which constitute the core 170 Ca2+ transport system for ER stores; 2) EF-hand domain-containing proteins such as calmodulin and 171 sorcin that bind Ca2+ and consequently interact with and regulate the biochemistry of numerous other 172 proteins; 3) proteins involved in folding and stability of other proteins, such as chaperones, redox 173 proteins, and protein disulfide isomerases; 4) components of the protein phosphorylation response 174 that is downstream of Ca2+ flux into the cytoplasm; and 5) proteins linking the network to other major 175 functional systems, such as Bcl2, which is involved in the apoptosis response in animals, and beclin 176 1, which is involved in autophagy. 177

3.2 Phyletic distribution of key proteins and implications for the evolution of the ER Ca2+-178 stores and -signaling pathway 179

This densely connected sub-network invites questions about its evolutionary origin, particularly 180 given that the wet-lab results that inform the connections are predominantly drawn from mammalian 181 studies (see Methods). To understand better the emergence of this sub-network and the conservation 182 of its nodes across eukaryotes, we systematically analyzed their phyletic patterns (Figure 2A, 183 Supplementary Figure S1 and Data). A list of the 34 proteins and protein families studied, as well as 184 their domain architectures, is in Supplementary Table S2. Further, Ward clustering analysis of the 185 core components based on their phyletic patterns (see Methods) revealed the presence of 5 distinct 186 clusters (Figure 2C). These clusters appear to have a broad evolutionary basis with distinct clusters 187 accommodating proteins that could be inferred as having been in the LECA (e.g. cluster 1) and those 188 that emerged in a metazoan-specific expansion of the ER Ca2+-stores system (e.g. clusters 4 and 5). 189 3.2.1 The core LECA complement of the Ca2+-stores system 190

At least 15 proteins of the ER Ca2+-stores and signaling system are found across all or most 191 eukaryotic lineages, suggesting that they were present in the LECA. These proteins include key 192 components of the dense sub-network, such as 1) the SERCA Ca2+ pumps; 2) the Ca2+-binding EF- 193 hand proteins like calmodulin; 3) chaperones involved in protein folding that are mostly found in the 194 ER and that sometimes act as either Ca2+ binding proteins (e.g. calreticulin, calnexin) or as regulators 195 of other components of the Ca2+-stores and -signaling system (e.g. ERp57/PDIA3, calstabin, TMX1, 196 ERdj5/DNAJC10); and 4) core enzymes of the Ca2+-dependent phosphorylation-based signaling 197 system including the CaMKs and the calcineurin A phosphatase. This set of components is likely to 198 have comprised the minimal ER-Ca2+-stores and -signaling system in the LECA and suggests that 199 there were already sub-systems in place to mediate: 1) the dynamic transport of Ca2+ ions across the 200 emerging eukaryotic intracellular membrane system and 2) the transmission of signals affecting a 201 wide-range of subcellular processes based on the sensing of Ca2+ ions. 202 203 Notably, the IP3R channels are absent in lineages that are often considered the basal-most 204 eukaryotes, namely the parabasalids and diplomonads; however, they are present in some other early-205

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 6: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

6

branching eukaryotes such as kinetoplastids. Their absence in certain extant eukaryotes (Figure 2A) 206 suggests that they can be dispensable, or that their role can be taken up by other channels in 207 eukaryotes that lack them. A comparable phyletic pattern is also seen for certain other key 208 components of the densely connected sub-network, namely the ERdj5 and calnexin chaperones. 209 3.2.2 Components with clearly identifiable prokaryotic origins 210

Deeper sequence-based homology searches revealed that at least three protein families of the ER 211 Ca2+-stores system have a clearly-identifiable bacterial provenance, namely the P-type ATPase pump 212 SERCA, sarcalumenin, and calmodulin and related EF-hand proteins. Phylogenetic analyses suggest 213 that the P-type ATPases SERCA and plasma membrane calcium-transporting ATPase (PMCA) were 214 both present in the LECA. They are most closely-related to bacterial P-type ATPases (Plattner and 215 Verkhratsky, 2015), which commonly associate with transporters (e.g. Na+-Ca2+ antiporters), ion 216 exchangers (Na+-H+ exchangers), permeases, and other distinct P-type ATPases in conserved gene-217 neighborhoods (Supplementary Figure S2) suggesting a role in maintenance of ionic homeostasis 218 even in bacteria. 219 220 The GTPases EHD and sarcalumenin, whose GTPase domains belong to the dynamin family (Leipe 221 et al., 2002), also show clear bacterial origins based on their phyletic patterns and phylogenetic 222 affinities. The closest bacterial homologs possess a pair of transmembrane helices C-terminal to the 223 GTPase domain, suggesting a possible role in membrane dynamics (Figure 3B, Supplementary Data). 224 Phylogenetic trees show that although they are related to the dynamins, the progenitor of the 225 eukaryotic sarcalumenin and EHD was acquired independently of the dynamins via a separate 226 transfer from a proteobacterial lineage to eukaryotes early in their evolution (Leipe et al., 2002). This 227 gave rise to the EHD domain-containing EHD clade of GTPases, which are involved in regulating 228 vesicular trafficking and membrane/Golgi reorganization. A further secondary transfer, likely from 229 the kinetoplastid lineage to the metazoans, gave rise to sarcalumenin proper, which has characterized 230 roles in the Ca2+-stores system (Figure 3A). This raises the possibility that in other eukaryotes, EHD 231 performs additional roles in the Ca2+-stores system overlapping with metazoan sarcalumenin. 232 233 Our analysis revealed that the bacterial calmodulin-like EF-hands show a great diversity of domain-234 architectural associations. Versions closest to the eukaryotic ones are found in actinobacteria, 235 cyanobacteria, proteobacteria, and verrucomicrobiae. These prokaryotic calmodulin-like EF-hand 236 domains are found fused to a variety of other domains (Figure 3E), such as a 7-transmembrane 237 domain (7TM) (cyanobacteria and to a lesser extent in actinobacteria), the prokaryotic Tic110-like α-238 helical domain (cyanobacteria), heme-oxygenases (actinobacteria), the nitric oxide synthase and 239 NADPH oxidase with ferredoxin and nucleotide-binding domains (cyanobacteria and δ-240 proteobacteria), as well as cNMP binding, thioredoxin, cytochrome, and sulfatase domains 241 (verrucomicrobiae). Several of these architectural associations, such as the fusions to the sulfatases 242 and the redox-regulator domains thioredoxin, cytochromes, glyoxylases, and NADPH oxidases, are 243 also observed in eukaryotes. However, our analysis showed that these fusions appear to be 244 independently derived in the two superkingdoms. These diverse associations suggest that even in 245 bacteria the calmodulin-like EF-hands function in the context of membrane-associated signaling and 246 redox reactions, possibly regulating in a Ca2+-flux-dependent manner (Zhou et al., 2006; Dominguez 247 et al., 2015). We infer that a version of these was transferred to the stem eukaryote, probably from 248 the cyanobacteria or the actinobacteria, and had already triplicated by the LECA, giving rise to the 249 ancestral versions of calmodulin and calcineurin B, which function as part of the Ca2+-stores system, 250 and the centrins, which were recruited for a eukaryote-specific role in cell division (Dantas et al., 251 2012). 252 253

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 7: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

7

Among components with indirect regulatory roles in the ER Ca2+-dependent system is the peptidyl-254 prolyl cis-trans isomerase (PPIase) calstabin (FKBP1A/B in humans). Phyletic pattern analysis 255 indicates that calstabin was present in the LECA, which in turn likely acquired it from the α-256 proteobacterial mitochondrial progenitor. This is also supported by the evidence from extant 257 pathogenic/symbiotic bacteria wherein the bacterial FKBP-like PPIases play a role in establishing 258 associations with eukaryotic hosts (Unal and Steinert, 2014). In the stem eukaryotes, the ancestral 259 PPIase acquired from the bacterial source underwent a large radiation resulting in diverse PPIases 260 that acquired a wide range of substrate proteins in several eukaryote-specific pathways and function 261 in several cellular compartments. Calstabin is one of the paralogs that arose as part of this radiation 262 and appears to have been dedicated to the ER Ca2+-stores system. 263 264 Beyond the above-mentioned, several other which are inferred to be part of the LECA complement of 265 the ER Ca2+-stores system are likely of bacterial origin. However, they do not have obvious bacterial 266 orthologs and might have diverged considerably from their bacterial precursors in the stem eukaryote 267 itself. These include the TRIC-like channels (Silverio and Saier, 2011), the CaMKs, and the 268 thioredoxin domains found in ERp57, ERdj5, and TMX1. 269 270 In contrast to the several components of bacterial provenance, the large eukaryotic assemblage of 271 calcineurin-like protein phosphatases, which includes the Ca2+-stores regulator calcineurin A, are 272 specifically related to an archaeal clade to the exclusion of all other members of the superfamily 273 (Figure 3A, Supplementary Data). Notably, these close relatives are present in several 274 Asgardarchaea, suggesting the eukaryotes may have directly inherited the ancestral version of these 275 phosphoesterases from their archaeal progenitor (Zaremba-Niedzwiedzka et al., 2017). Strikingly, the 276 archaeal calcineurin-like phosphatases occur in a conserved operon (Figure 3A) with proteins 277 combining a zinc ribbon fused to a phosphopeptide-recognition FHA domain and a vWA domain 278 fused to a β-barrel-like domain. This suggests a possible analogous role for these calcineurin-like 279 phosphatases in transducing a signal through protein substrate dephosphorylation. 280 281 Thus, the core, ancestrally-conserved components of the ER-dependent stores system predominantly 282 descend from the bacteria, although at least one component was inherited from the archaea. While the 283 roles for some of these domains in possible bacterial Ca2+-dependent systems are apparent, there is 284 no evidence that these versions function in a coordinated fashion in any single bacterial species or 285 clade. Further, it is also clear that there was likely more than one bacterial source for the proteins: 286 components such as SERCA and calstabin, whose closest relatives are proteobacterial, are likely to 287 have been acquired from the mitochondrial ancestor, whereas calmodulin is likely to have been 288 derived from a cyanobacterium or actinobacterium. Thus, the ER-dependent Ca2+-stores network was 289 assembled in the stem eukaryote from diverse components drawn from different prokaryotic lineages. 290 This assembly of the system in eukaryotes is strikingly illustrated by the case of the calcineurin 291 complex. Here, the protein phosphatase component has a clear-cut origin from the archaeal precursor 292 of the eukaryotes, whereas the Ca2+-binding calcineurin B component was acquired from a bacterial 293 source. It was the combination of these proteins with very distinct ancestries that allowed the 294 emergence of a Ca2+-signaling system. 295 3.2.3 Lineage-specific expansions and gene loss shape the Ca2+ response across eukaryotes 296

In order to obtain some insights regarding the major developments in the evolution of the eukaryotic 297 Ca2+-stores system, we systematically assembled protein complement vectors for each of the 298 organisms in the curated proteome dataset (see Methods). After computing the pairwise Canberra 299 distance between these vectors, we performed clustering using the Ward algorithm (see Methods). 300 The resulting clusters largely recapitulate eukaryote phylogeny, with animals, fungi, plants, 301

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 8: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

8

kinetoplastids and apicomplexans forming distinct monotypic clusters (Figure 2B). We then used 302 principal component and linear discriminant analysis (see Methods) on these vectors to obtain a 303 global quantitative view of the diversification of the Ca2+-stores system. Plotting the first two 304 principal components/discriminants reveals discernable spatial separation of the metazoans from their 305 nearest sister group, the fungi (Figure 2D-E). Further, diverse photosynthetic eukaryotes tended to be 306 spatially colocalized. These observations together indicate that most major eukaryotic lineages have 307 likely evolved distinguishing machineries around the core Ca2+-stores system inherited from the 308 LECA. Hence, investigations on individual lineages are likely to unearth novel, lineage-specific 309 regulatory devices, which potentially reflect adaptations to their respective environments and provide 310 hints about the biological contexts in which they were found. 311 312 We also defined a measure, the polydomain score (see Methods), which captures the overall 313 amplification of the Ca2+-stores system of an organism in terms of the contribution of the different 314 constituent protein families of the system (Figure 2F). This score allowed us to capture some of the 315 key differences in the reconstructed Ca2+-stores system between organisms. In particular, this 316 allowed us to identify some of the distinctions between Ca2+-regulatory systems that may reflect 317 differential strategies for the incorporation of calcium into exo/endo-skeletal structures in eukaryotes. 318 Comparison of polydomain scores within metazoa (Figure 2G) showed a striking reduction of the 319 Ca2+-stores system in arthropods as well as in some molluscs and other marine lineages, consistent 320 with their distinct grouping in the PCA/LDA plots. This appears to be generally related to the lower 321 dependence on biomineralized structures in these organisms – arthropods utilize chitin as the primary 322 exoskeletal material as opposed to calcareous structures, and the molluscan lineages showing this 323 pattern have lost their calcified shells. Similarly, fungi, which use chitin as a central cell wall 324 component, show lower polydomain scores relative to sister eukaryotic lineages. Conversely, we 325 noted elevated polydomain scores in molluscs with calcareous shells. Likewise, ciliates, which have a 326 evolved a distinct extension to the ancestral eukaryotic Ca2+-stores system in the form of the alveoli 327 (Plattner, 2017), also have elevated polydomain scores. Ca2+-based signaling has been shown to be 328 important for defensive trichocyst exocytosis, ciliary action, and other pathways in ciliates (Plattner, 329 2015). These observations suggest a linkage between strategies for structural and signaling usage of 330 calcium and the regulation of Ca2+ distribution between intracellular compartments. 331 332 In qualitative terms, the distinct spatial positioning of different eukaryotic lineages in the PCA/LDA 333 plots and the difference in their polydomain scores could be explained on the basis of lineage-specific 334 expansions (LSEs), gene losses, and domain architectural diversity between lineages (Lespinet et al., 335 2002). One of the most frequent proteins displaying LSEs is the CaMK: expansions have occurred 336 independently in animals, plants, oomycetes and alveolates (Figure 3C, Supplementary Data). 337 Ciliates in particular show a dramatic expansion of the family with over one hundred copies in 338 Paramecium and Stentor. Other proteins that show LSEs in specific lineages (Supplementary Data) 339 include calmodulin (in metazoans, certain fungal lineages, and plants), calstabin (independently in 340 different stramenopile lineages and haptophytes), PP2R3-like proteins (in kinetoplastids and 341 Trichomonas), calcineurin A (in ciliates and Entamoeba), calnexins (in Trichomonas and diatoms), 342 ORAI (in Emiliania, Figure 3D), and SERCA (in certain fungi). The metazoans also display LSEs for 343 several protein families that appear to have specifically emerged in metazoa (see below). 344 345 Some of these LSEs have clear biological correlates that were also suggested by the polydomain 346 scores. For example, LSEs of the calmodulin family in the shelled molluscs (20-46 copies) might 347 correspond to or be involved in the regulation of Ca2+ concentration during the biomineralization 348 processes involved in the formation of calcareous shells. Similarly, the development of the distinct 349 set of Ca2+ stores in the form of the alveoli, which play several key ciliate-specific roles (Plattner, 350

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 9: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

9

2015; 2017), might explain the expansions in these organisms. Thus, the very large LSEs of the 351 CaMK and calcineurin A-like phosphoesterase likely reflect the many diverse pathways into which 352 Ca2+-based signaling is incorporated in these organisms. Similarly, ORAI family LSEs in Emiliania 353 might correspond to the need to regulate Ca2+ transport for mineralizing the calcareous coccoliths 354 (Yin et al., 2018). Here, as in ciliates, these expansions might have accompanied the emergence of a 355 distinct Ca2+-stores system associated with the nuclear envelope (an extension of the ER), which is 356 close in proximity to the envelope of the coccolith and the site of biomineralization (Brownlee et al., 357 2015). 358 359 In contrast, we observed extensive gene losses of proteins including ERdj5, sarcalumenin, IP3R, and 360 ORAI across several or all fungi and losses of at least 10 conserved proteins in Entamoeba (Figure 361 2A). However, experimental studies have shown that both fungi and Entamoeba have Ca2+ transients 362 and associated signaling systems (Makioka et al., 2002; Kim et al., 2012); hence, despite these losses, 363 the evidence favors these organisms retaining at least a limited Ca2+-store-dependent signaling 364 network. The CREC (calumenin, reticulocalbin, and Cab45) family displays an unusual phyletic 365 pattern of particular note, being found only in metazoans and land plants (Figure 2A). Barring the 366 unusual possibility of lateral transfer between these two lineages, this would imply extensive loss of 367 these proteins across most other major eukaryotic lineages. Within metazoa, several proteins display 368 unusual loss patterns, including sorcin, calsequestrin, and SelN (Figure 2A). 369 370 Notably, the land plants and their sister group the green algae have entirely lost ancient core Ca2+-371 dependent signaling proteins such as calcineurin A and HOMER. The IP3R channels are retained by 372 the chlorophytes but lost entirely by the land plants, while the ORAI channels are present in the basal 373 land plants Physcomitrella and Selaginella but are not observed in crown land plants (Edel et al., 374 2017). Land plants have also lost the TRIC channels, but the basal streptophyte Chara braunii retains 375 a copy. The differential retention and expansion of various ancestral Ca2+ components (Figure 2A) in 376 the land plants might be seen as adaptations to a sessile lifestyle no longer requiring Ca2+-signaling 377 components associated with active cell or organismal motility. However, the paradoxical LSEs of the 378 calmodulin-like and EF-hand-fused CaMK-like proteins (CDPK family) (Edel et al., 2017) in land 379 plants relative to their chlorophyte sister-group (Figure 2A, 3C) might reflect the presence of distinct 380 Ca2+-related homeostasis systems. This might have emerged as a Ca2+ sequestration mechanism 381 relevant in the Ca2+-poor freshwater ecosystems wherein the land plants originated from algal 382 progenitors (Delwiche and Cooper, 2015). 383 384 Little in the way of domain architectural diversity is observed in the core conserved proteins of the 385 animal Ca2+-stores system. A notable exception is seen in the CaMKs: metazoan CaMKs show a 386 higher architectural complexity via their fusion to several distinct globular domains. These domains 387 act as adaptors in interactions with Ca2+-binding regulatory domains, effectively broadening the total 388 range of signaling pathways in which the CaMKs participate in metazoa (Wang et al., 2015). In 389 contrast, land plants display lower architectural complexity with CaMK orthologs directly fused to 390 Ca2+-binding domains (Klimecka and Muszynska, 2007). 391 3.2.4 Accretion of novel Ca2+ signaling pathway components in the Metazoa 392

Case by case examination revealed that the distinct position of the metazoans in the PCA/LDA plots 393 relative to other eukaryotes (Figure 2D-E) is due to a major accretion of novel signaling and flux-394 related Ca2+-stores components at the base of the metazoan lineage (Figure 2A). Our analyses suggest 395 three distinct origins of these proteins: 1) several emerged as paralogs of domains already present in 396 the LECA and functioning in the context of Ca2+-signaling, including the EF-hand domains found in 397 the STIM1/2, calumenin, SelN, S100, sorcin, and NCS1 proteins, the thioredoxin domain found in 398

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 10: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

10

the ERp44 and calsequestrin proteins, and the RyR channel proteins. 2) Proteins or component 399 domains which are involved in a broader range of functions but have been recruited to roles in Ca2+-400 stores regulation specifically in the animal lineages, such as the SAM domain in the STIM1/2 401 proteins, the PDZ domain in the neurabin protein, and the TRPC channels. 3) Proteins containing 402 domains which appear to be either novel metazoan innovations, such as the KRAP domain of the 403 Tespa1 protein, or whose origins have yet to be traced, such as VGCC and wolframin. Further, the 404 origin of certain metazoan-specific proteins such as phospholamban, which inhibits the SERCA 405 ATPase, remain difficult to trace because of their small size and highly-biased composition as 406 membrane proteins. Inspection of the neighbors of these proteins in the constructed network suggests 407 that in almost all instances, these proteins were added to the Ca2+ signaling network via interactions 408 with one or more of the ancient components of the system (Figure 1A). 409 410 This sudden accretion of Ca2+-stores components likely coincided with emergence of well-studied 411 aspects of differentiated metazoan tissues, such as muscle contraction and neurotransmitter release 412 (Zucchi and Ronca-Testoni, 1997; Clapham, 2007). Other additions to the network likely arose via 413 interface with other pathways like apoptosis and autophagy in the context of ER stress response 414 (Smaili et al., 2013). Notable in this regard is the metazoan-specific Bcl2 family of membrane-415 associated proteins associated with regulation of apoptosis. They appear to have been derived via 416 rapid divergence in metazoans from pore-forming toxin domains of ultimately bacterial provenance 417 (Peng et al., 2009), which are found in pathogenic bacteria and fungi (Aravind et al., 2012). 418 419 Still other proteins were recruited to the system as part of newly-emergent regulatory subnetworks, 420 such as the Store-Operated Calcium Entry (SOCE) pathway for re-filling the ER from extracellular 421 Ca2+ stores (Prakriya and Lewis, 2015; Ong et al., 2016). Of the proteins identified in this pathway, 422 the ORAI Ca2+ channels are present in the early-branching kinetoplastid lineage but were lost in 423 several later-branching lineages (Figure 2A), while other components like the ORAI-regulating 424 STIM1/2 proteins and the TRPC channels emerged around the metazoan accretion event, suggesting 425 the core SOCE pathway came together at or near the base of animals. In course of this analysis, we 426 identified a common origin for the ORAI and Jiraiya/TMEM221 ER channels, the latter of which are 427 characterized in BMP signaling (Aramaki et al., 2010). The Jiraiya channels are observed in animals 428 but are absent in earlier-branching eukaryotic lineages, suggesting they emerged from a duplication 429 of an ORAI channel early in metazoan evolution (Figure 3D, Supplementary Data). Jiraiya channel 430 domains lack the Ca2+ binding residues seen in ORAI channels, suggesting they are unlikely to 431 directly bind Ca2+. However, it is possible that Jiraiya/TMEM221 physically associates with 432 components of the Ca2+-stores system to regulate them. 433

3.3 Domain architectural anatomy and functional analysis of the wolframin protein 434

As noted above, one of the uniquely metazoan proteins in the Ca2+-stores system is wolframin, a 435 transmembrane protein localized to the ER membrane (Hildebrand et al., 2008; Rigoli et al., 2011; 436 Qian et al., 2015) (Figure 2A, Supplementary Data). Wolframin, along with the structurally-unrelated 437 and more widely phyletically distributed (Figure 2A) Wolfram syndrome 2 (WFS2) protein, is 438 implicated in Wolfram syndrome (Inoue et al., 1998; Strom et al., 1998; Amr et al., 2007; Urano, 439 2016). Experimental studies have attributed biological roles to unannotated regions upstream and 440 downstream of the TM region in wolframin, and our analyses revealed four uncharacterized globular 441 regions in these regions (Figure 4A). The cytosolic N-terminal region was identified through iterative 442 database searches (see Methods) as containing Sel1-like repeats (SLRs; query: NP_005996, hit: 443 OYV16035, iteration 2, e-value: 5x10-16), which are α-helical superstructure forming repeats 444 structurally comparable to the tetratricopeptide repeats (TPRs) (Ponting et al., 1999; Karpenahalli et 445

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 11: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

11

al., 2007). Profile-profile searches (see Methods) unified the remaining two globular regions 446 respectively with various EF-hand domains (e.g., query: XP_011608878, hit: 1SNL_A, p-value 447 2.3x10-4) and OB fold-containing domains (e.g. query: XP_017330582, hit: 2FXQ_A, p-value: 448 1.3x10-4). 449 450 The wolframin SLR possess several conserved residues seen in the classical SLRs (Figure 4C, 451 Supplementary Table S4)(Mittl and Schneider-Brachert, 2007), and additionally display a truncated 452 first loop when compared to known SLRs (Figure 4C). The SLRs of wolframin appear most closely-453 related to bacterial versions, suggesting a possible horizontal transfer at the base of animals (Ponting 454 et al., 1999); however, the short length and rapid divergence of such repeats complicates definitive 455 ascertainment of such evolutionary relationships. SLRs and related α/α repeats are often involved in 456 coordinating interactions within protein complexes (Karpenahalli et al., 2007; Mittl and Schneider-457 Brachert, 2007) and have been characterized specifically in ER stress and misfolded protein 458 degradation responses (Jeong et al., 2016) and in mediating interactions of membrane-associated 459 protein complexes (Mittl and Schneider-Brachert, 2007). As the wolframin SLRs roughly correspond 460 to an experimentally-determined calmodulin-binding region (Yurimoto et al., 2009), they might 461 specifically mediate that protein interaction. 462 463 The EF-hand region of wolframin contains the two characteristic copies of the bihelical repeat that 464 form the basic EF-hand unit. However, it lacks the well-characterized Ca2+-binding DxDxDG motif 465 or any comparable residue conservation (Figure 4D) (Gifford et al., 2007; Denessiouk et al., 2014). It 466 also features striking loop length variability in between the two helix-loop-helix motifs (Figure 4D). 467 Such “inactive” EF-hand units typically dimerize with other EF-hand proteins (Kawasaki et al., 1998; 468 Gifford et al., 2007), suggesting that wolframin could be self-dimerizing or that its EF-hand could 469 interact at the ER membrane with calmodulin and/or a distinct EF-hand protein. 470 471 The C-terminal OB fold domain (Figure 4B,E) lacks the conserved polar residues typical of nucleic 472 acid-binding OB-fold domains (Watson et al., 2007; Guardino et al., 2009). Further, the localization 473 of this region to the ER lumen suggests that it is unlikely to be involved in nucleic acid binding 474 (Arcus, 2002; Flynn and Zou, 2010; Krishna et al., 2010). Alternatively, this OB-fold domain could 475 mediate protein-protein interactions as has been observed for other members of the fold (Flynn and 476 Zou, 2010) and is also consistent with previous experimental studies implicating this region of 477 wolframin in binding the pre-folded form of ATP1B1 (Zatyka et al., 2008). Strikingly, in the region 478 N-terminal to the OB fold domain we observe a further globular region containing a set of six 479 absolutely-conserved cysteines (Figure 4E). This region is not unifiable with any known domains and 480 could conceivably represent an extension to the core OB fold domain. These conserved cysteine 481 residues could contribute to disulfide-bond-mediated cross-binding regulation, a well-studied 482 regulatory mechanism of Ca2+-stores regulation (Ushioda et al., 2016). Additionally, C-terminal to 483 the OB-fold, wolframin has a hydrophobic helix that might be involved in intra- or inter-molecular 484 packing interactions (Figure 4E). 485 486 The wolframin TM region, located in the central region of the protein (Figure 4A), consists of nine 487 transmembrane helices (Hofmann et al., 2003; Rigoli et al., 2011; Qian et al., 2015). Despite 488 extensive studies on the TM region, its precise role in affecting Ca2+ flow across the ER membrane 489 remains the subject of some debate (Osman et al., 2003; Aloi et al., 2012; Zatyka et al., 2015; 490 Cagalinec et al., 2016). Inspection of a multiple sequence alignment of the wolframin transmembrane 491 region (Supplementary Figure S3) revealed a concentration of polar residues which are spatially 492 alignable in helices 4 and 5 (Supplementary Figure S4). This is reminiscent of membrane associated 493

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 12: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

12

polar residue configurations seen in proteins that allow transmembrane flux of ions. Hence, it would 494 be of interest to investigate if these residues might play a role in ion transport by wolframin. 495

4 Discussion 496

4.1 Evolutionary and functional considerations 497

4.1.1 Early and later landmarks in the evolution of the ER Ca2+-stores system 498

The ER Ca2+-stores system displays several parallels in its evolutionary history to other 499 endomembrane-dependent systems such as the nuclear membrane and vesicular trafficking systems 500 (Mans et al., 2004; Jekely, 2008). Like in the case of these systems, dedicated ER Ca2+-stores 501 systems are absent in the prokaryotes, despite the presence of Ca2+ transients and Ca2+-dependent 502 signaling pathways in them (Dominguez et al., 2015). However, several of the more ancient 503 individual components of eukaryotic Ca2+-stores systems are of clear-cut prokaryotic origin. 504 Critically, we show here that not all of these proteins originated from the proto-mitochondrion; 505 notably, calmodulin is of likely cyanobacterial or actinobacterial provenance and the calcineurin-like 506 phosphoesterases originally descended from the archaea. The former observation adds to 507 accumulating evidence of LECA acquiring bacterial contributions from non-α-proteobacterial 508 lineages. It is therefore possible that LECA had a more extensive set of associated symbionts than 509 what was fixed as the mitochondrion in eukaryotic evolution (Burroughs et al., 2017; Verma et al., 510 2018). 511 512 The cyanobacterial origin of calmodulin and the role for the closely-related and early-branching 513 centrin family of EF-hand proteins in microtubule dynamics during cell division suggest that the 514 LECA already had a strong Ca2+ dependency. This raises the possibility that the prokaryotic 515 ancestors of the eukaryotes might have existed in a calcium-rich environment such as the 516 biomineralized structures (e.g. stromatolites) formed by cyanobacteria (Bosak et al., 2013). This is 517 consistent with the diversity of domain architectures for calmodulin-like proteins with ramifications 518 into various functional systems in the cyanobacteria that we reported here. This diversified pool of 519 Ca2+-binding domains could have contributed raw materials needed during the initial emergence of 520 Ca2+ flux-based signaling and regulation across endomembranes in eukaryotes. This emergence of 521 intracellular Ca2+ gradients was likely fixed by the myriad advantages it bestowed in the stem 522 eukaryotes, including increased signaling capacity and flexibility for processes like 523 growth/proliferation, secretion, and motility. 524 525 The ancestral eukaryotic ER-dependent Ca2+-stores system likely consisted of a combination of the 526 SERCA pump, a cation channel, EF-hand-containing proteins, and phosphorylation enzymes. Early 527 in eukaryotic evolution, chaperone domains associated with protein folding and thioredoxin fold 528 domains were added to the system, likely recruited from roles as general regulatory domains, some of 529 which could also bind Ca2+. Association with thioredoxin fold domains, involved in disulfide-bond 530 isomerization, is of note as this points to early emergence of a link between redox-dependent folding 531 of cysteine-rich proteins and Ca2+ concentrations. The striking presence of cysteine-rich domains 532 associated with cyanobacterial calmodulin homologs (Figure 3E; DES, LMI, and LA unpublished 533 observations) suggests that such a connection might have emergence even before the origin of the 534 eukaryotic Ca2+-stores system. These functional links might have persisted until later in eukaryotic 535 evolution as hinted by the cysteine-rich domain present in wolframin. This led to the basic system as 536 reconstructed in the LECA (Figure 2A). 537 538

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 13: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

13

Waves of additional accretion events added components to the Ca2+-stores system at distinct points in 539 eukaryotic evolution, often appearing to correlate with adaptations to distinct lifestyles, such as the 540 evolution of motile multicellular forms, loss of motility (the crown plant lineage), or the evolution of 541 calcium-rich biomineralized skeletons and shells (Figure 2A). Strikingly, we appear to observe some 542 correlation between the loss-and-gain patterns of the regulatory components of Ca2+-stores systems 543 and the degree of structural utilization of calcium. For example, a relative dearth of these components 544 is observed in the arthropod and fungal lineage, which use chitin-based structural components as 545 opposed to the calcium-based counterparts used by vertebrates and molluscs (Figure 2F-G). While 546 the data in this study necessarily relies on experimental findings primarily from animals, variations in 547 presence/absence across lineages provide insight into the dynamic evolution of Ca2+ stores regulation 548 and suggests there are further complexities to explore in more poorly-characterized eukaryotes. 549

4.1.2 Wolframin domain architecture and interactors 550

Even within animal Ca2+-stores systems, several proteins with important regulatory roles remain 551 poorly-understood in terms of their functional mechanisms. Wolframin is such a protein; its domain 552 composition has eluded researchers for over two decades (Inoue et al., 1998; Strom et al., 1998; 553 Osman et al., 2003). Assignment of domains at the N- and C-termini of the central TM region (Figure 554 4A), as well as the positioning of wolframin in the assembled interaction network, (Figure 1A) 555 (Zatyka et al., 2015) supports a role in coordinating interactions on both sides of the ER membrane. 556 These are likely to take the form of PPIs with Ca2+-binding proteins or through disulfide bond 557 interactions (see above). 558 559 However, outside of a possible bacterial origin for the SLR region (see above, Figure 4C), the precise 560 evolutionary origins of the remaining domains comprising wolframin remain mostly unclear. It 561 appears likely these domains were derived from paralogs of existing EF-hand and OB fold domains, 562 both of which had already undergone extensive domain radiations in the eukaryotes prior to the 563 emergence of the animals, and then assembled and recruited to a Ca2+-stores regulatory role at the ER 564 (Lespinet et al., 2002). Their rapid divergence, evident by the lack of recognizable relationships to 565 known families with their respective folds, could have resulted from the extraordinary selective 566 pressures occurring with the major burst of evolutionary changes during the emergence of the 567 metazoan lineage. 568 569 Despite observations that genic disruptions contribute to similar though not identical phenotypes 570 (Urano, 2016), WFS1 and WFS2 are structurally unrelated. Our interaction network analysis further 571 failed to uncover any shared interactors (Figure 1A), although both have been linked in the past to 572 calpain activity (Lu et al., 2014), mitochondrial dysfunction (Chang et al., 2012; Cagalinec et al., 573 2016), and apoptosis (Yamada et al., 2006; Chang et al., 2010). Recent research on WFS2 has 574 particularly focused on its role in calcium stores regulation at the intersection of the ER and 575 mitochondrial membranes (Rouzier et al., 2017). We believe that our identification of the constituent 576 domains of wolframin reported herein might help clarify its function better through target deletion 577 and mutagenesis of these domains. 578 579 Conclusions 580 Reconstruction of the evolution of the eukaryotic Ca2+-stores regulatory system points to a core of 581 domains inherited from distinct prokaryotic sources conserved across most eukaryotes. Lineage-582 specific differentiation of the system across eukaryotes is driven by complexities stemming from 583 both the loss and/or expansion of the core complement of domains by the addition of components via 584 LSEs or recruitment of domains of diverse provenance. We analyze in depth one such striking 585

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 14: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

14

example of the latter, namely wolframin, which has been previously implicated in human disease. 586 The evolution of wolframin provides a template for how regulatory components of the Ca2+-stores 587 system emerged: through the combining of existing mediators of Ca2+ signaling, like the EF-hand 588 domains, with other domains originally not found in the system. Such transitions often happened at 589 the base of lineages that subsequently underwent substantial diversification. We hope the findings 590 presented here open novel avenues for the ongoing research on the regulation of calcium stores 591 across eukaryotes, including providing new handles for understanding the functional mechanisms of 592 wolframin and its dysregulation in Wolfram syndrome. 593

4.2 Conclusions 594

Reconstruction of the evolution of the eukaryotic Ca2+-stores regulatory system points to a core of 595 domains inherited from distinct prokaryotic sources conserved across most eukaryotes. Lineage-596 specific differentiation of the system across eukaryotes is driven by complexities stemming from 597 both the loss and/or expansion of the core complement of domains by the addition of components via 598 LSEs or recruitment of domains of diverse provenance. We analyze in depth one such striking 599 example of the latter, namely wolframin, which has been previously implicated in human disease. 600 The evolution of wolframin provides a template for how regulatory components of the Ca2+-stores 601 system emerged: through the combining of existing mediators of Ca2+ signaling, like the EF-hand 602 domains, with other domains originally not found in the system. Such transitions often happened at 603 the base of lineages that subsequently underwent substantial diversification. We hope the findings 604 presented here open novel avenues for the ongoing research on the regulation of calcium stores 605 across eukaryotes, including providing new handles for understanding the functional mechanisms of 606 wolframin and its dysregulation in Wolfram syndrome. 607

5 Conflict of Interest 608

The authors declare that the research was conducted in the absence of any commercial or financial 609 relationships that could be construed as a potential conflict of interest. 610

6 Author Contributions 611

Conceptualization, DS, AMB, LA; formal analysis, DS, AMB, LMI; analytical tools, DS, LA; 612 project administration, AMB, LMI, LA; visualization, DS; writing—original draft DS; AMB.; 613 writing—review and editing, AMB, LMI, and LA 614

7 Funding 615

DES, LMI., AMB., and LA are supported by the Intramural Research Program of the NIH, National 616 Library of Medicine. 617

8 Acknowledgments 618

This is a short text to acknowledge the contributions of specific colleagues, institutions, or agencies 619 that aided the efforts of the authors. 620

9 Data Availability Statement 621

All datasets analyzed for this study are included in the manuscript and the supplementary files. 622

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 15: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

15

10 Figure legends 623

Figure 1. Protein-protein interaction network of the ER-dependent Ca2+-stores and -signaling system. 624 (A) The network. Edges are classified as follows: interactions reported in the literature in which each 625 partner gene either has no other paralog or has been specifically identified (solid black), reported 626 interactions where the specific paralog for at least one of the two genes is unclear (dashed gray), and 627 interactions found in FunCoup 4.0 with P ≥ 0.9 (light pink). Medium- and large-sized nodes represent 628 the proteins whose phyletic distribution was selected for detailed analysis and are colored based on 629 phylogeny as follows: pan-eukaryotic, green; metazoans and close relatives, yellow; metazoan-630 specific, orange; chordate-specific, red. Nodes labeled by HUGO nomenclature to capture paralog-631 specific interactions. The borders of nodes involved in selected intracellular systems are colored 632 based on the key given in the lower left. (B) The highly-connected subnetwork. Edge coloring has no 633 significance; nodes are colored and sized as in (A). (C) A histogram of the degrees of the nodes of 634 the network shown in (A). A curve of best fit is shown in gray. 635

Figure 2. Comparative genome analyses. (A) A visualization of the number of paralogs of the 34 key 636 proteins/protein families (vertical axis) across 216 eukaryotes (horizontal axis). Each circle 637 represents the number of paralogs of one protein/family in one organism; the radius of each circle is 638 scaled by the hyperbolic arcsine of the represented number of paralogs. The proteins and protein 639 families shown can be broadly divided into two categories: those that are largely pan-eukaryotic 640 (calmodulin to ORAI) and those that originate in the metazoans or their close relatives (STIM to 641 S100). Eukaryotic clades are color-coded and labeled below the plot. Where needed, protein/protein 642 family names are supplemented in parenthesis with human gene names from Figure 1, where node 643 labels distinguish between paralogs. (B) A dendrogram of the 216 eukaryotes based on the counts of 644 their paralogs of the 34 proteins/families. Tip labels are colored by clade based on coloring scheme 645 shown in the upper left. (C) A dendrogram of the 34 proteins/families based on the counts of their 646 paralogs. Five major clusters are numbered to their left. (D-E) Plots of the first two (D) principal 647 components and (E) discriminants of each of the 216 eukaryotes resulting from principal component 648 and linear discriminant analyses on the paralog count dataset. Color keys linking colors to high-level 649 clades are given in the upper right and lower right, respectively, of the plots. (F-G) Histograms of the 650 polydomain scores of (F) all 216 eukaryotes, shown on a base-10 log scale, and (G) 50 metazoans, 651 shown on a linear scale. Contributions of some clades to each bar of the histograms are shown 652 through coloring given by the keys in the upper right of the histograms. 653

Figure 3. (A-D) Stylized phylogenetic trees showing (A) sarcalumenin, EHD, and related bacterial 654 dynamins; (B) a partial tree of the calcineurin-like superfamily, containing the classical calcineurin A 655 phosphatases with their immediate eukaryotic relatives and newly-recognized archaeal orthologs, 656 along with the related MRE11/rad32/sbcD-like phosphoesterases as an outgroup; (C) the expansions 657 of the calcium/calmodulin-dependent kinase in eukaryotes; and (D) ORAI and Jiraiya. Collapsed 658 groups are colored as follows: universal distribution, yellow; bacterial, blue; archaeal, red; pan-659 eukaryotic, dark green; metazoan, light green; restricted to non-metazoan eukaryotes, blue-green. 660 Branches with bootstrap support of greater than 85% are marked with a black circle. Genome 661 contextual associations pertinent to a given clade are provided within context of trees. Conserved 662 gene neighborhoods are depicted as boxed arrows and protein domain architectures as boxes linked 663 in the same polypeptide. (E) Domain architectures of proteins containing calmodulin-like EF-hands. 664 Each EF-hand is a dyad of EF repeats. Long (>200 residue) regions without an annotated domain are 665 collapsed using “//”. 666

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 16: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

16

Figure 4. Structural and sequence overview of wolframin. (A) Domain architecture of wolframin. (B) 667 Topology diagram of the wolframin OB-fold. The labeled secondary structure elements correspond 668 with the labeled secondary structure elements and coloring in the OB-fold alignment in part E. The β-669 strand shaded in gray is a possible sixth strand stacking with the core OB-fold domain barrel. (C-E) 670 Multiple sequence alignments of the (C) Sel1-like repeats, (D) EF-hand, (E) and cysteine-rich and 671 OB-fold domains of wolframin. Sequences are labeled to the left of the alignments by organism 672 abbreviation (see Supplementary Table S3) and NCBI GenBank accession number. Secondary 673 structure is provided above the alignments; green arrows represent strands and red cylinders represent 674 helices. The two EF motifs are marked with blue arrows above the secondary structure line. The 675 boundaries of the core OB-fold and the cysteine-rich region (E) are marked with pentagonal arrows 676 above the secondary structure line and pointing towards the center of the domain. The six secondary 677 structure elements that comprise the OB-fold are labeled with blue text in the secondary structure 678 line. Conserved cysteines are marked with an asterisk. A 90% consensus line is provided below the 679 alignments; the coloring and abbreviations used are: h (hydrophobic), l (aliphatic), and a (aromatic) 680 are shown on a yellow background; o (alcohol) is shown in salmon font; p (polar) is shown in blue 681 font; + (positively charged), - (negatively charged), and c (charged) are shown in pink font; s (small) 682 is shown in green font; u (tiny) is shown on a green background; b (big) is shaded gray. 683

11 References 684

Aloi, C., Salina, A., Pasquali, L., Lugani, F., Perri, K., Russo, C., et al. (2012). Wolfram syndrome: 685 new mutations, different phenotype. PLoS One 7(1), e29150. doi: 686 10.1371/journal.pone.0029150. 687

Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. (1990). Basic local alignment 688 search tool. J Mol Biol 215(3), 403-410. doi: 10.1016/s0022-2836(05)80360-2. 689

Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., et al. (1997). Gapped 690 BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic 691 Acids Res 25(17), 3389-3402. 692

Altshuler, I., Vaillant, J.J., Xu, S., and Cristescu, M.E. (2012). The evolutionary history of 693 sarco(endo)plasmic calcium ATPase (SERCA). PLoS One 7(12), e52617. doi: 694 10.1371/journal.pone.0052617. 695

Alva, V., Nam, S.Z., Soding, J., and Lupas, A.N. (2016). The MPI bioinformatics Toolkit as an 696 integrative platform for advanced protein sequence and structure analysis. Nucleic Acids Res 697 44(W1), W410-415. doi: 10.1093/nar/gkw348. 698

Alzayady, K.J., Sebe-Pedros, A., Chandrasekhar, R., Wang, L., Ruiz-Trillo, I., and Yule, D.I. (2015). 699 Tracing the Evolutionary History of Inositol, 1, 4, 5-Trisphosphate Receptor: Insights from 700 Analyses of Capsaspora owczarzaki Ca2+ Release Channel Orthologs. Mol Biol Evol 32(9), 701 2236-2253. doi: 10.1093/molbev/msv098. 702

Amr, S., Heisey, C., Zhang, M., Xia, X.J., Shows, K.H., Ajlouni, K., et al. (2007). A homozygous 703 mutation in a novel zinc-finger protein, ERIS, is responsible for Wolfram syndrome 2. Am J 704 Hum Genet 81(4), 673-683. doi: 10.1086/520961. 705

Aramaki, T., Sasai, N., Yakura, R., and Sasai, Y. (2010). Jiraiya attenuates BMP signaling by 706 interfering with type II BMP receptors in neuroectodermal patterning. Dev Cell 19(4), 547-707 561. doi: 10.1016/j.devcel.2010.09.001. 708

Aravind, L., Anantharaman, V., Zhang, D., de Souza, R.F., and Iyer, L.M. (2012). Gene flow and 709 biological conflict systems in the origin and evolution of eukaryotes. Front Cell Infect 710 Microbiol 2, 89. doi: 10.3389/fcimb.2012.00089. 711

Arcus, V. (2002). OB-fold domains: a snapshot of the evolution of sequence, structure and function. 712 Curr Opin Struct Biol 12(6), 794-801. 713

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 17: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

17

Ashby, M.C., and Tepikin, A.V. (2001). ER calcium and the functions of intracellular organelles. 714 Semin Cell Dev Biol 12(1), 11-17. doi: 10.1006/scdb.2000.0212. 715

Bader, G.D., and Hogue, C.W. (2002). Analyzing yeast protein-protein interaction data obtained 716 from different sources. Nat Biotechnol 20(10), 991-997. doi: 10.1038/nbt1002-991. 717

Banerjee, S., Vishwanath, P., Cui, J., Kelleher, D.J., Gilmore, R., Robbins, P.W., et al. (2007). The 718 evolution of N-glycan-dependent endoplasmic reticulum quality control factors for 719 glycoprotein folding and degradation. Proc Natl Acad Sci U S A 104(28), 11676-11681. doi: 720 10.1073/pnas.0704862104. 721

Benson, D.A., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., et al. (2013). 722 GenBank. Nucleic Acids Res 41(Database issue), D36-42. doi: 10.1093/nar/gks1195. 723

Berridge, M.J. (2012). Calcium signalling remodelling and disease. Biochem Soc Trans 40(2), 297-724 309. doi: 10.1042/BST20110766. 725

Bosak, T., Knoll, A.H., and Petroff, A.P. (2013). The Meaning of Stromatolites. Annual Review of 726 Earth and Planetary Sciences 41(1), 21-44. doi: 10.1146/annurev-earth-042711-105327. 727

Brownlee, C., Wheeler, G.L., and Taylor, A.R. (2015). Coccolithophore biomineralization: New 728 questions, new answers. Semin Cell Dev Biol 46, 11-16. doi: 10.1016/j.semcdb.2015.10.027. 729

Burroughs, A.M., Kaur, G., Zhang, D., and Aravind, L. (2017). Novel clades of the HU/IHF 730 superfamily point to unexpected roles in the eukaryotic centrosome, chromosome 731 partitioning, and biologic conflicts. Cell Cycle 16(11), 1093-1103. doi: 732 10.1080/15384101.2017.1315494. 733

Cagalinec, M., Liiv, M., Hodurova, Z., Hickey, M.A., Vaarmann, A., Mandel, M., et al. (2016). Role 734 of Mitochondrial Dynamics in Neuronal Development: Mechanism for Wolfram Syndrome. 735 PLoS Biol 14(7), e1002511. doi: 10.1371/journal.pbio.1002511. 736

Cai, X., Wang, X., Patel, S., and Clapham, D.E. (2015). Insights into the early evolution of animal 737 calcium signaling machinery: a unicellular point of view. Cell Calcium 57(3), 166-173. doi: 738 10.1016/j.ceca.2014.11.007. 739

Chang, N.C., Nguyen, M., Germain, M., and Shore, G.C. (2010). Antagonism of Beclin 1-dependent 740 autophagy by BCL-2 at the endoplasmic reticulum requires NAF-1. Embo j 29(3), 606-618. 741 doi: 10.1038/emboj.2009.369. 742

Chang, N.C., Nguyen, M., and Shore, G.C. (2012). BCL2-CISD2: An ER complex at the nexus of 743 autophagy and calcium homeostasis? Autophagy 8(5), 856-857. doi: 10.4161/auto.20054. 744

Clapham, D.E. (2007). Calcium signaling. Cell 131(6), 1047-1058. doi: 10.1016/j.cell.2007.11.028. 745 Dantas, T.J., Daly, O.M., and Morrison, C.G. (2012). Such small hands: the roles of 746

centrins/caltractins in the centriole and in genome maintenance. Cell Mol Life Sci 69(18), 747 2979-2997. doi: 10.1007/s00018-012-0961-1. 748

Delwiche, C.F., and Cooper, E.D. (2015). The Evolutionary Origin of a Terrestrial Flora. Curr Biol 749 25(19), R899-910. doi: 10.1016/j.cub.2015.08.029. 750

Denessiouk, K., Permyakov, S., Denesyuk, A., Permyakov, E., and Johnson, M.S. (2014). Two 751 structural motifs within canonical EF-hand calcium-binding domains identify five different 752 classes of calcium buffers and sensors. PLoS One 9(10), e109287. doi: 753 10.1371/journal.pone.0109287. 754

Dominguez, D.C., Guragain, M., and Patrauchan, M. (2015). Calcium binding proteins and calcium 755 signaling in prokaryotes. Cell Calcium 57(3), 151-165. doi: 10.1016/j.ceca.2014.12.006. 756

Drozdetskiy, A., Cole, C., Procter, J., and Barton, G.J. (2015). JPred4: a protein secondary structure 757 prediction server. Nucleic Acids Res 43(W1), W389-394. doi: 10.1093/nar/gkv332. 758

Edel, K.H., Marchadier, E., Brownlee, C., Kudla, J., and Hetherington, A.M. (2017). The Evolution 759 of Calcium-Based Signalling in Plants. Curr Biol 27(13), R667-r679. doi: 760 10.1016/j.cub.2017.05.020. 761

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 18: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

18

Edgar, R.C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. 762 Nucleic Acids Res 32(5), 1792-1797. doi: 10.1093/nar/gkh340. 763

Flynn, R.L., and Zou, L. (2010). Oligonucleotide/oligosaccharide-binding fold proteins: a growing 764 family of genome guardians. Crit Rev Biochem Mol Biol 45(4), 266-275. doi: 765 10.3109/10409238.2010.488216. 766

Fruchterman, T.M.J., and Reingold, E.M. (1991). Graph drawing by force-directed placement. 767 Software: Practice and Experience 21(11), 1129-1164. doi: 10.1002/spe.4380211102. 768

Gertz, E.M., Yu, Y.K., Agarwala, R., Schaffer, A.A., and Altschul, S.F. (2006). Composition-based 769 statistics and translated nucleotide searches: improving the TBLASTN module of BLAST. 770 BMC Biol 4, 41. doi: 10.1186/1741-7007-4-41. 771

Gifford, J.L., Walsh, M.P., and Vogel, H.J. (2007). Structures and metal-ion-binding properties of the 772 Ca2+-binding helix-loop-helix EF-hand motifs. Biochem J 405(2), 199-221. doi: 773 10.1042/BJ20070255. 774

Guardino, K.M., Sheftic, S.R., Slattery, R.E., and Alexandrescu, A.T. (2009). Relative stabilities of 775 conserved and non-conserved structures in the OB-fold superfamily. Int J Mol Sci 10(5), 776 2412-2430. doi: 10.3390/ijms10052412. 777

Hildebrand, M.S., Sorensen, J.L., Jensen, M., Kimberling, W.J., and Smith, R.J. (2008). Autoimmune 778 disease in a DFNA6/14/38 family carrying a novel missense mutation in WFS1. Am J Med 779 Genet A 146A(17), 2258-2265. doi: 10.1002/ajmg.a.32449. 780

Hofmann, S., Philbrook, C., Gerbitz, K.D., and Bauer, M.F. (2003). Wolfram syndrome: structural 781 and functional analyses of mutant and wild-type wolframin, the WFS1 gene product. Hum 782 Mol Genet 12(16), 2003-2012. 783

Inoue, H., Tanizawa, Y., Wasson, J., Behn, P., Kalidas, K., Bernal-Mizrachi, E., et al. (1998). A gene 784 encoding a transmembrane protein is mutated in patients with diabetes mellitus and optic 785 atrophy (Wolfram syndrome). Nat Genet 20(2), 143-148. doi: 10.1038/2441. 786

Jekely, G. (2007). Origin of eukaryotic endomembranes: a critical evaluation of different model 787 scenarios. Adv Exp Med Biol 607, 38-51. doi: 10.1007/978-0-387-74021-8_3. 788

Jekely, G. (2008). Origin of the nucleus and Ran-dependent transport to safeguard ribosome 789 biogenesis in a chimeric cell. Biol Direct 3, 31. doi: 10.1186/1745-6150-3-31. 790

Jeong, H., Sim, H.J., Song, E.K., Lee, H., Ha, S.C., Jun, Y., et al. (2016). Crystal structure of SEL1L: 791 Insight into the roles of SLR motifs in ERAD pathway. Sci Rep 6, 20261. doi: 792 10.1038/srep20261. 793

Karpenahalli, M.R., Lupas, A.N., and Soding, J. (2007). TPRpred: a tool for prediction of TPR-, 794 PPR- and SEL1-like repeats from protein sequences. BMC Bioinformatics 8, 2. doi: 795 10.1186/1471-2105-8-2. 796

Kaufman, L., and Rousseeuw, P.J. (1990). Finding Groups in Data: An Introduction to Cluster 797 Analysis. New York: John Wiley & Sons. 798

Kawasaki, H., Nakayama, S., and Kretsinger, R.H. (1998). Classification and evolution of EF-hand 799 proteins. Biometals 11(4), 277-295. 800

Kim, H.S., Czymmek, K.J., Patel, A., Modla, S., Nohe, A., Duncan, R., et al. (2012). Expression of 801 the Cameleon calcium biosensor in fungi reveals distinct Ca(2+) signatures associated with 802 polarized growth, development, and pathogenesis. Fungal Genet Biol 49(8), 589-601. doi: 803 10.1016/j.fgb.2012.05.011. 804

Klimecka, M., and Muszynska, G. (2007). Structure and functions of plant calcium-dependent 805 protein kinases. Acta Biochim Pol 54(2), 219-233. 806

Kozlov, G., Bastos-Aristizabal, S., Maattanen, P., Rosenauer, A., Zheng, F., Killikelly, A., et al. 807 (2010). Structural basis of cyclophilin B binding by the calnexin/calreticulin P-domain. J Biol 808 Chem 285(46), 35551-35557. doi: 10.1074/jbc.M110.160101. 809

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 19: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

19

Krishna, S.S., Aravind, L., Bakolitsa, C., Caruthers, J., Carlton, D., Miller, M.D., et al. (2010). The 810 structure of SSO2064, the first representative of Pfam family PF01796, reveals a novel two-811 domain zinc-ribbon OB-fold architecture with a potential acyl-CoA-binding role. Acta 812 Crystallogr Sect F Struct Biol Cryst Commun 66(Pt 10), 1160-1166. doi: 813 10.1107/s1744309110002514. 814

Lance, G.N., and Williams, W.T. (1966). Computer Programs for Hierarchical Polythetic 815 Classification (“Similarity Analyses”). The Computer Journal 9(1), 60-64. doi: 816 10.1093/comjnl/9.1.60. 817

Lanner, J.T., Georgiou, D.K., Joshi, A.D., and Hamilton, S.L. (2010). Ryanodine receptors: structure, 818 expression, molecular details, and function in calcium release. Cold Spring Harb Perspect 819 Biol 2(11), a003996. doi: 10.1101/cshperspect.a003996. 820

Leipe, D.D., Wolf, Y.I., Koonin, E.V., and Aravind, L. (2002). Classification and evolution of P-loop 821 GTPases and related ATPases. J Mol Biol 317(1), 41-72. doi: 10.1006/jmbi.2001.5378. 822

Lespinet, O., Wolf, Y.I., Koonin, E.V., and Aravind, L. (2002). The role of lineage-specific gene 823 family expansion in the evolution of eukaryotes. Genome Res 12(7), 1048-1059. doi: 824 10.1101/gr.174302. 825

Lu, S., Kanekura, K., Hara, T., Mahadevan, J., Spears, L.D., Oslowski, C.M., et al. (2014). A 826 calcium-dependent protease as a potential therapeutic target for Wolfram syndrome. Proc 827 Natl Acad Sci U S A 111(49), E5292-5301. doi: 10.1073/pnas.1421055111. 828

Makioka, A., Kumagai, M., Kobayashi, S., and Takeuchi, T. (2002). Possible role of calcium ions, 829 calcium channels and calmodulin in excystation and metacystic development of Entamoeba 830 invadens. Parasitol Res 88(9), 837-843. doi: 10.1007/s00436-002-0676-6. 831

Mans, B.J., Anantharaman, V., Aravind, L., and Koonin, E.V. (2004). Comparative genomics, 832 evolution and origins of the nuclear envelope and nuclear pore complex. Cell Cycle 3(12), 833 1612-1637. doi: 10.4161/cc.3.12.1345. 834

Mittl, P.R., and Schneider-Brachert, W. (2007). Sel1-like repeat proteins in signal transduction. Cell 835 Signal 19(1), 20-31. doi: 10.1016/j.cellsig.2006.05.034. 836

Moreno, S.N., and Docampo, R. (2003). Calcium regulation in protozoan parasites. Curr Opin 837 Microbiol 6(4), 359-364. 838

Neuwald, A.F., and Altschul, S.F. (2016). Bayesian Top-Down Protein Sequence Alignment with 839 Inferred Position-Specific Gap Penalties. PLoS Comput Biol 12(5), e1004936. doi: 840 10.1371/journal.pcbi.1004936. 841

Nolan, D.P., Reverlard, P., and Pays, E. (1994). Overexpression and characterization of a gene for a 842 Ca(2+)-ATPase of the endoplasmic reticulum in Trypanosoma brucei. J Biol Chem 269(42), 843 26045-26051. 844

Ogris, C., Guala, D., and Sonnhammer, E.L.L. (2018). FunCoup 4: new species, data, and 845 visualization. Nucleic Acids Res 46(D1), D601-D607. doi: 10.1093/nar/gkx1138. 846

Ong, H.L., de Souza, L.B., and Ambudkar, I.S. (2016). Role of TRPC Channels in Store-Operated 847 Calcium Entry. Adv Exp Med Biol 898, 87-109. doi: 10.1007/978-3-319-26974-0_5. 848

Osman, A.A., Saito, M., Makepeace, C., Permutt, M.A., Schlesinger, P., and Mueckler, M. (2003). 849 Wolframin expression induces novel ion channel activity in endoplasmic reticulum 850 membranes and increases intracellular calcium. J Biol Chem 278(52), 52755-52762. doi: 851 10.1074/jbc.M310331200. 852

Peng, J., Ding, J., Tan, C., Baggenstoss, B., Zhang, Z., Lapolla, S.M., et al. (2009). Oligomerization 853 of membrane-bound Bcl-2 is involved in its pore formation induced by tBid. Apoptosis 854 14(10), 1145-1153. doi: 10.1007/s10495-009-0389-8. 855

Perez-Gordones, M.C., Serrano, M.L., Rojas, H., Martinez, J.C., Uzcanga, G., and Mendoza, M. 856 (2015). Presence of a thapsigargin-sensitive calcium pump in Trypanosoma evansi: 857

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 20: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

20

Immunological, physiological, molecular and structural evidences. Exp Parasitol 159, 107-858 117. doi: 10.1016/j.exppara.2015.08.017. 859

Plattner, H. (2015). Molecular aspects of calcium signalling at the crossroads of unikont and bikont 860 eukaryote evolution--the ciliated protozoan Paramecium in focus. Cell Calcium 57(3), 174-861 185. doi: 10.1016/j.ceca.2014.12.002. 862

Plattner, H. (2017). Signalling in ciliates: long- and short-range signals and molecular determinants 863 for cellular dynamics. Biol Rev Camb Philos Soc 92(1), 60-107. doi: 10.1111/brv.12218. 864

Plattner, H., and Verkhratsky, A. (2013). Ca2+ signalling early in evolution--all but primitive. J Cell 865 Sci 126(Pt 10), 2141-2150. doi: 10.1242/jcs.127449. 866

Plattner, H., and Verkhratsky, A. (2015). The ancient roots of calcium signalling evolutionary tree. 867 Cell Calcium 57(3), 123-132. doi: 10.1016/j.ceca.2014.12.004. 868

Ponting, C.P., Aravind, L., Schultz, J., Bork, P., and Koonin, E.V. (1999). Eukaryotic signalling 869 domain homologues in archaea and bacteria. Ancient ancestry and horizontal gene transfer. J 870 Mol Biol 289(4), 729-745. doi: 10.1006/jmbi.1999.2827. 871

Prakriya, M., and Lewis, R.S. (2015). Store-Operated Calcium Channels. Physiol Rev 95(4), 1383-872 1436. doi: 10.1152/physrev.00020.2014. 873

Price, M.N., Dehal, P.S., and Arkin, A.P. (2010). FastTree 2--approximately maximum-likelihood 874 trees for large alignments. PLoS One 5(3), e9490. doi: 10.1371/journal.pone.0009490. 875

Prole, D.L., and Taylor, C.W. (2011). Identification of intracellular and plasma membrane calcium 876 channel homologues in pathogenic parasites. PLoS One 6(10), e26218. doi: 877 10.1371/journal.pone.0026218. 878

Qian, X., Qin, L., Xing, G., and Cao, X. (2015). Phenotype Prediction of Pathogenic 879 Nonsynonymous Single Nucleotide Polymorphisms in WFS1. Sci Rep 5, 14731. doi: 880 10.1038/srep14731. 881

Reiner, D.S., Hetsko, M.L., Meszaros, J.G., Sun, C.H., Morrison, H.G., Brunton, L.L., et al. (2003). 882 Calcium signaling in excystation of the early diverging eukaryote, Giardia lamblia. J Biol 883 Chem 278(4), 2533-2540. doi: 10.1074/jbc.M208033200. 884

Rigoli, L., Lombardo, F., and Di Bella, C. (2011). Wolfram syndrome and WFS1 gene. Clin Genet 885 79(2), 103-117. doi: 10.1111/j.1399-0004.2010.01522.x. 886

Rodrigues, F.A., Costa Lda, F., and Barbieri, A.L. (2011). Resilience of protein-protein interaction 887 networks as determined by their large-scale topological features. Mol Biosyst 7(4), 1263-888 1269. doi: 10.1039/c0mb00256a. 889

Rouzier, C., Moore, D., Delorme, C., Lacas-Gervais, S., Ait-El-Mkadem, S., Fragaki, K., et al. 890 (2017). A novel CISD2 mutation associated with a classical Wolfram syndrome phenotype 891 alters Ca2+ homeostasis and ER-mitochondria interactions. Hum Mol Genet 26(9), 1599-892 1611. doi: 10.1093/hmg/ddx060. 893

Silverio, A.L., and Saier, M.H., Jr. (2011). Bioinformatic characterization of the trimeric intracellular 894 cation-specific channel protein family. J Membr Biol 241(2), 77-101. doi: 10.1007/s00232-895 011-9364-8. 896

Smaili, S.S., Pereira, G.J., Costa, M.M., Rocha, K.K., Rodrigues, L., do Carmo, L.G., et al. (2013). 897 The role of calcium stores in apoptosis and autophagy. Curr Mol Med 13(2), 252-265. 898

Soding, J. (2005). Protein homology detection by HMM-HMM comparison. Bioinformatics 21(7), 899 951-960. doi: 10.1093/bioinformatics/bti125. 900

Strom, T.M., Hortnagel, K., Hofmann, S., Gekeler, F., Scharfe, C., Rabl, W., et al. (1998). Diabetes 901 insipidus, diabetes mellitus, optic atrophy and deafness (DIDMOAD) caused by mutations in 902 a novel gene (wolframin) coding for a predicted transmembrane protein. Hum Mol Genet 903 7(13), 2021-2028. 904

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 21: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

21

Unal, C.M., and Steinert, M. (2014). Microbial peptidyl-prolyl cis/trans isomerases (PPIases): 905 virulence factors and potential alternative drug targets. Microbiol Mol Biol Rev 78(3), 544-906 571. doi: 10.1128/mmbr.00015-14. 907

Urano, F. (2016). Wolfram Syndrome: Diagnosis, Management, and Treatment. Curr Diab Rep 908 16(1), 6. doi: 10.1007/s11892-015-0702-6. 909

Ushioda, R., Miyamoto, A., Inoue, M., Watanabe, S., Okumura, M., Maegawa, K.I., et al. (2016). 910 Redox-assisted regulation of Ca2+ homeostasis in the endoplasmic reticulum by disulfide 911 reductase ERdj5. Proc Natl Acad Sci U S A 113(41), E6055-E6063. doi: 912 10.1073/pnas.1605818113. 913

Venancio, T.M., Balaji, S., Iyer, L.M., and Aravind, L. (2009). Reconstructing the ubiquitin network: 914 cross-talk with other systems and identification of novel functions. Genome Biol 10(3), R33. 915 doi: 10.1186/gb-2009-10-3-r33. 916

Verkhratsky, A., and Parpura, V. (2014). Calcium signalling and calcium channels: evolution and 917 general principles. Eur J Pharmacol 739, 1-3. doi: 10.1016/j.ejphar.2013.11.013. 918

Verma, R., Reichermeier, K.M., Burroughs, A.M., Oania, R.S., Reitsma, J.M., Aravind, L., et al. 919 (2018). Vms1 and ANKZF1 peptidyl-tRNA hydrolases release nascent chains from stalled 920 ribosomes. Nature 557(7705), 446-451. doi: 10.1038/s41586-018-0022-5. 921

Wang, Y.Y., Zhao, R., and Zhe, H. (2015). The emerging role of CaMKII in cancer. Oncotarget 922 6(14), 11725-11734. doi: 10.18632/oncotarget.3955. 923

Watson, E., Matousek, W.M., Irimies, E.L., and Alexandrescu, A.T. (2007). Partially folded states of 924 staphylococcal nuclease highlight the conserved structural hierarchy of OB-fold proteins. 925 Biochemistry 46(33), 9484-9494. doi: 10.1021/bi700532j. 926

Woo, J.S., Srikanth, S., and Gwack, Y. (2018). "Modulation of Orai1 and STIM1 by Cellular 927 Factors," in Calcium Entry Channels in Non-Excitable Cells, eds. J.A. Kozak & J.W. Putney, 928 Jr. (Boca Raton (FL): CRC Press/Taylor & Francis), 73-92. 929

Yamada, T., Ishihara, H., Tamura, A., Takahashi, R., Yamaguchi, S., Takei, D., et al. (2006). WFS1-930 deficiency increases endoplasmic reticulum stress, impairs cell cycle progression and triggers 931 the apoptotic pathway specifically in pancreatic beta-cells. Hum Mol Genet 15(10), 1600-932 1609. doi: 10.1093/hmg/ddl081. 933

Yin, X., Ziegler, A., Kelm, K., Hoffmann, R., Watermeyer, P., Alexa, P., et al. (2018). Formation and 934 mosaicity of coccolith segment calcite of the marine algae Emiliania huxleyi. J Phycol 54(1), 935 85-104. doi: 10.1111/jpy.12604. 936

Yurimoto, S., Hatano, N., Tsuchiya, M., Kato, K., Fujimoto, T., Masaki, T., et al. (2009). 937 Identification and characterization of wolframin, the product of the wolfram syndrome gene 938 (WFS1), as a novel calmodulin-binding protein. Biochemistry 48(18), 3946-3955. doi: 939 10.1021/bi900260y. 940

Zaremba-Niedzwiedzka, K., Caceres, E.F., Saw, J.H., Backstrom, D., Juzokaite, L., Vancaester, E., et 941 al. (2017). Asgard archaea illuminate the origin of eukaryotic cellular complexity. Nature 942 541(7637), 353-358. doi: 10.1038/nature21031. 943

Zatyka, M., Da Silva Xavier, G., Bellomo, E.A., Leadbeater, W., Astuti, D., Smith, J., et al. (2015). 944 Sarco(endo)plasmic reticulum ATPase is a molecular partner of Wolfram syndrome 1 protein, 945 which negatively regulates its expression. Hum Mol Genet 24(3), 814-827. doi: 946 10.1093/hmg/ddu499. 947

Zatyka, M., Ricketts, C., da Silva Xavier, G., Minton, J., Fenton, S., Hofmann-Thiel, S., et al. (2008). 948 Sodium-potassium ATPase 1 subunit is a molecular partner of Wolframin, an endoplasmic 949 reticulum protein involved in ER stress. Hum Mol Genet 17(2), 190-200. doi: 950 10.1093/hmg/ddm296. 951

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 22: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Evolution of ER Ca2+ storage

22

Zhou, X., Lin, P., Yamazaki, D., Park, K.H., Komazaki, S., Chen, S.R., et al. (2014). Trimeric 952 intracellular cation channels and sarcoplasmic/endoplasmic reticulum calcium homeostasis. 953 Circ Res 114(4), 706-716. doi: 10.1161/circresaha.114.301816. 954

Zhou, Y., Yang, W., Kirberger, M., Lee, H.W., Ayalasomayajula, G., and Yang, J.J. (2006). 955 Prediction of EF-hand calcium-binding proteins and analysis of bacterial EF-hand proteins. 956 Proteins 65(3), 643-655. doi: 10.1002/prot.21139. 957

Zucchi, R., and Ronca-Testoni, S. (1997). The sarcoplasmic reticulum Ca2+ channel/ryanodine 958 receptor: modulation by endogenous effectors, drugs and disease states. Pharmacol Rev 959 49(1), 1-51. 960

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 23: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

A

B

C

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 24: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

A

E

DCB

F G

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 25: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

0.6

Alveolate ORAIEmiliania ORAIExpansion

Emiliania ORAI Expansion

Alveolate ORAI

OomyceteORAI

Metazoan ORAI

Viridiplantae ORAI

Metazoan Jiraiya

AJC61110 from Streptomyces sp. 769 (actinobacteria)

EF

-ha

ndHeme-Oxygenase

NP_001171708 from Homo sapiens (metazoa)

EF

-ha

nd

NAD-depoxidore-ductase

FAD-depoxidore-ductaseE

F-

hand

TM

TM

TM

TM

FerricreductaseTM region

ABW28148 from Acaryochloris marina MBIC11017 (cyanobacteria)

EF

-ha

nd

NAD-depoxidore-ductase

FAD-depoxidore-ductase

TM

TM

TM

FerricreductaseTM region

Tic110-like EF

-ha

nd

AKE65596 from Microcystis aeruginosa NIES-2549 (cyanobacteria)

TM

TM

TM

TM

TM

TM

TM

Fer

4_5

AFZ12720 from Crinalium epipsammum PCC 9333 (cyanobacteria)

NAD-dep-

oxidoreductase

NO synthase

FAD-dep-

oxidoreductase

Fer

2_B

FD

// EF

-ha

nd //

AHF92076 from Opitutaceae bacterium TAV5 (verrucomicrobia)

EF

-ha

nd

EF

-ha

nd cNMPbinding

cNMPbindingT

M

TM

TM

TM

TM

TM

TM

XP_013754210 from Thecamonas trahens ATCC 50062 (apusomonadida)

EF

-ha

ndcNMPbinding

cNMPbinding

Ion channel //

OYW31404 from Chthoniobacter sp. 12-60-6 (verrucomicrobia)

EF

-ha

ndSulfatase

DU

F49

94

XP_005787975 from Emiliania huxleyi CCMP1516 (haptophyta)

EF

-ha

nd

EF

-ha

nd

Thi

ore-

doxi

n

Sulfatase

OYW74718 from Verrucomicrobia bacterium 12-59-8 (verrucomicrobia)

EF

-ha

nd

EF

-ha

nd Thiore-doxin

CytC //

EJK50743 from Thalassiosira oceanica (stramenopiles)

EF

-ha

nd Glyox-alaseE

F-

handProtein kinase

AKJ65436 from Kiritimatiella glycovorans (verrucomicrobia)

EF

-ha

nd

EF

-ha

nd

EF

-ha

ndGlyox-alase

Glyox-alase

ABHydrolase

// //

AHF04338 from Marichromatium purpuratum 984 (γ-proteobacteria)

Dynamin TM

TM

Closest Bacterial Dynamins

0.6

OtherBacterial Dynamins

Metazoan EHD

Fungal EHD

Viridiplantae EHD

Stramenopile EHD

MetazoanSarcalumenin

KinetoplastidSarcalumenin

AKV76728, AKV77601, AKV76729 from Metallosphaera sedula

ZNR+linker+FHA Calcineurin VWA+ β-barrel+α-helical

Archaeal Calcineurin-like Phosphatases

Parabasalidexpansions

BSU-like phosphatases

Protein phosphatase 1

Diplomonad expansion

Protein phosphatase 2

PP4/6-like

Protein phosphatase 6

Protein phosphatase 4

Protein phosphatase 3(Calcineurin A)

Protein phosphatase 5

Protein phosphatase 7

PP5/7-like

}

}Parabasalidexpansions

}

MRE11/rad32/sbcD-likephosphatases

0.8

Metazoan CAMK1-like

Plant CDPK(land plant expansion)

Fungal CMK1-likeStramenoplie CAMK

CHEK2

Metazoan CAMK2-like

Alveolateexpansion

A

B

C

ED

Universal distribution

Archaeal

Pan-eukaryotic

Other eukaryotic clade

Bacterial Metazoan

0.6

ANZ34757 from Lentzeaguizhouensis (actinobacteria)

EF

-ha

nd

EF

-ha

nd

NP_008819 from Homosapiens (metazoa)

EF

-ha

nd

EF

-ha

nd

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint

Page 26: Functional innovation in the evolution of the calcium-dependent … · Evolution of ER Ca2+ storage 2 37 reticulum (ER), vesicular trafficking, and several novel signaling systems

Secondary StructureHsapiens__NP_005996.2 56 D A A A P A E P Q A Q H T R S R E R A D G T G P T K G D M E I P F E E V L E R A K A G D P - - K A Q T E V G K H Y L Q L A G D T - -Mmusculus__NP_035846.1 58 E A A A P - E P R A P Q T G S R E E T D R A G P M K A D V E I P F E E V L E K A K A G D P - - K A Q T E V G K H Y L R L A N D A - -Ggallus__XP_420803.2 42 D A A T L S S S V P G Y S Q S R E K A E K N E T M K E E P E V L F E E L L E R A K A G E P - - K A Q T E V G K H F L R L A E E E - -Drerio__XP_695252.5 161 - A A Q K V M M Q E K T R K A E E Q A K A E E C D D M E D D L P F E E L Q K K A E A G D P - - R A Q S R L G R Y Y L K L A E E K - -Dmelanogaster__NP_651079.2 1 - - - - - - M A T W T Q N E P T G V T K R R R W N L E D R A S L N K L K H H I A E E G C P - - Q M Q Y D L A K E L L D N S I V E P NLgigantea__XP_009046103.1 1 - - - - - - - - - - - - M A D Q E C N K Q G S E D T N D D T N V N L L K Q E - A E N G S G - - E H Q Y E L G K R Y L K L A D - - S EHrobusta__XP_009012051.1 1 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - M L W Q K T T I Y T V I Y - - E I Y R I S G V V I L N A T N S A D FEpallida__KXJ15729.1 1 - - - - - - - - - - - - - - - - - - - - - - - M S S Y Q S L L Q D K M E D I L A S E S K K N Y S T D I S K A K E L L H E D - - - - -Tadhaerens__XP_002114744.1 2006 - L K N I F Q P K V P A T P G D N P I D G K Q D R K T S K Q L - F K E E L Q K A R D G D A - - D A Q L Q I S N F Y F E G F G - - - -Aqueenslandica__XP_011409218.2 1 - - - - - - - - - - - - - - - - - - - - - - - - - M A L A M A E P K G M T E I A P E D K P R A T E L I V K G K R L I D E A K E S S Sconsensus/90% . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . b b . . p . A . . s p . . . p . b . . . u p . h l p . s . . . . .

Secondary StructureHsapiens__NP_005996.2 D E E L N S - C T A V D W L V L A A K Q G R R E A V K L L - R R C L A D R R - - G I T S E N E R E V R Q L - - - - S S E T D L E R A 175Mmusculus__NP_035846.1 D E E L N S - C S A V A W L I L A A K Q G R R E A V K L L - R R C L A D R K - - G I T S E N E A E V K Q L - - - - S S E T D L E R A 176Ggallus__XP_420803.2 D E E L N N - C S A V D W F I L A A K Q G R R E A V K L L - R R C L A D R R - - G I T S E N E Q E V K K L - - - - S S E T D L E R A 161Drerio__XP_695252.5 D A E K N N - L T A V E W L M K A A K Q G R R D A A K L L - Q K C W M Q K K - - G I T A E N A Q E V R C L - - - - S S E S K F E Q A 279Dmelanogaster__NP_651079.2 L A K G N Q S Q K A V N W L V S A A H N G H E D A V K L L - R Q C Y N D G S - - G I T A E N T D E V R R C - - - - L A M T P G E R A 117Lgigantea__XP_009046103.1 I N R D E N I S F A L H W L V K S S K Q G N E D A T K E L - Q K C M E D D I - - G I N E E N G P D I R W C - - - - I Q T S S L E K Q 108Hrobusta__XP_009012051.1 I N - Y G E A I S V V G M L L M M S E S P G S V I L S V L T L N C L I I K I F L G I T N K N G N D V H W C - - - - V N T S S W E K K 93Epallida__KXJ15729.1 - - - C K K - E E A I K I L I D I S K T G D E B A T E I L - A Q C L A S G N - - G I T N D N R A S V E W C - - - - V K T S E A D K R 93Tadhaerens__XP_002114744.1 - - V K R N R K E A L Q W L K L S S D S G N Q Q A A E M L - Q D Y E D E N H - - K I H V R R K A S I R W Q S A M G K A L E R Y V Q S 2124Aqueenslandica__XP_011409218.2 S E K P H L V A Q G L K L F I R A A A E E S R D A V K R I N S L F S D L S N - - S V I N E L P S D L R K M A Y V L V K S T D R E K E 105consensus/90% . . . . p p . . p u l . h h h . . u c p s p p . A s c b L . . p h b . . . . . . u I p s c p . . p l + b h . . . . . . . o p . - b .

Secondary StructureHsapiens__NP_005996.2 176 V R K A A L V M Y W K L N P K K K K Q V A V A E L L E N V G Q V N E H D G 1 0 L Q K Q R R - M L E R L V S S E S K N Y I A L D D F V E I T K K Y A 255Mmusculus__NP_035846.1 177 V R K A A L V M Y W K L N P K K K K Q V A V S E L L E N V G Q V N E Q D G 1 0 L Q K Q R R - M L E R L V S S E S K N Y I A L D D F V E L T K K Y A 256Ggallus__XP_420803.2 162 V R K A A L V M Y W K L N P K K K K Q L A V S E L L E N V G Q V D N E D G 1 0 V Q K Q R R - M L E R L V S C E S K K F I A L D D F V E I T K K Y A 241Drerio__XP_695252.5 280 V R R A A M T M Y W K L N P D R K K K V A M S E M L E N V S Q V N T V P G 1 0 I Q T Q R K - I L E T M V S 1 E S S S K Y V D V E D F V E M T K K F T 360Dmelanogaster__NP_651079.2 118 A R K A A R E L F A C L S N G N E 2 T P K Q L E R K M R R I Y N L Q R K R R 1 3 E Q E P E C E P L E D V P T 5 V E R R R L I T E A H L V S A A S N Y S 208Obimaculoides__XP_014767880.1 108 I R L A A R Q M F G S L N K T H K E V L S K D E Y M K A I K E I P G - - - D E Q A R K - L L A A A G K - K I G D A I S E D A F V K T I S K K I 173Hrobusta__XP_009012051.1 94 G R N A A S K L F K K I C M G D - K H L T K E E Y K R R V A E L S E - - - N K V E R K - L L E K A I K - - - - E N V S E E D F V N R M M K Q L 155Epallida__KXJ15729.1 94 M N H A M T E L F M S L K Q E G K Q N I T V K D I K N A L K A A K K E K E 1 4 E I G G L M G V L K S A M S I G G K D E L T L T E F L D S A M S Y A 178Tadhaerens__XP_002114744.1 2125 K E D E N D S F L V G N K K L Q S 8 G A T T L H K L I Q D V Q Q N K K M V S 1 3 A K D E E V - F L D Q A T N 1 H N K E K L P I E E E V Q R K A M T K E 2216Aqueenslandica__XP_011409218.2 106 V Y V V A K D I F E T M A E R R D 6 I G E A V E R L L A T K R E D S D S Y K 2 L R Q T M K K L L N S C L S 2 D K G D I I V T E D K F C L T A C L Y S 186consensus/90% . b . h h . . h a . p h p . . p p . . . s . p c h b p . l . p . p p . . . . b . . b . . h L p p h . p . . . . p . l s b p c h h p . . . p b .

Secondary StructureHsapiens__NP_005996.2 666 W Q Q Y G A L C G P R A W K E T N M A R T Q I L C S H L E G H R V T W T - G R F K Y V R V T D I D N S A E S A I - N M L P F F I G D W M R C L Y G EMmusculus__NP_035846.1 668 W Q Q Y G F L C G P R A W K E T N M A R T Q I L C S H L E G H R V T W T - G R F K Y V R V T E I D N S A E S A I - N M L P F F L G D W M R C L Y G EGgallus__XP_420803.2 651 W N Q Y A F L C G P R S W K E T N M A R T Q I L C S H L E G H R V T W T - G R F K Y V R V T E I D N S A E S A I - N M L P L F I G D W M R C L Y G E

Lpolyphemus__XP_013784410.1 601 W D D Y I H V C H Q P A W D I T N M A E V E I G C S Q F E G V G I T G E - G I V N F V N V A R K V N R V K T L V - S L L P R C V A N Y I E C F I G NTadhaerens__XP_002114744.1 2646 W S Q Y V A H C G P P A W H S S N Q A R V Q V Q C R Q L A D Q I V T W S - G E I D T I E V V D I D N K F Q D M F - S S L P W L I H Q W L T C V F G KAqueenslandica__XP_011409218.2 605 F E E Y N H L C G V D T S I N D N L I Q N Q L D C L H L K G S V L S V E I A Q I K E V R I S E I T N S P L T T L - S N F P D S V R N T L T C L M G Econsensus/90% W p p Y . . h C u . . s . p p s s b h p s Q l . C . . L p G . . l p h p . G . h p . V p l s p . s N . h b s . l . s . h P . . h . p h h p C h h G c

Secondary StructureHsapiens__NP_005996.2 A Y P A C S P G 3 T A E E E L C R L K L L A K H P C H I K K F D R Y K F E I T V G M P F 5 G S R S R E E 4 K D I V L R A S S E F K S V L L S L RMmusculus__NP_035846.1 A Y P S C S S G 3 T A E E E L C R L K Q L A K H P C H I K K F D R Y K F E I T V G M P F 3 G N R G H E E 4 K D I V L R A S S E F K D V L L N L RGgallus__XP_420803.2 T Y P L C D P K 3 M E E E E L C R L K Y L T K H N C H M K M F D R Y K F E I T V G M P F 4 G T K L V E E 4 K D I V L K A S N E F K K V L L N L R

Lpolyphemus__XP_013784410.1 H F V C D E A 3 P M E F E R C H L F A S H N I 1 w C D L E N W T S Y Q F Q V T L K M S S V K M F T E L I L E V D N H C K D F I V N L RTadhaerens__XP_002114744.1 K Y T A C D N P - - D D P I C R L R R Q D - - G C H L H N N D H Y K F Q I R T K M P I P Q R G N E - L D V L L E A E H D F A T Q I Y Q L RAqueenslandica__XP_011409218.2 T K P F C G R Q - A D S K T C I F K - - - - - G C H F D S K N Q Y N V H I S T M I P S 3 S L R M T A V L S V W M G H Q Y L L D S N L L K L Kconsensus/90% . a . . C p . . . . - p b b C . h b . . . . . . C c h p p b s p a p h p l p s . h . . u . p . . . . . s l . l . h p . p h . p . h h p L p

Secondary StructureHsapiens__NP_005996.2 Q G S L I E F S T I L E G - R L G S K W P V F E L K A I S C L N C M A Q L S P T R R H V K I E H D W R S T V H G A V K F A F D F F F F P F L S A A 890Mmusculus__NP_035846.1 Q G S L I E F S T I L E G - R L G S K W P V F E L K A I S C L N C M T Q L S P A R R H V K I E Q D W R S T V H G A L K F A F D F F F F P F L S A A 890Ggallus__XP_420803.2 Q G S I I E F S T I L E G R L G S K W P V F E L K A I T C L N C M S K L L P A G R H V K I E H D W R S T V H K A I K F V F D F F F F P F L S A A 874

Dpulex__EFX90302.1 643 W E R Y H E F C S 2 P R S Y S S S S I A S V Q M A C F A L S G L T V K W E - G V V R Q V S V E S R N N V V E S L L - Q W L P P K W M E W L K C R I G E

Dpulex__EFX90302.1 K W P P C H N H 6 S L D S R R C N L F W K L N H 4 L C H V D R F S K F T L R L E V E S K I 5 S W S G D N N 4 L T V E L L A G P R F Q K L A F R L S

-Dpulex__EFX90302.1 P G Q E I R F H G T L S S P S V G G L K P E L V L N S L E C L T C Q P D S T F N S H A E S T S H Q 1 1 I D R A L Q G A V D F V L P H F L R L N V T Q L 888

-

Drerio__XP_695252.5 818 W E E Y G T L C G P Q A W K E R G M A Q T Q L S C S H L E G H R V T W T - G I F R Y V R V A E K E N G A Q S V I - N M L P V F M G D W L R C L Y G ESkowalevskii__XP_002741405.1 619 W Q Q Y Y N L C G P V T G K S T - V A N T Q I V C D H L T G Q W V T W Q - G N V T G V K I S A V D N K A E A A L - N V F P A T L S N W L S C V Y G DDmelanogaster__NP_651079.2 621 W D R F H A L C A Q P V H E Q P N K I K A Q L R C S L L N G M P V I W E - G S V T K V E I S R V S N F L E D T I A N Y L P V W L G R M L R C L H G E

Drerio__XP_695252.5 V Y P K C E P Q 2 5 Q E E E E L C R I K A Y A T H Q C H V K R F D S Y R F E V T V G M P V 1 G V T K V D N 2 G D I L L M A S H E F R Q V L L N L NSkowalevskii__XP_002741405.1 P Y P E C N T T 5 N G T N K L C E I K Q I S G R D C H M S K Y D T L T F Q I S A K M K - S T D G K S - S V I N I V A S H S F R H T A L S L QDmelanogaster__NP_651079.2 3 Q H F K C D P K 3 Q C E E W R S V F K T F N A Q 2 s C T L Q R W N R Y E Y E L L V K V G T 3 G R L L G R S 2 T D V I L R A H H D F G N F T R L L S

- - - -

Drerio__XP_695252.5 P G S M V E F S T K L E G K L G S K L P A F E L K A I H C L N C S S N L V P E G R Q V K I E R N W R K S M L K A I K F A F D F F F A P F L C A R 1058Skowalevskii__XP_002741405.1 I G N R I E F T G T L T T - N L G A P G L T L T L Y S L Q C L T C T S I Q A D L I T Q Q Q T I S E - - - Q L M E S L F F A L N F F I P P F F S T S 830Dmelanogaster__NP_651079.2 E G D V V L F Y G I L H N S R L L A D N V Q V K L K T I E C V E C R S R D L G T A S I E R V V A A 5 R L Q D L M R G I K Y L L N A L L N P L I T F K 853

Lpolyphemus__XP_013784410.1 R G H V L S I S G I L H T - N V G K R K P H L W L Y S A N C T N C Q K E L S C S K I T N G I Y A - - - - Q W N K S F E L L M A F F T F P I V S Y V 809Tadhaerens__XP_002114744.1 Q G M L A D F K G I L K S - N L G G D T P V L E L I S I N S E G L K K Q S I Y Q - - - - - - - - - - D N T T L M V F H F L L Q F I F Y P M L T M S 2843Aqueenslandica__XP_011409218.2 S N Q F I S F N A T L E D - G L G T Q T L N L K L L S Y Q L K D G S D E V F D S S E K G S L D K E S K G E I L D D F W S S V I S T A K F V T E I L 815consensus/90% . G p . l p F p s . L p s . p l G s . . . . h . L b u h p h . s h . . p . . . . . . . . . . . . . . . . p h . . s h . h h h . . h h . . h h s . .

EF motif 1 EF motif 2

Core OB-fold

* * *

* * *

* *

S1 S3 H1

S4

S2

S5

C-rich domain

C-rich

domainSel

1

Sel

1

Sel

1

EF

-h

an

d

OB

-fo

ld

NP_005996 from Homo sapiens

TM region

CN

S4

H1

S1

S2

S3

S5

A B

C

D

E

made available for use under a CC0 license. certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also

The copyright holder for this preprint (which was notthis version posted July 26, 2019. . https://doi.org/10.1101/716472doi: bioRxiv preprint