modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including...

214
1 Modeling the evolutionary loss of erythroid genes by Antarctic icefishes: analysis of the hemogen gene using transgenic and mutant zebrafish by Michael J. Peters B.S. in Biology, University of New Hampshire A dissertation submitted to The Faculty of the College of Science of Northeastern University in partial fulfillment of the requirements for the degree of Doctor of Philosophy June 4, 2018 Dissertation directed by H. William Detrich, III Professor of Marine Molecular Biology and Biochemistry

Upload: others

Post on 17-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

1

Modeling the evolutionary loss of erythroid genes by Antarctic icefishes: analysis of the hemogen gene using transgenic and mutant zebrafish

by Michael J. Peters

B.S. in Biology, University of New Hampshire

A dissertation submitted to

The Faculty of the College of Science of Northeastern University

in partial fulfillment of the requirements for the degree of Doctor of Philosophy

June 4, 2018

Dissertation directed by

H. William Detrich, III Professor of Marine Molecular Biology and Biochemistry

Page 2: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

2

Dedication

For my Oma Oswald, who started this journey with me.

Page 3: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

3

Acknowledgments

I would like to thank my advisor, Dr. H William Detrich III, for encouraging me to be

innovative and to pursue cutting-edge research. I thank the members of my committee,

Drs. Erin Cram, Rebeca Rosengaus, Steven Vollmer, and Leonard Zon for their many

helpful suggestions. I enjoyed working alongside Sandra Parker and Carmen

Elenberger and appreciate their support. I also enjoyed working with many students

including Caroline Benavides, Carolyn Dubnik, Carmen Elenberger, Laura Goetz, Urjeet

Khanwalkar, Ben Moran, Alessia Santilli, Eileen Sheehan, Margaret Streeter, Kathleen

Shusdock, and Sierra Smith. I especially thank Jonah Levin, who joined the lab as a

high school student and has since continued working with me. I thank Corey Allard and

Drs. Donald Yergeau and Jeffrey Grim for collecting samples that I used in my studies. I

owe thanks to the staff of the Marine Science Center for their support, including Roberto

Valdez, Sonya Simpson, Heather Sears, David Dawson, and Ryan Hill. I thank Drs.

Joseph Ayers and Justin Ries for use of their facilities and Drs. Isaac Westfield and

Ryan Myers for help with scanning electron microscopy. I thank Drs. Camille Berthelot,

Melody Clark, James Monaghan, and Leonard Zon for valuable discussions and

providing important datasets and materials. I thank Dr. Jill de Jong and her lab for

inspiring my interest in zebrafish research. I was pushed to my best by my fellow

graduate students at Northeastern University. I am especially grateful to my Mom and

Dad for their unending support and for reading every word I write. I am grateful to my

siblings Sarah, Katie, Ryan, and Zachary for inspiring me with their passions and

positive energy.

Page 4: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

4

Abstract of Dissertation

The Antarctic icefishes (Channichthyidae) are the only vertebrate taxon whose

species do not produce red blood cells, thereby providing a natural mutant model to

study the regulators of blood development and disease. To identify novel regulators of

erythropoiesis, I compared RNA-Seq transcriptomes from red- and white-blooded

notothenioids. I find that both icefishes and their sister taxon, the dragonfishes

(Bathydraconidae), model beta-spectrin mutated spherocytic anemia. Icefishes appear

to have evolved morph-biased changes in expression of hematopoietic regulatory

genes, including down-regulation of the histone acetyltransferase p300 and

overexpression of histone deacetylase 1b. In icefishes, I characterize a frameshift

mutation that truncates the P300-binding domain of Hemogen, an important erythroid

transcription factor. Tol2 and CRISPR/Cas9-generated transgenic zebrafish lines reveal

that hemogen is expressed in hematopoietic, renal, neural, and reproductive tissues. I

find that two conserved non-coding elements differentially contribute to hemogen

expression in primitive and definitive waves of hematopoiesis. CRISPR-generated

mutant zebrafish lines, which replicate the C-terminal mutation in icefish hemogen,

show severe anemia and growth defects. Furthermore, I show that the function of

zebrafish Hemogen is dependent on acidic residues within the TAD. Thus, Antarctic

icefishes evolved an intricate system for repression of erythropoiesis that is caused in

part by the loss of Hemogen function.

Page 5: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

5

Table of Contents

Dedication 2

Acknowledgments 3

Abstract of Dissertation 4

Table of Contents 5

List of Figures 6

List of Tables 9

List of Symbols 10

List of Genes 14

Chapter 1: Morph-biased gene expression and sequence divergence typifies disease-like

traits of Antarctic icefishes 25

Chapter 2: Divergent Hemogen genes of teleosts and mammals share conserved roles in

erythropoiesis: Analysis using transgenic and mutant zebrafish 87

Chapter 3: Erythroid gene discovery using the erythrocyte-null Antarctic icefishes 157

Conclusion 196

References 200

Page 6: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

6

List of Figures

Introduction

Figure 1 Erythropoiesis in the zebrafish 19

Figure 2 Comparison of red- and white-blooded notothenioids 24

Chapter 1

Figure 1 Peripheral blood smears from Antarctic notothenioid fishes 32

Figure 2 Expression profile heatmap of differentially expressed genes 36

Figure 3 Gene ontology enrichment for differentially expressed genes 40

Figure 4 Association networks for DE genes in the icefish head kidney 44

Figure 5 Three tissue-specific clusters of hematopoietic genes are differentially

expressed in the head kidneys of Ps. georgianus and P. charcoti 48

Figure 6 Differential expression of hematopoietic regulators 52

Figure 7 Deleterious substitutions in erythroid genes from icefishes 55

Figure 8 The dragonfish P. charcoti is a natural mutant model for beta-spectrin

mutated spherocytic anemia 57

Figure 9 Functional mutations occur in the interaction domains of Hemogen, Gata1,

and P300 64

Figure 10 Whole protein acetylation in the head kidneys of red- and white-blooded

notothenioids 68

Page 7: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

7

Chapter 2

Figure 1 Zebrafish si:dkey-25o16.2 and human hemogen are orthologous and

encode related proteins that differ in size 93

Figure 2 hemogen expression in zebrafish embryos 99

Figure 3 Alternative promoters drive hemogen expression in hematopoietic and

nonhematopoietic tissues in zebrafish 103

Figure 4 Conserved elements in the zebrafish hemogen promoter are predicted

targets for transcription factors 108

Figure 5 Gata1 binds distal and proximal promoter elements to regulate hemogen

expression in zebrafish 111

Figure 6 Promoter elements have distinct roles in driving hematopoietic, renal, and

testicular expression of hemogen in transgenic Tg(hemgn:mCherry)

zebrafish 115

Figure 7 Morpholino targeting of hemogen inhibits erythropoiesis in embryonic

zebrafish 118

Figure 8 CRISPR/Cas9 mutagenesis of the third exon of zebrafish hemogen

impairs primitive and definitive erythropoiesis 124

Chapter 3

Figure 1 The erythroid gene hemogen is mutated in Antarctic icefishes 161

Figure 2 hemogen is expressed in hematopoietic, renal, and neural tissues in red-

blooded notothenioids 164

Page 8: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

8

Figures 3 A truncated isoform of hemogen is highly expressed in icefishes and is

translated 169

Figure 4 Overexpression of icefish hemogen in zebrafish blocks primitive

erythropoiesis 172

Figure 5 A novel MABP-containing protein (mabpcp) is an RBC-specific gene that

was lost in icefishes 174

Figure 6 Modeling a truncated cd33-related Siglec (cd33rSig) from icefishes in

mutant zebrafish 180

Page 9: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

9

List of Tables

Chapter 1

Table 1 Hematopoietic genes are differentially expressed in the icefish head

kidney 84

Table 2 GO enrichment of genes under different selective pressures in two red-

and two white-blooded notothenioids 84

Table 3 GO enrichment for genes with deleterious substitutions found in two white-

blooded icefishes but not in two red-blooded notothenioids 85

Table 4 Table of primers 85

Table 5 Icefishes have predicted deleterious substitutions in targets of human

diseases 86

Chapter 2

Table S1 Sequences of primer and oligonucleotides used in experiments 156

Chapter 3

Table S1 Primer Sequences 194

Table S2 Oligos for CRISPR gRNA template 195

Page 10: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

10

List of Symbols

AGM Aorta gonad mesonephros

ALL Acute lymphocytic leukemia

AML Acute myeloid leukemia

ATP Adenosine triphosphate

B-ALL B-cell acute lymphoblastic leukemia

BWS Beckwith-Wiedemann syndrome

bZIP Basic leucine zipper domain

CBF-AML Core binding factor acute myeloid leukemia

CC Coiled coil domain

CHT Caudal hematopoietic tissue

Ce Corpus cerebelli

CL-XPosure Clear-blue X-ray film

CLL Chronic lymphocytic leukemia

CML Chronic myeloid leukemia

CMP Common myeloid progenitor

COFS Cerebro oculo facio skeletal syndrome

CRISPR Clustered Regularly Interspaced Short Palindromic Repeats

CT domain C-terminal cystine knot-like domain

CT-ZF C-terminal zinc finger

CV Caudal vein

Cyto Cytoplasmic domain

C2H2 Cys2-His2 zinc finger

Page 11: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

11

DA Dorsal aorta

DBA Diamond-Blackfan anemia

DED1 Death effector domain

DLBCL Diffuse large B-cell lymphoma

DS-AMKL Acute megakaryoblastic leukemia in Down syndrome

ECL Enhanced chemiluminescence

EGFP Enhanced green fluorescent protein

G Glomerulus

HCP Hereditary Coproporphyria

HDR Homology directed repair

HK Head kidney

HNSCC Head and neck squamous cell carcinoma

HRP Horseradish peroxidase

HS Hereditary spherocytic anemia

ICM Intermediate cell mass

Ig Immunoglobulin

IgG Immunoglobulin G

ITIM Immunoreceptor tyrosine-based inhibition motif

LDS Lithium dodecyl sulfate

MAE Myoclonic astatic epilepsy

MABP MVB12-associated beta prism domain

MHB Midbrain-hindbrain-boundary

MO Medulla oblongata

Page 12: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

12

MPN Myeloproliferative neoplasms

NH-terminus Amino-terminus

NHEJ Non-homologous end joining

NLS Nuclear localization signal

PBI Peripheral blood island

PD Pronephric ducts

PHD Plant homeodomain

ProE Proerythroblast

PTK Protein tyrosine kinase

PVDF Polyvinylidene fluoride

RING-finger Really interesting new gene finger domain

SCN Severe congenital neutropenia

SDS-PAGE Sodium dodecyl sulfate-polyacrylamide gel electrophoresis

Se Sertoli cells

Siglec Sialic acid-binding immunoglobulin-type lectin

SNF Sucrose non-fermentable

ST Seminiferous tubules

TAD Transactivation domain

TALEN Transcription activator-like effector nuclease

T-ALL T-cell acute lymphoblastic leukemia

T-CLL T-cell chronic lymphoblastic leukemia

TBST Tris-buffered saline and Tween-20

TFBS Transcription factor binding site

Page 13: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

13

TK Trunk kidney

Tr Transmembrane domain

UDP Uridine diphosphate

WISH Whole-mount in situ hybridization

Zn Zinc

Page 14: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

14

List of Genes

Add1 Adducin-1

Add2 Adducin-2

AKT2 RAC-beta serine/threonine protein kinase 2

Ank1 Ankyrin-1

Anxa2 Annexin A2

Asxl2 Additional sex combs like 2, transcriptional regulator

Band3/Slc4a1 Band 3 anion transport protein

Bcl11a B-cell lymphoma/leukemia 11a

Bcl2l1/BclxL BCL2-like 1 gene/B-cell lymphoma-extra large 1 gene

Blvrb Biliverdin reductase B

Brca1 Breast cancer 1

Card11 Caspase recruitment domain family member 11

Casp8 Caspase 8

Cdkn1c Cyclin dependent kinase inhibitor 1c

Cd33rSig CD33 related siglec

Chd2 Chromodomain helicase DNA binding protein 2

Cox15 Cytochrome C oxidase assembly homolog

Cpox Coproporphyrinogen oxidase

Edag Erythroid differentiation associated gene

Eml1 Echinoderm microtubule associated protein like 1

Epor Erythropoietin receptor

Ercc1 Excision repair 1

Page 15: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

15

Ero1lα Endoplasmic reticulum oxidoreductase 1 alpha

Fam161al Family with sequence similarity 161, member A-like

Fech Ferrochelatase

Fes Feline sarcoma oncogene

Fgl1 Fibrinogen-like protein 1

Flt1 Fms related tyrosine kinase 1

Foxo1 Forkhead box protein O1

Foxp1 Forkhead box protein P1

Gata1 GATA-binding factor 1

Gapdh Glyceraldehyde 3-phosphate dehydrogenase

Gfi1 Growth factor independent 1 transcriptional repressor

Gfi1b Growth factor independent 1 transcriptional repressor B

G6PD Glucose-6-phosphate dehydrogenase

Hbba Hemoglobin, beta adult major chain

Hbbe1 Hemoglobin beta embryonic 1.1

Hdac1b Histone deacetylase 1b

Hemgn Hemogen

Hpx Hemopexin

Ikzf1 Ikaros family zinc finger 1

Ikzf2 Ikaros family zinc finger 2

Il7r Interleukin-7 receptor

Klf1 Krüppel-like factor 1

Ldb1 LIM domain-binding protein 1

Page 16: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

16

Lmo2 LIM domain only 2

Mabpcp1/Dkey:30j10.5 MABP-containing protein 1

Mate1 Multidrug and toxin extrusion protein

Nfe2 Nuclear factor, erythroid derived 2

Nfe2l1 Nuclear factor, erythroid derived 2 like 1

Ntrk1 Neurotrophic receptor tyrosine kinase 1

Numa1 Nuclear mitotic apparatus protein 1

Pphln1 Periphilin 1

Pu.1/Spi1 Spi-1 proto-oncogene

P300 Histone acetyltransferase p300

Rela NFĸB p65 subunit/RELA proto-oncogene

Runx1 Runt related transcription factor 1

Sgk1 Serum and glucocorticoid-regulated kinase 1

Slc25a39 Solute carrier family 25 member 39

Sptb Spectrin beta, erythrocytic

Stat5b Signal transducer and activator of transcription 5B

Tal1 T-cell acute lymphocytic leukemia protein 1

Tf Transferrin

Tfrc Transferrin receptor C

Tgfβ Transforming growth factor beta 1

Trim16l Tripartite motif containing 16 like

Ugt1a1 UDP glucuronosyltransferase 1 family, polypeptide A1

Zfp64 Zinc finger protein 64

Page 17: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

17

Introduction

1. Ontogeny of hematopoiesis in vertebrates

Hematopoiesis, the production of all blood lineages from pluripotent

hematopoietic stem cells, is a complex developmental process essential to vertebrate

life. Comparison of hematopoiesis in different animal models (e.g. humans, chicken,

mice, zebrafish) has revealed fundamental features and key molecular regulators of

blood development and disease (Detrich, 1999; Paw and Zon, 2000; Zon, 1995).

Several hematopoietic processes and genes originated early in evolution and are

conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and

non-chordates (Evans et al., 2003). In all vertebrates, blood production occurs in two

distinct waves, termed primitive and definitive hematopoiesis (Maximow, 1909).

Primitive hematopoiesis occurs in the yolk sac blood islands in embryos of all

vertebrates except fishes (Zon, 1995). In this first wave, committed progenitors

differentiate into nucleated erythrocytes that supply the early embryo with oxygen as

they complete maturation in circulation (Palis, 2014). Definitive hematopoiesis is defined

as the production of blood lineages from pluripotent hematopoietic stem cells (HSCs),

which originate in the aorta gonad mesonephros (AGM). In humans and other

mammals, HSCs seed and develop in the fetal liver for a short time before colonizing

the bone marrow and thymus (Palis, 2014).

Even though many aspects of blood development are highly conserved across

vertebrate taxa, the evolution of diverse forms of animals was coincident with alterations

to the ontogeny of hematopoiesis. In all vertebrates, primitive hematopoiesis initiates in

Page 18: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

18

lateral plate mesoderm (LPM) from cells called “hemangioblasts,” which are capable of

forming both blood and vasculature (Sabin, 2002). In sharks and in some teleosts,

primitive hematopoiesis is extraembryonic and commences directly on the yolk sac

(Zon, 1995). However, in most teleost fishes, the first wave of hematopoiesis is

intraembryonic (Oellacher, 1872). The zebrafish has provided an excellent model to

study teleost blood development (Fig. 1A). Primitive erythroid progenitors are produced

in the intermediate cell mass (ICM) (Detrich et al., 1995). Subsequently, a committed

population of definitive erythroid-myeloid progenitors (EMPs) is found in the peripheral

blood island (PBI), intermixed with primitive erythrocytes (Bertrand et al., 2007a; Detrich

et al., 1995). Concurrently, primitive myelopoiesis initiates from anterior lateral plate

mesoderm (ALM) (Bennett et al., 2001). Primitive erythrocytes migrate from the ICM

and are pulled into circulation by the heart through the ducts of Cuvier (Detrich et al.,

1995). At 30 hpf, the first definitive HSCs are produced from hemogenic endothelial

cells from the ventral wall of the dorsal aorta in a region called the aorta gonad

mesonephros (AGM) (Thompson et al., 1998). The HSCs migrate via circulation to

colonize a temporary site of definitive hematopoiesis in the caudal hematopoietic tissue

(CHT) and to seed the thymus, the major site of adult T-cell production (Murayama et

al., 2006). From the CHT, HSCs “crawl” along the pronephric ducts to colonize the

pronephric head kidney, the site analogous to human bone marrow (Bertrand et al.,

2008). Here, lymphoid and myeloid cells are produced within the hematopoietic stem

cell niche between renal tubules. The primary juvenile and adult hematopoietic organs

(kidney, spleen, liver, bone marrow) vary between different species of amphibians,

reptiles, and fishes (Zon, 1995).

Page 19: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

19

Page 20: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

20

2. Regulators of erythroid differentiation and function

The path to becoming a red blood cell is determined by signaling molecules,

transcription factors, structural proteins, and other factors. Stages of erythroid

differentiation (progenitors in Fig. 1B) are generally characterized by condensation of

the nucleus, a decrease in cell size, and an accumulation of hemoglobin during terminal

differentiation. Each stage is also defined by a unique gene expression profile. As stem

cells mature and lose their pluripotency, factors involved in self-renewal (e.g. myb) are

down-regulated while factors that are critical for heme synthesis and erythroid

differentiation (e.g. gata1) are up-regulated (Hattangadi et al., 2011). Extrinsic factors

within erythroblast islands play an early role in the activation of erythroid differentiation

and these include cytokines and interactions with receptors on neighboring cells or with

the extracellular matrix. The major activator of erythropoiesis is the hormone,

Erythropoietin (Epo), which binds to the erythropoietin receptor (EPOR) on BFU-E and

CFU-E progenitors (D'Andrea et al., 1989; Krantz, 1991; Lin et al., 1996), which signals

up-regulation of erythroid genes and anti-apoptotic factors (Dolznig et al., 2002).

Erythroid transcription factors are also expressed and interact within complexes to

regulate cell differentiation. The Gata1 transcription factor, a master regulator of

erythropoiesis, colocalizes with other nuclear proteins (e.g. Scl/Tal1, Ldb1, Lmo2, Klf1)

at promoters and enhancers to activate erythroid gene transcription via long-range

chromatin interactions (Love et al., 2014; Tijssen et al., 2011). Gata1 also functions as a

transcriptional activator or repressor by recruiting the Hdac1-containing NuRD

(Nucleosome Remodeling Deacetylase) complex (Miccio et al., 2010). During terminal

differentiation, these cofactors up-regulate expression of erythroid genes including anti-

Page 21: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

21

apoptotic molecules like Bcl-xL, which protects erythroid cells from BAX-induced cell

death (Dolznig et al., 2002; Rhodes et al., 2005).

3. Gene editing and zebrafish mutant models

The zebrafish has provided an excellent model to study the genetic regulators of

hematopoiesis (Kafina and Paw, 2018). Zebrafish spawn 100-200 translucent eggs

whose development may be easily studied following fertilization. The generation of

transgenic zebrafish using the Tol2 system has allowed researchers to track gene

expression and tissue development using fluorescent reporters (Kawakami, 2016).

Forward mutant screens have used chemical mutagens like N-ethyl-N-nitrosourea

(ENU) or retroviral insertion of DNA to generate zebrafish lines with developmental

abnormalities (Detrich, 1999; Frame, 2017.). A series of zebrafish blood mutants were

found to have defective erythropoiesis (e.g. vlad tepesm651, riesling, zinfandel) (Ransom

et al., 1996; Weinstein et al., 1996) and impaired HSC specification (e.g. hi1618 and

hi2335) (Amsterdam et al., 2004). These methods have generally not been site-specific.

The advent of CRISPR technologies (Clustered Regularly Interspaced Short

Palindromic Repeats) provides a highly efficient method for targeted gene disruption

using a small guide RNA (sgRNA) and the Cas9 endonuclease (Ata, 2016; Jinek et al.,

2012). Indeed, this technology permits precise modification of genes of interest to

generate mutant zebrafish lines that model human diseases (Frame, 2017.). Further

improvements of these gene editing technologies have enhanced our ability to

manipulate and study the genome (Carroll, 2017; Sertori et al., 2016).

Page 22: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

22

4. Defective erythropoiesis in notothenioid fishes

The only vertebrates that do not produce erythrocytes are the family of Antarctic

icefishes (Channichthyidae), a monophyletic clade of the suborder Notothenioidei

(Cocca et al., 1995b; di Prisco et al., 2002; Near et al., 2006b; Zhao et al., 1998b).

Notothenioid fishes adaptively radiated in the Southern Ocean ~34 million years as it

began cooling to the freezing point of seawater (–1.86C ) (Colombo et al., 2015;

Matschiner et al., 2011; Rutschmann et al., 2011) and other fish taxa became locally

extinct (Eastman, 1993). The decline in temperature was likely caused by the

separation of the Antarctic continent and formation of the Antarctic Circumpolar Current

(Kennett, 1977). Today, notothenioids are the most speciose of Antarctic fishes,

comprising ~130 species grouped into 8 families (Eastman and Eakin, 2000; Near et al.,

2003).

Notothenioids share several synapomorphic traits that allowed for their evolution

in a harsh environment. Almost all members of the clade possess antifreeze

glycoproteins (AFGPs) that keep their blood from freezing (Chen et al., 1997; Deng et

al., 2010). Cold adaptation may also explain the convergent evolution of different XY

sex-chromosome systems in several Antarctic notothenioids, which may avoid skewed

sex ratios otherwise induced by temperature-dependent sex determination (Ghigliotti et

al., 2016). All notothenioids lack a gas bladder, the organ through which most teleosts

regulate their buoyancy (DeVries and Eastman, 1978). Nevertheless, the evolution of

decreased bone mineralization and accumulation of lipids to enhance buoyancy

Page 23: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

23

facilitated the radiation of notothenioid species to occupy diverse niches in the water

column (Albertson et al., 2010). All notothenioids display hematopoietic phenotypes with

reduced hematocrits and hemoglobin concentrations compared to temperate fishes

(Wells et al., 1980). The complete loss of red blood cells by Antarctic icefishes (Fig. 2)

provides a unique opportunity to study the genetic regulators of erythropoiesis.

Page 24: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

24

Page 25: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

25

Chapter 1: Morph-biased gene expression and sequence divergence typifies disease-

like traits of Antarctic icefishes

Key words: icefish, Antarctic, erythropoiesis, anemia, mutant, gene expression

Page 26: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

26

Abstract

The family of white-blooded Antarctic icefishes (Channichthyidae) displays

several unusual traits that are reminiscent of human diseases, most notably their severe

anemia. Because most molecular-genetic pathways are shared among vertebrates, the

mutations that cause the icefish traits may be the same targeted pathways in human

diseases. Here I performed a comparative transcriptomics study to identify changes in

gene expression and coding mutations that are morph-biased for red- and white-

blooded notothenioid fishes. I show that both the profoundly anemic icefishes and their

red-blooded sister clade, the dragonfishes, possess predicted deleterious mutations in

genes that are associated with hemolytic anemias in humans (e.g. g6pd, sptb).

However, erythropoiesis in icefishes appears to be stalled early in erythroid

differentiation prior to potential hemolysis. I hypothesize that this may be an adaptation

to abrogate the production of erythrocytes in an environment, the cold, oxygen-rich

Southern Ocean, in which their utility is marginal. Moreover, I propose the icefishes

have shut down terminal erythroid differentiation through decreased expression of

positive regulators of erythroid differentiation and increased expression of pluripotency

markers typically expressed in leukemias. I show that mutations in the transcription

factor hemogen, combined with overexpression of hdac1b and decreased expression of

p300b, are likely to cause an imbalance in regulatory acetylation of Gata1 that

downregulates its activity in icefish hematopoietic tissues. Together, these changes

suggest that Antarctic notothenioids have evolved an intricate repression of

erythropoiesis.

Page 27: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

27

Introduction

Fish are proven models for studying blood development as they share the same

set of hematopoietic cell lineages as mammals. One exception is the family of Antarctic

icefishes (Channichthyidae), which is the only vertebrate clade that has lost the ability to

produce red blood cells (Cocca et al., 1995a; di Prisco et al., 2002; Near et al., 2006b;

Zhao et al., 1998b). Thus, icefishes provide a natural mutant model for human anemias

(Albertson et al., 2009).

The icefish clade belongs to the Antarctic notothenioid suborder, which radiated

adaptively as the Southern Ocean began cooling ~34 million years ago (Mya) to the

freezing point of seawater (–1.86C ) today; other fish groups became locally extinct

(Eastman, 1993). The notothenioid radiation produced ~136 species belonging to eight

recognized families, among which there are highly divergent phenotypes (Colombo et

al., 2015; Matschiner et al., 2015).

The 16 species of Antarctic icefishes are unique among vertebrates in that they

neither produce the oxygen-carrying pigment hemoglobin nor do they produce typical

mature erythrocytes (Cocca et al., 1995b; di Prisco et al., 2002; Near et al., 2006b;

Zhao et al., 1998b). The high oxygen content in the Southern Ocean facilitated the loss

of erythrocytes in icefishes, but their severe anemia was clearly disaptive (Montgomery

and Clements, 2000), as the icefish co-evolved expanded hearts and highly

vascularized tissues, possibly as a consequence of elevated systemic nitric oxide (NO)

levels (Sidell and O'Brien, 2006). Red-blooded Antarctic notothenioids also have

decreased hematocrits and reduced hemoglobin concentrations compared to temperate

teleost species (Eastman, 1993; Wells et al., 1980). There is evidence for temperature-

Page 28: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

28

sensitive phenotypic plasticity of erythropoiesis in temperate teleosts; erythropoiesis is

repressed by cold exposure (Kulkeaw et al., 2010; Maekawa et al., 2012). Thus, genetic

fixation of cold-induced anemia in ancestral notothenioids may be an example of West-

Eberhard’s theory of “Increased genetic divergence due to phenotype fixation” (West-

Eberhard, 1989, 2005).

Significant mutations in coding sequences, including the deletion of globin genes,

have probably made permanent the anemia of icefishes (Cocca et al., 1995b; Near et

al., 2006b). However, one cannot rule out mutations of erythroid gene regulatory

elements as causes of icefish anemia; indeed, such changes might have initiated red

cell loss. The relaxed selection pressure accompanying regulatory mutations may

explain the high mutation rates observed in morph-biased genes when they become

expressed below a functional level (Helantera and Uller, 2014; Leichty et al., 2012).

Thus, reduction of hemoglobin levels, as seen in more derived notothenioid clades [i.e.,

the family Bathydraconidae (dragonfishes)] due to deletion of globin gene regulatory

elements, may have led ultimately to the evolutionary loss of the globin locus in

Antarctic icefishes (Lau et al., 2012; Near et al., 2006b; Zhao et al., 1998b). However,

the loss of β-globin may be just one of many erythroid specific gene mutations that

occurred in icefishes.

In this study, I compare multi-tissue, RNA-Seq transcriptomes (Berthelot et al.,

2018. Manuscript in preparation) to interrogate morph-biased gene expression and

coding sequence divergence between the derived sister lineages of dragonfishes

(Bathydraconidae) and icefishes (Channichthyidae). The goal was to discover the

genetic determinants of icefish traits, specifically changes to hematopoietic genes that

Page 29: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

29

may contribute to their anemia. The analysis revealed tissue-specific, differential

expression for genetic pathways that regulate development of blood, brain, muscle,

gonad and kidney. In hematopoietic tissues, decreased expression was observed for

both well-known and previously uncharacterized erythroid genes – several of these

genes contained predicted deleterious substitutions or frameshift mutations. I found that

the dragonfish, Parachaenichthys charcoti, was a natural mutant model for hereditary

spherocytic anemia. The icefish, Chaenocephalus aceratus, has been shown to be

blocked in erythroid differentiation (Yergeau et al., 2005; Yergeau et al., 2006.

Manuscript in preparation). I suggest that the block to differentiation may be caused by

increased expression of pluripotency factors and decreased expression of erythroid

differentiators. Specifically, the block in erythroid differentiation may be caused by

increased expression of hdac1b and by deleterious mutations in the interaction domains

of Hemogen, P300b, and Gata1, which would together promote deacetylation and

deactivation of Gata1. Together, these changes show that notothenioid evolution led

ultimately to an intricate repression of the erythropoietic pathway.

Results

Erythrocyte morphology in notothenioid fishes

All Antarctic notothenioid fishes are anemic, with reduced hematocrits and low

hemoglobin concentrations, compared to temperate fishes (Wells et al., 1980). This is

most apparent in dragonfishes (Bathydraconidae), the sister lineage to icefishes, which

display a more severe anemia (Kunzmann et al., 1991) than other red-blooded

notothenioids. Yet, very little is known about the process of erythroid differentiation in

Page 30: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

30

notothenioids. Two icefishes, C. aceratus and Dacodraco hunteri, were proposed to

have very rare, senile erythrocytes (Barber et al., 1981). Subsequently, it was

discovered that C. aceratus produces erythroid progenitors but shows a clear block to

terminal erythroid differentiation (Yergeau et al., 2005; Yergeau et al., 2006. Manuscript

in preparation). Nonetheless, the blood cell phenotypes of most notothenioid species

have not been characterized.

I examined the morphology and frequency of hematopoietic cell types in

peripheral blood smears from six species from three families of Antarctic notothenioid

fishes. The red blood cells of two nototheniids, Notothenia coriiceps and Gobionotothen

gibberifrons, were oval shaped, measuring 11.7±0.30 µm (n = 12) and 12.1±0.33 µm (n

= 12) on the longest axis, and were morphologically similar to erythrocytes seen in

temperate teleosts (Fig. 1A,B). Strikingly, erythrocytes of the dragonfish, P. charcoti,

were spherical in shape and significantly smaller (mean diameter 8.7±0.19 µm; n = 14;

Student’s t test, P = 6.1E-06) than those of N. coriiceps. They also had a reduced

surface area of 48.7±3.1 µm2 compared to the oval erythrocytes from N. coriiceps at

77.6±4.1 µm2 (Student’s t test, P = 0.003, n = 6) (Fig. 1C). Interestingly, the spherocytic

erythrocytes of P. charcoti were morphologically similar to the erythrocyte-like cells that

have been described in the icefish Channichthys rhinoceratus (Hureau, 1966; Spillman

and Hureau, 1967). Spherocytic morphology of erythrocytes is typically caused by

defects in erythroid membrane cytoskeletal proteins.

Among the icefishes that I analyzed, blood cell morphologies and frequencies

varied between species (Fig. 1D-F). Mature erythrocytes were not apparent in the

peripheral blood of Pseudochaenichthys georgianus (Ps. georgianus) or C. aceratus,

Page 31: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

31

but there were a number of circulating erythroblasts in the blood of C. aceratus (5.7% of

peripheral blood cells, n = 72, Fig. 1E). By comparison, the blood of N. coriiceps did not

contain circulating erythroblasts and reticulocytes were present at a low frequency

(1.8% n = 111, Fig. 1A). The peripheral blood of Champsocephalus gunnari and

Chionodraco rastrospinosus contained abundant myeloid cell-types (56.3% N = 98, and

19.7% n = 65 respectively, Fig. 1). Some myeloid cell-types in C. rastrospinosus had

clear cytoplasm and condensed nuclei in contrast to the polymorphonuclear leukocytes

of C. aceratus (Fig. 1D,E) and were similar to the putative erythrocytes of the icefish, C.

rhinoceratus (Hureau, 1966; Spillman and Hureau, 1967). Thus, blood cell

morphologies are highly variable among notothenioid fishes even between species

within the icefish clade.

Page 32: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

32

Page 33: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

33

Figure 1. Peripheral blood smears from Antarctic notothenioid fishes. Light

micrographs of Giemsa stained blood cells from N. coriiceps (A), G. gibberifrons (B), P.

charcoti (C), C. rastrospinosus (D), C. aceratus (E), and C. gunnari (F). Note the oval

erythrocytes in two nototheniids (A,B) and spherocytic erythrocytes of the dragonfish

(C,D). Abbreviations: E, eosinophil; Er, erythrocyte; L, lymphocyte; Mon, monocyte; N,

neutrophil; O, orthochromatophilic normoblast; ProE: proerythroblast; T, thrombocyte.

Scale bars = 10 µm (A-F).

Page 34: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

34

Comparative transcriptomics reveals tissue-specific, differentially expressed

genes between an icefish and dragonfish

To identify genes that are differentially expressed in icefish tissues, I performed an in

silico comparison of RNA-Seq expression profiles between the transcriptomes of the

icefish, Ps. georgianus, and red-blooded dragonfish, P. charcoti (Berthelot et al., 2018.

Manuscript in preparation). More than 18,700 orthologous genes were identified in the

multi-tissue transcriptomes of these species using OMA stand-alone and a pipeline that

has been described previously (Altenhoff et al., 2013; Altenhoff et al., 2011; Sharma et

al., 2014). For each tissue, the TPM (transcripts per million) normalized expression

values were strongly correlated between tissue replicates within each species (0.78 > R

> 0.98). Expression was correlated for most interspecies comparisons (0.51 > R > 0.92)

except at sites of hematopoiesis and in the liver (Fig. S1-2). Differential expression (DE)

analysis was conducted on the orthologs using edgeR v3.10.2 to normalize and detect

significant differences between read counts in each tissue from Ps. georgianus and P.

charcoti with 2-4 biological replicates per sample. From the list of 18,781 orthologs,

differentially expressed transcripts (significance criterion P ≤ 1E-05, FDR ≤ 0.001) were

identified in brain (2,005), head kidney (2,317), liver (1,960), ovary (784), spleen

(3,108), trunk kidney (2,688), pectoral muscle (2,129), white muscle (1,447), and heart

ventricle (2,005). Significant differential expression of 295 genes (39 up-regulated, 256

down-regulated) was observed in every tissue comparison. Tissue-wide down-

regulation of a gene may hint at a significant genomic alteration or it may occur due to

an incorrect orthology call. Therefore, the orthologies for specific genes of interest were

Page 35: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

35

confirmed by genomic synteny and/or by mapping to established zebrafish and

stickleback orthologs. Subsequently, I performed hierarchical clustering of TPM-

normalized expression values for DE genes across tissues from both species (Fig. 2).

This facilitated the isolation of specific clusters of DE genes (cut at 56% height of the

dendogram) with the highest expression in each tissue. Tissue-specific genes were

differentially expressed in brain (432), head kidney (255), liver (102), ovary (138),

spleen (323), trunk kidney (402), pectoral muscle (139), white muscle (56), and heart

ventricle (171). In the head kidney, 152 DE genes were determined to be blood-specific

genes.

Page 36: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

36

Page 37: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

37

Figure 2. Multi-tissue expression profile heatmap of genes that are differentially

expressed in head kidney from Ps. georgianus and P. charcoti. Differentially

expressed genes were identified in the head kidney RNA-Seq transcriptomes from Ps.

georgianus and P. charcoti (Fisher’s exact test, P < 1E-05, FDR < 0.001). For genes

that are differentially expressed in the icefish head kidney, the relative transcript

abundances in each tissue from both species are shown as the log2 transformed values

of transcripts per million (TPM). Hierarchical clustering identified groups of genes with

similar expression profiles. Tissue-specific clusters of genes with the highest expression

in each tissue were isolated by cutting the dendogram at a height of 56% (dashed line).

Abbreviations: H. Kidney, head kidney; T. Kidney, trunk kidney; Pectoral, pectoral

muscle; W. muscle, white muscle.

Page 38: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

38

Gene ontology enrichment of DE genes highlights tissue-specific molecular

processes that underlie icefish phenotypes

The differential expression of molecular processes in each tissue between P.

charcoti and Ps. georgianus may represent lineage-specific adaptations to temperature,

oxidative stress, and/or functional loss of red blood cells. To evaluate these possibilities,

gene ontology (GO) enrichment was used to assess the tissue-specific biological

functions of each DE gene cluster and to identify specific genetic pathways that may be

unique to icefish development (Fig. 3).

At sites of hematopoiesis, in the head kidney (n = 234, Fig. 3A) and spleen (n =

296, Fig. 3B), many of the genetic pathways that control cell survival, proliferation and

differentiation were enriched. More genes involved in the immune system were

differentially expressed in the spleen (n = 68; sum of up- and downregulated genes)

compared to head kidney (n = 27), particularly for genes involved in lymphopoiesis. In

both hematopoietic tissues, the widespread decrease in expression for erythroid genes

highlights the loss of mature erythrocytes in icefishes (Fig. S3).

With the highest number of tissue-specific DE genes (n = 398), the icefish brain

primarily exhibited down-regulation of regulators of nervous system development and

function (Fig. 3C). Decreased expression was observed for several factors involved in

glutamate receptor signaling (e.g. GRM8, GRIK5, GRIA3, GRIA1), which is consistent

with the contraction of this gene family observed in notothenioids (Shin et al., 2014).

Reduced glutamate signaling might inhibit neuronal cell function but may represent an

Page 39: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

39

adaptation to prevent excessive generation of reactive oxygen species (ROS)

(Reynolds and Hastings, 1995; Willard and Koochekpour, 2013).

In the trunk kidney (n = 372), gene expression changes were observed for

several metabolic processes (Fig. 3D). For example, the differential expression for

many lipid metabolites (n = 37, e.g. Fabp1) is consistent with the elevated

polyunsaturated fatty acid (PUFA) levels in icefish mitochondrial membranes (O'Brien

and Mueller, 2010).

Tissue-specific DE genes in pectoral red muscle (n = 128, Fig. 3E) and in trunk

white muscle (n = 50, Fig. 3F) were involved in striated muscle development and

function. In most cases, the same processes were strictly up-regulated in pectoral

muscle of the icefish but down-regulated in white muscle compared to the dragonfish.

This may highlight the increased hypertrophy and loss of hyperplasia in icefish pectoral

muscle (Archer and Johnston, 1987). Icefishes generally use their pectoral muscles for

labriform swimming (Archer and Johnston, 1987), whereas Parachaenichthys species

use sub-carangiform swimming and have a heavily muscled trunk (Kuhn et al., 2010).

More genes encoding mitochondrial proteins were differentially expressed in icefish

pectoral muscle, reflecting its higher concentration of mitochondria (Archer and

Johnston, 1991; Lin et al., 1974).

Page 40: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

40

Page 41: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

41

Figure 3. Gene ontology (GO) enrichment for tissue-specific, differentially

expressed genes between Ps. georgianus and P. charcoti. Enriched GO groups

were identified using STRING (Fishers exact test, P < 0.05, FDR < 0.01). Graphs show

numbers of up-regulated (red) and down-regulated (blue) genes in Ps. georgianus head

kidney (A), spleen (B), brain (C), trunk kidney (D), pectoral muscle (E), and white

muscle (F) for different biological processes.

Page 42: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

42

Interaction network for differentially expressed, tissue-specific hematopoietic

regulators in the icefish head kidney

To identify the pathways that control blood development in icefishes, I created a

gene association network for tissue-specific, differentially expressed genes in the icefish

head kidney, the major site of adult definitive hematopoiesis in teleosts (Fig. 4). The

network was created with STRING (Jensen et al., 2009) using annotations of the human

orthologs (see Methods). For genes of interest, orthology was verified by comparative

synteny of the sequenced genomes for N. coriiceps and H. sapiens. In the association

network, K-means clustering (n = 11, Fig. 4) revealed sets of genes that were grouped

consistently with ten GO biological functions (Fig. 4). The network highlights the loss of

expression for groups of genes involved in the erythroid skeletal membrane, heme

synthesis, erythroid transcriptional regulation, chromatin regulation, apoptosis, lipid

metabolism, and in the regulation of adenylate cyclase activity. It also shows a cluster of

signaling molecules, including many with increased expression in the icefish head

kidney (Fig. 4).

Central nodes linking these clusters included tspo (25), hdac1 (18), akt (16), rela

(15), foxo1 (13), gata1 (13), tk2 (13), tfrc (12), fech (10), and ntrk1 (10), all of which

were down-regulated with the exceptions of hdac1 and ntrk1 (Fig. 4). The cluster

centering on gata1 highlights the interactions between down-regulated Gata1

transcriptional targets and Gata1 co-factors that cooperate to drive erythropoiesis

(Ferreira et al., 2005). Previous studies have emphasized the correlation between node

centrality and essential function (Batada et al., 2006; He and Zhang, 2006; Jeong et al.,

Page 43: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

43

2001; Raman et al., 2014; Song and Singh, 2013). Thus, differential expression of these

central nodes should highlight the major genetic changes that contribute to the

Page 44: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

44

hematopoietic defecefishes.

Page 45: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

45

Figure 4. Association networks between tissue-specific DE genes in the icefish

head kidney showing both up-regulated and down-regulated genes. Gene

association networks were created for tissue-specific DE genes in the head kidney with

STRING (Jensen et al., 2009) using annotations from the human orthologs. Colors

represent K-means clusters of gene nodes (n = 11). Genes were generally clustered by

their biological functions.

Page 46: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

46

Decreased expression of genes involved in erythroid differentiation in icefishes

The most obvious phenotype that differentiates the icefishes from other

notothenioids is their lack of red blood cells (RBC, erythrocytes). To distinguish the loss

of erythroid-specific genes from early-acting regulators of the hematopoietic stem cell

niche, I examined two clusters of down-regulated genes (C1, C2) in the icefish that had

strong tissue-specific expression in P. charcoti head kidney or peripheral blood (Fig

5A,B). While many erythroid genes function in erythroid progenitors of the head kidney

(Orkin and Zon, 1997), the high concentration of RBC in the peripheral blood of P.

charcoti allowed me to detect both early and late erythroid markers.

Down-regulated genes that were specific to the head kidney included the

hematopoietic stem cell (HSC) markers myb and runx1, both of which are required for

definitive, but not primitive, hematopoiesis (Sood et al., 2010; Soza-Ried et al., 2010).

Decreased expression was also observed for several other genes that play critical roles

in HSC maintenance and differentiation, including relA/p65 (Stein and Baldwin, 2013)

and caspase 3 (Janzen et al., 2006).

Of the down-regulated genes in the icefish head kidney, 149 were blood-specific

markers (Fig. 5A,B, Table 1). The decreased expression of many RBC-specific genes

reflects the loss of erythrocytes in icefishes and included genes encoding globins, heme

biosynthetic enzymes and erythrocyte membrane proteins (n = 27; e.g. hb1, blvrb,

band3, band4.1, alas2, fech, add2, ank1, sptb). The list also included erythroid factors

that are more highly expressed in immature erythroblasts (Kingsley et al., 2013). These

genes regulate erythroid lineage commitment and/or terminal differentiation in the head

Page 47: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

47

kidney (n = 51; e.g. gata1, hemogen, klf1, scl/tal1, gfi1b, ldb1, epor, tfrc, tgm2).

Additionally, I identified 31 novel blood-specific genes that have not been previously

associated with erythropoiesis (data not shown).

Page 48: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

48

Page 49: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

49

Figure 5. Three tissue-specific clusters of hematopoietic genes are differentially

expressed (DE) in the head kidneys of Ps. georgianus and P. charcoti. (A) Heat

map of relative gene expression for tissue-specific DE genes in the icefish head kidney.

Expression was normalized to transcripts per million (TPM). Three clusters show tissue-

specific genes in (C1) dragonfish blood, (C2) dragonfish head kidney, and (C3) icefish

head kidney. (B) Expression profiles for genes in each tissue-specific cluster.

Abbreviations: HK, head kidney; TK, trunk kidney; Liv, liver; Pec, pectoral; WM, white

muscle.

Page 50: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

50

Increased expression of pluripotency markers highlights mechanisms of

erythroid inhibition in icefishes

The decreased expression for many erythroid genes in icefish head kidney is in

part caused by the loss of mature erythrocytes by this group. Thus, the genes with

increased expression may portray the cell lineages and developmental processes that

predominate in the icefish head kidney. My results show that icefish kidney expressed

at elevated levels a number of hematopoietic regulatory genes (Fig. 4-5), including cbfb,

ntrk1/trk1, gas6, and dock1. Several of the overexpressed genes in the icefish head

kidney (Table 1) are proto-oncogenes (e.g. flt1/vegfr1, bcr, ntrk1/trka) or leukemia

markers (e.g. hdac1b, dock1, gas6) that are frequently associated with leukemogenesis

(Bradbury et al., 2005; Collins et al., 1987; Dirks et al., 1999; Lee et al., 2017; Wang et

al., 2003).

Signaling pathways that drive hematopoietic proliferation (Van Etten, 2007)

showed altered expression in icefish kidney (Fig. 4). Specifically, the interaction

network of hematopoietic DE genes was enriched for the PI3K-Akt-mTOR signaling

pathway that promotes cell survival and proliferation (Ghosh and Kapur, 2017). This

included up-regulated oncogenes like sgk1 (Orlacchio et al., 2017) and down-regulated

genes, such as the tumor suppressor foxo1 and others (akt2, casp3, catalase) (Fig. 4).

Aberrant cell signaling promotes carcinogenesis (Martin, 2003) and mutations in

regulators of PI3K-Akt-mTOR signaling frequently activate this pathway in leukemias

(Fransecky et al., 2015; Park et al., 2010). In agreement with previous findings, I found

increased expression of TGF-beta signaling molecules (Xu et al., 2015), which may be

Page 51: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

51

due to extensive duplications of genes in this pathway in Antarctic notothenioids (Chen

et al., 2008). TGF-β signaling has been shown to activate AKT signaling in many normal

and leukemic cell types (Drabsch and ten Dijke, 2012; Naka et al., 2010). Thus,

erythroid differentiation may stall in icefish due to increased expression of signaling

molecules and pluripotency genes that may mark a proliferative cell-type.

Page 52: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

52

Page 53: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

53

Figure 6. Differential expression of hematopoietic regulators is represented by

red- and white-blooded notothenioids. Relative expression of hdac1b, p300b, gata1,

and spi1b determined by qPCR in head kidneys from two red-blooded (N. coriiceps, P.

charcoti) and two white-blooded (C. aceratus, Ps. georgianus) notothenioids. Target

gene expression was normalized to beta-actin and error bars represent standard

deviation (n.s., not significant; Student’s t test, P > 0.05).

Page 54: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

54

Confirmation of differential expression of hematopoietic regulatory genes across

notothenioid clades

Genes with morph-biased expression may show high variation even between

individuals within a species (Helantera and Uller, 2014). To assess whether differential

expression of hematopoietic genes was a consistent feature of the red- and white-

blooded notothenioids, I employed qRT-PCR to examine kidney expression of

hematopoietic regulatory genes across four representative notothenioid species: the

icefishes Ps. georgianus and C. aceratus, the nototheniid N. coriiceps, and the

dragonfish P. charcoti. The head kidneys of both icefishes showed significant down-

regulation of gata1a and p300b (Student’s t test, P < 0.05; Fig. 6). By contrast,

expression of hdac1b was found to be significantly up-regulated in the head kidneys of

both icefishes (Student’s t test, P < 0.05). Expression of the myeloid marker, pu.1/spi1b,

did not differ significantly between the four species (Fig. 6), consistent with the

comparable numbers of myeloid cells in the head kidneys of red- and white-blooded

notothenioids. In the RNA-Seq transcriptome, I showed that p300b was down-regulated

in all tissues of P. georgianus compared to P. charcoti. In contrast, I found that hdac1b

was specifically up-regulated in the icefish head kidney and spleen but not in non-

hematopoietic tissues (significance criterion P ≤ 1E-05, FDR ≤ 0.001). These findings

suggest that erythropoietic regulatory proteins (e.g., Gata1) may be differentially

acetylated, and hence differentially active, in icefish head kidney.

Page 55: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

55

Page 56: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

56

Figure 7. Predicted deleterious substitutions and frameshifts in blood genes from

icefishes. Provean was used to predict deleterious substitutions (Provean score < -3)

(Choi and Chan, 2015) that were shared by three white-blooded icefishes but which did

not occur in red-blooded notothenioids. (A) The CD33-related Siglec contained a F753*

frameshift mutation that truncated the transmembrane and cytoplasmic regions in the

icefishes, N.ionah, C. aceratus and Ps. georgianus. Numbers indicate length in amino

acids. (B) Predicted deleterious substitutions in icefish Glucose-6-phosphate

dehydrogenase (G6pd). Lines represent the NADP binding sites. White boxes indicate

the dimer interface. Capital letters represent beta-turns and lowercase letters are alpha

helices. Icefish mutation highlighted in yellow is mutated in human G6PD-deficiency. (C)

Predicted deleterious substitutions in the Transferrin receptor (Tfrc). (D) Predicted

deleterious substitutions in Hemopexin (Hpx). Residue highlighted in yellow is involved

in binding heme. Abbreviations: Cyto, cytoplasmic; C2-set, immunoglobulin c2-set

(constant) domain; Ig, immunoglobulin-like domain; ITIM, immunoreceptor tyrosine-

based inhibitory motif; v-set, immunoglobulin v-set (variable) domain; PA, protease-

associated domain; R, Hpx repeat; S, signal peptide; Tr, transmembrane.

Page 57: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

57

Page 58: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

58

Figure 8. Erythroid beta-spectrin is mutated in the dragonfish P. charcoti.

Scanning electron micrographs of erythrocytes from (A) N. coriiceps, a nototheniid, and

from (B-C) P. charcoti, a dragonfish. (D) Light micrographs of Giemsa stained triton-

insoluble erythrocyte membrane skeletons from N. coriiceps (Nc) and P. charcoti (Pc).

Flash frozen blood samples were treated with 1% Triton-X, spread on glass coverslips,

fixed with 4% PFA. (E) Predicted deleterious mutations in dragonfish (bold italic) and

icefish (Roman case) Erythroid beta-spectrins. Residues highlighted by yellow boxes

are also mutated in hereditary spherocytic anemia in humans. Scale bars = 10 µm (A,B)

5 µm (C,D).

Page 59: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

59

Icefish-specific deleterious substitutions and frameshift mutations occur in

common targets of disease

Deleterious point mutations or frameshifts in erythroid-specific functional domains

are likely to contribute to the profound anemia of icefishes. I identified frameshift

mutations in 16 genes (Table 5) from three icefish species (Neopagetopsis ionah, C.

aceratus, Ps. georgianus) after alignment to the orthologs from red-blooded

notothenioids (P. charcoti, N. coriiceps, H. antarcticus). One blood-specific gene, a

Cd33-related Siglec (Sialic-acid-binding immunoglobulin-like lectin) contained a C-

terminal frameshift that removed its cytoplasmic immunoreceptor tyrosine-based

inhibition motif (Fig. 7A). Loss of CD33 causes slight erythropoietic defects in mutant

mice (Brinkman-Van der Linden et al., 2003).

I identified all nonsynonymous substitutions in 7,049 orthologs that differed

between red- and white-blooded notothenioid lineages and then used Provean to

predict whether these were potentially deleterious mutations. Genes with potentially

deleterious substitutions (Provean-score < -3.0) were significantly enriched for

hematopoietic factors (n = 11; P < 5.380e-2), including regulators of heme metabolism

(n = 6; P < 2.890e-5) and myeloerythroid differentiation (n = 9; P < 1.940e-2) (Table 3,

Table 5, Fig. 7). The mutations were found in important functional domains that have

been associated with human diseases (Table 5). Mutated residues in G6PD (glucose-6-

phosphate dehydrogenase, Fig. 7) and Erythroid beta-spectrin (Fig. 8) are also mutated

in hemolytic anemias in humans (Barisic et al., 2005; Landrum et al., 2016).

Page 60: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

60

Page 61: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

61

The dragonfish P. charcoti is a natural mutant model for spherocytic anemia

The red blood cells of the dragonfish, P. charcoti, were morphologically different

from erythrocytes seen in other red-blooded notothenioids. I employed scanning

electron microscopy to compare erythrocytes from P. charcoti and N. coriiceps. A

nuclear bulge was apparent in erythrocytes from N. coriiceps but not from P. charcoti

(Fig. 8A-C). This indicated a spherocytic morphology for the dragonfish RBCs, which

may result from loss of incorporation of cytoskeletal proteins. Loss of membrane

proteins was evidenced by the size difference between triton-insoluble erythroid

membrane skeletons from P. charcoti (5.1±0.18 µm, n = 10) and N. coriiceps (12.4±0.72

µm n = 9) (Student’s t test, P = 1.8E-05; Fig. 8D). In the dragonfish, these features may

result from seven deleterious substitutions found in erythroid β-spectrin including an

R1037S mutation in the 7th spectrin repeat, which corresponds to an R1035W

substitution (rs143827332) that has been associated with hereditary spherocytic anemia

(HS) in humans (Landrum et al., 2016). Accumulation of deleterious substitutions in

erythroid β-spectrin from dragonfishes and icefishes may initially have been caused by

membrane instability at cold temperatures (Lomako et al., 2015) or may have been

caused by relaxed selection on erythrocyte markers as a result of anemia.

Page 62: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

62

Regulators of heme metabolism are under different selection in red- and white-

blooded notothenioids

Morph-biased gene expression is associated with increased rates of mutation,

due to relaxed selection upon genes that are expressed by few individuals of the

population or due to loss of function as a result of neutral selection (Helantera and Uller,

2014; Leichty et al., 2012). To determine the selective pressures on icefish coding

sequences, I compared the rate of non-synonymous to synonymous substitutions (ω,

dN/dS) between orthologous genes from two red-blooded species (P. charcoti and

Harpagifer antarcticus) and two white-blooded species (N. ionah and Ps. georgianus).

To search for genes with variable dN/dS ratios between the notothenioid lineages, I

employed a likelihood ratio test (P < 0.05) to compare a 2-ratio and 1-ratio (null) model

for each set of gene alignments. Most hematopoietic regulators (i.e. gata1, spi1b/pu.1)

were under equally strong purifying selection in red and white-blooded notothenioids

and must be functional in some cell lineages in icefishes (data not shown). However,

several erythroid genes had significantly higher dN/dS ratios in icefishes including three

genes (e.g. tfrc, hpx, slc25a39) involved in heme metabolism (Likelihood ratio test, P <

0.05; Table 2). The increased nonsynonomous substitution rate of the icefish transferrin

receptor illustrates continued genetic drift in a gene that is highly polymorphic in

notothenioids (Trinchella et al., 2008). Adaptive changes to hematopoietic pathways

may serve to combat the negative side-effects of anemia in icefishes. It has been

suggested that stable serotransferrin expression may scavenge free ferric iron (Fe 3+)

in icefish tissues (Kuhn et al., 2016). Likewise, the strong up-regulation of hemopexin

Page 63: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

63

(hpx) in the icefish, the significant positive selection acting on its sequence, and putative

functional mutations that remove (H36P, H37G, H79R, H333Q, H364A) or introduce

(Q132H, Q175H, Y246H, D362H, D395H, N434H) histidine residues that may bind

heme indicate that Hemopexin function may be adapted to enhance heme scavenging

in the plasma in response to the loss of hemoglobin formation.

Page 64: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

64

Page 65: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

65

Figure 9. Functional mutations occur in the interaction domains of Gata1,

Hemogen, and P300. (A) Deleterious mutations were discovered in the interaction

domains (brackets) of Gata1, P300, and Hemogen from white-blooded icefishes (N.

ionah, Ps. georgianus) but not in red-blooded notothenioids (P. charcoti, H. antarcticus).

Deleterious mutations were predicted with Provean (Choi and Chan, 2015). (A) Icefish

Gata1 contains a deleterious N319S substitution in the C-terminal zinc finger (C-ZF),

which binds Hemogen and P300 (Zheng et al., 2014). (B) Icefish P300 contains three

deleterious substitutions in the Gata1 binding region (Blobel et al., 1998) which overlaps

with the acetyl transferase domain (spans Br, P, KAT, Z, and CH) (Bordoli et al., 2001).

(C) Icefish Hemogen contains a P174fs frameshift mutation that truncates the C-

terminal domain that is responsible for binding of P300 (Zheng et al., 2014). (D) Model

for molecular repression of Gata1 function and erythroid gene transcription caused by

mutations (marked with X) and by dysregulation of gene expression (arrowheads).

Page 66: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

66

Icefish-specific deleterious mutations in the interaction domains of Gata1,

Hemogen, and P300b

The top candidate pathways for the block of erythroid differentiation in icefishes

involve Gata1, which is considered the master regulator of erythropoiesis in vertebrates

(Ferreira et al., 2005; Suzuki et al., 2011). Icefish Gata1 and several of its co-factors

(P300, Hemogen, Runx1, Spi1, Gfi1b, Klf1) contained deleterious mutations that may

affect Gata1 activity. In both N. ionah and Ps. georgianus, Gata1 contained a single

deleterious N319S substitution in the C-terminal Zinc finger (CF) (Fig. 9A), a domain

that is required for DNA-binding and for promoting erythroid differentiation (Omichinski

et al., 1993). The Gata1 C-ZF domain is bound by the erythroid transcription factor,

Hemogen (Zheng et al., 2014), and by the histone acetyl-transferases CBP (Creb-

binding protein) and P300 (Boyes et al., 1998). The erythroid transcription factor,

Hemogen, can recruit P300 to promote acetylation of Gata1 in an immediately adjacent

basic domain, leading to enhanced erythroid gene transcription (Zheng et al., 2014).

Previously, I characterized a C-terminal deletion in icefish Hemogen (See

Chapter 4), a defect that introduces a frameshift and premature stop codon in some

icefish species (Fig. 9A) This frameshift removes a C-terminal transactivation domain

motif that may be required for binding of P300. Icefish P300b also contained seven

deleterious substitutions, including three in the TAZ2/CH3 domain (Transcription

Adaptor putative Zinc Finger/cysteine-histidine), the domain that binds Gata1 (Blobel et

al., 1998). Thus, all of the interaction domains of Gata1, Hemogen, and P300b contain

predicted deleterious mutations. In contrast, Hdac1b was highly conserved between

Page 67: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

67

red- and white-blooded species and did not contain any predicted deleterious mutations.

The mutations in Gata1, Hemogen, and P300 may contribute to the differential

expression of all identifiable transcriptional targets of Gata1 (n = 161) and Hemogen (n

= 367) in the icefish, Ps. georgianus.

The up-regulation of hdac1b and down-regulation of p300b may contribute to a

homeostatic imbalance in erythroid-specific acetylation in icefishes. Thus, I employed

Western blotting to assess whole-protein acetylation status in the head kidneys of red-

and white-blooded notothenioid fishes (Fig. 10). Acetylation of most proteins was

comparable in C. aceratus and N. coriiceps. Normal acetylation of most proteins may be

compensated by p300 paralogs (e.g. p300a, cbp, cbp-like) in notothenioids. However,

changes in acetylation for specific targets of Hdac1b and p300b could not be ruled out.

In mice, mutations in the KIX domain of P300 cause severe anemia and erythroid cell

defects (Kasper et al., 2002) whereas Hdac1 is inactivated during terminal

differentiation by P300 (Yang et al., 2012). Thus, the differential expression of Gata1

targets in icefishes may result from (1) decreased expression of Hemogen, P300b, and

Gata1, (2) by deleterious mutations in the interaction domains of all three proteins and

(3) by overexpression of Hdac1b (Fig. 9B).

Page 68: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

68

Page 69: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

69

Figure 10. Protein acetylation in the head kidneys of red- and white-blooded

notothenioids. (A) Protein acetylation was detected in purified protein from head

kidneys of N. coriiceps and C. aceratus. Separated proteins were probed with anti-

acetylated lysine antibody (Santa Cruz Biotechnology, AKL5C1). Signals were

normalized to Ponceau stained bands and calculated as a fold change relative to N.

coriiceps. Arrows mark proteins with increased acetylation in the icefish (> 1.5 fold

change).

Page 70: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

70

Discussion

Molecular repression of erythropoiesis in Antarctic icefishes

Hemolysis of mature erythrocytes may cause the reduced hematocrits observed

in red-blooded Antarctic notothenioids and may have instigated the evolutionary loss of

red blood cells in icefishes. The dragonfish, P. charcoti, possesses spherocytic

erythrocytes, a feature that may be caused by deleterious mutations that occur in the

functional domains of erythroid β-spectrin including specific residues that are mutated in

hereditary spherocytic anemia (HS) in humans. In both dragonfishes and icefishes, the

accumulation of deleterious substitutions in β-spectrin may have been caused by cold-

induced membrane instability (Lomako et al., 2015) or by relaxed selection due to the

loss of red blood cell function.

Icefishes may have adapted a molecular repression of erythroid differentiation to

avoid the consequences of hemolytic anemias. I identified changes in expression of

hematopoietic regulators in icefishes including overexpression of pluripotency genes

and decreased expression for genes that promote erythroid differentiation. Icefish

hematopoiesis may be disrupted by an acetylation imbalance caused by decreased

expression of the p300 acetyltransferase in all tissues and hematopoietic-specific

overexpression of hdac1b. Histone acetyltransferases (HATs) and histone deacetylases

(HDACs) control gene expression through acetylation and deacetylation of histones and

transcription factors (De Ruijter et al., 2003; Eberharter and Becker, 2002; Vo and

Goodman, 2001). Loss of P300 is embryonic lethal and mutations in the KIX domain of

P300 cause severe anemia and erythroid cell defects in mice (Kasper et al., 2002).

Page 71: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

71

Hdac1 activity also plays a critical role in early erythroid proliferation (Heideman et al.,

2014) but is inactivated by P300 during terminal differentiation (Yang et al., 2012).

Furthermore, Antarctic icefishes contain predicted deleterious mutations in the

interaction domains of Hemogen, P300, and Gata1. Truncation of the C-terminal

transactivation domain in icefish Hemogen may prevent it from recruiting the P300

acetyltransferase to Gata1 (Zheng et al., 2014). Hdac1 facilitates Gata1-mediated

transcriptional repression by the NuRD complex (Hong et al., 2005; Snow and Orkin,

2009). During terminal differentiation, P300 acetylates and inactivates Hdac1 and

converts this complex to an activator (Yang et al., 2012). Thus, in icefishes, Gata1 may

function solely as a transcriptional repressor due to Hdac1b overexpression. Loss of

Gata1 expression and function in icefishes may prevent formation of active chromatin

hubs (ACH), which are thought to play a global role in erythroid gene transcription

(Schoenfelder et al., 2010). Specifically, the loss of chromatin looping by the LCR (locus

control region) (Krivega and Dean, 2016) may have contributed to the deletion of globin

promoter elements in dragonfishes (Lau et al., 2012) and the complete loss of globin

genes in icefishes (Cocca et al., 1995a).

Page 72: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

72

Methods

Transcriptome assembly and orthology assignment

Transcript sequences from multi-tissue transcriptomes were previously

generated for two red-blooded (H. antarcticus, P. charcoti) and two white-blooded (Ps.

georgianus, N. ionah) notothenioid species (Berthelot et al., 2018. Manuscript in

preparation). Whole-tissue transcriptomes were assembled with Trinity using default

parameters (Haas et al., 2013). For comparisons between transcriptomes, orthologous

relationships were determined as previously described (Sharma et al., 2014). Briefly,

CD-HIT was used to eliminate gene duplicates (95% similarity) and TransDecoder was

used to identify putative open reading frames (Fu et al., 2012; Haas et al., 2013). The

longest ORF was chosen for each Trinity subcomponent to produce unique proteins by

use of usegalaxy.org (Blankenberg et al., 2010; Giardine et al., 2005; Goecks et al.,

2010). OMA stand-alone v.0.99t (Altenhoff et al., 2013; Altenhoff et al., 2011) identified

7,297 orthologs that were shared in the transcriptomes of all four species (Ps.

geogianus, P. charcoti, N. ionah, H. antarcticus) and 18,781 orthologous groups that

were shared between Ps. georgianus and P. charcoti. Confirmation of orthologous pairs

was done with Blast v2.2.30+ (Altschul et al., 1990; Altschul et al., 1997). A list of

52,959 shared genes was identified in Ps. georgianus and P. charcoti assemblies by

reciprocal blast hit (E value < 10-80) criteria. From this list, 19,665 genes from Ps.

georgianus mapped (E value < 10-40) to the published genome (60.9%) of Notothenia

coriiceps (Shin et al., 2014). Transcriptomes were also mapped (E value < 1010) to the

Swiss Prot database for human proteins Release 2015_9 (UniProt, 2015). Reciprocal

Blast confirmed 15,606 of the 18,781 orthologous genes from the OMA analysis.

Page 73: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

73

Mapping (E value < 1040) to known zebrafish and stickleback orthologs in ENSEMBL

v74 (Herrero et al., 2016) confirmed 9,735 genes. Gene association networks and gene

ontology enrichment were analyzed by STRING v10 (Jensen et al., 2009) based on the

Swiss-Prot annotations (The UniProt, 2017). Association networks were edited using

Inkscape (www.inkscape.org).

Expression Analysis

Expression of pairwise-orthologs shared by Ps. georgianus and P. charcoti was

directly compared between tissues from each species using the Trinity pipeline (Haas et

al., 2013). Briefly, reads from each tissue were aligned to the respective assembly using

bowtie v1.1.1, and abundance estimation was carried out with RSEM v1.2.21

(Langmead et al., 2009; Li and Dewey, 2011). Differential expression (DE) analysis was

performed on Trinity components (gene level) with edgeR v3.10.2 to normalize and

detect significant differences between Ps. georgianus and P. charcoti read counts in

each tissue with 2-4 biological replicates per sample (Robinson et al., 2010). Differential

expression was considered statistically significant with an exact test P-value of 1E-05

and an FDR < 0.05. Expression profiles of DE genes were normalized across samples

to transcripts per million (TPM) (Haas et al., 2013). Hierarchical clustering was

performed with Gene-E to group similarly expressed genes and generate expression

profile heat maps (http://www.broadinstitute.org). Clusters were cut at a dendogram

height of 56%.

Page 74: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

74

Quantitative RT-PCR

Whole RNA was purified from flash frozen tissues in TriZol using the Ribopure kit

(Ambion). RNA was reverse transcribed with a polyT(23) primer using Protoscript II RT-

PCR kit (Invitrogen). Target genes were amplified from cDNA in triplicate by quantitative

PCR (Table 4). Standard curves were generated in QuantStudio v3 (Thermo Fischer

Sci) to confirm the efficiency of all primers. One or two biological replicates were used

per notothenioid species. Beta-actin expression was used to normalize expression of

target genes for comparison by the ΔΔCt method. Statistical comparisons were carried

out between red- and white-blooded lineages using a Student’s t test (P < 0.05).

Determination and comparison of dN/dS ratios

Sequence analysis was conducted on the set of 7,297 orthologs that were

shared by two red-blooded (P. charcoti, H. antarcticus) and two white-blooded (Ps.

georgianus, N. ionah) notothenioids. First, ratios of nonsynonymous to synonymous

substitution rates (dN/dS) were determined to identify genes that are under different

selection in each ecotype. Coding and peptide sequences were aligned with T-Coffee

v11.00.8 and back-translated with ParaAT (Notredame et al., 2000; Zhang et al.,

2012b). Substitution rates were determined in PAML v4.8 with codeml (Yang, 2007).

This method does not consider gaps in alignments when estimating substitution rates.

Extremely high dN/dS ratios (>10) may be due to high sequence similarity or short

sequence length and were discarded (Mugal et al., 2014). To test for genes with

variable dN/dS ratios between notothenioid lineages, a likelihood ratio test (P < 0.05)

Page 75: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

75

was carried out to compare a 2-ratio and 1-ratio (null) model for each set of alignments.

Prediction of deleterious missense mutations

Sequence alignments were scanned for mutations that may have altered the

function of the encoded protein. PAML was used to identify all nonsynonymous

substitutions that supported the division between the red- and white-blooded lineage

branches. The program Provean was used to predict whether these missense mutations

have a neutral or deleterious effect on protein function (Choi and Chan, 2015). The

program was run on all sequence alignments using a custom shell script. Provean

works under the assumption that substitutions in evolutionarily conserved protein

domains are likely to have deleterious effects. A Provean score < -3 was was used as a

threshold to predict deleterious mutations because it had a higher specificity than the

default score (<2.5) and was shown to accurately predict ~84% of deleterious

mutations. Protein domain diagrams were created with Geneious version R10

(http://www.geneious.com) (Kearse et al., 2012).

Identification of frameshift mutations

Frameshift mutations were determined by a Blastx search of the icefish coding

sequences (Ps. georgianus, N. ionah) to the translated protein databases for two red-

blooded species (P. charcoti, N. coriiceps) (Shin et al., 2014). As a requisite, the same

mutation(s) in both icefishes could not occur in either red-blooded species. Mutations

were also checked manually by aligning the genes to the reference genome for N.

Page 76: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

76

coriiceps (Shin et al., 2014). Mutated genes were isolated and sequenced from genomic

DNA purified from N. coriiceps and another icefish, C. aceratus.

Cloning of genes and cDNAs from Antarctic fish tissues

Genomic DNA (gDNA) was isolated from flash frozen tissues using the HotShot

protocol (Truett et al., 2000). Target genes were amplified from gDNA and cDNA by

PCR with 1 µM primers (Table 4) – the amplification program was 35 cycles of 98°C for

10 s, 57°C for 10 s, and 72°C for 30 s. PCR products were cloned into the pGEM-T

Easy vector (Promega, A1360), plasmids were transformed into 5-α competent cells

(New England Biolabs, C2987H), recombinant plasmids were identified by blue/white

screening and purified with the Wizard Plus SV Miniprep Kit (Promega A1330), and

inserts were sequenced by GeneWiz.

Imaging

Peripheral blood smears were prepared from N. coriiceps and P. charcoti on

glass slides and fixed in 4% paraformaldehyde (PFA) (Yergeau et al., 2005). Cells were

stained with Giemsa according to the manufacturer’s instructions (Sigma Aldrich).

Triton-insoluble erythrocyte membrane skeletons were prepared from flash frozen

peripheral blood samples from N. coriiceps and P. charcoti. Briefly, cells were

resuspended in 1% Triton-X, spread on glass coverslips and fixed with 4% PFA. Blood

smears and triton-shell spreads were imaged with a Nikon Eclipse E800 microscope

using a Photometrics Scientific CoolSNAP EZ camera. Morphological measurements of

cells were made using NIKON NIS-Elements AR 4.20 software. For scanning electron

microscopy, peripheral blood smears were sputter-coated for 5 s with gold and imaged

Page 77: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

77

directly by Scanning Electron Microscopy at the Marine Science Center of Northeastern

University.

Western blotting of anti-acetylated lysine

Total protein was prepared for sodium dodecyl sulfate polyacrylamide gel

electrophoresis (SDS-PAGE) from flash frozen notothenioid tissues by homogenization

in lithium dodecyl sulfate (LDS) Bolt buffer (Life Technologies, B007) and NuPAGE

reducing agent (Life Technologies, NP0009) using a pestle and microcentrifuge tube

(USA Scientific, 1415-5390). Samples were boiled for 3 min and centrifuged at top

speed in an Eppendorf 5417R centrifuge for 2 min. Aliquots (15 µg) were

electrophoresed on a 4-12% SDS polyacrylamide gel, and the separated proteins were

transferred to a polyvinylidene difluoride (PVDF) membrane by use of the iBlot system

(Life Technologies, IB21001). Membranes were blocked in maleic acid blocking buffer

(2% Roche blocking reagent, 2% BSA, 0.2% heat treated goat serum, 0.1% Tween-20)

for 1 h at room temperature and then incubated overnight at 4°C with 1:1000 mouse

anti-acetylated lysine (Santa Cruz Biotechnology, AKL5C1). Membranes were washed

in TBST (0.1 M Tris, 0.1 M NaCl, 0.1% Tween-20) and incubated for 2 h with

horseradish peroxidase HRP-conjugated goat anti-mouse IgG (H&L) (Aviva,

OARA04973). Bound antibodies were detected with the Amersham ECL Western

Blotting Analysis System (GE Healthcare, RPN2106) on CL-X Posure film (Thermo

Scientific,34091).

Page 78: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

78

Page 79: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

79

Figures S1. Box plot of RNA-Seq expression profiles in notothenioid tissues.

Values are normalized to transcripts per million (TPM) and log2 transformed.

Abbreviations: HK, head kidney; T. kidney, trunk kidney; Pectoral, pectoral muscle; W.

muscle, white muscle; Gonads, ovary. Ventricle, heart ventricle.

Page 80: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

80

Page 81: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

81

Figure S2. Heatmap of Pearson’s correlation coefficients after comparison of

RNA-Seq gene expression in P. charcoti and Ps. georgianus tissues. Heat map

color represents the Pearson’s correlation coefficient for total gene expression from

each tissue comparison. (A) Gene expression correlation coefficients cluster by tissue

type between P. georgianus and P. charcoti when all genes are assessed. (B) Gene

expression correlation coefficients do not cluster between P. georgianus and P. charcoti

for differentially expressed genes in head kidney.

Page 82: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

82

Page 83: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

83

Figure S3. Down-regulation of genes in the icefish head kidney for gene ontology

(GO) groups related to erythropoiesis. GO enrichment was determined using

STRING (Fishers exact test, P < 0.05, FDR < 0.01).

Page 84: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

84

Table 1. Hematopoietic genes are differentially expressed in the icefish head kidney

GO:0030097 hemopoiesis 20 1.00E+00 1.76E-02

Up in Icefish GeneID logFC Pvalue Human homolog

GAS6 4.30 6.55E-14 growth arrest-specific protein 6 NTRK1 3.53 3.63E-06 high affinity nerve growth factor receptor

BAX 2.89 9.88E-08 apoptosis regulator BAX

DOCK1 2.66 1.80E-07 dedicator of cytokinesis protein 1

CBFB 2.02 5.88E-05 core-binding factor subunit beta

Down in Icefish

MYB -2.03 4.24E-05 transcriptional activator Myb

CBFA2T3 -2.50 1.31E-07 protein CBFA2T3

CASP3 -2.55 9.78E-10 caspase-3

GATA1 -2.60 2.53E-07 GATA-1 CD28 -3.33 3.45E-05 T-cell-specific surface glycoprotein CD28

FECH -3.37 6.30E-09 ferrochelatase

ALAS2 -4.53 6.72E-14 5-aminolevulinate synthase

KLF2 -4.59 1.05E-20 Krueppel-like factor 2

KLF1 -4.70 4.05E-16 Krueppel-like factor 1 TFRC -4.88 1.93E-15 transferrin receptor protein 1

GFI1B -2.02 0.000101 zinc finger protein Gfi-1b

Table 2. GO enrichment of genes under different selective pressures in two red- and

two white-blooded notothenioids

GO Enrichment for genes under Different Selection

GO:0030097 hemopoiesis 11 5.66E-02

GO:0034101 Erythrocyte homeostasis 7 6.49E-05

GO:0042168 Heme metabolic process 4 4.10E-04

GO:0033572 Transferrin transport 3 8.50E-03

Page 85: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

85

Table 3. GO enrichment for genes with deleterious substitutions found in two white-

blooded icefishes but not in two red-blooded notothenioids

GO Enrichment for Deleterious Mutations

GO.0030099 myeloid cell differentiation 31 0.000286

GO.0030097 hemopoiesis 56 0.00165

GO.0006915 apoptotic process 94 0.0074

GO.0055114 oxidation-reduction process 85 0.0231

GO.0007010 cytoskeleton organization 72 0.0284

5220 Chronic myeloid leukemia 11 0.0473

4380 Osteoclast differentiation 19 0.008

Table 4. Table of Primers

Gene Oligo Sequence (5’ – 3’) Gene Method

Sptb_F Sptb_R Sptb_R_exon19

CCAGGCCTTCATGGCTGAG CGCACCTGGTTCTCCGTC GATGCTTCTTGAGCAAGATG

sptb (Notothenioid)

PCR gDNA

Hdac1b_F Hdac1b_R

GAGGAGGCCTTCTACACCAC CGACTCGTCGTCAATACCGT

hdac1b (Notothenioid)

qPCR

Spi1_F Spi1_R

GGATCCAAACCTTGGGGCAC GTGGATACACAGGCCGAGG

spi1b (Notothenioid)

qPCR

Gata1a_F Gata1a_R

CCACAGCCGAGCGCCTCC GCCCCGTCCAGCAGCTGC

gata1a (Notothenioid)

qPCR

SGK1_F SGK1_R

CTGAAGCCTGAGAACATCC CCATAGAGCATCTCGTAGAG

sgk1 (Notothenioid)

qPCR

PML-L_F PML-L_R

TGACCTGGAGGCCACTGG CCTGCAGGTCAGACCCG

pml-like (Notothenioid)

qPCR

P300b_F P300b_R

CCCGAGAAACGGAAGCTGAT TTTTTCAGCGGCAGGCAAAC

p300b (Notothenioid)

qPCR

ZFP64_F ZFP64_R

GCCTTACACTGTGAGGAGG AACTCCTCATTGTGGGAGG

zfp64 (Notothenioid)

qPCR

ERO1α_F ERO1α_R

GCAGGTGCTTCTGTCAG GTTTGGAGAAGAGCTGGTTG

ero1α (Notothenioid)

qPCR

Fam161a_F Fam161a_R

TTTAAGGCGAGACCCATG CACCATCTCAATGGAAACC

fam161a (Notothenioid)

qPCR

CD33rSig_F CD33rSig_R

CTGCTCATTAGAGATTGATGA GAAGGTTATTGTGGAGGTC

cd33rSig (Notothenioid)

qPCR

Bact_F Bact_R

CAGATCATGTTCGAGACCTTCAAC TCACCRGARTCCATGACGATA

beta-actin (Notothenioid)

qPCR

Page 86: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

86

Table 5. Icefishes have predicted deleterious substitutions in targets of human diseases Gene GO Mutation Provean Domain Associated Diseases Reference

ADD1 MH A246T -3.692 Aldolase II, NH-head domain HS Robledo et al. 2008,Anong et al. 2009

ANXA2 MH Y113F -3.105 Annexin domain AML, B-ALL Olwill et al. 2005

ASXL2 M G1082R -3.637 Proximate to PHD CBF-AML Jean-Baptiste et al. 2014

BCL11A H E14G -3.353 AML, CML Yin et al. 2016

CARD11 H E258G -6.062 GBP_C "guanylate-binding protein” DLBCL Lenz et al. 2008

CASP8 MH C22R -3.965 DED1 domain HNSCC Ando et al. 2013

CDKN1C MH G202E -4.246 BWS Yatsuki et al. 2013

CHD2 H N502I -5.68 SNF2 domain CLL, MAE Rodriguez et al. 2015, Trivisano et al. 2015

EML1 H Y36D -3.204 T-ALL De Keersmaecker et al. 2005

ERCC1 H R414Y -3.021 COFS, ALL Jaspers et al. 2007, Wang et al. 2006

FOXP1 MH V77F -3.126 B-ALL Put et al. 2011

G6PD MH P495N -4.957 G6PD deficiency, hemolytic anemia Beutler et al. ,

GATA1 MHN N319S -4.756 CT-Zn finger Thrombocytopenia, DS-AMKL, DBA Nichols et al. 2000, Crispino 2005

GFI1 MH C217Y -6.278 3rd C2H2 Zn finger AML, CLL, bleeding disorder, SCN Moroy et al. 2015

GFI1B MH E136A -3.387 1st C2H2 Zn finger Macrothrombocytopenia Kitamura et al. 2016

IKZF1 M P145A -4.715 Proximate to Zn finger 1 B-ALL, ALL Kastner et al 2013

NFE2 MH A380V -3.633 Coiled-coil, bZIP DNA binding MPN Jutzi et al. 2013,Shyu et al. 2006

RUNX1 MN L342H -5.353 TAD domain AML, B-ALL, CML Gaidzik et al. 2011

SPI1 MH V56P -3.414 Acidic TAD domain AML Mueller et al. 2002, Lamandin et al. 2002

STAT5B HN P198A -3.563 All-alpha domain Lymphomas Kucuk et al. 2015

TF MH E382A -4.57 C-lobe, disulfide bond Atransferrinemia, Alzheimer's Lee et al. 2001, Giambattistelli et al. 2012

TFRC MH E140V -3.759 ZN-peptidase transferrin receptor Iron deficiency anemia Roetto et al. 2001, Jabara et al. 2016

BLVRB P G145R -3.626 BVR/FR (Flavin reductase) domain Thrombopoiesis WU et al. 2016, O'Brien et al. 2015

COX15 P L235F -3.623 Transmembrane region Leigh syndrome, cardiomyopathy Antonicka et al. 2003, Bugiani et al. 2005

CPOX P P364H -8.838 Coproporphyrinogen III oxidase HCP, harderoporphyria Martasek et al. 1994, Schmitt et al. 2005

HPX P G52E -7.765 Hpx repeat 1 Diabetic macular edema Mehta et al. 2015, Hernandez et al. 2013

NFE2L1 P S301F -4.455 - Cancer, neurodegenerative disease Han et al. 2012, Taniguchi et al. 2017

SLC25A39 P L230F -3.055 1st solcar repeat Anemia, epilepsy Nilsson et al. 2009, Slabbaert et al. 2016

UGT1A1 P R332G -4.98 UDP-glucoronosyltransferase 1-1 Crigler-Najjar (CN), Gilbert (GILBS) Servedio 2005

IKAROS2 G108D -6.144 Zn finger 1 ALL Zhang et al. 2007,Chen et al. 2013

FES R585C -5.823 Catalytic domain of PTK, ATP binding AML Cheng et al. 2001,Sangrar et al. 2005

FLT1 C186R -4.337 CT domain AML, CML Choi et al. 2005, Fragoso et al. 2006

Gene logFC GO Mutation Domain Associated Diseases Reference

PPHLN1 0.57 L265fs Intrahepatic cholangiocarcinoma Sia et al. 2014

PAPPS2 na A43fs Brachyolmia, Prostate cancer Miyake et al. 2012, Ibeawuchi et al. 2015

MATE1 -1.18 F467fs Environmental toxin clearance, CML Chen et al. 2009

ZFP64 0.46 L615fs Amyotrophic lateral sclerosis Schymick et al. 2007

ERO1la -5.06 T37fs Adenocarcinoma Endoh et al. 2004

FAM161Al 3.07 H291fs C-terminus Retinitis pigmentosa 28 Karlstetter et al. 2014, Van Schil et al. 2015

CD33l -1.46 F753fs ITIM domain Alzheimer's, AML (expression) Stefania De Propris et al. 2011

NUMA1 -0.09 E895fs Osteosarcoma, AML Kovac et al. 2015, Strehl et al. 2012

Page 87: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

87

Chapter 2: Divergent Hemogen genes of teleosts and mammals share conserved

roles in erythropoiesis: Analysis using transgenic and mutant zebrafish

Michael J. Peters1, Sandra K. Parker1, Jeffrey Grim1,2, Corey A. H. Allard1,3, Jonah

Levin1,4, H. William Detrich, III1*

1Department of Marine and Environmental Sciences, Northeastern University, Nahant,

MA 01908, USA

2Present address: Department of Biology, The University of Tampa, Tampa, FL 33606,

USA

3Present address: Department of Biochemistry and Cell Biology, Geisel School of

Medicine at Dartmouth College, Hanover, NH 03755, USA

4Present address: Department of Biochemistry, McGill University, Montreal, Quebec

H3G1Y6, CA

Published:

Peters MJ, Parker SK, Grim J, Allard CAH, Levin J, Detrich HW III. 2018. Divergent

Hemogen genes of teleosts and mammals share conserved roles in erythropoiesis:

Analysis using transgenic and mutant zebrafish. Biology Open bio.035576 doi:

10.1242/bio.035576

Page 88: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

88

Summary Statement

Transgenic and mutant zebrafish lines were created to characterize the

expression and functions of Hemogen, a transcription factor involved in the formation of

red blood cells and other processes.

Page 89: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

89

ABSTRACT

Hemogen is a vertebrate transcription factor that performs important functions in

erythropoiesis and testicular development and may contribute to neoplasia. Here we

identify zebrafish Hemogen and show that it is considerably smaller (~22 kDa) than its

human ortholog (~55 kDa), a striking difference that is explained by an underlying

modular structure. We demonstrate that Hemogens are largely composed of 21-25

amino acid repeats, some of which may function as transactivation domains (TADs).

Hemogen expression in embryonic and adult zebrafish is detected in hematopoietic,

renal, neural, and gonadal tissues. Using Tol2- and CRISPR/Cas9-generated

transgenic zebrafish, we show that Hemogen expression is controlled by two Gata1-

dependent regulatory sequences that act alone and together to control spatial and

temporal expression during development. Partial depletion of Hemogen in embryos by

morpholino knock-down reduces the number of erythrocytes in circulation.

CRISPR/Cas9-generated zebrafish lines containing either a frameshift mutation or an

in-frame deletion in a putative, C-terminal TAD display anemia and embryonic tail

defects. This work expands our understanding of Hemogen and provides mutant

zebrafish lines for future study of the mechanism of this important transcription factor.

Page 90: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

90

INTRODUCTION

Hemogen (Hemgn) is a vertebrate transcription factor that is expressed in

mammalian hematopoietic progenitors (Lu et al., 2001; Yang et al., 2001) and has been

implicated in erythroid differentiation and survival (Li et al., 2004). Originally identified in

mice and subsequently described in humans as EDAG (Erythrocyte Differentiation

Associated Gene), Hemogen has also been implicated in testis development in

mammals and chickens (Nakata et al., 2013; Yang et al., 2003), and in osteogenesis in

rats (Kruger et al., 2002; Kruger et al., 2005). Here we analyze the developmental roles

of teleost Hemogen using the zebrafish model system and its powerful suite of reverse-

genetic technologies.

Teleost Hemogen was discovered using a subtractive hybridization screen

designed to isolate novel erythropoietic genes from fish belonging to the largely

Antarctic suborder Notothenioidei (Detrich and Yergeau, 2004; Yergeau et al., 2005).

Sixteen species belonging to the icefish family (Channichthyidae) are unique among

vertebrates because they are white-blooded ‒ they fail to execute the erythroid genetic

program or produce hemoglobin (Cocca et al., 1995a; Near et al., 2006a; Zhao et al.,

1998a). Forty-five candidate erythropoietic cDNAs were recovered using

representational difference analysis (Hubank and Schatz, 1999) applied to kidney

marrow transcriptomes of two notothenioid species, one red-blooded and the other

white-blooded (Detrich and Yergeau, 2004; Yergeau et al., 2005). One of the unknown

genes, clone Rda130, was similar to mammalian Hemogen and was expressed only by

the red-blooded notothenioid.

Page 91: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

91

Although Hemogen is clearly involved in hematopoiesis, its mechanism remains

incompletely understood. In human cell lines, Hemogen activates erythroid gene

transcription in part by recruiting the histone acetyltransferase P300 to acetylate Gata1

(Zheng et al., 2014). Like Gata1, Hemogen protects erythroid cells from apoptosis by

upregulating anti-apoptotic factors (e.g., Nf-κB, Bcl-xL) that are critical for terminal

differentiation (Li et al., 2004; Rhodes et al., 2005; Zhang et al., 2012a).

The regulation of Hemogen expression is of interest because it is overexpressed

frequently in patients with a variety of cancers and leukemias (An et al., 2005; Forbes et

al., 2017; Li et al., 2004). This putative oncogene, which is located in a human

chromosomal region (9q22) of leukemia-associated breakpoints, has been linked to

proliferation and survival of leukemic cells and to induction of tumor formation in mice

(Chen et al., 2016; Lu et al., 2002). Thus, somatic mutations in Hemogen or its

regulators may contribute to neoplasia.

The zebrafish is a well-established model organism for studying hematopoiesis in

vertebrates because it produces the same blood lineages as mammals (de Jong and

Zon, 2005; Paffett-Lugassy and Zon, 2005). In zebrafish, erythropoiesis occurs in

sequential waves at unique anatomical locations in embryos and adults that correspond

to analogous sites in mammals (Galloway and Zon, 2003). Many of the molecular

players that orchestrate the erythroid program appear to be conserved between

zebrafish and mammals, but relatively few have been functionally characterized in

zebrafish. Nevertheless, mutant zebrafish models accurately phenocopy human blood

diseases caused by mutations in major erythroid factors, such as Gata1 (Lyons et al.,

2002) and Erythroid beta-spectrin (Liao et al., 2000).

Page 92: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

92

The purpose of this study is to characterize the regulation of Hemogen

expression and the function of the Hemogen protein in zebrafish. We identify the

zebrafish Hemogen ortholog, which despite being only 40% as large as the human

protein, contains similarly arranged functional motifs. Hemogen is expressed in blood,

testis, ovaries, kidney, and the central nervous system in zebrafish. Two tissue-specific,

alternative Hemogen promoters are associated with conserved noncoding elements

(CNEs) and have distinct regulatory functions in primitive and definitive hematopoiesis

and other processes. By analysis of morphant and mutant zebrafish, we show that

Hemogen is required for normal erythropoiesis and that this role depends in part on a

cluster of acidic residues within a putative, C-terminal transactivation domain (TAD).

Page 93: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

93

Page 94: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

94

Figure 1. Zebrafish Si:dkey-25o16.2 and human Hemogen are orthologous and

encode related proteins that differ in size. (A) Structure of the zebrafish Hemogen-

like gene, Si:dkey-25o16.2. Two conserved noncoding elements (C1 and C2; black

boxes) were identified in a 2-kb segment proximal to the start codon (see Results, Figs.

4-6). Coding exons, white boxes; noncoding exons, gray boxes. Numbers indicate

length in bp. (B) Synteny of loci for zebrafish Si:dkey-25o16.2 on chromosome 1 and

Hemogen on human chromosome 9 (region q22). Transcriptional orientations indicated

by arrows. (C) Alternative splicing of zebrafish Hemogen-like transcripts showing

sequenced regions. Introns are shown as chevrons. Transcripts 1 and 2 differ by

retention of 12 bp of intron (red). (D) Modular structures of zebrafish and human

Hemogen proteins each encoded by four exons (numbered boxes). Locations of

truncating mutations found in some human cancers (Forbes et al., 2017) are indicated

by asterisks. Predicted regions and motifs: green, coiled coil; blue, nuclear localization

signal; red, four residues introduced by alternative splicing; yellow, tandem peptide

repeats; brown, acidic repeat with transactivation domain (TAD) motif; gray, no

prediction. (E) Three-dimensional ab initio models of Hemogens. The ribbon diagram of

the zebrafish protein, color-coded as in panel D, is superimposed on the gray, space-

filling model for the human protein (See Materials and Methods).

Page 95: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

95

RESULTS

Teleosts contain a single Hemogen-like gene that is syntenic with human Hemogen

Chromosomal synteny is an important criterion when assigning gene

relationships across divergent taxa. Despite the whole-genome duplication (WGD) that

coincided with the separation of teleosts from more basal ray-finned fishes and

tetrapods (Postlethwait et al., 2000), the sequenced genomes of nearly all fishes retain

a single Hemogen-like gene. We cloned zebrafish Hemogen-like cDNAs and found that

they corresponded to the predicted gene Si:dkey-25o16.2 on chromosome 1 of the

zebrafish genome (Howe et al., 2013). When we compared the synteny of the putative

teleost and mammalian orthologs, represented in Figure 1B by zebrafish Si:dkey-

25o16.2 (chromosome Dr1) and human Hemogen (chromosome Hs9), we found that

the flanking genes and their transcriptional orientations were conserved, which strongly

supported Si:dkey-25o16.2 as the zebrafish Hemogen ortholog.

Structure of the zebrafish Hemogen gene

The basic structure of the Hemogen gene of teleosts and mammals was also

found to be highly conserved – four coding exons were separated by three introns (Fig.

1A), and two introns were found in the 5’-UTR. Two transcription start sites were

predicted to occur within 2-kb upstream of the Hemogen start codon in zebrafish (Fig.

1A) ‒ these appear to correspond to the hematopoietic- and testis-specific Hemogen

promoters (noncoding exons 1H and 1T, respectively) described for mammals (Yang et

al., 2003). Alignment of Hemogen genes from 10 teleost species (Yates et al., 2016)

revealed two conserved non-coding elements, CNE1 and CNE2, that overlapped with

Page 96: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

96

zebrafish exons 1T and 1H, respectively (Fig. 1A). We hypothesized that these

elements function individually or together to regulate transcription of Hemogen.

Transcription of the zebrafish Hemogen gene yields multiple mRNA isoforms

We confirmed transcription from both promoters in zebrafish by isolating and

sequencing four splicing variants (Fig. 1C). Three isoforms were transcribed from the

proximal promoter (exon 1H, Fig. 1A,C), each containing the same 5’-untranslated

region (5’-UTR). Alternative splicing of the second coding exon produced transcripts 1

and 2, which differ by four additional codons in the latter (Fig. 1C, red); the shorter

version has not been described in mammals. Transcript 3 retained the entire third intron

(156 bp), which introduced a premature translation-termination codon. A fourth isoform

was transcribed from the distal promoter (1T) located ~1.65 kb upstream of the

translation start codon (Fig. 1A,C). Splicing of exons 1T and 1H to form the 5’-UTR of

transcript 4 made use of canonical donor (AT-GT) and acceptor (AG-TT) splice sites.

Teleost and mammalian Hemogen proteins differ markedly in size but share structural

motifs

Teleost Hemogen-like genes encoded shorter proteins (194-289 amino acids)

than the annotated Hemogen genes of mammals (417-827 amino acids), and the

overall amino acid sequence similarity between teleost and mammalian orthologs was

modest (18%-38%). Despite this heterogeneity in length and sequence, Hemogens of

teleost fish and mammals shared predicted structural motifs, as shown in Figure 1D,E

for zebrafish (198 aa, 22 kDa) and human (484 aa, 55 kDa) orthologs, respectively.

Their N-termini (zebrafish residues 1-74, human 1-78) were substantially conserved

Page 97: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

97

(51% sequence similarity; Fig. S1) and contained two predicted coiled-coil (CC) forming

alpha-helices, the second of which was a putative nuclear localization signal (NLS)

(Yang et al., 2001) (Fig. 1D; Fig. S1). By contrast, their C-termini (zebrafish residues 75-

198, human 79-484) were weakly conserved in sequence (13% similarity), but both

were rich in Pro and Glu residues (Figs. S1-S2), consistent with intrinsic disorder of

these regions (Dyson and Wright, 2005). Furthermore, the C-termini shared modular

structures – each was built of several 21-25 amino acid motifs, three in zebrafish and

nine in humans, with distinct but related consensus sequences

(PEXXXIAEXXXXXQEVXPQXXLVP and YSXEXYQEXAEPEDXSPETYQEIPX,

respectively) (Fig. 1D,E, Figs. S1-S2). Thus, the size heterogeneity between zebrafish

and human Hemogens was largely attributable to the number of repetitive segments

contained within each.

Within the C-termini of teleost Hemogens, we identified a conserved acidic region

(zebrafish residues 119-169, 35-49% similarity across 10 species) that was similar to an

acidic region of the mouse protein (Yang et al., 2001). Given the transactivation

functions of Hemogen in humans (Zheng et al., 2014), we investigated whether the

zebrafish and human proteins possessed TAD motifs based on the consensus

sequences ϕϕxxϕ or ϕxxϕϕ, where ϕ is a bulky hydrophobic residue (Dyson and Wright,

2016). The acidic C-termini of both Hemogens contained one TAD motif. Four additional

TAD motifs were distributed in other regions of the human protein (Fig. S1).

To assess the three-dimensional conformations of zebrafish and human

Hemogens, although in a static context, we generated ab initio tertiary structural models

with I-Tasser (Yang et al., 2015) using the best of ten predicted templates (Fig. 1E, see

Page 98: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

98

Materials and Methods). The structures for zebrafish and human Hemogens had

template modeling scores (TM-scores) of 0.45 and 0.55, respectively, where a TM-

score > 0.3 indicates significantly different (P < 0.001) from random structures (Xu and

Zhang, 2010). When the two models were superimposed, amino acid sequences shared

by human and zebrafish Hemogens showed 98% coincidence and a TM-score of 0.71.

The N-termini of the zebrafish and human Hemogens presented exposed CC domains

that may serve as binding sites for Gata1 (Zheng et al., 2014). The “disordered” C-

termini of Hemogens from zebrafish and humans were comprised of two distinct

elements: proline-rich repeats (yellow) and an acidic, C-terminal repeat containing the

TAD motif (maroon) (Fig, 1E, Fig. S1). The former may coalesce as rigid linkers to

extend the TAD motif to binding partners. These features are common to transcription

factors, as epitomized by the structure of p53 (Wells et al., 2008).

Page 99: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

99

Page 100: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

100

Figure 2. Hemogen expression in zebrafish embryos. (A-H) Wild-type embryos,

WISH. (A) Epiboly at 9 hpf. Hemogen expression was not detected. (B) 10-somite

stage. Hemogen transcripts along the lateral plate mesoderm (LPM). (C) 20 hpf.

Hemogen staining in the intermediate cells mass (ICM) and posterior blood island (PBI).

The inset shows a sense probe control. (D) 33 hpf. Hemogen-positive primitive

erythrocytes of the peripheral blood (PB) exited the Ducts of Cuvier (DC) onto the yolk.

Staining at the midbrain-hindbrain boundary (MHB) was observed. (E) 144 hpf.

Hemogen expression in the caudal hematopoietic tissue (CHT) and pronephric kidney

(PK), and in erythrocytes in the heart (H). The asterisk indicates the plane of the cross

section in panel F. (F) 144 hpf. Cross section of embryo in panel E showing heavily

stained pronephric ducts. (G) 48 hpf. Lateral aspect of tail. Hemogen transcripts in the

CHT and pronephric tubule duct (PD). (H) Kidney touch print from adult fish. Hemogen

expression was observed in proerythroblasts (ProE) and normoblasts (N) but not in

erythrocytes (E). (I) 48 hpf. View of circulating EGFP+ erythrocytes in the dorsal aorta

(DA) of Tg(Lcr:EGFP)cz3325Tg zebrafish after staining for Hemogen protein by indirect

immunofluorescence. Hemogen (red signal) accumulated in nuclei (Nu) of erythrocytes

whereas the cytoplasm (C) was marked by EGFP. Other abbreviations: CV, caudal

vein; DA, dorsal aorta; G, gut; M, myotomes; NC, notochord; PK, pronephric kidney; SB,

swim bladder; SC, spinal cord. Scale bars: (A-F) 250 µm; (G) 1 mm; (H) 100 µm; (I) 50

µm.

Page 101: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

101

Hemogen expression tracks the ontogenetic progression of hematopoiesis in zebrafish

The spatial and temporal patterns of Hemogen expression were evaluated in

zebrafish between 2 and 144 hours post fertilization (hpf) by whole-mount in situ

hybridization (WISH) (Fig. 2A-H). Hemogen transcripts were not apparent prior to

somitogenesis (Fig. 2A) but first appeared at the 10-somite stage in punctate,

intersomitic foci in the lateral plate mesoderm (LPM; Fig. 2B). By 20 hpf, Hemogen was

expressed throughout the intermediate cell mass (ICM) and posterior blood island (PBI)

(Fig. 2C), the sites of primitive hematopoiesis (Bertrand et al., 2007b; Davidson and

Zon, 2004). Primitive erythrocytes expressed Hemogen as they entered circulation at 33

hpf (Fig. 2D).

Definitive hematopoiesis in zebrafish embryos commences in the aorta gonad

mesonephros (AGM) region at 30 hpf with the emergence of hematopoietic stem

progenitor cells (HSPCs) that subsequently seed the caudal hematopoietic tissue (CHT)

and the thymus (Murayama et al., 2006). By 144 hpf, HSPCs migrate from the CHT to

establish a niche associated with the pronephric glomeruli (Bertrand et al., 2008).

Although we did not detect Hemogen mRNA in the AGM (Fig. 2D), we observed strong

expression in cells of the CHT at 48 and 144 hpf (Fig. 2E,G) and in the region of the

pronephric glomeruli at 144 hpf (Fig. 2E,F). In the adult zebrafish kidney, Hemogen was

strongly expressed in progenitor cells in the interstitial hematopoietic stem cell niche

between pronephric tubules (Fig. 2H). Hemogen expression was robust in progenitors

but absent in mature erythrocytes (Fig. 2H), whereas an anti-sense riboprobe for βe1-

globin hybridized exclusively to mature erythrocytes but not to progenitor cells (data not

shown).

Page 102: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

102

Hemogen has been shown to function as a nuclear transcription factor in

mammals (Zheng et al., 2014). To determine whether or not Hemogen is likely to play

the same role in zebrafish, we examined Tg(Lcr:EGFP)cz3325Tg embryos at 48 hpf by

indirect immunofluorescence microscopy using an antibody specific for Hemogen.

Tg(Lcr:EGFP)cz3325Tg zebrafish have been used to track both primitive and definitive

erythrocytes (Ganis et al., 2012). Figure 2I shows that Hemogen accumulated in the

nuclei (red signal) of GFP-labeled circulating erythrocytes in the dorsal aorta; thus, its

role in transcription is likely to be conserved in zebrafish.

Alternative promoters regulate Hemogen expression in zebrafish hematopoietic and

reproductive tissues

In zebrafish, we also detected Hemogen expression in the hindbrain and in the

pronephric tubules of embryonic zebrafish between 30 and 48 hpf (Fig. 3A,B) and in

adult zebrafish reproductive tissues (Fig. 3G-H). The alternative Hemogen promoters

found in zebrafish probably correspond to the hematopoietic and testis-specific

Hemogen promoters of mammals (Yang et al., 2003). To quantify relative levels of

transcription from each promoter in zebrafish (Fig. 3I), we performed qRT-PCR on total

RNA from adult peripheral blood, testis, and ovaries (Fig. 3J) using primer pairs specific

for exons 1H and 1T. Because all of exon 1H was included in transcripts initiated from

exon 1T, one must infer transcription from the proximal promoter by difference.

Transcription from the proximal promoter was greatest in peripheral blood; the presence

of transcripts from this promoter in testis and ovarian tissue may be due to

contaminating blood RNA. The distal promoter was highly active in both peripheral

blood and in testes but not in ovaries.

Page 103: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

103

Page 104: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

104

Figure 3. Alternative promoters drive Hemogen expression in hematopoietic and

nonhematopoietic tissues in zebrafish. WISH of wild-type embryos (A-B) and adult

tissues (C-H). (A) 48 hpf. Hemogen expression in the pronephric kidney glomeruli (PG),

pronephric tubule duct (PD), caudal hematopoietic tissue (CHT), and brain (Br). (B) 48

hpf. Section showing strong Hemogen expression in the hindbrain (HB) but at low levels

in the midbrain (MB). (C) Dorsal and (D) ventral views of the adult zebrafish brain after

staining for Hemogen transcripts. (E) Schematic drawing of the dorsal view. Hemogen

was highly expressed at the midbrain-hindbrain boundary within the eminentia

granularis (EG), in the crista cerebellaris (CC), and in the hypothalamus (Hy). The

asterisk indicates the plane of the cross section in panel F. (F) Section of the hindbrain

showing Hemogen expression in the periventricular gray zone (PGZ). (G) Hemogen

was expressed by Sertoli cells (Se) between the seminiferous tubules (ST) of the testes.

(H) Hemogen was expressed in early (I-III) but not late (IV) stage oocytes. Transcripts

accumulated around the germinal vesicle (GV). (I) Schematic of the Hemogen

noncoding exons 1T and 1H (gray) upstream of the first coding exon (white); bent

arrows, transcription initiation sites. Arrowheads mark primer binding sites for qPCR

amplification of transcripts initiated from exons 1T or 1H. (J) Expression of transcripts

from alternative promoters determined by qRT-PCR using RNA from blood, testes, and

ovaries of adult TU zebrafish. Expression in three biological replicates were normalized

to β-actin and calculated relative to ovaries. Error bars represent the standard deviation.

Transcription initiated from 1H must be inferred by difference [1H – 1T] because the 1H

primers also amplified 1T transcripts. Other abbreviations: Ce, corpus cerebelli; MO,

Page 105: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

105

medulla oblongata; OB, olfactory bulb; OT, optic tectum; SR, superior raphe; Te,

telencephalon; TS, torus semicircularis. Scale bars = 250 µm (A, B, F-H); 1 mm (C,D).

Page 106: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

106

Hemogen CNEs are predicted targets for transcription factors that regulate

erythropoiesis and spermatogenesis

In teleosts, we identified two evolutionarily conserved non-coding elements,

CNE1 and CNE2, that were tightly associated with exons 1T and 1H, respectively (Fig.

1A, Fig. 4A). These elements may function as core promoters and/or enhancers to

regulate transcription of the different Hemogen isoforms in zebrafish. To identify

potential regulators of Hemogen transcription, we used ConTra v2 (Broos et al., 2011)

to predict transcription factor binding motifs in the aligned Hemogen CNEs from two

mammals and nine teleosts (Yates et al., 2016) (Fig. 4B,C). Each CNE contained

binding motifs for transcription factors involved in erythropoiesis and/or

spermatogenesis.

In zebrafish CNE2, two Gata1 binding sites, located +59 and +127 bp

downstream relative to the transcription start site, aligned with Gata1 sites known to be

active in the mammalian Hemogen promoter (Fig. 4C) (Yang et al., 2006). Each Gata

motif was paired with a predicted E-box - this motif in Hemogen CNE2 is a known target

of the Ldb1-erythroid-complex recruited by Scl (Soler et al., 2010). CNE2 also contained

binding sites for Klf4, a driver of zebrafish primitive erythropoiesis (Gardiner et al.,

2007), for Myb, a regulator of zebrafish definitive hematopoiesis (Soza-Ried et al.,

2010), and for HoxB4, a regulator of Hemogen expression in mammalian hematopoietic

stem cells (Jiang et al., 2010).

The distal CNE1 of teleosts possessed a similar suite of transcription factor

binding motifs in roughly the same arrangement as the proximal CNE but with the

Page 107: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

107

notable addition of binding sites for Sox9 and the Androgen receptor (Fig. 4B), both of

which play roles in zebrafish spermatogenesis (Hossain et al., 2008; Rodriguez-Mari et

al., 2005). CNE1, like CNE2, contained pairs of E-box and Gata motifs downstream of

the zebrafish transcription start site (+15 and +48 bp, respectively). CNE1 may function

as an enhancer for the Hemogen gene and/or act as the core promoter for exon 1T.

Page 108: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

108

Page 109: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

109

Figure 4. Conserved elements in the zebrafish Hemogen promoter are predicted

targets for transcription factors. (A) Schematic of the zebrafish Hemogen gene.

CNEs, black; coding exons, white; noncoding exons, gray; transcription initiation sites,

bent arrows. Numbers indicate length in bp. (B, C) Sequence alignments of CNE1 and

CNE2, respectively, from 9 teleost species, mice, and humans. ConTra software (Broos

et al., 2011) predicted transcription factor binding sites for the Androgen receptor (light

green), Brca1 (cyan), Foxl2 (pink), Gata1 (dark blue), Gfi1 (orange), HoxB4 (sky blue),

Hnf1a (dark green), Klf4 (yellow), Myb (dark gray), P300 (red), Sox9 (purple),

Scl/Lmo2/Ldb1 complex (light gray). Splice donor sites are highlighted black. Species

abbreviations: Dr, Danio rerio; Ca, Cynoglossus semilaevis; Gm, Gadus morhua; Ga,

Gasterosteus aculeatus; Ol, Oryzias latipes; Xm, Xiphophorus maculatus; On,

Oreochromis niloticus; Tr, Takifugu rubripes; Tn, Tetraodon nigroviridis; Mm, Mus

musculus; Hs, Homo sapiens.

Page 110: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

110

Hematopoietic and neural expression of Hemogen in zebrafish is dependent on Gata1

binding to the promoter CNEs

In mammals, transcription of Hemogen from the proximal promoter is tightly

regulated by Gata1 in hematopoietic cells (Yang et al., 2006). To investigate whether

Gata1 regulates Hemogen in zebrafish, we analyzed a Gata1 ChIP-seq dataset that

was generated to assess Gata1 activity in adult zebrafish erythrocytes (Yang et al.,

2016). Figure 5A shows that Gata1 bound to CNE1 and CNE2 at sites overlapping their

Gata motifs (red lines), which indicates strongly that Gata1 is required for transcription

of Hemogen in zebrafish. Corroboration that CNE1 and CNE2 were active chromatin

regions was provided by ATAC-seq and DNase I hypersensitive site analysis (Yang et

al., 2016) (Fig. 5A). Our data reveal that Gata motifs in CNE1, like those in CNE2, are

important regulators of Hemogen expression in zebrafish erythrocytes.

We performed WISH to compare the expression of Hemogen and Embryonic

beta-globin (βe1-globin) in embryos produced by the Gata1-null mutant, vlad tepes

(vltm651) (Lyons et al., 2002). At 33 hpf, Hemogen was expressed normally in circulating

blood cells and in the hindbrain of wild-type siblings (Fig. 5B), and βe1-globin was

abundant in the blood (Fig. 5B, inset). Homozygous vltm651 mutant siblings, by contrast,

failed to express Hemogen in the blood and brain (Fig. 5C). This result mimicked the

loss of βe1-globin in vltm651 mutants, with the exception that βe1-globin expression

persisted in the PBI (Fig. 5C, inset), as has been demonstrated for α1-globin, Scl and

Gata1 (Jin et al., 2009).

Page 111: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

111

Page 112: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

112

Figure 5. Gata1 binds distal and proximal promoter elements to regulate

Hemogen expression in zebrafish. (A) Gata1 ChIP-sequencing showing enriched

binding of Gata1 at CNE1 and CNE2 (C1 and C2, red lines) in the Hemogen promoter

in adult zebrafish red blood cells (Yang et al., 2016). DNase-sequencing and ATAC-

sequencing showing colocalization of the active chromatin regions (Yang et al., 2016).

(B) Hemogen expression by WISH of wild-type (n = 16/21) and (C) homozygous mutant

(n = 5/21) siblings (33 hpf) from in-crossed Gata1+/- vltm651 mutants. Insets show βe1-

globin expression in mutant (n = 4/10) and wild-type (n = 6/10) siblings. Scale bar = 250

µm (C).

Page 113: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

113

Tg(Hemgn:mCherry) zebrafish reveal the functions of the two Hemogen promoters

To determine the tissue-specific regulatory profiles of the two Hemogen

promoters, we generated transgenic zebrafish embryos

[Tg(Hemgn:mCherry,myl7:EGFP)] in which the mCherry reporter was controlled by the

putative promoter elements (Fig. 6). The dual promoter, P1 (2,248 bp), spanned the

upstream, non-coding region to the Hemogen start codon and contained both CNEs.

Transgenic fish were outcrossed to wild-type TU zebrafish, and offspring with the

strongest mCherry expression were selected as founders. In the early embryo, the P1

transgene drove expression of mCherry in primitive blood cells of the ICM and the PBI

(20 hpf; Fig. 6B) and in primitive erythrocytes in circulation (Movie 1). Between 2-8 dpf,

mCherry was expressed strongly throughout the pronephric ducts (Fig. 6C) and was

present in the proximal convoluted tubule at 72 hpf (Fig. 6D). In adult transgenic fish,

the head and trunk kidneys were positive for the reporter (Fig. 6H), as were Sertoli cells

surrounding the seminiferous tubules of the testes (Fig. 6I). Therefore, the ~2.2 kb P1

transgene contained all of the regulatory elements necessary to recapitulate Hemogen

expression (Fig. 6B-I). We note that the dual promoter did not confer detectable ovarian

or neural expression, which may require more distal sequences.

We found that the same expression profile was driven by the endogenous

Hemogen promoter in embryonic zebrafish by using CRISPR/Cas9 technology to insert

the mCherry gene (containing a polyadenylation motif) two codons downstream of, and

in frame with, the Hemogen start codon (See Methods, Fig. S3A,C). Homology-directed

integration of the transgene, confirmed by sequencing of the locus, produced mCherry+

Page 114: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

114

cells in the CHT and in the kidney in 10% (n = 15/150) of embryos at 3 dpf (Fig. S3B)

and at a lower frequency in circulating RBC (n = 3/150, data not shown).

To characterize hematopoietic cell lineages that express Hemogen, the P1

reporter plasmid was injected into embryos of Tg(CD41:EGFP)Ia2Tg or

Tg(Lcr:EGFP)cz3325Tg zebrafish, which have been used to track hematopoietic

progenitors (Lin et al., 2005) and primitive and definitive erythrocytes (Ganis et al.,

2012), respectively. We did not observe mCherry expression in the AGM, in the thymus,

or in CD41+ HSPCs colonizing the thymus or pronephros (Bertrand et al., 2008).

However, the reporter was strongly expressed in a subset of LCR+ erythroid and

CD41+ myeloid-biased progenitors in the CHT (Fig. 6E,F), a tissue that supports

myelopoiesis (Gekas and Graf, 2013; Medvinsky et al., 2011). This lends support to

previous findings that Hemogen is a marker and promoter of myeloerythroid, but not

lymphoid, lineages (Li et al., 2007; Lu et al., 2001). Maturing mCherry+ primitive

progenitors peaked in brightness just prior to leaving the caudal plexus and entering

circulation at 72 hpf (observed by time-lapse imaging; data not shown). However,

mature definitive erythrocytes expressed little mCherry in adult transgenics (Fig. 6G),

which supports prior observations that Hemogen expression is limited to primitive

erythrocytes and immature definitive progenitors (Lu et al., 2001).

Page 115: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

115

Page 116: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

116

Figure 6. Promoter elements have distinct roles in driving hematopoietic, renal,

and testicular expression of Hemogen in transgenic Tg(Hemgn:mCherry)

zebrafish. (A) Schematic of the zebrafish Hemogen gene. CNEs, black; coding exons,

white; transcription initiation sites, bent arrows. Three Tg(Hemgn:mCherry,myl7:EGFP)

transgenes driven by portions of the Hemogen promoter were transfected into one-cell

TU embryos by Tol2 transposase-mediated insertion. Numbers indicate length of

promoter elements and arrows show gene direction. (B) 20 hpf. P1 transgene

expression in the peripheral blood island (PBI). (C) 72 hpf. P1 transgene expression in

the pronephric ducts (PD). (D) 5 dpf. P1 transgene expression in the proximal

convoluted tubule (PCT). (E,F) 72 hpf. colocalization of mCherry and EGFP in

progenitors in the CHT of Tg(Hemgn-P1:mCherry,Lcr:GFP) or Tg(Hemgn-

P1:mCherry,CD41:EGFP) zebrafish. (G) Transgene expression in mature erythrocytes

from adult zebrafish. (H) Transgene expression in adult head kidney (HK), trunk kidney

(TK), and tail kidney (T) near the EGFP+ heart (H). (I) Transgene expression in adult

Sertoli cells (Se) that surround the seminiferous tubules (ST). (J) Proportion of embryos

expressing transgenes P1, P2, or P3 in ICM, kidney, CHT, and circulating primitive

erythrocytes (RBC). Scale bars = 100 µm (B,D-F,I); 500 µm (C,H); 25 µm (G).

Page 117: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

117

Hemogen promoters have different functions in primitive and definitive erythropoiesis in

zebrafish

We evaluated the separate and combined contributions of the two Hemogen

promoters, including CNE1 or CNE2, to the observed tissue-expression profiles by

injecting wild-type embryos with one of three Tg(Hemgn:mCherry,myl7:EGFP) reporter

constructs in which mCherry expression was driven: 1) by the dual promoter (P1); 2) by

a 2-kb fragment (P2) containing the distal promoter including CNE1; or 3) by a 188-bp

fragment (P3) containing the proximal promoter including CNE2 (Fig. 6A). Transgenic

embryos were screened for EGFP+ hearts, and mCherry transcription was confirmed by

RT-PCR and sequencing.

mCherry fluorescence was examined in four cell types: 1) erythroid progenitors in

the ICM at 1 dpf; 2) primitive erythrocytes in the peripheral blood at 3 dpf; 3) erythroid

progenitors in the CHT at 3 dpf; and 4) renal cells of the kidney tubules at 3 dpf. Fig. 6J

shows that the dual promoter (P1) supported strong expression of the mCherry reporter

in erythroid cells of the ICM and peripheral blood (RBC), in the CHT, and in renal cells

of the kidney. By contrast, the distal promoter (P2 construct) containing CNE1 failed to

drive reporter expression in these tissues. Finally, the proximal promoter (P3 construct)

containing CNE2 alone produced strong expression of the reporter in the CHT and in

kidney cells but was not active in cells of the ICM and peripheral blood. Together, these

results indicate that the proximal promoter containing CNE2 is necessary and sufficient

to drive expression in definitive hematopoiesis and in the kidney, whereas the full 2.2-kb

sequence including both promoters and CNEs is required in primitive erythropoiesis.

Page 118: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

118

Page 119: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

119

Figure 7. Morpholino targeting of Hemogen inhibits erythropoiesis in embryonic

zebrafish. Embryos were injected with 2-4 ng antisense MO targeted to the first 25

coding nucleotides of Hemogen. (A-B) O-dianisidine staining of erythrocytes was

decreased in morphants (MO) relative to wild-type embryos (WT) or embryos rescued

with 500 pg synthetic Hemgn mRNA (zHem) at 24 hpf. (ANOVA, Tukey post hoc test, P

< 0.001). Live wild-type (C), Hem1 MO-injected (D), and Hem1mm mismatch MO-

injected (E) Tg(Lcr:EGFP)cz3325Tg embryos at 20 hpf. Morphants showed decreased

EGFP expression in the ICM compared to the wild-type and mismatch MO controls. Live

wild-type (F), Hem1 MO-injected (G), and Hem1mm MO-injected (H) embryos at 72 hpf.

Morphant embryos have fewer EGFP+ cells in circulation compared to the two controls.

The dorsal aortas of embryos (insets above F-H) were magnified 20x to permit

quantitation of EGFP+ erythrocytes. Background red (D,G) and green (E,H)

fluorescence was generated by the fluorescent labels on the MOs. (I) In vivo flow

quantitation of EGFP+ erythrocyte concentrations between 3-6 dpf in Hem1-injected (n

= 9,7,7,7), Hem1mm-injected (n = 13,14,11,11), and uninjected (n = 5,10,10,9)

embryos. Data shown as means ± s.e.m. (* P ≤ 0.05, ** P ≤ 0.001, ANOVA, Tukey-

Kramer post hoc test). Scale bars = 500 µm (A-F) 100 µm (inset).

Page 120: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

120

Morpholino knock-down of Hemogen protein expression partially disrupts erythropoiesis

in zebrafish

To perturb Hemogen function in zebrafish, we first injected wild-type zebrafish

embryos at the one cell stage with an antisense morpholino oligonucleotide (MO), Hem-

1, targeted to the translation start codon of the Hemogen transcript (Hem1). MO

treatment significantly reduced Hemogen protein levels by 19% at 33 hpf (Student’s t-

test, P < 0.05; Fig. S4A,B) and steady-state levels of βe1-globin mRNA at 3 dpf

(Student’s t-test, P < 0.05; Fig. S4C). At 24 hpf, 61% of morphants were anemic

compared to 35% of uninjected zebrafish (Fig. 7A,B). Red cell levels were restored to

wild-type by co-injection of the MO with 500 pg of synthetic zebrafish Hemogen mRNA

containing silent mutations in the MO target site. Both the uninjected and rescue

treatments differed significantly from the MO treatment (ANOVA, Tukey post hoc test, P

< 0.001; Fig. 7A).

We used Tg(Lcr:EGFP) cz3325Tg zebrafish to visualize the red blood cell population

in Hem1-treated morphants from 0-6 dpf. Control embryos were injected with a 5-bp

mismatch MO (Hem1mm) or were uninjected. At 20 hpf, EGFP+ erythrocytes appeared

to be reduced in the ICM/PBI of 75% of Hem1 morphants (n = 56) but not in mismatch

or uninjected control embryos (n = 14 and 63, respectively) (Fig. 7C-E). At 2 dpf,

morphant embryos had few erythrocytes in circulation compared to controls (Fig. 7F-H,

Movie 2). Using quantitative in vivo flow analysis (Fig. 7I), we found that morphant

embryos at 3-5 dpf had fewer than 50% of the circulating EGFP+ erythrocytes as the

uninjected and Hem1mm-injected controls, whereas the controls did not differ

statistically from each other (ANOVA, Tukey-Kramer post hoc test, P < 0.05).

Page 121: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

121

A conserved C-terminal domain in Hemogen is required for hematopoiesis and prevents

apoptosis in embryonic tissues

The function of the putative C-terminal transactivation domain of zebrafish

Hemogen was investigated using CRISPR/Cas9 mutagenesis. We generated zebrafish

lines with mutations in the conserved region near the end of the third coding exon of

Hemogen, immediately downstream of the TAD motif (Fig. 8A-D, Fig. S1). Founders

(F0) were out-crossed to wild-type TU zebrafish and mutant alleles were genotyped in

the F1 generation by high resolution melting analysis and by sequencing the locus (Fig.

8E, Fig. S1). One line, Hemgnnuz2, had a 5-bp deletion (Δ5) that produced a frameshift

mutation, thereby introducing a premature stop codon (Fig. 8E, Fig. S1). PolyA-tailed

transcripts of the Δ5 allele were detected at equivalent steady-state levels relative to the

wild-type allele in peripheral blood from individual adult heterozygotes (Fig. 8F).

Western blot analysis revealed, however, that truncated Hemogen protein was almost

undetectable in peripheral blood from single heterozygous adults (data not shown) and

in pooled 33-hpf embryos from a heterozygous in-cross (Fig. 8G). Therefore, if the

truncated Hemgnnuz2 transcripts were translated, then the protein must have been

rapidly degraded. The second line, Hemgnnuz4, contained an in-frame 12-bp deletion

(Δ12), which deleted an acidic cluster (EEED) in the last repeat that is conserved in

teleost species (Fig. S1). Hemogen protein was detected in the blood of homozygous

Δ12 adults by Western Blot (data not shown).

To evaluate the effects of the mutant Hemogen alleles on erythropoiesis during

development, we examined embryos from mutant crosses by microscopy and

genotyped them between 20-48 hpf (Fig. 8A-C) - mutant genotypes were recovered

Page 122: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

122

near the expected Mendelian ratios (Fig. S5A), but homozygous Δ5 hemgnnuz2 mutants

could not be raised to adulthood. To classify the mutants, we assessed the relative

numbers of blood cells and relative concentrations of hemoglobin beginning at 2 dpf

(Ransom et al., 1996). Embryos from a heterozygous in-cross were scored for

hypochromic blood (paler blood) and decreased numbers of circulating cells on the yolk

sac and in the vasculature. Erythrocyte levels were reduced to about 25-75% of normal

levels in frameshift Hemgnnuz2/+ mutants (n = 8) at 24 hpf compared to wild-type siblings

(n = 7) (Fig. 8C). At 48 hpf, 59% of heterozygous (n = 49) and 50% of homozygous (n =

12) Hemgnnuz2 mutants had reduced numbers of circulating erythrocytes (Fig. 8H, Movie

3) and homozygotes could be distinguished by their more severe anemia. Comparable

numbers of anemic individuals were observed for heterozygotes and homozygotes of

the Δ12 Hemgnnuz4 allele – 64% (n = 25) and 60% (n = 10), respectively (Fig. 8H). In all

cases, the proportion of anemic mutant embryos was significantly different from that for

wild-type (* P ≤ 0.05, ** P ≤ 0.005, Chi square).

Erythrocyte levels in adult mutants were partially suppressed in heterozygotes.

Hemgnnuz2/+ and Hemgnnuz4/+ adults gave average erythrocyte counts of 2.2 ± 1.0 x 106

cells µl-1 and 2.1 ± 0.8 x 106 cells µl-1, respectively, whereas wild-type zebrafish had 4.3

± 1.0 x 106 cells µl-1 (Fig. 8J,K). Homozygous Δ12 Hemgnnuz4 gave average erythrocyte

counts of 2.2 ± 1.2 X 106 cells µl-1 (Fig. 8K). Taken together, the erythroid defects of

embryonic and adult zebrafish carrying the CRISPR-generated mutant alleles support

the conclusion that the conserved C-terminus of Hemogen functions as a TAD, but the

mechanism of action of these mutations remains to be determined.

Page 123: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

123

Both the Δ5 and Δ12 mutant Hemogen alleles also caused mild to severe

developmental defects in the nototchord and the trunk of heterozygotes and

homozygotes (Fig. 8A-B, Fig. S5B). Embryos had kinked notochords and exhibited

increased cellular refractility consistent with apoptotic cell death. Elevated apoptotic cell

death was apparent in Hemgnnuz2/+ mutants as detected by staining with acridine orange

(Fig. S5C). Apoptosis occurred throughout the embryo, including sites of embryonic

hematopoiesis. Nevertheless, viable heterozygotes for both alleles could be raised to

adulthood; they were slightly smaller than wild-type siblings (Fig. 8I). Impaired growth

was significant in homozygous Δ12 Hemgnnuz4 adult mutants (Student’s t test, P = 0.04,

N = 3, Fig. S5D,E).

Page 124: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

124

Page 125: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

125

Figure 8. CRISPR/Cas9 mutagenesis of the third exon of zebrafish Hemogen

reduces primitive and definitive erythropoiesis. Embryos were injected with Cas9

mRNA and a guide RNA to establish lines with mutations in exon three of zebrafish

Hemogen. (A) 20-hpf. Representative wild-type and mutant siblings with notochord

defects (arrow) (B) 48 hpf. Mutant Δ12 embryos with an in-frame deletion showing

kinked notochords (arrow). (C) 24 hpf. Wild-type and Δ5/+ mutant embryos stained with

diaminofluorene. Production of erythrocytes was reduced in heterozygotes. (D)

Schematic of CRISPR/Cas9 target in the third exon (red arrowhead) of zebrafish

Hemogen. (E) Sequences of founder mutations aligned at the CRISPR target site: Δ5

(Hemgnnuz2); Δ12 (Hemgnnuz4). The sequence traces show the Δ5 and Δ12 mutant

alleles. PAM, blue and underlined; Δ, deletions (highlighted in red). (F) Relative

expression of wild-type and Δ5 transcripts in blood from single adult, heterozygous

Hemgnnuz2/+ mutants determined by qRT-PCR with allele specific primers. Three

biological replicates were normalized to β-actin. Error bars represent the standard

deviation. (G) Western blot of Hemogen in pooled 33 hpf wild-type embryos or pooled

embryos from a Δ5 Hemgnnuz2/+ heterozygous in-cross. We calculated that the protein

would run 6.5 kDa above its molecular weight at 28.5 kDa because of its high acidic

composition (Guan et al., 2015). Arrows show the calculated sizes of wild-type and

truncated alleles. (H) Proportion of genotyped mutants and wild-type sibling embryos at

2 dpf that were anemic (black) or phenotypically normal (white) (* P ≤ 0.05, ** P ≤ 0.005,

Chi square). (I) Wild-type and mutant zebrafish heterozygous for the Δ5 and Δ12

alleles. (J) Red blood cells from adult Hemgnnuz2/+ mutant zebrafish and wild-type

siblings. (K) Erythrocyte counts in adult heterozygous Hemgnnuz2 (Δ5, n = 12),

Page 126: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

126

heterozygous Hemgnnuz4 (Δ12, n = 4) mutants, homozygous Hemgnnuz4 (Δ12, n = 2)

mutants, and wild-type (n = 9) siblings (* P ≤ 0.05, ANOVA, Tukey post hoc test). Scale

bars = 500 µm (A-C); 50 mm (I); 20 µm (J)

Page 127: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

127

DISCUSSION

The zebrafish is a compelling model for understanding the pleiotropic functions of

Hemogen in the context of vertebrate development. Our results show that zebrafish

Hemogen is considerably smaller than its human ortholog, a distinction true for teleost

and mammalian Hemogens in general. Hemogen is expressed in multiple zebrafish

tissues from the early embryo to the adult under the control of at least two promoters.

Both primitive and definitive erythropoiesis are affected by depletion of Hemogen and by

targeted mutation of a putative, C-terminal TAD. The transgenic and mutant zebrafish

lines that we have generated will contribute to a mechanistic understanding of this

important transcription factor.

Hemogen ‒ small or large, it’s built of related modules and has a conserved role in

erythropoiesis

We show that the divergent Hemogens of zebrafish and human are largely, but

not entirely, built of 21-25 residue repeats; the number of repeats largely determines

protein size. The repeat consensus sequences are distinct, but they appear to have

evolved from an 8-10 amino acid core motif (Fig. S2). Although all repeats are acidic

(Figure S2), the terminal repeat of each Hemogen is particularly so (> 38% Asp and Glu

for zebrafish, > 29% for human), and these repeats contain TAD motifs. Together, these

features suggest that Hemogens possess flexible, intrinsically disordered TADs, as is

true of many transcription factors (e.g., p53, HIF-1α, NF-κB, etc). The multivalent

structure of Hemogen provides opportunities for cooperative binding to single or multiple

protein partners, including P300 (Zheng et al., 2014).

Page 128: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

128

Hemogen interacts with a variety of proteins to stimulate the transcription of

genes involved in terminal erythroid differentiation and other processes. In humans,

Hemogen contributes to transcription of erythroid genes in part by recruiting P300 to

acetylate and activate Gata1 (Zheng et al., 2014). Our results show that nonsense (Δ5)

and deletion (Δ12) alleles of Hemogen vicinal to the zebrafish TAD motif cause

significant reductions of erythrocyte levels in embryos and adults. The Δ12 allele may

be hypomorphic, but we have not determined whether the protein that is expressed has

reduced activity.

Hemogen – targeted mutation of the acidic C-terminus impairs erythropoiesis, but not

completely

Our CRISPR-generated zebrafish mutant lines show that nonsense (Δ5) and

deletion (Δ12) alleles of Hemogen caused a decrease in erythrocyte levels in embryos

and adults. However, these phenotypes were incompletely penetrant ‒ in both

heterozygous and homozygous Hemogen mutants the proportion of anemic embryos

was 50-65%, compared to 20% for wild-types. If Hemogen were essential for

erythropoiesis, one would anticipate an erythroid-null phenotype for homozygous

mutants, as observed for the Gata1 mutant, vlad tepesm651 (Lyons et al., 2002). Rather,

the Hemogen phenotype resembles the variable reduction of red cells in zebrafish

zinfandel (zinte207) mutants that harbor a mutation in a regulatory region at the globin

locus (Ransom et al., 1996), a known target of both Hemogen and Gata1 transcription

factors (Zheng et al., 2014). Loss of Hemogen in zebrafish contributes to decreased

expression of Embryonic beta-globin (Fig. S4), which may explain the hypochromic

state of Hemogen mutants.

Page 129: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

129

The most plausible explanation for the incomplete penetrance of anemia in

Hemogen mutants is the phenomenon of genetic compensation, which may occur when

genes are knocked out as opposed to knocked down (El-Brolosy and Stainier, 2017;

Rossi et al., 2015). Although the mechanisms are poorly understood, genetic

compensation entails changes in gene expression (e.g., upregulation of paralogous

genes or functionally related genes) that at least partially offset the phenotype caused

by the mutant protein. Compensation through elevated expression of other erythroid co-

activators is an attractive possibility that might maintain erythrocyte production in

Hemogen mutants. The functional loss of Hemogen could be mitigated by Gata1

homodimerization and/or by direct recruitment of CBP/P300, both of which enhance

Gata1 activity (Ferreira et al., 2005; Nishikawa et al., 2003).

Page 130: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

130

Similar design and regulation of Hemogen and Gata1 genes

Comparison of the expression of Hemogen and of Gata1 throughout zebrafish

development reveals a remarkable degree of overlap in tissue and cellular specificity.

For example, Gata1 mRNA appears in cells of the LPM at the two-somite stage (Detrich

et al., 1995), immediately prior to the onset of Hemogen expression at ten somites.

Furthermore, Hemogen and Gata1 are co-expressed in primitive erythrocytes and

definitive hematopoietic progenitors (Ferreira et al., 2005; Lu et al., 2001), in Sertoli

cells (Nakata et al., 2013; Wakabayashi et al., 2003), and at the midbrain-hindbrain

boundary (Volkmann et al., 2008). Interestingly, both Hemogen and Gata1 genes

possess hematopoietic- and testis-specific promoters (Wakabayashi et al., 2003). The

temporal and spatial co-incidence of Hemogen and Gata1 expression almost certainly

results from their similar regulatory architectures and also through regulatory crosstalk.

Our results and studies conducted by others (Ding et al., 2010; Yang et al., 2006; Zheng

et al., 2014) indicate that reciprocal transcriptional activation of Hemogen and Gata1

may form a positive feedback loop that drives erythropoiesis.

Strikingly, the two CNEs of Hemogen are organized like, and have the same

functions as, the distal and proximal enhancers of the Gata1 gene (McDevitt et al.,

1997; Onodera et al., 1997; Suzuki et al., 2009). The proximal Gata1 promoter functions

exclusively in definitive erythropoiesis (McDevitt et al., 1997), as does CNE2 of

zebrafish Hemogen. In contrast, transcription of Gata1 in primitive erythrocytes requires

both the proximal promoter and a distal enhancer comparable to Hemogen CNE1

(McDevitt et al., 1997). Fig. S6 presents a model for the transition from primitive to

definitive hematopoiesis based on chromatin looping at the Hemogen locus We propose

Page 131: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

131

that the transition from primitive to definitive erythropoiesis involves a switch from a loop

conformation to a linear conformation, mediated by the Gata1/Ldb1-complex at

erythroid transcription factories (Osborne et al., 2004; Schoenfelder et al., 2010). This

model may also apply to the Gata1 enhancer, which is another known target of the

Ldb1-complex (Love et al., 2014). The zebrafish lines produced in this study may help

clarify the cell-specific Hemogen expression profile driven by different Gata1-containing

complexes and the functions of Hemogen in different cell types.

Page 132: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

132

MATERIALS AND METHODS

Fish husbandry

Wild-type (SAT, AB, TU) zebrafish (Danio rerio), the transgenic lines

Tg(Lcr:EGFP)cz3325Tg (Ganis et al., 2012) and Tg(CD41:EGFP)Ia2Tg (Traver et al., 2003),

and the mutant vlad tepesm651 (Lyons et al., 2002) were all generously provided by Dr.

Leonard I. Zon (Howard Hughes Medical Institute and Harvard Medical School, Boston).

Animal procedures were carried out in full accordance with established standards set

forth in the Guide for the Care and Use of Laboratory Animals (8th Edition). The animal

care and use protocol for live zebrafish embryos was reviewed and approved by

Northeastern University’s Institutional Animal Care and Use Committee (Protocol No.

15-0207R). The animal care and use program at Northeastern University has been

continuously accredited by AAALAC Int. since July 22, 1987, and maintains the Public

Health Service Policy Assurance number A3155-01.

Cloning and sequence analysis of zebrafish Hemogen cDNAs

Total RNA was isolated from wild-type AB zebrafish embryos and adult tissues

(kidney, blood, brain, ovary, intestine) using TRI reagent (Sigma, T9424) and the

Ribopure Kit (Ambion, AM1924). Total cDNA was produced from mRNA using M-MuLV

reverse transcriptase (NEB, M0253S) and an oligo(dT)23 primer. Hemogen cDNA was

amplified by PCR from total cDNA with 1 µM primers (Table S1) – the amplification

program was 35 cycles of 98°C for 10 s, 57°C for 10 s, and 72°C for 30 sec. PCR

products were cloned into the pGEM-T Easy vector (Promega, A1360), plasmids were

transformed into 5-α competent cells (New England Biolabs, C2987H), recombinant

Page 133: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

133

plasmids were identified by blue/white screening and purified with the Wizard Plus SV

Miniprep Kit (Promega A1330), and inserts were sequenced by GeneWiz.

Bioinformatic comparison of vertebrate Hemogen genes and Hemogen proteins

We utilized the murine gene nomenclature for comparing orthologs from different

vertebrate species. We used Blast+ (Altschul et al., 1990) to identify Hemogen in the

zebrafish genome (assembly GRCz11) (Howe et al., 2013). Chromosomal synteny

comparisons were performed using the Synteny Database with a sliding window of 200

genes (Catchen et al., 2009) and Ensembl Genomes v74 (Kersey et al., 2016).

Hemogen promoter alignments were obtained from whole genome alignments for 10

teleost species (ENSEMBL v74) (Yates et al., 2016). Transcription factor binding motifs

were predicted using the program ConTra with the default similarity matrix of 0.75

(Broos et al., 2011). Transcription start sites were predicted using NNPP v2.2 with a

score cutoff of 0.98 (Reese, 2001).

Protein domains in zebrafish were identified using annotated human Hemogen

(Yang et al., 2001), or they were predicted using HHpred (Soding et al., 2005) and the

Conserved Domain Database (CDD) (Marchler-Bauer et al., 2015). Peptide repeats

were predicted with RADAR (Heger and Holm, 2000). The 9aaTAD Prediction Tool was

first used to predict transactivation domain (TAD) motifs, starting with low stringency

DFx repeats (Piskacek et al., 2016). These were then culled by ϕϕxxϕ or ϕxxϕϕ criteria,

where ϕ is a bulky hydrophobic motif (Dyson and Wright, 2016). We refer to the latter

five amino acid consensus sequences as “TAD motifs,” in contrast to larger, functionally

defined “transactivation domains” (TADs). Ab initio tertiary structure models were

Page 134: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

134

created for zebrafish and human Hemogen proteins with I-Tasser (Yang et al., 2015)

based on the X-ray structure for the secretory component of Immunoglobulin A

(PDB:3chnS), which was the best of ten predicted structural templates determined by

LOMETS (Wu and Zhang, 2007). The 3D models were superimposed using TM-align

(Zhang and Skolnick, 2005) and Geneious version R10 (Kearse et al., 2012).

MO knock-down of Hemogen in zebrafish and rescue of the morphant phenotype

The antisense MO Hem1 (5’-TCTCTTTCTCCAACGGGTCTTCCAT-3’), which

targets the first 25 base pairs of the zebrafish Hemogen open reading frame, was

designed according to the manufacturer’s instructions (Gene Tools, LLC). The control

MO (Hem1mm; 5’-TCTgTTTgTCCAtCGGcTCTTCgAT-3’) targeted the same sequence

but contains five mismatched bases to prevent efficient binding to Hemogen mRNA.

MOs were labeled with lissamine or fluorescein so that the quality of injections could be

monitored by fluorescence microscopy. MOs were injected (2-8 ng) into embryos at the

single-cell stage using a PLI-100 Picoinjector (Medical Systems Corporation, 65-0001)

and a micromanipulator (Narishige, MN-151). Injected embryos were sampled from 0-6

dpf for subsequent analyses.

Rescue of the morphant phenotype was tested by co-injection of the Hem1 MO

with 500 pg synthetic zebrafish Hemogen mRNA transcribed from a zebrafish Hemogen

cDNA cloned into pGem-T Easy (Promega). Primers (Table S1) introduced five silent

mutations within the MO target site. The clone was digested with Spe1, and mRNA was

transcribed, capped, and polyadenylated in vitro using the mMessage T7 kit (Ambion,

Page 135: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

135

AM1340) and the Poly(A) Tailing Kit (Ambion, AM1350). mRNA was purified with the

MEGAclear kit (Ambion).

In-situ hybridization

The spatial and temporal patterns of expression of selected genes were analyzed

by whole-mount in situ hybridization (WISH) of zebrafish embryos following standard

protocols (Jacobs et al., 2011). These methods were adapted to evaluate Hemogen

expression in tissues, peripheral blood smears, and pronephric kidney prints prepared

from euthanized adult fish [200 mg L-1 tricaine methane sulfonate (MS222; Sigma-

Aldrich, 886862)] (Detrich and Yergeau, 2004; Gupta and Mullins, 2010). For sectioning,

embryos and tissues were embedded in a solution containing 0.25 g gelatin, 30 g

albumin, 22 g sucrose, 2.5% glutaraldehyde (v/v) per 100 ml phosphate buffered saline

(PBS). Sections were cut with a vibrating blade microtome (Leica, VT1000S).

Digoxigenin-labeled antisense and sense RNA probes were transcribed from zebrafish

cDNA clones using the DIG RNA Labeling Kit (Roche Diagnostics, 11175025910).

Indirect Immunofluorescence

Zebrafish embryos were fixed in 4% paraformaldehyde (PFA) at 48 hpf. Embryos

were incubated with 1:1000 rabbit anti-Hemogen primary antibody (Aviva,

ARP57794_P050) followed by 1:1000 goat anti-rabbit IgG Alexafluor 488 secondary

antibody (Life Technologies, A11034) as previously described (Westerfield, 2000). The

specificity of the Hemogen antibody was validated both by Clontech and by our

laboratory by Western blotting of zebrafish protein extracts.

Page 136: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

136

Hemoglobin staining

To detect red blood cells in circulation, embryos were stained with o-dianisidine (Iuchi

and Yamamoto, 1983) or diaminofluorene (McGuckin et al., 2003).

Western blotting

Total embryonic protein was prepared for sodium dodecyl sulfate polyacrylamide

gel electrophoresis (SDS-PAGE) from dechorionated, 33-hpf embryos (n =80) by

homogenization in lithium dodecyl sulfate (LDS) Bolt buffer (Life Technologies, B007)

and NuPAGE reducing agent (Life Technologies, NP0009) using a pestle and

microcentrifuge tube (USA Scientific, 1415-5390). Samples were boiled for 3 min and

centrifuged at top speed in a centrifuge for 2 min. Aliquots (15 µg) were electrophoresed

on a 4-12% SDS polyacrylamide gel, and the separated proteins were transferred to a

polyvinylidene difluoride (PVDF) membrane with the iBlot system (Life Technologies,

IB21001). Membranes were blocked in maleic acid blocking buffer (2% Roche blocking

reagent, 2% BSA, 0.2% heat treated goat serum, 0.1% Tween-20) for 1 hour at room

temperature and then incubated overnight at 4°C with 1:1000 rabbit anti-Hemogen

(Aviva, ARP57794_P050) or with 1:1000 mouse anti-GAPDH (Aviva, OAE00006)

antibodies. Membranes were washed in tris-buffered saline and Tween 20 (TBST) and

incubated for 2 h with horseradish peroxidase (HRP)-conjugated goat anti-rabbit IgG

(H&L) (Aviva, ASP00001) or HRP-conjugated goat anti-mouse IgG (H&L) (Aviva,

OARA04973), respectively. Bound antibodies were detected with the Amersham ECL

Western Blotting Analysis System (GE Healthcare, RPN2106) on CL-X Posure film

(Thermo Scientific,34091).

Page 137: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

137

Tol2 generation of Tg(Hemgn:mCherry) zebrafish

To identify the regulatory elements that drive Hemogen expression in zebrafish,

three different Tg(Hemgn:mCherry) reporter plasmids were created using Gateway

Cloning Technology (Invitrogen, 11791020) (Hartley et al. 2000). First, the proximal

Hemogen promoter (~2.2 kb) was amplified from wild-type SAT zebrafish using 1 µM

primers (Table S1). The promoter sequence spanned the upstream, non-coding region

before, but not including, the Hemogen translation start codon. The promoter was

cloned between KpnI/SpeI restriction sites in the p5e-MCS vector (Tol2kit, #228) using

the Tol2kit vector system (Kwan et al., 2007) to generate the entry clone, p5e-Hemgn-1.

The resulting plasmid was digested with NaeI/KpnI or NaeI/SpeI to remove each of two

conserved non-coding elements (CNE1 or CNE2) from the promoter. Each new

construct was blunt-ended with Q5 Hot Start High-Fidelity 2x Master Mix (NEB) and

religated with T4 DNA Ligase (NEB) to create p5e-Hemgn-2 and p5e-Hemgn-3. Each of

the three entry clones were cloned in front of the mCherry gene within the

pDestTol2CG2 destination vector (Tol2kit, #395). The pCS2FA-transposase clone

(Tol2kit, #396) was digested with PmeI, and Tol2 transposase mRNA was transcribed,

capped, and polyadenylated in vitro using the mMessage SP6 kit (Ambion, AM1340)

and the Poly(A) Tailing Kit (Ambion, AM1350). mRNA was purified by precipitation using

2.5 M LiCl. Transposase mRNA (37 ng µL-1) and each of the

Tg(Hemgn:mCherry,myl7:EGFP) expression clones (25 ng µL-1) were co-injected into

one-cell wild-type zebrafish embryos. Founders were raised and out-crossed to wild-

type TU zebrafish for two generations.

Page 138: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

138

CRISPR/Cas9 generation of transgenic and mutant zebrafish

Optimal targets for CRISPR-Cas9 mutagenesis were identified within the first and

third exons of zebrafish Hemogen using the program CHOPCHOP (Labun et al., 2016;

Montague et al., 2014). The templates for multiple small guide RNAs were produced by

a cloning-free method as previously described (Table S1) (Hruscha et al., 2013; Talbot

and Amacher, 2014). Guide RNAs were transcribed with the T7 MaxiScript Kit (Ambion,

AM1312) and purified by LiCl precipitation.

A donor construct for homology directed repair was created containing the

mCherry gene and polyadenylation signal flanked by 199 bp and 253 bp homology arms

that were PCR amplified from the sequence surrounding exon 1 of Hemogen from wild-

type AB zebrafish (Table S1). The homology arms and mCherry gene were PCR

amplified with primers that added AvrII and ClaI restriction sites, ligated, and cloned into

the pGem-T Easy vector (Promega). Tg(Lcr:EGFP)cz3325Tg embryos were co-injected at

the single-cell stage with EcoRI linearized donor plasmid (25 ng µl-1), two exon-1

targeting guide RNAs (150 ng µl-1), and Cas9 mRNA (300 ng µl-1) (Trilink). Embryos

were checked for fluorescence between 1 and 3 dpf. To confirm integration, the locus

was PCR amplified with internal and external primers (Table S1) and cloned into the

pGem-T Easy vector for sequencing.

Wild-type (TU) embryos were co-injected with a guide RNA (150 ng µl-1) targeting

exon 3, Cas9 mRNA (300 ng µl-1), and mCherry mRNA (30 ng µl-1) to identify successful

injections. Embryos were raised and adults were tail-clipped for haplotyping by high-

resolution melting analysis (HRMA) as previously described (Talbot and Amacher,

Page 139: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

139

2014). PCR amplification was run using 1 µM primers (Table S1) with PowerUp SYBR

MasterMix (Applied Biosystems, A25742) on a QuantStudio 3 Real-time PCR system

(ThermoFisher, A28137). Founder mutants were outcrossed to wild-type (TU) fish. The

offspring were raised and mutations were characterized by HRMA and sequencing of

the locus.

Imaging of zebrafish embryos

Fixed embryos were mounted in 80% glycerol and imaged with a dissecting

microscope (Nikon, SMZ-U) and a CCD digital camera (Diagnostic Instruments,

SPOT32). Live embryos were embedded in 0.1% agarose in embryo medium (EB) with

0.01% tricaine and imaged with an epifluorescence-equipped microscope (Nikon,

Eclipse E800). Movies (0.01 sec interval) and time-lapse images (1 min interval) were

obtained using a Photometrics Scientific CoolSNAP EZ camera and NIKON NIS-

Elements AR 4.20 software. Methods for in vivo flow analyses were adapted to quantify

fluorescently labeled red blood cells in MO-injected Tg(Lcr:EGFP)cz3325Tg zebrafish

(Schwerte et al., 2003; Zeng et al., 2012). Briefly, 100 frame videos were taken set at a

500 μs exposure time with no delay. The field of view (20x) was centered on the dorsal

aorta adjacent to the cloaca. The summed maximum intensity images of all frames were

used to create “casts” of the dorsal aorta and the average volume was calculated

assuming cylindrical vasculature. EGFP+ cells were converted to binary objects (6.66

µm diameter, contrast 180) and counted within the region of interest.

Page 140: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

140

qRT-PCR

RNA was purified from adult zebrafish tissues or 10-30 pooled embryos at 3 or 4

dpf in TriZol (Sigma-Aldrich, T9424) using the PureLink RNA purification Kit (Ambion).

DNase treated RNA was reverse transcribed with a polyT(23) primer using Protoscript II

RT-PCR kit (New England Biolabs, M0368S). Target genes were amplified in triplicate

from cDNA by qRT-PCR with 1 µM primers (Table S1). Standard curves were

generated to confirm primer efficiencies. Target gene expression was normalized to

beta-actin for comparison by the ΔΔCt method. Three or four biological replicates were

used for each treatment for statistical comparisons.

Statistical analyses

Data were analyzed as means ± s.e.m. or means ± s.d. as noted. Statistical tests

applied to the results are provided with each experiment. Differences with a p-value ≤

0.05 were considered significant.

GenBank accession numbers

Zebrafish Hemgn isoform 1, JZ970258; zebrafish Hemgn isoform 2, JZ970260;

zebrafish Hemgn isoform 3, JZ970259; and zebrafish Hemgn isoform 4, JZ970257.

Zebrafish ZFIN IDs

Transgenic construct Tg(hemgn:mCherry,myl7:EGFP), ZDB-TGCONSTRCT-170726-1;

zebrafish line nuz1Tg, ZDB-ALT-170726-1; zebrafish line hemgnnuz2, ZDB-ALT-170726-

Page 141: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

141

2; zebrafish line hemgnnuz3, ZDB-ALT-170726-3; zebrafish line hemgnnuz4, ZDB-ALT-

170726-4

Acknowledgements

We thank Dr. Leonard Zon and Christian Lawrence at Children’s Hospital in Boston for

providing zebrafish and plasmids. We thank Dr. John Postlethwait, Dr. Leonard Zon,

and Christopher Wells for helpful discussion. We thank Dr. Johanna Farkas and Carly

Ching for their technical contributions. We thank Dr. Leonard Zon, Dr. Yi Zhou, and

colleagues at Boston Children's Hospital, Stem Cell and Regenerative Biology

Department, Harvard Medical School and Harvard University for providing ATAC-seq,

ChIP-seq, and DNase I-seq datasets.

Funding

This research was supported by a Graduate Research Grant from the College of

Sciences and the Office of the Vice Provost of Graduate Studies at Northeastern

University awarded to MJP and by NSF grants PLR-1247510 and PLR-1444167

awarded to HWD. This is contribution number 380 from the Northeastern University

Marine Science Center.

Authors’ Contributions

MJP designed and carried out experiments, created expression plasmids, did in vivo

flow analysis, generated zebrafish lines, and analyzed and interpreted results. SKP

created expression plasmids and contributed to MO-knockdown and WISH experiments

and analyses. JL constructed expression plasmids, helped create transgenic zebrafish,

Page 142: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

142

and did immunofluorescence microscopy. JG and CAHA designed and carried out MO

experiments and rescues. MJP and CAHA isolated transcripts. HWD conceived the

study and participated in its design and interpretation. MJP and HWD drafted and

revised the manuscript. All authors reviewed the manuscript.

Author Disclosure Statement

No competing financial interests exist.

Page 143: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

143

Page 144: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

144

Supplemental Figure 1. Alignment of the amino acid sequences of wild-type and

mutant Hemogens in zebrafish with the orthologous proteins from other

vertebrate species. Transactivation domain (TAD) motifs are boxed and were identified

in human and zebrafish Hemogens by ϕϕxxϕ or ϕxxϕϕ, where ϕ is a bulky hydrophobic

motif. Alleles are shown for Hemgnnuz2 (Δ5) and Hemgnnuz4 (Δ12) mutant zebrafish

lines. Predicted motifs: green, coiled coil; blue, nuclear localization signal; maroon, four

residues introduced by alternative splicing; yellow, tandem peptide repeats; box, TAD

motif; purple, TAD motif conserved in teleosts; bold italic, acidic region; red,

frameshifted residues; red dashes, deletion. Species abbreviations: H. sapiens, Homo

sapiens; M. musculus, Mus musculus; G. aculateus, Gasterosteus aculeatus; G. gallus,

Gallus gallus; C. milii, Callorhinchus milii; D. rerio, Danio rerio

Page 145: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

145

Page 146: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

146

Supplemental Figure 2. Analysis of peptide repeats (A) Alignment of predicted

tandem peptide repeats from zebrafish and human Hemogens. Conserved residues are

shaded black. Each predicted repeat in human Hemogen can be divided into two more

repeats. Conserved regions of peptide repeats between zebrafish and human

Hemogens are boxed. Repeats are most similar within species but repeats 1 and 3 are

similar between human and zebrafish Hemogens (marked with asterisks). (B) Amino

acid composition of zebrafish Hemogen. The repeat region is enriched for glutamic acid

and proline. The acidic C-terminal repeat is enriched for glutamic acid and aspartic acid.

Page 147: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

147

Page 148: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

148

Supplemental Figure 3. CRISPR/Cas9-mediated replacement of zebrafish

Hemogen with the mCherry transgene recapitulates endogenous Hemogen

expression in zebrafish. (A) Schematic showing insertion of the mCherry transgene at

the CRISPR target site within exon 1 of zebrafish Hemogen. Integration of the mCherry

transgene was confirmed by sequencing the locus with internal and external primers

(arrows; Table S1). (B) 3 dpf. Representative image of the tail segment. mCherry+

mutant cells were present in the CHT and the pronephric duct (PD) and at a low

frequency in circulation in the dorsal aorta (DA, dashed outline) (n = 15 embryos). (C)

Sequence across the insertion, showing part of the Hemogen promoter (blue), the first 7

codons of the mCherry transgene (red), and a linker sequence (black). Scale Bar = 100

µm (B).

Page 149: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

149

Page 150: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

150

Supplemental Figure 4. (A) Representative Western blot of Hemogen from pooled

morphants (MO) or wild-type (WT) embryos at 33 hpf. (B) Average Hemogen protein

expression from three experiments. GAPDH served as the internal control. (*, P ≤ 0.05,

Student’s t test). (C) Relative βe1-globin expression in pooled morphant or wild-type

embryos at 3 dpf as determined by qRT-PCR. Four samples of 10 pooled embryos were

amplified per treatment. Signals were normalized to β-actin and shown relative to wild-

type. Error bars represent the standard deviation (*, P ≤ 0.05, Student’s t test).

Page 151: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

151

Page 152: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

152

Supplemental Figure 5. Hemogen mutant zebrafish have increased cell death

during embryonic development. (A) Genotypic ratios of 2 dpf embryos produced from

heterozygous incrosses of Hemgnnuz3 (Δ5) or Hemgnnuz4 (Δ12) mutants. (B) Proportion

of genotyped mutants and wild-type sibling embryos at 2 dpf that were apoptotic (black)

or phenotypically normal (white) (* P ≤ 0.05, ** P ≤ 0.005, Chi square). (BC) Acridine

orange staining for apoptotic cells is increased in the bodies and in the peripheral blood

island (outlined) in 20 hpf heterozygous Hemgnnuz2 mutant zebrafish (n = 3) compared

to wild-type siblings (n = 3). (D) Comparison of adult wild-type and homozygous

Hemgnnuz4 (Δ12) mutant. (E) Average body length of adult wild-type and Hemgnnuz4

(Δ12) mutants. Error bars represent standard error (*, P ≤ 0.05, Student’s t test). Scale

bars = 100 µm (E), 50 mm (D)

Page 153: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

153

Page 154: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

154

Supplemental Figure 6. Proposed models for regulation of Hemogen expression

by promoter elements. (A) Linear, two-promoter model. (B) Chromatin looping model.

Page 155: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

155

Movie 1. Circulating erythrocytes in Tol2-generated transgenic Tg(Hemgn-

1:mCherry,myl7:EGFP) zebrafish at 2 dpf. 4x magnification.

Movie 2. Comparison of circulating EGFP+ erythrocytes in the dorsal aorta of Hemogen

morphant and wild-type Tg(Lcr:EGFP)cz3325Tg zebrafish embryos at 3 dpf. The dorsal

aorta is highlighted, and EGFP+ erythrocytes are marked with a dot. 20x magnification.

Movie 3. Comparison of circulating erythrocytes in, Hemgnnuz2/+ zebrafish embryos (Δ5

frameshift), Hemgnnuz4/+ embryos (Δ12 deletion) and wild-type siblings at 24 hpf. 10x

magnification.

Page 156: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

156

Table S1. Sequences of primer and guide oligonucleotides used in experiments

Gene Oligo Sequence (5’ – 3’) Method

hemgn F1 hemgn F2 hemgn F3 hemgn R1 hemgn R2

CTTTCTTCTGTGAGTATTGTGC GAGAAAGAGATCCCACCAACTG GACATGATTGTGAACACGCCC TTGTTTCCATAGTAAGGAGGTG TCTGAGTCGCCGCCGAATTCC

RT-PCR of Hemgn

hemgn F1 hemgn F2 hemgn R

GACATGATTGTGAACACGCCC CTGTGAGTATTGTGCCAAGTCC TCTGAGTCGCCGCCGAATTCC

qRT-PCR of Hemgn

hemgn-Kpn1F hemgn-Spe1-R

ATCATGGGTACCCACATCCAGAAATGAGACAT ATCATGACTAGTTTTGTAGTCCTGTCACATGA

PCR Promoter

hemgn F mCherry R

CTGTGAGTATTGTGCCAAGTCC GAACTCCTTGATGATGGCC

RT-PCR transgene

hemgnMM F4 hemgn R1

ACCATGGAGGATCCGCTGGAGAAAGAGA TTGTTTCCATAGTAAGGAGGTG

zHemgn rescue cDNA

βe1-globin F βe1-globin R β-actin F β -actin R

TCGCCAAGGCTGACTACGA CGGCATTGTAGGTTTCCAA CGAGCAGGAGATGGGAACC CAACGGAAACGCTCATTGC

qRT-PCR of Morphants

LarmF LarmR-AvrII-R Rarm-ClaI-F RarmR mCherry-AvrII-F PolyA-ClaI-R

CTGTGAGTATTGTGCCAAGTCC ATCATGCCTAGGGTCTTCCATTTTGTAGTCC ATGTACATCGATCCTTAGCATTAAACATCAATCAC CCATGCCTAGTGTCAGGATC ATCATGCCTAGGATGGTGAGCAAGGGCG ATGTACATCGATCTTGTTTATTGCAGCTTATAATGGTTAC

Constructing Donor Plasmid

hemgn F mCherry R

GCTCGCTTGTTGTTTACTCT GAACTCCTTGATGATGGCC PCR Knock-in

sgRNA AAAAGCACCGACTCGGTGCCACTTTTTCAAGTTGATAACGGACTAGCCTTATTTTAACTTGCTATTTCTAGCTCTAAAAC

CRISPR template

sgRNA Hemgn ex1a

GAAATTAATACGACTCACTATAGGTGGGATCTCTTTCTCCAAGTTTTAGAGCTAGAAATAGC

CRISPR template

sgRNA Hemgn ex1b

GAAATTAATACGACTCACTATAGGAATAAAAGATTCAGATGAGTTTTAGAGCTAGAAATAGC

CRISPR template

sgRNA Hemgn ex3

GAAATTAATACGACTCACTATAGGATCTGGGGCCAGATGAGGGTTTTAGAGCTAGAAATAGC

CRISPR template

hemgn ex3 F hemgn ex3 R

GGTGCCTGAAGAAGCAATAAGTG CATTCATGAACAAGACGTTTCAGC

HRMA

hemgn ex1 F hemgn ex1 R

GCATGAATGTAAGCGGGC GTGATTGATGTTTAATGCTAAGG

HRMA

hemgn WT F hemgn Δ5 F hemgn Δ12 F hemgn Both F hemgn Both R

TGAGGATCTGGGGCCAGATG GGATCTGGGGCCAGGAG AGGATCTGGGGCCAGATATGC GATTGAGGATCTGGGGCCAG GGTGCTGGAGCAAACATTGG

qRT-PCR of

mutant alleles

Page 157: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

157

Chapter 3: Erythroid gene discovery using the erythrocyte-null Antarctic icefishes

Page 158: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

158

Abstract

The molecular regulators of erythropoiesis have been carefully studied in

different animal models. However, only the most important erythroid genes or the most

highly expressed markers have been well characterized. The complete loss of functional

red blood cells in Antarctic icefishes provides an opportunity to discover and

characterize new erythroid genes. I previously identified 31 novel erythroid-specific

genes by transcriptomic comparison of hematopoietic tissues from red- and white-

blooded notothenioids. Here, I characterize the loss of the erythroid gene hemogen and

two novel blood genes, mabcp1 and cd33rSig, from icefishes.

My studies reveal a truncating frameshift mutation in hemogen from icefishes,

which may alter its function in hematopoietic cells, kidney, brain, and testis where it is

expressed. This defect may have resulted from overexpression of a short hemogen

isoform (hemgn-s) that is specific to icefishes. I show that overexpression of icefish

hemogen in zebrafish embryos impairs primitive erythropoiesis. Finally, I characterize

the loss of two more erythroid genes, a novel mabp (MVB12-associated β-prism)-

containing protein with the second highest expression in notothenioid red blood cells

and the teleost ortholog of cd33 which is truncated in icefishes.

Page 159: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

159

Introduction

The “white-blooded” Antarctic icefishes (Channicthyidae), are the only

vertebrates that do not produce hemoglobin nor typical mature erythrocytes (Cocca et

al., 1995b; Near et al., 2006b; Zhao et al., 1998b). Loss of the globin genes in icefishes

(Cocca et al., 1995b) may be one of several erythropoietic defects that contributed to

their anemia. The defective molecular pathways in icefishes may reveal novel regulators

of erythroid development and disease (Albertson et al., 2009).

The study of blood development has helped uncover fundamental aspects of cell

differentiation and function including transcriptional regulation, chromatin regulation,

heme synthesis, the cytoskeleton, and cell survival. The process of hematopoiesis

provided the first model of cell differentiation from pluripotent stem cells (Till and

McCulloch, 1980). Erythrocytes also provided the first model to study the cytoskeleton

and important membrane-associated proteins (e.g. Spectrin, Actin, Band3, Band 4.1,

Band7) (Steck, 1974).

The severe anemia of Antarctic icefishes makes these animals the ideal “mutant

model” for erythroid gene discovery (Detrich and Yergeau, 2004). I identified 31 novel

erythroid genes by transcriptomic comparisons of red- and white-blooded notothenioid

fishes (See Chapter 1). In the current study, I characterize three blood-specific genes

that were mutated in Antarctic icefishes. I evaluate the functions of each gene using the

zebrafish model. This study provides a first analysis of three novel erythroid genes that

were lost by Antarctic icefishes.

Page 160: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

160

Results

hemogen (hemgn)

Isolation of the mutated hemogen gene from Antarctic icefishes

Teleost hemogen was originally discovered as a candidate erythroid gene that

was strongly down-regulated in Antarctic icefishes (Detrich and Yergeau, 2004;

Yergeau et al., 2005). In all vertebrates, hemogen is found as a single-copy, four exon

gene (Fig.1A) at a locus that is highly syntenic in fish and mammals (Fig. S1).

To characterize hemogen in notothenioids, I isolated and sequenced the

genomic locus from N. coriiceps and C. aceratus. Alignments of the two orthologs

revealed a 90-bp deletion and 1-bp insertion in exon 3 of icefish hemogen that

introduced a frameshift and premature stop codon (Fig. 1B-D). The same frameshift

mutation was found in hemogen in the RNA-Seq transcriptome for N. ionah. However,

the ortholog of the icefish, Ps. georgianus, only contained a 90-bp in-frame deletion,

which removed 30 amino acids (176P_206P). Thus, hemogen from icefishes first

evolved a 90-bp deletion which was followed by a 1-bp insertion in some species.

Strikingly, the hemogen gene from the dragonfish, P. charcoti, contained a 6-bp

deletion at the same site as the icefish mutation which removes two amino acids

(191V_192Pdel). Thus, mutations accumulated in the third exon of hemogen in a region

that may constitute a functional domain with an erythropoietic role. Furthermore, these

data show that loss of residues in this domain began prior to the divergence of

dragonfishes and icefishes.

Page 161: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

161

Page 162: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

162

Figure 1. The erythroid gene hemogen is mutated in Antarctic icefishes. (A) Gene

structure of zebrafish hemogen on chromosome 1 from assembly GRCz11 (O'Leary et

al., 2016). Coding exons, green boxes; introns, green lines. CRISPR/Cas9 targets are

highlighted red. RNA-Sequencing shows strong expression of hemogen in zebrafish

blood and other tissues. Values are Log2 transformed RPKM (reads per kilobase of

transcript per million mapped reads). (B) Protein domains of Hemogen from Notothenia

coriiceps. In the icefish, C. aceratus, a frameshift mutation occurs at a putative

transactivation domain (TAD). Numbers indicate length in amino acids. Abbreviations:

CC, coiled-coil domain; NLS, nuclear localization signal; R, peptide repeats; 4AA, four

amino acides introduced by alternative splicing. (C) 3D model of Hemogen from N.

coriiceps (color coded as in panel B) designed with I-tasser (Yang et al., 2015) using

human secretory component of immunoglobulin g as a template (PDB:3CHN:S). Red

and yellow regions mark the deleted and frameshifted sequences in the icefish,

respectively. (D) Exon structures of hemogen genes from N. coriiceps and C. aceratus.

The frameshift mutation occurs in exon 3 (red arrow) in icefish hemogen.

Page 163: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

163

The deleted domain in icefish Hemogen contains a transactivation domain motif

Teleost Hemogens possess conserved functional domains at the N- and C-

termini separated by a linker formed by proline-rich peptide repeats (Peters et al.,

2018). The N-terminus contains predicted coiled-coil forming nuclear localization

signals. In most teleosts, the C-terminus contains a conserved transactivation domain

(TAD) motif with the consensus sequence φφxφ, where φ is a strong hydrophobic

residue. Residues within this conserved region of Hemogen were found to be critical for

normal erythropoiesis in zebrafish (Peters et al., 2018). However, notothenioid

Hemogens lack this specific TAD motif. Instead, notothenioid Hemogens possess a

TAD motif within the peptide repeat that was deleted in icefishes (Fig. 1B).

To predict the tertiary structure of Hemogen from N. coriiceps, ab initio, 3D

models were created with I-Tasser using the X-ray structure for the secretory

component of immunoglobulin g (PDB 3CHN:S) as a template (See Methods, Fig. 1C).

The structures had TM-scores of 0.48+0.15 (TM-score > 0.3 is significant). Mutations in

Hemogens from icefishes remove the proline-rich linker (highlighted red, Fig. 5C)

including the TAD motif. I predict that loss of the proline-linker and C-terminal globular

domain including the putative TAD may disrupt Hemogen activation of erythropoiesis in

C. aceratus.

Page 164: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

164

Page 165: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

165

Figure 2. hemogen is expressed in hematopoietic, renal, and neural tissues in

embryonic and adult notothenioids. (A) 1 month post fertilization. Whole-mount in

situ hybridization (WISH) of hemogen in embryos of the red-blooded nototheniid, N.

coriiceps. Transcripts were detected with an anti-sense riboprobe synthesized from C.

aceratus hemogen cDNA. (B) 1 week. WISH of hemogen transcripts in the brain of

embryos from the icefish, C. aceratus. (C) Northern blot detection of hemogen

transcripts in tissues from N. coriiceps using an antisense riboprobe for C. aceratus

hemogen. Four alternative transcripts were detected in different tissues. (D) Western

blot detection of Hemogen protein in spleen from adult individuals of N. coriiceps (Ncor)

and C. aceratus (Cace). Specific bands were detected in both species at the molecular

weight for full-length Hemogen (~36 kDa). (E) In situ hybridization of hemogen

transcripts in spleen prints from C. aceratus with a digoxigenin-labeled antisense

riboprobe for C. aceratus hemogen. Cytoplasmic staining was seen in different

hematopoietic cell types in the icefish.

Page 166: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

166

In situ hybridization of hemogen in notothenioid embryos and in spleen from

icefishes

To determine the tissues that express hemogen in Antarctic notothenioid fishes

and that may be affected by the mutation in icefishes, I employed whole-mount in situ

hybridization to detect hemogen transcripts in embryos from N. coriiceps and C.

aceratus. In N. coriiceps embryos, at the onset of blood circulation (~1 month post

fertilization), hemogen was detected in the pronephric tubules, in the brain, and in

circulating blood cells in the vasculature and on the yolk sac (Fig. 2A). The same

expression profile is found in zebrafish embryos at 48 hpf (Peters et al., 2018). In C.

aceratus embryos, prior to the onset of circulation (~1 week post fertilization), hemogen

expression was detected in the brain but not in blood cells of the intermediate cell mass

(ICM) (Fig. 2B). In adult spleen prints from C. aceratus, hemogen transcripts were

detected in different hematopoietic cell types (Fig. 2E).

Tissue-specific isoforms of hemogen are expressed in red- and white-blooded

notothenioids

To characterize the hemogen transcripts that were expressed in notothenioids, I

performed Northern blotting of hemogen transcripts in tissues from N. coriiceps (see

Methods). Multiple alternative isoforms of hemogen were detected in different tissues

from N. coriiceps. The first transcript (~1.35 kb) was highly expressed in blood and head

Page 167: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

167

kidney (Fig. 2C) and corresponded to the expected size (1,331 kb) of N. coriiceps

hemogen (XM_010775526.1). A second transcript was found at ~1.8 kb and was

expressed in all tissues. This isoform may correspond to the 1,777-bp hemogen

transcript that retains intron 2. A third transcript was detected at 1.25 kb in testis and

ovary (Fig. 6C), and this may correspond to the testis-specific isoform found in zebrafish

(Peters et al., 2018) and in mammals (Yang et al., 2003). A fourth transcript was

detected at 900 bp in brain from N. coriiceps (Fig. 2C). Our laboratory isolated a fifth

short isoform (hemgn-s) by RT-PCR that occurred in icefishes but not in red-blooded

notothenioids (data not shown). This transcript splices around the deleted region in

icefish hemogen but produces the same frameshift and premature stop codon.

Normal and short Hemogen protein variants are expressed in icefishes

To identify Hemogen protein in notothenioids, I ran Western Blots on different

tissues from N. coriiceps and C. aceratus using an antibody directed against the

conserved N-terminal region of human Hemogen (see Methods). Hemogen was

specifically detected at 36 kDa in both spleen and brain from N. coriiceps and in spleen

from C. aceratus (Fig. 2D). The absence of a size difference for Hemogens from N.

coriiceps and C. aceratus could not be explained. I also detected a short Hemogen

isoform at ~12 kDa (Fig. 3A) that was only expressed by icefishes and likely

corresponded to the short isoform (hemgn-s). I employed quantitative PCR to measure

expression of the normal and short isoforms of hemogen in head kidneys from red- and

white-blooded notothenioids (Fig. 3B). Expression of normal hemogen was significantly

Page 168: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

168

down-regulated in icefish head kidney compared to that of red-blooded species (~35

fold change, Fig. 3B). By contrast, expression of the truncated isoform was significantly

higher in icefishes (~2,634 fold change, Fig. 3B). Therefore, down-regulation of normal

hemogen in icefishes was associated with up-regulation of the truncated isoform

(hemgn-s).

Page 169: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

169

Page 170: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

170

Figures 3. The short isoform of hemogen is overexpressed in icefishes and is

translated into a truncated protein. (A) Western blot detection of a short variant of

Hemogen (Hemgn-s, 12 kDa). The Hemgn-s protein was expressed in head kidney

(HK) and spleen of the icefish, C. aceratus (Ca), but not in the red-blooded nototheniid,

N. coriiceps, nor in the peripheral blood (PB) of either species. (B) Relative expression

of full-length hemogen (hemgn) and short hemogen (hemgn-s) isoforms in head kidney

from notothenioid fishes detected by quantitative PCR. Target gene expression was

normalized to beta-actin and error bars represent the standard deviation of one or two

biological replicates. Significant differences were seen between red- and white-blooded

phenotypes (Student’s t-test, P < 0.05,*).

Page 171: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

171

Overexpression of icefish hemogen disrupts erythropoiesis in zebrafish embryos

To characterize the developmental abnormalities caused by icefish Hemogen,

zebrafish embryos were injected with synthethic, icefish hemogen mRNAs (200 pg) with

the endogenous Kozak sequence. Blood production was assessed by o-dianisidine

staining at 48 hpf and injected embryos were compared with wild-type siblings or

embryos injected with mCherry mRNA (Fig. 4). Red blood cell production was reduced

in ~40% of embryos injected with hemogen mRNA from the icefish (Ca-Hemgn)

compared to 10% of embryos injected with mCherry mRNA and 2% of uninjected

zebrafish (Fig. 4A,B). Thus, icefish Hemogen inhibits primitive erythropoiesis and may

function as a dominant negative allele.

Page 172: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

172

Page 173: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

173

Figure 4. Overexpression of icefish hemogen in zebrafish blocks primitive

erythropoiesis. (A) 48 hpf. TU embryos were injected with a mix of synthetic,

polyadenylated hemogen mRNA from C. aceratus (Ca-Hemgn) and mCherry mRNA.

Controls were uninjected TU wild-type (WT) siblings or embryos injected with mCherry

mRNA alone. O-dianisidine staining of erythrocytes was reduced in Ca-Hemgn injected

embryos. (B) Graph showing the proportion of Ca-Hemgn injected or control embryos

that had decreased erythrocyte production (P < 0.01,* P < 0.001,**; chi square test of

proportions).

Page 174: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

174

Page 175: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

175

Figure 5. A novel MABP-containing protein (mabpcp) is a RBC-specific gene in

notothenioid fishes. (A) Gene structure of the zebrafish mabpcp ortholog (si:dkey-

30j10.5) on chromosome 3 from assembly GRCz11 (O'Leary et al., 2016).

CRISPR/Cas9 targets in the gene are highlighted red. RNA-Sequencing shows strong,

specific expression of si:dkey-30j10.5 in zebrafish blood. Values are the log2

transformed RPKM. (B) Protein domains of Mabpcp from Notothenia coriiceps.

Numbers indicate length in amino acids. Two gaps (marked X) were found in the

assembled transcript from both icefishes, C. aceratus and Ps. georgianus.

Abbreviations: non-cyto, non-cytoplasmic domain; S, signal peptide; MABP, MVB12-

Associated β-prism domain. (C) Color-coded, 3D model of MABP-containing protein

from Notothenia coriiceps (LOC104952319) designed using I-tasser (Yang et al., 2015).

The model is superimposed on the MABP domain from human MVB12B (PDB:

3TOW:A). Yellow arrows, beta sheet; red, gaps in icefish sequence; blue, lipid-binding

residues; white, no prediction. (D) 24 hpf. Whole-mount in situ hybridization of dkey-

30j10.5 in wild-type zebrafish. Sense probe control shown as inset. (E) 20 hpf. Wild-type

TU and CRISPR-injected sibling embryos. Mutants were injected with Cas9 and a

gRNA targeting dkey-30j10.5 as in Panel A. (F) CRISPR target sequences in zebrafish

dkey-30j10.5. Protospacer-adjacent motif (PAM, underlined and highlighted blue).

Abbreviations: MB, midbrain; HB, hindbrain; Vent, brain ventricle

Page 176: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

176

MABP-CONTAINING PROTEIN (MABPCP)

I identified a novel RBC-specific gene in the RNA-Seq transcriptomes of

notothenioid fishes, which had the second highest expression level in P. charcoti

peripheral blood (63,503 TPM) and head kidney (36,283 TPM), second only to beta-

globin (267,053 TPM and 76,609 TPM) and 11 times higher than all other blood-specific

genes. The two-exon gene mostly encodes an MVB12-Associated β-prism domain

(MABP, InterPro IPR023341) that is related to the MABP domain of the DENND4c

protein (DENN domain-containing protein 4C) found in all vertebrates. Thus, we named

the notothenioid protein MABP-containing protein (MABPCP). The icefish ortholog was

fragmented and not strongly expressed (< 0.39 TPM, Fig. 5B) in the transriptomes of

both Ps. georgianus and N. ionah. In the icefish ortholog, two gaps were present in the

assembled contig at residues that correspond to W115 and Y176 (Fig. 5B). The gaps in

the assembly may have been caused by repetitive sequences and/or significant

genomic alterations at the gap loci.

The MABP domain is a lipid-binding structure that localizes proteins to

membranes and which has been implicated in endocytic transport (de Souza and

Aravind, 2010). A variety of MABP-containing proteins are found in eukaryotes and also

in bacteria (de Souza and Aravind, 2010) but no direct ortholog of notothenioid

MABPCP occurs in mammals. In teleosts, including notothenioid fishes, multiple

duplicated paralogs are found as one or two-exon genes. The nototheniid N. coriiceps

contains the RBC-specific protein and two more paralogs, LOC104952319 and

Page 177: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

177

LOC104956446 – the latter was specifically expressed in the trunk kidneys of P.

charcoti and Ps. georgianus.

dkey:30j10.5 (Acc. XM_001335220.7) was identified as the MABPCP ortholog in

zebrafish, and was also found to have strong, specific expression in peripheral blood

(Fig. 6A). dkey:30j10.5 occurs as a single-copy, two-exon gene located on chromosome

3 at a locus that is a well-known erythroid gene cluster on chromosome 17 in humans

(Fig. S1). Thus, the loss of red blood cells in icefishes was coincident with the loss of a

teleost-specific erythroid gene.

Protein domain and structure of MABPCP from notothenioids

Ab initio, tertiary structure models were created for notothenioid MABPCP (Fig.

5C) using I-Tasser based on the solved X-ray structure for the MABP domain of human

Multivesicular body subunit 12B (MVB12B, PDB 3TOW:A) (Boura and Hurley, 2012).

The model and its template had a template modeling score (TM-score) of 0.34±0.11

(TM-score > 0.3 = P < 0.001). In the structure, the MABP domain from notothenioid

MABPCP contains the same hydrophobic β2-β3 loop seen in MVB12B that has been

predicted to insert itself within lipid membranes (Boura and Hurley, 2012) (Fig. 5C). The

gaps in icefish MABPCP occur at electropositive residues that anchor this domain to

membranes (Boura and Hurley, 2012) (Fig. 5B,C). Thus, changes to this domain in

icefishes may disrupt MABPCP binding to cell membranes.

Page 178: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

178

Expression profile of mabp-containing protein in zebrafish embryos

The spatiotemporal expression profile of zebrafish mabpcp (dkey:30j10.5) was

evaluated in wild-type TU embryos by whole mount in situ hybridization (WISH) at 20

hpf (Fig. 5D). Sibling embryos were fixed and hybridized with sense or anti-sense

digoxigenin-labeled riboprobes targeting a portion of the dkey:30j10.5 gene (See

Methods). Faint expression was specifically detected in the central nervous system with

an anti-sense probe (Fig. 5D) but not with a sense probe (Fig. 5D inset). The highest

expression occurred in the midbrain and hindbrain, specifically in cells associated with

the brain ventricles (Fig. 5D). The intermediate cell mass (ICM) and peripheral blood

island (PBI) did not show strong expression, which indicates dkey:30j10.5 is not

associated with primitive erythropoiesis in embryos.

CRISPR/Cas9 targeting of mabp-containing protein in zebrafish

To analyze the developmental role of mabpcp, we employed CRISPR/Cas9 gene

editing to target zebrafish dkey-30j10.5 (Fig. 6A,E,F). Wild-type TU zebrafish embryos

were co-injected with a guide RNA (100 ng µl -1), Cas9 mRNA (1000 ng µl -1), and

mCherry mRNA (100 ng µl -1) (See Methods and Table 2). At 20 hpf, 88% of injected

mutants had shortened tails and severe deformities compared to uninjected, wild-type

siblings (n = 25, Fig. 5E) - these deformities are common phenotypes produced from

off-target effects. Furthermore, the deformities (Fig. 5E) were not restricted to the sites

Page 179: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

179

of dkey-30j10.5 expression in embryonic zebrafish (Fig. 5D), which also suggested they

were non-specific phenotypes.

Several MABP-domain containing proteins have been found in eukaryotes and

bacteria and were shown to bind cell membranes (Allaire et al., 2010; Boura and Hurley,

2012; Denef et al., 2008; Rosado et al., 2007). In the dragonfish, P. charcoti, mabpcp is

the second highest expressed gene in red blood cells. This protein may have a function

in the ESCRT machinery (endosomal sorting complexes required for transport) as do

other MABP-containing proteins (Boura and Hurley, 2012). The ESCRT pathway plays

an important role in mitochondrial removal during erythroid maturation and defects in

this process can cause anemia (Mortensen et al., 2010).

Page 180: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

180

Page 181: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

181

Figure 6. Modeling a truncated CD33-related Siglec (CD33rSig) from icefishes in

mutant zebrafish. (A) Gene structures of the zebrafish cd33rSig paralogs, dkey-

238d18.10 and LOC101884840, on chromosome 15 from assembly GRCz11 (O'Leary

et al., 2016). CRISPR targets are highlighted red. RNA-Sequencing shows specific

expression of dkey-238d18.10 in blood and LOC101884840 in brain. Values are the

log2 transformed RPKM. (B) Protein domains of the CD33rSig ortholog from N.

coriiceps. The F753* mutation truncates CD33rSig in both icefishes, C. aceratus and

Ps. georgianus. Numbers indicate length in amino acids. (C) 20 hpf. Wild-type TU

embryo (D) 20 hpf. CRISPR-injected embryo targeting dkey-238d18.10. Note the

enlarged peripheral blood island (PBI, outlined) in the mutant. (E) 20 hpf. CRISPR-

injected embryo targeting LOC101884840. (F) 3D model of CD33rSig from N. coriiceps

created with I-tasser (Yang et al., 2015) using human CD33 as a template (PDB:

5IHB:A). Red marks the truncated region in icefish CD33rSig. (G) High resolution

melting curve showing decreased melting of the mutant (red) dkey-238d18.10 alleles

compared to wild-type (blue). Inset shows the difference curves for several mutants. (H)

Sequence TRACE result of dkey-238d18.10 from CRISPR-injected TU zebrafish.

Frameshifts occur at the protospacer-adjacent motif (PAM, underlined). Abbreviations:

S, sigal peptide; Ig, immunoglobulin-like domain; C2-set, immunoglobulin c2-set

(constant) domain; v-set, immunoglobulin v-set (variable) domain; Tr, transmembrane;

cyto, cytoplasmic; ITIM?, putative immunoreceptor tyrosine-based inhibitory motif.

Page 182: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

182

CD33-RELATED SIGLEC (CD33rSig)

One of the blood-specific gene was identified as a member of the multi-gene

family of CD33-related Sialic-acid-binding immunoglobulin-like lectins (CD33rSiglec)

and had strong, specific expression in peripheral blood cells from P. charcoti (42.04

TPM). This CD33rSiglec was found as a single copy gene and is orthologous to

LOC104953882 from the genome of the red-blooded nototheniid, N. coriiceps (Shin et

al., 2014). In the transcriptomes of three icefishes, (N. ionah, Ps. georgianus, and C.

aceratus), the corresponding orthologs contain a C-terminal frameshift mutation created

by a 1-bp deletion in exon 9, which introduced a frameshift and an immediate premature

translation termination codon (Fig. 7B).

The family of Siglecs is made up of diverse cell surface receptors that are

expressed on the membranes of different hematopoietic lineages (Crocker et al., 2007).

In humans, CD33 (Siglec-3) is restricted to myeloid lineages and is over-expressed in

acute myeloid leukemias (De Propris et al., 2011). No obvious cd33 ortholog is found in

teleosts but tandem duplication of an ancestral CD33-like gene produced numerous

CD33rSiglec-extended (CD33e) genes in fishes that are conserved with mammalian

CD33rSiglecs, Siglec-4/MAG (myelin-associated glycoprotein), and Siglec-2/CD22 (Cao

et al., 2009). Recently, a new study discovered the same CD33rSig in another teleost

(rock bream, Oplegnathus fasciatus), and this gene was designated as the functional

ortholog of mammalian CD33 (Jeswin et al., 2018). As for notothenioid fishes, rock

bream CD33 was specifically expressed in leukocytes of the peripheral blood and was

Page 183: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

183

found to be up-regulated during the immune response (Jeswin et al., 2018). Thus,

Antarctic icefishes appear to have lost the functional ortholog of CD33.

In the zebrafish genome assembly GRCz10 (Howe et al., 2013), a tandem pair of

paralogs, si:dkey-238d18.10 and LOC101884840, are both related to N. coriiceps

LOC104953882 and are located at the CD33rSiglec cluster on chromosome 15 (Fig.

7A,S1). This locus shares synteny with the genomic loci of both MAG and CD33 on

human chromosome 19 (Fig. S1). High, specific expression in blood is seen for

zebrafish si:dkey-238d10.10 but not for LOC101884840 (Fig. 6A). Together these data

indicate that notothenioid LOC104953882 and zebrafish si:dkey-238d18.10 are

candidates for the functional orthologs of the myeloid Siglec, CD33.

Protein structures and domains of CD33rSig from notothenioids

Ab initio, tertiary structure models were created for the notothenioid CD33rSiglec

(LOC104953882) with I-Tasser using the X-ray structure for human CD33 (PDB 5IHB:A)

as a template (Dodd, 2016, to be published). The structures for the zebrafish and

human proteins had template modeling scores (TM-scores) of 0.726, (TM-score > 0.3 =

P < 0.001) (Xu and Zhang, 2010). The frameshift mutation and premature stop codon in

icefish CD33rSiglec occurs in the most C-terminal C2-set immunoglobulin domain,

which removes the transmembrane and cytoplasmic domains from this cell surface

protein (Fig. 6B,F). In red-blooded species, the cytoplasmic domain contains two

tyrosine residues that likely function as immunoreceptor tyrosine-based inhibitory (ITIM)

Page 184: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

184

or activator (ITAM) motifs (Fig. 6B,F), which carry out the function of most CD33rSiglecs

(Paul et al., 2000). The loss of this inhibitory domain in icefishes may affect immunity,

hematopoietic proliferation and cell survival (Nguyen et al., 2006; Varki and Angata,

2006; Vitale et al., 2001).

CRISPR/Cas9 targeting of CD33rSig in zebrafish

To determine the developmental role of cd33rSiglec in fishes, I employed

CRISPR/Cas9 gene editing to target the zebrafish paralogs, si:dkey-238d18.10 and

LOC101884840 (Fig. 6A). Wild-type TU zebrafish embryos were co-injected with a

guide RNA (100 ng µl -1), Cas9 mRNA (1000 ng µl -1), and mCherry mRNA (100 ng µl -1)

(See methods and Table 2). Mutant and wild-type siblings were imaged and genotyped

by high-resolution melting analysis (HRMA) (Fig. 6G) and by sequencing of the locus

(Fig. 6H). At 20 hpf, enlarged peripheral blood islands (PBI) were seen in 33% of

injected mutant si:dkey-238d18.10 zebrafish (n = 12) (Fig. 6D). Additionally, 33% of

mutants had shortened tails with a reduced PBI, and 25% were deformed due to cell

death (Fig. 6D). While some of these phenotypes may result from off-target effects, the

uncontrolled proliferation of PBI blood cells in si:dkey-238d18.10 mutant zebrafish is an

uncommon trait and may evidence a role for this CD33rSiglec in primitive

hematopoiesis. The phenotype of si:dkey-238d18.10 mutants contrasted sharply with

that of LOC101884840 mutants (Fig. 6D). Shortened tails occurred in 71% of

LOC101884840 mutants and 57% were developmentally delayed (n = 7, Fig. 6E).

Page 185: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

185

In mammals, truncation of CD33 and loss of the cytoplasmic ITIM has been

shown to prevent internalization of the receptor thereby increasing CD33 expression on

the cell surface (Walter et al., 2008). CD33 is highly expressed in normal and malignant

myeloid lineages and is a common target for immunotherapy – internalization of a drug-

linked CD33 antibody (Gemtuzumab ozogamicin) promotes cell death (Geiger and

Rubnitz, 2015). The functions of CD33 in myeloid cells are not well understood

(Ulyanova et al., 1999). Many Siglecs are involved in erythroblast island formation

(Rhodes et al., 2008) and CD33 knockout mice display a slight erythroid defect

(Brinkman-Van der Linden et al., 2003). The loss of cd33 may affect the survival of

myeloerythroid progenitors in icefishes.

Page 186: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

186

Methods

Fish husbandry

Wild-type (SAT, AB, TU) and transgenic Tg(lcr:egfp) zebrafish (Danio rerio), were

generously provided by Dr. Leonard I. Zon (Howard Hughes Medical Institute and

Harvard Medical School, Boston). Animal procedures were carried out in full

accordance with established standards set forth in the Guide for the Care and Use of

Laboratory Animals (8th Edition). The animal care and use protocol for live zebrafish

embryos was reviewed and approved by Northeastern University’s Institutional Animal

Care and Use Committee (Protocol No. 15-0207R). The animal care and use program

at Northeastern University has been continuously accredited by AAALAC Int. since July

22, 1987, and maintains the Public Health Service Policy Assurance number A3155-01.

Cloning and sequence analysis of zebrafish and notothenioid cDNAs

Total RNA was isolated from wild-type AB zebrafish embryos or flash frozen

tissues from Antarctic notothenioid species using TRI reagent (Sigma, T9424) and the

Ribopure Kit (Ambion, AM1924). Total cDNA was produced from mRNA using M-MuLV

reverse transcriptase (NEB, M0253S) and an oligo(dT)23 primer. cDNAs were amplified

by PCR from total cDNA with 1 µM primers (Table S1) – the amplification program was

35 cycles of 98°C for 10 s, 60°C for 10 s, and 72°C for 30 s. PCR products were cloned

into the pGEM-T Easy vector (Promega, A1360), plasmids were transformed into 5-α

competent cells (New England Biolabs, C2987H), recombinant plasmids were identified

by blue/white screening and purified with the Wizard Plus SV Miniprep Kit (Promega

A1330), and inserts were sequenced by GeneWiz.

Page 187: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

187

Comparison of vertebrate genes

We used Blast+ (Altschul et al., 1990) to identify genes from Antarctic

notothenioid fishes in the zebrafish genome (assembly GRCz11) (Howe et al., 2013).

Chromosomal synteny comparisons were performed using the Synteny Database with a

sliding window of 200 genes (Catchen et al., 2009). Protein domains were predicted

using InterProScan (Jones et al., 2014). Ab initio tertiary structure models were created

for proteins from N. coriiceps including Hemogen, LOC104953882, and

LOC104952319. 3D models were created with I-Tasser (Yang et al., 2015) based on

the X-ray structures for the secretory component of immunoglobulin G (PDB:3CHN:S),

CD33 (PDB:5IHB:A), and the MABP domain of MVB12B (PDB:3TOW:A), respectively.

The 3D models were superimposed using TM-align (Zhang and Skolnick, 2005) or

Geneious version R10 (Kearse et al., 2012).

In-situ hybridization

The spatial and temporal patterns of expression of selected genes were analyzed

by whole-mount in situ hybridization (WISH) of zebrafish and notothenioid embryos

following standard protocols (Jacobs et al., 2011). These methods were adapted to

evaluate Hemogen expression in spleen prints prepared from adult notothenioid fishes

after they were euthanized in 200 mg L-1 tricaine methane sulfonate (MS222; Sigma-

Aldrich, 886862) (Detrich and Yergeau, 2004; Gupta and Mullins, 2010). Digoxigenin-

Page 188: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

188

labeled antisense and sense RNA probes were transcribed from cDNA clones using the

DIG RNA Labeling Kit (Roche Diagnostics, 11175025910).

Northern Blotting

Total mRNA was purified from flash frozen tissues using TRI reagent (Sigma,

T9424) and the Ribopure Kit (Ambion, AM1924). mRNA (5 µg) was electrophoresed in a

denaturing gel. Replicate lanes were cut out and stained with ethidium bromide to

assess RNA quality. Separated mRNAs were transferred overnight to nylon paper by

upward capillary transfer. Blots were hybridized with digoxigenin-labeled antisense or

sense RNA probes following a standard protocol (Alwine et al., 1977).

Western blotting

Total protein was prepared for sodium dodecyl sulfate polyacrylamide gel

electrophoresis (SDS-PAGE) from flash frozen notothenioid tissues or fresh zebrafish

tissues by homogenization in lithium dodecyl sulfate (LDS) Bolt buffer (Life

Technologies, B007) and NuPAGE reducing agent (Life Technologies, NP0009) using a

pestle and microcentrifuge tube (USA Scientific, 1415-5390). Samples were boiled for 3

min. and centrifuged at top in an Eppendorf 5417R centrifuge speed for 2 min. Aliquots

(15 µg) were electrophoresed on a 4-12% SDS polyacrylamide gel, and the separated

proteins were transferred to a polyvinylidene difluoride (PVDF) membrane with the iBlot

system (Life Technologies, IB21001). Membranes were blocked in maleic acid blocking

buffer (2% Roche blocking reagent, 2% BSA, 0.2% heat treated goat serum, 0.1%

Page 189: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

189

Tween-20) for 1 hour at room temperature and then incubated overnight at 4°C with

1:1000 rabbit anti-Hemogen (Aviva, ARP57794_P050) or with 1:1000 mouse anti-

GAPDH (Aviva, OAE00006) antibodies. Membranes were washed in TBST (0.1 M Tris,

0.1 M NaCl, 0.1% Tween-20) and incubated for 2 h with horseradish peroxidase (HRP)-

conjugated goat anti-rabbit IgG (H&L) (Aviva, ASP00001) or HRP-conjugated goat anti-

mouse IgG (H&L) (Aviva, OARA04973), respectively. Bound antibodies were detected

with the Amersham ECL Western Blotting Analysis System (GE Healthcare, RPN2106)

on CL-X Posure film (Thermo Scientific,34091).

Overexpression of icefish hemogen in zebrafish

The natural hemogen kozak sequence in notothenioids was added to hemogen

cDNA clones from N. coriiceps and C. aceratus by PCR using 1 µM primers (Table S1)

– the amplification program was 35 cycles of 98°C for 10 s, 60°C for 10 s, and 72°C for

30 s. PCR products were cloned into the pGEM-T Easy vector (Promega, A1360),

plasmids were transformed into 5-α competent cells (New England Biolabs, C2987H),

recombinant plasmids were identified by blue/white screening and purified with the

Wizard Plus SV Miniprep Kit (Promega A1330), and inserts were sequenced by

GeneWiz. Sense mRNAs were transcribed, capped, and polyadenylated in vitro using

the mMessage SP6 kit (Ambion, AM1340) and the Poly(A) Tailing Kit (Ambion,

AM1350). mRNA was purified by precipitation using 2.5 M LiCl. Hemogen mRNAs (30

ng µL-1) and mCherry mRNA (30 ng µl-1) were co-injected into one-cell, wild-type SAT

zebrafish embryos. Treated and control embryos were stained with o-dianisidine using

Page 190: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

190

previously established methods (Yergeau et al., 2005) and micrographed between 20-

48 hpf.

CRISPR/Cas9 generation of mutant zebrafish

Optimal targets for CRISPR-Cas9 mutagenesis were identified in zebrafish

si:dkey-30j10.5, si:dkey-238d18.10/CD33, LOC101884840, and tyrosinase using the

program CHOPCHOP (Labun et al., 2016; Montague et al., 2014). The templates for

multiple small guide RNAs (Table S2) were produced by a cloning-free method as

previously described (Hruscha et al., 2013; Talbot and Amacher, 2014). Guide RNAs

were transcribed with the T7 MaxiScript Kit (Ambion, AM1312) and purified by LiCl

precipitation.

Wild-type (TU) embryos were co-injected with a guide RNA (150 ng µl-1), Cas9

mRNA (300 ng µl-1), and mCherry mRNA (ng µl-1) to identify successful injections.

Embryos were raised and adults were tail-clipped and genotyped by high-resolution

melting analysis (HRMA) as previously described (Talbot and Amacher, 2014). PCR

amplification was run using 1 µM primers (Table S1) with PowerUp SYBR MasterMix

(Applied Biosystems, A25742) on a QuantStudio 3 Real-time PCR system

(ThermoFisher, A28137). PCR amplicons were sequenced by Genewiz.

Imaging

Fixed zebrafish or notothenioid embryos were mounted in 80% glycerol and

imaged with a dissecting microscope (Nikon, SMZ-U) and a CCD digital camera

Page 191: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

191

(Diagnostic Instruments, SPOT32). Live zebrafish embryos were embedded in 0.1%

agarose in embryo medium (EB) with 0.01% tricaine and imaged with an

epifluorescence-equipped microscope (Nikon, Eclipse E800) using a Photometrics

Scientific CoolSNAP EZ camera and NIKON NIS-Elements AR 4.20 software.

Quantitative PCR

RNA was purified from flash frozen notothenioid tissues or fresh zebrafish

tissues in TriZol (Sigma-Aldrich, T9424) using the PureLink RNA purification Kit

(Ambion). DNase treated RNA was reverse transcribed with a polyT(23) primer using

Protoscript II RT-PCR kit (New England Biolabs, M0368S). Target genes were amplified

in triplicate from cDNA by qRT-PCR with 1 µM primers (Table S1). Standard curves

were generated to confirm primer efficiencies. Target gene expression was normalized

to beta-actin for comparison by the ΔΔCt method. Three or four biological replicates

were used for each treatment for statistical comparisons.

Statistical analyses

Data are displayed as means±s.e.m. or means±s.d. or as noted. Differences with

a p-value ≤ 0.05 were considered significant for all statistical tests.

Page 192: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

192

Page 193: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

193

Figure S1. Synteny maps comparing the chromosomal loci of novel RBC-specific

genes in zebrafish and humans. (A) Syntenic Hemogen loci on zebrafish

chromosome 1 and human chromosomal region 9q22.33. (B) Synteny of loci for

zebrafish dkey-30j10.5 on chromosome 3 and the corresponding region on human

chromosome 17. No direct ortholog was identified in humans. (C) Synteny of loci for

zebrafish dkey-238d18.10 and LOC101884840 paralogs on chromosome 15 and

human CD33 on chromosome 19.

Page 194: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

194

Table S1. Primer Sequences

Gene Oligo Sequence (5’ – 3’) Gene Method

Ncor130For NcHemgn_R2 NcHemgnR1050

GGAGGAGACATTTCAAC CTAACAGGATGCACACTAACC AGATACCCGTCATTCAGGA

hemgn (Notothenioid)

PCR gDNA

NcHemgnR2.2 NcHemgnR2.1 NcHemgn5utrF Icefish_For5utr

CCTCAGAAGATCCCTGTCAC CACGTAACCGGCGACGGATC ATGCCCTCACACAACTTGAC GTGTCCCCGAGGTTATAATAC

hemgn (Notothenioid)

PCR gDNA

30j10_F 30j10_R 30j10_F3 30j10_R_mp1

CCAGCACTGCGGTTCAG GAGATATGGAAAAAGGTCTGGAGG GACCAGGATCAGTTTTCATTC AGATTCTTCTTGACCTGCTCGT

dkey:30j10.5 (Zebrafish)

RT-PCR

30j10_F 30j10_R

CCACCACTAAAGATGAGGAGGA CCACAGATTGATTTTGTCTCCA

dkey:30j10.5 (Zebrafish)

HRMA

Dkey238F Dkey238R

GTGCACTATTATTTGCACGCTC CCCGATTTAAACCAGAAAGTGT

dkey:238 (Zebrafish)

HRMA

LOC101884840F LOC101884840R

CCACAGCTGCAATTTACAGAAC CTGATACCACACAACTCTGCGT

LOC101884840 (Zebrafish)

HRMA

Hemgn_F_kozak NcHemgn_R2

AATTCATAGCAGGACTCAGAATGGAGGAGACATTTCAAC CTAACAGGATGCACACTAACC

hemgn (Notothenioid)

PCR gDNA

CD33rSig_F CD33rSig_R

CTGCTCATTAGAGATTGATGA GAAGGTTATTGTGGAGGTC

cd33rSig (Notothenioid)

PCR gDNA

Drhemex3Fb CCTCAAGAGGAGTTTTTGATTGAGG hemgn

(Zebrafish) PCR

β-globin_F β-globin_R

TCGCCAAGGCTGACTACGA CGGCATTGTAGGTTTCCAA

beta-globin (Zebrafish)

qPCR

β-actin_F β-actin_R

CGAGCAGGAGATGGGAACC CAACGGAAACGCTCATTGC

beta-actin (Zebrafish)

qPCR

β-act_F β-act_R

CAGATCATGTTCGAGACCTTCAAC TCACCRGARTCCATGACGATA

beta-actin (Notothenioid)

qPCR

Hemgn_short_F NcHemgn_R2

GACTAACCAGTGGGTTTTAAGCC CTAACAGGATGCACACTAACC

hemgn (Notothenioid)

qPCR

Page 195: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

195

Table S2. Oligos for CRISPR gRNA template

Gene Oligo Sequence (5’ – 3’) Gene Method

30j10_gRNA1 30j10_gRNA2

GAAATTAATACGACTCACTATAGGAAAGATCGCGTCTTCCTCGTTTTAGAGCTAGAAATAGC GAAATTAATACGACTCACTATAGGAGGCAGCTGGGTACGAGCGTTTTAGAGCTAGAAATAGC

dkey:30j10.5 (Zebrafish)

CRISPR

LOC101884840_gRNA GAAATTAATACGACTCACTATAGGAACCTTGGAGGCCGTGAAGTTTTAGAGCTAGAAATAGC

LOC101884840 (Zebrafish)

CRISPR

Dkey238_gRNA GAAATTAATACGACTCACTATAGGTTGGACTCTCTTTCTGACGTTTTAGAGCTAGAAATAGC

dkey:238d18.10 (Zebrafish)

CRISPR

gRNA_common AAAAGCACCGACTCGGTGCCACTTTTTCAAGTTGATAACGGACTAGCCTTATTTTAACTTGCTATTTCTAGCTCTAAAAC

CRISPR template

Page 196: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

196

Conclusion

The evolution of Antarctic notothenioid fishes appears to have involved a gradual

reduction in the activity of the erythropoietic pathway (Eastman, 1993; Lau et al., 2012;

Wells et al., 1980), which culminated in the complete loss of production of typical

mature red blood cells in the derived monophyletic clade of icefishes (Channicthyidae).

Mutations in key erythroid genes or genetic pathways may have instigated the

evolutionary loss of red blood cells. Genes that are targets in human anemias are prime

candidates, but there may also be unknown genes whose mutation caused or

contributed to icefish anemia. As an example of the former, I found that the Antarctic

dragonfish P. charcoti (dragonfishes are the sister clade to the icefishes), produces

abnormal spherocytic erythrocytes and has mutations in erythroid beta-spectrin that are

identical to several that cause human hereditary spherocytic anemia. Moreover, some

mutations in icefishes, like the deletions of the adult alpha and beta globin genes

(Cocca et al., 1995b; di Prisco et al., 2002; Zhao et al., 1998b), may be a consequence

of relaxed selection due to the loss of globin transcriptional regulation (Lau et al., 2012)

or due to the loss of globin-expressing erythrocytes. Alternatively, one may speculate

that some regulator(s) of globin transcription (e.g. Lcr, Gata1, Hemgn, etc.) became

functionally compromised, and this led to the loss of globin expression and the

subsequent deletion of the locus.

Page 197: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

197

Chapter 2

Teleost hemogen was first discovered as a marker that was strongly expressed

in the hematopoietic tissues of red-blooded notothenioids but not in the derived lineage

of white-blooded Antarctic icefishes (Detrich and Yergeau, 2004; Yergeau et al., 2005) –

this finding implicated teleost hemogen in erythropoiesis. In my preliminary research, I

identified a frameshift mutation in the icefish hemogen gene, a defect that truncates the

putative transactivation domain (TAD) of the encoded transcription factor. The hemogen

gene is present in the genomes of all vertebrates except the superclass of jawless

fishes (Agnatha), and is found as a highly conserved, single-copy, four exon gene.

Except in a few species, the hemogen gene has been preserved in the same state for

over 450 million years and is likely to be a gene that is crucial for vertebrate

development. In support of this, I found that the expression pattern of hemogen was

conserved between fishes and mammals. In zebrafish, hemogen expression is driven

by Gata1 in differentiating erythrocytes, in Sertoli cells of the testis, in the brain, and in

renal cells of the kidney. Two conserved non-coding elements function individually and

together to regulate hemogen expression in primitive and definitive waves of blood

development in zebrafish.

In icefish Hemogen, deletion of the putative TAD is likely to impair the

recruitment of P300 to the erythroid transcription factor, Gata1 (Zheng et al., 2014). To

determine the effects of the icefish hemogen mutation on erythroid development, I used

the CRISPR/Cas9 gene editing system to generate hemogen mutant zebrafish that

recapitulate the icefish mutation. I showed that the frameshift mutation in zebrafish

hemogen was a dominant-negative allele, which caused partial anemia in embryos and

Page 198: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

198

adults. Therefore, intact Hemogen appears to be required for erythropoiesis and may

contribute to the anemia of Antarctic icefishes.

The transgenic zebrafish lines produced in this study provide the first in vivo

animal models to analyze the function of Hemogen during embryonic and adult

development. These zebrafish models may be useful in identifying causes and

treatments of human blood diseases that have been associated with hemogen

overexpression.

Chapter 1

To discern the mutation events that led to the deletion of the globin genes and

the loss of red blood cells in icefishes, I performed comparative transcriptomics of red-

and white-blooded notothenioid species. I show that the mutation in icefish hemogen

was one of several genetic defects in a shared molecular pathway that may contribute

to an intricate repression of erythropoiesis. Notably, icefish erythropoiesis may be

blocked by an acetylation imbalance caused by down-regulation of P300 and

overexpression of Hdac1b, two proteins that are known to regulate the activity of Gata1

(Boyes et al., 1998). Furthermore, both P300 and Gata1 contain predicted deleterious

substitutions in the domains that bind Hemogen, which suggest that this activating

complex was lost by icefishes. These mutations and the truncation of the Hemogen

TAD in icefishes may hint at the loss of a multi-protein complex formed by Hemogen,

Gata1, and other cofactors that bind the globin locus control region (Lcr).

Page 199: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

199

Chapter 3

In red-blooded Antarctic notothenioid fishes, hemogen is expressed in the same

tissues at sites of embryonic and adult hematopoiesis, in the brain, and in renal cells of

the kidney. Despite their severe anemia, icefish embryos produce hemgn+ primitive

erythroid cells in lateral plate mesoderm and express the truncated allele at very low

levels compared to red-blooded species. Overexpression of icefish hemogen severely

impairs erythropoiesis in a zebrafish model, which indicates that this truncated hemogen

is a dominant negative allele. Interestingly, icefishes produce another truncated isoform

of hemogen (hemgn-s) through alternative splicing, which is not seen in red-blooded

species. It is likely that this truncated Hemogen isoform also operates as dominant

negative protein to disrupt erythropoiesis in icefishes. This unique mechanism of

evolution is strikingly similar to splicing mutations that cause some human blood

diseases (Conboy, 2017). One might speculate that overexpression of the hemgn-s

isoform may have pre-empted the permanent deletion and truncation of the Hemogen

TAD in icefishes.

The evolutionary loss of red blood cells in Antarctic icefishes facilitated the

discovery of 31 novel erythroid genes. In icefishes, three blood-specific genes (hemgn,

cd33rsig, and mabp-like) contain nonsense mutations that disrupt important functional

domains. The hemgnuz2 and hemgnnuz4 mutant zebrafish lines produced in this study

demonstrate a critical role for the Hemogen C-terminal TAD in erythropoiesis.

Generation of stable mutant zebrafish lines for the cd33rsig and mabp-like alleles from

icefishes may also reveal novel roles for these genes in erythropoiesis.

Page 200: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

200

References

Albertson, R.C., Cresko, W., Detrich, H.W., 3rd, Postlethwait, J.H., 2009. Evolutionary mutant models for human disease. Trends Genet 25, 74-81.

Albertson, R.C., Yan, Y.L., Titus, T.A., Pisano, E., Vacchi, M., Yelick, P.C., Detrich, H.W., 3rd, Postlethwait, J.H., 2010. Molecular pedomorphism underlies craniofacial skeletal evolution in Antarctic notothenioid fishes. BMC evolutionary biology 10, 4.

Allaire, P.D., Marat, A.L., Dall'Armi, C., Di Paolo, G., McPherson, P.S., Ritter, B., 2010. The Connecdenn DENN domain: a GEF for Rab35 mediating cargo-specific exit from early endosomes. Mol Cell 37, 370-382.

Altenhoff, A.M., Gil, M., Gonnet, G.H., Dessimoz, C., 2013. Inferring hierarchical orthologous groups from orthologous gene pairs. PLoS One 8, e53786.

Altenhoff, A.M., Schneider, A., Gonnet, G.H., Dessimoz, C., 2011. OMA 2011: orthology inference among 1000 complete genomes. Nucleic Acids Res 39, D289-294.

Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1990. Basic local alignment search tool. Journal of molecular biology 215, 403-410.

Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J., 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389-3402.

Alwine, J.C., Kemp, D.J., Stark, G.R., 1977. Method for detection of specific RNAs in agarose gels by transfer to diazobenzyloxymethyl-paper and hybridization with DNA probes. Proc Natl Acad Sci U S A 74, 5350-5354.

Amsterdam, A., Nissen, R.M., Sun, Z., Swindell, E.C., Farrington, S., Hopkins, N., 2004. Identification of 315 genes essential for early zebrafish development. Proc Natl Acad Sci U S A 101, 12792-12797.

An, L.L., Li, G., Wu, K.F., Ma, X.T., Zheng, G.G., Qiu, L.G., Song, Y.H., 2005. High expression of EDAG and its significance in AML. Leukemia 19, 1499-1502.

Archer, S.D., Johnston, I.A., 1987. Kinematics of labriform and subcarangiform swimming in the Antarctic fish Notothenia neglecta. J. Exp. Biol. 143, 195-210.

Archer, S.D., Johnston, I.A., 1991. Density of cristae and distribution of mitochondria in the slow muscle fibres of Antarctic fish. Physiol. Zool. 64, 242-258.

Ata, H., Clark, K.J., Ekker, S.C., 2016. The zebrafish genome editing toolkit. Barber, D.L., Westerman, J.E.M., White, M.G., 1981. The blood cells of the Antarctic icefish

Chaenocephalus aceratus Lönnberg: light and electron microscopic observations. J Fish Biol 19, 11-28.

Barisic, M., Korac, J., Pavlinac, I., Krzelj, V., Marusic, E., Vulliamy, T., Terzic, J., 2005. Characterization of G6PD deficiency in southern Croatia: description of a new variant, G6PD Split. J Hum Genet 50, 547-549.

Batada, N.N., Hurst, L.D., Tyers, M., 2006. Evolutionary and physiological importance of hub proteins. PLoS Comput Biol 2, e88.

Bennett, C.M., Kanki, J.P., Rhodes, J., Liu, T.X., Paw, B.H., Kieran, M.W., Langenau, D.M., Delahaye-Brown, A., Zon, L.I., Fleming, M.D., Look, A.T., 2001. Myelopoiesis in the zebrafish, Danio rerio. Blood 98, 643-651.

Berthelot, C., J., C., Desvignes, T., Detrich III, H.W., Flicek, P., Peck, L.S., Peters, M., Postlethwait, J.H., Clark, M.S., 2018. Manuscript in preparation. Adaptation of proteins to the cold in Antarctic fish: A role for Methionine?

Page 201: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

201

Bertrand, J.Y., Kim, A.D., Teng, S., Traver, D., 2008. CD41+ cmyb+ precursors colonize the zebrafish pronephros by a novel migration route to initiate adult hematopoiesis. Development 135, 1853-1862.

Bertrand, J.Y., Kim, A.D., Violette, E.P., Stachura, D.L., Cisson, J.L., Traver, D., 2007a. Definitive hematopoiesis initiates through a committed erythromyeloid progenitor in the zebrafish embryo. Development 134, 4147-4156.

Bertrand, J.Y., Kim, A.D., Violette, E.P., Stachura, D.L., Cisson, J.L., Traver, D., 2007b. Definitive hematopoiesis initiates through a committed erythromyeloid progenitor in the zebrafish embryo. Development.

Blankenberg, D., Von Kuster, G., Coraor, N., Ananda, G., Lazarus, R., Mangan, M., Nekrutenko, A., Taylor, J., 2010. Galaxy: a web-based genome analysis tool for experimentalists. Current protocols in molecular biology Chapter 19, Unit 19 10 11-21.

Blobel, G.A., Nakajima, T., Eckner, R., Montminy, M., Orkin, S.H., 1998. CREB-binding protein cooperates with transcription factor GATA-1 and is required for erythroid differentiation. Proc Natl Acad Sci U S A 95, 2061-2066.

Bordoli, L., Husser, S., Luthi, U., Netsch, M., Osmani, H., Eckner, R., 2001. Functional analysis of the p300 acetyltransferase domain: the PHD finger of p300 but not of CBP is dispensable for enzymatic activity. Nucleic Acids Res 29, 4462-4471.

Boura, E., Hurley, J.H., 2012. Structural basis for membrane targeting by the MVB12-associated beta-prism domain of the human ESCRT-I MVB12 subunit. Proc Natl Acad Sci U S A 109, 1901-1906.

Boyes, J., Byfield, P., Nakatani, Y., Ogryzko, V., 1998. Regulation of activity of the transcription factor GATA-1 by acetylation. Nature 396, 594-598.

Bradbury, C.A., Khanim, F.L., Hayden, R., Bunce, C.M., White, D.A., Drayson, M.T., Craddock, C., Turner, B.M., 2005. Histone deacetylases in acute myeloid leukaemia show a distinctive pattern of expression that changes selectively in response to deacetylase inhibitors. Leukemia 19, 1751-1759.

Brinkman-Van der Linden, E.C., Angata, T., Reynolds, S.A., Powell, L.D., Hedrick, S.M., Varki, A., 2003. CD33/Siglec-3 binding specificity, expression pattern, and consequences of gene deletion in mice. Molecular and cellular biology 23, 4199-4206.

Broos, S., Hulpiau, P., Galle, J., Hooghe, B., Van Roy, F., De Bleser, P., 2011. ConTra v2: a tool to identify transcription factor binding sites across species, update 2011. Nucleic Acids Res 39, W74-78.

Cao, H., de Bono, B., Belov, K., Wong, E.S., Trowsdale, J., Barrow, A.D., 2009. Comparative genomics indicates the mammalian CD33rSiglec locus evolved by an ancient large-scale inverse duplication and suggests all Siglecs share a common ancestral region. Immunogenetics 61, 401-417.

Carroll, D., 2017. Genome Editing: Past, Present, and Future. Yale J Biol Med 90, 653-659. Catchen, J.M., Conery, J.S., Postlethwait, J.H., 2009. Automated identification of conserved synteny after

whole-genome duplication. Genome research 19, 1497-1505. Chen, D.L., Hu, Z.Q., Zheng, X.F., Wang, X.Y., Xu, Y.Z., Li, W.Q., Fang, H.S., Kan, L., Wang, S.Y., 2016.

EDAG-1 promotes proliferation and invasion of human thyroid cancer cells by activating MAPK/Erk and AKT signal pathways. Cancer biology & therapy 17, 414-421.

Chen, L., DeVries, A.L., Cheng, C.H., 1997. Evolution of antifreeze glycoprotein gene from a trypsinogen gene in Antarctic notothenioid fish. Proc Natl Acad Sci U S A 94, 3811-3816.

Chen, Z., Cheng, C.H., Zhang, J., Cao, L., Chen, L., Zhou, L., Jin, Y., Ye, H., Deng, C., Dai, Z., Xu, Q., Hu, P., Sun, S., Shen, Y., Chen, L., 2008. Transcriptomic and genomic evolution under constant cold in Antarctic notothenioid fish. Proc Natl Acad Sci U S A 105, 12944-12949.

Choi, Y., Chan, A.P., 2015. PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels. Bioinformatics 31, 2745-2747.

Cocca, E., Ratnayake-Lecamwasam, M., Parker, S.K., Camardella, L., Ciaramella, M., di Prisco, G., Detrich,

H.W., 1995a. Genomic remnants of -globin genes in the hemoglobinless antarctic icefishes. Proc. Natl. Acad. Sci. U. S. A 92, 1817-1821.

Page 202: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

202

Cocca, E., Ratnayake-Lecamwasam, M., Parker, S.K., Camardella, L., Ciaramella, M., di Prisco, G., Detrich, H.W., 3rd, 1995b. Genomic remnants of alpha-globin genes in the hemoglobinless antarctic icefishes. Proc Natl Acad Sci U S A 92, 1817-1821.

Collins, S., Coleman, H., Groudine, M., 1987. Expression of bcr and bcr-abl fusion transcripts in normal and leukemic cells. Molecular and cellular biology 7, 2870-2876.

Colombo, M., Damerau, M., Hanel, R., Salzburger, W., Matschiner, M., 2015. Diversity and disparity through time in the adaptive radiation of Antarctic notothenioid fishes. J Evol Biol 28, 376-394.

Conboy, J.G., 2017. RNA splicing during terminal erythropoiesis. Curr Opin Hematol 24, 215-221. Crocker, P.R., Paulson, J.C., Varki, A., 2007. Siglecs and their roles in the immune system. Nat Rev

Immunol 7, 255-266. D'Andrea, A.D., Lodish, H.F., Wong, G.G., 1989. Expression cloning of the murine erythropoietin

receptor. Cell 57, 277-285. Davidson, A.J., Zon, L.I., 2004. The 'definitive' (and 'primitive') guide to zebrafish hematopoiesis.

Oncogene 23, 7233-7246. de Jong, J.L., Zon, L.I., 2005. Use of the Zebrafish to Study Primitive and Definitive Hematopoiesis. Annu.

Rev. Genet 39, 481-501. De Propris, M.S., Raponi, S., Diverio, D., Milani, M.L., Meloni, G., Falini, B., Foa, R., Guarini, A., 2011. High

CD33 expression levels in acute myeloid leukemia cells carrying the nucleophosmin (NPM1) mutation. Haematologica 96, 1548-1551.

De Ruijter, A.J.M., Van Gennip, A.H., Caron, H.N., Kemp, S., Van Kuilenburg, A.B.P., 2003. Histone deacetylases (HDACs): characterization of the classical HDAC family. Biochem J 370, 737-749.

de Souza, R.F., Aravind, L., 2010. UMA and MABP domains throw light on receptor endocytosis and selection of endosomal cargoes. Bioinformatics 26, 1477-1480.

Denef, N., Chen, Y., Weeks, S.D., Barcelo, G., Schupbach, T., 2008. Crag regulates epithelial architecture and polarized deposition of basement membrane proteins in Drosophila. Dev Cell 14, 354-364.

Deng, C., Cheng, C.H., Ye, H., He, X., Chen, L., 2010. Evolution of an antifreeze protein by neofunctionalization under escape from adaptive conflict. Proc Natl Acad Sci U S A 107, 21593-21598.

Detrich, H.W., 3rd, Kieran, M.W., Chan, F.Y., Barone, L.M., Yee, K., Rundstadler, J.A., Pratt, S., Ransom, D., Zon, L.I., 1995. Intraembryonic hematopoietic cell migration during vertebrate development. Proc Natl Acad Sci U S A 92, 10713-10717.

Detrich, H.W., III., Westerfield, M., Zon, LI., 1999. Overview of the Zebrafish System. In Methods in Cell Biology, Overview of the Zebrafish system, Vol. 59. (Detrich, H. W., III, Westerfield, M., & Zon, L. I., Eds.), Elsevier Academic Press, San Diego, pp. 3-10

Detrich, H.W., Yergeau, D.A., 2004. Comparative genomics in erythropoietic gene discovery: synergisms between the Antarctic icefishes and the zebrafish, in: Detrich, H.W., Westerfield, M., Zon, L.I. (Eds.), Methods in Cell Biology, The Zebrafish, 2nd edition: Genetics, Genomics, and Informatics, Vol. 77. Elsevier Academic Press, San Diego, pp. 475-503.

DeVries, A.L., Eastman, J.T., 1978. Lipid sacs as a buoyancy adap- tation in an Antarctic fish. Nature 271, 352-353.

di Prisco, G., Cocca, E., Parker, S., Detrich, H., 2002. Tracking the evolutionary loss of hemoglobin expression by the white-blooded Antarctic icefishes. Gene 295, 185-191.

Ding, Y.L., Xu, C.W., Wang, Z.D., Zhan, Y.Q., Li, W., Xu, W.X., Yu, M., Ge, C.H., Li, C.Y., Yang, X.M., 2010. Over-expression of EDAG in the myeloid cell line 32D: induction of GATA-1 expression and erythroid/megakaryocytic phenotype. Journal of cellular biochemistry 110, 866-874.

Dirks, W., Rome, D., Ringel, F., Jager, K., MacLeod, R.A., Drexler, H.G., 1999. Expression of the growth arrest-specific gene 6 (GAS6) in leukemia and lymphoma cell lines. Leuk Res 23, 643-651.

Dodd, R.B., Meadows, W., Qamar, S., Johnson, C.M., Kronenberg-Versteeg, D., St George-Hyslop, P., 2016, to be published. Structure of ligand bound CD33 receptor associated with Alzheimer's disease.

Page 203: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

203

Dolznig, H., Habermann, B., Stangl, K., Deiner, E.M., Moriggl, R., Beug, H., Mullner, E.W., 2002. Apoptosis protection by the Epo target Bcl-X(L) allows factor-independent differentiation of primary erythroblasts. Curr Biol 12, 1076-1085.

Drabsch, Y., ten Dijke, P., 2012. TGF-beta signalling and its role in cancer progression and metastasis. Cancer Metastasis Rev 31, 553-568.

Dyson, H.J., Wright, P.E., 2005. Intrinsically unstructured proteins and their functions. Nat Rev Mol Cell Bio 6, 197-208.

Dyson, H.J., Wright, P.E., 2016. Role of Intrinsic Protein Disorder in the Function and Interactions of the Transcriptional Coactivators CREB-binding Protein (CBP) and p300. The Journal of biological chemistry 291, 6714-6722.

Eastman, J.T., 1993. Antarctic fish biology : evolution in a unique environment. Academic Press, San Diego.

Eastman, J.T., Eakin, R.R., 2000. An updated species list for notothenioid fish (Perciformes; Notothenioidei), with comments on Antarctic species. Archive of Fishery and Marine

Research 48, 11-20. Eberharter, A., Becker, P.B., 2002. Histone acetylation: a switch between repressive and permissive

chromatin. Second in review series on chromatin dynamics. EMBO Rep 3, 224-229. El-Brolosy, M.A., Stainier, D.Y.R., 2017. Genetic compensation: A phenomenon in search of mechanisms.

PLoS Genet 13, e1006780. Evans, C.J., Hartenstein, V., Banerjee, U., 2003. Thicker than blood: conserved mechanisms in Drosophila

and vertebrate hematopoiesis. Dev Cell 5, 673-690. Ferreira, R., Ohneda, K., Yamamoto, M., Philipsen, S., 2005. GATA1 function, a paradigm for transcription

factors in hematopoiesis. Molecular and cellular biology 25, 1215-1227. Forbes, S.A., Beare, D., Boutselakis, H., Bamford, S., Bindal, N., Tate, J., Cole, C.G., Ward, S., Dawson, E.,

Ponting, L., Stefancsik, R., Harsha, B., Kok, C.Y., Jia, M., Jubb, H., Sondka, Z., Thompson, S., De, T., Campbell, P.J., 2017. COSMIC: somatic cancer genetics at high-resolution. Nucleic Acids Res 45, D777-D783.

Frame, J.M., Lim, S.-E., North, T.E., 2017. Hematopoietic stem cell development: using the zebrafish to identify extrinsic and intrinsic mechanisms regulating hematopoiesis. In Methods in Cell Biology, The Zebrafish: Disease Models and Chemical Screens, 4th Edition. Vol. 138. (Detrich, H. W., III, Westerfield, M., & Zon, L. I., Eds.), Elsevier Academic Press, San Diego, pp. 165-184

Fransecky, L., Mochmann, L.H., Baldus, C.D., 2015. Outlook on PI3K/AKT/mTOR inhibition in acute leukemia. Mol Cell Ther 3, 2.

Fu, L., Niu, B., Zhu, Z., Wu, S., Li, W., 2012. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150-3152.

Galloway, J.L., Zon, L.I., 2003. Ontogeny of hematopoiesis: examining the emergence of hematopoietic cells in the vertebrate embryo. Current topics in developmental biology 53, 139-158.

Ganis, J.J., Hsia, N., Trompouki, E., de Jong, J.L., DiBiase, A., Lambert, J.S., Jia, Z., Sabo, P.J., Weaver, M., Sandstrom, R., Stamatoyannopoulos, J.A., Zhou, Y., Zon, L.I., 2012. Zebrafish globin switching occurs in two developmental stages and is controlled by the LCR. Dev Biol 366, 185-194.

Gardiner, M.R., Gongora, M.M., Grimmond, S.M., Perkins, A.C., 2007. A global role for zebrafish klf4 in embryonic erythropoiesis. Mech Dev 124, 762-774.

Geiger, T.L., Rubnitz, J.E., 2015. New approaches for the immunotherapy of acute myeloid leukemia. Discov Med 19, 275-284.

Gekas, C., Graf, T., 2013. CD41 expression marks myeloid-biased adult hematopoietic stem cells and increases with age. Blood 121, 4463-4472.

Ghigliotti, L., Cheng, C.H., Pisano, E., 2016. Sex determination in Antarctic notothenioid fish: chromosomal clues and evolutionary hypotheses. Polar Biology 39.

Page 204: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

204

Ghosh, J., Kapur, R., 2017. Role of mTORC1-S6K1 signaling pathway in regulation of hematopoietic stem cell and acute myeloid leukemia. Exp Hematol 50, 13-21.

Giardine, B., Riemer, C., Hardison, R.C., Burhans, R., Elnitski, L., Shah, P., Zhang, Y., Blankenberg, D., Albert, I., Taylor, J., Miller, W., Kent, W.J., Nekrutenko, A., 2005. Galaxy: a platform for interactive large-scale genome analysis. Genome research 15, 1451-1455.

Goecks, J., Nekrutenko, A., Taylor, J., Galaxy, T., 2010. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol 11, R86.

Guan, Y., Zhu, Q., Huang, D., Zhao, S., Jan Lo, L., Peng, J., 2015. An equation to estimate the difference between theoretically predicted and SDS PAGE-displayed molecular weights for an acidic peptide. Sci Rep 5, 13370.

Gupta, T., Mullins, M.C., 2010. Dissection of organs from the adult zebrafish. Journal of visualized experiments : JoVE 37, e1717.

Haas, B.J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P.D., Bowden, J., Couger, M.B., Eccles, D., Li, B., Lieber, M., Macmanes, M.D., Ott, M., Orvis, J., Pochet, N., Strozzi, F., Weeks, N., Westerman, R., William, T., Dewey, C.N., Henschel, R., Leduc, R.D., Friedman, N., Regev, A., 2013. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nature protocols 8, 1494-1512.

Hattangadi, S.M., Wong, P., Zhang, L., Flygare, J., Lodish, H.F., 2011. From stem cell to red cell: regulation of erythropoiesis at multiple levels by multiple proteins, RNAs, and chromatin modifications. Blood 118, 6258-6268.

He, X., Zhang, J., 2006. Why do hubs tend to be essential in protein networks? PLoS Genet 2, e88. Heger, A., Holm, L., 2000. Rapid automatic detection and alignment of repeats in protein sequences.

Proteins 41, 224-237. Heideman, M.R., Lancini, C., Proost, N., Yanover, E., Jacobs, H., Dannenberg, J.H., 2014. Sin3a-associated

Hdac1 and Hdac2 are essential for hematopoietic stem cell homeostasis and contribute differentially to hematopoiesis. Haematologica 99, 1292-1303.

Helantera, H., Uller, T., 2014. Neutral and adaptive explanations for an association between caste-biased gene expression and rate of sequence evolution. Frontiers in genetics 5, 297.

Herrero, J., Muffato, M., Beal, K., Fitzgerald, S., Gordon, L., Pignatelli, M., Vilella, A.J., Searle, S.M., Amode, R., Brent, S., Spooner, W., Kulesha, E., Yates, A., Flicek, P., 2016. Ensembl comparative genomics resources. Database : the journal of biological databases and curation 2016.

Hong, W., Nakazawa, M., Chen, Y.Y., Kori, R., Vakoc, C.R., Rakowski, C., Blobel, G.A., 2005. FOG-1 recruits the NuRD repressor complex to mediate transcriptional repression by GATA-1. EMBO J 24, 2367-2378.

Hossain, M.S., Larsson, A., Scherbak, N., Olsson, P.E., Orban, L., 2008. Zebrafish androgen receptor: isolation, molecular, and biochemical characterization. Biol Reprod 78, 361-369.

Howe, K., Clark, M.D., Torroja, C.F., Torrance, J., Berthelot, C., Muffato, M., Collins, J.E., Humphray, S., McLaren, K., Matthews, L., McLaren, S., Sealy, I., Caccamo, M., Churcher, C., Scott, C., Barrett, J.C., Koch, R., Rauch, G.J., White, S., Chow, W., Kilian, B., Quintais, L.T., Guerra-Assuncao, J.A., Zhou, Y., Gu, Y., Yen, J., Vogel, J.H., Eyre, T., Redmond, S., Banerjee, R., Chi, J., Fu, B., Langley, E., Maguire, S.F., Laird, G.K., Lloyd, D., Kenyon, E., Donaldson, S., Sehra, H., Almeida-King, J., Loveland, J., Trevanion, S., Jones, M., Quail, M., Willey, D., Hunt, A., Burton, J., Sims, S., McLay, K., Plumb, B., Davis, J., Clee, C., Oliver, K., Clark, R., Riddle, C., Elliot, D., Threadgold, G., Harden, G., Ware, D., Begum, S., Mortimore, B., Kerry, G., Heath, P., Phillimore, B., Tracey, A., Corby, N., Dunn, M., Johnson, C., Wood, J., Clark, S., Pelan, S., Griffiths, G., Smith, M., Glithero, R., Howden, P., Barker, N., Lloyd, C., Stevens, C., Harley, J., Holt, K., Panagiotidis, G., Lovell, J., Beasley, H., Henderson, C., Gordon, D., Auger, K., Wright, D., Collins, J., Raisen, C., Dyer, L., Leung, K., Robertson, L., Ambridge, K., Leongamornlert, D., McGuire, S.,

Page 205: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

205

Gilderthorp, R., Griffiths, C., Manthravadi, D., Nichol, S., Barker, G., Whitehead, S., Kay, M., Brown, J., Murnane, C., Gray, E., Humphries, M., Sycamore, N., Barker, D., Saunders, D., Wallis, J., Babbage, A., Hammond, S., Mashreghi-Mohammadi, M., Barr, L., Martin, S., Wray, P., Ellington, A., Matthews, N., Ellwood, M., Woodmansey, R., Clark, G., Cooper, J., Tromans, A., Grafham, D., Skuce, C., Pandian, R., Andrews, R., Harrison, E., Kimberley, A., Garnett, J., Fosker, N., Hall, R., Garner, P., Kelly, D., Bird, C., Palmer, S., Gehring, I., Berger, A., Dooley, C.M., Ersan-Urun, Z., Eser, C., Geiger, H., Geisler, M., Karotki, L., Kirn, A., Konantz, J., Konantz, M., Oberlander, M., Rudolph-Geiger, S., Teucke, M., Lanz, C., Raddatz, G., Osoegawa, K., Zhu, B., Rapp, A., Widaa, S., Langford, C., Yang, F., Schuster, S.C., Carter, N.P., Harrow, J., Ning, Z., Herrero, J., Searle, S.M., Enright, A., Geisler, R., Plasterk, R.H., Lee, C., Westerfield, M., de Jong, P.J., Zon, L.I., Postlethwait, J.H., Nusslein-Volhard, C., Hubbard, T.J., Roest Crollius, H., Rogers, J., Stemple, D.L., 2013. The zebrafish reference genome sequence and its relationship to the human genome. Nature 496, 498-503.

Hruscha, A., Krawitz, P., Rechenberg, A., Heinrich, V., Hecht, J., Haass, C., Schmid, B., 2013. Efficient CRISPR/Cas9 genome editing with low off-target effects in zebrafish. Development 140, 4982-4987.

Hubank, M., Schatz, D.G., 1999. cDNA representational difference analysis: a sensitive and flexible method for identification of differentially expressed genes. Methods Enzymol 303, 325-349.

Hureau, J.C., 1966. Biologic de Chaenichthys rhinoceratus Richardson, et problème du sang incolore des Chaenichthyidae, poissons des mers australes. Bulletin de la Societe Zoologique de France 91, 735-751. Iuchi, I., Yamamoto, M., 1983. Erythropoiesis in the developing rainbow trout, Salmo gairdneri irideus:

histochemical and immunochemical detection of erythropoietic organs. J Exp Zool 226, 409-417. Jacobs, N.L., Albertson, R.C., Wiles, J.R., 2011. Using whole mount in situ hybridization to link molecular

and organismal biology. Journal of visualized experiments : JoVE 49, e2533. Janzen, V., Fleming, H.E., Waring, M.T., Milne, C.D., Scadden, D.T., 2006. Multifunctional role of caspase-

3 in regulating hematopoietic stem cells. Blood 108, 258a-258a. Jensen, L.J., Kuhn, M., Stark, M., Chaffron, S., Creevey, C., Muller, J., Doerks, T., Julien, P., Roth, A.,

Simonovic, M., Bork, P., von Mering, C., 2009. STRING 8--a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res 37, D412-416.

Jeong, H., Mason, S.P., Barabasi, A.L., Oltvai, Z.N., 2001. Lethality and centrality in protein networks. Nature 411, 41-42.

Jeswin, J., Joo, M.S., Jeong, J.M., Bae, J.S., Choi, K.M., Cho, D.H., Park, S.I., Park, C.I., 2018. The first report of siglec-3/CD33 gene in a teleost (rock bream, Oplegnathus fasciatus): An analysis of its spatial expression during stimulation to red seabream iridovirus (RSIV) and two bacterial pathogens. Dev Comp Immunol 84, 117-122.

Jiang, J., Yu, H., Shou, Y., Neale, G., Zhou, S., Lu, T., Sorrentino, B.P., 2010. Hemgn is a direct transcriptional target of HOXB4 and induces expansion of murine myeloid progenitor cells. Blood 116, 711-719.

Jin, H., Sood, R., Xu, J., Zhen, F., English, M.A., Liu, P.P., Wen, Z., 2009. Definitive hematopoietic stem/progenitor cells manifest distinct differentiation output in the zebrafish VDA and PBI. Development 136, 647-654.

Jinek, M., Chylinski, K., Fonfara, I., Hauer, M., Doudna, J.A., Charpentier, E., 2012. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816-821.

Jones, P., Binns, D., Chang, H.Y., Fraser, M., Li, W., McAnulla, C., McWilliam, H., Maslen, J., Mitchell, A., Nuka, G., Pesseat, S., Quinn, A.F., Sangrador-Vegas, A., Scheremetjew, M., Yong, S.Y., Lopez, R., Hunter, S., 2014. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236-1240.

Kafina, M.D., Paw, B.H., 2018. Using the Zebrafish as an Approach to Examine the Mechanisms of Vertebrate Erythropoiesis. Methods Mol Biol 1698, 11-36.

Page 206: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

206

Kasper, L.H., Boussouar, F., Ney, P.A., Jackson, C.W., Rehg, J., van Deursen, J.M., Brindle, P.K., 2002. A transcription-factor-binding surface of coactivator p300 is required for haematopoiesis. Nature 419, 738-743.

Kawakami, K., Asakawa, K., Muto, A., Wada, H., 2016. Tol2-mediated transgenesis, gene trapping, enhancer trapping, and Gal4-UAS system. In Methods in Cell Biology, The Zebrafish: Genetics, Genomics, and Transcriptomics, 4th Edition, Vol. 135. (Detrich, H. W., III, Westerfield, M., & Zon, L. I., Eds.), Elsevier Academic Press, San Diego, pp. 19-36.

Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., Buxton, S., Cooper, A., Markowitz, S., Duran, C., Thierer, T., Ashton, B., Meintjes, P., Drummond, A., 2012. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28, 1647-1649.

Kennett, J., 1977. Cenozoic evolution of Antarctic glaciation, the circum‐Antarctic Ocean, and their impact on global paleoceanography. Journal of Geophysical Research 82, 3843-3860.

Kersey, P.J., Allen, J.E., Armean, I., Boddu, S., Bolt, B.J., Carvalho-Silva, D., Christensen, M., Davis, P., Falin, L.J., Grabmueller, C., Humphrey, J., Kerhornou, A., Khobova, J., Aranganathan, N.K., Langridge, N., Lowy, E., McDowall, M.D., Maheswari, U., Nuhn, M., Ong, C.K., Overduin, B., Paulini, M., Pedro, H., Perry, E., Spudich, G., Tapanari, E., Walts, B., Williams, G., Tello-Ruiz, M., Stein, J., Wei, S., Ware, D., Bolser, D.M., Howe, K.L., Kulesha, E., Lawson, D., Maslen, G., Staines, D.M., 2016. Ensembl Genomes 2016: more genomes, more complexity. Nucleic Acids Res 44, D574-580.

Kingsley, P.D., Greenfest-Allen, E., Frame, J.M., Bushnell, T.P., Malik, J., McGrath, K.E., Stoeckert, C.J., Palis, J., 2013. Ontogeny of erythroid gene expression. Blood 121, e5-e13.

Krantz, S.B., 1991. Erythropoietin. Blood 77, 419-434. Krivega, I., Dean, A., 2016. Chromatin looping as a target for altering erythroid gene expression. Ann N Y

Acad Sci 1368, 31-39. Kruger, A., Ellerstrom, C., Lundmark, C., Christersson, C., Wurtz, T., 2002. RP59, a marker for osteoblast

recruitment, is also detected in primitive mesenchymal cells, erythroid cells, and megakaryocytes. Developmental dynamics : an official publication of the American Association of Anatomists 223, 414-418.

Kruger, A., Somogyi, E., Christersson, C., Lundmark, C., Hultenby, K., Wurtz, T., 2005. Rat enamel contains RP59: a new context for a protein from osteogenic and haematopoietic precursor cells. Cell Tissue Res 320, 141-148.

Kuhn, D.E., O'Brien, K.M., Crockett, E.L., 2016. Expansion of capacities for iron transport and sequestration reflects plasma

volumes and heart mass among white-blooded notothenioid fishes. Am J Physiol Regul Integr Comp Physiol. 311, 649-657.

Kuhn, K.L., Near, T.J., Detrich, H.W.I., Eastman, J.T., 2010. Biology of the Antarctic dragonfish Vomeridens infuscipinnis

(Notothenioidei: Bathydraconidae). Antarctic Science 23, 18-26. Kulkeaw, K., Ishitani, T., Kanemaru, T., Fucharoen, S., Sugiyama, D., 2010. Cold exposure down-regulates

zebrafish hematopoiesis. Biochem Biophys Res Commun 394, 859-864. Kunzmann, A., Caruso, C., Diprisco, G., 1991. Hematological Studies on a High-Antarctic Fish -

Bathydraco-Marri Norman. J Exp Mar Biol Ecol 152, 243-255. Kwan, K.M., Fujimoto, E., Grabher, C., Mangum, B.D., Hardy, M.E., Campbell, D.S., Parant, J.M., Yost,

H.J., Kanki, J.P., Chien, C.B., 2007. The Tol2kit: a multisite gateway-based construction kit for Tol2 transposon transgenesis constructs. Developmental dynamics : an official publication of the American Association of Anatomists 236, 3088-3099.

Labun, K., Montague, T.G., Gagnon, J.A., Thyme, S.B., Valen, E., 2016. CHOPCHOP v2: a web tool for the next generation of CRISPR genome engineering. Nucleic Acids Res 44, W272-276.

Page 207: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

207

Landrum, M.J., Lee, J.M., Benson, M., Brown, G., Chao, C., Chitipiralla, S., Gu, B., Hart, J., Hoffman, D., Hoover, J., Jang, W., Katz, K., Ovetsky, M., Riley, G., Sethi, A., Tully, R., Villamarin-Salomon, R., Rubinstein, W., Maglott, D.R., 2016. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res 44, D862-868.

Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L., 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10, R25.

Lau, Y.T., Parker, S.K., Near, T.J., Detrich, H.W., 3rd, 2012. Evolution and function of the globin intergenic regulatory regions of the antarctic dragonfishes (Notothenioidei: Bathydraconidae). Molecular biology and evolution 29, 1071-1080.

Lee, S.H., Chiu, Y.C., Li, Y.H., Lin, C.C., Hou, H.A., Chou, W.C., Tien, H.F., 2017. High expression of dedicator of cytokinesis 1 (DOCK1) confers poor prognosis in acute myeloid leukemia. Oncotarget 8, 72250-72259.

Leichty, A.R., Pfennig, D.W., Jones, C.D., Pfennig, K.S., 2012. Relaxed genetic constraint is ancestral to the evolution of phenotypic plasticity. Integrative and comparative biology 52, 16-30.

Li, B., Dewey, C.N., 2011. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC bioinformatics 12, 323.

Li, C.Y., Zhan, Y.Q., Li, W., Xu, C.W., Xu, W.X., Yu, D.H., Peng, R.Y., Cui, Y.F., Yang, X., Hou, N., Li, Y.H., Dong, B., Sun, H.B., Yang, X.M., 2007. Overexpression of a hematopoietic transcriptional regulator EDAG induces myelopoiesis and suppresses lymphopoiesis in transgenic mice. Leukemia 21, 2277-2286.

Li, C.Y., Zhan, Y.Q., Xu, C.W., Xu, W.X., Wang, S.Y., Lv, J., Zhou, Y., Yue, P.B., Chen, B., Yang, X.M., 2004. EDAG regulates the proliferation and differentiation of hematopoietic cells and resists cell apoptosis through the activation of nuclear factor-kappa B. Cell death and differentiation 11, 1299-1308.

Liao, E.C., Paw, B.H., Peters, L.L., Zapata, A., Pratt, S.J., Do, C.P., Lieschke, G., Zon, L.I., 2000. Hereditary spherocytosis in zebrafish riesling illustrates evolution of erythroid beta-spectrin structure, and function in red cell morphogenesis and membrane stability. Development 127, 5123-5132.

Lin, C.S., Lim, S.K., D'Agati, V., Costantini, F., 1996. Differential effects of an erythropoietin receptor gene disruption on primitive and definitive erythropoiesis. Genes Dev 10, 154-164.

Lin, H.F., Traver, D., Zhu, H., Dooley, K., Paw, B.H., Zon, L.I., Handin, R.I., 2005. Analysis of thrombocyte development in CD41-GFP transgenic zebrafish. Blood 106, 3803-3810.

Lin, Y., Dobbs, G.H., 3rd, Devries, A.L., 1974. Oxygen consumption and lipid content in red and white muscles of Antarctic fishes. J Exp Zool 189, 379-386.

Lomako, V.V., Shilo, A.V., Kovalenko, I.F., Babiichuk, G.A., 2015. [Erythrocytes of hetero- and homoiothermal animals in natural and artificial hypothermia]. Zh Evol Biokhim Fiziol 51, 52-59.

Love, P.E., Warzecha, C., Li, L., 2014. Ldb1 complexes: the new master regulators of erythroid gene transcription. Trends Genet 30, 1-9.

Lu, J., Xu, W.X., Wang, S.Y., Jiang, Y., Li, C.Y., Cai, W.M., Yang, X.M., 2002. [Overexpression of EDAG-1 in NIH3T3 cells leads to malignant transformation]. Sheng Wu Hua Xue Yu Sheng Wu Wu Li Xue Bao (Shanghai) 34, 95-98.

Lu, J., Xu, W.X., Wang, S.Y., Zhan, Y.Q., Jiang, Y., Cai, W.M., Yang, X.M., 2001. Isolation and Characterization of EDAG-1, A Novel Gene Related to Regulation in Hematopoietic System. Sheng Wu Hua Xue Yu Sheng Wu Wu Li Xue Bao (Shanghai) 33, 641-646.

Lyons, S.E., Lawson, N.D., Lei, L., Bennett, P.E., Weinstein, B.M., Liu, P.P., 2002. A nonsense mutation in zebrafish gata1 causes the bloodless phenotype in vlad tepes. Proc Natl Acad Sci U S A 99, 5454-5459.

Maekawa, S., Iemura, H., Kuramochi, Y., Nogawa-Kosaka, N., Nishikawa, H., Okui, T., Aizawa, Y., Kato, T., 2012. Hepatic confinement of newly produced erythrocytes caused by low-temperature exposure in Xenopus laevis. The Journal of experimental biology 215, 3087-3095.

Page 208: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

208

Marchler-Bauer, A., Derbyshire, M.K., Gonzales, N.R., Lu, S., Chitsaz, F., Geer, L.Y., Geer, R.C., He, J., Gwadz, M., Hurwitz, D.I., Lanczycki, C.J., Lu, F., Marchler, G.H., Song, J.S., Thanki, N., Wang, Z., Yamashita, R.A., Zhang, D., Zheng, C., Bryant, S.H., 2015. CDD: NCBI's conserved domain database. Nucleic Acids Res 43, D222-226.

Martin, G.S., 2003. Cell signaling and cancer. Cancer Cell 4, 167-174. Matschiner, M., Colombo, M., Damerau, M., Ceballos, S., Hanel, R., Salzburger, W., 2015. The Adaptive

Radiation of Notothenioid Fishes in the Waters of Antarctica. Springer International Publishing Switzerland.

Matschiner, M., Hanel, R., Salzburger, W., 2011. On the origin and trigger of the notothenioid adaptive radiation. PLoS One 6, e18911.

Maximow, A., 1909. Der Lymphozyt als gemeinsame Stammzelle der verschiedenen Blutelemente in der embryonalen Entwicklung und im postfetalen Leben der Säugetiere. Fol. Haematol. 8, 125-134.

McDevitt, M.A., Fujiwara, Y., Shivdasani, R.A., Orkin, S.H., 1997. An upstream, DNase I hypersensitive region of the hematopoietic-expressed transcription factor GATA-1 gene confers developmental specificity in transgenic mice. Proc Natl Acad Sci U S A 94, 7976-7981.

McGuckin, C.P., Forraz, N., Liu, W.M., 2003. Diaminofluorene stain detects erythroid differentiation in immature haemopoietic cells treated with EPO, IL-3, SCF, TGFbeta1, MIP-1alpha and IFNgamma. European journal of haematology 70, 106-114.

Medvinsky, A., Rybtsov, S., Taoudi, S., 2011. Embryonic origin of the adult hematopoietic system: advances and questions. Development 138, 1017-1031.

Miccio, A., Wang, Y., Hong, W., Gregory, G.D., Wang, H., Yu, X., Choi, J.K., Shelat, S., Tong, W., Poncz, M., Blobel, G.A., 2010. NuRD mediates activating and repressive functions of GATA-1 and FOG-1 during blood development. EMBO J 29, 442-456.

Montague, T.G., Cruz, J.M., Gagnon, J.A., Church, G.M., Valen, E., 2014. CHOPCHOP: a CRISPR/Cas9 and TALEN web tool for genome editing. Nucleic Acids Res 42, W401-407.

Montgomery, J., Clements, K., 2000. Disaptation and recovery in the evolution of Antarctic fishes. Trends in ecology & evolution 15, 267-271.

Mortensen, M., Ferguson, D.J., Edelmann, M., Kessler, B., Morten, K.J., Komatsu, M., Simon, A.K., 2010. Loss of autophagy in erythroid cells leads to defective removal of mitochondria and severe anemia in vivo. Proc Natl Acad Sci U S A 107, 832-837.

Mugal, C.F., Wolf, J.B., Kaj, I., 2014. Why time matters: codon evolution and the temporal dynamics of dN/dS. Molecular biology and evolution 31, 212-231.

Murayama, E., Kissa, K., Zapata, A., Mordelet, E., Briolat, V., Lin, H.F., Handin, R.I., Herbomel, P., 2006. Tracing hematopoietic precursor migration to successive hematopoietic organs during zebrafish development. Immunity 25, 963-975.

Naka, K., Hoshii, T., Muraguchi, T., Tadokoro, Y., Ooshio, T., Kondo, Y., Nakao, S., Motoyama, N., Hirao, A., 2010. TGF-beta-FOXO signalling maintains leukaemia-initiating cells in chronic myeloid leukaemia. Nature 463, 676-680.

Nakata, T., Ishiguro, M., Aduma, N., Izumi, H., Kuroiwa, A., 2013. Chicken hemogen homolog is involved in the chicken-specific sex-determining mechanism. Proc Natl Acad Sci U S A 110, 3417-3422.

Near, T.J., Parker, S.K., Detrich, H.W., 2006a. A genomic fossil reveals key steps in hemoglobin loss by the antarctic icefishes. Mol. Biol. Evol 23, 2008-2016.

Near, T.J., Parker, S.K., Detrich, H.W., 3rd, 2006b. A genomic fossil reveals key steps in hemoglobin loss by the antarctic icefishes. Molecular biology and evolution 23, 2008-2016.

Near, T.J., Pesavento, J.J., Cheng, C.H., 2003. Mitochondrial DNA, morphology, and the phylogenetic relationships of Antarctic icefishes (Notothenioidei: Channichthyidae). Mol Phylogenet Evol 28, 87-98.

Page 209: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

209

Nguyen, D.H., Ball, E.D., Varki, A., 2006. Myeloid precursors and acute myeloid leukemia cells express multiple CD33-related Siglecs. Exp Hematol 34, 728-735.

Nishikawa, K., Kobayashi, M., Masumi, A., Lyons, S.E., Weinstein, B.M., Liu, P.P., Yamamoto, M., 2003. Self-association of Gata1 enhances transcriptional activity in vivo in zebra fish embryos. Mol Cell Biol 23, 8295-8305.

Notredame, C., Higgins, D.G., Heringa, J., 2000. T-Coffee: A novel method for fast and accurate multiple sequence alignment. Journal of molecular biology 302, 205-217.

O'Brien, K.M., Mueller, I.A., 2010. The unique mitochondrial form and function of Antarctic channichthyid icefishes. Integrative and comparative biology 50, 993-1008.

O'Leary, N.A., Wright, M.W., Brister, J.R., Ciufo, S., Haddad, D., McVeigh, R., Rajput, B., Robbertse, B., Smith-White, B., Ako-Adjei, D., Astashyn, A., Badretdin, A., Bao, Y., Blinkova, O., Brover, V., Chetvernin, V., Choi, J., Cox, E., Ermolaeva, O., Farrell, C.M., Goldfarb, T., Gupta, T., Haft, D., Hatcher, E., Hlavina, W., Joardar, V.S., Kodali, V.K., Li, W., Maglott, D., Masterson, P., McGarvey, K.M., Murphy, M.R., O'Neill, K., Pujar, S., Rangwala, S.H., Rausch, D., Riddick, L.D., Schoch, C., Shkeda, A., Storz, S.S., Sun, H., Thibaud-Nissen, F., Tolstoy, I., Tully, R.E., Vatsan, A.R., Wallin, C., Webb, D., Wu, W., Landrum, M.J., Kimchi, A., Tatusova, T., DiCuccio, M., Kitts, P., Murphy, T.D., Pruitt, K.D., 2016. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res 44, D733-745.

Oellacher, J., 1872. Beitrage zur enwicklungsgeschichte der knochenfische nach beobachtungen am bachforelleneie. . Z. Wiss. Zool. 23, 373-421. Omichinski, J.G., Trainor, C., Evans, T., Gronenborn, A.M., Clore, G.M., Felsenfeld, G., 1993. A small

single-"finger" peptide from the erythroid transcription factor GATA-1 binds specifically to DNA as a zinc or iron complex. Proc Natl Acad Sci U S A 90, 1676-1680.

Onodera, K., Takahashi, S., Nishimura, S., Ohta, J., Motohashi, H., Yomogida, K., Hayashi, N., Engel, J.D., Yamamoto, M., 1997. GATA-1 transcription is controlled by distinct regulatory mechanisms during primitive and definitive erythropoiesis. Proc Natl Acad Sci U S A 94, 4487-4492.

Orkin, S.H., Zon, L.I., 1997. Genetics of erythropoiesis: induced mutations in mice and zebrafish. Annu Rev Genet 31, 33-60.

Orlacchio, A., Ranieri, M., Brave, M., Arciuch, V.A., Forde, T., De Martino, D., Anderson, K.E., Hawkins, P., Di Cristofano, A., 2017. SGK1 Is a Critical Component of an AKT-Independent Pathway Essential for PI3K-Mediated Tumor Development and Maintenance. Cancer Res 77, 6914-6926.

Osborne, C.S., Chakalova, L., Brown, K.E., Carter, D., Horton, A., Debrand, E., Goyenechea, B., Mitchell, J.A., Lopes, S., Reik, W., Fraser, P., 2004. Active genes dynamically colocalize to shared sites of ongoing transcription. Nat Genet 36, 1065-1071.

Paffett-Lugassy, N.N., Zon, L.I., 2005. Analysis of hematopoietic development in the zebrafish. Methods Mol. Med 105, 171-198.

Palis, J., 2014. Primitive and definitive erythropoiesis in mammals. Front Physiol 5, 3. Park, S., Chapuis, N., Tamburini, J., Bardet, V., Cornillet-Lefebvre, P., Willems, L., Green, A., Mayeux, P.,

Lacombe, C., Bouscary, D., 2010. Role of the PI3K/AKT and mTOR signaling pathways in acute myeloid leukemia. Haematologica 95, 819-828.

Pascual-Anaya, J., Albuixech-Crespo, B., Somorjai, I.M., Carmona, R., Oisi, Y., Alvarez, S., Kuratani, S., Munoz-Chapuli, R., Garcia-Fernandez, J., 2013. The evolutionary origins of chordate hematopoiesis and vertebrate endothelia. Dev Biol 375, 182-192.

Paul, S.P., Taylor, L.S., Stansbury, E.K., McVicar, D.W., 2000. Myeloid specific human CD33 is an inhibitory receptor with differential ITIM function in recruiting the phosphatases SHP-1 and SHP-2. Blood 96, 483-490.

Paw, B.H., Zon, L.I., 2000. Zebrafish: a genetic approach in studying hematopoiesis. Curr Opin Hematol 7, 79-84.

Page 210: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

210

Peters, M.J., Parker, S.K., Grim, J., Allard, C.A.H., Levin, J., Detrich, H.W., 3rd, 2018. Divergent Hemogen genes of teleosts and mammals share conserved roles in erythropoiesis: Analysis using transgenic and mutant zebrafish. Biol Open.

Piskacek, M., Havelka, M., Rezacova, M., Knight, A., 2016. The 9aaTAD Transactivation Domains: From Gal4 to p53. PLoS One 11, e0162842.

Postlethwait, J.H., Woods, I.G., Ngo-Hazelett, P., Yan, Y.L., Kelly, P.D., Chu, F., Huang, H., Hill-Force, A., Talbot, W.S., 2000. Zebrafish comparative genomics and the origins of vertebrate chromosomes. Genome research 10, 1890-1902.

Raman, K., Damaraju, N., Joshi, G.K., 2014. The organisational structure of protein networks: revisiting the centrality-lethality hypothesis. Syst Synth Biol 8, 73-81.

Ransom, D.G., Haffter, P., Odenthal, J., Brownlie, A., Vogelsang, E., Kelsh, R.N., Brand, M., van Eeden, F.J., Furutani-Seiki, M., Granato, M., Hammerschmidt, M., Heisenberg, C.P., Jiang, Y.J., Kane, D.A., Mullins, M.C., Nusslein-Volhard, C., 1996. Characterization of zebrafish mutants with defects in embryonic hematopoiesis. Development 123, 311-319.

Reese, M.G., 2001. Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome. Computers & chemistry 26, 51-56.

Reynolds, I.J., Hastings, T.G., 1995. Glutamate induces the production of reactive oxygen species in cultured forebrain neurons following NMDA receptor activation. The Journal of neuroscience : the official journal of the Society for Neuroscience 15, 3318-3327.

Rhodes, M.M., Kopsombut, P., Bondurant, M.C., Price, J.O., Koury, M.J., 2005. Bcl-x(L) prevents apoptosis of late-stage erythroblasts but does not mediate the antiapoptotic effect of erythropoietin. Blood 106, 1857-1863.

Rhodes, M.M., Kopsombut, P., Bondurant, M.C., Price, J.O., Koury, M.J., 2008. Adherence to macrophages in erythroblastic islands enhances erythroblast proliferation and increases erythrocyte production by a different mechanism than erythropoietin. Blood 111, 1700-1708.

Robinson, M.D., McCarthy, D.J., Smyth, G.K., 2010. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139-140.

Rodriguez-Mari, A., Yan, Y.L., Bremiller, R.A., Wilson, C., Canestro, C., Postlethwait, J.H., 2005. Characterization and expression pattern of zebrafish Anti-Mullerian hormone (Amh) relative to sox9a, sox9b, and cyp19a1a, during gonad development. Gene Expr Patterns 5, 655-667.

Rosado, C.J., Buckle, A.M., Law, R.H., Butcher, R.E., Kan, W.T., Bird, C.H., Ung, K., Browne, K.A., Baran, K., Bashtannyk-Puhalovich, T.A., Faux, N.G., Wong, W., Porter, C.J., Pike, R.N., Ellisdon, A.M., Pearce, M.C., Bottomley, S.P., Emsley, J., Smith, A.I., Rossjohn, J., Hartland, E.L., Voskoboinik, I., Trapani, J.A., Bird, P.I., Dunstone, M.A., Whisstock, J.C., 2007. A common fold mediates vertebrate defense and bacterial attack. Science 317, 1548-1551.

Rossi, A., Kontarakis, Z., Gerri, C., Nolte, H., Holper, S., Kruger, M., Stainier, D.Y., 2015. Genetic compensation induced by deleterious mutations but not gene knockdowns. Nature 524, 230-233.

Rutschmann, S., Matschiner, M., Damerau, M., Muschick, M., Lehmann, M.F., Hanel, R., Salzburger, W., 2011. Parallel ecological diversification in Antarctic notothenioid fishes as evidence for adaptive radiation. Mol Ecol 20, 4707-4721.

Sabin, F.R., 2002. Preliminary note on the differentiation of angioblasts and the method by which they produce blood-vessels, blood-plasma and red blood-cells as seen in the living chick. 1917. J Hematother Stem Cell Res 11, 5-7.

Schoenfelder, S., Sexton, T., Chakalova, L., Cope, N.F., Horton, A., Andrews, S., Kurukuti, S., Mitchell, J.A., Umlauf, D., Dimitrova, D.S., Eskiw, C.H., Luo, Y., Wei, C.L., Ruan, Y., Bieker, J.J., Fraser, P., 2010. Preferential associations between co-regulated genes reveal a transcriptional interactome in erythroid cells. Nat Genet 42, 53-61.

Page 211: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

211

Schwerte, T., Uberbacher, D., Pelster, B., 2003. Non-invasive imaging of blood cell concentration and blood distribution in zebrafish Danio rerio incubated in hypoxic conditions in vivo. The Journal of experimental biology 206, 1299-1307.

Sertori, R., Trengove, M., Basheer, F., Ward, A.C., Liongue, C., 2016. Genome editing in zebrafish: a practical overview. Brief Funct Genomics 15, 322-330.

Sharma, P.P., Kaluziak, S.T., Perez-Porro, A.R., Gonzalez, V.L., Hormiga, G., Wheeler, W.C., Giribet, G., 2014. Phylogenomic interrogation of arachnida reveals systemic conflicts in phylogenetic signal. Molecular biology and evolution 31, 2963-2984.

Shin, S.C., Ahn, D.H., Kim, S.J., Pyo, C.W., Lee, H., Kim, M.K., Lee, J., Lee, J.E., Detrich, H.W., Postlethwait, J.H., Edwards, D., Lee, S.G., Lee, J.H., Park, H., 2014. The genome sequence of the Antarctic bullhead notothen reveals evolutionary adaptations to a cold environment. Genome Biol 15, 468.

Sidell, B.D., O'Brien, K.M., 2006. When bad things happen to good fish: the loss of hemoglobin and myoglobin expression in Antarctic icefishes. The Journal of experimental biology 209, 1791-1802.

Snow, J.W., Orkin, S.H., 2009. Translational isoforms of FOG1 regulate GATA1-interacting complexes. The Journal of biological chemistry 284, 29310-29319.

Soding, J., Biegert, A., Lupas, A.N., 2005. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 33, W244-248.

Soler, E., Andrieu-Soler, C., de Boer, E., Bryne, J.C., Thongjuea, S., Stadhouders, R., Palstra, R.J., Stevens, M., Kockx, C., van Ijcken, W., Hou, J., Steinhoff, C., Rijkers, E., Lenhard, B., Grosveld, F., 2010. The genome-wide dynamics of the binding of Ldb1 complexes during erythroid differentiation. Genes Dev 24, 277-289.

Song, J., Singh, M., 2013. From hub proteins to hub modules: the relationship between essentiality and centrality in the yeast interactome at different scales of organization. PLoS Comput Biol 9, e1002910.

Sood, R., English, M.A., Belele, C.L., Jin, H., Bishop, K., Haskins, R., McKinney, M.C., Chahal, J., Weinstein, B.M., Wen, Z., Liu, P.P., 2010. Development of multilineage adult hematopoiesis in the zebrafish with a runx1 truncation mutation. Blood 115, 2806-2809.

Soza-Ried, C., Hess, I., Netuschil, N., Schorpp, M., Boehm, T., 2010. Essential role of c-myb in definitive hematopoiesis is evolutionarily conserved. Proc Natl Acad Sci U S A 107, 17304-17308.

Spillman, J., Hureau, J.C., 1967. Observations sur les éléments figures du sang incolore de Chaenichthys rhinoceratus Richardson, poisson télCostéen antarctique (Chaenichthyidae). Bulletin du Museum National d'Histoire

Naturelle 38, 779-783. Steck, T.L., 1974. The organization of proteins in the human red blood cell membrane. A review. J Cell

Biol 62, 1-19. Stein, S.J., Baldwin, A.S., 2013. Deletion of the NF-kappaB subunit p65/RelA in the hematopoietic

compartment leads to defects in hematopoietic stem cell function. Blood 121, 5015-5024. Suzuki, M., Moriguchi, T., Ohneda, K., Yamamoto, M., 2009. Differential contribution of the Gata1 gene

hematopoietic enhancer to erythroid differentiation. Molecular and cellular biology 29, 1163-1175. Suzuki, M., Shimizu, R., Yamamoto, M., 2011. Transcriptional regulation by GATA1 and GATA2 during

erythropoiesis. Int J Hematol 93, 150-155. Talbot, J.C., Amacher, S.L., 2014. A streamlined CRISPR pipeline to reliably generate zebrafish

frameshifting alleles. Zebrafish 11, 583-585. The UniProt, C., 2017. UniProt: the universal protein knowledgebase. Nucleic Acids Res 45, D158-D169. Thompson, M.A., Ransom, D.G., Pratt, S.J., MacLennan, H., Kieran, M.W., Detrich, H.W., 3rd, Vail, B.,

Huber, T.L., Paw, B., Brownlie, A.J., Oates, A.C., Fritz, A., Gates, M.A., Amores, A., Bahary, N., Talbot, W.S., Her, H., Beier, D.R., Postlethwait, J.H., Zon, L.I., 1998. The cloche and spadetail genes differentially affect hematopoiesis and vasculogenesis. Dev Biol 197, 248-269.

Page 212: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

212

Tijssen, M.R., Cvejic, A., Joshi, A., Hannah, R.L., Ferreira, R., Forrai, A., Bellissimo, D.C., Oram, S.H., Smethurst, P.A., Wilson, N.K., Wang, X., Ottersbach, K., Stemple, D.L., Green, A.R., Ouwehand, W.H., Gottgens, B., 2011. Genome-wide analysis of simultaneous GATA1/2, RUNX1, FLI1, and SCL binding in megakaryocytes identifies hematopoietic regulators. Dev Cell 20, 597-609.

Till, J.E., McCulloch, E.A., 1980. Hemopoietic stem cell differentiation. Biochim Biophys Acta 605, 431-459.

Traver, D., Paw, B.H., Poss, K.D., Penberthy, W.T., Lin, S., Zon, L.I., 2003. Transplantation and in vivo imaging of multilineage engraftment in zebrafish bloodless mutants. Nat Immunol 4, 1238-1246.

Trinchella, F., Parisi, E., Scudiero, R., 2008. Evolutionary analysis of the transferrin gene in Antarctic Notothenioidei: A history of adaptive evolution and functional divergence. Mar Genomics 1, 95-101.

Truett, G.E., Heeger, P., Mynatt, R.L., Truett, A.A., Walker, J.A., Warman, M.L., 2000. Preparation of PCR-quality mouse genomic DNA with hot sodium hydroxide and tris (HotSHOT). Biotechniques 29, 52, 54.

Ulyanova, T., Blasioli, J., Woodford-Thomas, T.A., Thomas, M.L., 1999. The sialoadhesin CD33 is a myeloid-specific inhibitory receptor. Eur J Immunol 29, 3440-3449.

UniProt, C., 2015. UniProt: a hub for protein information. Nucleic Acids Res 43, D204-212. Van Etten, R.A., 2007. Oncogenic signaling: new insights and controversies from chronic myeloid

leukemia. J Exp Med 204, 461-465. Varki, A., Angata, T., 2006. Siglecs--the major subfamily of I-type lectins. Glycobiology 16, 1R-27R. Vitale, C., Romagnani, C., Puccetti, A., Olive, D., Costello, R., Chiossone, L., Pitto, A., Bacigalupo, A.,

Moretta, L., Mingari, M.C., 2001. Surface expression and function of p75/AIRM-1 or CD33 in acute myeloid leukemias: engagement of CD33 induces apoptosis of leukemic cells. Proc Natl Acad Sci U S A 98, 5764-5769.

Vo, N., Goodman, R.H., 2001. CREB-binding protein and p300 in transcriptional regulation. The Journal of biological chemistry 276, 13505-13508.

Volkmann, K., Rieger, S., Babaryka, A., Koster, R.W., 2008. The zebrafish cerebellar rhombic lip is spatially patterned in producing granule cell populations of different functional compartments. Dev Biol 313, 167-180.

Wakabayashi, J., Yomogida, K., Nakajima, O., Yoh, K., Takahashi, S., Engel, J.D., Ohneda, K., Yamamoto, M., 2003. GATA-1 testis activation region is essential for Sertoli cell-specific expression of GATA-1 gene in transgenic mouse. Genes to cells : devoted to molecular & cellular mechanisms 8, 619-630.

Walter, R.B., Hausermann, P., Raden, B.W., Teckchandani, A.M., Kamikura, D.M., Bernstein, I.D., Cooper, J.A., 2008. Phosphorylated ITIMs enable ubiquitylation of an inhibitory cell surface receptor. Traffic 9, 267-279.

Wang, Y., Xiao, Z.J., Liu, P., Yang, C., Yang, R.C., Cai, Y.L., Han, Z.C., 2003. [Expression of vascular endothelial growth factor and its receptors KDR and Flt1 in acute myeloid leukemia]. Zhonghua Xue Ye Xue Za Zhi 24, 249-252.

Weinstein, B.M., Schier, A.F., Abdelilah, S., Malicki, J., Solnica-Krezel, L., Stemple, D.L., Stainier, D.Y., Zwartkruis, F., Driever, W., Fishman, M.C., 1996. Hematopoietic mutations in the zebrafish. Development 123, 303-309.

Wells, M., Tidow, H., Rutherford, T.J., Markwick, P., Jensen, M.R., Mylonas, E., Svergun, D.I., Blackledge, M., Fersht, A.R., 2008. Structure of tumor suppressor p53 and its intrinsically disordered N-terminal transactivation domain. Proc Natl Acad Sci U S A 105, 5762-5767.

Wells, R.M.G., Ashby, M.D., Duncan, S.J., Macdonald, J.A., 1980. Comparative-Study of the Erythrocytes and Hemoglobins in Nototheniid Fishes from Antarctica. J Fish Biol 17, 517-527.

West-Eberhard, M.J., 1989. Phenotypic plasticity and the origins of diversity. Annu. Rev. Ecol. Syst. 20, 249-278.

West-Eberhard, M.J., 2005. Developmental plasticity and the origin of species differences. P Natl Acad Sci USA 102, 6543-6549.

Page 213: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

213

Westerfield, M., 2000. The zebrafish book. A guide for the laboratory use of zebrafish (Danio rerio). in: Univ. of Oregon Press, E. (Ed.), 4th Edition ed, Univ. of Oregon Press, Eugene.

Willard, S.S., Koochekpour, S., 2013. Glutamate, glutamate receptors, and downstream signaling pathways. International journal of biological sciences 9, 948-959.

Wu, S., Zhang, Y., 2007. LOMETS: a local meta-threading-server for protein structure prediction. Nucleic Acids Res 35, 3375-3382.

Xu, J., Zhang, Y., 2010. How significant is a protein structure similarity with TM-score = 0.5? Bioinformatics 26, 889-895.

Xu, Q., Cai, C., Hu, X., Liu, Y., Guo, Y., Hu, P., Chen, Z., Peng, S., Zhang, D., Jiang, S., Wu, Z., Chan, J., Chen, L., 2015. Evolutionary suppression of erythropoiesis via the modulation of TGF-beta signalling in an Antarctic icefish. Mol Ecol 24, 4664-4678.

Yang, J., Yan, R., Roy, A., Xu, D., Poisson, J., Zhang, Y., 2015. The I-TASSER Suite: protein structure and function prediction. Nature methods 12, 7-8.

Yang, L.V., Heng, H.H., Wan, J., Southwood, C.M., Gow, A., Li, L., 2003. Alternative promoters and polyadenylation regulate tissue-specific expression of Hemogen isoforms during hematopoiesis and spermatogenesis. Developmental dynamics : an official publication of the American Association of Anatomists 228, 606-616.

Yang, L.V., Nicholson, R.H., Kaplan, J., Galy, A., Li, L., 2001. Hemogen is a novel nuclear factor specifically expressed in mouse hematopoietic development and its human homologue EDAG maps to chromosome 9q22, a region containing breakpoints of hematological neoplasms. Mech Dev 104, 105-111.

Yang, L.V., Wan, J., Ge, Y., Fu, Z., Kim, S.Y., Fujiwara, Y., Taub, J.W., Matherly, L.H., Eliason, J., Li, L., 2006. The GATA site-dependent hemogen promoter is transcriptionally regulated by GATA1 in hematopoietic and leukemia cells. Leukemia 20, 417-425.

Yang, S., Ott, C.J., Rossmann, M.P., Superdock, M., Zon, L.I., Zhou, Y., 2016. Chromatin immunoprecipitation and an open chromatin assay in zebrafish erythrocytes. Method Cell Biol 135, 387-412.

Yang, T., Jian, W., Luo, Y., Fu, X., Noguchi, C., Bungert, J., Huang, S., Qiu, Y., 2012. Acetylation of histone deacetylase 1 regulates NuRD corepressor complex activity. The Journal of biological chemistry 287, 40279-40291.

Yang, Z., 2007. PAML 4: phylogenetic analysis by maximum likelihood. Molecular biology and evolution 24, 1586-1591.

Yates, A., Akanni, W., Amode, M.R., Barrell, D., Billis, K., Carvalho-Silva, D., Cummins, C., Clapham, P., Fitzgerald, S., Gil, L., Giron, C.G., Gordon, L., Hourlier, T., Hunt, S.E., Janacek, S.H., Johnson, N., Juettemann, T., Keenan, S., Lavidas, I., Martin, F.J., Maurel, T., McLaren, W., Murphy, D.N., Nag, R., Nuhn, M., Parker, A., Patricio, M., Pignatelli, M., Rahtz, M., Riat, H.S., Sheppard, D., Taylor, K., Thormann, A., Vullo, A., Wilder, S.P., Zadissa, A., Birney, E., Harrow, J., Muffato, M., Perry, E., Ruffier, M., Spudich, G., Trevanion, S.J., Cunningham, F., Aken, B.L., Zerbino, D.R., Flicek, P., 2016. Ensembl 2016. Nucleic Acids Res 44, D710-716.

Yergeau, D.A., Cornell, C.N., Parker, S.K., Zhou, Y., Detrich, H.W., 2005. bloodthirsty, an RBCC/TRIM gene required for erythropoiesis in zebrafish. Dev. Biol 283, 97-112.

Yergeau, D.A., Wingert, R.A., Zon, L.I., Detrich, H.W.I., 2006. Manuscript in preparation. Hematopoietic tissues of the erythrocyte-null Antarctic icefishes contain proerythroblasts that fail to complete terminal differentiation.

Zeng, Y., Xu, J., Li, D., Li, L., Wen, Z., Qu, J.Y., 2012. Label-free in vivo flow cytometry in zebrafish using two-photon autofluorescence imaging. Optics letters 37, 2490-2492.

Zhang, M.J., Ding, Y.L., Xu, C.W., Yang, Y., Lian, W.X., Zhan, Y.Q., Li, W., Xu, W.X., Yu, M., Ge, C.H., Ning, H.M., Li, C.Y., Yang, X.M., 2012a. Erythroid differentiation-associated gene interacts with NPM1

Page 214: Modeling the evolutionary loss of erythroid genes by ... · conserved in invertebrates, including both chordates (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003)

214

(nucleophosmin/B23) and increases its protein stability, resisting cell apoptosis. The FEBS journal 279, 2848-2862.

Zhang, Y., Skolnick, J., 2005. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 33, 2302-2309.

Zhang, Z., Xiao, J., Wu, J., Zhang, H., Liu, G., Wang, X., Dai, L., 2012b. ParaAT: a parallel tool for constructing multiple protein-coding DNA alignments. Biochem Biophys Res Commun 419, 779-781.

Zhao, Y., Ratnayake-Lecamwasam, M., Parker, S.K., Cocca, E., Camardella, L., di Prisco, G., Detrich, H.W.,

1998a. The major adult -globin gene of Antarctic teleosts and its remnants in the hemoglobinless icefishes: calibration of the mutational clock for nuclear genes. J. Biol. Chem 273, 14745-14752.

Zhao, Y., Ratnayake-Lecamwasam, M., Parker, S.K., Cocca, E., Camardella, L., di Prisco, G., Detrich, H.W., 3rd, 1998b. The major adult alpha-globin gene of antarctic teleosts and its remnants in the hemoglobinless icefishes. Calibration of the mutational clock for nuclear genes. The Journal of biological chemistry 273, 14745-14752.

Zheng, W.W., Dong, X.M., Yin, R.H., Xu, F.F., Ning, H.M., Zhang, M.J., Xu, C.W., Yang, Y., Ding, Y.L., Wang, Z.D., Zhao, W.B., Tang, L.J., Chen, H., Wang, X.H., Zhan, Y.Q., Yu, M., Ge, C.H., Li, C.Y., Yang, X.M., 2014. EDAG positively regulates erythroid differentiation and modifies GATA1 acetylation through recruiting p300. Stem Cells 32, 2278-2289.

Zon, L.I., 1995. Developmental biology of hematopoiesis. Blood 86, 2876-2891.