coronavirus evolution and immune evasion

Coronavirus Evolution and Immune Evasion

by

Aidan Tomlinson

A thesis submitted in conformity with the requirements for the degree of Master of Science

Department of Biochemistry University of Toronto

© Copyright Aidan Tomlinson 2018

ii

Coronavirus Evolution and Immune Evasion

Aidan Tomlinson

Master of Science

Department of Biochemistry University of Toronto

2018

Abstract

Coronaviruses are emerging pathogens that threaten human health and prosperity. Each

year, hundreds of millions of people are infected with continually circulating coronaviruses that

cause the common cold in healthy individuals and kill the most vulnerable of us. Coronaviruses

adapt to environmental change at a remarkable rate. The HCoV-229E coronavirus has adapted

and evolved over the last 50 years by mutating residues in the receptor-binding loops of the

receptor binding domain (RBD) of its spike (S) protein. These sequences phylogenetically

segregate into Classes whose viruses have successively replaced one another in the human

population. These Classes possess different receptor (hAPN) and antibody binding

characteristics and the crystal structures of RBD-hAPN complexes have been solved. Structural

insights into the ever-changing RBD and its interaction with hAPN show the use of secondary

structure-less loops facilitate tremendous structural variability, a trait that likely enables changes

in viral fitness and immune evasion.

iii

Acknowledgments

Because this opus bears but my name, I must reveal and identify the many other

fingerprints scattered throughout its pages in the hopes that the tangible and intangible aid I’ve

received the last few years does not go forgotten.

First and foremost, this work would not have been completed without the rarely

complimentary, but always complementary, guidance of Dr. James Rini. A great teacher and a

better learner, he raises an excellent standard that inspires those around him. I would also like to

thank my committee members Dr. Jean-Phillipe Julien and Dr. Scott Gray-Owen for their

tremendous help with the formation of this thesis and for helping nurture my scientific

communication skills.

A very special thanks must be given to Dr. Alan Wong who welcomed me into this

project with open arms. He has been a mentor and a role model. His prior work laid a solid and

level foundation which was ripe to build upon.

Dongxia Zhou and Malathy Satkunarajah deserve all the thanks in the world for their

indispensable and expert technical help. Dr. Zhijie Li, the handiest and most knowledgeable man

in the world, must be thanked for illuminating conversations and his ubiquitous 3D-printed

creations. I must also thank Kristina Han for moral support in the lab and on the softball field

and Nathan Doner for late night commiseration and debate.

Last but far from least, I need to thank my parents and sister along with the rest of my

family and friends for making my studies and work all worthwhile and for keeping me sharp

even when I’m away from the lab.

iv

Table of Contents

Acknowledgments .......................................................................................................................... iii

Table of Contents ........................................................................................................................... iv

List of Tables ................................................................................................................................ vii

List of Figures .............................................................................................................................. viii

Chapter 1 ......................................................................................................................................... 1

Introduction ............................................................................................................................. 1

1.1 Diseases Caused by Coronaviruses ..............................................................................................1 1.2 Life Cycle of Coronaviruses .........................................................................................................1 1.3 Fusion Proteins .............................................................................................................................3 1.4 Anatomy and Function of the Spike Protein ................................................................................5 1.5 Evolution of Coronavirus Diversity .............................................................................................9 1.6 Binding of Receptor by Coronaviral RBDs ...............................................................................10 1.7 The Engine of Genetic Diversity ................................................................................................14 1.8 Environmental Pressures Promote Coronaviral Adaptation .......................................................15 1.9 HCoV-229E as a Model for Coronavirus Adaptation and Evolution .........................................16 1.10 Rationale .....................................................................................................................................18

Chapter 2 ....................................................................................................................................... 20

Results ................................................................................................................................... 20

2.1 Biophysical Characterization of HCoV-229E Spike Protein RBD Interaction with Its Receptor

and Neutralizing Antibodies ....................................................................................................................20 2.1.1 HCoV-229E RBDs Cluster into Six Phylogenetic Classes. ...................................................20 2.1.2 Variation in Receptor-Binding Loops Changes Receptor-Binding Affinity and Binding

Kinetics. ..............................................................................................................................................20 2.1.3 Structure-Function Analysis of the Class 1 RBD interaction with hAPN. ............................23 2.1.4 The Six Classes of RBD Share a Conserved Binding site on hAPN. ....................................27 2.1.5 HCoV-229E Classes Differ in Their Ability to be Bound by a Neutralizing Antibody. .......27

2.2 Structural Biology of the Evolving RBD-Receptor Interaction .................................................33 2.2.1 Crystallization of hAPN Requires Deglycosylation ..............................................................33

v

2.2.2 HCoV-229E S Protein RBD Classes 3, 4, and 5 Crystal Complexes with Their Receptor

hAPN. ................................................................................................................................................36 2.2.3 HCoV-229E Class 1 RBD-hAPN Complex Provides a Foundation for Comparison. ..........37 2.2.4 HCoV-229E Class 3 RBD-hAPN Crystal Complex Shows a Markedly Different Interaction

When Compared to the Class 1 RBD. .................................................................................................40 2.2.5 The HCoV-229E Class 4 RBD-hAPN Interaction Maintains Many Features of the Class 3

Interaction. ..........................................................................................................................................44 2.2.6 HCoV-229E Class 5 RBD-hAPN Interaction Shows a Moderately Changed Loop 1. .........45

Chapter 3 ....................................................................................................................................... 49

Discussion ............................................................................................................................. 49

3.1 A Ladder-Like Phylogeny and Immune Evasion .......................................................................49 3.2 The HCoV-229E RBD Affinity for hAPN has Changed Over Time. ........................................50 3.3 Crystal Structures and Mutagenesis Shed Light on the RBD-hAPN Interaction. ......................51 3.4 hAPN Mutants Lead to Reduction in RBD Binding Affinity ....................................................53 3.5 The Use of Loops as Receptor-Binding Motifs Enables HCoV-229E Adaptation and Evolution. 53 3.6 Bats are Unique and Potent Agents of Viral Spread ..................................................................55

Chapter 4 ....................................................................................................................................... 56

Future Directions .................................................................................................................. 56

4.1 Short Term Goals .......................................................................................................................56 4.1.1 Immediate Experiments .........................................................................................................56 4.1.2 Recent Surveillance Has Revealed Additional Human and Animal HCoV-229E RBD

Classes ...............................................................................................................................................57 4.2 Long Term Goals ........................................................................................................................60

4.2.1 Mutations in the Receptor-Binding Loops May Impact Spike Protein Conformational

Dynamics .............................................................................................................................................60 4.2.2 A Cell-Based Assay to Test Viral Fitness is Within Reach ...................................................61

Chapter 5 ....................................................................................................................................... 62

Methods................................................................................................................................. 62

5.1 Sequence Comparison of HCoV-229E S-protein RBD ..............................................................62 5.2 Protein Expression and Purification ...........................................................................................62 5.3 Surface Plasmon Resonance Assays ..........................................................................................63 5.4 Deglycosylation of hAPN ..........................................................................................................63

vi

5.5 Protein Crystallization ................................................................................................................64 5.6 Data Collection and Structure Determination ............................................................................64

Bibliography ................................................................................................................................. 65

vii

List of Tables

Table 1. Coronavirus’s diverse host range and receptor usage. .................................................... 12

Table 2. Surface plasmon resonance binding kinetics for the RBD-hAPN interaction for each of

the six Classes. .............................................................................................................................. 23

Table 3. RBD mutants, their contribution to buried surface area, and their effect on the RBD-

hAPN interaction. ......................................................................................................................... 25

Table 4. hAPN mutants, their contribution to buried surface area, and their effect on the RBD-

hAPN interaction. ......................................................................................................................... 26

Table 5. Data collection and refinement statistics. ....................................................................... 39

Table 6. Buried surface area analysis of RBD Classes 1, 3, 4, and 5 in complex with hAPN. .... 42

Table 7. Hydrogen bond and salt bridge analysis of four crystal complexes. .............................. 43

viii

List of Figures

Figure 1. The life cycle of the coronavirus. .................................................................................... 4

Figure 2. Fusion proteins facilitate fusion between viral envelopes and cellular membranes. ...... 6

Figure 3. The cryo-EM structure of the MHV spike protein. ......................................................... 7

Figure 4. A dynamic RBD allows receptor binding in the standing conformation. ....................... 8

Figure 5. Phylogenetic analysis of coronaviruses reveals four genera. ........................................ 11

Figure 6. The spike protein RBDs of HCoV-229E and HCoV-NL63 show structural similarity. 13

Figure 7. Phylogenetic analysis of HCoV-229E S-protein RBDs reveals six distinct classes. .... 17

Figure 8. Variation in HCoV-229E S-protein RBD is localized in the three receptor binding

loops. ............................................................................................................................................. 19

Figure 9. HCoV-229E S protein RBD variation. .......................................................................... 21

Figure 10. Global fitting of surface plasmon resonance data for the Class 1-6 RBD-hAPN

interaction. .................................................................................................................................... 22

Figure 11. Surface plasmon resonance data and global fitting for RBD mutants and their

interaction with hAPN. ................................................................................................................. 24

Figure 12. RBD Mutants show reduced or abrogated binding to hAPN. ..................................... 25

Figure 13. Surface plasmon resonance data for mutant hAPN and RBD interaction. .................. 26

Figure 14. N-linked glycan at H-site produces a steric clash with a docked RBD. ...................... 28

Figure 15. Introduction of an N-linked glycan to the H-site on hAPN prohibits binding of all six

Classes of RBD. ............................................................................................................................ 29

Figure 16. The 9.8.E12 antibody binds the HCoV-229E Class 1 RBD. ....................................... 30

Figure 17. The 9.8.E12 antibody competes with hAPN for RBD binding. .................................. 31

ix

Figure 18. The 9.8.E12 antibody binds only the Class 1 RBD. Surface plasmon resonance data

for the class 1-6 RBD interaction with the 9.8.E12 antibody. ...................................................... 31

Figure 19. The 9.8.E12 antibody binds both loop 1 mutants. ....................................................... 32

Figure 20. hAPN harbors 10 N-linked glycans. ............................................................................ 33

Figure 21. Deglycosylation of hAPN by EndoH yields a heterogenous product. ........................ 34

Figure 22. Deglycosylation of hAPN by EndoA yields a homogenous product. ......................... 35

Figure 23. Superimposition of Class 1, 3, 4, and 5 RBD. ............................................................. 37

Figure 24 Interface details of HCoV-229E S protein Class 1, 3, 4, and 5 in complex with hAPN.

....................................................................................................................................................... 40

Figure 25. R316 of the Class 3 RBD fills the volume vacated by the kinked backbone. ............. 44

Figure 26. N406 of the Class 3 RBD provides supporting hydrogen bonds to N319. .................. 45

Figure 27. A two residue deletion in loop 2 of the Class 3 RBD helps position R357 more

favorably. ...................................................................................................................................... 46

Figure 28. L402 and L405 replace W404 in more recently isolated RBD sequences. ................. 47

Figure 29. Mutations not located at the site of hAPN interaction cluster at the top and sides of the

RBD. ............................................................................................................................................. 48

Figure 30. Phylogenetic analysis of animal HCoV-229E-related RBDs reveal discrete classes. . 58

Figure 31. Examination of the H-site of bat and camel APN shows moderate diversity. ............ 58

Figure 32. Camel and bat RBDs express well in mammalian tissue culture. ............................... 59

1

Chapter 1

Introduction

1.1 Diseases Caused by Coronaviruses

Research into coronaviruses is of great importance because of the diseases they cause in

humans, domesticated mammals, and other animals. The coronaviruses that circulate in humans

cause mild respiratory infections and are responsible for 10-30% of common cold cases across

the globe (Zumla et al. 2016). These ailments are normally cleared by individuals without

complication, but may worsen and even lead to death in the very young, the very old, and the

immunocompromised (Desforge et al. 2014). With even conservative estimates placing the

number of annual coronaviral infections in the hundreds of millions, coronaviruses have a

tremendous impact on human health and are responsible for the loss of billions of person-hours

in the economy (Fendrick et al. 2003). The four known circulating human coronaviruses are

HCoV-229E, HCoV-NL63, HCoV-OC43, and HCoV-HKU1. Each virus is able to infect an

individual multiple times throughout their life and it appears that immunological protection

provided after a first infection is not permanent, perhaps one reason why the common cold is so

common (Gaunt et al. 2010).

Possibly of greater concern than the common cold causing coronaviruses is the looming

threat of cross-species transmission by viruses that are currently circulating in animals. The

SARS-CoV and MERS-CoV epidemics are two examples that have occurred within the last 15

years. They possess mortality rates of 10% and 30%, respectively, and both have jumped species

from bats to humans (Li et al. 2006, Chan et al. 2015). While neither virus is capable of efficient

human-to-human transmission and only several thousand cases were logged (Richard et al.

2017), their mortality rates are of great concern.

1.2 Life Cycle of Coronaviruses

Coronaviruses are enveloped and possess a continuous positive sense RNA genome of

around 30 kilobases, the largest of any RNA viruses (Masters 1999). In order to replicate, viruses

must enter a target cell, produce their proteins, replicate their genome and package it, and bud

from the cell to begin the cycle anew.

2

In order to enter a target cell, a coronavirus must first bind a receptor on the outside of

the cell and fuse the viral and host cell membranes. Coronaviruses accomplish both of these

crucial steps via the S protein (Lin et al. 2011). Among coronaviruses, the S protein is able to

recognize either proteinaceous or carbohydrate receptors in a process called receptor

engagement. This engagement may be aided by attachment factors such as host lectins that

interact with high mannose glycans on the S protein (Hofmann et al. 2006). In a process that is

not well understood, receptor binding results in S-protein conformational changes that lead to

insertion of the S-protein fusion peptide into the host cell membrane (Walls et al. 2016a).

Changes in pH and/or processing by host proteases may also be involved in triggering these

conformational changes (Belouzard et al. 2012). Further conformational changes in the S-protein

bring the viral and cellular membranes together in a process that leads to fusion. Depending on

the coronavirus, the virus can fuse with the plasma membrane at the cell surface or with an

endosomal membrane after endocytosis (Belouzard et al. 2012).

Once inside the cytoplasm, the coronavirus’s positive sense RNA genome serves as an

mRNA for the translation of the viral replicase polyprotein using the host's ribosomes. The

polyprotein contains an RNA-dependent RNA polymerase that then generates negative sense

copies of the viral genome. The polymerase uses these negative sense copies to produce positive

sense copies that get packaged into new viral particles. The polymerase also uses the negative

sense copies of the viral genome to produce positive sense subgenomic RNAs. These

subgenomic RNAs serve as mRNAs that are used by host ribosomes to translate the viral

structural proteins S, envelope (E), membrane (M), nucleocapsid (N), and for some

coronaviruses the hemagglutinin-esterase (HE), that are required for the production of new viral

progeny.

The S, E, and M proteins are targeted to the endoplasmic reticulum at the start of

translation and are eventually trafficked to the ER-Golgi intermediate compartment (ERGIC).

These proteins encounter the viral genome encapsidated by the N protein - both of which are on

the cytoplasmic side of the ERGIC membrane - and together bud into the lumen of the ER-Golgi

compartment (Haan et al. 2005). In this process, a piece of the ERGIC membrane becomes the

viral membrane. The M and E proteins work in concert to create these virions, and expression of

just these proteins is sufficient to create virus-like particles in vitro (Bos et al. 1996). The

enveloped virions continue through the secretory pathway until they are secreted from the cell by

3

the normal process of exocytosis (Figure 1). While transiting though the secretory pathway, the

viral membrane proteins of these nascent virions can be further processed by Golgi-resident

glycosyltransferases and proteases.

1.3 Fusion Proteins

For all enveloped viruses, fusion of the host and viral membranes is required for

infection. This fusion is thermodynamically favored, but there is a large energy barrier that must

first be overcome (Chernomordik et al. 2003). Enveloped viruses employ fusion proteins to

lower the energy barrier in an ATP independent manner (Colman et al. 2003). There are 3 classes

of fusion proteins termed class I, II, and III. These three classes of proteins show great structural

diversity but they have all converged on a common mechanism of action (Figure 2). In short, a

trigger leads to conformational changes in the fusion protein that expose a short, hydrophobic

region called the fusion peptide. The fusion peptide is inserted into the host cell membrane and

subsequent large-scale conformational changes in the fusion protein bring the two membranes to

be fused into close apposition (Harrison 2008). The two membranes, now in close contact, can

mix outer leaflets in an intermediate event termed hemifusion. Hemifusion is followed by the

creation of a small pore that enlarges as the two membranes fuse (White et al. 2008). At least

three different fusion protein triggering mechanisms have been identified: i) receptor binding, ii)

low pH and iii) proteolytic cleavage, and others likely exist (Kielian et al. 2014). The Influenza

hemagglutinin protein is triggered by the low pH of cellular endosomes (Earp et al. 2005). The

herpes virus gB protein fuses after binding a cellular receptor, and may do so at a neutral,

extracellular pH (Milne et al. 2005). Among coronaviruses, proteolytic processing and low pH

both serve as fusion triggers (Belouzard et al. 2012).

4

Figure 1. The life cycle of the coronavirus. The coronavirus docks with a cell by recognizing a receptor (1). Fusion of membranes allows the viral genome to be deposited and cytoplasmic ribosomes translate the genome (2). The first protein product is the RNA-dependent RNA polymerase (RdRp), which transcribes the genomic RNA into a negative sense strand (3). This negative sense strand is transcribed by the RdRp into subgenomic RNAs which encode the structural proteins S, E, M, and N (4). Full-length genomes are also transcribed from the negative sense strand. The S, E, and M structural protein transcripts are translated on membrane-bound ribosomes and targeted to the endoplasmic reticulum-Golgi intermediate compartment where they join with N protein covered full-length genomes to form virions by budding into the lumen of the ERGIC (5 and 6). Secretory vesicles containing the nacent virions (6) fuse with the cell membrane leading to virus secretion (7).

5

1.4 Anatomy and Function of the Spike Protein

The coronavirus S protein is a class I fusion protein. It forms spike-like protrusions that

decorate the outside of the viral membrane, forming the crown-like shape for which the

coronavirus is named. The S protein is a homotrimer of about 400 kD and consists of three main

elements: a short intracellular tail, a single-pass transmembrane domain, and an ectodomain that

accounts for most of its mass (Li 2016). The S protein is highly glycosylated, containing many

N-linked glycosylation sequons and as many as 31 N-linked glycans per protomer have been

observed via cryo-electron microscopy (cryo-EM). Extensive glycosylation is thought to shield

the S protein from the host immune response as it does for the gp120/41 protein of HIV (Walls et

al. 2016b). Indeed, the S protein is a major target of the host immune response and it elicits

neutralizing antibodies (Du et al. 2016). Recent studies utilizing cryo-electron microscopy have

greatly advanced our understanding of the structure of the S-protein and how it mediates

membrane fusion. The S protein ectodomain is comprised of two regions, S1 and S2 (Figure 3).

The S1 region is N-terminal and is distally located from the viral membrane, while the C-

terminal S2 region is membrane proximal. Both the S1 and S2 regions contribute almost equally

to the surface area buried on trimer formation (Kirchdoerfer et al. 2016). The S1 region is

comprised of up to five domains: D0 (so named because it is not present in all S proteins), D1,

D2, D3, and D4. D0 and D1 both resemble galectins and adopt a canonical beta-sandwich fold

and it has been proposed that they have been acquired from their hosts by gene transfer (Li

2016). Among coronaviruses, both the D1 and D2 domains have been found to bind receptor

(protein or carbohydrate) as discussed in more detail below. The D3 and D4 domains link the S1

and S2 regions. Key elements present in the S2 region are the fusion peptide, used to catalyze

membrane fusion, the central helix, important for trimerization, and the heptad repeat regions 1

and 2 (HR1, HR2), amphipathic sequences key in the α-helical rearrangements that promote

fusion. As noted above, S proteins have the dual responsibilities of recognizing a receptor on the

host cell and mediating fusion of the viral and host membranes. In order to mediate these two

processes, labor is divided between the S1 and S2 regions. The S1 domain is responsible for

receptor recognition, and the S2 domain is responsible for membrane fusion.

The S1 region, tasked with recognizing a cellular receptor, must contain a receptor

binding domain (RBD). Coronaviruses are unusual in that two separate domains may act as the

6

Figure 2. Fusion proteins facilitate fusion between viral envelopes and cellular membranes. An enveloped virus diffuses into the area of a target cell (1). The fusion (F) protein, represented as a trimer, recognizes and binds to its cellular receptor in a process called receptor engagement (2). A trigger causes conformational changes that leads to the insertion of a fusion peptide into the target cell’s membrane (3). More conformational changes leads to formation of a six-helix bundle, which juxtaposes the two membranes and leads to fusion, and the beginning of infection (4). A class I fusion protein is used as an example, but classes II and III perform their function with similar large-scale rearrangements that result in a related “hairpin” protein structure and membrane fusion.

7

RBD: some viruses utilize D1 while others D2. The D1 domain recognizes sugar in the

coronaviruses TGEV/PEDV, BCoV/OC43, and IBV. The lab strain MHV uses D1 to recognize a

proteinaceous receptor, CEACAM1. The D2 domain acts as the RBD in SARS-CoV, MERS-

CoV, HCoV-NL63, and HCoV-229E to name a few. In all known cases, D2 binds protein

receptors.

The RBD is dynamic and exists in at least two prefusion conformations as determined by

cryo-EM studies (Yuan et al. 2017). The RBD either lays flat, pointing into the trimer interface

(“lying” state) or it pivots ~90° upwards, orienting itself parallel to the trimer axis (Figure 4). In

the lying state, the surface used by the RBD to bind the host receptor (sometimes called the

receptor binding motif (RBM)) is buried. Indeed, the superimposition of crystal structures of

coronavirus RBDs in complex with their receptor on that of the cryo-EM structures of the

Figure 3. The cryo-EM structure of the MHV spike protein. A) The cryo-EM structure of the spike protein trimer of MHV. B) Protomer of spike protein colored by region. The N-terminal S1 domain is in red and C-terminal

S2 domain is in beige.

8

Figure 4. A dynamic RBD allows receptor binding in the standing conformation. A) The cryo-EM structure of a MERS-CoV spike protein monomer. The lying RBD can pivot into

the standing form seen.

B) Pivoting allows binding of the MERS-CoV receptor, DPP4, because it prevents a steric clash. MERS-CoV S protein trimer is in red, DPP4 in blue, and possible steric clash in purple.

9

S protein trimer has shown that receptor binding can only occur in the "standing" conformation

(Figure 4). Although an RBD in the lying conformation is not able to bind receptor, it has been

suggested that this structural arrangement might serve to shield the RBD from a neutralizing

antibody response (Walls et al. 2016b). After binding to its receptor, the RBD is caught in the

standing state and this has been hypothesized to promote the cascade of conformational changes

required for membrane fusion (Gui et al. 2016, Yuan et al. 2017). RBD dynamics may thus play

a role in immune evasion and the triggering of membrane fusion, two important features of

infection and virulence.

The S protein’s ultimate function is that of a fusion protein and the machinery necessary

to fuse membranes is contained in the S2 region. The coronavirus S protein must be processed by

host proteases in order to become activated. It may harbor 2 or more protease cleavage sites, one

at the S1/S2 boundary and another in the S2 region, upstream from the all-important fusion

peptide (termed the S2’ cleavage site). Cleavage at the first site occurs in the secretory pathway

during the biosynthesis of viral progeny in an infected cell, while cleavage at the second site

occurs during entry into a new host cell prior to membrane fusion. In some coronaviruses like

MERS-CoV, both receptor binding and cleavage at the S1/S2 site is required for cleavage at the

S2’ site. S2’ cleavage occurs for all coronaviruses and is required for fusion activity. S proteins

that do not contain an amino acid sequence recognized by host proteases will not be able to

infect that host and, as such, protease specificity is a determinant of host range (Yang et al.

2015). Upon triggering, the HR1 motif of the S2 region rearranges to form a 3-helix bundle that

extends the central 3-helix bundle found in the prefusion trimer (one helix from each monomer).

Further rearrangements bring this nascent 3-helix bundle into contact with the HR2 3-helix

bundle that exists in the prefusion trimer prior to rearrangement (one helix from each monomer).

The net result is the formation of a 6-helix bundle from two 3-helix bundles, a process that drives

membrane fusion. Indeed, the formation of a 6-helix bundle in this way is characteristic of all

viral membrane fusion proteins, and is similar to cellular proteins involved in the secretory and

endocytic pathways (e.g. SNARES) (Martens et al. 2008, White et al. 2008).

1.5 Evolution of Coronavirus Diversity

Coronaviruses, and in fact all RNA viruses, are recognized for the remarkable speed with

which they evolve. Mutations are rapidly produced and fixed in a population and evolutionary

10

processes that take place over millions of years in eukaryotes take place over decades in RNA

viruses (Holmes 2011). The last common ancestor of coronaviruses is believed to have existed

about 10,000 years ago (Woo et al. 2012) and diversification since then has been massive.

Indeed, many viral species that infect a wide range of hosts via many different receptors now

exist (Table 1). Comparison of both gene sequence information and viral protein structures allow

us to elucidate evolutionary relationships that may aid in our understanding of the processes of

zoonosis, adaptation in a new host, and immune evasion.

To study the evolutionary relationships between the extant coronaviruses, phylogenetic

analysis of certain gene and protein sequences can be performed. The polymerase gene that

encodes the RNA-dependent RNA polymerase (RdRp) is essential for coronaviruses and it is the

most conserved region of their entire genome (Snijder et al. 2003). Based on the RdRp,

phylogenetic analysis has separated coronaviruses into four genera: alpha, beta, gamma, and

delta (Figure 5). Alpha and beta coronaviruses infect mammals while gamma and delta

coronaviruses infect mostly avian hosts. Viral surveillance studies of bats, alpacas, and camels

have revealed that viruses that are very closely related to HCoV-229E circulate in these

organisms (Crossley et al. 2012, Corman et al. 2015, Corman et al. 2016, Lin et al. 2017). In

vitro studies have shown that these camel viruses are able to infect human cells, and while this

does not guarantee that direct transfer between species is possible, it shows that some barriers to

zoonotic transmission are low (Corman et al. 2016).

1.6 Binding of Receptor by Coronaviral RBDs

Among coronaviruses, both the D1 and D2 domain has been found to mediate receptor

binding. Moreover, both proteins and carbohydrates can serve as receptors. The crystal structures

of the RBDs (i.e. D1 or D2) of the coronaviruses HCoV-229E, HCoV-NL63, SARS-CoV,

MERS-CoV, MHV, and PRCoV have all been solved in complex with their receptors, making

structural analysis possible. Here we will look at the tremendous variation seen between the four

coronaviruses that use D2 as the RBD.

11

HCoV-NL63 and SARS-CoV are alpha and beta coronaviruses, respectively, and they

share only 10% amino acid sequence identity in the S1 region of their S proteins (Li 2011). This

lack of primary sequence identity is common when comparing S1 regions across different

coronavirus genera. The NL63 and SARS RBD both recognize the same receptor, angiotensin

converting enzyme 2 (ACE2), and bind to it at overlapping sites on the receptor (Li 2011). The

NL63-CoV’s RBD uses three short loops, L1-L3, of 11, 11, and three amino acids in length,

respectively, to bind ACE2. These loops are supported by an 8-stranded β-structural domain. The

RBM of SARS-CoV is one continuous segment of 70 amino acids and contains two β-strands.

Figure 5. Phylogenetic analysis of coronaviruses reveals four genera. An unrooted phylogenetic tree of coronaviruses based on the gene sequence of the RNA-dependent RNA polymerase. The alpha, beta, gamma, and delta genera are shown in red, blue, green, and lavender respectively. Numbers indicate bootstrap value (N=1000). Adapted from de Groot et al. (2013).

12

Table 1. Coronavirus’s diverse host range and receptor usage.

Coronavirus Host Receptor

Alphacoronavirus

HCoV-229E Human Aminopeptidase N (APN)

HCoV-NL63 Human Angiotensin converting enzyme 2 (ACE2)

PRCoV Pig Aminopeptidase N (APN)

229E-like Bat Unknown

229E-like Camel Aminopeptidase N (APN)

229E-like Alpaca Unknown

NL63-like Bat Unknown

Betacoronavirus

HCoV-OC43 Human 9-O-Acetyl-N-acetylneuraminic acid (Neu5,9Ac2)

HCoV-HKU1 Human Unknown

BCoV Cow 9-O-Acetyl-N-acetylneuraminic acid (Neu5,9Ac2)

MHV Mouse Carcinoembryonic antigen-related cell adhesion molecule 1 (CEACAM-1)

SARS-CoV Human Angiotensin converting enzyme 2 (ACE2)

SARS-related CoV Bat Angiotensin converting enzyme 2 (ACE2)

MERS-CoV Human Dipeptidyl peptidase-4 (DPP4)

MERS-related CoV Camel Dipeptidyl peptidase-4 (DPP4)

BatCoV-HKU4 Bat Dipeptidyl peptidase-4 (DPP4)

Gammacoronavirus

IBV Bird α2,3-linked Neu5Ac

Beluga Whale Coronavirus SW1 Whale Unknown

Deltacoronavirus

Bulbul Coronavirus HKU11 Bird Unknown

Porcine Coronavirus HKU15 Pig Unknown

13

NL63-CoV’s closest relative in the alpha coronavirus genus is HCoV-229E and their last

common ancestor is believed to have existed more than 1000 years ago (Woo et al. 2012). Since

this time, both viruses have diverged and bind different human receptors. HCoV-229E binds

human aminopeptidase N (APN) and as stated above, HCoV-NL63 binds ACE2. Both RBDs

have a similar core fold composed of β-strands and both utilize three extended loops to bind their

receptors (Wong et al. 2017, Wu et al. 2009) (Figure 6). The RBDs share only 45% amino acid

sequence identity and their receptor binding loops vary in length, sequence and the extent to

which they mediate the interactions with their respective receptors (Wong et al. 2017).

The alpha coronavirus, porcine respiratory coronavirus (PRCoV), also uses APN as its

receptor but, as the name would indicate, in a porcine host. This virus’s RBD also has high

structural similarity to the RBDs of HCoV-NL63 and HCoV-229E and it also binds its receptor

through the use of three extended loops (Reguera et al. 2012). Interestingly, despite being close

relatives and utilizing the same receptor, the RBDs of HCoV-229E and PRCoV bind to different

Figure 6. The spike protein RBDs of HCoV-229E and HCoV-NL63 show structural similarity. The NL63 (left) and 229E (right) RBDs are displayed and their receptor binding loops are labeled 1-3. The structural similarity is striking and there is only a root-mean-square deviation (RMSD) of 1.2 Å between the two RBDs.

14

sites on the APN molecule, sites termed the P-site and H-site for pig and human, respectively

(Wong et al. 2017). HCoV-229E is unable to bind pAPN at the H-site because of the presence of

an N-linked glycan on pAPN at residue Asn286. PRCoV is unable to bind hAPN at the P-site

because of steric clashes between hAPN Arg741 and Ser302, Pro307, and I308 of the PRCoV S

protein RBD. The H- and P-site share only about 60% identity between the APNs of mouse, pig,

and human. Strikingly, once APN is mutated to remove residues whose side chains lead to a

clash, alpha coronaviruses from different species are able to bind, indicating that the receptor-

binding loops are able to accommodate the remaining structural differences (Wong et al. 2017).

Structural variation in the receptor binding loops of these coronavirus S proteins has led to the

receptor usage and host specificity observed.

1.7 The Engine of Genetic Diversity

All of the protein variation that enables the receptor binding diversity seen in

coronaviruses is the result of genetic diversity. RNA viruses replicate with mutation rates

between 10−6 and 10−4 substitutions per nucleotide, a rate orders of magnitude higher than DNA

polymerases (Sanjuán 2016). Due to the optimized nature of the “wild type” virus, most

mutations are deleterious or lethal (Vishner 2016). However, mutations that result in increased

viral fitness are amplified by means of natural selection (Smith 2017, Darwin 1859). The engine

of genetic diversity that enables such selection is the RNA-dependent RNA polymerase (RdRp).

This protein is tasked with producing copies of the RNA genome during viral replication.

Coronaviral RdRps lack intrinsic proofreading ability, a characteristic known to lower replication

fidelity (Steinhauer et al. 1992). It is this low-fidelity that leads directly to diverse populations. A

simple illustration of how much diversity is created by the RdRp is shown as follows: the RdRp

error rate is 1•10-5 mutations per nucleotide, Coronavirus genomes are 3•104 nucleotides long,

and Viral loads are approximately 1•107 viruses per mL tissue. This leads to 3•106 mutations per

mL of tissue, enough diversity to cover the genome about 100 times over in a small sample of

just one infected individual. This thought experiment has recently been supplanted with actual

experimental evidence that vast coronavirus genome variation exists within a single organism

(Briese et al. 2014). Some have argued that RNA viruses have evolved an optimal mutation rate

and that higher or lower rates decrease viral fitness. With higher mutation rates, an “error

threshold" is reached where viable viral genomes can no longer be produced (Holmes 2003).

15

With lower mutation rates, viral populations do not harbor enough diversity to overcome

environmental changes (Coffey et al. 2011, Smith 2014).

1.8 Environmental Pressures Promote Coronaviral Adaptation

Diversity increases the probability that a virus will be able to survive under changing

environmental pressures, but what such pressures exist in nature? Host defenses such as an

antibody response or the introduction of an antiviral pharmaceutical provide two common

examples (Coffin et al. 2013). Additionally, a virus that has been introduced into a new species

faces a number of host specific factors that act as barriers to efficient infection (Parrish et al.

2008).

For coronaviruses, the S protein is the major target of neutralizing antibodies (Du et al.

2016). Antibodies can neutralize coronaviral infection via two methods: i) binding to the RBD to

sterically prevent interaction with the host receptor, and ii) binding to the fusion machinery in a

manner that either stabilizes the pre-fusion conformation or that sterically prevents the required

post-fusion conformation from being realized (Zeng et al. 2006, Corti et al. 2013). Mutations on

the S protein that ablate binding of neutralizing antibodies would see a marked increase in fitness

compared to its neutralized peers, as they would be the only ones able to enter a host cell and

replicate. Culturing coronaviruses in the presence of neutralizing antibodies quickly leads to

mutants that are able to escape recognition by such antibodies (Tang et al. 2014), confirming that

immune pressure leads to S protein diversity.

Hosts naturally elicit a polyclonal antibody response with several antibodies able to bind

unique epitopes on a viral antigen and competitively inhibit receptor binding. Avoiding such a

robust response via point mutation may be impossible, as a single side chain difference may not

abrogate binding of multiple antibodies (Tang et al. 2014). One means by which a virus can

overcome such a polyclonal response is to increase the affinity of the RBD-receptor interaction

so that this interaction is more favorable than the antibody-RBD interaction. Increasing the

affinity of the receptor interaction via point mutations in the HA molecule has been shown to

enable influenza A to overcome a polyclonal antibody response (Hensley et al. 2009).

16

1.9 HCoV-229E as a Model for Coronavirus Adaptation and Evolution

HCoV-229E is a good model for studying the adaptation and evolution of coronaviruses

for several reasons. It was the first coronavirus to have its genome sequenced and multiple whole

genome sequences are available (Farsani et al. 2012). Additionally, more than 50 sequences of

the S protein RBD spanning decades and continents have been deposited and changes in primary

sequence are observed. The cellular receptor for HCoV-229E is known and the crystal structure

of the HCoV-229E RBD in complex with hAPN has been solved (Wong et al. 2017). HCoV-

229E elicits a neutralizing antibody response and yet prior infection does not lead to lasting

immunity. As such, viral adaption and immune evasion can be studied with this system.

Phylogenetic analysis of the 52 available HCoV-229E RBD sequences segregate them into six

distinct groups (Figure 7). Each new group has successively replaced the previous one leading to

a “ladder-like” phylogeny. Such a phylogeny is representative of a protein under selective

pressure to evade the host immune response (Holmes 2011).

Analysis of the x-ray crystal structure of the HCoV-229E RBD-hAPN complex, in

conjunction with the available sequence data, has shown that the sequence variation observed is

highly skewed to the three receptor-binding loops (Figure 8). This is a striking observation that

raises many questions. Does such sequence diversity lead to different receptor usage? Does this

diversity modulate receptor binding affinity? Are these mutations the result of immune evasion?

The implications of these mutational differences are examined in this thesis.

17

Figure 7. Phylogenetic analysis of HCoV-229E S-protein RBDs reveals six distinct classes. An unrooted phylogenetic tree of 52 amino acid sequences of the HCoV-229E spike protein RBD (residues 293-435). The sequences cluster in six groups termed classes. These classes are presented along with the timeframe for which they were found.

18

1.10 Rationale

Viruses cause an enormous impact on human health and prosperity. Each year, viruses

that circulate amongst humans are responsible for millions of infections and deaths throughout

the world. In addition to this present threat, animal viruses capable of crossing species barriers

and spreading to humans lurk on the horizon. Recent coronavirus epidemics launched from cross

species transmission events like SARS-CoV and MERS-CoV have presented themselves with

high mortality rates of 10% and 30%, respectively, and as of now there are no approved vaccines

or antiviral therapies to combat these viruses. Future coronavirus outbreaks are likely to occur

and further research is needed to prevent or treat disease caused by coronaviruses.

HCoV-229E is a human coronavirus that is thought to have originated in bats. First

identified 50 years ago, it circulates globally and is responsible for a modest percentage of the

common cold. Exactly how the zoonosis of this virus occurred and how it was able to establish

itself in the human population is unknown and it is our hope that studying it can provide

mechanistic insights into these processes.

A crucial step in viral entry is membrane fusion and for coronaviruses this is mediated by

the spike (S) protein. Host antibodies that can bind to the S protein and prevent these actions can

neutralize a possible infection. Avoiding this neutralization as well as optimizing existing

receptor binding characteristics are selective pressures thought to drive the evolution of the S

protein. Phylogenetic analysis of HCoV-229E S protein RBD sequences shows that they group

into six distinct classes. By studying amino acid changes in these six classes and how they

influence interactions with neutralizing antibodies as well as the host cell receptor, we hope to

gain insight into how and why coronaviruses evolve.

19

Figure 8. Variation in HCoV-229E S-protein RBD is localized in the three receptor binding loops. An alignment of the 51 amino acid sequences of the HCoV-229E S-protein RBD. Conservation is shown by a period, and differences in residues are shown as their one letter abbreviation. Most variation is observed in the three receptor binding loop regions, highlighted in red at the top. Residues that directly interact with hAPN in the Class 1 structure are highlighted in orange.

20

Chapter 2

Results

2.1 Biophysical Characterization of HCoV-229E Spike Protein RBD Interaction with Its Receptor and Neutralizing Antibodies

2.1.1 HCoV-229E RBDs Cluster into Six Phylogenetic Classes.

The HCoV-229E S protein RBD sequence, previously defined as residues 293-435, was

used to query the coronavirus sequence database and 51 additional RBD sequences were

obtained from both patient samples and lab-strain sources (Wong et al. 2017). Alignment and

construction of a phylogenetic tree indicated that the sequences formed six different Classes.

Each Class of RBD sequence was found in samples collected over a 3 to 7 year period (Figure

7). Moreover, the analysis shows that each RBD Class is replaced in the human population by

the next Class and this ladder-like phylogeny (Grenfell 2004) has continued until the present day.

Further analysis of the sequences shows that a large majority of the variation among Classes

maps to the three receptor binding loops of the RBD (Figure 9). G311, G313, N319, R359 and

the cysteine residues C317 and C320 that form the loop 1 disulfide bond are the only loop

residues conserved in all six RBD Classes. These conserved residues account for only 45% of the

Class 1 RBD surface area buried upon binding hAPN. The variation observed at the site-of-

interaction likely changes receptor binding characteristics and this was further investigated.

2.1.2 Variation in Receptor-Binding Loops Changes Receptor-Binding Affinity and Binding Kinetics. A representative sequence from each of the six Classes was selected for characterization.

To facilitate comparison, the six RBDs were synthesized such that residues outside of the loop

regions correspond to that of the Class 1 RBD. Differences in interaction with the receptor,

hAPN, were then tested and binding affinity (Kd), on-rate (kon), and off-rate (koff) were measured

using a surface plasmon resonance (SPR) assay (Figure 10). All six Classes of RBD were found

to bind hAPN and the kinetic details of these interactions are found in Table 2. The binding

affinity covers a 16-fold range with Class 1 binding the weakest (Kd of 434 nM) and Class 5

binding the strongest (Kd of 27.0 nM). With the exception of Class 4, there is a general trend

toward increased affinity over time since the 1970s. Furthermore, while the on-rate of the

21

Figure 9. HCoV-229E S protein RBD variation. The crystal structure of the Class 1 HCoV-229E RBD is displayed and the variation observed in the 52 RBD sequences is displayed. Blue indicates no variation, white indicates moderate variation, and red indicates residues where the most variation occurs (Pei et al. 2001).

22

Figure 10. Global fitting of surface plasmon resonance data for the Class 1-6 RBD-hAPN interaction. SPR data for the interaction between the six classes of RBD and hAPN. Duplicates for each class are shown (left and right columns). Response units are plotted against time. Raw data is shown in black and the global fit is shown in red. Injection series are 2X dilutions starting from top points of 4.98, 3.78, 2.34, 2.04, 1.01, and 1.38 µM for Classes 1-6 respectively.

23

interactions remained similar and span only a 2.2-fold range, the off-rate of binding spans a 12-

fold range. Based on a linear regression analysis, the off-rate alone accounts for 90% of the

variation seen in binding affinity and follows a similar pattern of near constant decrease over

time.

Table 2. Surface plasmon resonance binding kinetics for the RBD-hAPN interaction for each of the six Classes. N=2 Class kon (•105M-1s-1) koff (s-1) Kd (nM)

1 3.6 ± 0.5 0.16 ± 0.02 434 ± 63

2 3.3 ± 0.5 0.08 ± 0.02 246 ± 19

3 7.3 ± 1.4 0.08 ± 0.02 113 ± 2.3

4 3.6 ± 0.5 0.10 ± .01 261 ± 24

5 4.8 ± 1.1 0.01 ± 0.01 27.0 ± 1.7

6 8.5 ± 0.6 0.03 ± 0.01 37.4 ± 3.5

2.1.3 Structure-Function Analysis of the Class 1 RBD interaction with hAPN.

Structure-function analysis of the Class 1 RBD interaction with hAPN was informed by

our group’s previously obtained crystal structure (Wong et al. 2017). Residues thought to be key

for the interaction on both the RBD and hAPN molecules were identified and mutants were

produced to confirm their importance. The RBD mutants produced were F318A, N319A,

W404A, and the double mutant C317S/C320S. Binding affinity and kinetics were measured in

much the same way as was done for the six RBD Classes (Figure 11). The N319A, W404A, and

C317S/C320S mutant RBDs were unable to interact with hAPN at the maximum achievable

concentration (Table 4). The F318A mutant showed a greatly reduced affinity (Kd of 5.8 µM

compared to 434 nM for WT Class 1 RBD) and therefore we can conclude that in all cases these

residues were important in complex formation.

24

The hAPN mutants produced were D288A, Y289A, V290G, I309A, and L318A. As

expected, SPR-based analysis showed reduced binding when compared to wild-type hAPN

(Figure 13). The reduction in affinity ranged from 10-fold for the D288A mutant to 30-fold for

Figure 11. Surface plasmon resonance data and global fitting for RBD mutants and their interaction with hAPN. SPR data for the selected RBD mutants and their interaction with hAPN. Raw data is shown in black and the global fit for the F318A mutant is shown in red. The highest concentration used in the two F318A mutant titrations are 17.2 (left) and 11.4 µM (right).

25

Figure 12. RBD Mutants show reduced or abrogated binding to hAPN. The side chains of RBD residues that were mutated are shown on the crystal structure of the Class 1 RBD and hAPN complex. The RBD is in brown and APN in green.

Table 3. RBD mutants, their contribution to buried surface area, and their effect on the RBD-hAPN interaction.

Residue Number

Amino acid % Buried Surface

Area Mutated amino acid Affinity Reduction

318 Phenylalanine 15 Alanine 13X

319 Asparagine 9 Alanine No binding observed at

25 µM

404 Tryptophan 10 Alanine No binding observed at

2.2 µM

317/320 Cysteine/ Cysteine

12 Serine/Serine No binding observed at

15 µM

26

Figure 13. Surface plasmon resonance data for mutant hAPN and RBD interaction. In all cases the mutant hAPN was covalently linked to the CM-5 dextran-coated gold chip and the WT Class 1 HCoV-229E RBD was injected. (A) hAPN D288A, (B) hAPN I309A, (C) hAPN V290G, (D) hAPN L318A and (E) hAPN Y289A. The raw data is plotted in black and the global fit is in red. The analyte solutions for the D288A, I309A, V290G, L318A, and Y289A titrations were obtained by 2-fold serial dilution starting at maximum concentrations of 25 µM, 4.1 µM, 25 µM, 4.1 µM and 24 µM, respectively.

Table 4. hAPN mutants, their contribution to buried surface area, and their effect on the RBD-hAPN interaction.

Residue Number Amino acid % Buried Surface

Area Mutated amino acid Affinity Reduction

288 Aspartic acid 13 Alanine 10X

289 Tyrosine 12 Alanine 18X

290 Valine 7 Glycine 30X

309 Isoleucine 3 Alanine 25X

318 Leucine 5 Alanine 12X

27

the V290G mutant (Table 4). Unlike the RBD point mutants that were produced, no single

mutation on hAPN was able to completely eliminate the interaction. One interpretation of this

outcome is that the receptor-binding loops possess enough structural plasticity to accommodate

changes on the surface of their binding partner.

2.1.4 The Six Classes of RBD Share a Conserved Binding site on hAPN.

As previously noted, PRCoV also uses APN as its cellular receptor. However, it binds at

a site on porcine APN (the P-site) that is completely distinct from where the Class 1 RBD binds

on hAPN (the H-site) (Wong et al. 2017). Porcine APN (pAPN) residue N286 is located in the

H-site and is glycosylated. This bulky glycan leads to steric interference with the RBD and

prohibits docking (Figure 14). The corresponding residue on hAPN is E291 and the triple hAPN

mutant E291N/K292E/Q293T was produced to introduce an N-glycan sequon into the H-site on

hAPN. This mutant was then used as a prospective binding partner in a SPR assay in order to

determine whether all six Classes of HCoV-229E S protein RBD bind hAPN at the H-site

(Figure 15). Classes 1 through 6 of the RBD fail to show any binding to this triple mutant,

suggesting that the variation observed in the receptor-binding loops does not lead to a different

binding site on hAPN.

Introducing mutations into a wild type protein can destabilize its fold and change its

structure in more ways than intended. To determine whether or not the triple mutant hAPN was

folded properly, it was crystallized and the structure was solved (Table 5). This mutant was

almost identical to the WT hAPN structure with a RMSD of 0.174 Å across 892 Cα atoms.

Solving the crystal structure of the triple mutant had the added benefit of confirming that the site

was glycosylated during expression. Electron density for the asparagine-linked N-

acetylglucosamine moiety of the N-glycan was observed and, therefore, we can confidently

conclude that the RBDs were unable to bind the mutant hAPN because of the presence of a

glycan in the H-site.

2.1.5 HCoV-229E Classes Differ in Their Ability to be Bound by a Neutralizing Antibody.

A monoclonal antibody (9.8.E12) was generated against the HCoV-229E lab strain

whose S protein contains the Class 1 RBD. This antibody was previously demonstrated to

neutralize the lab strain in a cell-based assay (Wong et al. 2017). Antibodies can neutralize

28

infection through at least two routes (competing for receptor-binding or stabilizing pre-fusion

conformation of S protein), and to investigate which mechanism of neutralization was occurring,

a SPR binding assay was again employed.

The 9.8.E12 antibody was shown to bind the Class 1 RBD with a dissociation constant

(Kd) of 66 nM (Figure 16). To test whether this antibody neutralizes by blocking receptor

binding, a competition assay was conducted (Figure 17). A 200 nM concentration of the Class 1

RBD produced a plateau signal of 15 RUs on an hAPN coupled SPR chip. The same solution

with 2.0 µM 9.8.E12 antibody added produced no discernible signal. Together these results

Figure 14. N-linked glycan at H-site produces a steric clash with a docked RBD. The superimposition of the crystal structures of the Class 1 RBD:hAPN complex and pAPN reveal that the glycan at N286 would prohibit binding of the RBD at the H-site. The RBD is seen in brown, hAPN in green, and pAPN in lavender. The steric clash can be observed at the displayed disulfide bond.

29

indicate that the 9.8.E12 antibody binds the RBD and that this interaction prevents binding to

hAPN.

The 9.8.E12 antibody, now proven to neutralize through binding of the HCoV-229E S

protein RBD, was tested for its ability to bind the RBDs of the other classes. The Class 1 through

6 RBDs were flowed over an antibody-coupled chip at a concentration of 1.0 µM, 15 times the

Kd of the interaction between the Class 1 RBD and 9.8.E12 (Figure 18). No signal was observed

for Classes 2 through 6, an outcome indicating that this antibody is specific to the Class 1 RBD.

Because these RBDs vary only in the loop region, this assay also shows that the antibody

recognizes an epitope present on the receptor-binding loops and not in the β-sandwich region of

the RBD. This is strong evidence that the receptor-binding loops of HCoV-229E elicit a

neutralizing antibody response and that mutations in these loops may be sufficient to prevent

antibody binding.

Several Class 1 RBD mutants previously produced were also tested for cross-reactivity

with 9.8.E12. Full titrations of the F318A and N319A mutants were performed and their binding

Figure 15. Introduction of an N-linked glycan to the H-site on hAPN prohibits binding of all six Classes of RBD. SPR data for the six classes of RBD interacting with the E291N/K292E/Q293T triple mutant. All RBDs were injected at their respective Kd with WT hAPN. All sensorgrams show no signal.

30

kinetics and affinity are shown in Figure 19. Both mutants showed a moderately reduced binding

affinity when compared to the WT Class 1 RBD but binding was not completely abrogated. This

suggests that these residues are not key to the RBD-antibody interaction.

Figure 16. The 9.8.E12 antibody binds the HCoV-229E Class 1 RBD. Surface plasmon resonance data for the interaction of the 9.8.E12 antibody with the Class 1 RBD. Raw data is shown in black while the global fit is shown in red. Interaction kinetics and affinity are shown in the top right. The antibody was covalently attached to the chip while the RBD was injected. Solutions of RBD for injection were obtained starting from a maximum concentration of 1.18 µM and conducting a 2-fold dilution.

31

Figure 17. The 9.8.E12 antibody competes with hAPN for RBD binding. Surface plasmon resonance data for a Class 1 RBD interaction with hAPN. hAPN is covalently linked to the chip and the Class 1 RBD was injected. 200 nM RBD produces a plateau of 15 RUs, while 200 nM RBD in solution with 2.0 µM 9.8.E12 antibody shows no binding to hAPN.

Figure 18. The 9.8.E12 antibody binds only the Class 1 RBD. Surface plasmon resonance data for the class 1-6 RBD interaction with the 9.8.E12 antibody. The 9.8.E12 antibody is covalently linked to the chip and Classes 1-6 RBD was injected at 1.0 µM. 1.0 µM Class 1 RBD produces a plateau of 180 RUs, while 1.0 µM of class 2-6 RBD shows no signal, indicating no binding.

32

Figure 19. The 9.8.E12 antibody binds both loop 1 mutants. Surface plasmon resonance data for the F318A and N319A RBD mutant interaction with the 9.8.E12 antibody (top and bottom respectively). The 9.8.E12 antibody is covalently linked to the chip and RBD mutants were injected. Maximum concentrations of 0.839 µM and 1.18 µM, respectively, were serially diluted by 2-fold to obtain the lower sensorgrams.

33

2.2 Structural Biology of the Evolving RBD-Receptor Interaction

2.2.1 Crystallization of hAPN Requires Deglycosylation

hAPN is a glycoprotein with 10 N-linked glycosylation sequons (NXS/T) and the

previously solved crystal structure of hAPN shows that all 10 sites are glycosylated (Figure 20).

Glycoproteins present a special problem for crystallographers as the chemical and

conformational heterogeneity of their attached glycans can hinder crystal formation. Chemical

heterogeneity is reduced by expression in a GnT1(-/-) cell line, leading to glycoproteins whose N-

glycans do not get processed beyond the Man5GlcNAc2 intermediate. These glycans are still

large and highly flexible. In order to reduce the size and heterogeneity of these N-glycans, an

endoglycosidase was employed. Endo-β-N-acetylglucosaminidase A and H (EndoA and EndoH)

both recognize high-mannose type glycans and are able to enzymatically remove most of the

sugar moiety leaving only the asparagine-linked N-acetlyglucosamine residue. Previous work

has indicated that hAPN will not crystallize without further enzymatic treatment with Jack Bean

α-Mannosidase, an observation suggesting all 10 N-glycan sites are not susceptible to EndoA/H

cleavage. hAPN crystallizes in its apo form following either route of deglycosylation (either

EndoH or EndoA treatment followed by the α-mannosidase in both cases).

Figure 20. hAPN harbors 10 N-linked glycans. The sequence position and glycosylation potential of the 10 N-linked glycan motifs on the hAPN construct. The hAPN structure previously solved shows evidence of glycosylation at all 10 sites, regardless of two motifs falling below the arbitrary threshold of the NetNGlycan 1.0 server (Gupta et al. 2004, Wong et al. 2012).

34

Although EndoH and EndoA have a very similar specificity, they are different enzymes with

different molecular weights. As such, their ability to access a particular N-glycan on a given

substrate might be different. This phenomenon was observed with hAPN. SDS-PAGE with

Coomassie staining showed sharp bands of lowered MW for both EndoH and EndoA treated

hAPN, an indication of quantitative deglycosylation. However, after failed attempts to crystallize

the hAPN:RBD complex, both digests were investigated using MALDI-TOF mass spectrometry.

As seen in Figure 21, hAPN appears as a gaussian distribution centered around 116 M/Z. After

48 hours of EndoH treatment, a large shift of around 8 M/Z units is observed but the distribution

loses its gaussian character by gaining a shoulder at 109.1 M/Z, an indication of heterogeneity.

Figure 21. Deglycosylation of hAPN by EndoH yields a heterogenous product. Shown are two MALDI-TOF curves displaying mass/charge ratio and arbitrary response units for hAPN. The red curve corresponds to the untreated sample and the green curve to that of the sample after EndoH treatment for 48 hours.

35

Subsequent treatment with α-Mannosidase fails to shift the hAPN further, indicating that there is

no further effect and that all possible glycan substrates have been removed. As mentioned earlier,

hAPN crystallizes in its apo form after this treatment, but crystals of hAPN in complex with

several RBDs could not be obtained.

EndoA treatment appears to have a different effect on APN than EndoH. As seen in

Figure 22, two days of EndoA digest shifts hAPN only about 4 M/Z units and the curve remains

gaussian in shape. This tells us that EndoA is able to cleave fewer glycans than EndoH, but that

Figure 22. Deglycosylation of hAPN by EndoA yields a homogenous product. Shown are two MALDI-TOF curves displaying mass/charge ratio and arbitrary response units for hAPN. In yellow is 48 hours of an EndoA digest and blue shows this hAPN sample after an additional 48 hours of α-Mannosidase treatment.

36

it does so in a more quantitative manor. Subsequent α-mannosidase treatment shifts APN about 1

M/Z unit further. The two deglycosylation strategies yield hAPN samples that differ with regard

to what carbohydrate structures remain. Since these samples have similar masses the differences

are missed when only using SDS-PAGE for analysis.

2.2.2 HCoV-229E S Protein RBD Classes 3, 4, and 5 Crystal Complexes with Their Receptor hAPN.

Following deglycosylation of hAPN with EndoA and α-Mannosidase, diffracting crystals

of RBD Classes 3, 4, and 5 in complex with their receptor hAPN were obtained. These three

crystal structures were solved and statistics can be found in Table 5. All three crystals are in

space group of P21 and exhibited very similar packing of the crystal lattice. The unit cell is

nearly identical for the Class 4 and 5 structures. The Class 3 structure has a c axis of about 6 Å

longer compared to the Class 4 and 5 structures. The asymmetric unit (ASU) contains an hAPN

dimer with an RBD bound to each monomer. The space group, unit cell and contents of the

asymmetric unit are different for the previously reported Class 1 RBD complex. Despite

crystallizing under similar conditions, the Class 1 complex crystallizes in a P3121 space group

and the ASU contains three hAPN monomers and the three RBDs that interact with them.

As expected, the overall “backbone” structure of the RBDs are remarkably similar. The

Class 3, 4, and 5 RBDs all maintain the same beta-sandwich domain and vary only in the loop

regions (Figure 23). The crystal structures confirm that they all bind hAPN in a fashion very

similar to that observed for the Class 1 RBD. In all cases, the RBD surface area buried on

complex formation is very similar. The largest difference is between Class 1 with a buried

surface area (BSA) of 513 Å2 and Class 3 with a BSA of 537 Å2, a difference of only a 4.7%

(Table 6). A view of the four interfaces can be found in Figure 24. hAPN can exist in a fully

“open” form where its catalytic site is solvent exposed, a more compact fully “closed” form

where the catalytic site is not exposed, or the spectrum of conformations between these terminal

states (Wong et al. 2012). The previously obtained Class 1 RBD structure contained hAPNs that

were all in the “closed” conformation. In the three other structures, at least one of the hAPNs in

the asymmetric unit is partially open and this change may contribute to crystallization in

different space groups.

37

2.2.3 HCoV-229E Class 1 RBD-hAPN Complex Provides a Foundation for Comparison.

The HCoV-229E Class 1 RBD-hAPN crystal structure shows that the interaction is

mediated exclusively by loop 1 (residues 308-325), loop 2 (residues 352-359), and loop 3

(residues 404-408) of the RBD. Loop 1 is the largest by residue count and by the buried surface

area (BSA) created on complex formation with hAPN. It accounts for 68% of the total BSA and

makes several notable interactions with hAPN. The GGG motif present at residues 313-315

accounts for 25% of the total BSA and it contains backbone NH groups that act as hydrogen

bond donors to the side chain carboxyl oxygens of hAPN residue D288 and the backbone

carbonyl oxygen of hAPN residue Y289 (Table 7). As demonstrated by the SPR experiments

described above, mutation of either of these two hAPN residues leads to greatly reduced binding

Figure 23. Superimposition of Class 1, 3, 4, and 5 RBD. Shown is the superimposition of all four RBD structures. The RMSD for any pairwise comparison is less than 1.1 Å across all shared α-carbons. Dotted lines represent unbuilt regions.

38

(Table 4). RBD residue N319 is central to the interaction and its side chain makes two hydrogen

bonds to the backbone NH and carbonyl oxygen of hAPN residue E291. Also of importance in

loop 1 is the disulfide bond between C317 and C320. This disulfide makes stacking interactions

with hAPN and it likely structures loop 1. The C317S/C320S RBD double mutant cannot bind

hAPN (Table 3). Loop 2 accounts for the smallest fraction of the BSA at just 9%. It contains

residue R359 that forms a salt bridge with D315 of hAPN. No mutagenic analysis was performed

with any loop 2 residues. Loop 3 contributes the remaining 23% of the BSA for the RBD-hAPN

interaction. W404 of this loop is important as its bulky side chain makes both intra-RBD

interactions as well as interactions with the sidechains of residues V290 and L318 on hAPN.

Mutation of this residue to alanine eliminates binding. S407 and K408 of loop 3 participate in

hydrogen bonds with K292 and E291 of hAPN respectively.

39

Table 5. Data collection and refinement statistics. Class 3 RBD:hAPN Class 4 RBD:hAPN Class 5 RBD:hAPN hAPN Glycosylation

Mutant Resolution range 44.3 - 3.1 (3.2 - 3.1) 48.0 - 2.75 (2.85 - 2.75) 47.58 - 2.5 (2.589 - 2.5) 29.2 - 2.0 (2.07 - 2.0)

Space group P 1 21 1 P 1 21 1 P 1 21 1 P 64

Unit cell 99.32 98.52 153.62 90 104.437 90

99.158 98.572 147.778 90 104.417 90

99.5109 98.6599 147.45 90 104.53 90

159.53 159.53 115.854 90 90 120

Total reflections 189210 (20035) 272415 (27367) 351849 (36261) 1312614 (122519)

Unique reflections 52290 (5206) 71808 (7122) 95596 (9507) 112832 (11236)

Multiplicity 3.6 (3.8) 3.8 (3.8) 3.7 (3.8) 11.6 (10.9)

Completeness (%) 99.77 (99.81) 99.84 (99.89) 99.88 (99.95) 99.96 (100.00)

Mean I/sigma(I) 12.10 (3.36) 16.18 (3.87) 10.37 (3.91) 18.19 (2.12)

Wilson B-factor 86.9 48.97 40.22 37.02

R-merge 0.06664 (0.427) 0.06827 (0.4231) 0.07703 (0.2766) 0.1019 (0.9583)

R-meas 0.07817 (0.497) 0.07962 (0.4921) 0.09019 (0.3218) 0.1066 (1.006)

R-pim 0.04057 (0.2532) 0.04078 (0.2503) 0.04633 (0.1628) 0.03103 (0.3037)

CC1/2 0.998 (0.934) 0.997 (0.908) 0.995 (0.942) 0.999 (0.784)

CC* 0.999 (0.983) 0.999 (0.975) 0.999 (0.985) 1 (0.938)

Reflections used in refinement

52263 (5203) 71791 (7121) 95567 (9507) 112828 (11238)

Reflections used for R-free

2613 (260) 1073 (107) 4868 (472) 1288 (127)

R-work 0.2162 (0.2987) 0.2049 (0.2907) 0.2096 (0.3224) 0.1805 (0.2832)

R-free 0.2618 (0.3405) 0.2450 (0.2891) 0.2327 (0.3365) 0.2023 (0.2925)

CC(work) 0.922 (0.879) 0.940 (0.830) 0.943 (0.764) 0.959 (0.855)

CC(free) 0.882 (0.795) 0.914 (0.858) 0.931 (0.709) 0.956 (0.787)

Number of non-hydrogen atoms

16136 16541 16521 8104

macromolecules 15951 15985 16028 7281

ligands 181 324 290 130

solvent 180 232 203 693

Protein residues 1991 1991 2012 892

RMS(bonds) 0.002 0.002 0.003 0.006

RMS(angles) 0.5 0.52 0.59 0.71

Ramachandran favored (%)

95.13 96.09 96.50 97.64

Ramachandran allowed (%)

4.82 3.91 3.5 2.36

Ramachandran outliers (%)

0.05 0 0 0

Rotamer outliers (%) 0.63 0.28 0.34 0.75

Clashscore 0.85 0.53 0.44 1.37

Average B-factor 92.81 48.89 41.06 41.06

Statistics for the highest-resolution shell are shown in parentheses.

40

2.2.4 HCoV-229E Class 3 RBD-hAPN Crystal Complex Shows a Markedly Different Interaction When Compared to the Class 1 RBD.

The Class 3 RBD-hAPN crystal structure shows several important differences when

compared to the Class 1 structure. In loop 1, S312 and G313 contribute far less BSA to the

interaction compared to their Class 1 counterparts. The GGG motif at residues 313-315 is

changed to GVG and the backbone that lays flat against hAPN in Class 1 to satisfy hydrogen

bonds, has been kinked up and away from hAPN. This leads to the loss of a hydrogen bond

involving RBD residue 313 that is seen in the Class 1 RBD. However, the Class 3 RBD residue

R316, whose side chain occupies the volume vacated by the backbone kink (Figure 25), makes a

highly favorable salt bridge with hAPN residue D288. The Class 3 residue R316 accounts for

Figure 24 Interface details of HCoV-229E S protein Class 1, 3, 4, and 5 in complex with hAPN. Shown are key residues present in the interface between hAPN and the Class 1 RBD (A), the Class 3 RBD (B), the Class 4 RBD (C), and Class 5 RBD (D).

41

three times the BSA of Class 1 residue K316 and the bulkier side chain of V314 compared to

G314 adds 9 Å2 to the interface. The C317/C320 disulfide bond is positioned and oriented in a

near identical manner and accomplishes all the same interactions in both complexes. The same is

true of F318 and N319. The dual hydrogen bonds made by N319 are slightly longer in the Class

3 structure, but the residue is supported intramolecularly by N406. N406 makes hydrogen bonds

with both the backbone and side chain of N319 (interactions not observed in the Class 1

complex) and this highly networked and orienting interaction likely promotes complex formation

(Figure 26).

Loop 2 contains a 2 amino acid deletion in Class 3 relative to that of Class 1. Residues

V353 and Y354 are lost, and this leads to a tighter turn that repositions R359. R359 in Class 1

makes a hydrogen bond with hAPN residue D315 and it accounted for 7.5% of the total BSA. In

the Class 3 RBD, R357 makes two additional hydrogen bonds and nearly double its contribution

to the BSA of the interaction with hAPN (Figure 27).

Loop 3 also shows major changes in the Class 3 complex relative to that of Class 1 and

these changes are likely to affect binding. Class 1 RBD residue W404, a residue that makes both

intra- and intermolecular interactions, as discussed above, is mutated to L402 in the Class 3

RBD. This side chain is able to accomplish the intramolecular apolar packing that W404

participated in, but it makes no contact with hAPN in the Class 3 complex. Class 1 residue S407

has mutated to L405 in Class 3 and a hydrogen bond to hAPN is lost in the process. However,

L405 makes a larger contribution to the BSA than does W404 in their respective complexes.

Together, L402 and L405 of the Class 3 RBD make a key contribution to the Class 3 complex

(Figure 28). The remainder of the difference between the Class 1 and Class 3 RBD are peripheral

to the receptor interaction (Figure 29).

42

Table 6. Buried surface area analysis of RBD Classes 1, 3, 4, and 5 in complex with hAPN.

43

Table 7. Hydrogen bond and salt bridge analysis of four crystal complexes.

44

2.2.5 The HCoV-229E Class 4 RBD-hAPN Interaction Maintains Many Features of the Class 3 Interaction.

Several subtle differences exist between the Class 3 and 4 RBD-hAPN complexes. Loop

1 is almost identical as the GVG motif is maintained along with the contributions of crucial

residues R316, N319, and the disulfide bond. F318 has mutated to Y318 in a conservative

manner. This bulky residue still contributes over 12% of the total BSA and participates in

hydrogen bonds with Y289 on hAPN. The additional hydroxyl group of tyrosine compared to

phenylalanine is presented toward the solvent and this increase in polar character may be

favorable. All interactions highlighted for the Class 3 structure are maintained in the Class 4

structure and no additional mutational differences exist at the protein-protein interface. However,

there are differences in the loops and supporting residues that are not central to the interaction

Figure 25. R316 of the Class 3 RBD fills the volume vacated by the kinked backbone. Shown is the Class 1 RBD in brown, the Class 3 RBD in green, and hAPN in black. The class 1 RBD makes hydrogen bonds with hAPN residue D288 using residues G313 and G315 (dotted line). Class 3 loses these interactions and compensates using R316.

45

(Figure 27). Residues N307, R311, Q349, K354, D356, M399, N404, and H408 do not

contribute to the buried surface but nonetheless differ between Class 3 and 4.

2.2.6 HCoV-229E Class 5 RBD-hAPN Interaction Shows a Moderately Changed Loop 1.

The Class 5 RBD binds hAPN with the strongest affinity of any tested construct (Kd of

27.0 nM). This is nearly 10-fold stronger than the Class 4 RBD (Kd of 261 nM). The major

difference between the Class 4 and Class 5 RBD-hAPN complexes occurs in loop 1. The GVG

motif from residues 313-315 has been mutated to GPG. The addition of this proline has changed

the backbone conformation of this region of loop 1. This is the second class in which the middle

residue of the Class 1 GGG motif has been changed. Proline residues have stricter

Ramachandran requirements than glycine and valine residues and the transition from valine to

proline here likely reduces loop flexibility. Additional mutational differences are not at the site

Figure 26. N406 of the Class 3 RBD provides supporting hydrogen bonds to N319. Shown the Class 3 RBD (green) in complex with hAPN (black). N319 makes two hydrogen bonds with the E291 backbone and this residue has been shown to be essential to the interaction. It is oriented into position by two hydrogen bonds provided by N406.

46

of interaction and are unlikely to affect binding. The other residues that are changed in the Class

5 RBD relative to that of Class 4 are exposed to solvent (Figure 29). Antibodies that bind to

these locations on the RBD may sterically occlude binding to hAPN and changes to these

residues may facilitate immune evasion.

Figure 27. A two residue deletion in loop 2 of the Class 3 RBD helps position R357 more favorably. Shown is the class 1 RBD in brown and class 3 in green. hAPN is in black. The loss of two residues moves R357 (class 3) away from the position of R359 (class 1), positioning the salt bridge closer, and more favorable. Class 3 interactions are shown with a dotted line and class 1 interactions are showed with a dashed line.

47

Figure 28. L402 and L405 replace W404 in more recently isolated RBD sequences. Shown is the class 1 RBD in brown and Class 5 in blue. hAPN is in black. The volume occupied by W404 in the Class 1 structure is supplanted by two leucine residues, L402 and L405. This feature is shared by the Class 3 and 4 structures and the Class 5 is displayed as it is the highest resolution structure obtained (2.5 Å).

48

Figure 29. Mutations not located at the site of hAPN interaction cluster at the top and sides of the RBD. The HCoV-229E RBD overlaid on the NL63-CoV trimer and colored. Residues not located at the RBD-hAPN interface that have changed from the previous structure are shown in red. Class 3 changes from Class 1 in panels A and B, Class 4 changes from Class 3 in C and D, and Class 5 changes from Class 4 in panels E and F.

49

Chapter 3

Discussion

3.1 A Ladder-Like Phylogeny and Immune Evasion

Phylogenetic analysis of the RBD sequences of HCoV-229E viruses sampled over the

past 50 years shows that they segregate into six classes. Moreover, the six classes have been

found to successively replace each other in the human population over this time period. This

“ladder-like” phylogeny has been observed in the Influenza H3N2 HA1 protein and in intra-host

sampling of the HIV-1 E protein. In both cases, it has been attributed to the selection of

phenotypes that are able to escape an immune response (Grenfell 2004). The vast majority of the

mutational differences observed between the HCoV-229E S protein RBD classes are found in

these three receptor binding loops (Figure 9). Loops of this kind are known to be highly

immunogenic (Corti et al. 2013) and they elicit an antibody response in the case of the HIV-1 E

protein and the TGEV S protein (Kim et al. 2003, Reguera et al. 2012). We showed that the

receptor-binding loops of the HCoV-229E S protein RBD are the site of binding of the

neutralizing antibody, 9.8.E12, and that loop variation can abrogate antibody binding. It follows

that loop mutations would facilitate immune evasion.

HCoV-229E is a pandemic virus and surveillance indicates that it circulates on all

continents except Antarctica. The constructed phylogeny shows these viruses all share a common

lineage (Figure 7). The emergence of viral classes can be explained as follows: an HCoV-229E

strain containing an S protein with the Class 1 RBD propagates through individuals during the

cold season. This infection is cleared and the persons that were infected now possess neutralizing

antibodies, an asset that prevents additional infections from viruses with identical or very similar

RBDs. The virus continues infecting individuals the following season but this time has fewer

potential targets. This continues for several years until a certain percentage of the population is

protected, typically around 90% (Fine 1993). At this time, a viral variant able to escape this

"herd immunity" is at a competitive advantage. The new variant may be fixed in the population

at this time or it may require further optimization before a new RBD class emerges. This model

is supported by studies that show periods of low HCoV-229E circulation following years of

periods of high circulation (Cabeca et al. 2013), and the observation that HCoV-229E infection

50

does not necessarily provide an individual with protection from future infections (Reed 1984). It

is certainly possible that the HCoV-229E receptor-binding loop variation, and the emergence of

new RBD classes, is strictly the consequence of the abrogation of neutralizing antibody binding.

However, it is also possible that other driving forces for loop variation exist as discussed below.

3.2 The HCoV-229E RBD Affinity for hAPN has Changed Over Time.

The receptor-binding affinity and kinetics of the six HCoV-229E S protein RBDs were

measured and several interesting patterns emerged. Firstly, the affinity of the RBDs for their

cellular receptor hAPN has shown a tendency to increase over the past 50 years. The Class 1

RBD (from a sample first isolated in 1967) has an affinity of 434 nM, while the Class 5 and 6

RBDs (from samples isolated in the 2000s and 2010s) have an affinity of 27 and 37 nM,

respectively (Table 2). This increase in affinity appears to have been selected for over time and

the directionality suggests that optimization is occurring. Enveloped viruses possess receptor-

binding proteins that bind their receptors with a wide range of affinity: influenza binds in the

millimolar range, and HIV binds in the nanomolar range, for instance (Skehel et al. 2000,

Ugolini et al. 1999). For a given virus and cell-surface receptor density, there is an affinity

threshold where membrane fusion is achieved and above which fusion is not improved

(Hasegawa et al. 2007). Why then is a 16-fold increase in affinity observed between the Class 1

and Class 5/6 RBDs, when this threshold has apparently been met by viruses with the Class 1

RBD? The answer may be related to the need to evade a polyclonal antibody response in the case

of a host infection. As mentioned in the previous section, mutations in the receptor-binding loops

are able to prevent a monoclonal antibody from binding, but the true, in vivo immune response to

viral infection would be polyclonal. Coronaviruses like SARS-CoV and other enveloped viruses

like influenza A have surface glycoproteins with multiple epitopes (He et al. 2005, Hensley et al.

2009). Evading a polyclonal antibody response through the abrogation of antibody binding might

require mutations at several distinct site - an unlikely event. However, for neutralizing antibodies

that compete for receptor binding, an increase in receptor binding affinity is thought to be an

additional route to immune evasion and one that would work even for a polyclonal antibody

response (Hensley et al. 2009). Our 9.8.E12 antibody binds the Class 1 RBD with an affinity of

51

66 nM, an affinity seven-fold higher than the Class 1 RBD-hAPN interaction. An increase in

receptor binding affinity will help the receptor outcompete neutralizing antibodies for interaction

with the RBD. Indeed, the Class 5 and 6 RBDs bind hAPN two to three-fold tighter than the

9.8.E12 antibody binds the Class 1 RBD.

Another interesting aspect of the increased affinity of the RBD-hAPN interaction is that it

appears to be almost exclusively the result of a slowing of the off-rate (koff). The affinity of an

interaction, or its Kd, is the quotient of koff and kon and changes in either rate will change the

affinity. Biochemically, the change from shorter to longer off-rates has interesting implications at

the site of receptor engagement, the cell surface. As previously introduced, after a coronavirus

localizes to the cell surface by binding a receptor, it needs triggers to activate the fusion

machinery and it needs the fusion process to progress to completion, two actions assisted by

additional time. For HCoV-229E, fusion activation is achieved by cell-surface transmembrane

serine protease cleavage of the S-protein (Bertram et al. 2013). Additional time attached to the

cell surface would allow for diffusion of such proteases to the proper location and for enzymatic

cleavage to occur. Viruses with longer off-rates may be at a fitness advantage compared to those

with shorter off-rates. Similarly, more than one S-protein subunit may need to engage hAPN

molecules to trigger fusion and slower off-rates would facilitate the engagement of a second

subunit before the dissociation of the first one occurs.

3.3 Crystal Structures and Mutagenesis Shed Light on the RBD-hAPN Interaction.

Crystal structures of protein complexes provide a wealth of information that can be

further explored through mutagenesis and binding studies. Our group’s previously obtained

crystal structure of the Class 1 RBD in complex with hAPN informed the design of several new

RBD and hAPN constructs that were aimed at elucidating and confirming the importance of

several key residues. RBD residues F318, N319, W404, and the disulfide bond formed by C317

and C320 were selected for study and the appropriate mutants were produced.

The F318A mutant lead to an RBD that bound hAPN with a 13-fold reduction in affinity

(Table 3), a ∆∆G of 1.5 kcal/mol; this residue is near the center of the RBD-hAPN interacting

surface, the classical definition of a “hot-spot residue” (Li et al. 2004). This residue is a tyrosine

in the Class 2 RBD, it returns to a phenylalanine in the Class 3 RBD, and again appears as a

52

tyrosine in the Class 4, 5, and 6 RBDs. The residue’s aromatic ring is positioned nearly

identically in all cases and differs only in the presence of the tyrosine hydroxyl group which

points toward solvent (Figure 24). This switching of similar residues may be indicative of a

neutral mutation, but may also provide an example of how a change to an interface residue will

maintain receptor binding, while altering a potential antibody binding epitope (Shiroishi et al.

2006). This hypothesis could be investigated with a suite of antibodies raised against the Class 1

RBD and a F318Y mutant.

The N319A mutant showed no binding at the maximum achievable concentration of 25

µM, an indication that it too is a hot-spot residue. Both the NH and carbonyl oxygen of the side

chain of this asparagine residue make hydrogen bonds with hAPN that are maintained in the

crystal structures of the Class 3, 4, and 5 RBDs. It is likely that this same interaction is present

between hAPN and the Class 2 and 6 RBDs as the residue is one of the six that is conserved in

all 52 deposited sequences. It is noteworthy that this residue forms an important bond with the

backbone of hAPN and not a side chain. Overall protein structure is likely to be maintained

between homologous proteins from different species, while individual residues are more likely to

vary (Sitbon et al. 2007). This feature may have important implications when cross-species

transmission is considered, as less dependence on side chains may allow for more potential hosts.

Mutation of residue 404 from tryptophan to alanine completely abrogates binding. W404

contributes a large hydrophobic surface for apolar packing both within the RBD and across the

RBD-hAPN interface. It follows that a mutation from a large aromatic residue to the much

smaller alanine residue would disrupt the interaction. In all of the other classes, a leucine is

found at this position and interestingly they all bind with higher affinity. In the Class 3, 4, and 5

structures, this leucine (L402 due to a two amino acid deletion) is accompanied by another

leucine at position 405 (L405). The two hydrophobic side chains interact and occupy the same

volume as the tryptophan residue in Class 1 (Figure 28). The Class 2 and 6 RBDs have an

isoleucine and a histidine at these positions, respectively, showing a need for a large and possibly

branched side chain.

53

3.4 hAPN Mutants Lead to Reduction in RBD Binding Affinity

The hAPN mutants produced all interacted with the Class 1 RBD at least 10-fold weaker

than the wild-type hAPN did. Overall, this is unsurprising as the mutants selected were thought

to be key to the interaction. The reduction in affinity of mutants does shed some light on the

contributions a particular residue makes to the interaction. The side chain carboxylic group of

D288 makes two hydrogen bonds with backbone amides in the RBD receptor-binding loop 1.

Despite the assumed significance of these two hydrogen bonds, the D288A mutant sees the

smallest reduction in affinity, only 10-fold. In contrast, the I309A mutant has a 25-fold reduction

in affinity.

No hAPN mutant produced completely abolished binding of the Class 1 RBD. This might

suggest that the receptor-binding loops on the RBD are structurally malleable enough to

accommodate changes that occur on the surface of its binding partner. Indeed, these mutant

APNs can be viewed as "homologous receptors" in species closely related to humans. It has been

suggested that differences in receptor sequence is the largest barrier to zoonotic transmission

(Bae et al. 2011) and receptor binding loop plasticity is one means by which differences in

receptor sequence/structure can be overcome (Wong et al. 2017).

3.5 The Use of Loops as Receptor-Binding Motifs Enables HCoV-229E Adaptation and Evolution.

The HCoV-229E RBD utilizes extended loops and not more constrained secondary

structure elements to bind its receptor hAPN. Loop regions in proteins are more likely to tolerate

insertions, deletions, and substitutions when compared to more ordered regions (Chaux et al.

2007, Touriki et al. 2008). In fact, one analysis of yeast orthologs shows that loop regions are 14

times more able to accommodate insertions and deletions than are secondary structure elements

(Tath-Petraczy et al. 2013). Of the 52 RBD sequences available, substitutions, insertions, and

deletions in the receptor binding loops are all observed (Figure 8). This ability to withstand

change enables the RBD to probe more sequence space than would otherwise be allowed and this

diversity has important implications. Perhaps most importantly, it allows the virus to alter its

antigenic surface thereby providing a facile means of abrogating the binding of loop-binding

neutralizing antibodies. As discussed above, the use of loops to bind receptor, and the variability

54

that they can accommodate, also provides a route to changing receptor binding affinity - another

determinant of viral fitness.

RNA viruses are best described as populations (Lauring et al. 2013) and viral populations

with diversity in their receptor binding loops would also be expected to be well suited to

acquiring new receptor interactions by chance. Among relatives of HCoV-229E that use loops to

bind their receptors, PRCoV binds APN at the P site and not the H site, and HCoV-NL63 binds

ACE2 (Reguera et al. 2012, Wu et al. 2009). These examples provide evidence that receptor

binding loops have provided a route for the acquisition of new receptors or new receptor

interactions. During cross-species transmission, the use of non-conserved receptor interactions in

the new species are not only likely to be rare, as they involve completely new binding modes, but

they will also likely require rounds of viral replication and optimization to produce a sufficiently

fit virus (Dai et al. 2013, Wong et al. 2017). It follows that the ability of receptor binding loops

to sustain mutational change will facilitate both the acquisition and optimization of non-

conserved receptor interactions. The use of a conserved receptor interactions between

homologous receptors in the old and new species is comparatively more likely. There are many

barriers to cross-species transmission and these all need to be overcome in order for cross-

species transmission to occur (Plowright et al. 2017). The use of a homologous receptor in the

new host reduces one of the barriers to cross-species transmission and again loop variation and

plasticity will facilitate the process.

HCoV-229E likely crossed the species barrier from bats to humans in the recent past

(Corman et al. 2015) and it is certainly possible that optimization of receptor interactions in

humans is still ongoing. In the Class 1 RBD loop 1, a GGG motif is likely to give this loop

flexibility and likely increased binding promiscuity. Too much flexibility completely eliminates

binding as the loss of the disulfide bond in the C317S/C320S double mutant RBD showed (Table

3). As the HCoV-229E RBD has evolved in the human population, the middle residue in this

GXG motif has changed from glycine in the Class 1 and 2 RBDs to valine in the Class 3 and 4

RBDs to proline in the Class 5 and 6 RBDs (Figure 8). A valine at this position would reduce

loop flexibility and a proline would reduce it even further. There are very few differences in the

RBD-hAPN interface between the Class 4 and 5 RBDs except for this proline and these two

RBDs show the largest difference in binding affinity for consecutive classes (nearly 10-fold). It

is plausible that RBDs with more flexible loops are more promiscuous (thereby promoting new

55

receptor interactions) and that they bind their receptor less tightly, while less flexible loops are

optimized for one specific interaction. Investigation of how RBDs with mutations designed to

modulate loop flexibility interact with homologous receptors of other species could help confirm

this idea.

3.6 Bats are Unique and Potent Agents of Viral Spread

Bats are known to be a vast reservoir for many virus types and transmission from bats to

humans is thought to be responsible for recent outbreaks of Ebola virus, Hendra virus, Nipah

virus, and the coronaviruses responsible for SARS and MERS (Smith et al. 2013). Bats are the

second largest group of mammals by species number and they exist on all six continents except

Antarctica (Nowak 1994). Their ability to travel by powered flight increases their potential to

spread disease and their cohabitation with many related species in a single cave has the potential

to lead to vast viral diversity (Willoughby et al. 2017). Bats also demonstrate unique immune

features that lead to infections without discernible symptoms and they may be able to tolerate

high levels of infection without a large impact on fecundity or longevity (Brook et al. 2015).

Moreover, the barriers to cross-species transmission are lowest among closely related species

(Parrish et al. 2008). Taken together, these factors would explain how viral transmission among

bats has led to the vast viral reservoir that exists. Indeed, recent studies have shown that close

relatives to the human coronaviruses HCoV-229E and HCoV-NL63 are currently circulating in

bats and that the most recent common ancestor of bat and human 229E is thought to have existed

just 200 years ago (Pfefferle et al. 2009). The potential for other bat virus to infect and sustain

human-to-human transmission clearly exists.

56

Chapter 4

Future Directions

4.1 Short Term Goals

4.1.1 Immediate Experiments

With the completion of several additional experiments, a fuller understanding of the

HCoV-229E S protein RBD and its interactions with hAPN, and with the neutralizing antibody

9.8.E12, can be obtained. four of the six identified Classes have been crystallized with their

receptor, hAPN, all under very similar crystallization conditions. Crystallization trials with the

Class 2 RBD has not yielded crystals, and the Class 6-hAPN crystals obtained have diffracted to

worse than 10 Å resolution. These two classes are good candidates for further optimization such

as an additive screen. Additional reagents will affect the solubility and crystallization properties

of the protein complex and may lead to crystals and crystals of higher quality for the class 2 and

6 complexes, respectively. Another structural goal not reached in this project was the

determination of the Class 1 RBD in complex with the 9.8.E12 antibody. The only data obtained

regarding this antibody’s epitope is that it involves the receptor binding loops and that RBD

residues F318 and N319 are not key to the interaction (Figure 19). Inspection of the Class 1

RBD-hAPN and Class 1 RBD-9.8.E12 structures paired with our knowledge of the Class 2 RBD

sequence will shed light on exactly how RBD mutations were able to abrogate the binding of

antibodies like 9.8.E12. If crystals cannot be obtained, a structure of the HCoV-229E S-protein

ectodomain trimer in complex with the 9.8E12 Fab might be tackled by cryo-EM analysis.

One unexplored aspect of the HCoV-229E S protein RBD-hAPN interaction is whether

or not the observed variation outside of the binding interface has any effect on binding affinity or

kinetics. RBD variation outside of the interface is able to change properties such as surface

charge, hydrophobicity, and possibly the structure of the receptor binding loops. Indeed, residues

that do not appear to interact in any way can show surprising compensatory or antagonistic

epistatic effects (Holmes 2011, Duan et al. 2014). Mutating these variable residues one at a time,

followed by binding analysis at each step, would be one means by which a role for epistatic

interactions could be tested.

57

4.1.2 Recent Surveillance Has Revealed Additional Human and Animal HCoV-229E RBD Classes

Since this project began in 2015, surveillance of bat and camel populations has shown

that close relatives of HCoV-229E circulate in both of these animals (Corman et al. 2015,

Corman et al. 2016). Phylogenetic analysis of the S protein RBDs of these bat and camel viruses

reveal the existence of at least three bat classes and one camel class (Figure 30). The three bat

classes branch from the phylogenetic tree earlier and appear to be more ancient than the human

or camel RBDs and this relationship is corroborated by a phylogenetic analysis of the RdRp gene

(Corman et al. 2015). The camel RBD is very close in sequence to human Class 2 and this

HCoV-229E related camel virus is able to infect HEK-293 cells that express hAPN (Corman et

al. 2016). Camel APN shares only 11 of the 17 interfacial residues hAPN uses to interact with

the human Class 1 RBD (Figure 1), but the camel RBD is still able to bind and facilitate

infection. No such data exists for the HCoV-229E related bat viruses. Determining the binding

characteristics between human APN and these animal RBDs and solving the crystal structures if

they are able to interact can shed light on how zoonosis is accomplished. To this end, I have

produced stable HEK-293S cell lines expressing the three bat and one camel RBDs. Western

blotting indicates that these proteins are expressing at levels comparable to their human relatives

(Figure 32A). Purification of the “bat 3” RBD proceeded without problems and both a size

exclusion chromatogram and SDS-PAGE gel indicate a clean, monodisperse sample (Figure

32B).

In addition to the new classes of animal RBDs that have recently been observed, new

HCoV-229E S protein RBD sequences have been deposited and they appear to form a new class.

Studying the receptor binding characteristics and crystal structure of this “Class 7” RBD would

shed further light on the role played by loop variation and immune evasion and add an important

data point to either help confirm or reject the observed trend of increased RBD-hAPN affinity

over time

Currently, the primary sequences of APN from 71 different species are known. Of most

importance to the future of this project are the sequences from dromedary camels and

hipposideros bats (accession codes XP_010985051.1 and XP_019495552.1 respectively).

Dromedary camels and hipposideros bats are known reservoirs of HCoV-229E related viruses

that ostensibly use APN as a receptor. Even though many species of bats share the same habitat

58

and are in close contact with one another, to date only bats of the genus Hipposideros are known

to harbor HCoV-229E-like viruses (Corman et al. 2015). Only recently has the APN sequence of

a hipposideros bat been deposited (Dong et al. 2016), enabling relevant structural studies.

Figure 30. Phylogenetic analysis of animal HCoV-229E-related RBDs reveal discrete classes. Shown here is the phylogenetic analysis of bat and camel HCoV-229E-like S protein RBDs. Much like the human RBDs shown with roman numerals, they segregate into distinct classes and appear to be closely related to the human virus RBD sequences.

Figure 31. Examination of the H-site of bat and camel APN shows moderate diversity. The sequences of hipposideros bat APN (bAPN), human APN (hAPN), and dromedary camel APN (dAPN) are shown. Residues that comprise the H-site are highlighted in green. Conserved residues are indicated by an asterisk, semi-conserved residues with dots, and non-conserved residues with blank space. More than 35% of the interfacial residues are not conserved. HCoV-229E-related camel viruses can infect human cells despite the differences at this site, and studies regarding the bat viruses will be informative.

59

Figure 32. Camel and bat RBDs express well in mammalian tissue culture. A) A Western blot shows that all four animal RBD constructs (bands boxed by dashed line) are

expressed well. B) A Superdex-200 chromatogram of the Bat 3 RBD and SDS-PAGE show a clean, monodisperse

sample.

60

4.2 Long Term Goals

4.2.1 Mutations in the Receptor-Binding Loops May Impact Spike Protein Conformational Dynamics

Recent advances in cryo-electron microscopy have enabled visualization of the full-

length S protein trimer of the coronaviruses NL63, MHV, SARS, and MERS (Gui et al. 2016,

Kirchdoerfer et al. 2016, Walls et al. 2016, Yuan et al. 2017). These structures show a high

degree of tertiary structure conservation and it follows that the HCoV-229E S protein trimer is

similar. The RBD of the S protein is dynamic and in certain conformations the RBD’s receptor

binding loops are inaccessible and buried in the trimer interface (Figure 4). Loop residues make

both intra- and intersubunit contacts and loop variation might affect the equilibrium between the

inaccessible “lying” and accessible “standing” states. The exact biological implications of such

an equilibrium are currently unknown but the relevance to both receptor binding and immune

evasion is apparent. Hiding the highly antigenic receptor binding loops would be a route to

immune evasion but at the same time it prevents receptor binding. The equilibrium ratio of the

lying and standing forms of the trimer may therefore be an important determinant of fitness.

Cryo-EM analysis has been used to study these various states for SARS- and MERS-CoV (Yuan

et al. 2017) and we could use the same approach to determine whether or not the ratio of lying

and standing forms differs among the various HCoV-229E classes.

The expression of the soluble ectodomain of the coronavirus S protein trimer has been

aided by several approaches. Firstly, a T4 fibritin trimerization motif can be introduced at the C-

terminus the ectodomain of the S-protein (Kirchdoerfer et al. 2016). This “foldon” domain acts

synergistically with the natural tendency of the S-protein to form trimers (Papanikolopoulou et

al. 2003). A double proline mutation introduced into the loop between the HR1 region and the

central helix of the S-protein can increase S protein expression levels by up to 50 times, likely

because it prevents the alpha helical rearrangements that lead to the post-fusion conformation of

the S protein and these mutations have greatly facilitated cryo-EM analysis (Pallesen et al. 2017).

We will use these approaches to stabilize the HCoV-229E trimer for each of the six classes.

Once done we will use cryo-EM to determine whether loop variation influences the ratio of the

lying and standing forms.

61

4.2.2 A Cell-Based Assay to Test Viral Fitness is Within Reach

Most of the ideas stemming from this thesis are the result of structural or biophysical

analysis and do not have a cell-based assay to support them. While cell based infectivity assays

have been conducted using the lab strain HCoV-229E P100E isolate (Wong et al. 2017), no

assays on viruses with S proteins containing RBDs other than Class 1 have been attempted. Our

collaborators are currently working on a BacMid system to create HCoV-229E viruses with

custom genomes and thus viruses corresponding to each of the RBD classes. This will allow for

the testing of the fitness of viral variants in cell-based assays examining, for example, the effect

of receptor binding affinity on fitness.

62

Chapter 5

Methods

5.1 Sequence Comparison of HCoV-229E S-protein RBD

The protein sequence of the lab strain HCoV-229E P100E isolate RBD (residues 293–

435) was used as a template to search the non-redundant protein sequence database using Blastp

(Camacho et al. 2008). Sequences were compiled on December 1, 2016, and new sequences are

now available. 52 total sequences were obtained with the GenBank Identifier numbers:

NP_073551.1, AAK32188.1, AAK32189.1, AAK32190.1, AAK32191.1, CAA71056.1,

CAA71146.1, CAA71147.1, ADK37701.1, ADK37702.1, ADK37704.1, BAL45637.1,

BAL45638.1, BAL45639.1, BAL45640.1, BAL45641.1, AAQ89995.1, AAQ89999.1,

AAQ90002.1, AAQ90004.1, AAQ90005.1, AAQ90006.1, AAQ90008.1, AFI49431.1,

AFR45554.1, AFR79250.1, AFR79257.1, AGT21338.1, AGT21345.1, AGT21353.1,

AGT21367.1, AGW80932.1, AIG96686.1 ABB90506.1, ABB90507.1, ABB90508.1,

ABB90509.1, ABB90510.1, ABB90513.1. ABB90514.1, ABB90515.1, ABB90516.1,

ABB90519.1, ABB90520.1, ABB90522.1, ABB90523.1, ABB90526.1, ABB90527.1,

ABB90528.1, ABB90529.1, ABB90530.1, AOG74783.1. The 52 sequences were then aligned

using Muscle (Edgar et al. 2004). The protein-coding regions of the eight sequences for which

the entire genome were reported (GenBank Identifier numbers: NC_002645.1, JX503060.1,

JX503061.1, KF514433.1, KF514430.1, KF514432.1, AF304460.1, and KU291448.1) were

aligned using Muscle. The sequence AAK32191.1 was chosen as the representative of Class 1

and the loop sequences of ABB90507.1, ABB90514.1, ABB90519.1, ABB90523.1, and

AFR45554.1 were combined with the non-loop sequences of AAK32191.1 to generate the RBDs

of Classes 2–6, respectively.

5.2 Protein Expression and Purification

The soluble ectodomain of hAPN and related mutants (residues 66–967) were expressed

in and purified from stably transfected HEK293S GnT1(-/-) cells (ATCC CRL-3022) as described

previously (Wong et al. 2012). These cells produce glycoproteins containing only high mannose

N-linked glycans (Chang et al. 2007). The six classes of HCoV-229E S-protein RBD and related

63

mutants were also expressed and purified from stably transfected HEK293S GnT1(-/-) cells. Point

mutations were generated using the InFusion HD Site-Directed Mutagenesis protocol (Clontech).

In all cases, the target proteins were secreted as N-terminal protein-A fusion proteins with a

Tobacco Etch Virus (TEV) protease cleavage site following the protein-A tag. Harvested media

was concentrated 10-fold and purified by IgG affinity chromatography (IgG Sepharose, GE). The

bound proteins were liberated by on-column TEV protease cleavage and hAPN was further

purified by anion exchange chromatography (HiTrap-Q) while the HCoV-229E RBDs were

further purified by cation exchange chromatography (HiTrap-SP).

5.3 Surface Plasmon Resonance Assays

Surface plasmon resonance assays were performed on the Biacore-X system using CM-5

dextran chips (GE) covalently coupled to the ligand via amine coupling. The surface of the chip

was activated by 0.05 M EDC and 0.2 M NHS, injected with a flow rate of 10 µL/min for seven

minutes. 50 µg/mL of ligand in 50 mM sodium acetate pH 4.95 was injected until an appropriate

amount was covalently linked. For experiments using hAPN as a ligand, 400 RUs were

immobilized, while for the experiments using the 9.8.E12 antibody, 1900 RUs were

immobilized. 1.0 M ethanolamine injected at 10 µL/min for seven minutes was used to cap

unreacted CM-dextran residues and decrease the overall charge of the matrix. The running and

injection buffers were matched and in all cases consisted of 150 mM NaCl, 0.01% Tween-20, 0.1

mg/ml BSA, and 10 mM HEPES at pH 7.5. Response unit (RU) values were measured as a

function of analyte concentration at 298 K. Kinetic analysis was performed using the global

fitting feature of Scrubber 2 (BioLogic Software) assuming a 1:1 binding model.

5.4 Deglycosylation of hAPN

10.0 mg of purified hAPN was deglycosylated by treatment with 0.5 mg endo-β-N-

acetylglucosaminidase A. The reaction solution was 10 mL in volume and the buffer consisted of

100 mM NaCl and 10 mM MES pH 6.5. After 48 hours, the pH was dropped to 5.0 using 100

mM NaAc pH 4.9 and 20µL of Jack Bean α-mannosidase (Sigma) was introduced. This enzyme

requires Zn2+ as a cofactor and ZnSO4 was added to a final concentration of 1 mM. After 48

hours, the reaction was complete as determined by SDS-PAGE and MALDI-TOF analysis.

64

5.5 Protein Crystallization

Crystals of the Class 3, 4, and 5 S protein RBDs in complex with the hAPN ectodomain

were obtained via the same general method. The RBDs and hAPN were mixed in a 1.2:1 ratio

and the resulting complex was purified by gel filtration using a Superdex 200 column (GE) with

a buffer consisting of 50 mM NaCl and 10 mM HEPES at pH 7.4. The purified complexes were

concentrated to 9.5 mg/ml. 1 µg/ml of endo-β-N-acetylglucosaminidase H was introduced into

this solution to remove N-linked glycans still present on the RBDs. Mixing protein solution with

precipitant consisting of 9% PEG 8000, 1mM GSSG, 1mM GSH, 5% glycerol, and 100 mM

MES, pH 6.5 at 298 K in a ratio of 1:1 in hanging drops of 1 µl yielded crystals of low quality

after 48 hours. These crystals were harvested, washed in precipitant solution, and used for

seeding which resulted in higher quality crystals used for diffraction experiments and data

collection.

5.6 Data Collection and Structure Determination

Diffraction data for the Class 3 and 4 RBD complexes were collected at the Canadian

Light Source, Saskatoon, Saskatchewan (Beamline CMCF-08ID-1) at a wavelength of 0.9795 Å.

Data for the Class 5 RBD complex was collected at the Advanced Photon Source. Use of the

Advanced Photon Source at the Argonne National Laboratory was supported by the U. S.

Department of Energy, Office of Science, Office of Basic Energy Sciences, under Contract No.

DE-AC02-06CH11357. Data were merged, processed, and scaled using HKL-2000 (Otwinowski

et al. 1997). 5% of the data set was used for the calculation of Rfree for the Class 4 and 5 RBD

complex structures. 1% of the data was exclude for the Rfree calculation for the Class 3

structure. Phases were obtained by molecular replacement using hAPN as a search model (PDB

ID: 4FYQ) using Phaser in Phenix (Bunkóczi et al. 2013). Manual building of the HCoV-229E

RBD and hAPN was performed using COOT. Refinement was carried out using Phenix.refine

(Afonine et al. 2012). Data collection and refinement statistics are found in Table 5.

65

Bibliography

Afonine, P., et al. “Towards automated crystallographic structure refinement with phenix.refine.” (2012). Acta Cryst. D68, 352-367. Anthony, Simon J., et al. “Global Patterns in Coronavirus Diversity.” Virus Evolution, vol. 3, no.

1, 2017, doi:10.1093/ve/vex012.

Bae, Se-Eun, and Hyeon Son. “Classification of Viral Zoonosis through Receptor Pattern Analysis.” BMC Bioinformatics, vol. 12, no. 1, 2011, p. 96., doi:10.1186/1471-2105-12-96.

Belay, Ermias D., and Stephan S. Monroe. “Low-Incidence, High-Consequence Pathogens.”Emerging Infectious Diseases, vol. 20, no. 2, 2014, pp. 319–321., doi:10.3201/eid2002.131748.

Belouzard, Sandrine, et al. “Mechanisms of Coronavirus Cell Entry Mediated by the Viral Spike Protein.” Viruses, vol. 4, no. 12, 2012, pp. 1011–1033., doi:10.3390/v4061011.

Bertram, S., et al. “TMPRSS2 Activates the Human Coronavirus 229E for Cathepsin-

Independent Host Cell Entry and Is Expressed in Viral Target Cells in the Respiratory Epithelium.” Journal of Virology, vol. 87, no. 11, 2013, pp. 6150–6160., doi:10.1128/jvi.03372-12.

Boni, Maciej F. “Vaccination and Antigenic Drift in Influenza.” Vaccine, vol. 26, 2008, doi:10.1016/j.vaccine.2008.04.011.

Bos, Evelyne C.w., et al. “The Production of Recombinant Infectious DI-Particles of a Murine

Coronavirus in the Absence of Helper Virus.” Virology, vol. 218, no. 1, 1996, pp. 52–60., doi:10.1006/viro.1996.0165.

Briese, T., et al. “Middle East Respiratory Syndrome Coronavirus Quasispecies That Include

Homologues of Human Isolates Revealed through Whole-Genome Analysis and Virus Cultured from Dromedary Camels in Saudi Arabia.” MBio, vol. 5, no. 3, 2014, doi:10.1128/mbio.01146-14.

Brook, Cara E., and Andrew P. Dobson. “Bats as a Special Reservoirs for Emerging Zoonotic

Pathogens.” Trends in Microbiology, vol. 23, no. 3, 2015, pp. 172–180., doi:10.1016/j.tim.2014.12.004.

66

Bunkóczi, G., et al. “Phaser.MRage: automated molecular replacement” Acta Crystallogr D Biol Crystallogr 69, 2276-86 (2013).

Cabeca, Tatiane K., et al. “Epidemiological and Clinical Features of Human Coronavirus Infections among Different Subsets of Patients.” Influenza and Other Respiratory Viruses, vol. 7, no. 6, 2013, pp. 1040–1047., doi:10.1111/irv.12101.

Calisher, C. H., et al. “Bats: Important Reservoir Hosts of Emerging Viruses.” Clinical Microbiology Reviews, vol. 19, no. 3, 2006, pp. 531–545., doi:10.1128/cmr.00017-06.

Callebaut, P., et al. “An Adenovirus Recombinant Expressing the Spike Glycoprotein of Porcine

Respiratory Coronavirus Is Immunogenic in Swine.” Journal of General Virology, vol. 77, no. 2, Jan. 1996, pp. 309–313., doi:10.1099/0022-1317-77-2-309.

Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., & Madden T.L.

(2008) "BLAST+: architecture and applications." BMC Bioinformatics 10:421.

Chan, Jasper F. W., et al. “Middle East Respiratory Syndrome Coronavirus: Another Zoonotic Betacoronavirus Causing SARS-Like Disease.” Clinical Microbiology Reviews, vol. 28, no. 2, 2015, pp. 465–522., doi:10.1128/cmr.00102-14.

Chan, W.-E., et al. “Functional Characterization of Heptad Repeat 1 and 2 Mutants of the Spike Protein of Severe Acute Respiratory Syndrome Coronavirus.” Journal of Virology, vol. 80, no. 7, 2006, pp. 3225–3237., doi:10.1128/jvi.80.7.3225-3237.2006.

Chang VT, Crispin M, Aricescu AR, et al. “Glycoprotein Structural Genomics: Solving the Glycosylation Problem.” Structure. 2007;15(3):267-273. doi:10.1016/j.str.2007.01.011.

Chaux, Nicole De La, et al. “DNA Indels in Coding Regions Reveal Selective Constraints on

Protein Evolution in the Human Lineage.” BMC Evolutionary Biology, vol. 7, no. 1, 2007, p. 191., doi:10.1186/1471-2148-7-191.

Chowell, Gerardo, et al. “Transmission Characteristics of MERS and SARS in the Healthcare

Setting: a Comparative Study.” BMC Medicine, vol. 13, no. 1, 2015, doi:10.1186/s12916-015-0450-0.

67

Chernomordik, Leonid V., and Michael M. Kozlov. “Protein-Lipid Interplay in Fusion and Fission of Biological Membranes.” Annual Review of Biochemistry, vol. 72, no. 1, 2003, pp. 175–207., doi:10.1146/annurev.biochem.72.121801.16150

Coffey L.L., Beeharry Y., Borderia A.V., Blanc H., Vignuzzi M. “Arbovirus high fidelity variant loses fitness in mosquitoes and mice.” Proceedings of the National Academy of Sciences 2011, 108:16038-16043.

Coffin, J., and R. Swanstrom. “HIV Pathogenesis: Dynamics and Genetics of Viral Populations and Infected Cells.” Cold Spring Harbor Perspectives in Medicine, vol. 3, no. 1, Jan. 2013, doi:10.1101/cshperspect.a012526.

Colman, Peter M. and Lawrence, Michael C. “The Structural Biology of Type I Viral Membrane Fusion.” Nature Reviews Molecular Cell Biology, vol. 4, no. 4, 2003, pp. 309–319., doi:10.1038/nrm1076.

Corman, Victor M., et al. “Evidence for an Ancestral Association of Human Coronavirus 229E with Bats.” Journal of Virology, vol. 89, no. 23, 2015, pp. 11858–11870., doi:10.1128/jvi.01755-15.

Corman, Victor M., et al. “Link of a Ubiquitous Human Coronavirus to Dromedary Camels.”Proceedings of the National Academy of Sciences, vol. 113, no. 35, 2016, pp. 9864–9869., doi:10.1073/pnas.1604472113.

Corti, Davide, and Antonio Lanzavecchia. “Broadly Neutralizing Antiviral Antibodies.” Annual Review of Immunology, vol. 31, no. 1, 2013, pp. 705–742., doi:10.1146/annurev-immunol-032712-095916.

Crossley, Beate, et al. “Identification and Characterization of a Novel Alpaca Respiratory Coronavirus Most Closely Related to the Human Coronavirus 229E.” Viruses, vol. 4, no. 12, 2012, pp. 3689–3700., doi:10.3390/v4123689.

Dai, H.-S., et al. “Directed Evolution of a Virus Exclusively Utilizing Human Epidermal Growth Factor Receptor as the Entry Receptor.” Journal of Virology, vol. 87, no. 20, 2013, pp. 11231–11243., doi:10.1128/jvi.01054-13.

Darwin, Charles. “On the Origin of Species by Means of Natural Selection, Or, the Preservation of Favoured Races in the Struggle for Life.” London: J. Murray, 1859. Print.

68

de Groot, R. J., et al. “Middle East Respiratory Syndrome Coronavirus (MERS-CoV): Announcement of the Coronavirus Study Group.” Journal of Virology, vol. 87, no. 14, 2013, pp. 7790–7792., doi:10.1128/jvi.01244-13.

Desforges, M. et al. “Neuroinvasive and neurotropic human respiratory coronaviruses: potential neurovirulent agents in humans”. Advances in Experimental Medicine and Biolology 807,75–96 (2014).

Dong, Dong, et al. “The Genomes of Two Bat Species with Long Constant Frequency Echolocation Calls.” Molecular Biology and Evolution, vol. 34, no. 1, 2016, pp. 20–34., doi:10.1093/molbev/msw231.

Du L, He Y, Zhou Y, Liu S, Zheng B-J, Jiang S. “The spike protein of SARS-CoV — a target for vaccine and therapeutic development.” Nature Reviews Microbiology. 2009;7(3):226-236. doi:10.1038/nrmicro2090.

Du, Lanying, et al. “Introduction of Neutralizing Immunogenicity Index to the Rational Design of MERS Coronavirus Subunit Vaccines.” Nature Communications, vol. 7, 2016, p. 13473., doi:10.1038/ncomms13473.

Duan, Susu, et al. “Epistatic Interactions between Neuraminidase Mutations Facilitated the Emergence of the Oseltamivir-Resistant H1N1 Influenza Viruses.” Nature Communications, vol. 5, 2014, p. 5029., doi:10.1038/ncomms6029.

Earp, L. J., et al. “The Many Mechanisms of Viral Membrane Fusion Proteins.” Current Topics in Microbiology and Immunology Membrane Trafficking in Viral Replication, 2005 pp. 25–66., doi:10.1007/3-540-26764-6_2.

Edgar, R. C. “MUSCLE: multiple sequence alignment with high accuracy and high throughput.” Nucleic Acids Research. 32, 1792–1797 (2004).

Emsley, P., et al. “Features and Development of Coot.” Acta Crystallographica Section D

Biological Crystallography, vol. 66, no. 4, 2010, pp. 486–501., doi:10.1107/s0907444910007493.

Farsani, Seyed Mohammad Jazaeri, et al. “The First Complete Genome Sequences of Clinical

Isolates of Human Coronavirus 229E.” Virus Genes, vol. 45, no. 3, 2012, pp. 433–439., doi:10.1007/s11262-012-0807-9.

69

Fehr, Anthony R., and Stanley Perlman. “Coronaviruses: An Overview of Their Replication and Pathogenesis.” Coronaviruses Methods in Molecular Biology, 2015, pp. 1–23., doi:10.1007/978-1-4939-2438-7_1.

Fendrick, A. Mark, et al. “The Economic Burden of Non-Influenza-Related Viral Respiratory Tract Infection in the United States.” Archives of Internal Medicine, vol. 163, no. 4, 2003, p. 487., doi:10.1001/archinte.163.4.487.

Fine, Paul E. M. “Herd Immunity: History, Theory, Practice.” Epidemiologic Reviews, vol. 15, no. 2, 1993, pp. 265–302., doi:10.1093/oxfordjournals.epirev.a036121.

Gaunt, E. R. et al. “Epidemiology and clinical presentations of the four human coronaviruses

229E, HKU1, NL63, and OC43 detected over 3 years using a novel multiplex real-time PCR method.” J. Clin. Microbiol. 48, 2940–2947 (2010).

Gortazar, Christian, et al. “Crossing the Interspecies Barrier: Opening the Door to Zoonotic

Pathogens.” PLoS Pathogens, vol. 10, no. 6, 2014, doi:10.1371/journal.ppat.1004129.

Grenfell, B. T. “Unifying the Epidemiological and Evolutionary Dynamics of Pathogens.”Science, vol. 303, no. 5656, 2004, pp. 327–332., doi:10.1126/science.1090727.

Gui, Miao, et al. “Cryo-Electron Microscopy Structures of the SARS-CoV Spike Glycoprotein Reveal a Prerequisite Conformational State for Receptor Binding.”Cell Research, vol. 27, no. 1, 2016, pp. 119–129., doi:10.1038/cr.2016.152.

Gupta, R & Jung, E & Brunak, Søren. (2004). “Prediction of N-glycosylation sites in human proteins.” 46. 203-206.

Haan, Cornelis A.m. De, and Peter J.m. Rottier. “Molecular Interactions in the Assembly of

Coronaviruses.” Advances in Virus Research Virus Structure and Assembly, 2005, pp. 165–230., doi:10.1016/s0065-3527(05)64006-7.

Harrison, Stephen C. “Viral Membrane Fusion.” Nature Structural & Molecular Biology, vol. 15,

no. 7, 2008, pp. 690–698., doi:10.1038/nsmb.1456.

Hasegawa, K., et al. “Affinity Thresholds for Membrane Fusion Triggering by Viral Glycoproteins.” Journal of Virology, vol. 81, no. 23, 2007, pp. 13149–13157., doi:10.1128/jvi.01415-07.

70

He, Y., et al. “Receptor-Binding Domain of Severe Acute Respiratory Syndrome Coronavirus

Spike Protein Contains Multiple Conformation-Dependent Epitopes That Induce Highly Potent Neutralizing Antibodies.” The Journal of Immunology, vol. 174, no. 8, 2005, pp. 4908–4915., doi:10.4049/jimmunol.174.8.4908.

Hensley, S. E., et al. “Hemagglutinin Receptor Binding Avidity Drives Influenza A Virus Antigenic Drift.” Science, vol. 326, no. 5953, 2009, pp. 734–736., doi:10.1126/science.1178258.

Hofmann, Heike, et al. “Attachment Factor and Receptor Engagement of Sars Coronavirus and Human Coronavirus NL63.” Advances in Experimental Medicine and Biology The Nidoviruses, 2006, pp. 219–227., doi:10.1007/978-0-387-33012-9_37.

Holm L, Sander C. 1998. “Touring protein fold space with Dali/FSSP.” Nucleic Acids Research. 26:316–319

Holmes, Edward C. “Error Thresholds and the Constraints to RNA Virus Evolution.” Trends in

Microbiology, vol. 11, no. 12, 2003, pp. 543–546., doi:10.1016/j.tim.2003.10.006.

Holmes, Edward C. “The Evolution and Emergence of RNA Viruses.” Oxford University Press, 2011.

Imamura, Hiroshi, and Shinya Honda. “Calibration-Free Concentration Analysis for an Analyte

Prone to Self-Association.” Analytical Biochemistry, vol. 516, 2017, pp. 61–64., doi:10.1016/j.ab.2016.10.013.

Keele, B. F., et al. “Identification and Characterization of Transmitted and Early Founder Virus

Envelopes in Primary HIV-1 Infection.” Proceedings of the National Academy of Sciences, vol. 105, no. 21, 2008, pp. 7552–7557., doi:10.1073/pnas.0802203105.

Kielian, Margaret, and Rey, Félix A. “Virus Membrane-Fusion Proteins: More than One Way to

Make a Hairpin.” Nature Reviews Microbiology, vol. 4, no. 1, 2006, pp. 67–76., doi:10.1038/nrmicro1326.

Kielian, Margaret. “Mechanisms of Virus Membrane Fusion Proteins.” Annual Review of

Virology, vol. 1, no. 1, Mar. 2014, pp. 171–189., doi:10.1146/annurev-virology-031413-085521.

71

Kim, Young B., et al. “Immunogenicity and Ability of Variable Loop-Deleted Human Immunodeficiency Virus Type 1 Envelope Glycoproteins to Elicit Neutralizing Antibodies.”Virology, vol. 305, no. 1, 2003, pp. 124–137., doi:10.1006/viro.2002.1727.

Kirchdoerfer, R.n., et al. “Prefusion Structure of a Human Coronavirus Spike Protein.” Feb. 2016, doi:10.2210/pdb5i08/pdb.

Koonin, Eugene V, and Valerian V Dolja. “Expanding Networks of RNA Virus

Evolution.” BMC Biology, vol. 10, no. 1, 2012, p. 54., doi:10.1186/1741-7007-10-54.

Kryazhimskiy, Sergey, et al. “Prevalence of Epistasis in the Evolution of Influenza A Surface Proteins.” PLoS Genetics, vol. 7, no. 2, 2011, doi:10.1371/journal.pgen.1001301.

Li, F. “Evidence for a Common Evolutionary Origin of Coronavirus Spike Protein Receptor-

Binding Subunits.” Journal of Virology, vol. 86, no. 5, 2011, pp. 2856–2858., doi:10.1128/jvi.06882-11.

Li, Fang. “Structure, Function, and Evolution of Coronavirus Spike Proteins.” Annual Review of

Virology, vol. 3, no. 1, 2016, pp. 237–261., doi:10.1146/annurev-virology-110615-042301.

Li, W., et al. “Animal Origins of the Severe Acute Respiratory Syndrome Coronavirus: Insight

from ACE2-S-Protein Interactions.” Journal of Virology, vol. 80, no. 9, Dec. 2006, pp. 4211–4219., doi:10.1128/jvi.80.9.4211-4219.2006.

Li, Xiang, et al. “Protein-Protein Interactions: Hot Spots and Structurally Conserved Residues

Often Locate in Complemented Pockets That Pre-Organized in the Unbound States: Implications for Docking.” Journal of Molecular Biology, vol. 344, no. 3, 2004, pp. 781–795., doi:10.1016/j.jmb.2004.09.051.

Li, Wenhui, et al. “Receptor and Viral Determinants of SARS-Coronavirus Adaptation to Human ACE2.” The EMBO Journal, vol. 24, no. 8, 2005, pp. 1634–1643., doi:10.1038/sj.emboj.7600640.

Li, Z., et al. “Simple PiggyBac Transposon-Based Mammalian Cell Expression System for Inducible Protein Production.” Proceedings of the National Academy of Sciences, vol. 110, no. 13, 2013, pp. 5004–5009., doi:10.1073/pnas.1218620110.

72

Lin, Han-Xin, et al. “Characterization of the Spike Protein of Human Coronavirus NL63 in Receptor Binding and Pseudotype Virus Entry.” Virus Research, vol. 160, no. 1-2, 2011, pp. 283–293., doi:10.1016/j.virusres.2011.06.029.

Lin, Xian-Dan, et al. “Extensive Diversity of Coronaviruses in Bats from China.” Virology, vol. 507, 2017, pp. 1–10., doi:10.1016/j.virol.2017.03.019.

Masters, Paul S. “Reverse Genetics of The Largest RNA Viruses.” Advances in Virus Research

Advances in Virus Research Volume 53, 1999, pp. 245–264., doi:10.1016/s0065-3527(08)60351-6.

Millet, Jean Kaoru, and Gary R. Whittaker. “Host Cell Proteases: Critical Determinants of

Coronavirus Tropism and Pathogenesis.” Virus Research, vol. 202, 2015, pp. 120–134., doi:10.1016/j.virusres.2014.11.021.

Milne, R. S. B., et al. “Glycoprotein D Receptor-Dependent, Low-PH-Independent Endocytic

Entry of Herpes Simplex Virus Type 1.” Journal of Virology, vol. 79, no. 11, Dec. 2005, pp. 6655–6663., doi:10.1128/jvi.79.11.6655-6663.2005.

Nowak, Ronald M. “Walker's Bats of the World.” James Hopkins University Press, 1994.

Otwinowski, Z. and Minor, W., " Processing of X-ray Diffraction Data Collected in Oscillation

Mode ", Methods in Enzymology, Volume 276: Macromolecular Crystallography, part A, p.307-326, 1997,C.W. Carter, Jr. & R. M. Sweet, Eds., Academic Press (New York).

Pei, J., and N. V. Grishin. “AL2CO: Calculation of Positional Conservation in a Protein

Sequence Alignment.” Bioinformatics, vol. 17, no. 8, 2001, pp. 700–712., doi:10.1093/bioinformatics/17.8.700.

Pallesen, Jesper, et al. “Immunogenicity and Structures of a Rationally Designed Prefusion

MERS-CoV Spike Antigen.” Proceedings of the National Academy of Sciences, vol. 114, no. 35, 2017, doi:10.1073/pnas.1707304114.

Papanikolopoulou, Katerina, et al. “Formation of Highly Stable Chimeric Trimers by Fusion of

an Adenovirus Fiber Shaft Fragment with the Foldon Domain of Bacteriophage T4 Fibritin.” Journal of Biological Chemistry, vol. 279, no. 10, 2003, pp. 8991–8998., doi:10.1074/jbc.m311791200.

73

Parrish, C. R., et al. “Cross-Species Virus Transmission and the Emergence of New Epidemic Diseases.” Microbiology and Molecular Biology Reviews, vol. 72, no. 3, Jan. 2008, pp. 457–470., doi:10.1128/mmbr.00004-08.

Perelson, Alan S. “Modelling Viral And Immune System Dynamics.” Nature Reviews Immunology, vol. 2, no. 1, 2002, pp. 28–36., doi:10.1038/nri700.

Pettersen, E. F. et al. “UCSF Chimera--a visualization system for exploratory research and

analysis.” Journal of Computational Chemistry 25, 1605–1612 (2004)

Pfefferle, Susanne, et al. “Distant Relatives of Severe Acute Respiratory Syndrome Coronavirus and Close Relatives of Human Coronavirus 229E in Bats, Ghana.” Emerging Infectious Diseases, vol. 15, no. 9, 2009, pp. 1377–1384., doi:10.3201/eid1509.090224.

Plowright, Raina K., et al. “Pathways to Zoonotic Spillover.” Nature Reviews Microbiology, vol. 15, no. 8, 2017, pp. 502–510., doi:10.1038/nrmicro.2017.45.

Reed, Sylvia E. “The Behaviour of Recent Isolates of Human Respiratory Coronavirus in Vitro and in Volunteers: Evidence of Heterogeneity among 229E-Related Strains.” Journal of Medical Virology, vol. 13, no. 2, 1984, pp. 179–192., doi:10.1002/jmv.1890130208.

Reguera, Juan, et al. “Structural Bases of Coronavirus Attachment to Host Aminopeptidase N and Its Inhibition by Neutralizing Antibodies.” PLoS Pathogens, vol. 8, no. 8, Feb. 2012, doi:10.1371/journal.ppat.1002859.

Richard, Mathilde, et al. “Factors Determining Human-to-Human Transmissibility of Zoonotic Pathogens via Contact.” Current Opinion in Virology, vol. 22, 2017, pp. 7–12., doi:10.1016/j.coviro.2016.11.004.

Sanjuan, R., et al. “Epistasis and the Adaptability of an RNA Virus.” Genetics, vol. 170, no. 3, 2005, pp. 1001–1008., doi:10.1534/genetics.105.040741.

Sanjuan, R. “Viral Mutation Rates.” Virus Evolution: Current Research and Future Directions,

2016, pp. 1–28., doi:10.21775/9781910190234.01.

Schuck, Peter, and Huaying Zhao. “The Role of Mass Transport Limitation and Surface Heterogeneity in the Biophysical Characterization of Macromolecular Binding Processes by SPR Biosensing.” Methods in Molecular Biology Surface Plasmon Resonance, 2010, pp. 15–54., doi:10.1007/978-1-60761-670-2_2.

74

Shiroishi, Mitsunori, et al. “Structural Consequences of Mutations in Interfacial Tyr Residues of a Protein Antigen-Antibody Complex.” Journal of Biological Chemistry, vol. 282, no. 9, 2006, pp. 6783–6791., doi:10.1074/jbc.m605197200.

Sitbon, Einat, and Shmuel Pietrokovski. “Occurrence of Protein Structure Elements in Conserved Sequence Regions.” BMC Structural Biology, vol. 7, no. 1, 2007, p. 3., doi:10.1186/1472-6807-7-3.

Smith EC, Sexton NR, Denison MR. “Thinking outside the triangle: replication fidelity of the largest RNA viruses.” The Annual Review of Virology. 2014; 1: 111–132. https://doi.org/10.1146/annurev-virology-031413-085507 PMID: 2695871712.

Smith, Everett Clinton. “The Not-so-Infinite Malleability of RNA Viruses: Viral and Cellular Determinants of RNA Virus Mutation Rates.” PLOS Pathogens, vol. 13, no. 4, 2017, doi:10.1371/journal.ppat.1006254.

Smith, Ina, and Lin-Fa Wang. “Bats and Their Virome: an Important Source of Emerging Viruses Capable of Infecting Humans.” Current Opinion in Virology, vol. 3, no. 1, 2013, pp. 84–91., doi:10.1016/j.coviro.2012.11.006.

Snijder, Eric J., et al. “Unique and Conserved Features of Genome and Proteome of SARS-Coronavirus, an Early Split-off From the Coronavirus Group 2 Lineage.” Journal of Molecular Biology, vol. 331, no. 5, 2003, pp. 991–1004., doi:10.1016/s0022-2836(03)00865-9.

Söllner, Thomas, et al. “A Protein Assembly-Disassembly Pathway in Vitro That May Correspond to Sequential Steps of Synaptic Vesicle Docking, Activation, and Fusion.” Cell, vol. 75, no. 3, 1993, pp. 409–418., doi:10.1016/0092-8674(93)90376-2.

Skehel, John J., and Don C. Wiley. “Receptor Binding and Membrane Fusion in Virus Entry: The Influenza Hemagglutinin.” Annual Review of Biochemistry, vol. 69, no. 1, 2000, pp. 531–569., doi:10.1146/annurev.biochem.69.1.531.

Steinhauer, David A., et al. “Lack of Evidence for Proofreading Mechanisms Associated with an RNA Virus Polymerase.” Gene, vol. 122, no. 2, 1992, pp. 281–288., doi:10.1016/0378-1119(92)90216-c.

Tang, X.-C., et al. “Identification of Human Neutralizing Antibodies against MERS-CoV and Their Role in Virus Adaptive Evolution.” Proceedings of the National Academy of Sciences, vol. 111, no. 19, 2014, doi:10.1073/pnas.1402074111.

75

Tath-Petraczy, Agnes, and Dan S. Tawfik. “Protein Insertions and Deletions Enabled by Neutral Roaming in Sequence Space.” Molecular Biology and Evolution, vol. 30, no. 4, 2013, pp. 761–771., doi:10.1093/molbev/mst003.

Tokuriki, Nobuhiko, et al. “How Protein Stability and New Functions Trade Off.” PLoS Computational Biology, vol. 4, no. 2, 2008, doi:10.1371/journal.pcbi.1000002.

Tusell, S. M., et al. “Mutational Analysis of Aminopeptidase N, a Receptor for Several Group 1

Coronaviruses, Identifies Key Determinants of Viral Host Range.” Journal of Virology, vol. 81, no. 3, Aug. 2006, pp. 1261–1273., doi:10.1128/jvi.01510-06.

Ugolini, Sophie, et al. “HIV-1 Attachment: Another Look.” Trends in Microbiology, vol. 7, no.

4, 1999, pp. 144–149., doi:10.1016/s0966-842x(99)01474-2.

Visher, Elisa, et al. “The Mutational Robustness of Influenza A Virus.” PLOS Pathogens, vol. 12, no. 8, 2016, doi:10.1371/journal.ppat.1005856.

Volz, Erik M., et al. “Viral Phylodynamics.” PLoS Computational Biology, vol. 9, no. 3, 2013,

doi:10.1371/journal.pcbi.1002947.

Walls, A.c., et al. “Cryo-Electron Microscopy Structure of a Coronavirus Spike Glycoprotein Trimer.” Mar. 2016a, doi:10.2210/pdb3jcl/pdb.

Walls, A.c., et al. “Glycan Shield and Epitope Masking of a Coronavirus Spike Protein Observed

by Cryo-Electron Microscopy.” 2016b, doi:10.2210/pdb5szs/pdb.

White JM, Delos SE, Brecher M, Schornberg K. Structures and Mechanisms of Viral Membrane Fusion Proteins: Multiple Variations on a Common Theme. Critical reviews in biochemistry and molecular biology. 2008;43(3):189-219. doi:10.1080/10409230802058320.

Willoughby, Anna, et al. “A Comparative Analysis of Viral Richness and Viral Sharing in Cave-

Roosting Bats.” Diversity, vol. 9, no. 3, 2017, p. 35., doi:10.3390/d9030035.

Wong, Alan H. M., et al. “The X-Ray Crystal Structure of Human Aminopeptidase N Reveals a Novel Dimer and the Basis for Peptide Processing.” Journal of Biological Chemistry, vol. 287, no. 44, 2012, pp. 36804–36813., doi:10.1074/jbc.m112.398842.

76

Wong, Alan H. M., Tomlinson, Aidan C.A., et al. “Receptor-Binding Loops in Alphacoronavirus Adaptation and Evolution.” Nature Communications, vol. 8, no. 1, 2017, doi:10.1038/s41467-017-01706-x.

Woo, Patrick C. Y., et al. “Coronavirus Diversity, Phylogeny and Interspecies Jumping.” Experimental Biology and Medicine, vol. 234, no. 10, 2009, pp. 1117–1127., doi:10.3181/0903-mr-94.

Woo, P. C. Y., et al. “Discovery of Seven Novel Mammalian and Avian Coronaviruses in the Genus Deltacoronavirus Supports Bat Coronaviruses as the Gene Source of Alphacoronavirus and Betacoronavirus and Avian Coronaviruses as the Gene Source of Gammacoronavirus and Deltacoronavirus.” Journal of Virology, vol. 86, no. 7, 2012, pp. 3995–4008., doi:10.1128/jvi.06540-11.

Wu, K., et al. “Crystal Structure of NL63 Respiratory Coronavirus Receptor-Binding Domain Complexed with Its Human Receptor.” 2009, doi:10.2210/pdb3kbh/pdb.

Yang, Yang, et al. “Two Mutations Were Critical for Bat-to-Human Transmission of Middle

East Respiratory Syndrome Coronavirus.” Journal of Virology, vol. 89, no. 17, Oct. 2015, pp. 9119–9123., doi:10.1128/jvi.01279-15.

Yuan, Yuan, et al. “Cryo-EM Structures of MERS-CoV and SARS-CoV Spike Glycoproteins

Reveal the Dynamic Receptor Binding Domains.” Nature Communications, vol. 8, Oct. 2017, p. 15092., doi:10.1038/ncomms15092.

Zeng, Fanya, et al. “Quantitative Comparison of the Efficiency of Antibodies against S1 and S2

Subunit of SARS Coronavirus Spike Protein in Virus Neutralization and Blocking of Receptor Binding: Implications for the Functional Roles of S2 Subunit.” FEBS Letters, vol. 580, no. 24, Dec. 2006, pp. 5612–5620., doi:10.1016/j.febslet.2006.08.085.

Zumla, A., Chan, J.F., Azhar, E.I., Hui, D.S. & Yuen, K.Y. “Coronaviruses: drug discovery and therapeutic options.” Nature Reviews Drug Discovery 15, 327–347 (2016)

coronavirus evolution and immune evasion

Documents