coronavirus evolution and immune evasion
TRANSCRIPT
Coronavirus Evolution and Immune Evasion
by
Aidan Tomlinson
A thesis submitted in conformity with the requirements for the degree of Master of Science
Department of Biochemistry University of Toronto
© Copyright Aidan Tomlinson 2018
ii
Coronavirus Evolution and Immune Evasion
Aidan Tomlinson
Master of Science
Department of Biochemistry University of Toronto
2018
Abstract
Coronaviruses are emerging pathogens that threaten human health and prosperity. Each
year, hundreds of millions of people are infected with continually circulating coronaviruses that
cause the common cold in healthy individuals and kill the most vulnerable of us. Coronaviruses
adapt to environmental change at a remarkable rate. The HCoV-229E coronavirus has adapted
and evolved over the last 50 years by mutating residues in the receptor-binding loops of the
receptor binding domain (RBD) of its spike (S) protein. These sequences phylogenetically
segregate into Classes whose viruses have successively replaced one another in the human
population. These Classes possess different receptor (hAPN) and antibody binding
characteristics and the crystal structures of RBD-hAPN complexes have been solved. Structural
insights into the ever-changing RBD and its interaction with hAPN show the use of secondary
structure-less loops facilitate tremendous structural variability, a trait that likely enables changes
in viral fitness and immune evasion.
iii
Acknowledgments
Because this opus bears but my name, I must reveal and identify the many other
fingerprints scattered throughout its pages in the hopes that the tangible and intangible aid I’ve
received the last few years does not go forgotten.
First and foremost, this work would not have been completed without the rarely
complimentary, but always complementary, guidance of Dr. James Rini. A great teacher and a
better learner, he raises an excellent standard that inspires those around him. I would also like to
thank my committee members Dr. Jean-Phillipe Julien and Dr. Scott Gray-Owen for their
tremendous help with the formation of this thesis and for helping nurture my scientific
communication skills.
A very special thanks must be given to Dr. Alan Wong who welcomed me into this
project with open arms. He has been a mentor and a role model. His prior work laid a solid and
level foundation which was ripe to build upon.
Dongxia Zhou and Malathy Satkunarajah deserve all the thanks in the world for their
indispensable and expert technical help. Dr. Zhijie Li, the handiest and most knowledgeable man
in the world, must be thanked for illuminating conversations and his ubiquitous 3D-printed
creations. I must also thank Kristina Han for moral support in the lab and on the softball field
and Nathan Doner for late night commiseration and debate.
Last but far from least, I need to thank my parents and sister along with the rest of my
family and friends for making my studies and work all worthwhile and for keeping me sharp
even when I’m away from the lab.
iv
Table of Contents
Acknowledgments .......................................................................................................................... iii
Table of Contents ........................................................................................................................... iv
List of Tables ................................................................................................................................ vii
List of Figures .............................................................................................................................. viii
Chapter 1 ......................................................................................................................................... 1
Introduction ............................................................................................................................. 1
1.1 Diseases Caused by Coronaviruses ..............................................................................................1 1.2 Life Cycle of Coronaviruses .........................................................................................................1 1.3 Fusion Proteins .............................................................................................................................3 1.4 Anatomy and Function of the Spike Protein ................................................................................5 1.5 Evolution of Coronavirus Diversity .............................................................................................9 1.6 Binding of Receptor by Coronaviral RBDs ...............................................................................10 1.7 The Engine of Genetic Diversity ................................................................................................14 1.8 Environmental Pressures Promote Coronaviral Adaptation .......................................................15 1.9 HCoV-229E as a Model for Coronavirus Adaptation and Evolution .........................................16 1.10 Rationale .....................................................................................................................................18
Chapter 2 ....................................................................................................................................... 20
Results ................................................................................................................................... 20
2.1 Biophysical Characterization of HCoV-229E Spike Protein RBD Interaction with Its Receptor
and Neutralizing Antibodies ....................................................................................................................20 2.1.1 HCoV-229E RBDs Cluster into Six Phylogenetic Classes. ...................................................20 2.1.2 Variation in Receptor-Binding Loops Changes Receptor-Binding Affinity and Binding
Kinetics. ..............................................................................................................................................20 2.1.3 Structure-Function Analysis of the Class 1 RBD interaction with hAPN. ............................23 2.1.4 The Six Classes of RBD Share a Conserved Binding site on hAPN. ....................................27 2.1.5 HCoV-229E Classes Differ in Their Ability to be Bound by a Neutralizing Antibody. .......27
2.2 Structural Biology of the Evolving RBD-Receptor Interaction .................................................33 2.2.1 Crystallization of hAPN Requires Deglycosylation ..............................................................33
v
2.2.2 HCoV-229E S Protein RBD Classes 3, 4, and 5 Crystal Complexes with Their Receptor
hAPN. ................................................................................................................................................36 2.2.3 HCoV-229E Class 1 RBD-hAPN Complex Provides a Foundation for Comparison. ..........37 2.2.4 HCoV-229E Class 3 RBD-hAPN Crystal Complex Shows a Markedly Different Interaction
When Compared to the Class 1 RBD. .................................................................................................40 2.2.5 The HCoV-229E Class 4 RBD-hAPN Interaction Maintains Many Features of the Class 3
Interaction. ..........................................................................................................................................44 2.2.6 HCoV-229E Class 5 RBD-hAPN Interaction Shows a Moderately Changed Loop 1. .........45
Chapter 3 ....................................................................................................................................... 49
Discussion ............................................................................................................................. 49
3.1 A Ladder-Like Phylogeny and Immune Evasion .......................................................................49 3.2 The HCoV-229E RBD Affinity for hAPN has Changed Over Time. ........................................50 3.3 Crystal Structures and Mutagenesis Shed Light on the RBD-hAPN Interaction. ......................51 3.4 hAPN Mutants Lead to Reduction in RBD Binding Affinity ....................................................53 3.5 The Use of Loops as Receptor-Binding Motifs Enables HCoV-229E Adaptation and Evolution. 53 3.6 Bats are Unique and Potent Agents of Viral Spread ..................................................................55
Chapter 4 ....................................................................................................................................... 56
Future Directions .................................................................................................................. 56
4.1 Short Term Goals .......................................................................................................................56 4.1.1 Immediate Experiments .........................................................................................................56 4.1.2 Recent Surveillance Has Revealed Additional Human and Animal HCoV-229E RBD
Classes ...............................................................................................................................................57 4.2 Long Term Goals ........................................................................................................................60
4.2.1 Mutations in the Receptor-Binding Loops May Impact Spike Protein Conformational
Dynamics .............................................................................................................................................60 4.2.2 A Cell-Based Assay to Test Viral Fitness is Within Reach ...................................................61
Chapter 5 ....................................................................................................................................... 62
Methods................................................................................................................................. 62
5.1 Sequence Comparison of HCoV-229E S-protein RBD ..............................................................62 5.2 Protein Expression and Purification ...........................................................................................62 5.3 Surface Plasmon Resonance Assays ..........................................................................................63 5.4 Deglycosylation of hAPN ..........................................................................................................63
vi
5.5 Protein Crystallization ................................................................................................................64 5.6 Data Collection and Structure Determination ............................................................................64
Bibliography ................................................................................................................................. 65
vii
List of Tables
Table 1. Coronavirus’s diverse host range and receptor usage. .................................................... 12
Table 2. Surface plasmon resonance binding kinetics for the RBD-hAPN interaction for each of
the six Classes. .............................................................................................................................. 23
Table 3. RBD mutants, their contribution to buried surface area, and their effect on the RBD-
hAPN interaction. ......................................................................................................................... 25
Table 4. hAPN mutants, their contribution to buried surface area, and their effect on the RBD-
hAPN interaction. ......................................................................................................................... 26
Table 5. Data collection and refinement statistics. ....................................................................... 39
Table 6. Buried surface area analysis of RBD Classes 1, 3, 4, and 5 in complex with hAPN. .... 42
Table 7. Hydrogen bond and salt bridge analysis of four crystal complexes. .............................. 43
viii
List of Figures
Figure 1. The life cycle of the coronavirus. .................................................................................... 4
Figure 2. Fusion proteins facilitate fusion between viral envelopes and cellular membranes. ...... 6
Figure 3. The cryo-EM structure of the MHV spike protein. ......................................................... 7
Figure 4. A dynamic RBD allows receptor binding in the standing conformation. ....................... 8
Figure 5. Phylogenetic analysis of coronaviruses reveals four genera. ........................................ 11
Figure 6. The spike protein RBDs of HCoV-229E and HCoV-NL63 show structural similarity. 13
Figure 7. Phylogenetic analysis of HCoV-229E S-protein RBDs reveals six distinct classes. .... 17
Figure 8. Variation in HCoV-229E S-protein RBD is localized in the three receptor binding
loops. ............................................................................................................................................. 19
Figure 9. HCoV-229E S protein RBD variation. .......................................................................... 21
Figure 10. Global fitting of surface plasmon resonance data for the Class 1-6 RBD-hAPN
interaction. .................................................................................................................................... 22
Figure 11. Surface plasmon resonance data and global fitting for RBD mutants and their
interaction with hAPN. ................................................................................................................. 24
Figure 12. RBD Mutants show reduced or abrogated binding to hAPN. ..................................... 25
Figure 13. Surface plasmon resonance data for mutant hAPN and RBD interaction. .................. 26
Figure 14. N-linked glycan at H-site produces a steric clash with a docked RBD. ...................... 28
Figure 15. Introduction of an N-linked glycan to the H-site on hAPN prohibits binding of all six
Classes of RBD. ............................................................................................................................ 29
Figure 16. The 9.8.E12 antibody binds the HCoV-229E Class 1 RBD. ....................................... 30
Figure 17. The 9.8.E12 antibody competes with hAPN for RBD binding. .................................. 31
ix
Figure 18. The 9.8.E12 antibody binds only the Class 1 RBD. Surface plasmon resonance data
for the class 1-6 RBD interaction with the 9.8.E12 antibody. ...................................................... 31
Figure 19. The 9.8.E12 antibody binds both loop 1 mutants. ....................................................... 32
Figure 20. hAPN harbors 10 N-linked glycans. ............................................................................ 33
Figure 21. Deglycosylation of hAPN by EndoH yields a heterogenous product. ........................ 34
Figure 22. Deglycosylation of hAPN by EndoA yields a homogenous product. ......................... 35
Figure 23. Superimposition of Class 1, 3, 4, and 5 RBD. ............................................................. 37
Figure 24 Interface details of HCoV-229E S protein Class 1, 3, 4, and 5 in complex with hAPN.
....................................................................................................................................................... 40
Figure 25. R316 of the Class 3 RBD fills the volume vacated by the kinked backbone. ............. 44
Figure 26. N406 of the Class 3 RBD provides supporting hydrogen bonds to N319. .................. 45
Figure 27. A two residue deletion in loop 2 of the Class 3 RBD helps position R357 more
favorably. ...................................................................................................................................... 46
Figure 28. L402 and L405 replace W404 in more recently isolated RBD sequences. ................. 47
Figure 29. Mutations not located at the site of hAPN interaction cluster at the top and sides of the
RBD. ............................................................................................................................................. 48
Figure 30. Phylogenetic analysis of animal HCoV-229E-related RBDs reveal discrete classes. . 58
Figure 31. Examination of the H-site of bat and camel APN shows moderate diversity. ............ 58
Figure 32. Camel and bat RBDs express well in mammalian tissue culture. ............................... 59
1
Chapter 1
Introduction
1.1 Diseases Caused by Coronaviruses
Research into coronaviruses is of great importance because of the diseases they cause in
humans, domesticated mammals, and other animals. The coronaviruses that circulate in humans
cause mild respiratory infections and are responsible for 10-30% of common cold cases across
the globe (Zumla et al. 2016). These ailments are normally cleared by individuals without
complication, but may worsen and even lead to death in the very young, the very old, and the
immunocompromised (Desforge et al. 2014). With even conservative estimates placing the
number of annual coronaviral infections in the hundreds of millions, coronaviruses have a
tremendous impact on human health and are responsible for the loss of billions of person-hours
in the economy (Fendrick et al. 2003). The four known circulating human coronaviruses are
HCoV-229E, HCoV-NL63, HCoV-OC43, and HCoV-HKU1. Each virus is able to infect an
individual multiple times throughout their life and it appears that immunological protection
provided after a first infection is not permanent, perhaps one reason why the common cold is so
common (Gaunt et al. 2010).
Possibly of greater concern than the common cold causing coronaviruses is the looming
threat of cross-species transmission by viruses that are currently circulating in animals. The
SARS-CoV and MERS-CoV epidemics are two examples that have occurred within the last 15
years. They possess mortality rates of 10% and 30%, respectively, and both have jumped species
from bats to humans (Li et al. 2006, Chan et al. 2015). While neither virus is capable of efficient
human-to-human transmission and only several thousand cases were logged (Richard et al.
2017), their mortality rates are of great concern.
1.2 Life Cycle of Coronaviruses
Coronaviruses are enveloped and possess a continuous positive sense RNA genome of
around 30 kilobases, the largest of any RNA viruses (Masters 1999). In order to replicate, viruses
must enter a target cell, produce their proteins, replicate their genome and package it, and bud
from the cell to begin the cycle anew.
2
In order to enter a target cell, a coronavirus must first bind a receptor on the outside of
the cell and fuse the viral and host cell membranes. Coronaviruses accomplish both of these
crucial steps via the S protein (Lin et al. 2011). Among coronaviruses, the S protein is able to
recognize either proteinaceous or carbohydrate receptors in a process called receptor
engagement. This engagement may be aided by attachment factors such as host lectins that
interact with high mannose glycans on the S protein (Hofmann et al. 2006). In a process that is
not well understood, receptor binding results in S-protein conformational changes that lead to
insertion of the S-protein fusion peptide into the host cell membrane (Walls et al. 2016a).
Changes in pH and/or processing by host proteases may also be involved in triggering these
conformational changes (Belouzard et al. 2012). Further conformational changes in the S-protein
bring the viral and cellular membranes together in a process that leads to fusion. Depending on
the coronavirus, the virus can fuse with the plasma membrane at the cell surface or with an
endosomal membrane after endocytosis (Belouzard et al. 2012).
Once inside the cytoplasm, the coronavirus’s positive sense RNA genome serves as an
mRNA for the translation of the viral replicase polyprotein using the host's ribosomes. The
polyprotein contains an RNA-dependent RNA polymerase that then generates negative sense
copies of the viral genome. The polymerase uses these negative sense copies to produce positive
sense copies that get packaged into new viral particles. The polymerase also uses the negative
sense copies of the viral genome to produce positive sense subgenomic RNAs. These
subgenomic RNAs serve as mRNAs that are used by host ribosomes to translate the viral
structural proteins S, envelope (E), membrane (M), nucleocapsid (N), and for some
coronaviruses the hemagglutinin-esterase (HE), that are required for the production of new viral
progeny.
The S, E, and M proteins are targeted to the endoplasmic reticulum at the start of
translation and are eventually trafficked to the ER-Golgi intermediate compartment (ERGIC).
These proteins encounter the viral genome encapsidated by the N protein - both of which are on
the cytoplasmic side of the ERGIC membrane - and together bud into the lumen of the ER-Golgi
compartment (Haan et al. 2005). In this process, a piece of the ERGIC membrane becomes the
viral membrane. The M and E proteins work in concert to create these virions, and expression of
just these proteins is sufficient to create virus-like particles in vitro (Bos et al. 1996). The
enveloped virions continue through the secretory pathway until they are secreted from the cell by
3
the normal process of exocytosis (Figure 1). While transiting though the secretory pathway, the
viral membrane proteins of these nascent virions can be further processed by Golgi-resident
glycosyltransferases and proteases.
1.3 Fusion Proteins
For all enveloped viruses, fusion of the host and viral membranes is required for
infection. This fusion is thermodynamically favored, but there is a large energy barrier that must
first be overcome (Chernomordik et al. 2003). Enveloped viruses employ fusion proteins to
lower the energy barrier in an ATP independent manner (Colman et al. 2003). There are 3 classes
of fusion proteins termed class I, II, and III. These three classes of proteins show great structural
diversity but they have all converged on a common mechanism of action (Figure 2). In short, a
trigger leads to conformational changes in the fusion protein that expose a short, hydrophobic
region called the fusion peptide. The fusion peptide is inserted into the host cell membrane and
subsequent large-scale conformational changes in the fusion protein bring the two membranes to
be fused into close apposition (Harrison 2008). The two membranes, now in close contact, can
mix outer leaflets in an intermediate event termed hemifusion. Hemifusion is followed by the
creation of a small pore that enlarges as the two membranes fuse (White et al. 2008). At least
three different fusion protein triggering mechanisms have been identified: i) receptor binding, ii)
low pH and iii) proteolytic cleavage, and others likely exist (Kielian et al. 2014). The Influenza
hemagglutinin protein is triggered by the low pH of cellular endosomes (Earp et al. 2005). The
herpes virus gB protein fuses after binding a cellular receptor, and may do so at a neutral,
extracellular pH (Milne et al. 2005). Among coronaviruses, proteolytic processing and low pH
both serve as fusion triggers (Belouzard et al. 2012).
4
Figure 1. The life cycle of the coronavirus. The coronavirus docks with a cell by recognizing a receptor (1). Fusion of membranes allows the viral genome to be deposited and cytoplasmic ribosomes translate the genome (2). The first protein product is the RNA-dependent RNA polymerase (RdRp), which transcribes the genomic RNA into a negative sense strand (3). This negative sense strand is transcribed by the RdRp into subgenomic RNAs which encode the structural proteins S, E, M, and N (4). Full-length genomes are also transcribed from the negative sense strand. The S, E, and M structural protein transcripts are translated on membrane-bound ribosomes and targeted to the endoplasmic reticulum-Golgi intermediate compartment where they join with N protein covered full-length genomes to form virions by budding into the lumen of the ERGIC (5 and 6). Secretory vesicles containing the nacent virions (6) fuse with the cell membrane leading to virus secretion (7).
5
1.4 Anatomy and Function of the Spike Protein
The coronavirus S protein is a class I fusion protein. It forms spike-like protrusions that
decorate the outside of the viral membrane, forming the crown-like shape for which the
coronavirus is named. The S protein is a homotrimer of about 400 kD and consists of three main
elements: a short intracellular tail, a single-pass transmembrane domain, and an ectodomain that
accounts for most of its mass (Li 2016). The S protein is highly glycosylated, containing many
N-linked glycosylation sequons and as many as 31 N-linked glycans per protomer have been
observed via cryo-electron microscopy (cryo-EM). Extensive glycosylation is thought to shield
the S protein from the host immune response as it does for the gp120/41 protein of HIV (Walls et
al. 2016b). Indeed, the S protein is a major target of the host immune response and it elicits
neutralizing antibodies (Du et al. 2016). Recent studies utilizing cryo-electron microscopy have
greatly advanced our understanding of the structure of the S-protein and how it mediates
membrane fusion. The S protein ectodomain is comprised of two regions, S1 and S2 (Figure 3).
The S1 region is N-terminal and is distally located from the viral membrane, while the C-
terminal S2 region is membrane proximal. Both the S1 and S2 regions contribute almost equally
to the surface area buried on trimer formation (Kirchdoerfer et al. 2016). The S1 region is
comprised of up to five domains: D0 (so named because it is not present in all S proteins), D1,
D2, D3, and D4. D0 and D1 both resemble galectins and adopt a canonical beta-sandwich fold
and it has been proposed that they have been acquired from their hosts by gene transfer (Li
2016). Among coronaviruses, both the D1 and D2 domains have been found to bind receptor
(protein or carbohydrate) as discussed in more detail below. The D3 and D4 domains link the S1
and S2 regions. Key elements present in the S2 region are the fusion peptide, used to catalyze
membrane fusion, the central helix, important for trimerization, and the heptad repeat regions 1
and 2 (HR1, HR2), amphipathic sequences key in the α-helical rearrangements that promote
fusion. As noted above, S proteins have the dual responsibilities of recognizing a receptor on the
host cell and mediating fusion of the viral and host membranes. In order to mediate these two
processes, labor is divided between the S1 and S2 regions. The S1 domain is responsible for
receptor recognition, and the S2 domain is responsible for membrane fusion.
The S1 region, tasked with recognizing a cellular receptor, must contain a receptor
binding domain (RBD). Coronaviruses are unusual in that two separate domains may act as the
6
Figure 2. Fusion proteins facilitate fusion between viral envelopes and cellular membranes. An enveloped virus diffuses into the area of a target cell (1). The fusion (F) protein, represented as a trimer, recognizes and binds to its cellular receptor in a process called receptor engagement (2). A trigger causes conformational changes that leads to the insertion of a fusion peptide into the target cell’s membrane (3). More conformational changes leads to formation of a six-helix bundle, which juxtaposes the two membranes and leads to fusion, and the beginning of infection (4). A class I fusion protein is used as an example, but classes II and III perform their function with similar large-scale rearrangements that result in a related “hairpin” protein structure and membrane fusion.
7
RBD: some viruses utilize D1 while others D2. The D1 domain recognizes sugar in the
coronaviruses TGEV/PEDV, BCoV/OC43, and IBV. The lab strain MHV uses D1 to recognize a
proteinaceous receptor, CEACAM1. The D2 domain acts as the RBD in SARS-CoV, MERS-
CoV, HCoV-NL63, and HCoV-229E to name a few. In all known cases, D2 binds protein
receptors.
The RBD is dynamic and exists in at least two prefusion conformations as determined by
cryo-EM studies (Yuan et al. 2017). The RBD either lays flat, pointing into the trimer interface
(“lying” state) or it pivots ~90° upwards, orienting itself parallel to the trimer axis (Figure 4). In
the lying state, the surface used by the RBD to bind the host receptor (sometimes called the
receptor binding motif (RBM)) is buried. Indeed, the superimposition of crystal structures of
coronavirus RBDs in complex with their receptor on that of the cryo-EM structures of the
Figure 3. The cryo-EM structure of the MHV spike protein. A) The cryo-EM structure of the spike protein trimer of MHV. B) Protomer of spike protein colored by region. The N-terminal S1 domain is in red and C-terminal
S2 domain is in beige.
8
Figure 4. A dynamic RBD allows receptor binding in the standing conformation. A) The cryo-EM structure of a MERS-CoV spike protein monomer. The lying RBD can pivot into
the standing form seen.
B) Pivoting allows binding of the MERS-CoV receptor, DPP4, because it prevents a steric clash. MERS-CoV S protein trimer is in red, DPP4 in blue, and possible steric clash in purple.
9
S protein trimer has shown that receptor binding can only occur in the "standing" conformation
(Figure 4). Although an RBD in the lying conformation is not able to bind receptor, it has been
suggested that this structural arrangement might serve to shield the RBD from a neutralizing
antibody response (Walls et al. 2016b). After binding to its receptor, the RBD is caught in the
standing state and this has been hypothesized to promote the cascade of conformational changes
required for membrane fusion (Gui et al. 2016, Yuan et al. 2017). RBD dynamics may thus play
a role in immune evasion and the triggering of membrane fusion, two important features of
infection and virulence.
The S protein’s ultimate function is that of a fusion protein and the machinery necessary
to fuse membranes is contained in the S2 region. The coronavirus S protein must be processed by
host proteases in order to become activated. It may harbor 2 or more protease cleavage sites, one
at the S1/S2 boundary and another in the S2 region, upstream from the all-important fusion
peptide (termed the S2’ cleavage site). Cleavage at the first site occurs in the secretory pathway
during the biosynthesis of viral progeny in an infected cell, while cleavage at the second site
occurs during entry into a new host cell prior to membrane fusion. In some coronaviruses like
MERS-CoV, both receptor binding and cleavage at the S1/S2 site is required for cleavage at the
S2’ site. S2’ cleavage occurs for all coronaviruses and is required for fusion activity. S proteins
that do not contain an amino acid sequence recognized by host proteases will not be able to
infect that host and, as such, protease specificity is a determinant of host range (Yang et al.
2015). Upon triggering, the HR1 motif of the S2 region rearranges to form a 3-helix bundle that
extends the central 3-helix bundle found in the prefusion trimer (one helix from each monomer).
Further rearrangements bring this nascent 3-helix bundle into contact with the HR2 3-helix
bundle that exists in the prefusion trimer prior to rearrangement (one helix from each monomer).
The net result is the formation of a 6-helix bundle from two 3-helix bundles, a process that drives
membrane fusion. Indeed, the formation of a 6-helix bundle in this way is characteristic of all
viral membrane fusion proteins, and is similar to cellular proteins involved in the secretory and
endocytic pathways (e.g. SNARES) (Martens et al. 2008, White et al. 2008).
1.5 Evolution of Coronavirus Diversity
Coronaviruses, and in fact all RNA viruses, are recognized for the remarkable speed with
which they evolve. Mutations are rapidly produced and fixed in a population and evolutionary
10
processes that take place over millions of years in eukaryotes take place over decades in RNA
viruses (Holmes 2011). The last common ancestor of coronaviruses is believed to have existed
about 10,000 years ago (Woo et al. 2012) and diversification since then has been massive.
Indeed, many viral species that infect a wide range of hosts via many different receptors now
exist (Table 1). Comparison of both gene sequence information and viral protein structures allow
us to elucidate evolutionary relationships that may aid in our understanding of the processes of
zoonosis, adaptation in a new host, and immune evasion.
To study the evolutionary relationships between the extant coronaviruses, phylogenetic
analysis of certain gene and protein sequences can be performed. The polymerase gene that
encodes the RNA-dependent RNA polymerase (RdRp) is essential for coronaviruses and it is the
most conserved region of their entire genome (Snijder et al. 2003). Based on the RdRp,
phylogenetic analysis has separated coronaviruses into four genera: alpha, beta, gamma, and
delta (Figure 5). Alpha and beta coronaviruses infect mammals while gamma and delta
coronaviruses infect mostly avian hosts. Viral surveillance studies of bats, alpacas, and camels
have revealed that viruses that are very closely related to HCoV-229E circulate in these
organisms (Crossley et al. 2012, Corman et al. 2015, Corman et al. 2016, Lin et al. 2017). In
vitro studies have shown that these camel viruses are able to infect human cells, and while this
does not guarantee that direct transfer between species is possible, it shows that some barriers to
zoonotic transmission are low (Corman et al. 2016).
1.6 Binding of Receptor by Coronaviral RBDs
Among coronaviruses, both the D1 and D2 domain has been found to mediate receptor
binding. Moreover, both proteins and carbohydrates can serve as receptors. The crystal structures
of the RBDs (i.e. D1 or D2) of the coronaviruses HCoV-229E, HCoV-NL63, SARS-CoV,
MERS-CoV, MHV, and PRCoV have all been solved in complex with their receptors, making
structural analysis possible. Here we will look at the tremendous variation seen between the four
coronaviruses that use D2 as the RBD.
11
HCoV-NL63 and SARS-CoV are alpha and beta coronaviruses, respectively, and they
share only 10% amino acid sequence identity in the S1 region of their S proteins (Li 2011). This
lack of primary sequence identity is common when comparing S1 regions across different
coronavirus genera. The NL63 and SARS RBD both recognize the same receptor, angiotensin
converting enzyme 2 (ACE2), and bind to it at overlapping sites on the receptor (Li 2011). The
NL63-CoV’s RBD uses three short loops, L1-L3, of 11, 11, and three amino acids in length,
respectively, to bind ACE2. These loops are supported by an 8-stranded β-structural domain. The
RBM of SARS-CoV is one continuous segment of 70 amino acids and contains two β-strands.
Figure 5. Phylogenetic analysis of coronaviruses reveals four genera. An unrooted phylogenetic tree of coronaviruses based on the gene sequence of the RNA-dependent RNA polymerase. The alpha, beta, gamma, and delta genera are shown in red, blue, green, and lavender respectively. Numbers indicate bootstrap value (N=1000). Adapted from de Groot et al. (2013).
12
Table 1. Coronavirus’s diverse host range and receptor usage.
Coronavirus Host Receptor
Alphacoronavirus
HCoV-229E Human Aminopeptidase N (APN)
HCoV-NL63 Human Angiotensin converting enzyme 2 (ACE2)
PRCoV Pig Aminopeptidase N (APN)
229E-like Bat Unknown
229E-like Camel Aminopeptidase N (APN)
229E-like Alpaca Unknown
NL63-like Bat Unknown
Betacoronavirus
HCoV-OC43 Human 9-O-Acetyl-N-acetylneuraminic acid (Neu5,9Ac2)
HCoV-HKU1 Human Unknown
BCoV Cow 9-O-Acetyl-N-acetylneuraminic acid (Neu5,9Ac2)
MHV Mouse Carcinoembryonic antigen-related cell adhesion molecule 1 (CEACAM-1)
SARS-CoV Human Angiotensin converting enzyme 2 (ACE2)
SARS-related CoV Bat Angiotensin converting enzyme 2 (ACE2)
MERS-CoV Human Dipeptidyl peptidase-4 (DPP4)
MERS-related CoV Camel Dipeptidyl peptidase-4 (DPP4)
BatCoV-HKU4 Bat Dipeptidyl peptidase-4 (DPP4)
Gammacoronavirus
IBV Bird α2,3-linked Neu5Ac
Beluga Whale Coronavirus SW1 Whale Unknown
Deltacoronavirus
Bulbul Coronavirus HKU11 Bird Unknown
Porcine Coronavirus HKU15 Pig Unknown
13
NL63-CoV’s closest relative in the alpha coronavirus genus is HCoV-229E and their last
common ancestor is believed to have existed more than 1000 years ago (Woo et al. 2012). Since
this time, both viruses have diverged and bind different human receptors. HCoV-229E binds
human aminopeptidase N (APN) and as stated above, HCoV-NL63 binds ACE2. Both RBDs
have a similar core fold composed of β-strands and both utilize three extended loops to bind their
receptors (Wong et al. 2017, Wu et al. 2009) (Figure 6). The RBDs share only 45% amino acid
sequence identity and their receptor binding loops vary in length, sequence and the extent to
which they mediate the interactions with their respective receptors (Wong et al. 2017).
The alpha coronavirus, porcine respiratory coronavirus (PRCoV), also uses APN as its
receptor but, as the name would indicate, in a porcine host. This virus’s RBD also has high
structural similarity to the RBDs of HCoV-NL63 and HCoV-229E and it also binds its receptor
through the use of three extended loops (Reguera et al. 2012). Interestingly, despite being close
relatives and utilizing the same receptor, the RBDs of HCoV-229E and PRCoV bind to different
Figure 6. The spike protein RBDs of HCoV-229E and HCoV-NL63 show structural similarity. The NL63 (left) and 229E (right) RBDs are displayed and their receptor binding loops are labeled 1-3. The structural similarity is striking and there is only a root-mean-square deviation (RMSD) of 1.2 Å between the two RBDs.
14
sites on the APN molecule, sites termed the P-site and H-site for pig and human, respectively
(Wong et al. 2017). HCoV-229E is unable to bind pAPN at the H-site because of the presence of
an N-linked glycan on pAPN at residue Asn286. PRCoV is unable to bind hAPN at the P-site
because of steric clashes between hAPN Arg741 and Ser302, Pro307, and I308 of the PRCoV S
protein RBD. The H- and P-site share only about 60% identity between the APNs of mouse, pig,
and human. Strikingly, once APN is mutated to remove residues whose side chains lead to a
clash, alpha coronaviruses from different species are able to bind, indicating that the receptor-
binding loops are able to accommodate the remaining structural differences (Wong et al. 2017).
Structural variation in the receptor binding loops of these coronavirus S proteins has led to the
receptor usage and host specificity observed.
1.7 The Engine of Genetic Diversity
All of the protein variation that enables the receptor binding diversity seen in
coronaviruses is the result of genetic diversity. RNA viruses replicate with mutation rates
between 10−6 and 10−4 substitutions per nucleotide, a rate orders of magnitude higher than DNA
polymerases (Sanjuán 2016). Due to the optimized nature of the “wild type” virus, most
mutations are deleterious or lethal (Vishner 2016). However, mutations that result in increased
viral fitness are amplified by means of natural selection (Smith 2017, Darwin 1859). The engine
of genetic diversity that enables such selection is the RNA-dependent RNA polymerase (RdRp).
This protein is tasked with producing copies of the RNA genome during viral replication.
Coronaviral RdRps lack intrinsic proofreading ability, a characteristic known to lower replication
fidelity (Steinhauer et al. 1992). It is this low-fidelity that leads directly to diverse populations. A
simple illustration of how much diversity is created by the RdRp is shown as follows: the RdRp
error rate is 1•10-5 mutations per nucleotide, Coronavirus genomes are 3•104 nucleotides long,
and Viral loads are approximately 1•107 viruses per mL tissue. This leads to 3•106 mutations per
mL of tissue, enough diversity to cover the genome about 100 times over in a small sample of
just one infected individual. This thought experiment has recently been supplanted with actual
experimental evidence that vast coronavirus genome variation exists within a single organism
(Briese et al. 2014). Some have argued that RNA viruses have evolved an optimal mutation rate
and that higher or lower rates decrease viral fitness. With higher mutation rates, an “error
threshold" is reached where viable viral genomes can no longer be produced (Holmes 2003).
15
With lower mutation rates, viral populations do not harbor enough diversity to overcome
environmental changes (Coffey et al. 2011, Smith 2014).
1.8 Environmental Pressures Promote Coronaviral Adaptation
Diversity increases the probability that a virus will be able to survive under changing
environmental pressures, but what such pressures exist in nature? Host defenses such as an
antibody response or the introduction of an antiviral pharmaceutical provide two common
examples (Coffin et al. 2013). Additionally, a virus that has been introduced into a new species
faces a number of host specific factors that act as barriers to efficient infection (Parrish et al.
2008).
For coronaviruses, the S protein is the major target of neutralizing antibodies (Du et al.
2016). Antibodies can neutralize coronaviral infection via two methods: i) binding to the RBD to
sterically prevent interaction with the host receptor, and ii) binding to the fusion machinery in a
manner that either stabilizes the pre-fusion conformation or that sterically prevents the required
post-fusion conformation from being realized (Zeng et al. 2006, Corti et al. 2013). Mutations on
the S protein that ablate binding of neutralizing antibodies would see a marked increase in fitness
compared to its neutralized peers, as they would be the only ones able to enter a host cell and
replicate. Culturing coronaviruses in the presence of neutralizing antibodies quickly leads to
mutants that are able to escape recognition by such antibodies (Tang et al. 2014), confirming that
immune pressure leads to S protein diversity.
Hosts naturally elicit a polyclonal antibody response with several antibodies able to bind
unique epitopes on a viral antigen and competitively inhibit receptor binding. Avoiding such a
robust response via point mutation may be impossible, as a single side chain difference may not
abrogate binding of multiple antibodies (Tang et al. 2014). One means by which a virus can
overcome such a polyclonal response is to increase the affinity of the RBD-receptor interaction
so that this interaction is more favorable than the antibody-RBD interaction. Increasing the
affinity of the receptor interaction via point mutations in the HA molecule has been shown to
enable influenza A to overcome a polyclonal antibody response (Hensley et al. 2009).
16
1.9 HCoV-229E as a Model for Coronavirus Adaptation and Evolution
HCoV-229E is a good model for studying the adaptation and evolution of coronaviruses
for several reasons. It was the first coronavirus to have its genome sequenced and multiple whole
genome sequences are available (Farsani et al. 2012). Additionally, more than 50 sequences of
the S protein RBD spanning decades and continents have been deposited and changes in primary
sequence are observed. The cellular receptor for HCoV-229E is known and the crystal structure
of the HCoV-229E RBD in complex with hAPN has been solved (Wong et al. 2017). HCoV-
229E elicits a neutralizing antibody response and yet prior infection does not lead to lasting
immunity. As such, viral adaption and immune evasion can be studied with this system.
Phylogenetic analysis of the 52 available HCoV-229E RBD sequences segregate them into six
distinct groups (Figure 7). Each new group has successively replaced the previous one leading to
a “ladder-like” phylogeny. Such a phylogeny is representative of a protein under selective
pressure to evade the host immune response (Holmes 2011).
Analysis of the x-ray crystal structure of the HCoV-229E RBD-hAPN complex, in
conjunction with the available sequence data, has shown that the sequence variation observed is
highly skewed to the three receptor-binding loops (Figure 8). This is a striking observation that
raises many questions. Does such sequence diversity lead to different receptor usage? Does this
diversity modulate receptor binding affinity? Are these mutations the result of immune evasion?
The implications of these mutational differences are examined in this thesis.
17
Figure 7. Phylogenetic analysis of HCoV-229E S-protein RBDs reveals six distinct classes. An unrooted phylogenetic tree of 52 amino acid sequences of the HCoV-229E spike protein RBD (residues 293-435). The sequences cluster in six groups termed classes. These classes are presented along with the timeframe for which they were found.
18
1.10 Rationale
Viruses cause an enormous impact on human health and prosperity. Each year, viruses
that circulate amongst humans are responsible for millions of infections and deaths throughout
the world. In addition to this present threat, animal viruses capable of crossing species barriers
and spreading to humans lurk on the horizon. Recent coronavirus epidemics launched from cross
species transmission events like SARS-CoV and MERS-CoV have presented themselves with
high mortality rates of 10% and 30%, respectively, and as of now there are no approved vaccines
or antiviral therapies to combat these viruses. Future coronavirus outbreaks are likely to occur
and further research is needed to prevent or treat disease caused by coronaviruses.
HCoV-229E is a human coronavirus that is thought to have originated in bats. First
identified 50 years ago, it circulates globally and is responsible for a modest percentage of the
common cold. Exactly how the zoonosis of this virus occurred and how it was able to establish
itself in the human population is unknown and it is our hope that studying it can provide
mechanistic insights into these processes.
A crucial step in viral entry is membrane fusion and for coronaviruses this is mediated by
the spike (S) protein. Host antibodies that can bind to the S protein and prevent these actions can
neutralize a possible infection. Avoiding this neutralization as well as optimizing existing
receptor binding characteristics are selective pressures thought to drive the evolution of the S
protein. Phylogenetic analysis of HCoV-229E S protein RBD sequences shows that they group
into six distinct classes. By studying amino acid changes in these six classes and how they
influence interactions with neutralizing antibodies as well as the host cell receptor, we hope to
gain insight into how and why coronaviruses evolve.
19
Figure 8. Variation in HCoV-229E S-protein RBD is localized in the three receptor binding loops. An alignment of the 51 amino acid sequences of the HCoV-229E S-protein RBD. Conservation is shown by a period, and differences in residues are shown as their one letter abbreviation. Most variation is observed in the three receptor binding loop regions, highlighted in red at the top. Residues that directly interact with hAPN in the Class 1 structure are highlighted in orange.
20
Chapter 2
Results
2.1 Biophysical Characterization of HCoV-229E Spike Protein RBD Interaction with Its Receptor and Neutralizing Antibodies
2.1.1 HCoV-229E RBDs Cluster into Six Phylogenetic Classes.
The HCoV-229E S protein RBD sequence, previously defined as residues 293-435, was
used to query the coronavirus sequence database and 51 additional RBD sequences were
obtained from both patient samples and lab-strain sources (Wong et al. 2017). Alignment and
construction of a phylogenetic tree indicated that the sequences formed six different Classes.
Each Class of RBD sequence was found in samples collected over a 3 to 7 year period (Figure
7). Moreover, the analysis shows that each RBD Class is replaced in the human population by
the next Class and this ladder-like phylogeny (Grenfell 2004) has continued until the present day.
Further analysis of the sequences shows that a large majority of the variation among Classes
maps to the three receptor binding loops of the RBD (Figure 9). G311, G313, N319, R359 and
the cysteine residues C317 and C320 that form the loop 1 disulfide bond are the only loop
residues conserved in all six RBD Classes. These conserved residues account for only 45% of the
Class 1 RBD surface area buried upon binding hAPN. The variation observed at the site-of-
interaction likely changes receptor binding characteristics and this was further investigated.
2.1.2 Variation in Receptor-Binding Loops Changes Receptor-Binding Affinity and Binding Kinetics. A representative sequence from each of the six Classes was selected for characterization.
To facilitate comparison, the six RBDs were synthesized such that residues outside of the loop
regions correspond to that of the Class 1 RBD. Differences in interaction with the receptor,
hAPN, were then tested and binding affinity (Kd), on-rate (kon), and off-rate (koff) were measured
using a surface plasmon resonance (SPR) assay (Figure 10). All six Classes of RBD were found
to bind hAPN and the kinetic details of these interactions are found in Table 2. The binding
affinity covers a 16-fold range with Class 1 binding the weakest (Kd of 434 nM) and Class 5
binding the strongest (Kd of 27.0 nM). With the exception of Class 4, there is a general trend
toward increased affinity over time since the 1970s. Furthermore, while the on-rate of the
21
Figure 9. HCoV-229E S protein RBD variation. The crystal structure of the Class 1 HCoV-229E RBD is displayed and the variation observed in the 52 RBD sequences is displayed. Blue indicates no variation, white indicates moderate variation, and red indicates residues where the most variation occurs (Pei et al. 2001).
22
Figure 10. Global fitting of surface plasmon resonance data for the Class 1-6 RBD-hAPN interaction. SPR data for the interaction between the six classes of RBD and hAPN. Duplicates for each class are shown (left and right columns). Response units are plotted against time. Raw data is shown in black and the global fit is shown in red. Injection series are 2X dilutions starting from top points of 4.98, 3.78, 2.34, 2.04, 1.01, and 1.38 µM for Classes 1-6 respectively.
23
interactions remained similar and span only a 2.2-fold range, the off-rate of binding spans a 12-
fold range. Based on a linear regression analysis, the off-rate alone accounts for 90% of the
variation seen in binding affinity and follows a similar pattern of near constant decrease over
time.
Table 2. Surface plasmon resonance binding kinetics for the RBD-hAPN interaction for each of the six Classes. N=2 Class kon (•105M-1s-1) koff (s-1) Kd (nM)
1 3.6 ± 0.5 0.16 ± 0.02 434 ± 63
2 3.3 ± 0.5 0.08 ± 0.02 246 ± 19
3 7.3 ± 1.4 0.08 ± 0.02 113 ± 2.3
4 3.6 ± 0.5 0.10 ± .01 261 ± 24
5 4.8 ± 1.1 0.01 ± 0.01 27.0 ± 1.7
6 8.5 ± 0.6 0.03 ± 0.01 37.4 ± 3.5
2.1.3 Structure-Function Analysis of the Class 1 RBD interaction with hAPN.
Structure-function analysis of the Class 1 RBD interaction with hAPN was informed by
our group’s previously obtained crystal structure (Wong et al. 2017). Residues thought to be key
for the interaction on both the RBD and hAPN molecules were identified and mutants were
produced to confirm their importance. The RBD mutants produced were F318A, N319A,
W404A, and the double mutant C317S/C320S. Binding affinity and kinetics were measured in
much the same way as was done for the six RBD Classes (Figure 11). The N319A, W404A, and
C317S/C320S mutant RBDs were unable to interact with hAPN at the maximum achievable
concentration (Table 4). The F318A mutant showed a greatly reduced affinity (Kd of 5.8 µM
compared to 434 nM for WT Class 1 RBD) and therefore we can conclude that in all cases these
residues were important in complex formation.
24
The hAPN mutants produced were D288A, Y289A, V290G, I309A, and L318A. As
expected, SPR-based analysis showed reduced binding when compared to wild-type hAPN
(Figure 13). The reduction in affinity ranged from 10-fold for the D288A mutant to 30-fold for
Figure 11. Surface plasmon resonance data and global fitting for RBD mutants and their interaction with hAPN. SPR data for the selected RBD mutants and their interaction with hAPN. Raw data is shown in black and the global fit for the F318A mutant is shown in red. The highest concentration used in the two F318A mutant titrations are 17.2 (left) and 11.4 µM (right).
25
Figure 12. RBD Mutants show reduced or abrogated binding to hAPN. The side chains of RBD residues that were mutated are shown on the crystal structure of the Class 1 RBD and hAPN complex. The RBD is in brown and APN in green.
Table 3. RBD mutants, their contribution to buried surface area, and their effect on the RBD-hAPN interaction.
Residue Number
Amino acid % Buried Surface
Area Mutated amino acid Affinity Reduction
318 Phenylalanine 15 Alanine 13X
319 Asparagine 9 Alanine No binding observed at
25 µM
404 Tryptophan 10 Alanine No binding observed at
2.2 µM
317/320 Cysteine/ Cysteine
12 Serine/Serine No binding observed at
15 µM
26
Figure 13. Surface plasmon resonance data for mutant hAPN and RBD interaction. In all cases the mutant hAPN was covalently linked to the CM-5 dextran-coated gold chip and the WT Class 1 HCoV-229E RBD was injected. (A) hAPN D288A, (B) hAPN I309A, (C) hAPN V290G, (D) hAPN L318A and (E) hAPN Y289A. The raw data is plotted in black and the global fit is in red. The analyte solutions for the D288A, I309A, V290G, L318A, and Y289A titrations were obtained by 2-fold serial dilution starting at maximum concentrations of 25 µM, 4.1 µM, 25 µM, 4.1 µM and 24 µM, respectively.
Table 4. hAPN mutants, their contribution to buried surface area, and their effect on the RBD-hAPN interaction.
Residue Number Amino acid % Buried Surface
Area Mutated amino acid Affinity Reduction
288 Aspartic acid 13 Alanine 10X
289 Tyrosine 12 Alanine 18X
290 Valine 7 Glycine 30X
309 Isoleucine 3 Alanine 25X
318 Leucine 5 Alanine 12X
27
the V290G mutant (Table 4). Unlike the RBD point mutants that were produced, no single
mutation on hAPN was able to completely eliminate the interaction. One interpretation of this
outcome is that the receptor-binding loops possess enough structural plasticity to accommodate
changes on the surface of their binding partner.
2.1.4 The Six Classes of RBD Share a Conserved Binding site on hAPN.
As previously noted, PRCoV also uses APN as its cellular receptor. However, it binds at
a site on porcine APN (the P-site) that is completely distinct from where the Class 1 RBD binds
on hAPN (the H-site) (Wong et al. 2017). Porcine APN (pAPN) residue N286 is located in the
H-site and is glycosylated. This bulky glycan leads to steric interference with the RBD and
prohibits docking (Figure 14). The corresponding residue on hAPN is E291 and the triple hAPN
mutant E291N/K292E/Q293T was produced to introduce an N-glycan sequon into the H-site on
hAPN. This mutant was then used as a prospective binding partner in a SPR assay in order to
determine whether all six Classes of HCoV-229E S protein RBD bind hAPN at the H-site
(Figure 15). Classes 1 through 6 of the RBD fail to show any binding to this triple mutant,
suggesting that the variation observed in the receptor-binding loops does not lead to a different
binding site on hAPN.
Introducing mutations into a wild type protein can destabilize its fold and change its
structure in more ways than intended. To determine whether or not the triple mutant hAPN was
folded properly, it was crystallized and the structure was solved (Table 5). This mutant was
almost identical to the WT hAPN structure with a RMSD of 0.174 Å across 892 Cα atoms.
Solving the crystal structure of the triple mutant had the added benefit of confirming that the site
was glycosylated during expression. Electron density for the asparagine-linked N-
acetylglucosamine moiety of the N-glycan was observed and, therefore, we can confidently
conclude that the RBDs were unable to bind the mutant hAPN because of the presence of a
glycan in the H-site.
2.1.5 HCoV-229E Classes Differ in Their Ability to be Bound by a Neutralizing Antibody.
A monoclonal antibody (9.8.E12) was generated against the HCoV-229E lab strain
whose S protein contains the Class 1 RBD. This antibody was previously demonstrated to
neutralize the lab strain in a cell-based assay (Wong et al. 2017). Antibodies can neutralize
28
infection through at least two routes (competing for receptor-binding or stabilizing pre-fusion
conformation of S protein), and to investigate which mechanism of neutralization was occurring,
a SPR binding assay was again employed.
The 9.8.E12 antibody was shown to bind the Class 1 RBD with a dissociation constant
(Kd) of 66 nM (Figure 16). To test whether this antibody neutralizes by blocking receptor
binding, a competition assay was conducted (Figure 17). A 200 nM concentration of the Class 1
RBD produced a plateau signal of 15 RUs on an hAPN coupled SPR chip. The same solution
with 2.0 µM 9.8.E12 antibody added produced no discernible signal. Together these results
Figure 14. N-linked glycan at H-site produces a steric clash with a docked RBD. The superimposition of the crystal structures of the Class 1 RBD:hAPN complex and pAPN reveal that the glycan at N286 would prohibit binding of the RBD at the H-site. The RBD is seen in brown, hAPN in green, and pAPN in lavender. The steric clash can be observed at the displayed disulfide bond.
29
indicate that the 9.8.E12 antibody binds the RBD and that this interaction prevents binding to
hAPN.
The 9.8.E12 antibody, now proven to neutralize through binding of the HCoV-229E S
protein RBD, was tested for its ability to bind the RBDs of the other classes. The Class 1 through
6 RBDs were flowed over an antibody-coupled chip at a concentration of 1.0 µM, 15 times the
Kd of the interaction between the Class 1 RBD and 9.8.E12 (Figure 18). No signal was observed
for Classes 2 through 6, an outcome indicating that this antibody is specific to the Class 1 RBD.
Because these RBDs vary only in the loop region, this assay also shows that the antibody
recognizes an epitope present on the receptor-binding loops and not in the β-sandwich region of
the RBD. This is strong evidence that the receptor-binding loops of HCoV-229E elicit a
neutralizing antibody response and that mutations in these loops may be sufficient to prevent
antibody binding.
Several Class 1 RBD mutants previously produced were also tested for cross-reactivity
with 9.8.E12. Full titrations of the F318A and N319A mutants were performed and their binding
Figure 15. Introduction of an N-linked glycan to the H-site on hAPN prohibits binding of all six Classes of RBD. SPR data for the six classes of RBD interacting with the E291N/K292E/Q293T triple mutant. All RBDs were injected at their respective Kd with WT hAPN. All sensorgrams show no signal.
30
kinetics and affinity are shown in Figure 19. Both mutants showed a moderately reduced binding
affinity when compared to the WT Class 1 RBD but binding was not completely abrogated. This
suggests that these residues are not key to the RBD-antibody interaction.
Figure 16. The 9.8.E12 antibody binds the HCoV-229E Class 1 RBD. Surface plasmon resonance data for the interaction of the 9.8.E12 antibody with the Class 1 RBD. Raw data is shown in black while the global fit is shown in red. Interaction kinetics and affinity are shown in the top right. The antibody was covalently attached to the chip while the RBD was injected. Solutions of RBD for injection were obtained starting from a maximum concentration of 1.18 µM and conducting a 2-fold dilution.
31
Figure 17. The 9.8.E12 antibody competes with hAPN for RBD binding. Surface plasmon resonance data for a Class 1 RBD interaction with hAPN. hAPN is covalently linked to the chip and the Class 1 RBD was injected. 200 nM RBD produces a plateau of 15 RUs, while 200 nM RBD in solution with 2.0 µM 9.8.E12 antibody shows no binding to hAPN.
Figure 18. The 9.8.E12 antibody binds only the Class 1 RBD. Surface plasmon resonance data for the class 1-6 RBD interaction with the 9.8.E12 antibody. The 9.8.E12 antibody is covalently linked to the chip and Classes 1-6 RBD was injected at 1.0 µM. 1.0 µM Class 1 RBD produces a plateau of 180 RUs, while 1.0 µM of class 2-6 RBD shows no signal, indicating no binding.
32
Figure 19. The 9.8.E12 antibody binds both loop 1 mutants. Surface plasmon resonance data for the F318A and N319A RBD mutant interaction with the 9.8.E12 antibody (top and bottom respectively). The 9.8.E12 antibody is covalently linked to the chip and RBD mutants were injected. Maximum concentrations of 0.839 µM and 1.18 µM, respectively, were serially diluted by 2-fold to obtain the lower sensorgrams.
33
2.2 Structural Biology of the Evolving RBD-Receptor Interaction
2.2.1 Crystallization of hAPN Requires Deglycosylation
hAPN is a glycoprotein with 10 N-linked glycosylation sequons (NXS/T) and the
previously solved crystal structure of hAPN shows that all 10 sites are glycosylated (Figure 20).
Glycoproteins present a special problem for crystallographers as the chemical and
conformational heterogeneity of their attached glycans can hinder crystal formation. Chemical
heterogeneity is reduced by expression in a GnT1(-/-) cell line, leading to glycoproteins whose N-
glycans do not get processed beyond the Man5GlcNAc2 intermediate. These glycans are still
large and highly flexible. In order to reduce the size and heterogeneity of these N-glycans, an
endoglycosidase was employed. Endo-β-N-acetylglucosaminidase A and H (EndoA and EndoH)
both recognize high-mannose type glycans and are able to enzymatically remove most of the
sugar moiety leaving only the asparagine-linked N-acetlyglucosamine residue. Previous work
has indicated that hAPN will not crystallize without further enzymatic treatment with Jack Bean
α-Mannosidase, an observation suggesting all 10 N-glycan sites are not susceptible to EndoA/H
cleavage. hAPN crystallizes in its apo form following either route of deglycosylation (either
EndoH or EndoA treatment followed by the α-mannosidase in both cases).
Figure 20. hAPN harbors 10 N-linked glycans. The sequence position and glycosylation potential of the 10 N-linked glycan motifs on the hAPN construct. The hAPN structure previously solved shows evidence of glycosylation at all 10 sites, regardless of two motifs falling below the arbitrary threshold of the NetNGlycan 1.0 server (Gupta et al. 2004, Wong et al. 2012).
34
Although EndoH and EndoA have a very similar specificity, they are different enzymes with
different molecular weights. As such, their ability to access a particular N-glycan on a given
substrate might be different. This phenomenon was observed with hAPN. SDS-PAGE with
Coomassie staining showed sharp bands of lowered MW for both EndoH and EndoA treated
hAPN, an indication of quantitative deglycosylation. However, after failed attempts to crystallize
the hAPN:RBD complex, both digests were investigated using MALDI-TOF mass spectrometry.
As seen in Figure 21, hAPN appears as a gaussian distribution centered around 116 M/Z. After
48 hours of EndoH treatment, a large shift of around 8 M/Z units is observed but the distribution
loses its gaussian character by gaining a shoulder at 109.1 M/Z, an indication of heterogeneity.
Figure 21. Deglycosylation of hAPN by EndoH yields a heterogenous product. Shown are two MALDI-TOF curves displaying mass/charge ratio and arbitrary response units for hAPN. The red curve corresponds to the untreated sample and the green curve to that of the sample after EndoH treatment for 48 hours.
35
Subsequent treatment with α-Mannosidase fails to shift the hAPN further, indicating that there is
no further effect and that all possible glycan substrates have been removed. As mentioned earlier,
hAPN crystallizes in its apo form after this treatment, but crystals of hAPN in complex with
several RBDs could not be obtained.
EndoA treatment appears to have a different effect on APN than EndoH. As seen in
Figure 22, two days of EndoA digest shifts hAPN only about 4 M/Z units and the curve remains
gaussian in shape. This tells us that EndoA is able to cleave fewer glycans than EndoH, but that
Figure 22. Deglycosylation of hAPN by EndoA yields a homogenous product. Shown are two MALDI-TOF curves displaying mass/charge ratio and arbitrary response units for hAPN. In yellow is 48 hours of an EndoA digest and blue shows this hAPN sample after an additional 48 hours of α-Mannosidase treatment.
36
it does so in a more quantitative manor. Subsequent α-mannosidase treatment shifts APN about 1
M/Z unit further. The two deglycosylation strategies yield hAPN samples that differ with regard
to what carbohydrate structures remain. Since these samples have similar masses the differences
are missed when only using SDS-PAGE for analysis.
2.2.2 HCoV-229E S Protein RBD Classes 3, 4, and 5 Crystal Complexes with Their Receptor hAPN.
Following deglycosylation of hAPN with EndoA and α-Mannosidase, diffracting crystals
of RBD Classes 3, 4, and 5 in complex with their receptor hAPN were obtained. These three
crystal structures were solved and statistics can be found in Table 5. All three crystals are in
space group of P21 and exhibited very similar packing of the crystal lattice. The unit cell is
nearly identical for the Class 4 and 5 structures. The Class 3 structure has a c axis of about 6 Å
longer compared to the Class 4 and 5 structures. The asymmetric unit (ASU) contains an hAPN
dimer with an RBD bound to each monomer. The space group, unit cell and contents of the
asymmetric unit are different for the previously reported Class 1 RBD complex. Despite
crystallizing under similar conditions, the Class 1 complex crystallizes in a P3121 space group
and the ASU contains three hAPN monomers and the three RBDs that interact with them.
As expected, the overall “backbone” structure of the RBDs are remarkably similar. The
Class 3, 4, and 5 RBDs all maintain the same beta-sandwich domain and vary only in the loop
regions (Figure 23). The crystal structures confirm that they all bind hAPN in a fashion very
similar to that observed for the Class 1 RBD. In all cases, the RBD surface area buried on
complex formation is very similar. The largest difference is between Class 1 with a buried
surface area (BSA) of 513 Å2 and Class 3 with a BSA of 537 Å2, a difference of only a 4.7%
(Table 6). A view of the four interfaces can be found in Figure 24. hAPN can exist in a fully
“open” form where its catalytic site is solvent exposed, a more compact fully “closed” form
where the catalytic site is not exposed, or the spectrum of conformations between these terminal
states (Wong et al. 2012). The previously obtained Class 1 RBD structure contained hAPNs that
were all in the “closed” conformation. In the three other structures, at least one of the hAPNs in
the asymmetric unit is partially open and this change may contribute to crystallization in
different space groups.
37
2.2.3 HCoV-229E Class 1 RBD-hAPN Complex Provides a Foundation for Comparison.
The HCoV-229E Class 1 RBD-hAPN crystal structure shows that the interaction is
mediated exclusively by loop 1 (residues 308-325), loop 2 (residues 352-359), and loop 3
(residues 404-408) of the RBD. Loop 1 is the largest by residue count and by the buried surface
area (BSA) created on complex formation with hAPN. It accounts for 68% of the total BSA and
makes several notable interactions with hAPN. The GGG motif present at residues 313-315
accounts for 25% of the total BSA and it contains backbone NH groups that act as hydrogen
bond donors to the side chain carboxyl oxygens of hAPN residue D288 and the backbone
carbonyl oxygen of hAPN residue Y289 (Table 7). As demonstrated by the SPR experiments
described above, mutation of either of these two hAPN residues leads to greatly reduced binding
Figure 23. Superimposition of Class 1, 3, 4, and 5 RBD. Shown is the superimposition of all four RBD structures. The RMSD for any pairwise comparison is less than 1.1 Å across all shared α-carbons. Dotted lines represent unbuilt regions.
38
(Table 4). RBD residue N319 is central to the interaction and its side chain makes two hydrogen
bonds to the backbone NH and carbonyl oxygen of hAPN residue E291. Also of importance in
loop 1 is the disulfide bond between C317 and C320. This disulfide makes stacking interactions
with hAPN and it likely structures loop 1. The C317S/C320S RBD double mutant cannot bind
hAPN (Table 3). Loop 2 accounts for the smallest fraction of the BSA at just 9%. It contains
residue R359 that forms a salt bridge with D315 of hAPN. No mutagenic analysis was performed
with any loop 2 residues. Loop 3 contributes the remaining 23% of the BSA for the RBD-hAPN
interaction. W404 of this loop is important as its bulky side chain makes both intra-RBD
interactions as well as interactions with the sidechains of residues V290 and L318 on hAPN.
Mutation of this residue to alanine eliminates binding. S407 and K408 of loop 3 participate in
hydrogen bonds with K292 and E291 of hAPN respectively.
39
Table 5. Data collection and refinement statistics. Class 3 RBD:hAPN Class 4 RBD:hAPN Class 5 RBD:hAPN hAPN Glycosylation
Mutant Resolution range 44.3 - 3.1 (3.2 - 3.1) 48.0 - 2.75 (2.85 - 2.75) 47.58 - 2.5 (2.589 - 2.5) 29.2 - 2.0 (2.07 - 2.0)
Space group P 1 21 1 P 1 21 1 P 1 21 1 P 64
Unit cell 99.32 98.52 153.62 90 104.437 90
99.158 98.572 147.778 90 104.417 90
99.5109 98.6599 147.45 90 104.53 90
159.53 159.53 115.854 90 90 120
Total reflections 189210 (20035) 272415 (27367) 351849 (36261) 1312614 (122519)
Unique reflections 52290 (5206) 71808 (7122) 95596 (9507) 112832 (11236)
Multiplicity 3.6 (3.8) 3.8 (3.8) 3.7 (3.8) 11.6 (10.9)
Completeness (%) 99.77 (99.81) 99.84 (99.89) 99.88 (99.95) 99.96 (100.00)
Mean I/sigma(I) 12.10 (3.36) 16.18 (3.87) 10.37 (3.91) 18.19 (2.12)
Wilson B-factor 86.9 48.97 40.22 37.02
R-merge 0.06664 (0.427) 0.06827 (0.4231) 0.07703 (0.2766) 0.1019 (0.9583)
R-meas 0.07817 (0.497) 0.07962 (0.4921) 0.09019 (0.3218) 0.1066 (1.006)
R-pim 0.04057 (0.2532) 0.04078 (0.2503) 0.04633 (0.1628) 0.03103 (0.3037)
CC1/2 0.998 (0.934) 0.997 (0.908) 0.995 (0.942) 0.999 (0.784)
CC* 0.999 (0.983) 0.999 (0.975) 0.999 (0.985) 1 (0.938)
Reflections used in refinement
52263 (5203) 71791 (7121) 95567 (9507) 112828 (11238)
Reflections used for R-free
2613 (260) 1073 (107) 4868 (472) 1288 (127)
R-work 0.2162 (0.2987) 0.2049 (0.2907) 0.2096 (0.3224) 0.1805 (0.2832)
R-free 0.2618 (0.3405) 0.2450 (0.2891) 0.2327 (0.3365) 0.2023 (0.2925)
CC(work) 0.922 (0.879) 0.940 (0.830) 0.943 (0.764) 0.959 (0.855)
CC(free) 0.882 (0.795) 0.914 (0.858) 0.931 (0.709) 0.956 (0.787)
Number of non-hydrogen atoms
16136 16541 16521 8104
macromolecules 15951 15985 16028 7281
ligands 181 324 290 130
solvent 180 232 203 693
Protein residues 1991 1991 2012 892
RMS(bonds) 0.002 0.002 0.003 0.006
RMS(angles) 0.5 0.52 0.59 0.71
Ramachandran favored (%)
95.13 96.09 96.50 97.64
Ramachandran allowed (%)
4.82 3.91 3.5 2.36
Ramachandran outliers (%)
0.05 0 0 0
Rotamer outliers (%) 0.63 0.28 0.34 0.75
Clashscore 0.85 0.53 0.44 1.37
Average B-factor 92.81 48.89 41.06 41.06
Statistics for the highest-resolution shell are shown in parentheses.
40
2.2.4 HCoV-229E Class 3 RBD-hAPN Crystal Complex Shows a Markedly Different Interaction When Compared to the Class 1 RBD.
The Class 3 RBD-hAPN crystal structure shows several important differences when
compared to the Class 1 structure. In loop 1, S312 and G313 contribute far less BSA to the
interaction compared to their Class 1 counterparts. The GGG motif at residues 313-315 is
changed to GVG and the backbone that lays flat against hAPN in Class 1 to satisfy hydrogen
bonds, has been kinked up and away from hAPN. This leads to the loss of a hydrogen bond
involving RBD residue 313 that is seen in the Class 1 RBD. However, the Class 3 RBD residue
R316, whose side chain occupies the volume vacated by the backbone kink (Figure 25), makes a
highly favorable salt bridge with hAPN residue D288. The Class 3 residue R316 accounts for
Figure 24 Interface details of HCoV-229E S protein Class 1, 3, 4, and 5 in complex with hAPN. Shown are key residues present in the interface between hAPN and the Class 1 RBD (A), the Class 3 RBD (B), the Class 4 RBD (C), and Class 5 RBD (D).
41
three times the BSA of Class 1 residue K316 and the bulkier side chain of V314 compared to
G314 adds 9 Å2 to the interface. The C317/C320 disulfide bond is positioned and oriented in a
near identical manner and accomplishes all the same interactions in both complexes. The same is
true of F318 and N319. The dual hydrogen bonds made by N319 are slightly longer in the Class
3 structure, but the residue is supported intramolecularly by N406. N406 makes hydrogen bonds
with both the backbone and side chain of N319 (interactions not observed in the Class 1
complex) and this highly networked and orienting interaction likely promotes complex formation
(Figure 26).
Loop 2 contains a 2 amino acid deletion in Class 3 relative to that of Class 1. Residues
V353 and Y354 are lost, and this leads to a tighter turn that repositions R359. R359 in Class 1
makes a hydrogen bond with hAPN residue D315 and it accounted for 7.5% of the total BSA. In
the Class 3 RBD, R357 makes two additional hydrogen bonds and nearly double its contribution
to the BSA of the interaction with hAPN (Figure 27).
Loop 3 also shows major changes in the Class 3 complex relative to that of Class 1 and
these changes are likely to affect binding. Class 1 RBD residue W404, a residue that makes both
intra- and intermolecular interactions, as discussed above, is mutated to L402 in the Class 3
RBD. This side chain is able to accomplish the intramolecular apolar packing that W404
participated in, but it makes no contact with hAPN in the Class 3 complex. Class 1 residue S407
has mutated to L405 in Class 3 and a hydrogen bond to hAPN is lost in the process. However,
L405 makes a larger contribution to the BSA than does W404 in their respective complexes.
Together, L402 and L405 of the Class 3 RBD make a key contribution to the Class 3 complex
(Figure 28). The remainder of the difference between the Class 1 and Class 3 RBD are peripheral
to the receptor interaction (Figure 29).
42
Table 6. Buried surface area analysis of RBD Classes 1, 3, 4, and 5 in complex with hAPN.
43
Table 7. Hydrogen bond and salt bridge analysis of four crystal complexes.
44
2.2.5 The HCoV-229E Class 4 RBD-hAPN Interaction Maintains Many Features of the Class 3 Interaction.
Several subtle differences exist between the Class 3 and 4 RBD-hAPN complexes. Loop
1 is almost identical as the GVG motif is maintained along with the contributions of crucial
residues R316, N319, and the disulfide bond. F318 has mutated to Y318 in a conservative
manner. This bulky residue still contributes over 12% of the total BSA and participates in
hydrogen bonds with Y289 on hAPN. The additional hydroxyl group of tyrosine compared to
phenylalanine is presented toward the solvent and this increase in polar character may be
favorable. All interactions highlighted for the Class 3 structure are maintained in the Class 4
structure and no additional mutational differences exist at the protein-protein interface. However,
there are differences in the loops and supporting residues that are not central to the interaction
Figure 25. R316 of the Class 3 RBD fills the volume vacated by the kinked backbone. Shown is the Class 1 RBD in brown, the Class 3 RBD in green, and hAPN in black. The class 1 RBD makes hydrogen bonds with hAPN residue D288 using residues G313 and G315 (dotted line). Class 3 loses these interactions and compensates using R316.
45
(Figure 27). Residues N307, R311, Q349, K354, D356, M399, N404, and H408 do not
contribute to the buried surface but nonetheless differ between Class 3 and 4.
2.2.6 HCoV-229E Class 5 RBD-hAPN Interaction Shows a Moderately Changed Loop 1.
The Class 5 RBD binds hAPN with the strongest affinity of any tested construct (Kd of
27.0 nM). This is nearly 10-fold stronger than the Class 4 RBD (Kd of 261 nM). The major
difference between the Class 4 and Class 5 RBD-hAPN complexes occurs in loop 1. The GVG
motif from residues 313-315 has been mutated to GPG. The addition of this proline has changed
the backbone conformation of this region of loop 1. This is the second class in which the middle
residue of the Class 1 GGG motif has been changed. Proline residues have stricter
Ramachandran requirements than glycine and valine residues and the transition from valine to
proline here likely reduces loop flexibility. Additional mutational differences are not at the site
Figure 26. N406 of the Class 3 RBD provides supporting hydrogen bonds to N319. Shown the Class 3 RBD (green) in complex with hAPN (black). N319 makes two hydrogen bonds with the E291 backbone and this residue has been shown to be essential to the interaction. It is oriented into position by two hydrogen bonds provided by N406.
46
of interaction and are unlikely to affect binding. The other residues that are changed in the Class
5 RBD relative to that of Class 4 are exposed to solvent (Figure 29). Antibodies that bind to
these locations on the RBD may sterically occlude binding to hAPN and changes to these
residues may facilitate immune evasion.
Figure 27. A two residue deletion in loop 2 of the Class 3 RBD helps position R357 more favorably. Shown is the class 1 RBD in brown and class 3 in green. hAPN is in black. The loss of two residues moves R357 (class 3) away from the position of R359 (class 1), positioning the salt bridge closer, and more favorable. Class 3 interactions are shown with a dotted line and class 1 interactions are showed with a dashed line.
47
Figure 28. L402 and L405 replace W404 in more recently isolated RBD sequences. Shown is the class 1 RBD in brown and Class 5 in blue. hAPN is in black. The volume occupied by W404 in the Class 1 structure is supplanted by two leucine residues, L402 and L405. This feature is shared by the Class 3 and 4 structures and the Class 5 is displayed as it is the highest resolution structure obtained (2.5 Å).
48
Figure 29. Mutations not located at the site of hAPN interaction cluster at the top and sides of the RBD. The HCoV-229E RBD overlaid on the NL63-CoV trimer and colored. Residues not located at the RBD-hAPN interface that have changed from the previous structure are shown in red. Class 3 changes from Class 1 in panels A and B, Class 4 changes from Class 3 in C and D, and Class 5 changes from Class 4 in panels E and F.
49
Chapter 3
Discussion
3.1 A Ladder-Like Phylogeny and Immune Evasion
Phylogenetic analysis of the RBD sequences of HCoV-229E viruses sampled over the
past 50 years shows that they segregate into six classes. Moreover, the six classes have been
found to successively replace each other in the human population over this time period. This
“ladder-like” phylogeny has been observed in the Influenza H3N2 HA1 protein and in intra-host
sampling of the HIV-1 E protein. In both cases, it has been attributed to the selection of
phenotypes that are able to escape an immune response (Grenfell 2004). The vast majority of the
mutational differences observed between the HCoV-229E S protein RBD classes are found in
these three receptor binding loops (Figure 9). Loops of this kind are known to be highly
immunogenic (Corti et al. 2013) and they elicit an antibody response in the case of the HIV-1 E
protein and the TGEV S protein (Kim et al. 2003, Reguera et al. 2012). We showed that the
receptor-binding loops of the HCoV-229E S protein RBD are the site of binding of the
neutralizing antibody, 9.8.E12, and that loop variation can abrogate antibody binding. It follows
that loop mutations would facilitate immune evasion.
HCoV-229E is a pandemic virus and surveillance indicates that it circulates on all
continents except Antarctica. The constructed phylogeny shows these viruses all share a common
lineage (Figure 7). The emergence of viral classes can be explained as follows: an HCoV-229E
strain containing an S protein with the Class 1 RBD propagates through individuals during the
cold season. This infection is cleared and the persons that were infected now possess neutralizing
antibodies, an asset that prevents additional infections from viruses with identical or very similar
RBDs. The virus continues infecting individuals the following season but this time has fewer
potential targets. This continues for several years until a certain percentage of the population is
protected, typically around 90% (Fine 1993). At this time, a viral variant able to escape this
"herd immunity" is at a competitive advantage. The new variant may be fixed in the population
at this time or it may require further optimization before a new RBD class emerges. This model
is supported by studies that show periods of low HCoV-229E circulation following years of
periods of high circulation (Cabeca et al. 2013), and the observation that HCoV-229E infection
50
does not necessarily provide an individual with protection from future infections (Reed 1984). It
is certainly possible that the HCoV-229E receptor-binding loop variation, and the emergence of
new RBD classes, is strictly the consequence of the abrogation of neutralizing antibody binding.
However, it is also possible that other driving forces for loop variation exist as discussed below.
3.2 The HCoV-229E RBD Affinity for hAPN has Changed Over Time.
The receptor-binding affinity and kinetics of the six HCoV-229E S protein RBDs were
measured and several interesting patterns emerged. Firstly, the affinity of the RBDs for their
cellular receptor hAPN has shown a tendency to increase over the past 50 years. The Class 1
RBD (from a sample first isolated in 1967) has an affinity of 434 nM, while the Class 5 and 6
RBDs (from samples isolated in the 2000s and 2010s) have an affinity of 27 and 37 nM,
respectively (Table 2). This increase in affinity appears to have been selected for over time and
the directionality suggests that optimization is occurring. Enveloped viruses possess receptor-
binding proteins that bind their receptors with a wide range of affinity: influenza binds in the
millimolar range, and HIV binds in the nanomolar range, for instance (Skehel et al. 2000,
Ugolini et al. 1999). For a given virus and cell-surface receptor density, there is an affinity
threshold where membrane fusion is achieved and above which fusion is not improved
(Hasegawa et al. 2007). Why then is a 16-fold increase in affinity observed between the Class 1
and Class 5/6 RBDs, when this threshold has apparently been met by viruses with the Class 1
RBD? The answer may be related to the need to evade a polyclonal antibody response in the case
of a host infection. As mentioned in the previous section, mutations in the receptor-binding loops
are able to prevent a monoclonal antibody from binding, but the true, in vivo immune response to
viral infection would be polyclonal. Coronaviruses like SARS-CoV and other enveloped viruses
like influenza A have surface glycoproteins with multiple epitopes (He et al. 2005, Hensley et al.
2009). Evading a polyclonal antibody response through the abrogation of antibody binding might
require mutations at several distinct site - an unlikely event. However, for neutralizing antibodies
that compete for receptor binding, an increase in receptor binding affinity is thought to be an
additional route to immune evasion and one that would work even for a polyclonal antibody
response (Hensley et al. 2009). Our 9.8.E12 antibody binds the Class 1 RBD with an affinity of
51
66 nM, an affinity seven-fold higher than the Class 1 RBD-hAPN interaction. An increase in
receptor binding affinity will help the receptor outcompete neutralizing antibodies for interaction
with the RBD. Indeed, the Class 5 and 6 RBDs bind hAPN two to three-fold tighter than the
9.8.E12 antibody binds the Class 1 RBD.
Another interesting aspect of the increased affinity of the RBD-hAPN interaction is that it
appears to be almost exclusively the result of a slowing of the off-rate (koff). The affinity of an
interaction, or its Kd, is the quotient of koff and kon and changes in either rate will change the
affinity. Biochemically, the change from shorter to longer off-rates has interesting implications at
the site of receptor engagement, the cell surface. As previously introduced, after a coronavirus
localizes to the cell surface by binding a receptor, it needs triggers to activate the fusion
machinery and it needs the fusion process to progress to completion, two actions assisted by
additional time. For HCoV-229E, fusion activation is achieved by cell-surface transmembrane
serine protease cleavage of the S-protein (Bertram et al. 2013). Additional time attached to the
cell surface would allow for diffusion of such proteases to the proper location and for enzymatic
cleavage to occur. Viruses with longer off-rates may be at a fitness advantage compared to those
with shorter off-rates. Similarly, more than one S-protein subunit may need to engage hAPN
molecules to trigger fusion and slower off-rates would facilitate the engagement of a second
subunit before the dissociation of the first one occurs.
3.3 Crystal Structures and Mutagenesis Shed Light on the RBD-hAPN Interaction.
Crystal structures of protein complexes provide a wealth of information that can be
further explored through mutagenesis and binding studies. Our group’s previously obtained
crystal structure of the Class 1 RBD in complex with hAPN informed the design of several new
RBD and hAPN constructs that were aimed at elucidating and confirming the importance of
several key residues. RBD residues F318, N319, W404, and the disulfide bond formed by C317
and C320 were selected for study and the appropriate mutants were produced.
The F318A mutant lead to an RBD that bound hAPN with a 13-fold reduction in affinity
(Table 3), a ∆∆G of 1.5 kcal/mol; this residue is near the center of the RBD-hAPN interacting
surface, the classical definition of a “hot-spot residue” (Li et al. 2004). This residue is a tyrosine
in the Class 2 RBD, it returns to a phenylalanine in the Class 3 RBD, and again appears as a
52
tyrosine in the Class 4, 5, and 6 RBDs. The residue’s aromatic ring is positioned nearly
identically in all cases and differs only in the presence of the tyrosine hydroxyl group which
points toward solvent (Figure 24). This switching of similar residues may be indicative of a
neutral mutation, but may also provide an example of how a change to an interface residue will
maintain receptor binding, while altering a potential antibody binding epitope (Shiroishi et al.
2006). This hypothesis could be investigated with a suite of antibodies raised against the Class 1
RBD and a F318Y mutant.
The N319A mutant showed no binding at the maximum achievable concentration of 25
µM, an indication that it too is a hot-spot residue. Both the NH and carbonyl oxygen of the side
chain of this asparagine residue make hydrogen bonds with hAPN that are maintained in the
crystal structures of the Class 3, 4, and 5 RBDs. It is likely that this same interaction is present
between hAPN and the Class 2 and 6 RBDs as the residue is one of the six that is conserved in
all 52 deposited sequences. It is noteworthy that this residue forms an important bond with the
backbone of hAPN and not a side chain. Overall protein structure is likely to be maintained
between homologous proteins from different species, while individual residues are more likely to
vary (Sitbon et al. 2007). This feature may have important implications when cross-species
transmission is considered, as less dependence on side chains may allow for more potential hosts.
Mutation of residue 404 from tryptophan to alanine completely abrogates binding. W404
contributes a large hydrophobic surface for apolar packing both within the RBD and across the
RBD-hAPN interface. It follows that a mutation from a large aromatic residue to the much
smaller alanine residue would disrupt the interaction. In all of the other classes, a leucine is
found at this position and interestingly they all bind with higher affinity. In the Class 3, 4, and 5
structures, this leucine (L402 due to a two amino acid deletion) is accompanied by another
leucine at position 405 (L405). The two hydrophobic side chains interact and occupy the same
volume as the tryptophan residue in Class 1 (Figure 28). The Class 2 and 6 RBDs have an
isoleucine and a histidine at these positions, respectively, showing a need for a large and possibly
branched side chain.
53
3.4 hAPN Mutants Lead to Reduction in RBD Binding Affinity
The hAPN mutants produced all interacted with the Class 1 RBD at least 10-fold weaker
than the wild-type hAPN did. Overall, this is unsurprising as the mutants selected were thought
to be key to the interaction. The reduction in affinity of mutants does shed some light on the
contributions a particular residue makes to the interaction. The side chain carboxylic group of
D288 makes two hydrogen bonds with backbone amides in the RBD receptor-binding loop 1.
Despite the assumed significance of these two hydrogen bonds, the D288A mutant sees the
smallest reduction in affinity, only 10-fold. In contrast, the I309A mutant has a 25-fold reduction
in affinity.
No hAPN mutant produced completely abolished binding of the Class 1 RBD. This might
suggest that the receptor-binding loops on the RBD are structurally malleable enough to
accommodate changes that occur on the surface of its binding partner. Indeed, these mutant
APNs can be viewed as "homologous receptors" in species closely related to humans. It has been
suggested that differences in receptor sequence is the largest barrier to zoonotic transmission
(Bae et al. 2011) and receptor binding loop plasticity is one means by which differences in
receptor sequence/structure can be overcome (Wong et al. 2017).
3.5 The Use of Loops as Receptor-Binding Motifs Enables HCoV-229E Adaptation and Evolution.
The HCoV-229E RBD utilizes extended loops and not more constrained secondary
structure elements to bind its receptor hAPN. Loop regions in proteins are more likely to tolerate
insertions, deletions, and substitutions when compared to more ordered regions (Chaux et al.
2007, Touriki et al. 2008). In fact, one analysis of yeast orthologs shows that loop regions are 14
times more able to accommodate insertions and deletions than are secondary structure elements
(Tath-Petraczy et al. 2013). Of the 52 RBD sequences available, substitutions, insertions, and
deletions in the receptor binding loops are all observed (Figure 8). This ability to withstand
change enables the RBD to probe more sequence space than would otherwise be allowed and this
diversity has important implications. Perhaps most importantly, it allows the virus to alter its
antigenic surface thereby providing a facile means of abrogating the binding of loop-binding
neutralizing antibodies. As discussed above, the use of loops to bind receptor, and the variability
54
that they can accommodate, also provides a route to changing receptor binding affinity - another
determinant of viral fitness.
RNA viruses are best described as populations (Lauring et al. 2013) and viral populations
with diversity in their receptor binding loops would also be expected to be well suited to
acquiring new receptor interactions by chance. Among relatives of HCoV-229E that use loops to
bind their receptors, PRCoV binds APN at the P site and not the H site, and HCoV-NL63 binds
ACE2 (Reguera et al. 2012, Wu et al. 2009). These examples provide evidence that receptor
binding loops have provided a route for the acquisition of new receptors or new receptor
interactions. During cross-species transmission, the use of non-conserved receptor interactions in
the new species are not only likely to be rare, as they involve completely new binding modes, but
they will also likely require rounds of viral replication and optimization to produce a sufficiently
fit virus (Dai et al. 2013, Wong et al. 2017). It follows that the ability of receptor binding loops
to sustain mutational change will facilitate both the acquisition and optimization of non-
conserved receptor interactions. The use of a conserved receptor interactions between
homologous receptors in the old and new species is comparatively more likely. There are many
barriers to cross-species transmission and these all need to be overcome in order for cross-
species transmission to occur (Plowright et al. 2017). The use of a homologous receptor in the
new host reduces one of the barriers to cross-species transmission and again loop variation and
plasticity will facilitate the process.
HCoV-229E likely crossed the species barrier from bats to humans in the recent past
(Corman et al. 2015) and it is certainly possible that optimization of receptor interactions in
humans is still ongoing. In the Class 1 RBD loop 1, a GGG motif is likely to give this loop
flexibility and likely increased binding promiscuity. Too much flexibility completely eliminates
binding as the loss of the disulfide bond in the C317S/C320S double mutant RBD showed (Table
3). As the HCoV-229E RBD has evolved in the human population, the middle residue in this
GXG motif has changed from glycine in the Class 1 and 2 RBDs to valine in the Class 3 and 4
RBDs to proline in the Class 5 and 6 RBDs (Figure 8). A valine at this position would reduce
loop flexibility and a proline would reduce it even further. There are very few differences in the
RBD-hAPN interface between the Class 4 and 5 RBDs except for this proline and these two
RBDs show the largest difference in binding affinity for consecutive classes (nearly 10-fold). It
is plausible that RBDs with more flexible loops are more promiscuous (thereby promoting new
55
receptor interactions) and that they bind their receptor less tightly, while less flexible loops are
optimized for one specific interaction. Investigation of how RBDs with mutations designed to
modulate loop flexibility interact with homologous receptors of other species could help confirm
this idea.
3.6 Bats are Unique and Potent Agents of Viral Spread
Bats are known to be a vast reservoir for many virus types and transmission from bats to
humans is thought to be responsible for recent outbreaks of Ebola virus, Hendra virus, Nipah
virus, and the coronaviruses responsible for SARS and MERS (Smith et al. 2013). Bats are the
second largest group of mammals by species number and they exist on all six continents except
Antarctica (Nowak 1994). Their ability to travel by powered flight increases their potential to
spread disease and their cohabitation with many related species in a single cave has the potential
to lead to vast viral diversity (Willoughby et al. 2017). Bats also demonstrate unique immune
features that lead to infections without discernible symptoms and they may be able to tolerate
high levels of infection without a large impact on fecundity or longevity (Brook et al. 2015).
Moreover, the barriers to cross-species transmission are lowest among closely related species
(Parrish et al. 2008). Taken together, these factors would explain how viral transmission among
bats has led to the vast viral reservoir that exists. Indeed, recent studies have shown that close
relatives to the human coronaviruses HCoV-229E and HCoV-NL63 are currently circulating in
bats and that the most recent common ancestor of bat and human 229E is thought to have existed
just 200 years ago (Pfefferle et al. 2009). The potential for other bat virus to infect and sustain
human-to-human transmission clearly exists.
56
Chapter 4
Future Directions
4.1 Short Term Goals
4.1.1 Immediate Experiments
With the completion of several additional experiments, a fuller understanding of the
HCoV-229E S protein RBD and its interactions with hAPN, and with the neutralizing antibody
9.8.E12, can be obtained. four of the six identified Classes have been crystallized with their
receptor, hAPN, all under very similar crystallization conditions. Crystallization trials with the
Class 2 RBD has not yielded crystals, and the Class 6-hAPN crystals obtained have diffracted to
worse than 10 Å resolution. These two classes are good candidates for further optimization such
as an additive screen. Additional reagents will affect the solubility and crystallization properties
of the protein complex and may lead to crystals and crystals of higher quality for the class 2 and
6 complexes, respectively. Another structural goal not reached in this project was the
determination of the Class 1 RBD in complex with the 9.8.E12 antibody. The only data obtained
regarding this antibody’s epitope is that it involves the receptor binding loops and that RBD
residues F318 and N319 are not key to the interaction (Figure 19). Inspection of the Class 1
RBD-hAPN and Class 1 RBD-9.8.E12 structures paired with our knowledge of the Class 2 RBD
sequence will shed light on exactly how RBD mutations were able to abrogate the binding of
antibodies like 9.8.E12. If crystals cannot be obtained, a structure of the HCoV-229E S-protein
ectodomain trimer in complex with the 9.8E12 Fab might be tackled by cryo-EM analysis.
One unexplored aspect of the HCoV-229E S protein RBD-hAPN interaction is whether
or not the observed variation outside of the binding interface has any effect on binding affinity or
kinetics. RBD variation outside of the interface is able to change properties such as surface
charge, hydrophobicity, and possibly the structure of the receptor binding loops. Indeed, residues
that do not appear to interact in any way can show surprising compensatory or antagonistic
epistatic effects (Holmes 2011, Duan et al. 2014). Mutating these variable residues one at a time,
followed by binding analysis at each step, would be one means by which a role for epistatic
interactions could be tested.
57
4.1.2 Recent Surveillance Has Revealed Additional Human and Animal HCoV-229E RBD Classes
Since this project began in 2015, surveillance of bat and camel populations has shown
that close relatives of HCoV-229E circulate in both of these animals (Corman et al. 2015,
Corman et al. 2016). Phylogenetic analysis of the S protein RBDs of these bat and camel viruses
reveal the existence of at least three bat classes and one camel class (Figure 30). The three bat
classes branch from the phylogenetic tree earlier and appear to be more ancient than the human
or camel RBDs and this relationship is corroborated by a phylogenetic analysis of the RdRp gene
(Corman et al. 2015). The camel RBD is very close in sequence to human Class 2 and this
HCoV-229E related camel virus is able to infect HEK-293 cells that express hAPN (Corman et
al. 2016). Camel APN shares only 11 of the 17 interfacial residues hAPN uses to interact with
the human Class 1 RBD (Figure 1), but the camel RBD is still able to bind and facilitate
infection. No such data exists for the HCoV-229E related bat viruses. Determining the binding
characteristics between human APN and these animal RBDs and solving the crystal structures if
they are able to interact can shed light on how zoonosis is accomplished. To this end, I have
produced stable HEK-293S cell lines expressing the three bat and one camel RBDs. Western
blotting indicates that these proteins are expressing at levels comparable to their human relatives
(Figure 32A). Purification of the “bat 3” RBD proceeded without problems and both a size
exclusion chromatogram and SDS-PAGE gel indicate a clean, monodisperse sample (Figure
32B).
In addition to the new classes of animal RBDs that have recently been observed, new
HCoV-229E S protein RBD sequences have been deposited and they appear to form a new class.
Studying the receptor binding characteristics and crystal structure of this “Class 7” RBD would
shed further light on the role played by loop variation and immune evasion and add an important
data point to either help confirm or reject the observed trend of increased RBD-hAPN affinity
over time
Currently, the primary sequences of APN from 71 different species are known. Of most
importance to the future of this project are the sequences from dromedary camels and
hipposideros bats (accession codes XP_010985051.1 and XP_019495552.1 respectively).
Dromedary camels and hipposideros bats are known reservoirs of HCoV-229E related viruses
that ostensibly use APN as a receptor. Even though many species of bats share the same habitat
58
and are in close contact with one another, to date only bats of the genus Hipposideros are known
to harbor HCoV-229E-like viruses (Corman et al. 2015). Only recently has the APN sequence of
a hipposideros bat been deposited (Dong et al. 2016), enabling relevant structural studies.
Figure 30. Phylogenetic analysis of animal HCoV-229E-related RBDs reveal discrete classes. Shown here is the phylogenetic analysis of bat and camel HCoV-229E-like S protein RBDs. Much like the human RBDs shown with roman numerals, they segregate into distinct classes and appear to be closely related to the human virus RBD sequences.
Figure 31. Examination of the H-site of bat and camel APN shows moderate diversity. The sequences of hipposideros bat APN (bAPN), human APN (hAPN), and dromedary camel APN (dAPN) are shown. Residues that comprise the H-site are highlighted in green. Conserved residues are indicated by an asterisk, semi-conserved residues with dots, and non-conserved residues with blank space. More than 35% of the interfacial residues are not conserved. HCoV-229E-related camel viruses can infect human cells despite the differences at this site, and studies regarding the bat viruses will be informative.
59
Figure 32. Camel and bat RBDs express well in mammalian tissue culture. A) A Western blot shows that all four animal RBD constructs (bands boxed by dashed line) are
expressed well. B) A Superdex-200 chromatogram of the Bat 3 RBD and SDS-PAGE show a clean, monodisperse
sample.
60
4.2 Long Term Goals
4.2.1 Mutations in the Receptor-Binding Loops May Impact Spike Protein Conformational Dynamics
Recent advances in cryo-electron microscopy have enabled visualization of the full-
length S protein trimer of the coronaviruses NL63, MHV, SARS, and MERS (Gui et al. 2016,
Kirchdoerfer et al. 2016, Walls et al. 2016, Yuan et al. 2017). These structures show a high
degree of tertiary structure conservation and it follows that the HCoV-229E S protein trimer is
similar. The RBD of the S protein is dynamic and in certain conformations the RBD’s receptor
binding loops are inaccessible and buried in the trimer interface (Figure 4). Loop residues make
both intra- and intersubunit contacts and loop variation might affect the equilibrium between the
inaccessible “lying” and accessible “standing” states. The exact biological implications of such
an equilibrium are currently unknown but the relevance to both receptor binding and immune
evasion is apparent. Hiding the highly antigenic receptor binding loops would be a route to
immune evasion but at the same time it prevents receptor binding. The equilibrium ratio of the
lying and standing forms of the trimer may therefore be an important determinant of fitness.
Cryo-EM analysis has been used to study these various states for SARS- and MERS-CoV (Yuan
et al. 2017) and we could use the same approach to determine whether or not the ratio of lying
and standing forms differs among the various HCoV-229E classes.
The expression of the soluble ectodomain of the coronavirus S protein trimer has been
aided by several approaches. Firstly, a T4 fibritin trimerization motif can be introduced at the C-
terminus the ectodomain of the S-protein (Kirchdoerfer et al. 2016). This “foldon” domain acts
synergistically with the natural tendency of the S-protein to form trimers (Papanikolopoulou et
al. 2003). A double proline mutation introduced into the loop between the HR1 region and the
central helix of the S-protein can increase S protein expression levels by up to 50 times, likely
because it prevents the alpha helical rearrangements that lead to the post-fusion conformation of
the S protein and these mutations have greatly facilitated cryo-EM analysis (Pallesen et al. 2017).
We will use these approaches to stabilize the HCoV-229E trimer for each of the six classes.
Once done we will use cryo-EM to determine whether loop variation influences the ratio of the
lying and standing forms.
61
4.2.2 A Cell-Based Assay to Test Viral Fitness is Within Reach
Most of the ideas stemming from this thesis are the result of structural or biophysical
analysis and do not have a cell-based assay to support them. While cell based infectivity assays
have been conducted using the lab strain HCoV-229E P100E isolate (Wong et al. 2017), no
assays on viruses with S proteins containing RBDs other than Class 1 have been attempted. Our
collaborators are currently working on a BacMid system to create HCoV-229E viruses with
custom genomes and thus viruses corresponding to each of the RBD classes. This will allow for
the testing of the fitness of viral variants in cell-based assays examining, for example, the effect
of receptor binding affinity on fitness.
62
Chapter 5
Methods
5.1 Sequence Comparison of HCoV-229E S-protein RBD
The protein sequence of the lab strain HCoV-229E P100E isolate RBD (residues 293–
435) was used as a template to search the non-redundant protein sequence database using Blastp
(Camacho et al. 2008). Sequences were compiled on December 1, 2016, and new sequences are
now available. 52 total sequences were obtained with the GenBank Identifier numbers:
NP_073551.1, AAK32188.1, AAK32189.1, AAK32190.1, AAK32191.1, CAA71056.1,
CAA71146.1, CAA71147.1, ADK37701.1, ADK37702.1, ADK37704.1, BAL45637.1,
BAL45638.1, BAL45639.1, BAL45640.1, BAL45641.1, AAQ89995.1, AAQ89999.1,
AAQ90002.1, AAQ90004.1, AAQ90005.1, AAQ90006.1, AAQ90008.1, AFI49431.1,
AFR45554.1, AFR79250.1, AFR79257.1, AGT21338.1, AGT21345.1, AGT21353.1,
AGT21367.1, AGW80932.1, AIG96686.1 ABB90506.1, ABB90507.1, ABB90508.1,
ABB90509.1, ABB90510.1, ABB90513.1. ABB90514.1, ABB90515.1, ABB90516.1,
ABB90519.1, ABB90520.1, ABB90522.1, ABB90523.1, ABB90526.1, ABB90527.1,
ABB90528.1, ABB90529.1, ABB90530.1, AOG74783.1. The 52 sequences were then aligned
using Muscle (Edgar et al. 2004). The protein-coding regions of the eight sequences for which
the entire genome were reported (GenBank Identifier numbers: NC_002645.1, JX503060.1,
JX503061.1, KF514433.1, KF514430.1, KF514432.1, AF304460.1, and KU291448.1) were
aligned using Muscle. The sequence AAK32191.1 was chosen as the representative of Class 1
and the loop sequences of ABB90507.1, ABB90514.1, ABB90519.1, ABB90523.1, and
AFR45554.1 were combined with the non-loop sequences of AAK32191.1 to generate the RBDs
of Classes 2–6, respectively.
5.2 Protein Expression and Purification
The soluble ectodomain of hAPN and related mutants (residues 66–967) were expressed
in and purified from stably transfected HEK293S GnT1(-/-) cells (ATCC CRL-3022) as described
previously (Wong et al. 2012). These cells produce glycoproteins containing only high mannose
N-linked glycans (Chang et al. 2007). The six classes of HCoV-229E S-protein RBD and related
63
mutants were also expressed and purified from stably transfected HEK293S GnT1(-/-) cells. Point
mutations were generated using the InFusion HD Site-Directed Mutagenesis protocol (Clontech).
In all cases, the target proteins were secreted as N-terminal protein-A fusion proteins with a
Tobacco Etch Virus (TEV) protease cleavage site following the protein-A tag. Harvested media
was concentrated 10-fold and purified by IgG affinity chromatography (IgG Sepharose, GE). The
bound proteins were liberated by on-column TEV protease cleavage and hAPN was further
purified by anion exchange chromatography (HiTrap-Q) while the HCoV-229E RBDs were
further purified by cation exchange chromatography (HiTrap-SP).
5.3 Surface Plasmon Resonance Assays
Surface plasmon resonance assays were performed on the Biacore-X system using CM-5
dextran chips (GE) covalently coupled to the ligand via amine coupling. The surface of the chip
was activated by 0.05 M EDC and 0.2 M NHS, injected with a flow rate of 10 µL/min for seven
minutes. 50 µg/mL of ligand in 50 mM sodium acetate pH 4.95 was injected until an appropriate
amount was covalently linked. For experiments using hAPN as a ligand, 400 RUs were
immobilized, while for the experiments using the 9.8.E12 antibody, 1900 RUs were
immobilized. 1.0 M ethanolamine injected at 10 µL/min for seven minutes was used to cap
unreacted CM-dextran residues and decrease the overall charge of the matrix. The running and
injection buffers were matched and in all cases consisted of 150 mM NaCl, 0.01% Tween-20, 0.1
mg/ml BSA, and 10 mM HEPES at pH 7.5. Response unit (RU) values were measured as a
function of analyte concentration at 298 K. Kinetic analysis was performed using the global
fitting feature of Scrubber 2 (BioLogic Software) assuming a 1:1 binding model.
5.4 Deglycosylation of hAPN
10.0 mg of purified hAPN was deglycosylated by treatment with 0.5 mg endo-β-N-
acetylglucosaminidase A. The reaction solution was 10 mL in volume and the buffer consisted of
100 mM NaCl and 10 mM MES pH 6.5. After 48 hours, the pH was dropped to 5.0 using 100
mM NaAc pH 4.9 and 20µL of Jack Bean α-mannosidase (Sigma) was introduced. This enzyme
requires Zn2+ as a cofactor and ZnSO4 was added to a final concentration of 1 mM. After 48
hours, the reaction was complete as determined by SDS-PAGE and MALDI-TOF analysis.
64
5.5 Protein Crystallization
Crystals of the Class 3, 4, and 5 S protein RBDs in complex with the hAPN ectodomain
were obtained via the same general method. The RBDs and hAPN were mixed in a 1.2:1 ratio
and the resulting complex was purified by gel filtration using a Superdex 200 column (GE) with
a buffer consisting of 50 mM NaCl and 10 mM HEPES at pH 7.4. The purified complexes were
concentrated to 9.5 mg/ml. 1 µg/ml of endo-β-N-acetylglucosaminidase H was introduced into
this solution to remove N-linked glycans still present on the RBDs. Mixing protein solution with
precipitant consisting of 9% PEG 8000, 1mM GSSG, 1mM GSH, 5% glycerol, and 100 mM
MES, pH 6.5 at 298 K in a ratio of 1:1 in hanging drops of 1 µl yielded crystals of low quality
after 48 hours. These crystals were harvested, washed in precipitant solution, and used for
seeding which resulted in higher quality crystals used for diffraction experiments and data
collection.
5.6 Data Collection and Structure Determination
Diffraction data for the Class 3 and 4 RBD complexes were collected at the Canadian
Light Source, Saskatoon, Saskatchewan (Beamline CMCF-08ID-1) at a wavelength of 0.9795 Å.
Data for the Class 5 RBD complex was collected at the Advanced Photon Source. Use of the
Advanced Photon Source at the Argonne National Laboratory was supported by the U. S.
Department of Energy, Office of Science, Office of Basic Energy Sciences, under Contract No.
DE-AC02-06CH11357. Data were merged, processed, and scaled using HKL-2000 (Otwinowski
et al. 1997). 5% of the data set was used for the calculation of Rfree for the Class 4 and 5 RBD
complex structures. 1% of the data was exclude for the Rfree calculation for the Class 3
structure. Phases were obtained by molecular replacement using hAPN as a search model (PDB
ID: 4FYQ) using Phaser in Phenix (Bunkóczi et al. 2013). Manual building of the HCoV-229E
RBD and hAPN was performed using COOT. Refinement was carried out using Phenix.refine
(Afonine et al. 2012). Data collection and refinement statistics are found in Table 5.
65
Bibliography
Afonine, P., et al. “Towards automated crystallographic structure refinement with phenix.refine.” (2012). Acta Cryst. D68, 352-367. Anthony, Simon J., et al. “Global Patterns in Coronavirus Diversity.” Virus Evolution, vol. 3, no.
1, 2017, doi:10.1093/ve/vex012.
Bae, Se-Eun, and Hyeon Son. “Classification of Viral Zoonosis through Receptor Pattern Analysis.” BMC Bioinformatics, vol. 12, no. 1, 2011, p. 96., doi:10.1186/1471-2105-12-96.
Belay, Ermias D., and Stephan S. Monroe. “Low-Incidence, High-Consequence Pathogens.”Emerging Infectious Diseases, vol. 20, no. 2, 2014, pp. 319–321., doi:10.3201/eid2002.131748.
Belouzard, Sandrine, et al. “Mechanisms of Coronavirus Cell Entry Mediated by the Viral Spike Protein.” Viruses, vol. 4, no. 12, 2012, pp. 1011–1033., doi:10.3390/v4061011.
Bertram, S., et al. “TMPRSS2 Activates the Human Coronavirus 229E for Cathepsin-
Independent Host Cell Entry and Is Expressed in Viral Target Cells in the Respiratory Epithelium.” Journal of Virology, vol. 87, no. 11, 2013, pp. 6150–6160., doi:10.1128/jvi.03372-12.
Boni, Maciej F. “Vaccination and Antigenic Drift in Influenza.” Vaccine, vol. 26, 2008, doi:10.1016/j.vaccine.2008.04.011.
Bos, Evelyne C.w., et al. “The Production of Recombinant Infectious DI-Particles of a Murine
Coronavirus in the Absence of Helper Virus.” Virology, vol. 218, no. 1, 1996, pp. 52–60., doi:10.1006/viro.1996.0165.
Briese, T., et al. “Middle East Respiratory Syndrome Coronavirus Quasispecies That Include
Homologues of Human Isolates Revealed through Whole-Genome Analysis and Virus Cultured from Dromedary Camels in Saudi Arabia.” MBio, vol. 5, no. 3, 2014, doi:10.1128/mbio.01146-14.
Brook, Cara E., and Andrew P. Dobson. “Bats as a Special Reservoirs for Emerging Zoonotic
Pathogens.” Trends in Microbiology, vol. 23, no. 3, 2015, pp. 172–180., doi:10.1016/j.tim.2014.12.004.
66
Bunkóczi, G., et al. “Phaser.MRage: automated molecular replacement” Acta Crystallogr D Biol Crystallogr 69, 2276-86 (2013).
Cabeca, Tatiane K., et al. “Epidemiological and Clinical Features of Human Coronavirus Infections among Different Subsets of Patients.” Influenza and Other Respiratory Viruses, vol. 7, no. 6, 2013, pp. 1040–1047., doi:10.1111/irv.12101.
Calisher, C. H., et al. “Bats: Important Reservoir Hosts of Emerging Viruses.” Clinical Microbiology Reviews, vol. 19, no. 3, 2006, pp. 531–545., doi:10.1128/cmr.00017-06.
Callebaut, P., et al. “An Adenovirus Recombinant Expressing the Spike Glycoprotein of Porcine
Respiratory Coronavirus Is Immunogenic in Swine.” Journal of General Virology, vol. 77, no. 2, Jan. 1996, pp. 309–313., doi:10.1099/0022-1317-77-2-309.
Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., & Madden T.L.
(2008) "BLAST+: architecture and applications." BMC Bioinformatics 10:421.
Chan, Jasper F. W., et al. “Middle East Respiratory Syndrome Coronavirus: Another Zoonotic Betacoronavirus Causing SARS-Like Disease.” Clinical Microbiology Reviews, vol. 28, no. 2, 2015, pp. 465–522., doi:10.1128/cmr.00102-14.
Chan, W.-E., et al. “Functional Characterization of Heptad Repeat 1 and 2 Mutants of the Spike Protein of Severe Acute Respiratory Syndrome Coronavirus.” Journal of Virology, vol. 80, no. 7, 2006, pp. 3225–3237., doi:10.1128/jvi.80.7.3225-3237.2006.
Chang VT, Crispin M, Aricescu AR, et al. “Glycoprotein Structural Genomics: Solving the Glycosylation Problem.” Structure. 2007;15(3):267-273. doi:10.1016/j.str.2007.01.011.
Chaux, Nicole De La, et al. “DNA Indels in Coding Regions Reveal Selective Constraints on
Protein Evolution in the Human Lineage.” BMC Evolutionary Biology, vol. 7, no. 1, 2007, p. 191., doi:10.1186/1471-2148-7-191.
Chowell, Gerardo, et al. “Transmission Characteristics of MERS and SARS in the Healthcare
Setting: a Comparative Study.” BMC Medicine, vol. 13, no. 1, 2015, doi:10.1186/s12916-015-0450-0.
67
Chernomordik, Leonid V., and Michael M. Kozlov. “Protein-Lipid Interplay in Fusion and Fission of Biological Membranes.” Annual Review of Biochemistry, vol. 72, no. 1, 2003, pp. 175–207., doi:10.1146/annurev.biochem.72.121801.16150
Coffey L.L., Beeharry Y., Borderia A.V., Blanc H., Vignuzzi M. “Arbovirus high fidelity variant loses fitness in mosquitoes and mice.” Proceedings of the National Academy of Sciences 2011, 108:16038-16043.
Coffin, J., and R. Swanstrom. “HIV Pathogenesis: Dynamics and Genetics of Viral Populations and Infected Cells.” Cold Spring Harbor Perspectives in Medicine, vol. 3, no. 1, Jan. 2013, doi:10.1101/cshperspect.a012526.
Colman, Peter M. and Lawrence, Michael C. “The Structural Biology of Type I Viral Membrane Fusion.” Nature Reviews Molecular Cell Biology, vol. 4, no. 4, 2003, pp. 309–319., doi:10.1038/nrm1076.
Corman, Victor M., et al. “Evidence for an Ancestral Association of Human Coronavirus 229E with Bats.” Journal of Virology, vol. 89, no. 23, 2015, pp. 11858–11870., doi:10.1128/jvi.01755-15.
Corman, Victor M., et al. “Link of a Ubiquitous Human Coronavirus to Dromedary Camels.”Proceedings of the National Academy of Sciences, vol. 113, no. 35, 2016, pp. 9864–9869., doi:10.1073/pnas.1604472113.
Corti, Davide, and Antonio Lanzavecchia. “Broadly Neutralizing Antiviral Antibodies.” Annual Review of Immunology, vol. 31, no. 1, 2013, pp. 705–742., doi:10.1146/annurev-immunol-032712-095916.
Crossley, Beate, et al. “Identification and Characterization of a Novel Alpaca Respiratory Coronavirus Most Closely Related to the Human Coronavirus 229E.” Viruses, vol. 4, no. 12, 2012, pp. 3689–3700., doi:10.3390/v4123689.
Dai, H.-S., et al. “Directed Evolution of a Virus Exclusively Utilizing Human Epidermal Growth Factor Receptor as the Entry Receptor.” Journal of Virology, vol. 87, no. 20, 2013, pp. 11231–11243., doi:10.1128/jvi.01054-13.
Darwin, Charles. “On the Origin of Species by Means of Natural Selection, Or, the Preservation of Favoured Races in the Struggle for Life.” London: J. Murray, 1859. Print.
68
de Groot, R. J., et al. “Middle East Respiratory Syndrome Coronavirus (MERS-CoV): Announcement of the Coronavirus Study Group.” Journal of Virology, vol. 87, no. 14, 2013, pp. 7790–7792., doi:10.1128/jvi.01244-13.
Desforges, M. et al. “Neuroinvasive and neurotropic human respiratory coronaviruses: potential neurovirulent agents in humans”. Advances in Experimental Medicine and Biolology 807,75–96 (2014).
Dong, Dong, et al. “The Genomes of Two Bat Species with Long Constant Frequency Echolocation Calls.” Molecular Biology and Evolution, vol. 34, no. 1, 2016, pp. 20–34., doi:10.1093/molbev/msw231.
Du L, He Y, Zhou Y, Liu S, Zheng B-J, Jiang S. “The spike protein of SARS-CoV — a target for vaccine and therapeutic development.” Nature Reviews Microbiology. 2009;7(3):226-236. doi:10.1038/nrmicro2090.
Du, Lanying, et al. “Introduction of Neutralizing Immunogenicity Index to the Rational Design of MERS Coronavirus Subunit Vaccines.” Nature Communications, vol. 7, 2016, p. 13473., doi:10.1038/ncomms13473.
Duan, Susu, et al. “Epistatic Interactions between Neuraminidase Mutations Facilitated the Emergence of the Oseltamivir-Resistant H1N1 Influenza Viruses.” Nature Communications, vol. 5, 2014, p. 5029., doi:10.1038/ncomms6029.
Earp, L. J., et al. “The Many Mechanisms of Viral Membrane Fusion Proteins.” Current Topics in Microbiology and Immunology Membrane Trafficking in Viral Replication, 2005 pp. 25–66., doi:10.1007/3-540-26764-6_2.
Edgar, R. C. “MUSCLE: multiple sequence alignment with high accuracy and high throughput.” Nucleic Acids Research. 32, 1792–1797 (2004).
Emsley, P., et al. “Features and Development of Coot.” Acta Crystallographica Section D
Biological Crystallography, vol. 66, no. 4, 2010, pp. 486–501., doi:10.1107/s0907444910007493.
Farsani, Seyed Mohammad Jazaeri, et al. “The First Complete Genome Sequences of Clinical
Isolates of Human Coronavirus 229E.” Virus Genes, vol. 45, no. 3, 2012, pp. 433–439., doi:10.1007/s11262-012-0807-9.
69
Fehr, Anthony R., and Stanley Perlman. “Coronaviruses: An Overview of Their Replication and Pathogenesis.” Coronaviruses Methods in Molecular Biology, 2015, pp. 1–23., doi:10.1007/978-1-4939-2438-7_1.
Fendrick, A. Mark, et al. “The Economic Burden of Non-Influenza-Related Viral Respiratory Tract Infection in the United States.” Archives of Internal Medicine, vol. 163, no. 4, 2003, p. 487., doi:10.1001/archinte.163.4.487.
Fine, Paul E. M. “Herd Immunity: History, Theory, Practice.” Epidemiologic Reviews, vol. 15, no. 2, 1993, pp. 265–302., doi:10.1093/oxfordjournals.epirev.a036121.
Gaunt, E. R. et al. “Epidemiology and clinical presentations of the four human coronaviruses
229E, HKU1, NL63, and OC43 detected over 3 years using a novel multiplex real-time PCR method.” J. Clin. Microbiol. 48, 2940–2947 (2010).
Gortazar, Christian, et al. “Crossing the Interspecies Barrier: Opening the Door to Zoonotic
Pathogens.” PLoS Pathogens, vol. 10, no. 6, 2014, doi:10.1371/journal.ppat.1004129.
Grenfell, B. T. “Unifying the Epidemiological and Evolutionary Dynamics of Pathogens.”Science, vol. 303, no. 5656, 2004, pp. 327–332., doi:10.1126/science.1090727.
Gui, Miao, et al. “Cryo-Electron Microscopy Structures of the SARS-CoV Spike Glycoprotein Reveal a Prerequisite Conformational State for Receptor Binding.”Cell Research, vol. 27, no. 1, 2016, pp. 119–129., doi:10.1038/cr.2016.152.
Gupta, R & Jung, E & Brunak, Søren. (2004). “Prediction of N-glycosylation sites in human proteins.” 46. 203-206.
Haan, Cornelis A.m. De, and Peter J.m. Rottier. “Molecular Interactions in the Assembly of
Coronaviruses.” Advances in Virus Research Virus Structure and Assembly, 2005, pp. 165–230., doi:10.1016/s0065-3527(05)64006-7.
Harrison, Stephen C. “Viral Membrane Fusion.” Nature Structural & Molecular Biology, vol. 15,
no. 7, 2008, pp. 690–698., doi:10.1038/nsmb.1456.
Hasegawa, K., et al. “Affinity Thresholds for Membrane Fusion Triggering by Viral Glycoproteins.” Journal of Virology, vol. 81, no. 23, 2007, pp. 13149–13157., doi:10.1128/jvi.01415-07.
70
He, Y., et al. “Receptor-Binding Domain of Severe Acute Respiratory Syndrome Coronavirus
Spike Protein Contains Multiple Conformation-Dependent Epitopes That Induce Highly Potent Neutralizing Antibodies.” The Journal of Immunology, vol. 174, no. 8, 2005, pp. 4908–4915., doi:10.4049/jimmunol.174.8.4908.
Hensley, S. E., et al. “Hemagglutinin Receptor Binding Avidity Drives Influenza A Virus Antigenic Drift.” Science, vol. 326, no. 5953, 2009, pp. 734–736., doi:10.1126/science.1178258.
Hofmann, Heike, et al. “Attachment Factor and Receptor Engagement of Sars Coronavirus and Human Coronavirus NL63.” Advances in Experimental Medicine and Biology The Nidoviruses, 2006, pp. 219–227., doi:10.1007/978-0-387-33012-9_37.
Holm L, Sander C. 1998. “Touring protein fold space with Dali/FSSP.” Nucleic Acids Research. 26:316–319
Holmes, Edward C. “Error Thresholds and the Constraints to RNA Virus Evolution.” Trends in
Microbiology, vol. 11, no. 12, 2003, pp. 543–546., doi:10.1016/j.tim.2003.10.006.
Holmes, Edward C. “The Evolution and Emergence of RNA Viruses.” Oxford University Press, 2011.
Imamura, Hiroshi, and Shinya Honda. “Calibration-Free Concentration Analysis for an Analyte
Prone to Self-Association.” Analytical Biochemistry, vol. 516, 2017, pp. 61–64., doi:10.1016/j.ab.2016.10.013.
Keele, B. F., et al. “Identification and Characterization of Transmitted and Early Founder Virus
Envelopes in Primary HIV-1 Infection.” Proceedings of the National Academy of Sciences, vol. 105, no. 21, 2008, pp. 7552–7557., doi:10.1073/pnas.0802203105.
Kielian, Margaret, and Rey, Félix A. “Virus Membrane-Fusion Proteins: More than One Way to
Make a Hairpin.” Nature Reviews Microbiology, vol. 4, no. 1, 2006, pp. 67–76., doi:10.1038/nrmicro1326.
Kielian, Margaret. “Mechanisms of Virus Membrane Fusion Proteins.” Annual Review of
Virology, vol. 1, no. 1, Mar. 2014, pp. 171–189., doi:10.1146/annurev-virology-031413-085521.
71
Kim, Young B., et al. “Immunogenicity and Ability of Variable Loop-Deleted Human Immunodeficiency Virus Type 1 Envelope Glycoproteins to Elicit Neutralizing Antibodies.”Virology, vol. 305, no. 1, 2003, pp. 124–137., doi:10.1006/viro.2002.1727.
Kirchdoerfer, R.n., et al. “Prefusion Structure of a Human Coronavirus Spike Protein.” Feb. 2016, doi:10.2210/pdb5i08/pdb.
Koonin, Eugene V, and Valerian V Dolja. “Expanding Networks of RNA Virus
Evolution.” BMC Biology, vol. 10, no. 1, 2012, p. 54., doi:10.1186/1741-7007-10-54.
Kryazhimskiy, Sergey, et al. “Prevalence of Epistasis in the Evolution of Influenza A Surface Proteins.” PLoS Genetics, vol. 7, no. 2, 2011, doi:10.1371/journal.pgen.1001301.
Li, F. “Evidence for a Common Evolutionary Origin of Coronavirus Spike Protein Receptor-
Binding Subunits.” Journal of Virology, vol. 86, no. 5, 2011, pp. 2856–2858., doi:10.1128/jvi.06882-11.
Li, Fang. “Structure, Function, and Evolution of Coronavirus Spike Proteins.” Annual Review of
Virology, vol. 3, no. 1, 2016, pp. 237–261., doi:10.1146/annurev-virology-110615-042301.
Li, W., et al. “Animal Origins of the Severe Acute Respiratory Syndrome Coronavirus: Insight
from ACE2-S-Protein Interactions.” Journal of Virology, vol. 80, no. 9, Dec. 2006, pp. 4211–4219., doi:10.1128/jvi.80.9.4211-4219.2006.
Li, Xiang, et al. “Protein-Protein Interactions: Hot Spots and Structurally Conserved Residues
Often Locate in Complemented Pockets That Pre-Organized in the Unbound States: Implications for Docking.” Journal of Molecular Biology, vol. 344, no. 3, 2004, pp. 781–795., doi:10.1016/j.jmb.2004.09.051.
Li, Wenhui, et al. “Receptor and Viral Determinants of SARS-Coronavirus Adaptation to Human ACE2.” The EMBO Journal, vol. 24, no. 8, 2005, pp. 1634–1643., doi:10.1038/sj.emboj.7600640.
Li, Z., et al. “Simple PiggyBac Transposon-Based Mammalian Cell Expression System for Inducible Protein Production.” Proceedings of the National Academy of Sciences, vol. 110, no. 13, 2013, pp. 5004–5009., doi:10.1073/pnas.1218620110.
72
Lin, Han-Xin, et al. “Characterization of the Spike Protein of Human Coronavirus NL63 in Receptor Binding and Pseudotype Virus Entry.” Virus Research, vol. 160, no. 1-2, 2011, pp. 283–293., doi:10.1016/j.virusres.2011.06.029.
Lin, Xian-Dan, et al. “Extensive Diversity of Coronaviruses in Bats from China.” Virology, vol. 507, 2017, pp. 1–10., doi:10.1016/j.virol.2017.03.019.
Masters, Paul S. “Reverse Genetics of The Largest RNA Viruses.” Advances in Virus Research
Advances in Virus Research Volume 53, 1999, pp. 245–264., doi:10.1016/s0065-3527(08)60351-6.
Millet, Jean Kaoru, and Gary R. Whittaker. “Host Cell Proteases: Critical Determinants of
Coronavirus Tropism and Pathogenesis.” Virus Research, vol. 202, 2015, pp. 120–134., doi:10.1016/j.virusres.2014.11.021.
Milne, R. S. B., et al. “Glycoprotein D Receptor-Dependent, Low-PH-Independent Endocytic
Entry of Herpes Simplex Virus Type 1.” Journal of Virology, vol. 79, no. 11, Dec. 2005, pp. 6655–6663., doi:10.1128/jvi.79.11.6655-6663.2005.
Nowak, Ronald M. “Walker's Bats of the World.” James Hopkins University Press, 1994.
Otwinowski, Z. and Minor, W., " Processing of X-ray Diffraction Data Collected in Oscillation
Mode ", Methods in Enzymology, Volume 276: Macromolecular Crystallography, part A, p.307-326, 1997,C.W. Carter, Jr. & R. M. Sweet, Eds., Academic Press (New York).
Pei, J., and N. V. Grishin. “AL2CO: Calculation of Positional Conservation in a Protein
Sequence Alignment.” Bioinformatics, vol. 17, no. 8, 2001, pp. 700–712., doi:10.1093/bioinformatics/17.8.700.
Pallesen, Jesper, et al. “Immunogenicity and Structures of a Rationally Designed Prefusion
MERS-CoV Spike Antigen.” Proceedings of the National Academy of Sciences, vol. 114, no. 35, 2017, doi:10.1073/pnas.1707304114.
Papanikolopoulou, Katerina, et al. “Formation of Highly Stable Chimeric Trimers by Fusion of
an Adenovirus Fiber Shaft Fragment with the Foldon Domain of Bacteriophage T4 Fibritin.” Journal of Biological Chemistry, vol. 279, no. 10, 2003, pp. 8991–8998., doi:10.1074/jbc.m311791200.
73
Parrish, C. R., et al. “Cross-Species Virus Transmission and the Emergence of New Epidemic Diseases.” Microbiology and Molecular Biology Reviews, vol. 72, no. 3, Jan. 2008, pp. 457–470., doi:10.1128/mmbr.00004-08.
Perelson, Alan S. “Modelling Viral And Immune System Dynamics.” Nature Reviews Immunology, vol. 2, no. 1, 2002, pp. 28–36., doi:10.1038/nri700.
Pettersen, E. F. et al. “UCSF Chimera--a visualization system for exploratory research and
analysis.” Journal of Computational Chemistry 25, 1605–1612 (2004)
Pfefferle, Susanne, et al. “Distant Relatives of Severe Acute Respiratory Syndrome Coronavirus and Close Relatives of Human Coronavirus 229E in Bats, Ghana.” Emerging Infectious Diseases, vol. 15, no. 9, 2009, pp. 1377–1384., doi:10.3201/eid1509.090224.
Plowright, Raina K., et al. “Pathways to Zoonotic Spillover.” Nature Reviews Microbiology, vol. 15, no. 8, 2017, pp. 502–510., doi:10.1038/nrmicro.2017.45.
Reed, Sylvia E. “The Behaviour of Recent Isolates of Human Respiratory Coronavirus in Vitro and in Volunteers: Evidence of Heterogeneity among 229E-Related Strains.” Journal of Medical Virology, vol. 13, no. 2, 1984, pp. 179–192., doi:10.1002/jmv.1890130208.
Reguera, Juan, et al. “Structural Bases of Coronavirus Attachment to Host Aminopeptidase N and Its Inhibition by Neutralizing Antibodies.” PLoS Pathogens, vol. 8, no. 8, Feb. 2012, doi:10.1371/journal.ppat.1002859.
Richard, Mathilde, et al. “Factors Determining Human-to-Human Transmissibility of Zoonotic Pathogens via Contact.” Current Opinion in Virology, vol. 22, 2017, pp. 7–12., doi:10.1016/j.coviro.2016.11.004.
Sanjuan, R., et al. “Epistasis and the Adaptability of an RNA Virus.” Genetics, vol. 170, no. 3, 2005, pp. 1001–1008., doi:10.1534/genetics.105.040741.
Sanjuan, R. “Viral Mutation Rates.” Virus Evolution: Current Research and Future Directions,
2016, pp. 1–28., doi:10.21775/9781910190234.01.
Schuck, Peter, and Huaying Zhao. “The Role of Mass Transport Limitation and Surface Heterogeneity in the Biophysical Characterization of Macromolecular Binding Processes by SPR Biosensing.” Methods in Molecular Biology Surface Plasmon Resonance, 2010, pp. 15–54., doi:10.1007/978-1-60761-670-2_2.
74
Shiroishi, Mitsunori, et al. “Structural Consequences of Mutations in Interfacial Tyr Residues of a Protein Antigen-Antibody Complex.” Journal of Biological Chemistry, vol. 282, no. 9, 2006, pp. 6783–6791., doi:10.1074/jbc.m605197200.
Sitbon, Einat, and Shmuel Pietrokovski. “Occurrence of Protein Structure Elements in Conserved Sequence Regions.” BMC Structural Biology, vol. 7, no. 1, 2007, p. 3., doi:10.1186/1472-6807-7-3.
Smith EC, Sexton NR, Denison MR. “Thinking outside the triangle: replication fidelity of the largest RNA viruses.” The Annual Review of Virology. 2014; 1: 111–132. https://doi.org/10.1146/annurev-virology-031413-085507 PMID: 2695871712.
Smith, Everett Clinton. “The Not-so-Infinite Malleability of RNA Viruses: Viral and Cellular Determinants of RNA Virus Mutation Rates.” PLOS Pathogens, vol. 13, no. 4, 2017, doi:10.1371/journal.ppat.1006254.
Smith, Ina, and Lin-Fa Wang. “Bats and Their Virome: an Important Source of Emerging Viruses Capable of Infecting Humans.” Current Opinion in Virology, vol. 3, no. 1, 2013, pp. 84–91., doi:10.1016/j.coviro.2012.11.006.
Snijder, Eric J., et al. “Unique and Conserved Features of Genome and Proteome of SARS-Coronavirus, an Early Split-off From the Coronavirus Group 2 Lineage.” Journal of Molecular Biology, vol. 331, no. 5, 2003, pp. 991–1004., doi:10.1016/s0022-2836(03)00865-9.
Söllner, Thomas, et al. “A Protein Assembly-Disassembly Pathway in Vitro That May Correspond to Sequential Steps of Synaptic Vesicle Docking, Activation, and Fusion.” Cell, vol. 75, no. 3, 1993, pp. 409–418., doi:10.1016/0092-8674(93)90376-2.
Skehel, John J., and Don C. Wiley. “Receptor Binding and Membrane Fusion in Virus Entry: The Influenza Hemagglutinin.” Annual Review of Biochemistry, vol. 69, no. 1, 2000, pp. 531–569., doi:10.1146/annurev.biochem.69.1.531.
Steinhauer, David A., et al. “Lack of Evidence for Proofreading Mechanisms Associated with an RNA Virus Polymerase.” Gene, vol. 122, no. 2, 1992, pp. 281–288., doi:10.1016/0378-1119(92)90216-c.
Tang, X.-C., et al. “Identification of Human Neutralizing Antibodies against MERS-CoV and Their Role in Virus Adaptive Evolution.” Proceedings of the National Academy of Sciences, vol. 111, no. 19, 2014, doi:10.1073/pnas.1402074111.
75
Tath-Petraczy, Agnes, and Dan S. Tawfik. “Protein Insertions and Deletions Enabled by Neutral Roaming in Sequence Space.” Molecular Biology and Evolution, vol. 30, no. 4, 2013, pp. 761–771., doi:10.1093/molbev/mst003.
Tokuriki, Nobuhiko, et al. “How Protein Stability and New Functions Trade Off.” PLoS Computational Biology, vol. 4, no. 2, 2008, doi:10.1371/journal.pcbi.1000002.
Tusell, S. M., et al. “Mutational Analysis of Aminopeptidase N, a Receptor for Several Group 1
Coronaviruses, Identifies Key Determinants of Viral Host Range.” Journal of Virology, vol. 81, no. 3, Aug. 2006, pp. 1261–1273., doi:10.1128/jvi.01510-06.
Ugolini, Sophie, et al. “HIV-1 Attachment: Another Look.” Trends in Microbiology, vol. 7, no.
4, 1999, pp. 144–149., doi:10.1016/s0966-842x(99)01474-2.
Visher, Elisa, et al. “The Mutational Robustness of Influenza A Virus.” PLOS Pathogens, vol. 12, no. 8, 2016, doi:10.1371/journal.ppat.1005856.
Volz, Erik M., et al. “Viral Phylodynamics.” PLoS Computational Biology, vol. 9, no. 3, 2013,
doi:10.1371/journal.pcbi.1002947.
Walls, A.c., et al. “Cryo-Electron Microscopy Structure of a Coronavirus Spike Glycoprotein Trimer.” Mar. 2016a, doi:10.2210/pdb3jcl/pdb.
Walls, A.c., et al. “Glycan Shield and Epitope Masking of a Coronavirus Spike Protein Observed
by Cryo-Electron Microscopy.” 2016b, doi:10.2210/pdb5szs/pdb.
White JM, Delos SE, Brecher M, Schornberg K. Structures and Mechanisms of Viral Membrane Fusion Proteins: Multiple Variations on a Common Theme. Critical reviews in biochemistry and molecular biology. 2008;43(3):189-219. doi:10.1080/10409230802058320.
Willoughby, Anna, et al. “A Comparative Analysis of Viral Richness and Viral Sharing in Cave-
Roosting Bats.” Diversity, vol. 9, no. 3, 2017, p. 35., doi:10.3390/d9030035.
Wong, Alan H. M., et al. “The X-Ray Crystal Structure of Human Aminopeptidase N Reveals a Novel Dimer and the Basis for Peptide Processing.” Journal of Biological Chemistry, vol. 287, no. 44, 2012, pp. 36804–36813., doi:10.1074/jbc.m112.398842.
76
Wong, Alan H. M., Tomlinson, Aidan C.A., et al. “Receptor-Binding Loops in Alphacoronavirus Adaptation and Evolution.” Nature Communications, vol. 8, no. 1, 2017, doi:10.1038/s41467-017-01706-x.
Woo, Patrick C. Y., et al. “Coronavirus Diversity, Phylogeny and Interspecies Jumping.” Experimental Biology and Medicine, vol. 234, no. 10, 2009, pp. 1117–1127., doi:10.3181/0903-mr-94.
Woo, P. C. Y., et al. “Discovery of Seven Novel Mammalian and Avian Coronaviruses in the Genus Deltacoronavirus Supports Bat Coronaviruses as the Gene Source of Alphacoronavirus and Betacoronavirus and Avian Coronaviruses as the Gene Source of Gammacoronavirus and Deltacoronavirus.” Journal of Virology, vol. 86, no. 7, 2012, pp. 3995–4008., doi:10.1128/jvi.06540-11.
Wu, K., et al. “Crystal Structure of NL63 Respiratory Coronavirus Receptor-Binding Domain Complexed with Its Human Receptor.” 2009, doi:10.2210/pdb3kbh/pdb.
Yang, Yang, et al. “Two Mutations Were Critical for Bat-to-Human Transmission of Middle
East Respiratory Syndrome Coronavirus.” Journal of Virology, vol. 89, no. 17, Oct. 2015, pp. 9119–9123., doi:10.1128/jvi.01279-15.
Yuan, Yuan, et al. “Cryo-EM Structures of MERS-CoV and SARS-CoV Spike Glycoproteins
Reveal the Dynamic Receptor Binding Domains.” Nature Communications, vol. 8, Oct. 2017, p. 15092., doi:10.1038/ncomms15092.
Zeng, Fanya, et al. “Quantitative Comparison of the Efficiency of Antibodies against S1 and S2
Subunit of SARS Coronavirus Spike Protein in Virus Neutralization and Blocking of Receptor Binding: Implications for the Functional Roles of S2 Subunit.” FEBS Letters, vol. 580, no. 24, Dec. 2006, pp. 5612–5620., doi:10.1016/j.febslet.2006.08.085.
Zumla, A., Chan, J.F., Azhar, E.I., Hui, D.S. & Yuen, K.Y. “Coronaviruses: drug discovery and therapeutic options.” Nature Reviews Drug Discovery 15, 327–347 (2016)