proteins - uni-duesseldorf.de · proteins structure o function o bioinformatics protein rigidity...
TRANSCRIPT
proteinsSTRUCTURE O FUNCTION O BIOINFORMATICS
Protein rigidity and thermophilic adaptationSebastian Radestock and Holger Gohlke*
Mathematisch-Naturwissenschaftliche Fakultat, Institut fur Pharmazeutische und Medizinische Chemie,
Heinrich-Heine-Universitat Dusseldorf, Germany
INTRODUCTION
In the recent years, there has been considerable effort to under-
stand the structural determinants of stability and activity of proteins
from thermophilic and hyperthermophilic organisms (henceforth
referred to together as thermophilic proteins) with optimal growth
temperatures of more than 608C.1,2 These proteins are required to
retain their native structures even under extreme environmental
conditions, when homologues from mesophilic organisms already
denature.1,2 Consequently, thermophilic proteins often have a
substantially higher intrinsic thermostability than their mesophilic
counterparts, while retaining the basic fold characteristics, and in
the case of enzymes, the catalytic site properties of the particular
protein family.1,3
Because of their adaptation to high temperature, thermophilic
enzymes are interesting for the biotechnological industry as bio-
catalysts.4–6 However, the identification of suitable thermophilic
enzymes with a specific temperature-dependent activity profile by
screening is tedious.6,7 As a valuable alternative, mesophilic
enzymes can be re-engineered to become more thermostable.8,9
Thus, there is great interest in the biotechnology industry for having
tools in hand that aid in developing thermostable enzymes by
protein engineering. In addition to stability, optimizing enzyme
activity for elevated target temperatures is mandatory. This requires
both a thorough understanding of the structural determinants of
thermostability and temperature-dependent activity.10
Numerous studies have been performed to identify principles of
thermostability and, subsequently, to apply those in rational or
data-driven protein engineering.8,11 These include a large and
growing number of comparative studies using mesophilic and ther-
mophilic protein homologues.12 The influence of various sequence
and structure determinants of thermophilic adaptation has been
studied.13–21 Often, thermophilic adaptation comes along with a
better packing of hydrophobic interactions and an increased density
of salt-bridges or charge-assisted hydrogen bonds.19,20,22 In many
cases, a combination of different sequence or structure features was
found to be responsible for an increased thermostability.10,21,23
Sebastian Radestock’s current address is Computational Structural Biology Group, Max Planck
Institute of Biophysics, Max-von-Laue-Str. 3, 60438 Frankfurt am Main, Germany.
Additional Supporting Information may be found in the online version of this article.
*Correspondence to: Holger Gohlke, Universitatsstr. 1, 40225 Dusseldorf, Germany.
E-mail: [email protected].
Received 5 August 2010; Revised 28 September 2010; Accepted 7 November 2010
Published online 19 November 2010 in Wiley Online Library (wileyonlinelibrary.com).
DOI: 10.1002/prot.22946
ABSTRACT
We probe the hypothesis of corresponding states,
according to which homologues from mesophilic
and thermophilic organisms are in corresponding
states of similar rigidity and flexibility at their
respective optimal temperatures. For this, the
local distribution of flexible and rigid regions in
19 pairs of homologous proteins from meso- and
thermophilic organisms is analyzed and related
to activity characteristics of the enzymes by con-
straint network analysis (CNA). Two pairs of
enzymes are considered in more detail: 3-isopro-
pylmalate dehydrogenase and thermolysin-like
protease. By comparing microscopic stability fea-
tures of homologues with the help of stability
maps, introduced for the first time, we show that
adaptive mutations in enzymes from thermo-
philic organisms maintain the balance between
overall rigidity, important for thermostability,
and local flexibility, important for activity, at the
appropriate working temperature. Thermophilic
adaptation in general leads to an increase of
structural rigidity but conserves the distribution
of functionally important flexible regions
between homologues. This finding provides direct
evidence for the hypothesis of corresponding
states. CNA thereby implicitly captures and uni-
fies many different mechanisms that contribute
to increased thermostability and to activity at
high temperatures. This allows to qualitatively
relate changes in the flexibility of active site
regions, induced either by a temperature change
or by the introduction of mutations, to experi-
mentally observed losses of the enzyme function.
As for applications, the results demonstrate that
exploiting the principle of corresponding states
not only allows for successful thermostability
optimization but also for guiding experiments
in order to improve enzyme activity in protein
engineering.
Proteins 2011; 79:1089–1108.VVC 2010 Wiley-Liss, Inc.
Key words: rigidity theory; thermostability; flexibil-
ity; protein engineering; network; enzyme activity.
VVC 2010 WILEY-LISS, INC. PROTEINS 1089
These structural changes contribute to the improvement
of the underlying network of noncovalent interactions
within the structure, presumably leading to an increase
in mechanical rigidity of the structure and a decrease
in flexibility.1
Motional properties of proteins with different thermal
stabilities have been compared using a variety of different
experimental techniques, including hydrogen/deuterium
(H/D) exchange, neutron scattering, and nuclear mag-
netic resonance (NMR) spectroscopy.24 As an intriguing
conclusion, the local distribution of flexible or rigid
regions in a thermophilic protein is similar to that of a
mesophilic homologue, provided that the proteins are
compared at their respective optimal temperatures. This
has led to the hypothesis that homologous mesophilic
and thermophilic proteins are in corresponding states of
similar rigidity and flexibility at their respective optimal
temperature.1,25 This should be especially true for
enzymes, where stability characteristics of catalytic and
binding site residues are an important factor in deter-
mining the activity.3 In fact, many thermophilic enzymes
have a maximal activity at elevated temperature, but are
far less active at ambient temperatures, where the struc-
tures are more rigid.26–29 Thus, rigidity is an important
factor for the integrity of the native folded structure,
whereas a certain degree of flexibility is required for
activity, although the general validity of this relation is
being discussed.30
The experimental characterization and quantification of
protein flexibility and rigidity are challenging, especially
in regard to the diverse types and timescales of motions
that are of interest.24,31 Moreover, changes in flexibility
due to sequence variations can be limited to a small part
of the structure only, which makes them difficult to
detect.32 Computational approaches are interesting alter-
natives. As an example, molecular dynamics (MD) simu-
lations provide information on the motional properties of
atoms in a protein structure. By performing simulations
at different temperatures or by running nonequilibrium
simulations, motional properties of the structure during
thermal unfolding can be analyzed.33–36 By comparing
equilibrium and nonequilibrium MD simulations, Creveld
et al.37 studied the unfolding of cutinase and distin-
guished two types of flexible regions: those that promote
unfolding and those that are important for enzyme activ-
ity. Thereby, unfolding regions were predicted where
mutations modulate thermostability and do not interfere
with enzyme activity. Despite these successes, such simula-
tions are computationally expensive, and it is not clear
whether force fields and solvent models are adequate for
use at high temperatures. Without doing expensive MD
simulations, moving protein parts can also be predicted
from a single protein conformation using methods related
to normal mode analysis (NMA).38 Using such an
approach, Su et al.39 studied changes in the fluctuation of
residues upon protein unfolding.
Both MD and NMA studies report on the mobility of
atoms by analyzing atomic fluctuations or the correlation
of movements, rather than flexibility and rigidity charac-
teristics of structural regions. Here, flexibility denotes the
possibility of motion within the region, that is, the region
can be deformed. In contrast, no relative motion is
allowed within rigid (structurally stable) regions, and
such regions can only move as a rigid body with six
degrees of freedom. Thus, flexibility and its opposite, ri-
gidity, are static properties that describe the possibility of
motion.40,41 The distinction between mobility and flexibil-
ity/rigidity becomes obvious in the case of a mobile rigid
body, such as a moving helix or domain.
In a previous study, we showed for a dataset of 20
pairs of mesophilic and thermophilic protein homologues
that protein thermostability can be predicted by directly
characterizing the mechanical rigidity of a protein struc-
ture during a thermal-unfolding simulation.42 The
approach is referred to as constraint network analysis
(CNA) and based on a graph-theoretical method that
determines rigidity and flexibility within a protein struc-
ture in atomic resolution.43 We showed that the
approach is helpful for understanding and exploiting the
relationship between microscopic structure and macro-
scopic stability and, thus, can be used in data-driven
protein engineering to increase protein thermostability by
introducing mutations at regions that are crucial for
macroscopic stability. These regions are referred to as
unfolding regions or weak spots.
Here, we go beyond the previous study, which related
protein thermal stability and rigidity by analyzing the
local distribution of flexible or rigid regions and relating
it to activity characteristics of enzyme structures. We
show that CNA implicitly captures and unifies many
different mechanisms that contribute to increased ther-
mostability and to activity at high temperatures. Two
pairs of enzymes from our data set are considered in
more detail: 3-isopropylmalate dehydrogenase (IPMDH)
and thermolysin-like protease (TLP). By comparing mi-
croscopic stability features of the meso- and thermophilic
protein homologues, we show that adaptive mutations of
thermophilic enzymes maintain the balance between
overall rigidity, important for thermostability, and local
flexibility, important for activity, at the respective tem-
perature at which the protein functions. Thus, direct
evidence is offered that supports the hypothesis of corre-
sponding states. To the best of our knowledge, the pre-
sent study is the first one probing this hypothesis for a
large data set by computational means. Moreover, unlike
other computational studies that probed this hypo-
thesis,35,36 protein rigidity and flexibility are characte-
rized directly. Our results demonstrate that exploiting the
principle of corresponding states not only allows for
successfully optimizing thermostability but also for guid-
ing experiments in order to improve enzyme activity in
protein engineering.
S. Radestock and H. Gohlke
1090 PROTEINS
THEORY
In the following, we will briefly introduce the theoreti-
cal background of CNA. With this method, the mecha-
nical rigidity of a protein structure is characterized by
simulating the thermal unfolding of a protein.
Unfolding simulations on constraint networkrepresentations of proteins
Protein stability is determined by the large number of
interactions that contribute to the native-folded state.44
Thus, to theoretically study protein stability features, pro-
teins can be described as molecular networks.45,46 In the
present study, we go beyond describing proteins as topo-
logical networks. Instead, protein structures are repre-
sented as constraint networks, in particular molecular
frameworks.47 These networks are a subtype of 3D-net-
works in which vertices represent atoms and edges repre-
sent covalent and noncovalent bond and angle constraints.
As noncovalent bond constraints, hydrogen bonds, inclu-
ding salt bridges, and hydrophobic interactions are used.
The rigidity and flexibility within a constraint network
can be fully characterized using constraint counting.48 As
a result, the protein-structure network is decomposed into
rigid clusters and flexible regions. A rigid cluster is a set of
atoms that move together as a rigid body, that is, no inter-
nal movement is allowed. Otherwise, an atom is part of a
flexible region. The rigid cluster decomposition can be
calculated using the FIRST (Floppy Inclusion and Rigid
Substructure Topography) software.43
Thermal unfolding of the protein structure can now be
simulated by gradually removing noncovalent bond
constraints from the constraint network.42,49,50 Thus,
starting from a rigid network with a large number of
constraints, a flexible network is obtained with only a
few constraints left. Here, hydrogen bonds (including salt
bridges) are removed from the network in the order of
increasing strength, that is, only hydrogen bonds with an
energy Ehb � Ecut are retained for a certain network state.
This follows the idea that stronger hydrogen bonds will
break at higher temperatures than weaker ones.51 The
number of hydrophobic contacts is kept constant during
the thermal unfolding, because the strength of hydro-
phobic interactions remains constant or even increases
with increasing temperature.52 A new network is gene-
rated for each hydrogen bond that is removed, and the
rigid cluster decomposition is performed on this network.
Identifying the folded–unfolded transition
During the thermal-unfolding simulation, a phase
transition from a largely rigid to a largely flexible net-
work is observed. At the transition, the percolating rigid
cluster (the giant cluster) suddenly breaks and stops
dominating the system. The transition defines the rigidity
percolation threshold.53 Both proteins and glasses are
similar in that such a transition can be observed.50
However, the percolation behavior of protein networks is
usually more complex, and several transitions can be
observed. This is related to the fact that protein struc-
tures are modular, as they are assembled from secondary
structure elements, subdomains, and domains, which
often break away from the giant cluster as a whole. As
every protein fold has its unique number and pattern
of spatially arranged modules, a percolation behavior
characteristic for each fold is observed.
To describe the general percolation behavior of a
network, the microstructure of the network, that is, prop-
erties of the set of clusters generated by the bond-dilution
process are analyzed.54 Here, we choose the fraction of
the network belonging to the percolating rigid cluster as
an order parameter for characterizing the rigidity transi-
tion (the rigidity-order parameter, P1). The rigidity-order
parameter denotes the probability that an atom belongs to
the giant cluster and is zero in the floppy phase. Monitor-
ing the decay of the giant cluster by the rigidity-order
parameter provides a global and intuitive description of
the rigidity within the protein structure during thermal
unfolding. Although the percolation behavior of protein
networks is complex, the curves of the rigidity-order
parameter are similar to curves observed for network
models of glasses and amorphous solids.54,55
The first transition observed describes the break down
of the completely rigid network into a number of rigid
clusters. The last transition describes the loss of the
remaining piece of rigidity and the onset of a network
that is completely flexible, that is, rigid clusters cease to
exist. The first transition is referred to as rigidity percola-
tion transition, because the network is unable to transmit
stress afterward. However, the last transition is biologi-
cally most relevant, because it corresponds to the folded–
unfolded transition in experimental protein unfolding.42
To identify the last steps of the folded–unfolded transi-
tion that is relevant from a biological point of view,42 a
parameter for analyzing macroscopic properties of the net-
work associated with the rigid cluster-size distribution is
used.56 In this context, Andraud et al.57 introduced the
cluster configuration entropy (H) as a morphological
descriptor for heterogeneous materials. H has been
adapted from Shannon’s information theory and, thus, is
a measure of the degree of disorder in the realization of
a given state: as long as the giant cluster dominates the
system, H is low because of the limited number of possi-
ble ways to configure a system with a very large cluster.
Originally, H was defined as a function of the probability
that an atom is part of a cluster of size s (s-cluster).57
This s-cluster configuration entropy definition has been
shown to be particularly useful for identifying early steps
of the transition of protein networks between largely
rigid states and partially flexible states.
However, we are interested in identifying the late steps
of the transition between partially rigid states, where a
Protein Rigidity and Thermophilic Adaptation
PROTEINS 1091
rigid core in the structure network is still present, and
largely flexible states, where the core is broken. This is in
accordance with the idea that the core then represents
the folding nucleus that dominates the transition state in
experimental protein (un)folding. To identify these steps
of the transition, we used a configuration entropy
H ¼ �X
s
ws lnws; ð1Þ
where the probability ws that an arbitrarily occupied site
belongs to a s2-cluster is given by
ws ¼ s2nsPs
s2nsð2Þ
and the normalized cluster number ns is
ns ¼ number of clusters of size s
N: ð3Þ
N equals the total number of atoms
Our definition of the configuration entropy is useful
for identifying the temperature at which the giant cluster
finally stops to dominate the system, that is, where a
transition takes place that corresponds to the folded–
unfolded transition in protein unfolding. The tempera-
ture at which this transition occurs is referred to as the
phase-transition temperature (Tp), which can be related
to the experimental melting temperature (Tm). See
Methods section for a description as to how the phase-
transition temperature is determined.
RESULTS AND DISCUSSION
Data set of mesophilic and thermophilichomologues
The hypotheses underlying the present study are that
thermostability comes along with an increased mechani-
cal rigidity of the structure and that mesophilic and ther-
mophilic proteins are in corresponding states of similar
rigidity and flexibility at their respective optimal tempe-
rature. To probe these hypotheses, a data set containing
homologous protein structures from mesophilic and ther-
mophilic organisms was constructed (Table S1).42 The
structures in the data set fulfill stringent quality criteria
based on previous results that indicate that a good struc-
tural quality is necessary for successful rigidity analysis.43
The data set consists of 19 pairs of protein structures
from 19 different protein families (Table S1). The corre-
sponding mesophilic and thermophilic proteins within
each family are highly similar, with sequence identities
varying from 37 to 88% and backbone root mean square
deviation (rmsd) values between 0.6 and 2.5 A. Higher
rmsd values were only obtained for proteins of families 5
(elongation factor TU) and 11 (phosphoglycerate kinase)
due to differences in the configuration of domains.
If considered separately, the domains share a high-struc-
tural similarity (rmsd < 2.5 A).
The high sequential and structural similarities point to
close relationships with respect to the three-dimensional
topology of the molecules. The presence of conserved
catalytic residues suggests that the homologues are closely
related, if not identical, from a mechanistic point of
view, too. Sequence alignments of 3-IPMDH and TLP are
shown in the Supporting Information (Figs. S1 and S2),
where catalytic and other functionally important residues
are highlighted.
Ideally, the melting temperature (Tm) of a protein
should be used as a descriptor of thermostability. Unfortu-
nately, only a few proteins in our data set have melting
temperatures currently available, because the Tm of many
proteins cannot be determined experimentally. The
limited number of proteins with known Tm, thus, forced
us to use the optimal growth temperature of the corre-
sponding organism (Tog) as a descriptor of thermostability.
Optimal growth temperatures have previously been used in
studies comparing meso- and thermophilic proteins.16
Macroscopic stability is characterizedby the final decay of a rigid core
All protein structures were subjected to thermal-
unfolding simulations using CNA. CNA is based on a
rigidity analysis of protein constraint networks, as imple-
mented in the FIRST software.43 This method is remark-
ably fast, and unfolding trajectories can be produced
within a few minutes. Using CNA, global (macroscopic)
stability is characterized by monitoring the size of the
largest rigid cluster (the giant cluster) and the rigid
cluster-size distribution within the network as the
thermal-unfolding simulation proceeds. These properties
are described by the rigidity-order parameter (P1) and
the s2-cluster configuration entropy (H), respectively.
From these quantities, the temperature of a biologically
relevant phase transition (Tp) is obtained that relates to
the protein’s melting temperature (Tm).42 Here, only
differences in the phase-transition temperatures of
homologous proteins are considered.42
The rigidity-order parameter plots of two representa-
tive pairs of proteins from our data set, IPMDH and
TLP, are shown in Figure 1(a,b). As described in the
Theory section, the curves characterize the general perco-
lation behavior of the constraint networks during unfold-
ing. In respect to the pattern of transitions, the rigidity-
order parameter curves of thermophilic proteins are
almost identical to their mesophilic counterparts. Never-
theless, the rigidity-order parameter curves of thermo-
philic proteins are shifted to higher temperatures (except
in low-temperature regions). From a global point of
view, these observations provide initial support to the
S. Radestock and H. Gohlke
1092 PROTEINS
hypothesis of corresponding states, according to which
mesophilic and thermophilic enzymes are in correspon-
ding states of similar rigidity and flexibility at their
respective optimal temperature.
The s2-cluster configuration entropy plots of the two
representative pairs of proteins from our data set are
shown in Figure 1(c,d). The cluster configuration entropy
is used to detect the rigid-to-flexible phase transition
that is biologically relevant, that is, where the last rigid
region integrating multiple secondary structure elements
breaks, and the structure is no longer dominated by a
rigid cluster. This melting is associated with small
changes in the topology of the constraint network, which
have a large effect on the rigid cluster-size distribution.
Accordingly, the curve shows a steep increase at the
phase-transition temperature, where the relevant transi-
tion takes place. As can be seen from Figure 1(c,d), the
thermophilic protein has a higher phase-transition
temperature compared to its mesophilic homologue.
Results for all pairs of protein structures from the data
set are given in Table I. Notably, for approximately two
thirds of the 19 different protein families, the Tp of the
thermophilic protein is predicted to be higher than that
of the mesophilic counterpart, as exemplarily demon-
strated for IPMDH and TLP. This shows that the rigid
Figure 1Rigidity-order parameter (a, b) and cluster configuration entropy (c, d) versus temperature for the mesophilic and the thermophilic form of 3-
isopropylmalate dehydrogenase (IPMDH) (a, c), and thermolysin-like protease (TLP) (b, d). The solid line denotes the parameter for the
mesophilic protein, and the dashed line denotes the parameter for the thermophilic protein.
Table IDifferences in experimental Tog values versus differences in computed
phase-transition temperatures (Tp) between a mesophilic protein and its
thermophilic homologue.
Protein family DToga DTp
b
No. Name
1 Pyrrolidone carboxyl peptidase 53.0 18.02 Aspartate aminotransferase-like 35.5 27.23 b-Glucanase 20.0 211.34 Xylose isomerase 34.5 11.55 Elongation factor TU 35.5 9.76 Signal recognition particle (SRP) 34.0 15.37 Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) 15.5 20.48 Methylguanin-DNA methyltransferase 58.0 10.99 Carboxypeptidase 18.0 7.210 3-Isopropylmalate dehydrogenase (IPMDH) 38.0 15.311 Phosphoglycerate kinase 43.0 1.112 Cyclodextrin glycosyltransferase 25.0 29.813 Carbonic anhydrase 13.0 4.914 Cellulase 20.0 216.815 Thermolysin-like protease (TLP) 25.0 23.616 Subtilase 34.0 2.817 Endocellulase 20.0 225.918 Histidine carrier protein 18.0 16.219 Adenylate kinase, zinc finger 18.0 24.7
aOptimal growth temperature difference of the corresponding species, in K.bIn K. DTp is marked in bold if the phase-transition temperature of the thermo-
philic protein was predicted to be higher than that of the mesophilic counterpart.
Protein Rigidity and Thermophilic Adaptation
PROTEINS 1093
protein core of a thermophilic protein is more resistant
to high temperatures than that of a mesophilic protein,
which lends immediate support to the hypothesis that
thermostability is linked to an enhanced structural rigi-
dity of the folded native state. Values in Table I differ
slightly from previously published ones,42 because, in
the present study, a modified, fully automated method
for identifying the phase-transition temperature was
used (see Methods section). This method yields phase-
transition temperatures that are determined in a more
consistent manner across the whole data set.
In approximately one third of the investigated pairs of
homologues, a higher phase-transition temperature for
the thermophilic protein could not be found. The follow-
ing reasons may account for this. First, in a few cases,
mesophilic proteins have been reported to be as thermo-
stable (if not more thermostable) than their thermophilic
counterpart.58 Second, extrinsic factors such as glycosyla-
tion, salt, pressure effects, or solvent viscosity may
further influence thermostability.59 As only for a few
proteins in our data set melting temperatures are avail-
able, let alone data on the dependency of Tm on the sol-
vent viscosity, we are currently unable to investigate this
solvent influence any further. Finally, rigidity analysis is
sensitive with respect to deficiencies in the structures43;
in particular, as only one static representation per struc-
ture is analyzed. Analyzing multiple representations
instead, for example, generated by MD simulations, and
averaging over the results is expected to alleviate deficien-
cies in the constraint model and, hence, led to more
robust results.60
Identifying sites that are importantfor macroscopic stability
To understand protein structure adaptations that lead
to higher thermostability, we next consider microscopic
properties of the protein structures in the context of the
macroscopic percolation behavior of the constraint
networks. Here, the unfolding simulations of IPMDH
and TLP are considered in detail, and local structural
regions are identified at which the unfolding starts. The
identified unfolding regions are compared to experimen-
tal data.
3-Isopropylmalate dehydrogenase
3-Isopropylmalate dehydrogenase (IPMDH) is an
enzyme in the leucine biosynthesis pathway that catalyzes
the dehydrogenation and concomitant decarboxylation of
3-isopropylmalate (IPM) substrate, using nicotinamide
adenine dinucleotide (NAD1) as a cofactor. In the pres-
ent study, IPMDH from E. coli and T. thermophilus is
compared. IPMDH is a functional dimer, composed of
two identical subunits (SU), each with �350 amino acid
residues.61 The polypeptide chain of one subunit is
folded into two domains with similar topologies based
on parallel a/b motifs: domain 1 contains the N- and
the C-terminus, and domain 2 contains the subunit
interface. The active site is located in a cleft between the
two domains and is made up of residues from both sub-
units.62
In Figure 1(c), cluster configuration entropy plots are
shown for mesophilic and thermophilic IPMDH. Meso-
philic and thermophilic IPMDH show a corresponding
percolation behavior. The biologically relevant transitions,
corresponding to the folded–unfolded transitions in pro-
tein unfolding, can be identified at 340 and 355 K for
the E. coli and the T. thermophilus IPMDH, respectively.
The difference in experimental optimal growth tempera-
tures of the corresponding species is 38 K (Table I). The
rigid-cluster decomposition of the networks before and
after the phase transition is shown in Figure 2. The
events that take place at the unfolding transitions in
E. coli and T. thermophilus IPMDH are quite similar:
before the transition, the giant cluster extends over the
subunit-interface and the domain-interfaces, although
only a few residues of domain 1 are incorporated by the
giant cluster [Fig. 2(a,c)].
After the transition, the giant cluster does not cover
the subunit-interface anymore [Fig. 2(b,d)]. Likewise, the
rigid connection between domains 1 and 2 is lost. The
close correspondence of the rigid cluster distribution in
the constraint networks of the homologues before and
after the phase transition is a striking result of our analy-
sis. Particularly, the location of the giant cluster is similar
within the pair of homologues. As for the size of the
giant cluster, the one in the thermophilic structure just
before the phase transition is larger than the one in the
mesophilic protein. This is in agreement with the experi-
ment, where it has been shown that the folding inter-
mediate of the E. coli structure only retains 50% of
the original secondary structure, whereas that of the
T. thermophilus IPMDH retains 90%.64,65
By comparing the structure networks before and after
the phase transition, we can identify microscopic struc-
tural features, where the unfolding starts, so-called
unfolding regions or weak spots. The decay of the giant
cluster in IPMDH initially causes residues at the domain
interface, in domain 1, and at the subunit interface
(domain 2) to become flexible. In Figure 2(b), these
regions are indicated by arrows. The unfolding regions
identified by the thermal-unfolding simulation of E. coli
IPMDH have been compared to experiment (Table II).
Encouragingly, sites within the predicted regions have
also been successfully targeted by experiment to increase
thermostability of various mesophilic66–71 and chimeric
IPMDH.72–77 The experimental examination of thermo-
philic IPMDH has further revealed that most of the sites
that are important for thermostability are at the subunit-
interface,70,71,78 the domain-interface,73,79–81 and in
domain 1.82,83 These regions are also detected
as unfolding regions in our simulation for both the
S. Radestock and H. Gohlke
1094 PROTEINS
mesophilic and the thermophilic protein (Table II).
Overall, the agreement between the sites predicted by
CNA and the experimentally identified unfolding regions
is good. However, weak agreement is found for the
domain 1 region, of which only a few residues are part
of the giant cluster, although mutations in this domain
have been shown to influence protein stability.82,83
Thermolysin-like protease
Thermolysin-like proteases (TLP) are members of the
peptidase family M4 of which thermolysin is the proto-
type.84 Thermolysin originates from B. thermoproteo-
lyticus. TLP are proteases and require a Zn21 ion for
their activity. They contain multiple Ca21 ions, which
Table IIComparison of predicted with experimentally verified unfolding sites for mesophilic
3-ispropylmalate-dehydrogenase (IPMDH)
Structure regionResidues in predictedunfolding regionsa
Experimentally verifiedthermostabilizing mutationsa,b
I Subunit interface 151–163, 165–167, 173, 185–200,215, 219–224, 241–248, 255, 256
191, 194, 200, 204, 256, 259
II Domain interface 105–112, 133–143, 174–175, 301–305c 98, 112, 114–116, 118, 120,131, 140, 182, 271, 295, 311
III Domain 1 — 20, 24, 42, 57, 60–62, 84, 86,92, 350, 356–360
aNumbering in the alignment, as shown in Figure S1.bSee text for details. According to ref. 66–74,77,79,81–83.cPreviously,42 these residues have been classified as belonging to the domain 1 region.
Figure 2Snapshots from the thermal unfolding simulation of meso- (a, b) and thermophilic (c, d) 3-isopropylmalatdehydrogenase (IPMDH) just before
(a, c) and after the phase transitions (b, d) at 340 and 355 K, respectively. The rigid cluster decomposition of the network is shown. The giant
cluster is shown in blue. Other clusters are shaded black. Arrows in (b) indicate potential unfolding nuclei. Roman numbers refer to the numbering
of the unfolding nuclei in Table II. The subunits and domains are marked. The locations of the active sites are indicated by asterisks. Figures were
generated using PyMOL.63 [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Protein Rigidity and Thermophilic Adaptation
PROTEINS 1095
have been reported to be important for stability.85 TLP
consists of an N-terminal (mainly b-sheet) and a C-termi-
nal (mainly a-helical) domain. Both domains are con-
nected by a central a-helix (residues 139–154), which is
located at the bottom of the active site cleft and contains
several of the catalytically important residues. In the pre-
sent study, mesophilic TLP from B. cereus and thermo-
philic thermolysin is compared. The structures share a
high sequence and structure similarity (Table S1). The two
enzymes differ in their thermostability and in the level of
enzyme activity, when measured at the same temperature.
The simulated thermal-unfolding behavior of B. cereus
TLP and thermolysin is similar, as expressed by a corre-
sponding percolation behavior. The analysis of the cluster
configuration entropy plot revealed that the biologically
relevant transitions that correspond to the folded–
unfolded transitions can be identified at 350 and 373 K,
respectively [Fig. 1(d)]. The difference in experimental
optimal growth temperatures of the corresponding species
is 25 K (Table I). In Figure 3, the rigid cluster decomposi-
tion of the networks of B. cereus TLP and thermolysin,
respectively, is shown directly before and after the phase
transition. Before the transition, the giant cluster domi-
nates the system in both cases. Moreover, the giant cluster
is located in the same region of the proteins: it extends
over the N-terminal domain and comprises the b-sheet
region and an a-helix in the N-terminal domain (residues
67–89). The size of the giant cluster in thermolysin differs
from that reported previously.42 This is because, in the
present study, a modified method to determine the phase-
transition temperature was applied (see Methods section).
In addition to these common features, some differences
also exist. The networks also contain several smaller rigid
clusters, mainly located in the C-terminal domain, but
also in the N-terminal one. The size and number of these
clusters vary, with more and larger rigid clusters existing
in B. cereus TLP compared to thermolysin.
After the phase transition, the giant cluster decays into
smaller rigid clusters and regions that are flexible.
Interestingly, the decay of the giant cluster is in the N-termi-
nal domain, where the main site of proteolytic degradation
is assumed. Thus, the thermal-unfolding simulation predicts
that the global unfolding starts at similar sites as observed
for local-unfolding events that lead to kinetic degradation.
Comparing the constraint networks before and after
the phase transition allows identifying the microscopic
structural origins of the thermostability of thermolysin.
For both mesophilic TLP and thermolysin, similar
unfolding regions can be identified. The first region
comprises almost all residues of the b-sheet in the N-ter-
minal domain, the second one a Ca21-binding site (resi-
dues 56, 58, 60, and 68), the hydrophobic region around
Figure 3Snapshots from the thermal unfolding simulation of meso- (a, b) and thermophilic (c, d) thermolysin-like protease (TLP) just before (a, c) and
after the phase transitions (b, d) at 350 and 373 K, respectively. The rigid cluster decomposition of the network is shown. The giant cluster is
shown in blue. Other clusters are shaded black. Arrows in (b) indicate potential unfolding nuclei. Roman numbers refer to the numbering of the
unfolding nuclei in Table III. The N- and C-terminal domains are marked. The location of the active site is indicated by an asterisk. [Color figure
can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
S. Radestock and H. Gohlke
1096 PROTEINS
Phe 63, and the N-terminal residues of the a-helix in the
N-terminal domain (residues 67–89). The predicted
unfolding regions are in good agreement with sites where
stabilizing mutations have successfully been introduced
experimentally into TLP to increase thermostability
(Table III).86–94 This holds true in particular for
the unfolding region comprised residues 56–70 in the N-
terminal domain (region II). Little correspondence, how-
ever, was found for the residues of the b-sheet region in
the N-terminal domain (region I). No unfolding regions
were found in the C-terminal domain. This is in agree-
ment with the experimental studies showing that muta-
tions at various sites in the C-terminal domain only have
a marginal effect on thermal stability.93,95–99
For both IPMDH and TLP, we found a strong agree-
ment between the events that take place at the unfolding
transition in the mesophilic enzyme and its thermophilic
homologue. In particular, the location and the size of the
giant cluster were comparable. In both the mesophilic
and the thermophilic enzyme, corresponding unfolding
regions were identified. The only difference in the global-
unfolding behavior between the mesophilic and the ther-
mophilic protein was the higher thermoresistance of the
unfolding regions. Apparently, thermostability is deter-
mined by the temperature dependence of local changes
in flexibility and rigidity of the systems.
Microscopic stability is determinedby the emergence of flexible regions
By analyzing the decay of the giant cluster, we identi-
fied the above unfolding regions that are important for
macroscopic stability, because their breaking determined
the global folding–unfolding transition. This demon-
strates that, within a rigid cluster, a hierarchy of regions
of varying stability can exist.50,100 Revealing such hier-
archies in detail provides information about microscopic
stability features, which can be associated with local
unfolding events. To characterize microscopic stability
features from the thermal-unfolding simulation, ‘‘stability
maps’’ are introduced here. For this, ‘‘rigid contacts’’
between two residues, represented by their Ca atoms, are
identified. A ‘‘rigid contact’’ exists if two residues belong
to the same rigid cluster. During a thermal-unfolding
simulation, ‘‘stability maps’’ are then constructed in that,
for each residue pair, Ecut is identified at which a ‘‘rigid
contact’’ between two residues is lost. [Note that, by Eq.
(4), Ecut also relates to a temperature at which the ‘‘rigid
contact’’ is lost.] That way, a contact’s stability relates to
the microscopic stability in the network, and, taken
together, the microscopic stabilities of all residue–residue
contacts result in a stability map. Stability maps are
comparable to cooperativity correlation plots used in
related studies.101,102 However, stability maps provide
direct information on the distribution of rigidity and flexi-
bility within the system, and they also identify regions that
are flexibly or rigidly correlated across the structure. In
turn, cooperativity correlation plots are calculated from an
ensemble of realizations of a given state of the protein, and
they primarily identify regions that are correlated across
the entire ensemble of these realizations.101,102
In Figures 4 and 5, the stability maps for IPMDH and
TLP are shown, respectively. The upper and lower tri-
angles of the matrices display the stability maps for the
mesophilic and the thermophilic homologues, with corre-
sponding rows and columns determined by a structural
alignment. Gaps in the structural alignment are colored in
gray. Relative microscopic stability values are color coded
in that contacts that are stable at temperatures above the
phase-transition temperature are shown in red, whereas
contacts that have already been broken when this tempera-
ture is reached are shown in white. For both IPMDH and
TLP, the size and the localization of the giant cluster at
temperatures slightly below the respective phase-transition
temperature are similar for the mesophilic and the ther-
mophilic proteins. Moreover, the size and the localization
of many other rigid clusters are corresponding.
Histograms of the differences between microscopic
stability values of rigid contacts in thermophilic IPMDH
and TLP and microscopic stability values of correspon-
ding rigid contacts in the mesophilic proteins are shown
in Figure S3(a,b), respectively. A negative stability diffe-
rence indicates that a rigid contact in the thermophilic
protein is stronger than that in the mesophilic protein.
For both IPMDH and TLP, stability differences mostly
range from 0 to 22.0 kcal mol21, with �55 and 75%,
Table IIIComparison of predicted with experimentally verified unfolding sites for mesophilic thermolysin-like
protease (TLP)
Structure regionResidues in predictedunfolding regionsa
Experimentally verifiedthermostabilizing mutationsa,b
I b-Sheet in the N-terminal domain 21–24, 29, 31–34, 39–42, 44,101–107, 114–115, 122–123
19, 38, 104, 120, 134, 142, 183
II Ca21-binding site, hydrophobic regionaround Phe 63, and N-terminus of thea-helix in the N-terminal domain
54, 56–62, 68–70 4, 8, 57, 59, 61, 64, 66–70
aNumbering in the alignment, as shown in Figure S2.bSee text for details. According to ref. 86–94.
Protein Rigidity and Thermophilic Adaptation
PROTEINS 1097
respectively, of the contacts in the range of 0 and
21.0 kcal mol21. Thus, it becomes apparent that, for the
thermophilic protein, the majority of rigid contacts are
stronger when compared with the mesophilic homologue,
but that the increase in strength per contact is moderate.
To finally quantify the relation of microstability features
of those pairs of meso- and thermophilic homologous
proteins in our data set for which the phase-transition
temperature of the thermophilic protein was predicted to
be higher than that of the mesophilic counterpart, corre-
lation coefficients were calculated for the respective stabi-
lity maps (Table IV). These values range from 0.25 to
0.90, with an average value of 0.53 � 0.19, showing weak
to strong correlations in all cases. These correlations are
significant with P-value � 0.05. For example, scatter plots
of the microscopic stability values of rigid contacts in
mesophilic versus thermophilic IPMDH and TLP, respec-
tively, are shown in Figure S4(a,b).
Overall, these results demonstrate that microscopic
stability features are conserved between mesophilic and
thermophilic homologues. In agreement with what has
been found when analyzing the general percolation beha-
vior based on global parameters, the results demonstrate
that mesophilic and thermophilic proteins are in corre-
sponding states of similar rigidity and flexibility, even at
the microscopic level. Such a conservation of microscopic
stability features strongly suggests that the features are
important for enzyme function and activity.1,24,103
Relating microscopic stability andenzyme activity
Experimentally, the interplay between flexibility, stabi-
lity, and activity has been studied by measuring kinetic
parameters that describe enzyme function and catalysis
(Km and kcat).3 However, these parameters do not
provide insights at an atomic level into the importance
of flexibility and rigidity for enzyme activity. A
more direct view on protein flexibility and microstability
is obtained by biophysical techniques that analyze
Figure 4Combined stability map for E. coli and T. thermophilus IPMDH subunits (SU) 1 and 2. In the upper (lower) half of the matrix, stability
information for the mesophilic (thermophilic) protein is shown. Relative microscopic stability values are color coded: contacts that are stable at
temperatures above the phase-transition temperature are shown in red, and instable contacts are shown in white. Gaps in the alignment of both
structures or residues that do not form any rigid contacts are shown in gray. Catalytic residues are highlighted by black arrows, and other
functionally important residues are highlighted by gray arrows. Contacts between the residues in the giant cluster (at a temperature just below Tp)
are surrounded by a solid line. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
S. Radestock and H. Gohlke
1098 PROTEINS
dynamical properties of the structure, such as H/D
exchange, neutron scattering, and NMR spectroscopy.24
However, even then, it remains difficult to quantify the
influence of flexibility and rigidity due to the different
types of motions and timescales detected. Such difficulties
may explain why, in some cases, thermophilic enzymes
were unexpectedly found by H/D exchange to display a
similar,104,105 or even increased, flexibility when com-
pared with their mesophilic homologues.58,106–108
Using CNA, no assumption needs to be made with
respect to timescales and types of motions. This method
allows studying rigidity and flexibility in atomic resolu-
tion and thus characterizing microscopic stability fea-
tures. Here, we analyze how adaptive mutations of ther-
mophilic enzymes maintain the balance between rigidity,
important for macroscopic stability, and local flexibility,
important for activity, regardless of the temperature at
which the protein functions. The role of local flexibility
is most apparent for regions involved in substrate bind-
ing and product release.
Figure 5Combined stability map for B. cereus thermolysin-like protease (TLP) and thermolysin. In the upper (lower) half of the matrix, stability
information for the mesophilic (thermophilic) protein is shown. Relative microscopic stability values are color coded: contacts that are stable at
temperatures above the phase-transition temperature are shown in red, and instable contacts are shown in white. Gaps in the alignment of both
structures or residues that do not form any rigid contacts are shown in gray. Catalytic residues are highlighted by black arrows, and other
functionally important residues are highlighted by gray arrows. Contacts between the residues in the giant cluster (at a temperature just below
Tp) are surrounded by a solid line. Contacts between the residues belonging to the a-helix in the N-terminus are surrounded by a dashed line.
The latter region is shown enlarged. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Table IVCorrelation between the stability maps from mesophilic
and thermophilic homologues
Protein familya rb
1 0.464 0.305 0.486 0.558 0.699 0.3510 0.6511 0.6612 0.3313 0.5315 0.7416 0.2518 0.90Mean 0.53 � 0.19
aNumbers refer to family numbers in Table I. Only those families were considered
for which the phase-transition temperature of the thermophilic protein was pre-
dicted to be higher than that of the mesophilic counterpart.bPearson’s correlation coefficient.
Protein Rigidity and Thermophilic Adaptation
PROTEINS 1099
3-Isopropylmalate dehydrogenase
The active site of 3-isopropylmalate dehydrogenase
(IPMDH) is located in a cleft between the two subunits
and is made up of residues from both subunits.62 The
catalytic residues Lys 195 and Asp 227 belong to one
subunit, whereas Tyr 145 belongs to the other (the num-
bering refers to the alignment shown in the Supporting
Information). The catalytic residues are indicated by
black arrows in Figure 4. The catalytic residues are
located in domain 2 of the enzyme, which has an impor-
tant role for macroscopic stability (see above). As illus-
trated in Figure 6(a,d) for meso- and thermophilic
IPMDH, respectively, the active site is in the immediate
vicinity of the giant cluster at the working temperature
(see Methods section) of the corresponding enzyme (337
and 352 K, respectively), indicating that the catalytic resi-
dues are in a relatively stable region of the protein.
Examining the stability map in Figure 4 reveals that
contacts between the catalytic residues and their neigh-
boring residues are relatively unstable when compared
with contacts between residues in the giant cluster. (In
Fig. 4, contacts between residues of the giant cluster are
surrounded by a solid line.) Thus, the catalytic residues
are not part of the giant cluster, but located in a locally
flexible region, allowing a certain degree of flexibility for
them, which is anticipated to be necessary for catalysis.
Still, because of the vicinity to the giant cluster, a certain
degree of global stability is retained.
As shown for the thermophilic enzyme, this view is
corroborated in that the microstability values of the
active site show that these residues become part of the
giant cluster as early as at temperatures 7–10 K below
the respective phase-transition temperature [Figs. 4 and
6(c)]. Thus, at the working temperature of the meso-
philic enzyme (337 K), the active site of the thermophilic
homologue is rigidified. It is highly likely that this is the
Figure 6Active site of meso- (a, b) and thermophilic (c, d) 3-isopropylmalate dehydrogenase (IPMDH) at the working temperature of the mesophilic
enzyme (337 K) (a, c) and the working temperature of the thermophilic enzyme (352 K) (b, d). The rigid cluster decomposition of the network is
shown. The giant cluster is shown in blue. Other clusters are shaded black. Catalytic residues are shown in red, and other functionally important
residues are shown in green.
S. Radestock and H. Gohlke
1100 PROTEINS
reason why the T. thermophilus IPMDH has been found
to be only marginally active then.29 In turn, at the work-
ing temperature of the thermophilic enzyme (352 K), the
active site of the mesophilic homologue is identified to
be completely flexible and, hence, has lost all structural
stability [Fig. 6(b)].
Aside from the microscopic stability of the catalytic
residues, the rigidity and flexibility characteristics of resi-
dues involved in substrate and cofactor binding are
expected to have a prominent role for the activity of
IPMDH, even more so, because, upon binding of IPM
and NAD1, IPMDH shows an induced fit of the binding
site. Especially, the cofactor binding triggers conforma-
tional changes in the immediate vicinity of both the
adenosine- and the nicotinamide-binding sites: the Ca
atoms of residues 14–18, 284–299, and 334–347 in
domain 1 and 261–266 at the domain interface move by
up to 2.5 A to bind the adenine moiety (the numbering
refers to the alignment shown in Fig. S1).62,109 In turn,
residues 77–91 in domain 1 move at the nicotinamide-
binding site.109 These residues are indicated by gray
arrows in Figure 4. All these residues are located in loops,
with the exception of residues 16–18 and 90–91, which
are located at the amino terminal ends of a-helices. Thelongest of the moving segments comprises residues 284–
299. This loop closes over an analog of NAD1, ADP-
ribose, when bound to IPMDH, allowing Asp 298 and
Leu 290 to bind the adenine moiety.62
As a prerequisite for these movements, contacts
between residues in the moving regions and neighboring
regions need to be unstable. This is shown in the stability
map (see Fig. 4), where it can be seen that the microscopic
stabilities of these residues are in fact low: residues 14–18
and 77–91 become rigidified (i.e., they are integrated into
the giant cluster) only at temperatures 10–20 K below the
phase-transition temperature [Fig. 6(c)]. At this stage, the
regions cannot move toward the binding site anymore,
and the closed conformation of IPMDH cannot be
adopted, which may be another reason why the thermo-
philic enzyme is only marginally active at the temperature
of maximal activity of the mesophilic enzyme [Fig. 6(a)].
Notably, cold-active forms of T. thermophilus IPMDH
have been engineered by introducing mutations into
exactly these fragments in order to improve binding by
favoring the closed conformation.110–113
Thermolysin-like protease
The active site of TLP is located in a cleft between the
N- and the C-terminal domain. TLP binds a single-cata-
lytic Zn21 ion. Two mechanisms of action have been
proposed for TLP: one in which Glu 144 acts as a proton
acceptor in catalysis,114,115 and a second in which His
232 acts as a general base instead of the glutamate (the
numbering refers to the alignment shown in Fig.
S2).116,117 In either mechanism, residues Glu 144, His
232, Tyr 158, and a Zn21 bound water play important
roles during catalysis.84 The Zn21 is coordinated by two
histidine residues (His 143 and His 147), a glutamate
(Glu 167), and a water molecule. Most of the catalytic
and Zn21-coordinating residues belong to the central a-helix (residues 139–154) that represents the bottom of
the binding site.84 The catalytic and Zn21 coordinating
residues are indicated by black arrows in Figure 5.
For TLPs, a hinge-bending motion has been postulated
to be important for substrate binding and catalysis. In
this context, the central a-helix (residues 139–154) has
been identified as a flexible joint.118 Particularly, residues
Gly 136 and Gly 137 at the N-terminus of this a-helixare important.119 In addition, Gly 79 in the center of an
a-helix in the N-terminal domain (residues 67–89) has
been found to be important for hinge bending.119–121
These residues are indicated by gray arrows in Figure 5.
Mutations of residue 79 influenced substrate binding by
altering mobility in the binding cleft. Mutations of
residues 136 and 137 affected motions directly involved
in catalysis and, hence, influenced kcat.
The microscopic stabilities of the functionally impor-
tant residues 79, 136, and 137 can be determined from the
unfolding simulation. To this aim, the stability map of
meso- and thermophilic TLP in Figure 5 is examined. It is
expected that contacts between the functionally important
residues and their neighboring residues are unstable, thus
allowing for a local flexibility. For Gly 79, this can be
observed primarily in the case of mesophilic TLP: contacts
between residue 79 and other residues of the long a-helixin the N-terminal domain (residues 67–89) are relatively
unstable compared to mutual contacts of other residues
within the a-helix. (In Fig. 5, contacts between these resi-
dues are surrounded by a black-dashed line, and this
region is shown enlarged.) The result shows that the influ-
ence of residue 79 is limited to a local region, most likely
because Gly 79 acts as a a-helix breaker.The importance of residues 136 and 137 as hinge resi-
dues can also be understood from examining the stability
map. Contacts between the residues of the giant cluster
and contacts between residues of a large cluster in the C-
terminal domain 138-208 are relatively stable, whereas
contacts between residues 125–137 are relatively unstable.
At temperatures just below the respective phase-transition
temperature, residues 125–137 form a flexible region that
assures that the giant cluster does not extend into the C-
terminal domain and does not fuse with residues 138–
208 forming the other cluster. That way, residues 125–
137 can act as a hinge. As shown in Figure 7(a,d) for
meso- and thermophilic TLP, respectively, some of the
catalytic and Zn21 coordinating residues belong to the
large cluster in the C-terminal domain at the working
temperature (see Methods section) of the corresponding
enzyme (342 and 364 K, respectively). Thus, these resi-
dues are in a locally rigid region, which can nevertheless
move globally due to the hinge. The global mobility and
Protein Rigidity and Thermophilic Adaptation
PROTEINS 1101
local rigidity characteristics have been confirmed
by recent H/D exchange experiments that were per-
formed to study the conformational dynamics of free
thermolysin.122
As shown for thermolysin, at temperatures 20–30 K
below the phase-transition temperature, the global mobi-
lity is lost, because the hinge region rigidifies [Fig. 7(c)].
This happens because contacts are then established
between residues of the hinge (125-136) and residues of
the giant cluster in the C-terminal domain (137-208).
This result explains experimental findings that show that
thermolysin is only marginally active at a temperature
that corresponds to the Tog of the mesophilic enzyme, 25
K below its own Tog.123 In turn, at a temperature where
the thermophilic enzyme is predicted to be fully active
(364 K), the active site of the mesophilic homologue is
completely flexible [Fig. 7(b)].
The importance of the flexibility of the hinge region
has also been confirmed by mutational studies.90,124
Mutating Leu 145 by Ser increased kcat and, thus,
improved the enzyme’s activity but did not influence the
thermostability of the protein.124 On the other hand,
mutating Gly 120 to Glu improved the activity, too, but
had a negative effect on macroscopic stability.90 These
observations can be understood with the help of the
CNA results: whereas the mutation of residue 145 was
performed in a region that is not important for macro-
scopic stability, the mutation of residue 120 targets an
identified unfolding region. Apparently, activity could be
modulated independently from stability when a mutation
was introduced at a site that does not correspond to an
unfolding region. It is encouraging to note that CNA
allows understanding this complex relationship between
activity and thermostability.
Figure 7Active site of meso- (a, b) and thermophilic (c, d) thermolysin-like protease (TLP) at the working temperature of the mesophilic enzyme (342 K)
(a, c) and the working temperature of the thermophilic enzyme (364 K) (b, d). The rigid cluster decomposition of the network is shown. The giant
cluster is shown in blue. Other clusters are shaded black. Catalytic residues are shown in red, and other functionally important residues are shown
in green.
S. Radestock and H. Gohlke
1102 PROTEINS
CONCLUSIONS
On the basis of the constraint network representations
of proteins, and by applying rigidity theory, we analyzed
the local distribution of flexible and rigid regions in 19
pairs of homologous proteins from meso- and thermo-
philic organisms and related it to activity characteristics
of the enzyme structures. The present study thus extends
a previous one by us, which related protein structural ri-
gidity and thermostability.42
From a biological point of view, the most intriguing
result of our study is that adaptive mutations of thermo-
philic enzymes maintain the balance between overall ri-
gidity, important for thermostability, and local flexibility,
important for activity, at the respective temperature at
which the protein functions: our results demonstrate that
thermophilic adaptation leads to an increase of structural
rigidity in general, but that the distribution of function-
ally important flexible regions is conserved between
meso- und thermophilic homologues. This finding pro-
vides direct evidence for the hypothesis of corresponding
states. To the best of our knowledge, this study is the
first one probing this hypothesis for a large data set by
computational means and by directly characterizing pro-
tein rigidity and flexibility. Our results demonstrate that
exploiting the principle of corresponding states not only
allows for successful thermostability optimization but
also for guiding experiments in order to improve enzyme
activity in protein engineering.
In more detail, our results show that specific regions
are crucial for macroscopic stability of protein structures.
When these unfolding nuclei become flexible, the struc-
tural integrity of the native folded state is lost, and the
protein unfolds. Notably, unfolding nuclei identified for
IPMDH and TLP are in good agreement with sites that
are known to be important for thermostability and where
thermostabilizing mutations have been introduced. Thus,
we suggest that the unfolding regions represent sites
where mutations should be introduced that might modu-
late a protein’s thermostability. Detailed analysis of the
flexibility in the IPMDH and TLP substrate-binding
regions with the help of ‘‘stability maps,’’ introduced here
for the first time, showed that enzyme function requires
certain regions of the protein to be flexible at the work-
ing temperature of the respective enzyme. We were able
to qualitatively relate changes in the flexibility of those
regions, induced either by a temperature change or by
the introduction of mutations, to experimentally
observed losses of the enzymes’ functions.
From a methodological point of view, it is important
to note that protein thermostability can be predicted by
characterizing the mechanical rigidity of a protein struc-
ture during a thermal-unfolding simulation. CNA is
based on a graph-theoretical method that determines ri-
gidity and flexibility within a protein structure in atomic
resolution. The approach is extremely fast and computa-
tionally inexpensive. Moreover, no assumption on the
type and the timescale of motion is made. Remarkably,
CNA implicitly captures and unifies many different
mechanisms that contribute to increased thermostability
and to activity at high temperatures. Thus, CNA provides
an appealing alternative to MD simulations or experi-
ments. Finally, the approach is entirely based on the
native topology of a protein. This can be rationalized in
that we are identifying when the giant rigid cluster
breaks, which is controlled by native contacts.
Network representations of proteins have already been
used to predict flexibility and stability characteristics of
protein structures. Vendruscolo et al.125,126 showed that
protein structure networks are ‘‘small world’’ networks.
In such networks, atoms or residues can be identified
that determine the topology of the network.45 Heringa
and Argos127 used protein structure networks to identify
clusters of tightly packed side chains. Later on, they
showed that these clusters were important for protein
thermostability.128 Brinda et al.129,130 used a similar
approach to identify differences between meso- and ther-
mophilic proteins. Robinson-Rechavi et al.19 used such
an approach to identify structural properties of thermo-
philic proteins, too. In the present study, we go beyond
analyzing topological properties of protein structure net-
works. Instead, protein structures are represented as con-
straint networks (molecular frameworks), where rigid
and flexible regions can be identified.42
A shortcoming of our method is the fact that a struc-
ture of high quality is needed for analysis. Moreover, ri-
gidity analysis is sensitive with respect to deficiencies in
the structures, and extrinsic factors such as glycosylation,
salt, pressure effects, or solvent viscosity may further
influence thermostability. Thus, future work is needed to
make the approach viable also in those cases where only
low-resolution structures or homology models are avail-
able and/or to include such extrinsic factors. An
approach to reduce sensitivity toward a single structure
has been developed by Jacobs et al.101,102 Using the dis-
tance constraint model (DCM), a large number of com-
binations of constraints is generated as an input to rigid-
ity analysis, and the results are then averaged. However,
the DCM requires a fit to experimental data, which limits
the general applicability of the method.
Compared to other computational methods for protein
engineering,131–133 CNA provides an interesting alterna-
tive. Most importantly, we expect CNA to be a valuable
tool for predicting ‘‘weak spots’’ in a protein structure,
where mutations could improve microscopic stability and
consequently macroscopic stability, ultimately leading to
an increased thermostability. In addition, CNA provides
detailed information about regions that should not be
stabilized in order to not interfere with enzyme activity.
Such information will be very valuable, because even in
an elaborate study describing the successful design of a
thermostable enzyme,131 only coarse assumptions have
Protein Rigidity and Thermophilic Adaptation
PROTEINS 1103
been made so as to not influence the active site’s stability
during the design process.
METHODS
Data set
A detailed description of the data set selection is given
in Ref. 42. Here, we summarize the procedure. Crystallo-
graphic models of homologous pairs of mesophilic and
thermophilic protein structures were collected from the
Protein Data Bank (PDB)134 using data sets from the
recent literature as a starting point.15,130 Only homolo-
gous pairs were considered that fulfilled the following
criteria: (i) the resolution is �2.6 A; (ii) structures were
excluded in cases where the crystallographic model did
not correspond to the biological unit; (iii) the quality of
the structures was checked using the PDBREPORT data-
base,135 and structures with the qualification ‘‘bad’’ were
excluded. Excluded structures were replaced by other
available structures of the same protein if possible.
For enzymes, information about catalytic residues was
taken from the Catalytic Site Atlas.136 To ensure struc-
tural homology, only sequences sharing �35% residue
identity were considered, as were only pairs of structures
that shared the same oligomeric and ligand-binding state
and had a similar resolution. In some cases, it was not
possible to find structures that met all of the above crite-
ria. In the case of the thermophilic protein of family 10
(IPMDH), a mutant was taken, because the wild-type
enzyme was only available as a monomer from the PDB.
The biological unit, however, is a homodimer. Never-
theless, the mutant shows only slight differences in ther-
mostability and no difference in enzyme activity when
compared with the wild-type protein.80 For further
details on other protein families, see ref. 42.
The final data set contained 19 homologous pairs of
mesophilic and thermophilic protein structures (Table
S1). The data set differs from a previously published
one42 in that every protein family is now being repre-
sented by only one pair of structures. Structural similar-
ity was determined using the DaliLite server.137 Where
necessary, the crystallographic model was modified by
flipping Asn, Gln, or His side chains using REDUCE.138
The optimal growth temperature (Tog) was assigned to
each protein by considering the living environment
temperature or the average of a range of habitat tempera-
tures of the corresponding species.21,139 Melting
temperatures were taken from the literature and the Pro-
Therm database.140
Constraint network analysis
Network construction
To construct the constraint network, only the protein
structure was used. Water, solvent molecules, buffer ions,
substrates, and cofactor molecules were removed from
the crystallographic models. Metal ions were retained
when they were part of the structure. Bonds between
ions and protein atoms were treated as covalent bonds
and inserted manually. The constraint network of the
covalent and noncovalent bonds present within the pro-
tein was constructed using the FIRST software (version
6.2).43 Hydrogen bonds (including salt bridges) were
modeled by a bond between the hydrogen and the
acceptor atom and two additional angular constraints
associated with these atoms.43 Hydrogen-bond energies
Ehb were calculated from an empirical potential based on
donor-hydrogen-acceptor atom geometries as given in
the crystallographic model.51 The position of the hydro-
gen atoms was used as predicted by the REDUCE soft-
ware.138 Hydrophobic interactions between carbon or
sulfur atoms were taken into account if the distance
between these atoms was less than the sum of their van
der Waals radii (C: 1.7 A, S: 1.8 A) plus 0.25 A.50
Rigid cluster decomposition
Rigidity within the constraint network can be fully char-
acterized using constraint counting.48 To this aim, in
FIRST, the ‘‘pebble game’’ algorithm is implemented.43
This algorithm determines for every bond whether it is
part of a rigid cluster or a flexible region.48,141,142 A
rigid cluster is defined as a set of atoms that move together
as a rigid body in any floppy motion. Otherwise, an atom
is part of a flexible region. The size of a rigid cluster was
determined as the number of atoms that form the cluster.
Cluster size was further normalized by the total number of
atoms within the network. By definition, the giant cluster
is the largest cluster present in a given network.
Thermal-unfolding simulation
By gradually removing noncovalent bond constraints
from the network, one can go from a rigid network with a
large number of bonds to a flexible network with only a
few bonds. Knowing the energies Ehb of all potential
hydrogen bonds (including salt bridges) in the network,
the energy values were used for excluding hydrogen bonds
from the network with Ehb > Ecut. A new network was gen-
erated for every single hydrogen bond that was removed.
The networks were then decomposed into rigid clusters, as
described earlier. As the energy of a hydrogen bond relates
to a temperature T at which the bond breaks43,51 (see also
below), thermal unfolding can be simulated that way.50
The number of hydrophobic contacts was kept fixed
during the thermal-unfolding simulation.
Network parameter evaluation
Global changes in the rigidity characteristics of the
network during the thermal-unfolding simulation were
characterized using the rigidity-order parameter (P1)
S. Radestock and H. Gohlke
1104 PROTEINS
and the s2-cluster configuration entropy (H) as defined
in the Theory section. A phase transition was identified
as the most prominent change in a H versus Ecut plot. To
do so, a spline was fitted to the entropy values. Ecut,p is
the hydrogen bond energy cutoff where the first-order
derivative of the spline reaches its maximum.
To determine a phase-transition temperature Tp from
Ecut,p, a relation T 5 f(Ecut) needs to be established. For
this, we calibrated Ecut 5 Ecut,p values with respect to ex-
perimental melting temperatures Tm, considering only
those pairs of meso- and thermophilic proteins for which
Ecut,p of the mesophilic protein was correctly predicted to
be larger than Ecut,p of the thermophilic protein. For
those proteins, for which only optimal growth tempera-
tures of the corresponding species (Tog) were available,
Tm values were estimated by assuming that they are 25 K
above the Tog value.143 Optimal growth temperatures
have previously been used in studies comparing meso-
and thermophilic proteins.16 This resulted in the follow-
ing relation:
T ¼ �20K=ðkcalmol�1Þ�Ecut þ 300 K ð4Þ
Considering that, at Ecut, only hydrogen bonds with
Ehb � Ecut are retained in the network, Eq. (4) also
provides a mapping of the procedure for removing
hydrogen bonds onto a scale of temperature. In contrast
to previous studies43,60 where the hydrogen bond
energy was related to thermal fluctuations in terms of
kT, Eq. (4) provides a less steep relation of hydrogen
bond energy with temperature. We note that Eq. (4)
works well for determining differences in Tp for pairs of
homologous proteins and to relate these to differences
in experimentally determined melting temperatures (Tm)
or optimal growth temperatures of the corresponding
species (Tog). However, the relation does not allow to
compare Tp values between nonhomologous proteins.
This is because the size of a constraint network and
the topology of the underlying protein influence the
absolute Tp value.
Local stability characteristics were identified based on
a stability map. At a given temperature, rigidity analysis
labels portions of the protein as rigid or flexible. A stabi-
lity map shows how pairs of residues at a given tempera-
ture are related to each other in terms of their rigidity
characteristics. On the map, a color is assigned to a pair
of residues according to the stability of their ‘‘contact.’’
Here, two residues form a ‘‘contact’’ if both are part of
the same rigid cluster. The contact stability is then
defined as the Ecut value at which the contact breaks
during the thermal-unfolding simulation. Stability maps
of the same size can be compared by measuring Pearson’s
correlation of the two matrices. For pairs of homologous
proteins, corresponding residue positions in the sequence
were identified based on structural alignments obtained
from the DaliLite server.137
For comparing a pair of homologous proteins in their
respective functional states, working temperatures for the
enzymes were determined, such that (i) these tempera-
tures are lower by the same amount than the respective
Tp values and (ii) structural elements that need to move
for the enzymes to be functional come out as flexible in
the rigidity analysis. This leads to working temperatures
3 K below the respective Tp values in the case of IPMDH
and 8 K in the case of TLP.
ACKNOWLEDGMENTS
We are grateful to Daniel Cashman, Doris Klein,
Prakash Rathi (Heinrich Heine-University, Dusseldorf),
and Benjamin Radestock (Johannes Gutenberg-Univer-
sitat, Mainz) for critically reading the manuscript.
REFERENCES
1. Jaenicke R, Bohm G. The stability of proteins in extreme environ-
ments. Curr Opin Struct Biol 1998;8:738–748.
2. Sterner R, Liebl W. Thermophilic adaptation of proteins. Crit Rev
Biochem Mol Biol 2001;36:39–106.
3. Fields PA. Review: protein function at thermal extremes: balancing
stability and flexibility. Comp Biochem Physiol A: Comp Physiol
2001;129:417–431.
4. Demirjian DC, Moris-Varas F, Cassidy CS. Enzymes from extremo-
philes. Curr Opin Chem Biol 2001;5:144–151.
5. Atomi H. Recent progress towards the application of hyperthermo-
philes and their enzymes. Curr Opin Chem Biol 2005;9:166–173.
6. Turner P, Mamo G, Karlsson EN. Potential and utilization of ther-
mophiles and thermostable enzymes in biorefining. Microb Cell
Fact 2007;6:1–23.
7. Bommarius AS, Broering JM, Chaparro-Riggers JF, Polizzi KM.
High-throughput screening for enhanced protein stability. Curr
Opin Biotechnol 2006;17:606–610.
8. Eijsink VGH, Bjork A, Gaseidnes S, Sirevag R, Synstad B, van den
Burg B, Vriend G. Rational engineering of enzyme stability. J Bio-
technol 2004;113:105–120.
9. Eijsink VGH, Gaseidnes S, Borchert TV, van den Burg B. Directed
evolution of enzyme stability. Biomol Eng 2005;22:21–30.
10. Russell RJM, Taylor GL. Engineering thermostability—lessons from
thermophilic proteins. Curr Opin Biotechnol 1995;6:370–374.
11. Chaparro-Riggers JF, Polizzi KM, Bommarius AS. Better library
design: data-driven protein engineering. Biotechnol J 2007;2:180–
191.
12. Razvi A, Scholtz JM. Lessons in stability from thermophilic pro-
teins. Protein Sci 2006;15:1569–1578.
13. Berezovsky IN, Shakhnovich EI. Physics and evolution of thermo-
philic adaptation. Proc Natl Acad Sci USA 2005;102:12742–12747.
14. Zeldovich KB, Berezovsky IN, Shakhnovich EI. Protein and DNA
sequence determinants of thermophilic adaptation. PloS Comput
Biol 2007;3:62–72.
15. Sadeghi M, Naderi-Manesh H, Zarrabi M, Ranjbar B. Effective
factors in thermostability of thermophilic proteins. Biophys Chem
2006;119:256–270.
16. Gromiha MM, Oobatake M, Sarai A. Important amino acid pro-
perties for enhanced thermostability from mesophilic to thermo-
philic proteins. Biophys Chem 1999;82:51–67.
17. Kumar S, Tsai CJ, Nussinov R. Factors enhancing protein thermo-
stability. Protein Eng 2000;13:179–191.
18. Gianese G, Bossa F, Pascarella S. Comparative structural analysis of
psychrophilic and meso- and thermophilic enzymes. Proteins
2002;47:236–249.
Protein Rigidity and Thermophilic Adaptation
PROTEINS 1105
19. Robinson-Rechavi M, Alibes A, Godzik A. Contribution of electro-
static interactions, compactness and quaternary structure to
protein thermostability: lessons from structural genomics of Ther-
motoga maritima. J Mol Biol 2006;356:547–557.
20. Szilagyi A, Zavodszky P. Structural differences between mesophilic,
moderately thermophilic and extremely thermophilic protein sub-
units: results of a comprehensive survey. Structure 2000;8:493–504.
21. Querol E, Perez-Pons JA, Mozo-Villarias A. Analysis of protein
conformational characteristics related to thermostability. Prot Eng
1996;9:265–271.
22. Vogt G, Argos P. Protein thermal stability: hydrogen bonds or
internal packing? Fold Des 1997;2:S40–S46.
23. Vieille C, Zeikus JG. Thermozymes: identifying molecular determi-
nants of protein structural and functional stability. Trends Biotech-
nol 1996;14:183–190.
24. Sterner R, Brunner E. The relationship between catalytic activity,
structural flexibility and conformational stability as deduced from
the analysis of mesophilic-thermophilic enzyme pairs and protein
engineering studies. In: Robb F, Antranikian G, Grogan D,
Driessen A, editors. Thermophiles: biology and technology at high
temperatures. London: CRC Press; 2008. pp 25–38.
25. Jaenicke R. Protein stability and molecular adaptation to extreme
conditions. Eur J Biochem 1991;202:715–728.
26. Tang KES, Dill KA. Native protein fluctuations: the confor-
mational-motion temperature and the inverse correlation of pro-
tein flexibility with protein stability. J Biomol Struct Dyn
1998;16:397–411.
27. Vihinen M. Relationship of protein flexibility to thermostability.
Prot Eng 1987;1:477–480.
28. Wrba A, Schweiger A, Schultes V, Jaenicke R, Zavodszky P.
Extremely thermostable D-glyceraldehyde-3-phosphate dehydrogen-
ase from the eubacterium Thermotoga maritima. Biochemistry
1990;29:7584–7592.
29. Zavodszky P, Kardos J, Svingor A, Petsko GA. Adjustment of
conformational flexibility is a key event in the thermal adaptation
of proteins. Proc Natl Acad Sci USA 1998;95:7406–7411.
30. Jaenicke R. Do ultrastable proteins from hyperthermophiles have
high or low conformational rigidity? Proc Natl Acad Sci USA
2000;97:2962–2964.
31. Kamerzell TJ, Middaugh CR. The complex inter-relationships
between protein flexibility and stability. J Pharm Sci 2008;97:3494–
3517.
32. Georlette D, Blaise V, Collins T, D’Amico S, Gratia E, Hoyoux A,
Marx JC, Sonan G, Feller G, Gerday C. Some like it cold: biocatal-
ysis at low temperatures. FEMS Microbiol Rev 2004;28:25–42.
33. van Gunsteren WF, Hunenberger PH, Kovacs H, Mark AE, Schiffer
CA. Investigation of protein unfolding and stability by computer
simulation. In: Dobson CM, Fersht AR, editors. Protein folding.
Cambridge: Cambridge University Press; 1995. pp 49–60.
34. Purmonen M, Valjakka J, Takkinen K, Laitinen T, Rouvinen J. Mo-
lecular dynamics studies on the thermostability of family 11 xyla-
nases. Protein Eng Des Sel 2007;20:551–559.
35. Sham YY, Ma B, Tsai CJ, Nussinov R. Thermal unfolding molecu-
lar dynamics simulation of Escherichia coli dihydrofolate reductase:
thermal stability of protein domains and unfolding pathway.
Proteins 2002;46:308–320.
36. Pang JY, Allemann RK. Molecular dynamics simulation of thermal
unfolding of Thermatoga maritima DHFR. Phys Chem Chem Phys
2007;9:711–718.
37. Creveld LD, Amadei A, van Schaik RC, Pepermans HAM, de Vlieg
J, Berendsen HJC. Identification of functional and unfolding
motions of cutinase as obtained from molecular dynamics com-
puter simulations. Proteins 1998;33:253–264.
38. Case DA. Normal mode analysis of protein dynamics. Curr Opin
Struct Biol 1994;4:285–290.
39. Su JG, Li CH, Hao R, Chen WZ, Wang CX. Protein unfolding beha-
vior studied by elastic network model. Biophys J 2008;94:4586–4596.
40. Ahmed A, Gohlke H. Multiscale modeling of macromolecular
conformational changes combining concepts from rigidity and
elastic network theory. Proteins 2006;63:1038–1051.
41. Gohlke H, Thorpe MF. A natural coarse graining for simulating
large biomolecular motion. Biophys J 2006;91:2115–2120.
42. Radestock S, Gohlke H. Exploiting the link between protein rigi-
dity and thermostability for data-driven protein engineering. Eng
Life Sci 2008;8:507–522.
43. Jacobs DJ, Rader AJ, Kuhn LA, Thorpe MF. Protein flexibility pre-
dictions using graph theory. Proteins 2001;44:150–165.
44. Dill KA. Dominant forces in protein folding. Biochemistry
1990;29:7133–7155.
45. Bode C, Kovacs IA, Szalay MS, Palotai R, Korcsmaros T, Csermely P.
Network analysis of protein dynamics. FEBS Lett 2007;581:2776–2782.
46. Greene LH, Higman VA. Uncovering network systems within
protein structures. J Mol Biol 2003;334:781–791.
47. Whiteley W. Counting out to the flexibility of molecules. Phys Biol
2005;2:S116–S126.
48. Jacobs DJ. Generic rigidity in three-dimensional bond-bending
networks. J Phys A: Math Gen 1998;31:6653–6668.
49. Hespenheide BM, Rader AJ, Thorpe MF, Kuhn LA. Identifying
protein folding cores from the evolution of flexible regions during
unfolding. J Mol Graph Mod 2002;21:195–207.
50. Rader AJ, Hespenheide BM, Kuhn LA, Thorpe MF. Protein unfol-
ding: rigidity lost. Proc Natl Acad Sci USA 2002;99:3540–3545.
51. Dahiyat BI, Gordon DB, Mayo SL. Automated design of the
surface positions of protein helices. Protein Sci 1997;6:1333–1337.
52. Makhatadze GI, Privalov PL. Energetics of protein structure. Adv
Protein Chem 1995;47:307–425.
53. Thorpe MF. Continuous deformations in random networks.
J Non-Cryst Solids 1983;57:355–370.
54. Stauffer D, Aharony A. Introduction to percolation theory.
London: Taylor and Francis; 1994.
55. Albert R, Barabasi AL. Statistical mechanics of complex networks.
Rev Mod Phys 2002;74:47–97.
56. Tsang IR, Tsang IJ. Cluster size diversity, percolation, and complex
systems. Phys Rev E 1999;60:2684–2698.
57. Andraud C, Beghdadi A, Lafait J. Entropic analysis of random
morphologies. Phys A 1994;207:208–212.
58. Fitter J, Herrmann R, Dencher NA, Blume A, Hauss T. Activity
and stability of a thermostable a-amylase compared to its meso-
philic homologue: mechanisms of thermal adaptation. Biochemi-
stry 2001;40:10723–10731.
59. Vieille C, Zeikus GJ. Hyperthermophilic enzymes: sources, uses,
and molecular mechanisms for thermostability. Microbiol Mol Biol
Rev 2001;65:1–43.
60. Gohlke H, Kuhn LA, Case DA. Change in protein flexibility upon
complex formation: analysis of Ras-Raf using molecular dynamics
and a molecular framework approach. Proteins 2004;56:322–337.
61. Wallon G, Kryger G, Lovett ST, Oshima T, Ringe D, Petsko GA.
Crystal structures of Escherichia coli and Salmonella typhimurium
3-isopropylmalate dehydrogenase and comparison with their ther-
mophilic counterpart from Thermus thermophilus. J Mol Biol
1997;266:1016–1031.
62. Kadono S, Sakurai M, Moriyama H, Sato M, Hayashi Y, Oshima
T, Tanaka N. Ligand-induced changes in the conformation of 3-
isopropylmalate dehydrogenase from Thermus thermophilus. J
Biochem 1995;118:745–752.
63. DeLano WL. The PyMOL molecular graphics system. Palo Alto,
CA: DeLano Scientific; 2002.
64. Motono C, Oshima T, Yamagishi A. High thermal stability of 3-
isopropylmalate dehydrogenase from Thermus thermophilus result-
ing from low delta Cp of unfolding. Protein Eng 2001;14:961–966.
65. Motono C, Yamagishi A, Oshima T. Urea-induced unfolding and
conformational stability of 3-isopropylmalate dehydrogenase from
the thermophile Thermus thermophilus and its mesophilic counter-
part from Escherichia coli. Biochemistry 1999;38:1332–1337.
S. Radestock and H. Gohlke
1106 PROTEINS
66. Akanuma S, Yamagishi A, Tanaka N, Oshima T. Serial increase in
the thermal stability of 3-isopropylmalate dehydrogenase from
Bacillus subtilis by experimental evolution. Prot Sci 1998;7:698–705.
67. Tamakoshi M, Nakano Y, Kakizawa S, Yamagishi A, Oshima T.
Selection of stabilized 3-isopropylmalate dehydrogenase of Saccha-
romyces cerevisiae using the host-vector system of an extreme ther-
mophile, Thermus thermophilus. Extremophiles 2001;5:17–22.
68. Ohkuri T, Yamagishi A. Increased thermal stability against irrevers-
ible inactivation of 3-isopropylmalate dehydrogenase induced by
decreased van der Waals volume at the subunit interface. Prot Eng
2003;16:615–621.
69. Aoshima M, Oshima T. Stabilization of Escherichia coli isopropyl-
malate dehydrogenase by single amino acid substitution. Prot Eng
1997;10:249–254.
70. Kirino H, Aoki M, Aoshima M, Hayashi Y, Ohba M, Yamagishi A,
Wakagi T, Oshima T. Hydrophobic interaction at the subunit inter-
face contributes to the thermostability of 3-isopropylmalate de-
hydrogenase from an extreme thermophile, Thermus thermophilus.
Eur J Biochem 1994;220:275–281.
71. Nemeth A, Svingor A, Pocsik M, Dobo J, Magyar C, Szilagyi A,
Gal P, Zavodszky P. Mirror image mutations reveal the significance
of an intersubunit ion cluster in the stability of 3-isopropylmalate
dehydrogenase. FEBS Lett 2000;468:48–52.
72. Hori T, Moriyama H, Kawaguchi J, Hayashi-Iwasaki Y, Oshima T,
Tanaka N. The initial step of the thermal unfolding of 3-isopropyl-
malate dehydrogenase detected by the temperature-jump Laue
method. Prot Eng 2000;13:527–533.
73. Kotsuka T, Akanuma S, Tomuro M, Yamagishi A, Oshima T.
Further stabilization of 3-isopropylmalate dehydrogenase of an
extreme thermophile, Thermus thermophilus, by a suppressor
mutation method. J Bacteriol 1996;178:723–727.
74. Numata K, Hayashi-Iwasaki Y, Kawaguchi J, Sakurai M, Moriyama
H, Tanaka N, Oshima T. Thermostabilization of a chimeric enzyme
by residue substitutions: four amino acid residues in loop regions are
responsible for the thermostability of Thermus thermophilus isopro-
pylmalate dehydrogenase. Biochim Biophys Acta 2001;1545:174–183.
75. Numata K, Hayashiiwasaki Y, Yutani K, Oshima T. Studies on
interdomain interaction of 3-isopropylmalate dehydrogenase from
an extreme thermophile, Thermus thermophilus, by constructing
chimeric enzymes. Extremophiles 1999;3:259–262.
76. Numata K, Muro M, Akutsu N, Nosoh Y, Yamagishi A, Oshima T.
Thermal stability of chimeric isopropylmalate dehydrogenase genes
constructed from a thermophile and a mesophile. Prot Eng
1995;8:39–43.
77. Sakurai M, Moriyama H, Onodera K, Kadono S, Numata K,
Hayashi Y, Kawaguchi J, Yamagishi A, Oshima T, Tanaka N. The
crystal-structure of thermostable mutants of chimeric 3-isopropyl-
malate dehydrogenase, 2T2M6T. Prot Eng 1995;8:763–767.
78. Imada K, Sato M, Tanaka N, Katsube Y, Matsuura Y, Oshima T. 3-
Dimensional structure of a highly thermostable enzyme, 3-isopro-
pylmalate dehydrogenase of Thermus thermophilus at 2.2 A resolu-
tion. J Mol Biol 1991;222:725–738.
79. Akanuma S, Qu CX, Yamagishi A, Tanaka N, Oshima T. Effect of
polar side chains at position 172 on thermal stability of 3-isopro-
pylmalate dehydrogenase from Thermus thermophilus. FEBS Lett
1997;410:141–144.
80. Qu CX, Akanuma S, Moriyama H, Tanaka N, Oshima T. A muta-
tion at the interface between domains causes rearrangement
of domains in 3-isopropylmalate dehydrogenase. Prot Eng
1997;10:45–52.
81. Tamakoshi M, Yamagishi A, Oshima T. Screening of stable proteins
in an extreme thermophile, Thermus thermophilus. Mol Microbiol
1995;16:1031–1036.
82. Akanuma S, Yamagishi A, Tanaka N, Oshima T. Spontaneous
tandem sequence duplications reverse the thermal stability of
carboxyl-terminal modified 3-isopropylmalate dehydrogenase.
J Bacteriol 1996;178:6300–6304.
83. Nurachman Z, Akanuma S, Sato T, Oshima T, Tanaka N. Crystal
structures of 3-isopropylmalate dehydrogenases with mutations at
the C-terminus: crystallographic analyses of structure-stability rela-
tionships. Prot Eng 2000;13:253–258.
84. Adekoya OA, Sylte I. The thermolysin family (M4) of enzymes:
therapeutic and biotechnological potential. Chem Biol Drug Des
2009;73:7–16.
85. Corbett RJ, Roche RS. The unfolding mechanism of thermolysin.
Biopolymers 1983;22:101–105.
86. Imanaka T, Shibazaki M, Takagi M. A new way of enhancing the
thermostability of proteases. Nature 1986;324:695–697.
87. Mansfeld J, Vriend G, Dijkstra BW, Veltman OR, VandenBurg B,
Venema G, UlbrichHofmann R, Eijsink VGH. Extreme stabilization
of a thermolysin-like protease by an engineered disulfide bond.
J Biol Chem 1997;272:11152–11156.
88. van den Burg B, Vriend G, Veltman OR, Venema G, Eunsink
VGH. Engineering an enzyme to resist boiling. Proc Natl Acad Sci
USA 1998;95:2056–2060.
89. Hardy F, Vriend G, Veltman OR, Vandervinne B, Venema G,
Eijsink VGH. Stabilization of Bacillus stearothermophilus neutral
protease by introduction of prolines. FEBS Lett 1993;317:89–92.
90. Kidokoro S, Miki Y, Endo K, Wada A, Nagao H, Miyake T,
Aoyama A, Yoneya T, Kai K, Ooe S. Remarkable activity enhance-
ment of thermolysin mutants. FEBS Lett 1995;367:73–76.
91. van den Burg B, Enequist HG, Vanderhaar ME, Eijsink VGH, Stulp
BK, Venema G. A highly thermostable neutral protease from
Bacillus caldolyticus—cloning and expression of the gene in Bacillus
subtilis and characterization of the gene product. J Bacteriol
1991;173:4107–4115.
92. Veltman OR, Vriend G, Hardy F, Mansfeld J, van den Burg B,
Venema G, Eijsink VGH. Mutational analysis of a surface area that
is critical for the thermal stability of thermolysin-like proteases.
Eur J Biochem 1997;248:433–440.
93. Veltman OR, Vriend G, Middelhoven PJ, van den Burg B, Venema
G, Eijsink VGH. Analysis of structural determinants of the stability
of thermolysin-like proteases by molecular modelling and site-
directed mutagenesis. Prot Eng 1996;9:1181–1189.
94. van den Burg B, Dijkstra BW, Vriend G, Vandervinne B, Venema
G, Eijsink VGH. Protein stabilization by hydrophobic interactions
at the surface. Eur J Biochem 1994;220:981–985.
95. Eijsink VGH, Dijkstra BW, Vriend G, Vanderzee JR, Veltman OR,
Vandervinne B, Vandenburg B, Kempe S, Venema G. The effect of
cavity-filling mutations on the thermostability of Bacillus stearo-
thermophilus neutral protease. Prot Eng 1992;5:421–426.
96. Eijsink VGH, Vriend G, Vandenburg B, Vanderzee JR, Veltman OR,
Stulp BK, Venema G. Introduction of a stabilizing 10-residue b-hair-pin in Bacillus subtilis neutral protease. Prot Eng 1992;5:157–163.
97. Eijsink VGH, Vriend G, Vandenburg B, Vanderzee JR, Venema G.
Increasing the thermostability of a neutral protease by replacing
positively charged amino-acids in the N-terminal turn of alpha-
helices. Prot Eng 1992;5:165–170.
98. Eijsink VGH, Vriend G, Vandervinne B, Hazes B, Vandenburg B,
Venema G. Effects of changing the interaction between subdomains
on the thermostability of Bacillus neutral proteases. Proteins
1992;14:224–236.
99. Margarit I, Campagnoli S, Frigerio F, Grandi G, Defilippis V, Fontana
A. Cumulative stabilizing effects of glycine to alanine substitutions in
Bacillus subtilis neutral protease. Prot Eng 1992;5:543–550.
100. Sitharam M, Agbandje-McKenna M. Modeling virus self-assembly
pathways: avoiding dynamics using geometric constraint decompo-
sition. J Comput Biol 2006;13:1232–1265.
101. Livesay DR, Jacobs DJ. Conserved quantitative stability/flexibility
relationships (QSFR) in an orthologous RNase H pair. Proteins
2006;62:130–143.
102. Livesay DR, Huynh DH, Dallakyan S, Jacobs DJ. Hydrogen bond
networks determine emergent mechanical and thermodynamic
properties across a protein family. Chem Cent J 2008;2:1–20.
Protein Rigidity and Thermophilic Adaptation
PROTEINS 1107
103. Somero GN. Temperature adaptation of enzymes—biological opti-
mization through structure-function compromises. Annu Rev Ecol
Syst 1978;9:1–29.
104. Hollien J, Marqusee S. Structural distribution of stability in a ther-
mophilic enzyme. Proc Natl Acad Sci USA 1999;96:13674–13678.
105. Hollien J, Marqusee S. Comparison of the folding processes of
T. thermophilus and E. coli ribonucleases H. J Mol Biol
2002;316:327–340.
106. Fitter J, Heberle J. Structural equilibrium fluctuations in meso-
philic and thermophilic alpha-amylase. Biophys J 2000;79:1629–
1636.
107. Hernandez G, Jenney FE, Adams MWW, LeMaster DM. Milli-
second time scale conformational flexibility in a hyperthermophile
protein at ambient temperature. Proc Natl Acad Sci USA
2000;97:3166–3170.
108. Hernandez G, LeMaster DM. Reduced temperature dependence of
collective conformational opening in a hyperthermophile rub-
redoxin. Biochemistry 2001;40:14384–14391.
109. Hurley JH, Dean AM. Structure of 3-isopropylmalate dehydro-
genase in complex with NAD1—ligand-induced loop closing and
mechanism for cofactor specificity. Structure 1994;2:1007–1016.
110. Hirose R, Suzuki T, Moriyama H, Sato T, Yamagishi A, Oshima T,
Tanaka N. Crystal structures of mutants of Thermus thermophilus
IPMDH adapted to low temperatures. Prot Eng 2001;14:81–84.
111. Suzuki T, Yasugi M, Arisaka F, Oshima T, Yamagishi A. Cold-adap-
tation mechanism of mutant enzymes of 3-isopropylmalate dehy-
drogenase from Thermus thermophilus. Prot Eng 2002;15:471–476.
112. Yasugi M, Amino M, Suzuki T, Oshima T, Yamagishi A. Cold
adaptation of the thermophilic enzyme 3-isopropylmalate dehydro-
genase. J Biochem 2001;129:477–484.
113. Yasugi M, Suzuki T, Yamagishi A, Oshima T. Analysis of the effect
of accumulation of amino acid replacements on activity of 3-iso-
propylmalate dehydrogenase from Thermus thermophilus. Prot Eng
2001;14:601–607.
114. Hangauer DG, Monzingo AF, Matthews BW. An interactive
computer-graphics study of thermolysin-catalyzed peptide cleavage
and inhibition by N-carboxymethyl dipeptides. Biochemistry
1984;23:5730–5741.
115. Matthews BW. Structural basis of the action of thermolysin and
related zinc peptidases. Acc Chem Res 1988;21:333–340.
116. Mock WL, Aksamawati M. Binding to thermolysin of phenolate-
containing inhibitors necessitates a revised mechanism of catalysis.
Biochem J 1994;302:57–68.
117. Mock WL, Stanford DJ. Arazoformyl dipeptide substrates for ther-
molysin. Confirmation of a reverse protonation catalytic mecha-
nism. Biochemistry 1996;35:7369–7377.
118. van Aalten DMF, Amadei A, Linssen ABM, Eijsink VGH, Vriend
G, Berendsen HJC. The essential dynamics of thermolysin—confir-
mation of the hinge-bending motion and comparison of simula-
tions in vacuum and water. Proteins 1995;22:45–54.
119. Veltman OR, Eijsink VGH, Vriend G, de Kreij A, Venema G, van
den Burg B. Probing catalytic hinge bending motions in thermoly-
sin-like proteases by glycine to alanine mutations. Biochemistry
1998;37:5305–5311.
120. Holland DR, Hausrath AC, Juers D, Matthews BW. Structural-ana-
lysis of zinc substitutions in the active-site of thermolysin. Prot Sci
1995;4:1955–1965.
121. Holland DR, Tronrud DE, Pley HW, Flaherty KM, Stark W, Janso-
nius JN, Mckay DB, Matthews BW. Structural comparison suggests
that thermolysin and related neutral proteases undergo hinge-
bending motion during catalysis. Biochemistry 1992;31:11310–
11316.
122. Liu YH, Konermann L. Conformational dynamics of free and cata-
lytically active thermolysin are indistinguishable by hydrogen/deu-
terium exchange mass spectrometry. Biochemistry 2008;47:6342–
6351.
123. de Kreij A, van den Burg B, Venema G, Vriend G, Eijsink VG,
Nielsen JE. The effects of modifying the surface charge on the cata-
lytic activity of a thermolysin-like protease. J Biol Chem
2002;277:15432–15438.
124. Yasukawa K, Inouye K. Improving the activity and stability of ther-
molysin by site-directed mutagenesis. Biochim Biophys Acta
2007;1774:1281–1288.
125. Vendruscolo M, Dokholyan NV, Paci E, Karplus M. Small-world
view of the amino acids that play a key role in protein folding.
Phys Rev E 2002;65:1–4.
126. Dokholyan NV, Li L, Ding F, Shakhnovich EI. Topological determi-
nants of protein folding. Proc Natl Acad Sci USA 2002;99:8637–
8641.
127. Heringa J, Argos P. Side-chain clusters in protein structures and
their role in protein folding. J Mol Biol 1991;220:151–171.
128. Heringa J, Argos P, Egmond MR, Devlieg J. Increasing thermal sta-
bility of subtilisin from mutations suggested by strongly interacting
side-chain clusters. Prot Eng 1995;8:21–30.
129. Kannan N, Vishveshwara S. Identification of side-chain clusters in
protein structures by a graph spectral method. J Mol Biol
1999;292:441–464.
130. Brinda KV, Vishveshwara S. A network representation of protein
structures: implications for protein stability. Biophys J
2005;89:4159–4170.
131. Korkegian A, Black ME, Baker D, Stoddard BL. Computational
thermostabilization of an enzyme. Science 2005;308:857–860.
132. Kuhlman B, Baker D. Native protein sequences are close to optimal
for their structures. Proc Natl Acad Sci USA 2000;97:10383–10388.
133. Malakauskas SM, Mayo SL. Design, structure and stability of a
hyperthermophilic protein variant. Nat Struct Biol 1998;5:470–475.
134. Bourne PE, Addess KJ, Bluhm WF, Chen L, Deshpande N, Feng
ZK, Fleri W, Green R, Merino-Ott JC, Townsend-Merino W,
Weissig H, Westbrook J, Berman HM. The distribution and query
systems of the RCSB protein data bank. Nucleic Acids Res
2004;32:D223–D225.
135. Hooft RWW, Vriend G, Sander C, Abola EE. Errors in protein
structures. Nature 1996;381:272.
136. Porter CT, Bartlett GJ, Thornton JM. The Catalytic Site Atlas: a
resource of catalytic sites and residues identified in enzymes using
structural data. Nucleic Acids Res 2004;32:D129–D133.
137. Holm L, Park J. DaliLite workbench for protein structure compari-
son. Bioinformatics 2000;16:566–567.
138. Word JM, Lovell SC, Richardson JS, Richardson DC. Asparagine
and glutamine: using hydrogen atom contacts in the choice of
side-chain amide orientation. J Mol Biol 1999;285:1735–1747.
139. Vogt G, Woell S, Argos P. Protein thermal stability, hydrogen
bonds, and ion pairs. J Mol Biol 1997;269:631–643.
140. Bava KA, Gromiha MM, Uedaira H, Kitajima K, Sarai A.
ProTherm, version 4.0: thermodynamic database for proteins and
mutants. Nucleic Acids Res 2004;32:D120–D121.
141. Jacobs DJ, Thorpe MF. Generic rigidity percolation—the pebble
game. Phys Rev Lett 1995;75:4051–4054.
142. Jacobs DJ, Hendrickson B. An algorithm for two-dimensional
rigidity percolation: the pebble game. J Comput Phys
1997;137:346–365.
143. Dehouck Y, Folch B, Rooman M. Revisiting the correlation
between proteins’ thermoresistance and organisms’ thermophilicity.
Prot Eng Des Sel 2008;21:275–278.
S. Radestock and H. Gohlke
1108 PROTEINS