an improved model of association for vh–vl immunoglobulin domains: asymmetries between vh and vl...

9
Review An improved model of association for VH–VL immunoglobulin domains: Asymmetries between VH and VL in the packing of some interface residues Enrique Vargas-Madrazo 1 * and Enrique Paz-Garcı´a 2 1 Instituto de Investigaciones Biolo ´gicas, Universidad Veracruzana, Xalapa, Veracruz, Me ´xico 2 Facultad de Biologı ´a, Universidad Veracruzana, Xalapa, Veracruz, Me ´xico The antibody-binding site is formed as a result of the association between VH and VL domains. Several studies have shown that this association plays an important role in the mechanism of antigen–antibody interaction (Stanfield et al. Structure 1: 83–93, 1993). Considering this, we propose that variations in the VH–VL association are part of the diversification strategy of the antibody repertoires. Previously, a model of association for VH–VL domains based on geometrical characteristics of the packing at the interface was developed by Chothia et al. (J. Mol. Biol. 186: 61–663, 1985). This model includes a common association form for antibodies and a three-layer structure for the interface. In the present work, a complementary model is introduced to account for the general geometrical restrictions of the VH–VL interface, and particular arrangements related to the chemical properties or the side-chain orientations of participating residues. Groups of residues assume common side-chain orientations, which are apparently related to particular functions of different interface zones. Analyses of amino acid usage and network are in agreement with the side-chain orientation patterns. Based on these observations, a three-zone model has evolved to illuminate geometrical and functional restrictions acting over the VH–VL interface. Additionally, this study has revealed the asymmetrical relationships between VH and VL residues important for the association of the two domains. Copyright # 2003 John Wiley & Sons, Ltd. Keywords: antibody engineering; domain association; contact analysis; amino acid usage; diversification strategy Received 9 October 2002; revised 8 January 2003; accepted 10 January 2003 INTRODUCTION The capacity of the immune system to mount an adequate response to diverse antigenic challenges depends mainly on its propensity to generate specific receptors (Igs and TCRs) of adequate diversity and affinity (Berek and Milstein, 1987). Specifically, as part of the humoral component of the immune response, B-lymphocytes of vertebrates have developed an arsenal of mechanisms that make this diversification of specificities possible (Tonegawa, 1983). In terms of the antibody-binding site structure, these mechanisms generate the following events: (i) substitution of residues; (ii) insertion and deletion of segments; and (iii) variations in the types of association between VH and VL domains (Wu and Kabat, 1970; Padlan, 1977; Novotny et al., 1983; Stevens et al., 1988). It is possible to understand the last component of diversification by considering the antibody-binding site as a result of the quaternary interac- tion between VH and VL. The recognition properties of the antibody-binding site can be modified by altering the association of VH–VL domains, a mechanism that has been proposed by several authors (Davies and Metzger, 1983; Chang et al., 1985; Chothia et al., 1985; Colman, 1988; Stevens et al., 1988; Stanfield et al., 1993; Chatellier et al., 1996; Khalifa et al., 2000). Modifications of the VH–VL association can be achieved through the substitution of residues located at the interior of the interface, generating small re-adjustments in the relative disposition between the two domains. These alterations do not have to be of sufficient magnitude to disturb the global stability of the quaternary interaction between VH and VL (Chatellier et al., 1996; Banfield et al., 1997; Jager and Pluckthun, 1997; Khalifa et al., 2000). Changes in the VH–VL association can modify the relative positions of the hypervariable loops, which in turn can alter the general shape of the antigen- binding site, as well as the disposition of side-chains that interact directly with the epitope (Chang et al., 1985; Colman, 1988; Stanfield et al., 1993). By this mechanism it is possible to generate variations of the binding site within a JOURNAL OF MOLECULAR RECOGNITION J. Mol. Recognit. 2003; 16: 113–120 Published online in Wiley InterScience (www.interscience.wiley.com). DOI:10.1002/jmr.613 Copyright # 2003 John Wiley & Sons, Ltd. *Correspondence to: E. Vargas-Madrazo, Apartado Postal 495, Xalapa, Ver., 91000, Me ´xico. E-mail: [email protected] Contract/grant sponsor: Proyectos Estrate ´gicos 1999–2000 UV. Contract/grant sponsor: SNI-CONACYT. Abbreviations used: CDR, complementarity-determining region; H1, H2 and H3, first, second and third hypervariable loop of heavy chain; Igs, immuno- globulins; L1, L2 and L3, first, second and third hypervariable loop of light chain; TCR, T-cell receptor; VH, variable heavy domain; VL, variable light domain.

Upload: enrique-vargas-madrazo

Post on 11-Jun-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

ReviewAn improved model of association for VH–VLimmunoglobulin domains: Asymmetries betweenVH and VL in the packing of some interfaceresidues

Enrique Vargas-Madrazo1* and Enrique Paz-Garcıa2

1Instituto de Investigaciones Biologicas, Universidad Veracruzana, Xalapa, Veracruz, Mexico2Facultad de Biologıa, Universidad Veracruzana, Xalapa, Veracruz, Mexico

The antibody-binding site is formed as a result of the association between VH and VL domains. Severalstudies have shown that this association plays an important role in the mechanism of antigen–antibodyinteraction (Stanfield et al. Structure 1: 83–93, 1993). Considering this, we propose that variations in theVH–VL association are part of the diversification strategy of the antibody repertoires. Previously, a modelof association for VH–VL domains based on geometrical characteristics of the packing at the interface wasdeveloped by Chothia et al. (J. Mol. Biol. 186: 61–663, 1985). This model includes a common associationform for antibodies and a three-layer structure for the interface. In the present work, a complementarymodel is introduced to account for the general geometrical restrictions of the VH–VL interface, andparticular arrangements related to the chemical properties or the side-chain orientations of participatingresidues. Groups of residues assume common side-chain orientations, which are apparently related toparticular functions of different interface zones. Analyses of amino acid usage and network are inagreement with the side-chain orientation patterns. Based on these observations, a three-zone model hasevolved to illuminate geometrical and functional restrictions acting over the VH–VL interface. Additionally,this study has revealed the asymmetrical relationships between VH and VL residues important for theassociation of the two domains. Copyright � 2003 John Wiley & Sons, Ltd.

Keywords: antibody engineering; domain association; contact analysis; amino acid usage; diversification strategy

Received 9 October 2002; revised 8 January 2003; accepted 10 January 2003

INTRODUCTION

The capacity of the immune system to mount an adequateresponse to diverse antigenic challenges depends mainly onits propensity to generate specific receptors (Igs and TCRs)of adequate diversity and affinity (Berek and Milstein,1987). Specifically, as part of the humoral component of theimmune response, B-lymphocytes of vertebrates havedeveloped an arsenal of mechanisms that make thisdiversification of specificities possible (Tonegawa, 1983).In terms of the antibody-binding site structure, thesemechanisms generate the following events: (i) substitutionof residues; (ii) insertion and deletion of segments; and (iii)variations in the types of association between VH and VL

domains (Wu and Kabat, 1970; Padlan, 1977; Novotny etal., 1983; Stevens et al., 1988). It is possible to understandthe last component of diversification by considering theantibody-binding site as a result of the quaternary interac-tion between VH and VL. The recognition properties of theantibody-binding site can be modified by altering theassociation of VH–VL domains, a mechanism that has beenproposed by several authors (Davies and Metzger, 1983;Chang et al., 1985; Chothia et al., 1985; Colman, 1988;Stevens et al., 1988; Stanfield et al., 1993; Chatellier et al.,1996; Khalifa et al., 2000). Modifications of the VH–VLassociation can be achieved through the substitution ofresidues located at the interior of the interface, generatingsmall re-adjustments in the relative disposition between thetwo domains. These alterations do not have to be ofsufficient magnitude to disturb the global stability of thequaternary interaction between VH and VL (Chatellier etal., 1996; Banfield et al., 1997; Jager and Pluckthun, 1997;Khalifa et al., 2000). Changes in the VH–VL associationcan modify the relative positions of the hypervariable loops,which in turn can alter the general shape of the antigen-binding site, as well as the disposition of side-chains thatinteract directly with the epitope (Chang et al., 1985;Colman, 1988; Stanfield et al., 1993). By this mechanism itis possible to generate variations of the binding site within a

JOURNAL OF MOLECULAR RECOGNITIONJ. Mol. Recognit. 2003; 16: 113–120Published online in Wiley InterScience (www.interscience.wiley.com). DOI:10.1002/jmr.613

Copyright � 2003 John Wiley & Sons, Ltd.

*Correspondence to: E. Vargas-Madrazo, Apartado Postal 495, Xalapa, Ver.,91000, Mexico.E-mail: [email protected]/grant sponsor: Proyectos Estrategicos 1999–2000 UV.Contract/grant sponsor: SNI-CONACYT.

Abbreviations used: CDR, complementarity-determining region; H1, H2 andH3, first, second and third hypervariable loop of heavy chain; Igs, immuno-globulins; L1, L2 and L3, first, second and third hypervariable loop of lightchain; TCR, T-cell receptor; VH, variable heavy domain; VL, variable lightdomain.

basic motif, thus contributing to modifications in therecognition properties of the antibody (Chang et al., 1985;Banfield et al., 1997; Chatellier et al., 1996; Khalifa et al.,2000).

The perception of the importance of the VH–VLassociation in the diversification of the antibody repertoiresimpelled several investigators to characterize the generalproperties of the VH–VL interface (Davies et al., 1975;Amzel and Poljak, 1979; Davies and Metzger, 1983;Novotny et al., 1983). These studies showed the presenceof highly conserved zones, suggesting a common associa-tion form for the antibodies, as well as the contributions thatcertain hypervariable residues make to the interface. In 1985Chothia et al. proposed a model of association whichconsiders the general aspects of the interface geometry andthe packing of residues involved in that interaction(Novotny et al., 1983; Novotny and Haber, 1985; Chothiaet al., 1985). The analysis of this geometry allowed Chothiaet al. (1985) to propose a three-layer packing model for theVH–VL interface. According to this model, an inner layer isformed by side-chains from the twisted ends of externalstrands that converge at the center of the interface. Inaddition, there are two external layers, one composed ofthe central segment of external strands and the secondencompassing of the remaining residues of internal strands(Chothia et al., 1985). This model permits the identificationof the main attributes of the general geometry characterizingthe VH–VL association.

However, the study of Chothia et al. was based on onlythree crystallographic structures. This small database pre-cludes analyses of contact frequencies and patterns of side-chain orientation. Such analyses are useful for describingnetworks of interactions between diverse residues and theirrelative contributions to the interface. A comparison of thisinformation with amino acid usage by position helps todefine the arrangements of residues consistent with thegeometric restrictions of the VH–VL association andfunctional aspects pertinent to the recognition mechanism(Davies et al., 1975; Stevens et al., 1988; Davies andMetzger, 1983; Colman, 1988; Stanfield et al., 1993).

Hypermutation patterns (Tomlinson et al., 1996; Wiens etal., 1998; Ramirez-Benıtez and Almagro, 2001) and proteinengineering of antibodies (Wedemayer et al., 1997;Daugherty et al., 2000) and other molecules (Clacksonand Wells, 1995) have suggested that it is not energeticallyefficient to replace residues in contact with the ligandbecause of the high risk of radical alterations in the ligand–receptor complementarity. Consequently, the most suitablestrategy is to modify side chains in the periphery or ‘vernier’zone to generate a ‘fine-tuning’ effect by making morelimited changes in the complementarity (Foote and Winter,1992; Schillbach et al., 1993). Substitution of hypervariableresidues involved in the VH–VL interface thus appears to bea suitable mechanism to contribute to this strategy of ‘fine-tuning’ (Schiffer et al., 1985, 1988; Colman, 1988; Footeand Winter, 1992). It is therefore fundamental to establish adetailed model of the VH–VL association that facilitatesinterpretation of substitution patterns and their effects on therecognition properties of antibodies (Stevens et al., 1988).

In the present report, we describe a model based on: (i)analyses of 23 crystallographic structures of antibodies; (ii)analyses of contact frequencies and orientations of side-

chains located at the VH–VL interface; and (iii) the aminoacid usage by position. Four principal concepts haveemerged from this work: (1) not all of the residues predictedby Chothia et al. (1985) in the VH–VL association are trulyinvolved; (2) residue L96, one of the key residues in theinner layer of the Chothia model, rotates and retreats fromthis layer in almost all antibodies examined; (3) H3participates actively in the interface and contributes ‘extra’residues; and (4) some hypervariable residues converge nearthe antigen-binding site in what we have designated theproximate zone.

MATERIALS AND METHODS

Structural data

Atomic coordinates of 23 human and murine antibodyfragments were obtained (www.rcsb.org/pdb) from theprotein database (PDB; see Table 1). We selected structuresdetermined at medium and high resolution (�3.0 A), withR-factors no greater than 22% and with H3 containing short,medium and large loops (Wu et al., 1993). Thirteen sampleswere unliganded fragments and 10 were present in com-plexes with ligands. Among the light chains, there were sixlambda- and 17 kappa-type molecules.

Contact analyses and orientations of side-chains at theVH–VL interface

The inter-atomic contact analysis at the interface was based

����� �� ���� � ������� ������

Antibody PDB codeResolution

(A)R-value

(%)H3

length

N10 1NSN 2.9 19 450.1 1GGI 2.8 18 5CHA255 1IND 2.2 18 5ANO2 1BAF 2.9 19 6D44.1 1MLB 2.1 18 74-4-20 1FLR 1.85 18 7D1.3 1FDL 2.5 18 8YST9.1 1MAM 2.45 21 8SE155-4 1MFE 2.0 16 9J539 2FBJ 1.95 19 917E8 1EAP 2.5 18 10R6.5 1RMF 2.81 18 10McPC603 1MCP 2.7 22 11N1G9 1NGP 2.4 19 11HIL 8FAB 1.8 17 12POT 1IGM 2.3 20 1240-50 1IBG 2.7 20 13F9.13.7 1FBI 3.0 19 13HC19 1GIG 2.3 17 14H52 1FGV 1.9 18 15OPG2 1OPG 2.0 16 15R454511 1IKF 2.5 16 17KOL 2FB4 1.9 18 17

114 E. VARGAS-MADRAZO AND E. PAZ-GARCIA

Copyright � 2003 John Wiley & Sons, Ltd. J. Mol. Recognit. 2003; 16: 113–120

on the residues previously identified by Chothia et al.,(1985). We added neighboring residues, which could makecontact with the opposite domain. The contact criterion wasset at a maximum distance of 4.1 A between two atoms andcomputed with the program InsightII (version 2000). Thefrequency of participation in at least one interdomaincontact was calculated for each residue over the set of 23antibody fragments (see Table 2). For example, residue H95is involved in contacts in four structures (17%). In all fourstructures, H95 makes contact with L96 (100%); with L91in two proteins (50%); and with L89 in one structure (25%).

Relative orientations of side-chains were determined forall residues (H35, H47, H95, H103-3, L34, L46, L91, L96)that might be suitable for inclusion in the ‘proximate zone’(Table 3). Classification of these residues was carried out bythe following procedure: (1) for each residue a plane waspassed through its �-carbon perpendicular to the VH–VLpseudo-symmetry axis; (2) if the side-chain of the residuecoincided approximately with this plane, its direction wasconsidered neutral and denoted by the symbol ‘�’; (3)When the side-chain was clearly above the plane of its �-carbon, it was considered in the ‘up-direction’, and pointing

����� �� ���� ������� � ���� ��� ����� ��� ����� �������

PositionContact

percentageaContacts with residues of the

opposite domainb (%)

Amino acid usage Secondarystructure(Chothiaand Lesk,

1987)c

Functionalsegments

(Kabat andWu, 1991)dHuman Mouse

Light chain (VL)34 56 (13) H:103-3(23); 100, 100a (15); 97,

98, 100d, 100e, 100b, 100f, 101(8)

A46,N17,S15,H6,Y4,D3

H34,N26,A19,Y7,E6,S5

�-Strand CDR1

36 87 (20) H:103(75); 103-3(55); 100(10);100c, 101(5)

Y92,F5 Y79,F9,V6,L4 �-Strand FR2

38 100 (23) H:39(100); 91(74); 45(4) Q92,H4 Q90,E6 �-Strand FR244 100 (23) H:103(83); 45(52); 91(30); 39(4) P99 P84,V7,F6 �-Strand FR246 35 (8) H:101(50); 103, 96, 99, 100b,

100d, 100f (12)L85,V5,R3,T2 L63,R20,G6,P4,T3 �-Strand FR2

87 87 (20) H:45(65); 39(65) Y93,F6 Y72,F26 �-Strand FR389 30 (7) H:103-3(71); 95, 100, 100c(14) Q71,M6,S5,A4,

L3,G3,C2,N2Q58,A9,L8,F7, S6,H5 �-Strand CDR3

91 26 (6) H:95(33); 47, 97, 100a, 100d (16) Y51,W12,S10,R8,A6,H2G2,F2

W28,S17,G17,Y13,H7, D4,F4,R2

Loop CDR3

96 87 (20) H:47(80); 103-3(40); 35(30);95(20); 100h(10); 96, 97, 100,101 (5)

Y17,L16,W15,V10,R9,F6,A5,P5,14, S3,G3,Q2

L31,W21,Y20,R9,F8,P6

Loop CDR3

98 87 (20) H:45(65); 37(35); 103-3(20); 47,103(20); 100(5)

F99 F99 �-Strand FR4

Heavy chain (VH)35 26 (6) L:96(100) S40,H21,N16,

G13,T4,Y2,A2H37,N26,S17,

E8Y2,D2�-Strand CDR1

37 30 (7) L:98(100) V66,130 V89,I8 �-Strand FR239 100 (23) L:38(100); 87(57); 44(4) Q97 Q95,K3 �-Strand FR245 91 (21) L:44(57); 98, 87(62); 38(5) L96 L98 �-Strand FR247 78 (18) L:96(94); 95(22); 98(17); 94(5) W96 W94,L3,Y2 �-Strand FR291 87 (20) L:38(85); 44(35) Y92,F6 Y80,F18 �-Strand FR393 0 — A84,T5,V5,E2 A85,T6,V3,M2 �-Strand FR395 17 (4) L:96(100); 91(50); 89(25) D20,G18,A8,E7,

R7,V6S5,L4,P4,N 3,Q3,T3,H3

Y16,D15,S15,G1R8,E5,N4,L4,H4,P3,W3,A2

�-Strand CDR3

103-3e 83 (19) L:36(58); 96(42); 89(26); 98(21);34(16); 46(5); 39(5)

F55,M20,L7,Y3,V2

F66,M26,L3 Loop CDR3

103 96 (22) L:44(82); 36(72); 98(18); 46(5) W97 W99 �-Strand FR4

a The absolute values are given within parentheses.b ‘Extra residues’ are reported in bold. See Materials and methods for details.c The type of secondary structure in which the residue is located is reported.d Location of the residue in respect to functional sub-segments (CDR or FR) is indicated.e See Materials and methods for an explanation for position H103-3.

AN IMPROVED MODEL FOR VH–VL INTERFACE 115

Copyright � 2003 John Wiley & Sons, Ltd. J. Mol. Recognit. 2003; 16: 113–120

toward the antigen-binding site (‘�’); (4) if the side-chainwas located below the plane, the orientation was classifiedas a ‘down-direction’ toward the inner layer of the interface(‘�’); and (5) when the side chain coincided with the planebut retreated from the interface due to the disposition of themain chain, it was denoted by the symbol ‘�’. Plate 2presents examples of the orientations of analyzed side-chains.

Alignments and frequency of amino acid usage

The frequency of amino acid usage for each position in theinterface was calculated from a compendium of immuno-globulin sequences (Kabat and Wu, 1991). Together, thedata selected consisted of the complete sequences of 1404human and 2465 murine light chains, plus 1976 human and4073 murine heavy chains. As illustrated below, manualcorrections to Kabat’s assignments were made whennecessary, to meet three-dimensional requirements foruniform alignment of residues from different proteins(Chothia and Lesk, 1987; Honegger and Pluckthun, 2001).For example, there can be as many as 11 insertions betweenresidues 100 and 101 in Kabat’s numbering scheme for theheavy chain. These residues belong to CDR3 and designatedas 100a to 100k if the insertion includes all 11 residues.Irrespective of the length of this insertion, its last residuewill be linked to the remainder of VH in positions threeresidues in front of the highly conserved tryptophan H103.Both this residue, which is renumbered H103-3, andtryptophan H103 play central roles in the model of Chothiaet al. (1985).

A conflict was found with the residue of VH that appearsthree positions in N-terminal direction in respect to thehighly conserved residue H103 (Trp in 99% of thesequences analyzed; Plate 1). According to the model ofChothia et al. (1985), this residue is one of the positions thatarched toward the center of the interface as a result of thepresence of the �-bulge (Plate 1). Nevertheless, followingKabat’s numbering scheme, the same number is not always

assigned to this residue (Kabat and Wu, 1991). This isbecause the insertions in HCDR3 are located betweenpositions 100 and 101. They are introduced in the alignment(without considering structural information) beginning fromthe N-terminal portion (Table 4).

In the model of Chothia et al. (1985), a residue located atthis region plays a central role in the VH–VL interaction.Our structural analysis indicates that in all the cases (datanot shown) the residue previous to H101 in the N-terminaldirection is H100k. The calculations of frequency of aminoacid usage for each position and the handling of thealignments were made (particularly in HCDR3) using theprogram VIR-I (Almagro et al., 1995).

RESULTS AND DISCUSSION

According to the model of three layers proposed by Chothiaet al. (1985), there are 20 residues involved in the VH–VLinterface: VH 35, 37, 39, 45, 47, 91, 93, 95, 101 and 103;and VL 34, 36, 38, 44, 46, 87, 89, 91, 96 and 98 (Plate 1). InTable 2, analyses of contact frequency and of amino acidusage are reported for the positions involved in the VH–VLinterface. Contact residues fall into two sub-groups: (i)residues H39, H45, H47, H91, H103-3, H103, L34, L36,L38, L44, L87, L96 and L98 participate in contacts at theinterface with values near 100%; and (ii) H35, H37, H93,H95, L46, L89 and L95 make contact in a circumstantial oralmost null way. These results suggest that contribution tothe interface is different for each residue, thus validating aresidue-by-residue analysis.

Residues of the interface with low contact percentage

The characteristic arrangement of �-strands at the VH–VLinterface (Chothia et al., 1983) results in the convergencesof H45, H47, HL03-3, H103, L46, L96 and L98 residues inthe external strands toward the center of the interface (Plate1). Consequently, the central region of �-sheets twists away,

����� �� �� � ���� ��������

Orientationa

Residue

H35 H47 H95 H103-3 L34 L46 L91 L96

� 87b 100 70 4 78 91 70 87� 13 0 26 83 22 9 26 13� 0 0 4 0 0 0 4 0� 0 0 0 13 0 0 0 0

a See Materials and methods for notation.b Values are reported as percentages. See Materials and methods for details.

����� �

100 100a 100b 100c 100d … 100k 101 102 103

Kabat’s scheme S S F — — … — D Y WNew scheme S S — — — … F D Y W

116 E. VARGAS-MADRAZO AND E. PAZ-GARCIA

Copyright � 2003 John Wiley & Sons, Ltd. J. Mol. Recognit. 2003; 16: 113–120

����� �� �������� ������ �� ��� �� � ���������� ��� ������� ������ �� ��� �� � ��������� �������������� �� ������� ���� ������� �� ����� ���� ��� ����������� ������� �������� ��� ������������ � ����� ��� �� ��� ��� ����� ���!" ��� �� ��� ������� ���!" ��� #�� �� ��� ���������!� $�� ������ �� ��%����� � ����� &�������� ������� ���� �� ���� ����� �� �� �����#���

����� �� ����� ������������� � ��� �� � ���������� $���'������ � �������� ������ �� ��� ��������� ��������#��� ��� �������� � ��� ��� ����� ���! ��� ����� �� ���" ���� ����������� � ��� ����������! ��� ����� �� ����

AN IMPROVED MODEL FOR VH–VL INTERFACE

Copyright � 2003 John Wiley & Sons, Ltd. J. Mol. Recognit. 2003; 16

causing displacement of residues H35, H37, H93, H95, L34,L36, L89 and L91. The main chains of the central �-strandsare separated by an average distance of 14 A (Chothia et al.,1985). This arrangement of displaced residues of the centralstrands helps explain why they form circumstantial or nullcontacts (see Plate 1 and Table 2). However, it should benoted that displaced residues L34 and L36 in VL have highcontact frequencies (56 and 87%, respectively; see Table 2).Thus, two residues of VL in the medium zone of the centralstrands participate in the formation of the interface, whereasthe displacement is effective for all residues of the centralstrands of VH.

The detailed analysis of the orientation of side-chains inthe external strands provided an explanation for theasymmetry of packing between VH and VL in the innerlayer. According to the three-layer model of Chothia et al.(1985), residues L96 and H103-3 are oriented toward theinner layer of the interface (Plate 1, also see Fig. 4 inChothia et al., 1985). However, in 20 of the 23 (87%)structures analyzed in the present study, the side-chain ofL96 was rotated away from the inner layer of the interface;i.e. L96 pointed upward toward the antigen-binding site(Plate 2 and Fig. 1; Table 3).

The retreat of L96 from the inner layer predisposes itsside chain to other interactions. Within VL, for example,L96 makes contact with L89 and L91 in 79 and 53%,respectively, of the antibodies analyzed (results not shown).These residues lie outside the inner layer and point towardthe antigen-binding site (Plates 1 and 2; Table 3). Incontrast, L96 makes contact with L98, a residue in the innerlayer (Plate 1), in only 5% of the antibodies.

In the case of H103-3, the side-chain points toward thecentral zone in 83% of the antibodies (Plate 1, Table 3) andinteracts with the Trp H103 of the inner layer in 63% of theproteins (data not shown). Plate 2 illustrates an orientationof Phe H103-3 near the center of the interface and separatedfrom residues pointing toward the binding site.

Analyses of amino acid usage in positions contiguous to

L96 and H103-3 reinforce the three-dimensional patternspreviously described. Residue L89 (adjacent to L96 in Plate1) is located in the twisted region of the central strandsdisplaced from the interface (Chothia et al., 1985). Thisposition is occupied by Gln in a high percentage of thesequences analyzed (Table 2). Gln is a relatively bulkyamino acid, which prevents L96 from orienting toward theinner layer (Fig. 1). Consequently, L96 cannot play aspreponderant a role in the inner layer as its H103-3counterpart in VH. To compensate, L34 and L36 interactwith VH in a considerable percentage (56 and 87%,respectively) of the antibodies. In yet another reciprocalresponse, the VH equivalents (H35 and H37) to L34 andL36 do not participate in the interface (Plate 1 and Table 2).

Residue L89 does not make contact with VH asfrequently (30%) as L34 and L36. It seems likely that L89retreats from the interface in some molecules because of thesevere twisting of the �-strands in the zone (Chothia et al.,1985). The VH equivalent of L89 is residue H93, which isAla in more than 80% of the sequences of human andmurine H chains (third column of Table 2). With such asmall side chain, it is not surprising that Ala H93 fails tomake contacts with VL in any of the structures analyzed. Onthe other hand, the presence of alanine in position H93allows the large side chain in position H103-3 (usually aPhe) to participate in the inner layer of the interface innearly all antibodies examined (Tables 2 and 3). Thiscombination of structural features sterically restrictsresidues H35 and H37 from participating in the interface(Figure 2). The steric hindrance is reflected in the relativelylow percentages of antibodies having H35 (26%) and H37(27%) as interface contact residues (Table 2).

The differences cited above are consistent with the aminoacid usage for the constituents arrayed along the VH–VLinterface. In the central strands of VL, locations of L34, L36and L89 are predominantly occupied by medium-sized andlarge residues in both in human and murine antibodies. Theonly exception is residue L34, which is Ala in 46% ofhuman sequences (fourth and fifth columns of Table 2). Thispattern contrasts with the corresponding positions of VH, inwhich H35, H37 and H93 are mainly small and medium-sized residues. As an exception, residue H35 is His in 37%of the murine sequences. These results add to the explana-tions for the differences in participation of VH and VLconstituents at the interface.

Two residues (H35 and L34) in the central portion of theinner strands appear to be linked with the ‘proximate zone’toward the antigen-binding site. Analyses of the side-chainorientations overwhelmingly support this view: in 97 and78% of the antibodies, H35 and L34 point toward theantigen-binding site (Table 3). Moreover, the residues incontact with H35 and L34 (i.e. L96, H101, H100a, H100b,etc.) are also oriented toward the antigen-binding site (Table2).

The principal difference in ‘circumstantial’ and ‘active’interdomain packing interaction appears to be one of degree,but these differences are substantial in some members of thetest series. For example, L46 is deemed to participatecircumstantially, since it only makes interdomain contactsin 35% of the analyzed antibodies, whereas H47 participatesactively (78%). A structural explanation for the differencecan be formulated by considering the specific environments

!����� �� �������� �� �� �� �������� �� ����� ������ �������� �������������� ���

AN IMPROVED MODEL FOR VH–VL INTERFACE 117

Copyright � 2003 John Wiley & Sons, Ltd. J. Mol. Recognit. 2003; 16: 113–120

of the two residues. Face-to-face packing of VH and VLdomains places H47 directly opposite L96 (Plate 2 and Fig.1). With similar orientations, the two side chains of H47 andL96 are in contact in most of the antibodies (94%). On theother hand, H103-3 is located in front of L46, but its sidechain extends in a different direction: H103-3 stretchestoward the inner layer in 83% of the cases, and L46 pointsupward to the antigen-binding site in 91% (Table 3). Hence,the side chains of H103-3 and L46 are not in favorablepositions to interact.

Chothia et al. (1985) proposed that H47 and L46 areinvolved in the interface, together with the more obviousconstituents of the N-terminal portions of external strandsconverging in the inner layer (i.e. H45, H103-3, H103, L44,L96 and L98). There are two structural reasons forexcluding H47 and L46 from the inner layer: (i) the�-strands containing H47 and L46 are not markedly twistedtoward the inner layer; and (ii) the above analyses show thatthe H47 and L46 side chains retreat from the inner layer andgo upward in 100 and 91% of the antibodies, respectively(Table 3 and Plate 2).

These findings suggest that H47 and L46 can be assignedto the ‘proximate zone’, along with H35, H95, L34, L91 andL96. Such assignments are supported by present and pastobservations: (1) the side chains in this set follow a patternof pointing upward toward antigen-binding site; (2) theseresidues pack closely within their own domain and withresidues from the complementary domain; (3) most of theseresidues are in or near the hypervariable loops designated ascomplementarity-determining regions.

From the earliest days of three-dimensional structuralstudies of antibody fragments, reviews of VH–VL associa-tion have emphasized the important roles of hypervariableresidues at the interface (Davies et al., 1975; Davies andMetzger, 1983). Our results indicate that these residues forma continuous and interconnected zone that is situated at thebase of the antigen-binding site (Plates 1 and 2). In a total of40 contacts between VH–VL constituents, 27 correspond tohypervariable-hypervariable pairs of residues and 13 tohypervariable-conserved residues (Table 2). Most of theinteractions (22) between hypervariable pairs are located inthe ‘proximate zone.’ Five of the residues of the ‘proximatezone’ (H35, H95, L34, L91 and L96) are highly hypervari-able (Vargas-Madrazo et al., 1994) and have side-chainsvarying significantly in volume (Table 2). This tabulationclearly indicates the importance of the ‘proximate zone’. Inparticular, the hypervariable residues of this zone can beregarded indicators of past and future sources of diversifica-tion in the VH–VL association.

The present work also suggests that it would be helpful toredefine the ‘inner layer’ originally described by Chothia etal. (1985). As summarized above, some residues fall outsidethe general packing patterns established when the data baseswere much smaller for sequences and three-dimensionalstructures of antibody fragments.

Residues with high contact percentage at the interface

In Table 2, there are lists of residues that make contact withthe opposite domain in a high proportion of human andmurine antibodies, e.g. H39, H45, H47, H91, H103-3, H103,

L34, L36, L38, L44, L87, L96 and L98. Some of theseresidues (H45, H47, H103-3, L36, L44, L46 and L96) wereconsidered in the previous section.

Except for H47, L34 and L96, the residues of high contactpercentage (H39, H45, H91, H103-3, H103, L36, L38, L44,L87 and L98) are involved in the central or remote regionsof the interface, as defined by increasing distances from theantigen-binding site (Plate 1). A common feature is theusage of the same amino acid in a given location of thehuman and murine sequences (Table 2). The location (Plate1), the distance removed from the antigen-binding site andthe side chain orientation were determined for each aminoacid residue in the series (results not shown). This analysisclearly shows that 10 residues (H37, H45, H91, H103-3,H103, L36, L44, L87, L89 and L98) form an interconnectedhydrophobic region which we have designated the ‘centralzone’ (Plate 1). This zone accounts for a considerableproportion of the accessible surface area involved in theVH–VL association (Novotny et al., 1983; our unpublishedresults).

Asymmetries in the packing patterns between VH and VLare found in this central zone, despite its highly ordered andconserved nature. Unlike the three-layer model of Chothiaet al. (1985), L96 in our proposal lies outside the centralzone. By extending toward the antigen-binding site, L96makes it possible for L36 and L89 in the middle of the innerstrands to participate actively in the central zone. Again, thisfeature is incompatible with the three-layer model, in whichthe central portions of the inner �-strands do not contributeto the inner layer.

L89 and H103-3 are two residues in the central zone thathave not as yet been discussed in regard to their potentialcontributions to the variability of the VH–VL association(see Table 2). Studies of VH–VL association have shownthat steric effects are critical in this interaction (Novotny etal., 1983, Chatellier et al., 1996). Over the large number ofsequences examined, both L89 and H103-3 exhibit somedegree of variability. In position L89, for example,substitutions include a wide range of shapes and volumesthat would influence the steric relationships in the VH–VLassociation, e.g. Gln, Met, Ala, Leu, Gly, Ser, and Asn(Table 2). It may be significant that L89 exerts only a‘circumstantial’ effect on the interdomain pairing.

H103-3 plays a prominent role in the association and isalso less likely to be altered substantially. Its variability islimited to Phe, Leu, Met and Tyr. Although L89 participatesresidually in the interface, H103-3 fulfils a prominent role.The side-chains present in these residues imply considerablevariations of volume (H103-3: Phe, Leu, Met, Tyr; and L89:Gln, Met, Ala, Leu, Gly, Ser and Asn; Table 2). The H103-3location at the C-terminal part of the recombination regionbetween D and JH mini-gene products (still in HCDR3)makes it susceptible to alterations by nucleotide insertionsand deletions (Tonegawa, 1983). Together, these observa-tions suggest that somatic diversification of HCDR3 canfulfil preponderant roles in the alteration of both VH–VLassociation and recognition properties of antibody reper-toires (Hamel et al., 1984; Stevens et al., 1988; Pokkuluri etal., 1998).

Finally, residues H39 and L38 stabilize the remote part ofthe interface by forming an invariant network of hydrogenbonds across the pseudotwo-fold axis between VH and VL

118 E. VARGAS-MADRAZO AND E. PAZ-GARCIA

Copyright � 2003 John Wiley & Sons, Ltd. J. Mol. Recognit. 2003; 16: 113–120

(Plate 1). Both the H39 and the L38 positions contain Gln inmore than 90% of the human and murine sequences. Thisanchorage motif can act as a counterbalance to the possiblemovements at the ‘proximate zone’ (Novotny and Haber,1985; Colman, 1988; Stanfield et al., 1993).

H3 and the ‘proximate zone’ to the antigen-bindingsite

Of the 20 residues of the three layer model of Chothia et al.(1985), 19 are in fact participating in the interface (H93 isexcluded). Additionally, extra residues participate circum-stantially in the interface (residues in bold in the thirdcolumn of Table 2). These ‘extra residues’ are {96, 97, 98,99, 100, 100a–100f and 101} for VH, and {94 and 95} forVL, all corresponding to the third hypervariable region(CDR3) of V domains.

To study the participation of HCDR3 at the interface, thedatabase was adjusted to consider antibodies with a wideranging diversity for the lengths of this loop (see Materialsand methods). HCDR3 occupies a central position at theinterface and appears in the form of a protuberance (Moreaet al., 1998). The participation of ‘extra residues’ in VHtakes place when the antibody contains a CDR3 loop ofmedium or large length. These extra residues typicallyinteract with hypervariable residues from the ‘proximate’zone of VL (L34, L46, L91 and L96). In the case of VL, theresidues contributed by LCDR3 to the interface are locatedat the C-terminal portion of the loop, immediately adjacentto L96. L96 makes contacts with H47 in a high percentage(80%) of antibodies. Similar interactions occur when thereare extra residues in LCDR3 (e.g. L94 and L95 in Table 2).When extra residues from VH and/or VL are present at the

interface, it is reasonable to assume that they will signifi-cantly alter the form of the VH–VL association.

CONCLUSIONS

In analyses of the VH–VL interface, the present study hasconsidered the geometric arrangements of �-strands, theamino acid sequences of large numbers of human andmurine antibodies, interdomain interactions between in-dividual side chains and their patterns of variability. Wehave identified zones or patches in which groups of residuesshare common patterns. The structural integrity of thesezones is maintained by sets of coordinated side-chainorientations, as well as by networks of interatomic contactsamong residues that form the zones. Features described inthis report are complementary to those derived from thethree-layer model proposed by Chothia et al. (1985).

Our model of VH–VL association provides plausibleexplanations for certain aspects of diversification strategiesdeveloped evolutionarily or somatically by the immunesystems of vertebrates (Vargas-Madrazo et al., 1995). Forpractical applications, the model should prove useful in thestructure-based design and development of syntheticantibodies.

Acknowledgements

We thank Tania Romo and Aldo Segura for technical assistance and PatReidy for helping to prepare the English version of the article. This workwas supported by Proyectos Estrategicos 1999–2000 UV and SNI-CONACyT to EVM.

REFERENCES

������� ��� ����� ������ � !� "�������#� �� $����� �������� �� ����%���� &� '��(� �)�* � ������������� ���� �������+ � �� �������������� ,��� � ���������� �"-'.*/(01/�

�� � �� 2��3�4 ��� '�5�� 6������� ����� ������� ���������������� � ��� ��� ��� #* ��'0��5�

7��8����� 9��� :�� �������� �� 7���+ �� '��5� �*�$ �������������� �� ������ �������� * ��+ ��� ������� �� ��&�� ������� ���� ��� ����� ��������� �������� ������� ������ ����� ��� ����� � �������� �$-/.* '�'05'�

7�4 �� ��� ��� �� '�;5� �������� ����� ��� ������� ���� �� ������������ �� �� ����� � ��� � ������� ��� $%* /10<'�

����� �$� =���� �6� > ����� &�� =�#� &�� >��� 7�� &��+> ��� =������ �� =������ �� '�;(� ?�#� ��������� ���������������� #������ ������ * @���+ ��+ ����������������+ � �� �� ������������ ���� 7������ ������ ������ �������� � -';.* <;�A0<;�5�

�������� �� ��� �������� �$� ���� 6� ��� ���� :� '����&��������� ������� �� ��� �#� � ��� ������ �� �� ���� �$ ������ ������� �� � &��� � ��� ���� �% -'.* '0��

������� �� 4 ��� '�;5� ��������� ������� ��� �� �+��#������� ����� �� �������������� � � ��� ���� �$%-<.* �A'0�'5�

������� �� ?�#���+ �� 7�������� �� 9����� �� '�;(� :������ �������� �� �������������� ������ � 6� ���4��� ��#������ ������ � � ��� ���� �#%-1.* �('0��1�

����4 �� 6� >�� ��� '��(� � ��� ��� �� ������� ���+ �� �

������������� �������� � ��� � �%&-('��.* 1;101;�������� 2�� '�;;� =������� �� �������+������� �����B *

����������� ��� ����� ����������� �� ������� �* ��0'1/�

:������+ 2=� ��� C� )#� �� 7� C������ C� /AAA� D����������# ����+ � �� �� ���� �� �� �������� ��,���+ �� ����8���+ ���������� �� ���� ����� &# �������� � ��� ���� �� � � �� $&-(.* /A/�0/A1<�

:�#� :���� �� $� '�;1� =��������� �� � �� �������+ ��������� ��� ������� �* ;50''5�

:�#� :�� 2����� !�� =��� :�� '�5(� 6������� ����� ������� �� �������������� � ��� ��� ��� * �1�0��5�

&��� �� >���� C� '��/� �������+ �������4 � ��� ���������� ������������ �� �� �+��#������ ���� � � ��� ������ -/.* <;50<���

$��� 2�� ) ���� :!� 9��� �$� ���4 �� :��������� 9��'�;<� =��������� �� � ��� �� ��������� � �������� ����������� �������������� ������ * ��� �� �� � ����� ���� ����� ������ ��� ������� ��-<.* /550/;1�

$����� �� 2���4���� �� /AA'� 6� ��E��� �� �� ������������� �� �������� � ��� �� �� ����� � �� �� ��������� �������������� #������ ������ � � ��� ���� �'$-1.*�;50����

������ 2���4���� �� '��5� 6� ������������ �� ��� �� ��������� �� �������+ �&# �������� ���� ���� �#-'0/.* 'A�0''A�

9���� !�� >� 66� '��'� )������� � ����� ����� ���� ,��� ��� ���� �� ,��� �� �������� �� ������� ���8���� � �����# ������������ �� �$ ��� � �� �

AN IMPROVED MODEL FOR VH–VL INTERFACE 119

Copyright � 2003 John Wiley & Sons, Ltd. J. Mol. Recognit. 2003; 16: 113–120

������ � ��� ������������+���������� ����� ��������� �� �������+���������� �� � � ������� � &-(.*'5A�0'5'��

9������ �7� >�������� �� ������� � �������� �� �������7��+� ?� ��� ���� :� ���� 6� /AAA� !���� �� ����������4����� �� �������� �� �� �$0� ������� �� &�� ������ �� ��������� ����B�� � ��� �� ����� ��-1.* '/50'1��

���� �� 6��������� �� �� ���� �� ������� �� 4 ��� '��;������������� �� �� ����� �+��#������ ����� �� �� �$������ �� �������������� � � ��� ���� �&"-/.* /��0/�<�

?�#���+ �� $��� !� '�;(� =��������� ��#������ �� �������������* ������� �� �� �������������� �0�$ ��� �0������� ���� � ��� ���� �� � � �� #�-'<.* <(�/0<(���

?�#���+ �� 7�������� �� ?��� �� �����+ :� $��� !� 9����� ��'�;1� �������� ������+ �� �� �������+ ������� ��� � ���� ��� �"#-/1.* '<<110'<<15�

2����� !�� '�55� =��������� �� � ��� �� ���8���+ �� �������+������� ������� ��� ��������� ������ � ��� �� ��#� ��8������ �� �������������� ���8���� � ! ��� ���"����'-'.* 1(0�(�

2�44����� 2�� $���� :7� ����� �� ��� @� ���� ��C� =�#� 2>�=�#� &�� =������ �� '��;� � ������ E�� � � � ��� �� � ���� ���������� �� ��������� ���� ���� %-;.* 'A�50'A51�

����� �7�FG� ��� ������� ��� /AA'� ����+ � �� �������� ��4���� ������� ��� � � ���4 �� ���� ������ ������ � ��� �� ������� ���� �� ������ ��� ��� ����8��+ ������ �+����������� �������� "-1.* '��0/A��

=������ �� ����� �$� =�#� &�� '�;(� &�������� �� �� ��8����� �� ��������� ������� �� ��+ ����� ����� ���#����� ��������+� �������+ ����� ����� � � ��� ���� �#%-/.*<5(0<5;�

=������ �� ����� �$� ?��4 ��� =�#� &�� '�;;� ����+ � ���������������� ������ ���������� � !#���� ��� � ��������� ��� �� ��� ����� � � ��� ���� �'�-1.* 5��0;A/�

=��������� �&� ?�� �)� 7�������� �!� $��� !� ����+ 2:�?�#���+ �� =����� =� ������� �?� '��1� ���������� ���������+ ��8���+ �+ � ����������� � ���� ������� � � �-/.*

/A�0/'<�=���8�� �� 6�4������9������� �� ���� ��� 2���+ �6� >�� ��

)�� '��1� ��3�� ������������� ������ ���������� ���� �������+� ���� ���� �-/.* ;10�1�

=�#� &�� ����� �$� =������ �� '�;;� :��� ������������ �� �� �������������� ����������� ����* ��������+ �������� ���8���+ ��� ������� ���8� ��+ � ��� ����������� #������������� ���������� ������ � � ��� ���� �� � � �� #"-';.* �;�(0�;���

6����� �� )�� >���� C� ��� 26� :�� 2$� =�������� !�>���� C� '���� 6� ������� �� ������ �+���������� �� ��������� �� ����� ������ � �� � � ��� ���� �"%-(.*;'10;'5�

6������ =� '�;1� =������ �������� �� �������+ ��#� ��+������� �'�-(�A�.* (5(0(;'�

����� ������ � !� ����%���� &� ���� �������� �� '��<� � 4�� �� ��������� �� ����� ���� �� ���������� �� �� ���+��#������ ����� �� �������������� � � ��� ���� �#-'.*'AA0'A<�

����� ������ � !� ����%���� &� ������� ��� '��(� ��������� ������� ������� �� �� �������������� �� �� ��������������� ��� � ����� ��������� � �������� � ������� �� �� ������ � �� ����� ����������� � ������� �" -1.* <�50(A<�

>���+� C�� 2���� 2�� >��� $� =����� 2C� =�#� ���'��5� =��������� �� ���� ���� �� #������� �� �� �������+��������� ��� � ��� � �&%-(1'�.* '��(0'����

>�� C:� ����� ��� >������� !�� %H$�� 6� =�� ��2����2� �������� �7� '��;� $������ ������ �������� *� �� ���� �� ���4 ��� ������� ��� �%�* '�50/A��

>� 66� 9���� !�� '�5A� �� ����+ � �� �� ,��� �� ��#������ ����� �� 7�� ��� ������ ��� �+���� ���������� ��� ���� ����������� ��� �������+ ������������+�� �#" ��� ���-/.* /''0/(A�

>� 66� ���� �� C� 9���� !�� '��1� ���� �� ��������� �� �:�$1�� �������� � �������� �%-'.* '05�

120 E. VARGAS-MADRAZO AND E. PAZ-GARCIA

Copyright � 2003 John Wiley & Sons, Ltd. J. Mol. Recognit. 2003; 16: 113–120