intro to protien steucture

Upload: farhad-hossain

Post on 04-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Intro to Protien Steucture

    1/13

    Copyright 2000-2010 Mark Brandt, Ph.D. 27

    Introduction to Protein StructureProteins are large heteropolymers usually comprised of 50 2500 monomer units,although larger proteins are observed7. The monomer units of proteins are aminoacids. The primary structure is the sequence of monomer units in the protein.Secondary structure is made up from regular structures within proteins. Thetertiary structure is the overall three-dimensional structure of the protein.Quaternary structure is the organization of multisubunit proteins (proteinscomprised of single polypeptide chains do not have quaternary structure).

    A peptide is a molecule containing two or more amino acid residues linked bypeptide bonds. The term peptide typically refers to molecules comprised of relativelysmall numbers of amino acid residues. This is sometimes made explicit in the termoligopeptide: oligo- means short, and thus an oligo peptide is a short polymerof amino acids.

    A polypeptide is a longer polymer of amino acids. The term protein is often

    used interchangeably with polypeptide. In some cases, however, the termpolypeptide is used to refer to a single chain peptide part of a molecule that eitheralso contains non-amino acid groups as part of the overall protein or which is notfolded into a defined structure.

    You may have noticed the term amino acid residue in the preceding paragraphs. Ina strict sense, proteins do not contain actual amino acids; as will be seen, the aminoand carboxylic acid functional groups of amino acids incorporated into proteins aretransformed into amide linkages, which lack the ionizable groups. Therefore themonomer units within proteins are properly referred to as amino acid residues orsimply residues.

    Primary StructureThe primary structure of a protein is the sequence of amino acids that comprises theprotein. The three-dimensional structure of a protein (and therefore, all of itsfunctions) is dependent on the primary structure.

    The primary structure is always written from the amino-terminus (the residue withthe free -amino group) to the carboxy-terminus. This convention has severalreasons, including the fact that the protein biosynthetic machinery producesproteins from N-terminus to C-terminus. Thus, anyone seeing a peptide written asGRSKKNS (or the equivalent sequence: Gly-Arg-Ser-Lys-Lys-Asn-Ser) immediatelyknows that this peptide has an N-terminal glycine and a C-terminal serine.

    The peptide bondPeptides and proteins are linear polymers of amino acids. The amino acids areconnected by an amide linkage called a peptide bond. In principle, the peptide bond

    7The largest known polypeptide, a mouse muscle protein called titin, has 35,213 amino acid residues(the human version of titin is smaller, with only 34,350 residues).

  • 7/30/2019 Intro to Protien Steucture

    2/13

    Copyright 2000-2010 Mark Brandt, Ph.D. 28

    is formed by removing an H2O molecule while linking the -amino group of oneamino acid to the -carboxylic acid of another.

    O H2O+ H

    H

    Proteins are polymers of amino acids linked by these peptide bonds:

    Structural Features of the Peptide BondIn a peptide bond, the amide oxygen and amide proton in the bond are almostinvariably in the trans configuration (as shown below). This decreases stericconflicts between these atoms, and between the -carbons and their substituents.

    In the peptide bond, the -electrons from the carbonyl are delocalized between theoxygen and the nitrogen. This means that the peptide bond has ~40% double bondcharacter. This partial double bond character is evident in the shortened bondlength of the CN bond. The length of a normal CN single bond is 1.45 and aC=N double bond is 1.25 , while the peptide CN bond length is 1.33 .

    ^C

    NC

    O

    R

    N

    H

    H

    _\

    H

    C

    O

    FixedFixed

    tt

    Because of its partial double bond character, rotation around the NC bond isseverely restricted. The peptide bond allows rotation about the bonds from the -

    carbon, but not the amide CN bond. Only the and torsion angles can vary. Inaddition, the six atoms in the peptide bond (the two -carbons, the amide O, and theamide N and H) are coplanar.

    Finally, the peptide bond has a dipole, with the O having apartial negative charge, and the Namide having a partialpositive charge. This allows the peptide bond to participate inelectrostatic interactions, and contributes to the hydrogen

    C

    O

    N

    H

    C

    C

    +

  • 7/30/2019 Intro to Protien Steucture

    3/13

    Copyright 2000-2010 Mark Brandt, Ph.D. 29

    bond strength between the carbonyl and the Namide proton.

    Peptide bond and protein structureThe peptide bond contains three sets oftorsionangles (also known as dihedralangles). The least variable of these torsion angles is the angle, which is the

    dihedral angle around the amide bond. As discussed above, this angle is fixed by therequirement for orbital overlap between the carbonyl double bond and the Namidelone pair orbital. Steric considerations strongly favor the trans configuration (i.e. an angle of 180), because of steric hindrance between the alpha carbons of adjacentamino acid residues.8 This means that nearly all peptide bonds in a protein willhave an angle of 180.

    C

    O

    N

    H

    C

    C C

    O

    NH

    C

    Ctrans cis

    Observed angles for thepeptide bond

    180 0

    In considering peptide structures, it is usually much more important to look at thebackbone angles that can vary more widely. These angles are the (= phi, CNamide) and (= psi, CCamide) angles. By definition, the fully extendedconformation corresponds to 180 for both and . (Note that 180 = 180).Numeric values of angles increase in the clockwise direction when looking awayfrom the -carbon.

    Tetrahedral C

    Camide

    Namide

    RH

    Namide

    O

    C

    Namide

    HR

    Camide

    H

    Camide

    By definition, = 0 when the Camide-Namide and Camide-C bonds are in the sameplane, and = 0 when the Namide-Camide and Namide-C bonds are in the same

    plane. The (+) direction is clockwise while looking away from the C.

    The torsion angles that the atoms of the peptide bond can assume are limited bysteric constraints. Some / pairs will result in atoms being closer than allowed

    8Proline is an exception; the tertiary amide formed by an X-Pro peptide bond does not have strongenergetic preference for cis or trans configurations. As a result, both are observed in proteins.Interconversion between cis and trans X-Pro peptide bonds requires an enzyme (proline isomerase),

    which plays a role in folding of many proteins.

  • 7/30/2019 Intro to Protien Steucture

    4/13

    Copyright 2000-2010 Mark Brandt, Ph.D. 30

    by the van der Waals radii of the atoms, and are therefore sterically forbidden (forexample: 0:0, 180:0, and 0:180 are forbidden because of backbone atomclashes). For tetrahedral carbons, the substituents are typically found in staggeredconformations (see figure, above). Peptide bonds are more complicated, becausewhile the -carbon is tetrahedral, the two other backbone atom types are not.However, the same principle applies: the preferred conformations for peptide bondatoms have the substituent atoms at maximal distances from one another.

    A angle of 180 results in an alignment of the Namide with the carbonyl oxygenfrom the same residue. This is allowed, although not especially favored. A angleof 0 places the Namide from one residue very close to the Namide from the previousresidue; this results in a steric clash (as well as an unfavorable electrostaticinteraction, because both Namide have partial positive charges).

    The residue side-chains also impose steric constraints. Glycine, because of its smallside chain, has a much large ranger of possible / pairs than any other residue.Proline has a very limited range of angles because its side-chain is covalently

    bonded to its Namide. Most other residues are limited to relatively few / pairs(although more than proline). This is especially true for the -branched residuesthreonine, valine, and isoleucine, which are the most restricted, because theseresidues have more steric bulk due to the presence of two groups attached their -carbon.

    Secondary Structure ( -helix and -sheet)Secondary structural elements are repeated regular structures within proteins. Inthese repeated structures, several successive residues exhibit the same / pairs.The two most common types of secondary structural elements are -helices and -sheets.

    The -helixExamining an -helix is easiest with a molecular model of a short polypeptide. Witha model: start at top with amino terminus, and rotate the chain clockwise as youmove downward. The carbonyl oxygens point downward. The carbonyl oxygen from-carbon #1 forms an H-bond to the amide HN that follows three interveningcarbonyl oxygens; this is the amide proton from the 5th residue. More generally, thecarbonyli accepts an H-bond from HNi+4.

    In looking at a model, removing the side chains after the -carbon (i.e. constructingpolyalanine) makes the demonstration easier, because the extra atoms may beheavy enough to distort chain in gravity field (which is not a problem with realproteins). As it happens, alanine is frequently found in -helices.

    An -helix is a right-handed helix: looking end on, the rotation is clockwise whilemoving away from observer. This is true looking at the helix from either N-terminalor C-terminal end.

    Note that the first 4 nitrogens (counting the one from the initial residue thatcontains the i=1 carbonyl) cannot form H-bonds within the helix.

  • 7/30/2019 Intro to Protien Steucture

    5/13

    Copyright 2000-2010 Mark Brandt, Ph.D. 31

    Properties of an -helix1) The -helix satisfies the backbone H-bonds (except for the first 4 Namide, andlast 4 carbonyl groups).

    2) There is no hole in the middle of the helix: there is no room. If you look downthe axis of a space-filling model of a helix, you will not see any empty space. Thefigures below show the same -helix.

    N-terminus C-terminus

    Looking down helixaxis from N-

    terminus (licoricemodel)

    Looking down helix axisfrom N-terminus (space-

    filling model)

    3) The side chains extend outwards and slightly (~30) toward the N-terminus ofhelix.

    4) An -helix has a dipole, with the partial positive charge toward N-terminus. Thisis true because all of the partial charges of the peptide bonds are in alignment.

    N-terminus

    C-terminus

    _1

    _2

    _3

    _4

    _5

    Carbonyl

    of residue #1

    Nitrogen

    of residue #5

    + Helix Dipole 5) An -helix has 3.6 residues/turn. Since a circle contains 360, each residue isrotated by 360/3.6 = 100.

    6) An -helix has a 5.4 pitch/turn. This means that as the chain moves along theaxis of the helix, the helix becomes 5.4 longer for each full turn. This correspondsto 1.5 /residue. (A CC single bond length is ~1.5 ; this 1.5 per residue is thusa very compact structure.)

  • 7/30/2019 Intro to Protien Steucture

    6/13

    Copyright 2000-2010 Mark Brandt, Ph.D. 32

    7) The backbone of the helix is ~6 in diameter (ignoring side chains).

    8) The backbone bond angles are = 57 and = 47. Note that the and angles in real helices are not exact, and may vary by up to 10 from these idealvalues and occasionally vary by even larger amounts.

    Two-dimensional representations of -helices

    Drawing a three-dimensional helix on paper is difficult. Two types of two-dimensional representations are commonly used to simplify the analysis of helicalsegments of proteins.

    The two-dimensional representations are somewhat stylized, but show the majorfeatures more clearly than an attempt at depicting the three-dimensional structureaccurately in two-dimensions.

    Numbering the helix beginning with residue #0 make some calculations a little

    easier. Note that finding distances along helices are measured from the firstresidue, but in effect do not count the first residue. (The length of a four-residuehelix is 4.5 , not 6 .)

    The first type of representation is a Helical Wheel diagram. In this diagram, therepresentation involves looking down the helix axis, and plotting the rotationalangle around the helix for each residue. This representation is conceptually easilygrasped, but tends to obscure the distance along the helix; residues 0 and 18 areexactly aligned on this diagram, but are actually separated in space by 27 .

    Helical WheelResidue #0 = 0 (by definition)#1 = 100#2 = 200#3 = 300#4 = 400 = 40#5 = 140#6 = 240#7 = 340#8 = 440 = 809# = 900 (from first) = 180These angles can be plotted on acircle. Doing so results in a

    representation that correspondsto the view looking down thelong axis of the helix. (Notethat the rotation isclockwise as the residuenumber increases.)

    1

    2

    3

    4

    5

    6

    7

    8

    9

    Helical Wheel

  • 7/30/2019 Intro to Protien Steucture

    7/13

    Copyright 2000-2010 Mark Brandt, Ph.D. 33

    Note that residues 0, 3, 4, 7, and 8 are on the same face of the helix.

    Helical NetA separate representation involves ahypothetical razor cut lengthwise along thehelix axis, followed by unpeeling the surface ofthe helix. This results in a plot of distance alongthe helix versus distance around the helix calleda Helical Net.

    While it is sometimes more difficult to see thehelix faces in a helical net (where verticalbands within net are on same face of helix), thehelical net does emphasize the distance alongthe helix, which is ignored in the helical wheel.Compare the two representations, and note the

    relative positions of different residues.In the helical wheel, residues 1 and 8 are quiteclose together; the helical net shows that theseresidues exhibit a 10.5 vertical separation(although they are still present on the sameface of the helix).

    Both helical wheel and helical net analysis types are useful for analyzing the helixfor patterns in residue properties around the circumference of the helix. The helicalwheel is somewhat easier to interpret, but leaves out the sometimes-crucial distanceinformation.

    A common reason for using the helical wheel or helical net is to look for helicalregions that exist at protein surfaces. The interior of a protein is typically non-polar,while the surrounding solvent is polar. A helix that has its axis along the border ofthis region would be expected to have a corresponding, amphipathic, distribution ofpolar and non-polar residues. (Amphipathic, meaning hating both refers to thepresence of both polar and non-polar groups in the helix.) Note, however, thatresidues 1 and 8 are much closer in position around the helix than are residues 1and 2. A simple examination of the linear sequence does not reveal amphipathictendencies; in contrast, helical net, and especially helical wheel diagrams, showamphipathic helical character clearly.

    Polarenvironment

    Non-polarenvironment

    Polarside-chains

    Non-polarside-chains

    0

    5

    10

    15

    20

    0 60 120 180 240 300 360

    Helical Net

    Angle Relative to First Residue (degrees)

    1

    23

    4

    5

    6

    7

    8

    9

  • 7/30/2019 Intro to Protien Steucture

    8/13

    Copyright 2000-2010 Mark Brandt, Ph.D. 34

    The 3-10 helixIn some proteins, the backbone forms variant on the -helix instead. The 3-10 helix(for 3 residues/turn, 10 backbone atoms/turn) has a pitch of 6 (2 /residue). Theresidues in a 3-10 helix have slightly different and angles to yield a helix thatis less tightly wound and more extended than an -helix.

    Feature -helix 3-10 Helix

    57 49 47 26

    Rise per residue 1.5 2

    Amino acids in helicesMost amino acids are known to be present in helices within proteins. The side-chains of helical residues can interact, at least to some degree, but with rareexceptions (see below) the side-chains do not participate in the hydrogen-bondingpattern of the helical backbone.

    There are a few amino acids that are relatively rare in helical structures.1) While proline has limited conformational flexibility, it is capable of forming

    angles close to -57. However, proline lacks an amide proton, andtherefore cannot participate in helix H-bonds. Proline is often found at end ofa helix (especially at the N-terminus, when the lack of the amide proton isnot a problem, but also at the C-terminus, where it helps terminate the helix.

    No amideproton

    Proline

    residue

    N

    HO

    Peptide

    Peptide

    2) Glycine has a very small side-chain, and therefore has a largeconformational flexibility. This means that the ordered structure of helixis entropically disfavored for glycine. Glycine is capable of forming theappropriate / angles for an -helix, but tends to allow too much flexibilityto maintain the stable helical structure.

    Proline and glycine are sometimes referred to as helixbreakers because they tend to have destabilizing effects onhelices. Proline and glycine also allow kinks in a helix,where the direction of the helix axis changes. Note thehydrogen bond donor in the drawing. This hydrogen bonddonor from outside the helix is necessary to satisfy thehydrogen bonding requirements of the P(4) residue (i.e. theposition four residues closer to the N-terminus than theproline). The P(4) residue has a backbone carbonyl that

    Proline

  • 7/30/2019 Intro to Protien Steucture

    9/13

    Copyright 2000-2010 Mark Brandt, Ph.D. 35

    cannot hydrogen bond to the proline (as shown above, proline lacks an amideproton). The kink in the helix is a consequence of the missing H-bond from theproline and the likely distortion of the helix to allow room for a hydrogen bonddonor from outside the helix.

    3) Charged residues can either stabilize or destabilize a helical structure. Forexample, glutamate residues usually favor helix formation. However, toomany glutamates in the absence of positively charged residues can result insufficient electrostatic repulsion to disrupt helix formation.

    The -sheet

    A second widely observed method for satisfying backbone atom hydrogen bondingrequirements is called a -sheet. A -sheet is a much more extended form ofsecondary structure than an -helix.

    An antiparallel -sheet is a secondary structural element comprised of two (or

    more) strands leading in opposite directions. In an antiparallel -sheet, thebackbone amide proton and carbonyl from one residue form hydrogen bonds withthe corresponding groups from a single residue in another strand.

    Note the unsatisfied hydrogen bonds on each edge of the two stranded sheet shownin the figure below. These could be satisfied by the presence of additional -sheetstrands (which allows -sheets to form rather large structures). However, half of thebackbone hydrogen bonds at the edge of any planar -sheet must be formed usinggroups from outside the secondary structure.

    C

    N

    C

    N CH

    O

    R

    N

    H

    C

    O

    CH

    RH

    NC

    O

    CH

    R

    _

    _

    H

    CH

    R

    N

    H

    _

    CH

    O

    R

    N

    H

    C

    O

    CH

    R H

    N C

    O

    CH

    R

    _

    _

    H

    N C

    NC

    Note unsatisfied H-bonds

    A parallel -sheet is comprised of two (or more) strands leading in the samedirection. Note that a residue from one strand forms hydrogen bonds to groups fromtwo residues of the other strand. Note also that the hydrogen bonds are somewhatdistorted (this distortion is somewhat exaggerated in the diagram). Parallel -sheets tend to be less stable than their antiparallel counterparts, and are less

  • 7/30/2019 Intro to Protien Steucture

    10/13

    Copyright 2000-2010 Mark Brandt, Ph.D. 36

    commonly observed. Most parallel -sheets are formed from multiple strands.

    C

    NCH

    O

    R

    N

    H

    C

    O

    CH

    R H

    N C

    O

    CH

    R

    __

    H

    N

    H

    N C

    CNCH

    O

    R

    N

    H

    C

    O

    CH

    R H

    N C

    O

    CH

    R

    _

    _

    H

    N

    H

    N C

    Note unsatisfied H-bonds

    Note also that forming a parallel -sheet requires strands from different parts of theprotein (an antiparallel sheet can form from a single sequence assuming a turn inthe middle (as shown in the diagram below).

    The and angles required to form a -sheet are quite different from those usedfor forming an -helix. These bond angles result in a much more extended structure(the rise/residue for a -sheet is about 3.5).

    Angle Antiparallel Parallel

    139 119 135 113Both parallel and antiparallel -sheets put the residue side-chains on alternatingsides of H-bond/backbone plane. Generating a -sheet with specific propertiesrequires taking this alternating side-chain pattern into account.

    -turn

    The final type of secondary structural element observed in large varieties of

  • 7/30/2019 Intro to Protien Steucture

    11/13

    Copyright 2000-2010 Mark Brandt, Ph.D. 37

    proteins is a -turn (also called -bend, or tight turn, or merely turn). A-turn is acommonly used motif that allows a rapid reversal of the peptide strand direction.

    C

    NH

    CH

    O

    RN

    C

    O

    CH

    R H

    N

    C

    O

    CH

    R

    _

    _

    _

    H

    C

    CH

    O

    R

    N

    H

    _

    N C

    1

    2

    3

    4

    In a -turn, the carbonyl oxygen of residue #1 H-bonds to the amide proton ofresidue #4. In these structures, residue #2 is often glycine (because the glycine side-chain is small enough to avoid significant steric hindrance at the angles required forthe structure). Residue #3 is often proline (which readily assumes the required and angles).

    Other types of secondary structureSome other regular, repeating sets of and angles are observed in a few types ofproteins. Some fibrous proteins such as -keratin have -helices wrapped about oneanother in a structure called a coiled-coil.

    An even more prevalent protein is a structural protein called collagen; collagen iscomprised of three intertwined chains arranged in a helix. The collagen triple helix

    residues exhibit and angles of 51 and 153, respectively. Most animals havelarge amounts of a few types of collagen.

    Secondary structure Rise per residue-helix 57 47 1.5 (3.6 residues per turn)

    Coiled-coil 64 42 1.5 -sheet (antiparallel) 139 135 3.5 -sheet (parallel) 119 113 3.5

    Collagen 51 153 3

    A few proteins have a left-handed helix. This is a much less stable structure thanthe standard helix.

    Finally, most proteins have regions in which the and angles are not repeating.These regions are sometimes referred to as random coil although their structuresare not actually random. The non-repeating structures may be consideredsecondary structure, in spite of their irregular nature.

  • 7/30/2019 Intro to Protien Steucture

    12/13

    Copyright 2000-2010 Mark Brandt, Ph.D. 38

    Ramachandran diagramA Ramachandran diagram is a plot of versus angles for each residue in aprotein. The diagram shows several features. One obvious feature is the limited / angle pairs present; many of the pairs are sterically impossible, or so strained asto be extremely rare.

    -180

    -90

    0

    90

    180

    -180 -90 0 90 180

    Ramachandran Plot

    Right handed helixParallel sheetAntiparallel sheetLeft-handed helixExample protein

    ^(degrees)

    \ (degrees)

    Another obvious feature is the presence of many points in small regions of thediagram. These correspond to many residues with very similar / angle pairs,and are usually indications that secondary structural elements are present. Theexample protein shown contains large amounts of-helix; the region near = 57; = 47 is therefore heavily populated. Note, however, that all of the residues donot have identical angles (hence the scatter in this region). This is a generalphenomenon: few residues have the exact angles of a perfect secondary structuralelement.

    A Ramachandran diagram can be generated for any protein for which structuraldata are available. The collagen triple helix, the -helix, -sheets, and less regular

    structures all have specific locations on the diagram.

  • 7/30/2019 Intro to Protien Steucture

    13/13

    Copyright 2000-2010 Mark Brandt, Ph.D. 39

    SummaryThe amide linkage of the peptide bond has a partial double bond character thatgreatly restricts free rotation for the angle (the dihedral angle for the Camide-Namide bond). This means that, in general, only the (CNamide) and (CCamide)angles can vary.

    Steric constraints limit possible / pairs. This is most true for -branchedresidues, and least true for glycine, but all residues experience some degree ofconstraint.

    The steric constraints favor certain types of / pairs. Regions consisting of thesame set of / pairs exhibit regular structures known as secondary structure.

    The two most common types of secondary structure are -helices and -sheets.

    -Helices are compact structures, with each additional residue contributing 1.5 tothe length of the structure. The backbone rotates 100 clockwise for each additional

    residue. Side-chains in an -helix point outwards from the structure.

    -Sheets are extended structures; each additional residue contributes 3.5 to thestructure. The two types of-sheet differ by the direction of subsequent strands.

    Both -helices and -sheets satisfy the backbone atom hydrogen bondingrequirements; this hydrogen-bonding pattern is a major driving force for secondarystructure formation.

    Ramachandran diagrams are commonly used methods for examining the patterns of / pairs present within proteins.