folds and motifs using scop, cath, ce to find structural homologs. what class/architecture, etc does...

47
Folds and motifs •Using SCOP, CATH, CE to find structural homologs. •What class/architecture, etc does your molecule fall in? •Use CE to identify structural homologs for your protein of interest. •Motifs: classic turn types. Extended turn types. •Use Rasmol to find the glycines in your structure. That are the phi angles? What type of turn is it?

Upload: dayna-powers

Post on 27-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Folds and motifs

•Using SCOP, CATH, CE to find structural homologs.

•What class/architecture, etc does your molecule fall in?

•Use CE to identify structural homologs for your protein of interest.

•Motifs: classic turn types. Extended turn types.

•Use Rasmol to find the glycines in your structure. That are the phi angles? What type of turn is it?

Page 2: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

The SCOP database

• Contains information about classification of protein structures and within that classification, their sequences

• http://scop.berkeley.edu

Page 3: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

SCOP classification heirarchy

(1) class

(2) fold

(3) superfamily

(4) family

(5) protein

(6) species

global characteristics (no evolutionary relation)Similar “topology” . Distant evolutionary cousins?Clear structural homologyClear sequence

homologyfunctionally identical

unique sequences

Page 4: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

protein classes

1. all 2. all 5. multidomain (21)6. membrane (21)7. small (10)8. coiled coil (4)9. low-resolution (4)10. peptides (61)11. designed proteins (17)

number of sub-categories

possibly not complete, or erroneous

Page 5: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

class: proteinsFolds:

TIM-barrel (22)

swivelling beta/beta/alpha domain (5)

spoIIaa-like (2)

flavodoxin-like (10)

restriction endonuclease-like (2)

ribokinase-like (2)

chelatase-like (2)

Mainly parallel beta sheets (beta-alpha-beta units)

Many folds have historical names. “TIM” barrel was first seen in TIM. These classifications are done by eye, mostly.

Page 6: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

fold: flavodoxin-like

Superfamilies:

1.Catalase, C-terminal domain (1)

2.CheY-like (1)

3.Succinyl-CoA synthetase domains (1)

4.Flavoproteins (3)

5.Cobalamin (vitamin B12)-binding domain (1)

6.Ornithine decarboxylase N-terminal "wing" domain (1)

7.Cutinase-like (1)

8.Esterase/acetylhydrolase (2)

9.Formate/glycerate dehydrogenase catalytic domain-like (3)

10.Type II 3-dehydroquinate dehydratase (1)

3 layers, ; parallel beta-sheet of 5 strand, order 21345

Note the term: “layers”

These are not domains. No implication of structural independence.

Note how beta sheets are described: number of strands, order (N->C)

Page 7: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

fold-level similaritycommon topological features

catalase flavodoxin

At the fold level, a common core of secondary structure is conserved. Outer secondary structure units may not be conserved.

Page 8: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Superfamily:Flavoproteins

Flavodoxin-related (7)

NADPH-cytochrome p450 reductase, N-terminal domain

Quinone reductase

These molecules do not superimpose well, but side-by-side you can easily see the similar topology. Sec struct’s align 1-to-1, mostly.

Page 9: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Family: quinone reductasesbinds FAD

NADPH quinone reductase

quinone reductase type 2

Proteins:

Different members of the same family superimpose well. At this level, a structure may be used as a molecular replacement model.

Page 10: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

CATH

• Class• Architecture• Topology• Homology

http://www.biochem.ucl.ac.uk/bsm/cath_new/index.html

Page 11: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Remote homologs are more likely than close

ones

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0 20 40 60 80 100

percent identity for structural homologs

likelihood

Page 12: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Structural conservation is

stronger than sequence conservation

The existence of large numbers of remote homologs shows us that true structural similarity is hard to see in the amino acid sequence.

Page 13: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

beta-catenin versus clathrin (from FSSP database)

Structural homology is sometimes difficult to see

Page 14: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Rasmol practice

Using your protein of interest:

Find all glycines.

select gly

Label each alpha-carbon by number

label %r

Measure the phi-angle by eye (align N and CA. R-handed is positive) Write down the residue number and phi angle.

Page 15: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Rasmol exercises

(1) Color the ribbon drawing of 1qajA.pdb (it’s in your home directory) from the N-terminus (blue) to C-terminus (red)

(2) Divide 1qajA.pdb into domains of different color.

(3) In 1qajA.pdb, display only residues 66-87. What are the first and last residues with helical backbone angles?

(4) Find all of the atoms that H-bond to the sidechain of Ser92.

Page 16: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Secondary structure

Alpha helix

Right-handed Overall dipole N+->C- 3.6 residues/turn i->i+4 H-bonds

Page 17: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

3 types of Alpha helix

ae

b

f c

g

dTwo ways to display position of sidechain on a helix.

non-polar

amphipathic

polar

For amphipathic and non-polar, sidechains line up in a favorable way.

Page 18: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Helix caps

Please go to http://isites.bio.rpi.edu/

Click on:Glycine alpha (Schellman) C-cap Type 1

Glycine alpha (Schellman) C-cap Type 2

Proline alpha C-cap

Frayed alpha helix

Serine alpha N-cap (capping box)

Exercise: Find a helix cap in 1QAJ. What type is it?

Capping motifs stabilize helices.

Page 19: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Proline helix C-capProline helix C-cap

Sequence pattern=...nppnnpp[HNYF]P[DE]n

structural variability

Locations of non-polar (magenta) and polar (green) sidechains

note: hydrophobic cluster

Pro blocks helixD or E stabilizes tight turn

“n”=non-polar“p”=polar[...]=alternative aa’s

Page 20: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Two Helix motifs

corner(helix-turn-helix)

EF-hand

(binds Ca2+)

Page 21: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

corner motif

motif pattern (summary):

nnpp[nH]Gn[PDS][Px]pn

Exercise: Find a corner motif in myoglobin.

“n”=non-polar“p”=polar[...]=alternative aa’s

red=favorableblue = unfavorablegreen=neutral

backbone angles:green=psired=phi

I-sites motif(Also described by A. Efimov)

Page 22: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Find a corner in 1DWT

Download 1dwt from www.rcsb.orgOpen using Rasmol

Rasmol commands:

Display->Cartoons

Color->structure

select alpha

label %m

color labels yellow

Find the corner pattern (sequence and structure).

Page 23: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Can you see the motif in the sequence?

1 GLSDGEWQQV LNVWGKVEAD IAGHGQEVLI RLFTGHPETL EKFDKFKHLK HHHHHHH HHHHHHHHHT HHHHHHHHHH HHHHHTHHHH HTTTTTTT

51 TEAEMKASED LKKHGTVVLT ALGGILKKKG HHEAELKPLA QSHATKHKIP SHHHHHTTHH HHHHHHHHHH HHHHHHHTTT HHHHHHHH HHHHHTS

101 IKYLEFISDA IIHVLHSKHP GDFGADAQGA MTKALELFRN DIAAKYKELG HHHHHHHHHH HHHHHHHHST TSS HHHHHH HHHHHHHHHH HHHHHHHHHT

151 FQG

Page 24: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

EF-hand

Exercise: Rasmol 1CLL. Follow instructions on next page

Page 25: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

1CLL exercise. How does the EF-hand bind

calcium?restrict hetero and CA (comments in blue...) spacefill (this displays only calciums)click to get residue # (x = residue number)restrict within (10., x)center selected (so you can rotate it easily)Display->ball-and-stick (from menu item. you see atoms and bonds)select protein and within (10., x)Display->sticks (now protein part has think bonds)select hoh and within (10., x) ( selects nearby waters)spacefillcolor cyan (or your favorite water color...)select within (3., x) ( atoms within bonding distance)label ( these are the atoms bonded to calcium x)color labels yellow (..so you can see them)

Page 26: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Calcium causes a conformational change

in the EF hand.red=1CLL

green=1cfd

Page 27: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Create a Rasmol scriptWrite a file using jot or vi (for example: vi script.ras)Put keyboard commands in the script.Start RasmolInvoke the script using the ‘script’ command(for example: script script.ras)

Useful keyboard commands for scripts:

load xxx.pdb (loads a new PDB file)zap (closes PDB file, blanks the display)pause (stops the script until you hit a key)

Page 28: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Create a Rasmol script to survey many

structuresList the pdb files in the xtal directory (ls xtal/*.pdb)

Create a script file (myscript.ras) as follows:

load xtal/file1.pdb (use one of the PDB files listed)backbone 0.6color blueselect hohspacefill 1.0color redset unitcell onpausezapload xtal/file2.pdb

(repeat as before for each file listed)

Run the script: script myscript.ras

Next, change the script and run it again (i.e. change the colors to green and magenta)

Page 29: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Nested scripts

Tired of having to edit the script 10 times over??Write a “nested” script.

Create a file called “mynestedscript.ras”Put in all the commands except the ‘load’ line. Just one copy.The last line is ‘zap’.

Create a file called “mymainscript.ras” containing:

load file1.pdbscript mynestedscript.rasload file2.pdbscript mynestedscript.rasetc. etc. etc

Page 30: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Further exercises: just a few examples!

Select atoms by temperature factor: select temperature > 5000 (Rasmol uses B*100, so this selects all atoms with B > 50.00)

Find crystal contacts: select :B & within (8., :A)(This gets all atoms of chain B that are close to chain A. Your chain IDs may vary.)

Can you determine the crystal form (triclinic, orthorhombic, etc) just by looking at the unit cell? (set unitcell on)

Survey the B-factors of waters in several structures using a nested script. (Use: color temperature)

Use logic operators! Find alpha carbons within 10.A or residue x but not within 6.A. (select within (10.,x) and not within (6., x) )

Practice stereo viewing. (stereo -7 for walleyed viewing)

Page 31: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Proline helix C-capProline helix C-cap

Sequence pattern=...nppnnpp[HNYF]P[DE]n

structural variability

Locations of non-polar (magenta) and polar (green) sidechains

note: hydrophobic cluster

Pro blocks helixD or E stabilizes tight turn

“n”=non-polar“p”=polar[...]=alternative aa’s

Page 32: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Two Helix motifs

corner(helix-turn-helix)

EF-hand

(binds Ca2+)

Page 33: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

beta-strandAntiparallel:

Parallel:

middle strand

edge strand

Page 34: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Sequence motifs for beta strands

Amphipathic

Found at the edges of a sheet, or when one side of the sheet is exposed to solvent (i.e. 2-layer proteins).

Found in the buried middle strands of sheets in 3-layer proteins.

Hydrophobic

Note preference for beta-branched aa’s: I,V,T

Page 35: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

beta hairpins

Two adjacent antiparallel beta strands = a beta hairpin

Shown are “tight turns”, 2 residues in the loop region (shaded). Hairpins can have as many as 20 residues in the loop region.

Page 36: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

hairpin motifs

“Serine beta-hairpin” (my naming). A specific pattern (DPESG) forms an incomplete alpha-helical turn 4-residues long.

“Extended Type-1 hairpin”. A type-1 “tight turn” has only 2 residues in the turn. This motif, more common than the tight turn, has an additional Pro or polar sidechain. Pattern: PDG.

Page 37: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

diverging turn

“Diverging turns” have a Type-2 beta turn and two strands that do not pair. The consensus sequence pattern is PDG. The residue before G can be anything polar, but not a D or an N.

Exercise: Find the diverging turn in 1CSK.

Page 38: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Greek key motif

2 3 4 1

beta meander

One of the most common arrangements of four strands.

Two permutations. 2341 and 3214

Exercise: Download 2PLT. Find the Greek key motifs.

Page 39: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

motifWhen a helix occurs between two strands, they are often paired in a parallel sheet.

The cross-over from one strand to the next is almost always right-handed. It is not known why this is true.

(NOTE: the cross-over is always right-handed even when the two strands are not paired!!)

Page 40: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

TOPS diagramsbeta strand pointing up

beta strand pointing down

alpha helix

NC motif

Note R-handed crossover!

Page 41: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

motif1 2 3

12 3

31 2

Only 3 are possible.

(with R-handed crossovers)

Page 42: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Efimov’s theory

A. Efimov showed that almost all protein structures can be classified as being one of 7 trees, each starting with a motif and “growing” by one secondary structure unit at time.

Does this recapitulate evolution? the folding pathway?

Page 43: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Domains

Proteins fold heirarchically.

secondary structure --> motifs --> subdomains-->domains --> multidomain proteins --> complexes

Most domains fold independently of the rest of the chain.

Page 44: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

All alpha folds

3-helix bundle 1CUK 156-203

4-helix bundle 1NFN

Globin 1DWT

...many many other

Page 45: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

alpha/beta folds

“TIM barrel” 8TIM

“Rossman fold” 1HET 176-318

many others...

Exercise: in 1HET, find the “crevice” between two motifs where NAD binding commonly occurs.

Exercise: Draw TOPS diagram for 1HET 176-318.

Page 46: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

all beta folds

“Jelly roll” 1JMU 371-500:D, 2PLT

barrels 1EMC

propellers

beta-helixExercise: Draw TOPS diagrams for the two domains above.

Page 47: Folds and motifs Using SCOP, CATH, CE to find structural homologs. What class/architecture, etc does your molecule fall in? Use CE to identify structural

Term projects

Look at the description online.

Choose a molecule to study. Check with me before you start.

Look at “Molecule of the Month” at RCSB.org for inspiration. But don’t pick one of these molecules.