find a homolog in protein structure database ?

22
Find A Homolog in Protein Structure Database ? YES NO

Upload: tosca

Post on 13-Jan-2016

43 views

Category:

Documents


0 download

DESCRIPTION

Find A Homolog in Protein Structure Database ?. Homology Modeling. YES. Secondary Structure Prediction. NO. Homology Modeling from Swiss-Model. Malate dehydrogenase (14 MDH) sequence SEPIRVLVTGAAGQIAYSLLYSIGNGSVFGKDQPIILVLLDITPMMGVLD - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Find A Homolog in Protein Structure Database ?

Find A Homolog in Protein Structure

Database ?

YES

NO

Page 2: Find A Homolog in Protein Structure Database ?

Malate dehydrogenase (14 MDH) sequence

SEPIRVLVTGAAGQIAYSLLYSIGNGSVFGKDQPIILVLLDITPMMGVLD

GVLMELQDCALPLLKDVIATDKEEIAFKDLDVAILVGSMPRRDGMERKDL

LKANVKIFKCQGAALDKYAKKSVKVIVVGNPANTNCLTASKSAPSIPKEN

FSCLTRLDHNRAKAQIALKLGVTSDDVKNVIIWGNHSSTQYPDVNHAKVK

LQAKEVGVYEAVKDDSWLKGEFITTVQQRGAAVIKARKLSSAMSAAKAIC

DHVRDIWFGTPEGEFVSMGIISDGNSYGVPDDLLYSFPVTIKDKTWKIVE

GLPINDFSREKMDLTAKELAEEKETAFEFLSSA

Page 3: Find A Homolog in Protein Structure Database ?

High Smallest Poisson ProbabilitySequences Producing High-scoring Segment Pairs: Score P(N) N 14MDH

11BMD

21BDM

11BDM

11LLC

15LDH

1692 2.9e-230 1 610 1.9e-81 1 604 1.2e-80 1 295 1.2e-70

4

68 0.0012 3

79 0.0014 1

Finding Appropriate Template from Structure Database

11 BMD: Muscular Dystrophin, Becker types

Page 4: Find A Homolog in Protein Structure Database ?

Using Magic Fit to Align Two Sequences

14MDH 1 S EP IRVLVTG AAGQIAYSLL YS IGNGSVFG KDQP I ILVLL DITPMMGVLD

11BMD 1 KAPVRVAVTG AAGQIGYSLL FRIAAGEMLG KDQPV ILQLL EIPQAMKALE

14MDH 51 GVLM ELQDCA LPLLKDVIAT DKEE I AFKDL DVA ILVGSMP RRDGMERKDL

11BMD 51 GVVMELEDCA FPLLAGLEAT DDPDVAFKDA DYALLVGAAP RKAGMERRDL

14MDH 101 LKANVKIFKC QGAALDKYAK KSVKV IVVGN PANTNCLTAS KSAPS I PKEN

11BMD 101 LQVNGKIFTE QGRALAEVAK KDVKVLVVGN PANTNALIAY KNAPGLNPRN

14MDH 151 FSC LTRLDHN RAKAQ IALKL GVTSDDVKNV I I WGNHSSTQ YPDVNHAKVK

11BMD 151 FTAMTRLDHN RAKAQLAKKT GTGVDR IRRM TVWGNHSSTM FPDLF HAEVD

14MDH 201 LQAKEVGVYE AVKDDSWLKG EFITTVQQRG AAVIKARKLS SAMSAAKAIC

11BMD 201 GRP - - - - ALE LVDME -WYEK VFIPTVAQRG AA I IQARGAS SAASAANAAI

14MDH 251 DHVRDI W –FG TPEG E FVSMG I ISDGNSYGV PDDLLYSFPV TIKDKTWK I V

11BMD 246 EH IRD - WALG TPEGDWVSMA VPSQGE –YGI PEGIVYSFPV TAKDGAYRVV

14MDH 300 EGLP INDFSRE KMDLTAKELA EEKE TAF EFL SSA

11BMD 294 EGLEINEFARK RME ITAQ ELL DEMEQVKALG LI

Length = 326, Score = 610 (278.7 bits), Expect = 1.9e-81, P = 1.9e-81, Identities = 178 / 326 (54.6%)

Page 5: Find A Homolog in Protein Structure Database ?

Modifying Sequence Alignment

14MDH 1 S EPIRVLVTG AAGQIAYSLL YS IGNGSVFG KDQPI ILVLL DITPMMGVLD

11BMD 1 KAPVRVAVTG AAGQIGYSLL FRIAAGEMLG KDQPV ILQLL EIPQAMKALE

14MDH 51 GVLM ELQDCA LPLLKDVIAT DKEE I AFKDL DVA ILVGSMP RRDGMERKDL

11BMD 51 GVVMELEDCA FPLLAGLEAT DDPDVAFKDA DYALLVGAAP RKAGMERRDL

14MDH 101 LKANVKIFKC QGAALDKYAK KSVKV IVVGN PANTNCLTAS KSAPSI PKEN

11BMD 101 LQVNGKIFTE QGRALAEVAK KDVKVLVVGN PANTNALIAY KNAPGLNPRN

14MDH 151 FSC LTRLDHN RAKAQ IALKL GVTSDDVKNV I I WGNHSSTQ YPDVNHAKVK

11BMD 151 FTAMTRLDHN RAKAQLAKKT GTGVDR IRRM TVWGNHSSTM FPDLF HAEVD

14MDH 201 LQAKEVGVYE AVKDDSWLKG EFITTVQQRG AAVIKARKLS SAMSAAKAIC

11BMD 201 GRP - - - -ALE LVDME -WYEK VFIPTVAQRG AA I IQARGAS SAASAANAAI

14MDH 251 DHVRD I WFGT PEGE F VSMG I I SDGNSYGVP DDLLYSFPVT I KDKTWK IVE

11BMD 246 EH I RDWALGT PEGDWVSMAV PSQGE –YG IP E GIVYSFPVT AKDGAYRVVE

14MDH 301 GLPINDF SRE KMDLTAKELA EEKETA F E FL SSA

11BMD 295 GLEINEFARK RME I TAQELL DEMEQVKALG LI

Page 6: Find A Homolog in Protein Structure Database ?

ATOM 1 C ACE A 0 11.590 2.938 35.017 1.00 45.90 14B 5

ATOM 2 O ACE A 0 12.581 2.371 35.517 1.00 28.75 14B 6

ATOM 3 CH3 ACE A 0 10.179 2.477 35.417 1.00 36.75 14B 7

ATOM 4 N SER A 1 11.648 3.946 34.081 1.00 49.10 14 341

ATOM 5 CA SER A 1 12.901 4.557 33.573 1.00 52.42 14 342

ATOM 6 C SER A 1 12.733 5.624 32.482 1.00 48.48 14 343

ATOM 7 O SER A 1 13.238 5.432 31.363 1.00 57.03 14 344

ATOM 8 CB SER A 1 13.990 3.553 33.162 1.00 41.45 14 345

ATOM 9 OG SER A 1 15.105 3.679 34.039 1.00 42.59 14 346

ATOM 10 N GLU A 2 12.073 6.774 32.772 1.00 37.72 14 347

ATOM 11 CA GLU A 2 11.948 7.788 31.721 1.00 20.88 14 348

ATOM 12 C GLU A 2 12.042 9.235 32.169 1.00 28.31 14 349

Obtaining Atomic Coordinates of The Model

Page 7: Find A Homolog in Protein Structure Database ?

Building The Model

Page 8: Find A Homolog in Protein Structure Database ?

The First Model

14 MDH

11 BMD

Page 9: Find A Homolog in Protein Structure Database ?

Refining The Model

14 MDH11 BMD

Page 10: Find A Homolog in Protein Structure Database ?

The Refined Model

Page 11: Find A Homolog in Protein Structure Database ?

First Model Refined Model Real 14 MDH Structure

Models & Real Structure

Page 12: Find A Homolog in Protein Structure Database ?

Yellow real 14 MDH structure Blue refined model Green 11BMD (template)

Comparison of Backbone Structures

Page 13: Find A Homolog in Protein Structure Database ?

14MDH11BMD (template)

3-D Structure Docked with Substrate

In presence of reduced NAD (NADH) In presence of oxidized NAD (NAD+)

Page 14: Find A Homolog in Protein Structure Database ?

• Deduces the most likely position of alpha-helices and beta-strands • Confirms structural or functional relationships when sequence similarity is weak

• Determines guidelines for rational selection of specific mutants for further laboratory study

Secondary Structure Prediction Attributes

Page 15: Find A Homolog in Protein Structure Database ?

Alpha helices have a periodicity of 3.6, which means that for helices with one face buried in the protein core, and the other exposed to solvent, will have residues at positions i, i+3, i+4 & i+7, will lie on one face of the helix.

Page 16: Find A Homolog in Protein Structure Database ?

Beta strands that are half buried in the protein core will tend to have hydrophobic residues at positions i, i+2, i+4, i+8 etc, and polar residues at positions i+1, i+3, i+5, etc.

Page 17: Find A Homolog in Protein Structure Database ?

Beta strands that are completely buried usually contain a run of hydrophobic residues, since both faces are buried in the protein core.

Page 18: Find A Homolog in Protein Structure Database ?

Other Important Secondary Structures

Loop regions

– Often join combinations of -helices and -sheets

– May participate in forming active sites/binding sites

– Usually found on exterior of proteins (H-bond with solvent, H2O)

– Rich in charged and polar hydrophilic residues

– Usually have irregular structure

– Insertions and deletions are most likely to occur in these regions

Hairpin

- Generally 2 to 5 residues long- 70% are shorter than 7 residues- Type I ; residue 2 is always G- Type II; residue 1 is always G

Page 19: Find A Homolog in Protein Structure Database ?

Flavodoxin Chain A (FCA) Sequence

KIGLFYGTQTGVTQTIAESIQQEFGGESIVDLNDIANADASDLNA

YDYLIIGCPTWNVGELQSDWEGIYDDLDSVNFQGKKVAYFGAG

DQVGYSDNFQDAMGILEEKISSLGSQTVGYWPIEGYDFNESKAV

RNNQFVGLAIDEDNQPDLTKNRIKTWVSQLKSEFGL

Example for Secondary Structure Prediction

Page 20: Find A Homolog in Protein Structure Database ?

1 AKIGLFYGTQ TGVTQTIAES IQQEFGGESI VDLNDIANAD ASDLNAYDYL         EEEEEE S SSHHHHHHHH HHHHHTTTTT EEEEEGGGTT GGGGGGSEE   

51 IIGCPTWNVG ELQSDWEGIY DDLDSVNFQG KKVAYFGAGD QVGYSDNFQD       EEEE EETTT EE HHHHHHH GGGGGS  TT  EEEEEEE   TTTTTTTTTH   

101 AMGILEEKIS SLGSQTVGYW PIEGYDFNES KAVRNNQFVG LAIDEDNQPD       HHHHHHHHHH HTT EE   E ESTT   S    TTEETTEESS EEE TTTTHH   

151 LTKNR I KT WV SQLKS E FGL       HHTHHHHHHH HHHH HHTTT 

FCA Secondary Structure

The assignments are: •Helix

•H=helix •G=310 helix •I=pi helix

•Beta •B=residue in isolated beta bridge •E=extended beta strand

•Turns and Bends •T=hydrogen bonded turn •S=bend

Page 21: Find A Homolog in Protein Structure Database ?

Diagram of FCA Secondary Structure

3 sheets, 11 strands, 8 helices, 20 beta turns, 2 beta hairpins, Summary:

Page 22: Find A Homolog in Protein Structure Database ?

3-D Structure of FCA Docked with Substrate (flavin)