fisica computazionale applicata alle macromolecole

58
Fisica Computazionale applicata alle Macromolecole Pier Luigi Martelli Università di Bologna [email protected] 051 2094005 338 3991609 Predizione della struttura proteica

Upload: hoai

Post on 24-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

Fisica Computazionale applicata alle Macromolecole. Pier Luigi Martelli Università di Bologna [email protected] 051 2094005 338 3991609. Predizione della struttura proteica. 3D structure prediction of proteins. New folds. Existing folds. Building by homology. Ab initio prediction. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Fisica Computazionale applicata alle Macromolecole

Fisica Computazionale applicata alle Macromolecole

Pier Luigi Martelli

Università di [email protected]

051 2094005338 3991609

Predizione della struttura proteica

Page 2: Fisica Computazionale applicata alle Macromolecole

New folds Existing folds

ThreadingAb initio

prediction

Building by homology

Homology (%)

0 10 20 30 40 50 60 70 80 90 100

3D structure prediction of proteins

Page 3: Fisica Computazionale applicata alle Macromolecole
Page 4: Fisica Computazionale applicata alle Macromolecole

“Comparative modelling” di proteine

Da: Martì-Renom et al. (2000) Annu. Rev. Biophys. Biomol. Struct. 29:291

Page 5: Fisica Computazionale applicata alle Macromolecole

“Comparative modelling” di proteine

Da: Sanchez et al. (2000) Nature Struct. Biol. (Suppl) 7:986

Modelling per omologia

Modelli affidabili solo per il 45% delle

proteine di Swiss Prot(MODBASE)

http://alto.compbio.ucsf.edu/modbase

E’ possibile abbassare la soglia di identità di

sequenza?

Su larga scala?

Page 6: Fisica Computazionale applicata alle Macromolecole

Selection of Templates

Alignment of the Target sequence with Template

Modelling of the Target on the Template

Evaluation of the Model

Comparative Modelling

Page 7: Fisica Computazionale applicata alle Macromolecole

THE TEMPLATE: 1f13

Page 8: Fisica Computazionale applicata alle Macromolecole

TGL3 MAALGVQSINWQKAFNRQAHHTDKFSSQELILRRGQNFQVLMIMNKGLGSNERLEFIDTT 601F13A VHLFKERWDTNKVDHHTDKYENNKLIVRRGQSFYVQIDFSRPYDPRRDLFRVEYVIGRYP 60 : : : . : .: : :..:*: :. * : . . : .

TGL3 GPYPSESAMTKAVFPLSNGSSGGWSAVLQASNGNTLTISISSPASAPIGRYTMALQIFSQ 1201F13A QENKGTYIPVPIVSELQSGKWGAKIVMREDRSVRLSIQSSPKCIVGKFRMYVAVWTPYGV 120 . . * *..*. *. .: : . . * .. . : *. . :.

TGL3 GGISSVKLGTFILLFNPWLNVDSVFMGNHAEREEYVQEDAGIIFVGSTNRIGMIGWNFGQ 1801F13A LRTSRNPETDTYILFNPWCEDDAVYLDNEKEREEYVLNDIGVIFYGEVNDIKTRSWSYGQ 180 * :***** : *:*::.*. ****** :* *:** *..* * .*.:**

TGL3 FEEDILSICLSILDRSLNFRRDAATDVASRNDPKYVGRVLSAMINSNDDNGVLAGNWSGT 2401F13A FEDGILDTCLYVMDR-------AQMDLSGRGNPIKVSRVGSAMVNAKDDEGVLVGSWDNI 233 **:.**. ** ::** * *::.*.:* *.** ***:*::**:***.*.*..

TGL3 YTGGRDPRSWDGSVEILKNWKKSGFSPVRYGQCWVFAGTLNTALRSLGIPSRVITNFNSA 3001F13A YAYGVPPSAWTGSVDILLEYRSSENPVRYGQCWVFAGVFNTFLRCLGIPARIVTNYFSAH 293 *: * * :* ***:** :::.* . . . . . * . *.:

TGL3 HDTDRNLSVDVYYDPMGNPLDKGSDSVWNFHVWNEGWFVRSDLGPPYGGWQVLDATPQER 3601F13A DNDANLQMDIFLEEDGNVNSKLTKDSVWNYHCWNEAWMTRPDLPVGFGGWQAVDSTPQEN 353 .: . . : . . .*****:* ***.*:.*.** :****.:*:****.

Sequence alignment of TGL3_HUMAN with 1f13

Page 9: Fisica Computazionale applicata alle Macromolecole

TGL3 SQGVFQCGPASVIGVREGDVQLNFDMPFIFAEVNADRITWLYDNTTGKQWKNSVNSHTIG 4201F13A SDGMYRCGPASVQAIKHGHVCFQFDAPFVFAEVNSDLIYITAKKDGTHVVENVDATHIGK 413 *:*:::****** .::.*.* ::** **:*****:* * .: : :* :*

TGL3 RYISTKAVGSNARMDVTDKYKYPEGSDQERQVFQKALGKLKPNTPFAATSSMGLETEEQE 4801F13A LIVTKQIGGDGMMDITDTYKFQEGQEEERLALETALMYGAKKPLNT--------EGVMKS 465 ::.: *.. . .::. : : * * :.

TGL3 PSIIGKLKVAGMLAVGKEVNLVLLLKNLSRDTKTVTVNMTAWTIIYNGTLVHEVWKDSAT 5401F13A RSNVDMDFEVENAVLGKDFKLSITFRNNSHNRYTITAYLSANITFYTGVPKAEFKKETFD 525 * :. . .:**:.:* : ::* *:: *:*. ::* :*.*. *. *::

TGL3 MSLDPEEEAEHPIKISYAQYERYLKSDNMIRITAVCKVPDESEVVVERDIILDNPTLTLE 6001F13A VTLEPLSFKKEAVLIQAGEYMGQLLEQASLHFFVTARINETRDVLAKQKSTVLTIPEIII 585 ::*:* . :..: *. .:* * .: ::: ...:: : :*:.::. : . . :

TGL3 VLNEARVRKPVNVQMLFSNPLDEPVRDCVLMVEGSGLLLGNLKIDVPTLGPKERSRVRFD 6601F13A KVRGTQVVGSDMTVTVEFTNPLKETLRNVWVHLDGPGVTRPMKKMFREIRPNSTVQWEEV 645 :. ::* . . : . : . * : .. : :* . : *:. : .

TGL3 ILPSRSGTKQLLADFSCNKFPAIKAMLSIDVAE 6931F13A CRPWVSGHRKLIASMSSDSLRHVYGELDVQIQR 678 * ** ::*:*.:*.:.: : . *.::: .

sequence identity 34%

Page 11: Fisica Computazionale applicata alle Macromolecole

THE TARGET: TGL3_HUMAN

Page 13: Fisica Computazionale applicata alle Macromolecole
Page 14: Fisica Computazionale applicata alle Macromolecole
Page 15: Fisica Computazionale applicata alle Macromolecole
Page 18: Fisica Computazionale applicata alle Macromolecole

Modelling a bassa identità

•Scelta del template in base a dati sperimentali

La determinazione sperimentale della funzione o della presenza di metalli o gruppi prostetici riduce moltissimo il numero di fold possibili

Page 19: Fisica Computazionale applicata alle Macromolecole

Modelling a bassa identità

•Scelta del template in base a dati sperimentali

•Allineamento multiplo di proteine della stessa famiglia

La determinazione dei residui maggiormente conservati fissa alcuni residui importanti (nell’ambito della famiglia) la cui posizione deve essere mantenuta

Page 20: Fisica Computazionale applicata alle Macromolecole

Modelling a bassa identità

•Scelta del template in base a dati sperimentali

•Allineamento multiplo di proteine della stessa famiglia

•Utilizzo di predittori (struttura secondaria, accessibilità al solvente, stato di legame delle cisteine, segmenti transmembrana….)

TARGET PDDAEMQGTIRSLDENVRSKAKDYMRRIVSSICGIYGATCEVKFMEDVYPTTVNN-----TEMPLATE PASATLNADVRYARNEDFDAAMKTLEERAQQKKLP---EADVKVIVTR-----GRPAFNA

TARGET PEVTDEVMKILSSISTV------VETEPVLGAEDFSRFLQKAPGTYFFLGTRNEKKGCIYTEMPLATE GEGGKKLVDKAVAYYKEAGGTLGVEERTGGGTDAAYAALSG---KPVIES--LGLPGFGY

La predizione di caratteristiche strutturale del target aiuta l’allineamento col template

-elica -strand

Page 21: Fisica Computazionale applicata alle Macromolecole

Alcool deidrogenasi da Sulfolobus solfataricusDati sperimentali

•Contiene 2 atomi di zinco per monomero•Attiva come tetramero

Strutture presenti nella banca dati•Alcool deidrogenasi a 2 atomi di zinco, dimeriche

2OHX (fegato di cavallo)ID: 24%

•Alcool deidrogenasi a 1 atomo di zinco, tetrameriche1YKF (Thermoanaerobacterium brockii)ID: 23%

Monomeri simili (RMSD < 0.2 nm). Differenze in:• loop che coordina il secondo atomo di zinco• aree di tetramerizzazione

Page 22: Fisica Computazionale applicata alle Macromolecole

1 10 20 30 40 50 60 70 80 90 100 110ADH1_SULSO -----------MRAVRLVEIGKP-LSLQEIGVPKPKGPQVLIKVEAAGVCHSDVH-MRQGRFGNLRIVEDLGVKLPVTLGHEIAGKIEEVGDEVVG—-YSKGDLVAVNPWQG--EGNCYYCRIGEEHLCDSPR-------ADHE_HORSE ---STAGKVIKCKAAVLWEEKKP-FSIEEVEVAPPKAHEVRIKMVATGICRSDDH-VVSGTLV--------T-PLPVIAGHEAAGIVESIGEGVTT--VRPGDKV-IP-LFTPQCGKCRVCKHPEGNFCLKND-LSMPRGADHS_HORSE ---STAGKVIKCKAAVLWEQKKP-FSIEEVEVAPPKAHEVRIKMVAAGICRSDDH-VVSGTLV--------A-PLPVIAGHEAAGIVESIGEGVTT--VRPGDKV-IP-LFIPQCGKCSVCKHPEGNLCLKN--LSMPRGADH_GADCA ---ATVGKVIKCKAAVAWEANKP-LVIEEIEVDVPHANEIRIKIIATGVCHTDLYHLFEGKHK--------DG-FPVVLGHEGAGIVESVGPGVTE--FQPGEKV-IP-LFISQCGECRFCQSPKTNQCVKGWANES-PDADH7_HUMAN --MGTAGKVIKCKAAVLWEQKQP-FSIEEIEVAPPKTKEVRIKILATGICRTDDH-VIKGTMV--------S-KFPVIVGHEATGIVESIGEGVTT--VKPGDKV-IP-LFLPQCRECNACRNPDGNLCIRSDIT-G-RGADHX_HUMAN -----ANEVIKCKAAVAWEAGKP-LSIEEIEVAPPKAHEVRIKIIATAVCHTDAY-TLSGADP--------EGCFPVILGHEGAGIVESVGEGVTK--LKAGDTV-IP-LYIPQCGECKFCLNPKTNLCQKIRVTQG-KGADHB_HUMAN ---STAGKVIKCKAAVLWEVKKP-FSIEDVEVAPPKAYEVRIKMVAVGICRTDDH-VVSGNLV--------T-PLPVILGHEAAGIVESVGEGVTT--VKPGDKV-IP-LFTPQCGKCRVCKNPESNYCLKND-LGNP--ADH1_PEA MS-NTVGQIIKCRAAVAWEAGKP-LVIEEVEVAPPQAGEVRLKILFTSLCHTDVY-FWEAKGQ--------TPLFPRIFGHEAGGIVESVGEGVTH--LKPGDHA-LP-VFTGECGECPHCKSEESNMCDLLRINTD-RGADH3_ECOLI ---------MKSRAAVAFAPGKP-LEIVEIDVAPPKKGEVLIKVTHTGVCHTDAF-TLSGDDP--------EGVFPVVLGHEGAGVVVEVGEGVTS--VKPGDHV-IP-LYTAECGECEFCRSGKTNLCVAVRETQG-KGADH3_SOLTU MS-TTVGQVIRCKAAVAWEAGKP-LVMEEVDVAPPQKMEVRLKILYTSLCHTDVY-FWEAKGQ--------NPVFPRILGHEAAGIVESVGEGVTE--LAPGDHV-LP-VFTGECKDCAHCKSEESNMCSLLRINTD-RGADH2_BACST -----------MKAAVVNEFKKA-LEIKEVERPKLEEGEVLVKIEACGVCHTDLH-AAHGDWP-------IKPKLPLIPGHEGVGIVVEVAKGVKS--IKVGDRVGIP-WLYSACGECEYCLTGQETLCPHQL-------ADH1_ZYMMO -----------MKAAVITK-DHT-IEVKDTKLRPLKYGEALLEMEYCGVCHTDLH-VKNGDFG---------DETGRITGHEGIGIVKQVGEGVTS--LKAGDRASVA-WFFKGCGHCEYCVSGNETLCRNVE-------ADHP_ECOLI -----------MKAAVVTK-DHH-VDVTYKTLRSLKHGEALLKMECCGVCHTDLH-VKNGDFG---------DKTGVILGHEGIGVVAEVGPGVTS--LKPGDRASVA-WFYEGCGHCEYCNSGNETLCRSVK-------ADH2_EMENI --MAAPEIPKKQKAVIYDNPGTVSTKVVELDVPEPGDNEVLINLTHSGVCHSDFG-IMTNTWKILP----FPTQPGQVGGHEGVGKVVKLGAGAEASGLKIGDRVGVK-WISSACGQCPPCQDGADGLCFNQK-------ADH_MYCTU --------MSTVAAYAAMSATEP-LTKTTITRRDPGPHDVAIDIKFAGICHSDIH-TVKAEWG--------QPNYPVVPGHEIAGVVTAVGSEVTK--YRQGDRVGVG-CFVDSCRECNSCTRGIEQYCKPGAN------......... ............................................................................................................................................

120 130 140 150 160 170 180 190 200 210 220 230ADH1_SULSO ------WLGINF DG----------AYAEYVIVPHYKYMYKLRRLNAVEAAPLTCSGITTY-RAVRKASLDPTKTLLVVGAGGGLGTMAVQI-AKAVSGATIIGVDVREEAVEAAKRAGADYVINASMQ----D---PLAADHE_HORSE TMQ-DGTSRFT-CRGKPIHHFLGTSTFSQYTVVDEISVAKIDAASPLEKVCLIGCGFSTGYGSAVKVAKVTQGSTCAVFGLGG-VGLSVIMG-CKAAGAARIIGVDINKDKFAKAKEVGATECVNPQDYK---K--PIQEADHS_HORSE TMQ-DGTSRFT-CRGKPIHHFLGTSTFSQYTVVDEISVAKIDAASPLEKVCLVGCGFSTGYGSAVKVAKVTQGSTCAVFGLGG-VGLSVIMG-CKAAGAARIIGVDINKDKFAKAKEVGATECVNPQDYK---K--PIQEADH_GADCA VMS-PKETRFT-CKGRKVLQFLGTSTFSQYTVVNQIAVAKIDPSAPLDTVCLLGCGVSTGFGAAVNTAKVEPGSTCAVFGLGA-VGLAAVMG-CHSAGAKRIIAVDLNPDKFEKAKVFGATDFVNPNDHS---E--PISQADH7_HUMAN VLA-DGTTRFT-CKGKPVHHFMNTSTFTEYTVVDESSVAKIDDAAPPEKVCLIGCGFSTGYGAAVKTGKVKPGSTCVVFGLGG-VGLSVIMG-CKSAGASRIIGIDLNKDKFEKAMAVGATECISPKDST---K--PISEADHX_HUMAN LMP-DGTSRFT-CKGKTILHYMGTSTFSEYTVVADISVAKIDPLAPLDKVCLLGCGISTGYGAAVNTAKLEPGSVCAVFGLGG-VGLAVIMG-CKVAGASRIIGVDINKDKFARAKEFGATECINPQDFS---K--PIQEADHB_HUMAN TLQ-DGTRRFT-CRGKPIHHFLGTSTFSQYTVVDENAVAKIDAASPLEKVCLIGCGFSTGYGSAVNVAKVTPGSTCAVFGLGG-VGLSAVMG-CKAAGAARIIAVDINKDKFAKAKELGATECINPQDYK---K--PIQEADH1_PEA VMLNDNKSRFS-IKGQPVHHFVGTSTFSEYTVVHAGCVAKINPDAPLDKVCILSCGICTGLGATINVAKPKPGSSVAIFGLGA-VGLAAAEG-ARISGASRIIGVDLVSSRFELAKKFGVNEFVNPKEH----DK-PVQQADH3_ECOLI LMP-DGTTRFS-YNGQPLYHYMGCSTFSEYTVVAEVSLAKINPEANHEHVCLLGCGVTTGIGAVHNTAKVQPGDSVAVFGLGA-IGLAVVQG-ARQAKAGRIIAIDTNPKKFDLARRFGATDCINPNDYD---K--PIKDADH3_SOLTU VMINDGQSRFS-INGKPIYHFVGTSTFSEYTVVHVGCVAKINPLAPLDKVCVLSCGISTGLGATLNVAKPTKGSSVAIFGLGA-VGLAAAEG-ARIAGASRIIGVDLNASRFEQAKKFGVTEFVNPKDY----SK-PVQEADH2_BACST ------NGGYS-VDG----------GYAEYCKAPADYVAKIPDNLDPVEVAPILCAGVTTY-KALKVSGARPGEWVAIYGIGG-LGHIALQY-AKAMG-LNVVAVDISDEKSKLAKDLGADIAINGLKE----D---PVKADH1_ZYMMO ------NAGYT-VDG----------AMAEECIVVADYSVKVPDGLDPAVASSITCAGVTTY-KAVKVSQIQPGQWLAIYGLGG-LGNLALQY-AKNVFNAKVIAIDVNDEQLAFAKELGADMVINPKNE----D---AAKADHP_ECOLI ------NAGYS-VDG----------GMAEECIVVADYAVKVPDGLDSAAASSITCAGVTTY-KAVKLSKIRPGQWIAIYGLGG-LGNLALQY-AKNVFNAKVIAIDVNDEQLKLATEMGADLAINSHTE----D---AAKADH2_EMENI ------VSGYY-TPG----------TFQQYVLGPAQYVTPIPDGLPSAEAAPLLCAGVTVY-ASLKRSKAQPGQWIVISGAGGGLGHLAVQIAAKGMG-LRVIGVDHG-SKEELVKASGAEHFVDITKFPTGDKFEAISSADH_MYCTU ----FTYNSIG-KDGQP-----TQGGYSEAIVVDENYVLRIPDVLPLDVAAPLLCAGITLY-SPLRHWNAGANTRVAIIGLGG-LGHMGVKL-GAAMG-ADVTVLSQSLKKMEDGLRLGAKSYYATADP---------D-......... ............................................................................................................................................

240 250 260 270 280 290 300 310 320 330 340 347ADH1_SULSO EIRRITE-SK-GVDAVIDLNNSEKTLSVYPKALAKQ-GKYVMVGLFG---ADLHYHAPLITLS-EIQFVGS-LVG--NQSDFLGIMRLAEAG--KVKPMITKTMKLEEANEAIDNLENFKAIGRQVLIP---ADHE_HORSE VLTEMSN-G--GVDFSFEVIGRLDTMVTALSCCQEAYGVSVIVGVPPD--SQNLSMNPMLLLS-GRTWKGAIFGGFKSKDSVPKLVADFMAKKFALDPLITHVLPFEKINEGFDLLRSGESI-RTILTF---ADHS_HORSE VLTEMSN-G--GVDFSFEVIGRLDTMVAALSCCQEAYGVSVIVGVPPD--SQNLSMNPMLLLS-GRTWKGAIFGGFKSKDSVPKLVADFMAKKFALDPLITHVLPFEKINEGFDLLRSGKSI-RTILTF---ADH_GADCA VLSKMTN-G--GVDFSLECVGNVGVMRNALESCLKGWGVSVLVG-WTD--LHDVATRPIQLIA-GRTWKGSMFGGFKGKDGVPKMVKAYLDKKVKLDEFITHRMPLESVNDAIDLMKHGKCI-RTVLSLE--ADH7_HUMAN VLSEMTG-N--NVGYTFEVIGHLETMIDALASCHMNYGTSVVVGVPPS--AKMLTYDPMLLFT-GRTWKGCVFGGLKSRDDVPKLVTEFLAKKFDLDQLITHVLPFKKISEGFELLNSGQSI-RTVLTF---ADHX_HUMAN VLIEMTD-G--GVDYSFECIGNVKVMRAALEACHKGWGVSVVVGVAAS--GEEIATRPFQLVT-GRTWKGTAFGGWKSVESVPKLVSEYMSKKIKVDEFVTHNLSFDEINKAFELMHSGKSI-RTVVKI---ADHB_HUMAN VLKEMTD-G--GVDFSFEVIGRLDTMMASLLCCHEACGTSVIVGVPPA--SQNLSINPMLLLT-GRTWKGA-VYGGFKSKEGIPKLVADFMAKKFSLDALITHVLPFEKINEGFDLLHSGKSIRTVLTF---ADH1_PEA VIAEMTN-G--GVDRAVECTGSIQAMISAFECVHDGWGVAVLVGVPSK--DDAFKTHPMNFLN-ERTLKGTFYGNYKPRTDLPNVVEKYMKGELELEKFITHTVPFSEINKAFDYMLKGESI-RCIIKMEE-ADH3_ECOLI VLLDINK-W--GIDHTFECIGNVNVMRAALESAHRGWGQSVIIGVAVA--GQEISTRPFQLVT-GRVWKGSAFGGVKGRSQLPGMVEDAMKGDIDLEPFVTHTMSLDEINDAFDLMHEGKSI-RTVIRY---ADH3_SOLTU VIAEMTD-G--GVDRSVECTGHIDAMISAFECVHDGWGVAVLVGVPHK--EAVFKTHPMNFLN-ERTLKGTFFGNYKPRSDIPSVVEKYMNKELELEKFITHTLPFAEINKAFDLMLKGEGL-RCIITMED-ADH2_BACST AIHDQVG-G---VHAAISVAVNKKAFEQAYQSVKRG-GTLVVVGLPN---ADLPIPIFDTVLN-GVSVKGS-IVG--TRKDMQEALDFAARG--KVRPIV-ETAELEEINEVFERMEKGKINGRIVLKLKEDADH1_ZYMMO IIQEKVG-G---AHATVVTAVAKSAFNSAVEAIRAG-GRVVAVGLPP---EKMDLSIPRLVLD-GIEVLGS-LVG--TREDLKEAFQFAAEG--KVKPKV-TKRKVEEINQIFDEMEHGKFTGRMVVDFTHHADHP_ECOLI IVQEKTG-G---AHAAVVTAVAKAAFNSAVDAVRAG-GRVVAVGLPP---ESMSLDIPRLVLD-GIEVVGS-LVG--TRQDLTEAFQFAAEG--KVVPKV-ALRPLADINTIFTEMEEGKIRGRMVIDFRH-ADH2_EMENI HVKSLTTKG-LGAHAVIVCTASNIAYAQSLLFLRYN-GTMVCVGIPENEPQRIASAYPGLFIQKHVHVTGS-AVG--NRNEAIETMEFAARG--VIKAHF-REEKMEALTEIFKEMEEGKLQGRVVLDLS--ADH_MYCTU TFRKLR--G--GFDLILNTVSANLDLGQYLNLLDVD-GTLVELGIPEH--PMAVPAFALALMR--RSLAGSNIGG---IAETQEMLNFCAEH--GVTPEI-ELIEPDYINDAYERVLASDVRYRFVIDISAL......... ....................................................................................................................................

Allineamento di 87 ADH a 2 atomi di Zn per monomero

Page 23: Fisica Computazionale applicata alle Macromolecole

•38 residui sono conservati in più del 90% delle sequenze

•12 residui sono sempre conservati

Tra questi i residui coinvolti nel coordinare i due centri metallici

Page 24: Fisica Computazionale applicata alle Macromolecole

1 10 20 30 40 50 60 70 80 90 100 110ADH1_SULSO ----------MRAVRLVEIGKP--LSLQEIGVPKPKGPQVLIKVEAAGVCHSDVHMRQGRFGNLRIVEDLGVKLPVTLGHEIAGKIEEVGDEVVG--YSKGDLVAVNPWQG-EGNCYYCRIGEEHLCDS-----------ADH_CLOBE ----------MKGFAMLGINKLG---WIEKERPVAGSYDAIVRPLAVSPCTSDIHTVFEGA--------LGDRKNMILGHEAVGEVVEVGSEVKD--FKPGDRVIVPCTTPDWRSLEVQAGFQQHSN-------------ADH_THEBR ----------MKGFAMLSIGKVG---WIEKEKPAPGPFDAIVRPLAVAPCTSDIHTVFEGA--------IGERHNMILGHEAVGEVVEVGSEVKD--FKPGDRVVVPAITPDWRTSEVQRGYHQHSG-------------ADH1_SOLTU MSTTVGQVIRCKAAVAWEAGKP--LVMEEVDVAPPQKMEVRLKILYTSLCHTDVYFWEAKG--------QNPVFPRILGHEAAGIVESVGEGVTE--LGPGDHV-LPVFTGECKDCAHCKSEESNMCSL-----------ADH2_LYCES MSTTVGQVIRCKAAVAWEAGKP--LVMEEVDVAPPQKMEVRLKILYTSLCHTDVYFWEAKG--------QNPVFPRILGHEAAGIVESVGEGVTD--LAPGDHV-LPVFTGECKDCAHCKSEESNMCSL-----------ADH1_ASPFL ----MSIPEMQWAQVAEQKGGP--LIYKQIPVPKPGPDEILVKVRYSGVCHTDLHALKGDW-------PLPVKMPLVGGHEGAGVVVARGDLVT--EFEIGDHAGLKWLNGSCLACEFCKQADEPLCPN-----------ADH1_EMENI ----MCIPTMQWAQVAEKVGGP--LVYKQIPVPKPGPDQILVKIRYSGVCHTDLHAMMGHW-------PIPVKMPLVGGHEGAGIVVAKGELVH--EFEIGDQAGIKWLNGSCGECEFCRQSDDPLCAR-----------ADH1_KLULA --MAASIPETQKGVIFYENGGE--LQYKDIPVPKPKANELLINVKYSGVCHTDLHAWKGDW-------PLPTKLPLVGGHEGAGVVVAMGENVKG--WKIGDFAGIKWLNGSCMSCEYCELSNESNCPE-----------ADH1_KLUMA ----MAIPETQKGVIFYEHGGE--LQYKDIPVPKPKPNELLINVKYSGVCHTDLHAWQGDW-------PLDTKLPLVGGHEGAGIVVAMGENVTG--WEIGDYAGIKWLNGSCMSCEECELSNEPNCPK-----------ADH1_YEAST -----SIPETQKGVIFYESHGK--LEHKDIPVPKPKANELLINVKYSGVCHTDLHAWHGDW-------PLPVKLPLVGGHEGAGVVVGMGENVKG--WKIGDYAGIKWLNGSCMACEYCELGNESNCPH-----------ADH1_CANAL --MSEQIPKTQKAVVFDTNGGQ--LVYKDYPVPTPKPNELLIHVKYSGVCHTDLHARKGDW-------PLATKLPLVGGHEGAGVVVGMGENVKG--WKIGDFAGIKWLNGSCMSCEFCQQGAEPNCGE-----------ADH1_PICST ----MSVPTTQKAVVFESNGGP--LLYKDIPVPTPKPNEILINVKYSGVCHTDLHAWKGDW-------PLDTKLPLVGGHEGAGVVVGIGSNVTG--WELGDYAGIKWLNGSCLNCEFCQHSDEPNCAK-----------ADH_SCHPO ----MTIPDKQLAAVFHTHGGPENVKFEEVPVAEPGQDEVLVNIKYTGVCHTDLHALQGDW-------PLPAKMPLIGGHEGAGVVVKVGAGVTR--LKIGDRVGVKWMNSSCGNCEYCMKAEETICPH-----------ADH2_EMENI -MAAPEIPKKQKAVIYDNPGTVS-TKVVELDVPEPGDNEVLINLTHSGVCHSDFGIMTNTWK----ILPFPTQPGQVGGHEGVGKVVKLGAGAEASGLKIGDRVGVKWISSACGQCPPCQDGADGLCFN-----------ADH_ALCEU ------MTAMMKAAVFVEPGRIE---LADKPIPDIGPNDALVRITTTTICGTDVH-ILKGE--------YPVAKGLTVGHEPVGIIEKLGSAVTG--YREGQRVIAGAICPNFNSYAAQDGVASQDCSYLMASGQCGCHG......... ............................................................................................................................................

120 130 140 150 160 170 180 190 200 210 220ADH1_SULSO -PRWLG----INFDG------------------AYAEYVIVPHYKYMYKLRRLNAVEAAPLT--CSGITTYRAVRKASLDPTKTLLVVGAGGGLGTMAVQIAKAVSGATIIGVDVREEAVEAAKRAGADYVINASMQ---ADH_CLOBE -GMLAGWKFSNFKDG------------------VFGEYFHVNDADMNLAILPKDMPLENAVMITDMMTSGFHGAELADIQMGSSVVVIGIG-AVGLMGIAGAKLRGAGRIIGVGSRPICVEAAKFYGATDILNYKNG---ADH_THEBR -GMLAGWKFSNVKDG------------------VFGEFFHVNDADMNLAHLPKEIPLEAAVMIPDMMTTGFHGAELADIELGATVAVLGIG-PVGLMAVAGAKLRGAGRIIAVGSRPVCVDAAKYYGATDIVNYKDG---ADH1_SOLTU -LRINTDRGVMINDGQSRFSINGKPIYHFVGTSTFSEYTVVHVGCVAKINPLAPLDKVCVLS--CGISTLGATLNVAKPTKGSSVAIFGLG-AVGLAAAEGARIAGASRIIGVDLNASRFEQAKKFGVTEFVNPKDY---ADH2_LYCES -LRINTDRGVMLNDGKSRFSINGNPIYHFVGTSTFSEYTVVHVGCVAKINPLAPLDKVCVLS--CGISTLGASLNVAKPTKGSSVAIFGLG-AVGLAAAEGARIAGASRIIGVDLNASRFEQAKKFGVTEFVNPKDY---ADH1_ASPFL -ASLSG----YTVDG------------------TFQQYAIGKATHASKLPKNVPLDAVAPVL--CAGITVYKGLKESGVRPGQTVAIVGAGGGLGSLALQYA-KAMGIRVVAIDGGEEKQAMCEQLGAEAYVDFTKT---ADH1_EMENI -AQLSG----YTVDG------------------TFQQYALGKASHASKIPAGVPVDAAAPVL--CAGITVYKGLKEAGVRPGQTVAIVGAGGGLGSLAQQYA-KAMGIRVVAVDGGDEKRAMCESLGTETYVDFTKS---ADH1_KLULA -ADLSG----YTHDG------------------SFQQYATADAVQAAKIPVGTDLAEVAPVL--CAGVTVYKALKSANLKAGDWVAISGAAGGLGSLAVQYA-KAMGYRVLGIDAGEEKAKLFKDLGGEYFIDFTKS---ADH1_KLUMA -ADLSG----YTHDG------------------SFQQYATADAVQAARIPKNVDLAEVAPIL--CAGVTVYKALKSAHIKAGDWVAISGACGGLGSLAIQYA-KAMGYRVLGIDAGDEKAKLFKELGGEYFIDFTKT---ADH1_YEAST -ADLSG----YTHDG------------------SFQQYATADAVQAAHIPQGTDLAQVAPIL--CAGITVYKALKSANLMAGHWVAISGAAGGLGSLAVQYA-KAMGYRVLGIDGGEGKEELFRSIGGEVFIDFTKE---ADH1_CANAL -ADLSG----YTHDG------------------SFEQYATADAVQAAKIPAGTDLANVAPIL--CAGVTVYKALKTADLAAGQWVAISGAGGGLGSLAVQYA-RAMGLRVVAIDGGDEKGEFVKSLGAEAYVDFTKD---ADH1_PICST -ADLSG----YTHDG------------------SFQQYATADAVQAARLPKGTDLAQAAPIL--CAGITVYKALKTAQIQPGNWVCISGAGGGLGSLAIQYA-KAMGFRVIAIDGGEEKGEFVKSLGAEAYVDFTVS---ADH_SCHPO -IQLSG----YTVDG------------------TFQHYCIANATHATIIPESVPLEVAAPIM--CAGITCYRALKESKVGPGEWICIPGAGGGLGHLAVQYA-KAMAMRVVAIDTGDDKAELVKSFGAEVFLDFKKE---ADH2_EMENI -QKVSG----YYTPG------------------TFQQYVLGPAQYVTPIPDGLPSAEAAPLL--CAGVTVYASLKRSKAQPGQWIVISGAGGGLGHLAVQIAAKGMGLRVIGVDHGS-KEELVKASGAEHFVDITKFPTGADH_ALCEU YKATAGWRFGNMIDG------------------TQAEYVLVPDAQANLTPIPDGLTDEQVLMCPDIMSTGFKGAENANIRIGHTVAVFAQG-PIGLCATAGARLCGATTIIAIDGNDHRLEIARKMGADVVLNFRNC---......... ............................................................................................................................................

230 240 250 260 270 280 290 300 310 320 330 340 347ADH1_SULSO -----DPLAEIRRITESKGVDAVIDLNNSEKTLSVYPKALAKQGKYVMVGLFGADLHYHAPLIT----LSEIQFVG-SLVGNQSDFLGIMRLAEAGK-----VKPMITKTMKLEEANEAIDNLENFKAIGRQVLIP--ADH_CLOBE -----HIVDQVMKLTNGEGVDRVIMAGGGSETLSQAVSMVKPGGIISNINYHGSGDALLIPRVEWGCGMAHKTIKGGLCPGGRLRAEMLRDMVVYNRVDL--SKLVTHVYHGFDHIEEALLLMKDKPKDLIKAVVIL-ADH_THEBR -----PIESQIMNLTEGKGVDAAIIAGGNADIMATAVKIVKPGGTIANVNYFGEGEVLPVPRLEWGCGMAHKTIKGGLCPGGRLRMERLIDLVFYKRVDP--SKLVTHVFRGFDNIEKAFMLMKDKPKDLIKPVVILAADH1_SOLTU ---SKPVQEVIAEMTDGGVDRSVECTGHIDAMISAFECVHDGWGVAVLVGVPHKEAVFKTHPMN---LLNERTLKG-TFFGNYKPRSDIPSVVEKYMNKELELEKFITHTLPFAEINKAFDLMLKGEGLRCIITMED-ADH2_LYCES ---SKPVQEVIAEMTDGGVDRSVECTGHIDAMISAFECVHDGWGVAVLVGVPHKEAVFKTHPLN---FLNERTLKG-TFFGNYKPRSDIPCVVEKYMNKELELEKFITHTLPFAEINKAFDLMLKGEGLRCIITMAD-ADH1_ASPFL ---QDLVADVKAATPEGLGAHAVILLAVAEKPFQQAAEYV-SRGTVVAIGLPAG-AFLRAPVFN--TVVRMINIKG-SYVGNRQDGVEAVDFFARGL-----IKAPFK-TAPLQDLPKIFELMEQGKIAGRYVLEIPEADH1_EMENI ---KDLVADVRHGR-GCLGAHAVILLAVSEKPFQQATEYVRSRGTIVAIGLPPD-AYLKAPVIN--TVVRMITIKG-SYVGNRQDGVEALDFFARGL-----IKAPFK-TAPLKDLPKIYELMEQGRIAGRYVLEMPEADH1_KLULA ----KNIPEEVIEAT-KGGAHGVINVSVSEFAIEQSTNYVRSNGTVVLVGLPRD-AKCKSDVFN--QVVKSISIVG-SYVGNRADTREAIDFFSRGL-----VKAPIH-VVGLSELPSIYEKMEKGAIVGRYVVDTSKADH1_KLUMA ----KDMVAEVIEAT-NGGAHAVINVSVSEAAISTSVLYTRSNGTVVLVGLPRD-AQCKSDVFN--QVVKSISIVG-SYVGNRADTREALDFFSRGL-----VKAPIK-ILGLSELASVYDKMVKGQIVGRIVVDTSKADH1_YEAST ----KDIVGAVLKAT-DGGAHGVINVSVSEAAIEASTRYVRANGTTVLVGMPAG-AKCCSDVFN--QVVKSISIVG-SYVGNRADTREALDFFARGL-----VKSPIK-VVGLSTLPEIYEKMEKGQIVGRYVVDTSKADH1_CANAL ----KDIVEAVKKAT-DGGPHGAINVSVSEKAIDQSVEYVRPLGKVVLVGLPAH-AKVTAPVFD--AVVKSIEIKG-SYVGNRKDTAEAIDFFSRGL-----IKCPIK-IVGLSDLPEVFKLMEEGKILGRYVLDTSKADH1_PICST ----KDIVKDIQTAT-DGGPHAAINVSVSEKAIAQSCQYVRSTGTVVLVGLPAG-AKVVAPVFD--AVVKSISIRG-SYVGNRADSAEAIDFFTRGL-----IKCPIK-VVGLSELPKVYELMEAGKVIGRYVVDTSKADH_SCHPO ----ADMIEAVKACT-NGGAHGTLVLSTSPKSYEQAAGFARPGSTMVTVSMPAG-AKLGADIFW--LTVKMLKICG-SHVGNRIDSIEALEYVSRGL-----VKPYYK-VQPFSTLPDVYRLMHENKIAGRIVLDLSKADH2_EMENI DKFEAISSHVKSLTTKGLGAHAVIVCTASNIAYAQSLLFLRYNGTMVCVGIPENEPQRIASAYPGLFIQKHVHVTG-SAVGNRNEAIETMEFAARGV-----IKAHFR-EEKMEALTEIFKEMEEGKLQGRVVLDLS-ADH_ALCEU ----DVVDEVMKLTG-GRGVDASIEALGTQATFEQSLRVLKPGGTLSSLGVYSSD--LTIPLSAFAAGLGDHKINTALCPGGKERMRRLINVIESGRVDL--GALVTHQYR-LDDIVAAYDLFANQRDGVLKIAIKPH.......... .........................................................................................................................................

Allineamento di 24 ADH tetrameriche

Page 25: Fisica Computazionale applicata alle Macromolecole

Allineamento tra il target e due template

TargetADH a 2 atomi ZnADH tetramerica

-elica

-strand

L’allineamento considera: posizioni conservate, struttura secondaria, accessibilità al solvente.

Page 26: Fisica Computazionale applicata alle Macromolecole

Zinco strutturale

Zinco catalitico

Dominio di legame del coenzima Dominio catalitico

Modello del monomero

Page 27: Fisica Computazionale applicata alle Macromolecole

Modello del tetramero

Casadio R, Martelli PL, Giordano A, Rossi M, Raia CA A low-resolution 3D model of the tetrameric alcohol dehydrogenase from Sulfolobus solfataricus Protein eng 15:215-223 (2002)

Page 28: Fisica Computazionale applicata alle Macromolecole

ModelloStruttura a raggi X (1JVB)

RMSD = 0.25 nm

Casadio et al, Protein eng 15:215 (March 2002) Esposito et al., JMB 318:463 (April 2002)

Conferme: la struttura della proteina è stata risolta

Page 29: Fisica Computazionale applicata alle Macromolecole

Carbossipeptidasi da Sulfolobus solfataricusDati sperimentali

•Contiene 1 atomi di zinco per monomero•Attiva in forma oligomerica, ignoto il numero di monomeri

Strutture presenti nella banca dati•Carbossipeptidasi a 1 atomo di zinco

1OBR (Thermoactinomyces vulgaris)

ID: 16% simmetria compatibile con

esameri

•Carbossipeptidasi a 2 atomi di zinco 1CG2 (Pseudomonas spirullum)ID: 21% simmetria compatibile con

tetrameri

Page 30: Fisica Computazionale applicata alle Macromolecole

1CG2:Glu178 1CG2:Asp119

1OBR:Glu72

1OBR:His691CG2:His90

1OBR:His204

1OBR

1CG2

Sovrapposizione strutturale dei domini catalitici

RMSD = 0.25 nm

Page 31: Fisica Computazionale applicata alle Macromolecole

-elica -strand

Allineamento tra il target e 1OBR

L’allineamento considera: leganti dello zinco, struttura secondaria, accessibilità al solvente.

Page 32: Fisica Computazionale applicata alle Macromolecole

Modello di CPSso basato su 1OBR

His 108

Glu 327

His 245

Zinco

Asp 109

Coordinano lo zincoAcqua

Coordina l’acqua

Page 33: Fisica Computazionale applicata alle Macromolecole

Allineamento tra il target e 1CG2

-elica -strand

L’allineamento considera: leganti dello zinco, struttura secondaria, accessibilità al solvente.

Page 34: Fisica Computazionale applicata alle Macromolecole

His 168Asp 109His 108

Glu 142

Zinco

Acqua

Modello di CPSso basato su 1CG2

Coordinano lo zinco

Coordina l’acqua

Page 35: Fisica Computazionale applicata alle Macromolecole

His 168Asp 109His 108

His 245Asp 109His 108

Coordinano lo zinco

Mutagenesi sitospecifica

H108A InattivoD109L InattivoH245A AttivoH168A Inattivo

Page 36: Fisica Computazionale applicata alle Macromolecole

Aggregati

Modello basato su 1obr

Simmetria 6-merica

Modello basato su 1cg2

Simmetria 4-merica

Page 37: Fisica Computazionale applicata alle Macromolecole

Diffrazione a Raggi X a Basso Angolo

Occhipinti E, Martelli PL, Spinozzi F, Corsi F, Formantici C, Molteni L, Amenitsch H, Mariani P, Tortora P, Casadio R 3D structure of Sulfolobus solfataricus carboxypeptidase developed by molecular modeling is confirmed by site-directed mutagenesis and small angle X-ray scattering Biophys J 85:1165-1175 (2003)

Page 38: Fisica Computazionale applicata alle Macromolecole

Conclusioni

Il modelling a bassa identità di sequenza può dare buoni risultati se tutte le informazioni disponibili (sia sperimentali che derivanti da predizioni) sono utilizzate per la scelta del template e per l’allineamento.

Queste procedure sono in gran parte ANCORA non automatiche

Page 39: Fisica Computazionale applicata alle Macromolecole

A low resolution 3D Model of VDAC the sequence from Neurospora crassa)

Page 40: Fisica Computazionale applicata alle Macromolecole

2omf_.seq/ AEIYNKDGNK VDLYGKAVGL HYFSKGNGEN SYGGNGDMTY ARLGFKGETQ 2omf_.str/ CCCCCCCCEE EEEEEEEEEE EEECCCCCCC CCCCCCCCCE EEEEEEEEEE protx.str/ *******CCC CCCCEEEEEE EEEC****** ********CE EEEEEEEECC protx.seq/ *******KGY NFGLWKLDLK TKTS****** ********SG IEFNTAGHSN 2omf_.seq/ I*NSDLTGYG QWEYNFQGNN SEGADAQTGN KTRLAFAGLK YADVGSFDYG 2omf_.str/ C*CCCEEEEE EEEEEEECCC CCCCCCCCCC EEEEEEEEEE ECCCEEEEEE protx.str/ CCCCCEEEEE EEEEEEC*** ********** EEEEEEEEEC CCCCCEEEEE protx.seq/ QESGKVFGSL ETKYKVK*** ********** DYGLTLTEKW NTDNTLFTEV 2omf_.seq/ RNYGVVYDAL GYTDMLPEFG GDTAYSDDFF VGRVGGVATY RNSNFFGLVD 2omf_.str/ ECCCCCCCCC CCCCCCCCCC CCCCCCCCCC CCCCCCEEEE EECCCCCCCC protx.str/ EEEECC**** ********** ********** **CCEEEEEE EEECCCCCCC protx.seq/ AVQDQL**** ********** ********** **LEGLKLSL EGNFAPQSGN 2omf_.seq/ GLNFAVQYLG KNER****** *********D TARRSNGDGV GGSISYEYE* 2omf_.str/ CEEEEEEEEC CCCC****** *********C CCCCCCCCEE EEEEEEEEC* protx.str/ EEEEEEEEEE EEEECCCCCC CCCCCCCEEE EEEEEEEEEE EEEEEEECCC protx.seq/ KNGKFKVAYG HENVKADSDV NIDLKGPLIN ASAVLGYQGW LAGYQTAFDT 2omf_.seq/ **GFGIVGAY GAADRTNLQE AQPLGNGKKA EQWATGLKYD ANNIYLAANY 2omf_.str/ **CEEEEEEE EEEECCCCCC CCCCCCCCEE EEEEEEEEEE ECCEEEEEEE protx.str/ CCEEEEEEEE EEEEEEEEEE EEECCCCCCC EEEEEEEEEE CEEEEEEEEE protx.seq/ QQSKLTTNNF ALGYTTKDFV LHTAVNDGQE FSGSIFQRTS DKLDVGVQLS 2omf_.seq/ GETRNATPIT NKFTNTSGFA NKTQDVLLVA QYQFDFGLRP SIAYTKSKAK 2omf_.str/ EEEECCCCCC CCCCCCCCCC CEEEEEEEEE EEECCCCEEE EEEEEEEEEE protx.str/ EEECC***** ********** *CCCEEEEEE EEECCCCEEE EEEEEEC*** protx.seq/ WASGT***** ********** *SNTKFAIGA KYQLDDDARV RAKVNNA*** 2omf_.seq/ DVEGIGDVDL VNYFEVGATY YFNKNMSTYV DYIINQIDSD NKLGVGSDDT 2omf_.str/ CCCCCCCEEE EEEEEEEEEE ECCCCEEEEE EEEEECCCCC CCCCCCCCCE protx.str/ *********E EEEEEEEEEE EC***EEEEE EEEEECCC** *****CCCCE protx.seq/ *********S QVGLGYQQKL RT***GVTLT LSTLVDGK** *****NFNAG 2omf_.seq/ VAVGIVYQF* *** 2omf_.str/ EEEEEEEEE* *** protx.str/ EEEEEEEEEE EC* protx.seq/ GHKIGVGLEL EA*

Structural alignment of VDAC with the template

Prediction with HMM

Page 41: Fisica Computazionale applicata alle Macromolecole

A low resolution 3D model of VDAC:location of mutated residues

Casadio et al., FEBS Lett 520:1-7 (2002)

Page 42: Fisica Computazionale applicata alle Macromolecole

Threading Thread the Sequence ….ACDGGTKLMAG…… into

Model 1

Model 2

Model 3

Score 1

Score 2

Score 3

The best scoring model is chosen as candidate fold for the sequence

Page 43: Fisica Computazionale applicata alle Macromolecole

TOPITS (PredictProtein) Burkhard Rost (Columbia Univ.)http://cubic.bioc.columbia.edu/predictprotein/

FRSVR David Eisenberg (UCLA)http://fold.doe-mbi.ucla.edu/

3DPSSM Michael Sternberg (Imperial Cancer Res. Fund)http://www.sbg.bio.ic.ac.uk/~3dpssm/

GenTHREADER David Jones (Brunel Univ.)http://bioinf.cs.ucl.ac.uk/psipred/

THREADING SERVERSTHREADING SERVERS

Page 44: Fisica Computazionale applicata alle Macromolecole
Page 45: Fisica Computazionale applicata alle Macromolecole

FoRc

HoMo

1D

….the art of being humble

Page 46: Fisica Computazionale applicata alle Macromolecole

Ab initio methods:

•Knowledge based potentials

•Contact map predictions

Page 47: Fisica Computazionale applicata alle Macromolecole

Prediction of Contact MapsPrediction of Contact Maps

Contact definition

F 297

F 156 V 299

V 271

I 240V 238

I 269

Page 48: Fisica Computazionale applicata alle Macromolecole

Contact definition:

•C-C distance < 0.8 nm

•Sequence gap > 7 residues

Page 49: Fisica Computazionale applicata alle Macromolecole

From 3D Structure

F 297

F 156 V 299

V 271

I 240V 238

I 269

Computation of Contact MapsComputation of Contact Maps

To Contact MapTTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYANT

TCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN

Page 50: Fisica Computazionale applicata alle Macromolecole

RMSD = 2.5 Å

N

C

Contact mapMARC

1QHJ (1.9 Å)

We can build the correct structure from the correct contact map

Model

Page 51: Fisica Computazionale applicata alle Macromolecole

(A) An alignment of 5 (hypothetical) sequences as they are represented in a HSSP file (Sander and Schneider, 1991). i and j stand for the positions of the two residues making or not making contact (A and D in the leading sequence or sequence 1). (B) Single sequence coding. The position representing the couple (AD) in the vector is set to 1.0 while the other positions are set to 0. (C) Multiple sequence coding. For each sequence in the alignment (1 to 5 in the scheme in A) a couple of residues in position i and j is counted. The final input coding representing the frequency of each couple in the alignment is normalized to the number of the sequences

Representation of the input coding based on ordered couples.

Page 52: Fisica Computazionale applicata alle Macromolecole

The neural network architecture for prediction of contact maps

Page 53: Fisica Computazionale applicata alle Macromolecole

T0087: 310 residues A=20 % (FR/NF)

N

C

Page 54: Fisica Computazionale applicata alle Macromolecole

T0110: 128 residues A=30% (NF)

N

C

Page 55: Fisica Computazionale applicata alle Macromolecole
Page 56: Fisica Computazionale applicata alle Macromolecole

EVA

Evaluation of Automatic protein structure prediction

[ Burkhard Rost, Andrej Sali, http://maple.bioc.columbia.edu/eva/ ]

CASPCommunity Wide Experiment on the Critical Assessment of Techniques for Protein Structure Predictionhttp://PredictionCenter.llnl.gov/casp5/

3D - Crunch

Very Large Scale Protein Modelling Project

http://www.expasy.org/swissmod/SM_LikelyPrecision.html

Model Accuracy Evaluation

Bioinformatics I

Page 57: Fisica Computazionale applicata alle Macromolecole

Protein Structure Resources

PDB http://www.pdb.orgPDB – Protein Data Bank of experimentally solved structures (RCSB)

CATH http://www.biochem.ucl.ac.uk/bsm/cath/Hierarchical classification of protein domain structures

SCOP http://scop.mrc-lmb.cam.ac.uk/scop/Alexey Murzin’s Structural Classification of proteins

DALI http://www2.ebi.ac.uk/dali/Lisa Holm and Chris Sander’s protein structure comparison server

SS-Prediction and Fold Recognition

PHD http://cubic.bioc.columbia.edu/predictprotein/Burkhard Rost’s Secondary Structure and Solvent Accessibility Prediction Server

3DPSSM http://www.sbg.bio.ic.ac.uk/~3dpssm/ Fold Recognition Server using 1D and 3D Sequence Profiles coupled with Secondary Structure and Solvation Potential Information.

Bioinformatics I

Page 58: Fisica Computazionale applicata alle Macromolecole

Protein Homology Modeling Resources

SWISS MODEL: http://www.expasy.ch/swissmod/

Deep View - SPDBV:homepage: http://www.expasy.ch/spdbv/Tutorials http://www.usm.maine.edu/~rhodes/SPVTut/

http://www.bbsrc.ac.uk/molbiol/

WhatIf http://www.cmbi.kun.nl/whatif/Gert Vriend’s protein structure modeling analysis program WhatIf

Modeller: http://guitar.rockefeller.edu/modeller/Andrej Sali's homology protein structure modelling by satisfaction of spatial restraints

FAMS: http://physchem.pharm.kitasato-u.ac.jp/FAMS/fams.htmlFull Automatic Modelling System (FAMS); Kitasato University; Tokyo, Japan

3D-JIGSAW: http://www.bmm.icnet.uk/people/paulb/3dj/form.htmlComparative Modelling Server; Imperial Cancer Research Fund; London, UK

CPHmodels: http://www.cbs.dtu.dk/services/CPHmodels/Centre for Biological Sequence Analysis; The Technical University of Denmark; Denmark

SDSC1: http://cl.sdsc.edu/hm.htmlSDSC Structure Homology Modelling Server; San Diego Supercomputing Centre

Bioinformatics I