homology modeling (comparative structure...

18
GBCB 5874: Problem Solving in GBCB Homology Modeling (Comparative Structure Modeling)

Upload: others

Post on 26-Jan-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

Homology Modeling(Comparative Structure Modeling)

Page 2: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

Aims of Structural Genomics

• High-throughput 3D structure determination andanalysis

• To determine or predict the 3D structures of all theproteins encoded in the genome

• Up to 40% of the known protein sequences have atleast one segment related to one or morestructures

=> Determine all of the folds=> Use homology modeling to predict 3D structures

Page 3: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

Growth in the PDB

Page 4: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

What is Homology?

• Homology: having a common evolutionaryorigin

• Cannot be partial• Assertion of homology is an hypothesis• Hypothesis usually based on extent of

sequence similarity between proteins,though similar functions should bedemonstrated

Page 5: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

Some Definitions

• Homologues (homologs): proteins thatare evolutionarily related

• Orthologues (orthologs): homologuesfrom different organisms

• Paralogues (paralogs): homologuesfrom the same organism

Page 6: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

Basis of Homology Modeling

• 3D structures conserved to greaterextent than primary structures

• Develop models of protein structurebased on structures of homologues

• Using known structure as a “template”,calculate 3D model of a protein forwhich only know the sequence (the“target”)

Page 7: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

Steps in Homology Modeling

Page 8: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

Template Selection

• Identify protein structures related to targetand select those to be used as templates

• Involves searching a database such as atNCBI (e.g., BLAST at NCBI)

• Involves a certain amount of sequencealignment

Page 9: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

Aligning Sequences

• Critical step in homology modeling• Many options to consider• Factors to consider

– Which algorithm to use– Which scoring method to apply– Whether and how to assign gap penalties

Page 10: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

Scoring Alignments• Need some method of scoring to find optimal

alignment• Four general types of scoring have been applied

– Identity: considers only identical residues– Genetic code: considers the number of base changes in

DNA or RNA to interconvert codons for the amino acids– Chemical similarity: considers physico-chemical properties– Observed substitutions: considers substitution frequencies

observed in alignments of sequences (*used the most*)

Page 11: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

Scoring Matrices• PAM40 - short highly similar sequences• PAM160 - detecting members of protein family• PAM250 - longer more divergent sequences• BLOSUM90 - short highly similar sequences• BLOSUM80 - detecting members of protein family• BLOSUM62 - most effective in finding all potential

similarities• BLOSUM30 - longer more divergent sequences

Page 12: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

Log-Odds Matrix

Si,j = log[qi,j)/(pipj)]

qi,j = frequency of substitutionpipj= probability of occurrence of

residues i and j in proteins

Page 13: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

Building the 3D Model

• Rigid body assembly– Rigid bodies from aligned sequences– Core region, loops, and side chains

• Satisfaction of spatial restraints– Generate restraints from templates– Assume distances and angles between aligned template

and target are similar– Minimize violations of all restraints using distance

geometry or optimization techniques (i.e., force field) tosatisfy spatial restraints

Page 14: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

Evaluation of Model Quality

• Check for proper protein stereochemistry– ProCheck (http://biotech.ebi.ac.uk:8400/cgi-bin/sendquery)

• Ramachandran plot, bond-length, …– Whatif (http://www.cmbi.kun.nl/gv/servers/WIWWWI)

• Packing quality– Both web-servers

• Fitness of sequence to structure– ProsaII (http://lore.came.sbg.ac.at/Services/prosa.html)

• Program runs on Linux and Unix– Verify3D (http://www.doe-mbi.ucla.edu/Services/Verify_3D/)

• Web-server

Page 15: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

Evaluating the 3D ModelProcheck

• Ramachandran plot• Planar peptide bonds• Side chain

conformations thatcorrespond to thosein rotamer library

• Hydrogen bonding• No bad atom-atom

contacts

Page 16: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

Evaluating the 3D Model3D-Profiler (Verify 3D)

• Based on statistical preferences of each of the 20amino acids for particular environments within aprotein

• Residue positions characterized by environment• Preferred environments defined by three

parameters– Area of each residue that is buried– Fraction of side-chain area that is covered by

polar atoms (i.e., O and N)– Local secondary structure

Page 17: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

Refining the 3D Model

• MD and energy minimization• Application of restraints based on

experimental data (e.g., NMR,fluorescence)

Page 18: Homology Modeling (Comparative Structure Modeling)chekhov.cs.vt.edu/2007/presentations/homology_modeling.pdf · Basis of Homology Modeling •3D structures conserved to greater extent

GBCB 5874: Problem Solving in GBCB

Applications of the Model