information system in biologylsir · information system in biology homology modeling in biology and...
TRANSCRIPT
![Page 1: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/1.jpg)
INFORMATION SYSTEM IN BIOLOGY
HOMOLOGY MODELING IN BIOLOGY AND MEDECINE
Virginie Lafleur EPFL
![Page 2: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/2.jpg)
01/12/2004 2
Plan
What is Homology ModelingGoalsRecalls in biology
How construct a modelChecksSteps
Programs of Homology ModelingQuick presentationModeller
![Page 3: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/3.jpg)
01/12/2004 3
The goal of Homology Modeling
Use homologous sequences to construct a model of 3D structureAnalyze relationships between DNA sequence and 3D structure of proteinsKnow protein’s 3D structure to understand interactions with other moleculesCreate computer-aided drug design, mutagenesis and protein engineering
![Page 4: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/4.jpg)
01/12/2004 4
Homologous sequences
Gene A in DNA sequence
Duplication of the gene A
Gene AGene A
Gene A Gene A’
Mutation involving
Speciation
Gene A in species 1
ParalogyOrthology
Xenology Copy transferred to the specie from an other organism
Gene A duplicated
Gene A in different organism
Gene A in species 2
![Page 5: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/5.jpg)
01/12/2004 5
Level structure>1ubq_ mol: protein length:76 Ubiquitin
MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGGPrimary structure
Secondary structure
Tertiary structure
Quaternary structure
Several subunits not part of the same polypeptide chain
![Page 6: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/6.jpg)
01/12/2004 6
Why it is important to design protein structure?
Diversity of structure for several functions:Enzymatic activity
Storage
Transport
Immune response
Structure implies function:Example with experiment
of denaturation
Denaturation
Refolding
![Page 7: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/7.jpg)
01/12/2004 7
How determine the protein structure?
By experimentationX-RayNMR (nuclear magnetic resonance spectroscopy)
Today, Sequence Analysis have explodedWe have the data
We need to construct 3D models
The idea!Use similar structure to identify constraints and build fold corresponding
Homology Modeling
![Page 8: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/8.jpg)
01/12/2004 8
Where find the data?
Protein Data Bank (PDB)http://www.rcsb.org/pdb/
> 10,000 structures of proteins
Text file contain: coordinates for each heavy (non-hydrogen) atom from the first residue to the last
ATOM 1 N SER A 2 29.089 9.397 51.904 1.00 81.75 ATOM 2 CA SER A 2 27.883 10.162 52.185 1.00 79.71ATOM 3 C SER A 2 26.659 9.634 51.463 1.00 82.64 ATOM 4 O SER A 2 26.718 8.686 50.686 1.00 81.02 ATOM 5 CB SER A 2 28.039 11.660 51.932 1.00 75.59ATOM 6 OG SER A 2 27.582 12.038 50.639 1.00 43.28-------ATOM 1737 CD1 ILE A 229 39.535 21.584 52.346 1.00 41.62TER 1738 ILE A 229
![Page 9: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/9.jpg)
01/12/2004 9
The way to visualize the protein
It is impossible to read this text file without the help of graphic viewers such as RASMOLhttp://www.bernstein-plus-sons.com/software/rasmolDifferent way to visualize:
Coloring: by structure
All-atom model, in ball-and-stick representation
Space-filling modelCα Trace
![Page 10: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/10.jpg)
01/12/2004 10
Structure and homologous sequences
With at least 30% identity between two sequences, a definite correlation exists between sequence and structure
In particular, homologous sequences show very similar structures, with strong conservation in secondary structural elements
Some folds are preferred by vastly different sequences to conserve the structure of the active site
On the other hand, some proteins adopt very similar structures, with no obvious sequence similarity
![Page 11: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/11.jpg)
01/12/2004 11
Why homology modeling?
Other way to construct 3D modelPrediction method
Ab initio
Threading
But :Expansive in time and in calculation
The solution of Homology ModelingFrom 3D structure for each protein family
Construct model from this known structure
template structure
![Page 12: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/12.jpg)
01/12/2004 12
Before building a model…
Elements of sequence analysis, essential for building a molecular model, will be considered
Multiple sequence alignment
Alignment checks
Protein domain,…
![Page 13: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/13.jpg)
01/12/2004 13
Some problems have to be solved
Homologous sequences are identified by using database search methods (BLAST) To build a model, we require the alignment of complete protein sequences, collected from database searchesIdentical residues must be lined upThe rest should be arranged, based on
observed substitution in protein familieschemical similaritycharge similarity
Where none of the 19 residues is suitable, the alignment simply skips that position a ‘gap’ (insertion/deletion regions)
CLUSTALW/CLUSTALX, MAXHOM, MALIGN (MSA method)
![Page 14: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/14.jpg)
01/12/2004 14
After alignment, check the result
The function of a protein depends on the localization in space of a few key residuesSome residues are critical for the stability of the protein fold or for the formation of functional quaternary structuresConserve all residues usually indicate some conserved structural or functional role, especially buried charges
![Page 15: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/15.jpg)
01/12/2004 15
Checking protein domain
A polypeptide sequence can contain several regions of compact globular proteins, which can fold independently domainA domain is a compact unit of protein structure, usually associated with a functionTo know what domains go into making up a given protein is importantThe 3D model of a protein will be composed of these modular elements, usually constructed individually and then assembled together
![Page 16: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/16.jpg)
01/12/2004 16
How to construct homologous model?
Find homologous sequence
Select the template sequence of known structure
Align the template and the target structure
Build the model
![Page 17: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/17.jpg)
01/12/2004 17
The most important step: find a homologous structure
The criteria:Alignment Score and E value (discarded: low scores and high values (> 0.005) )
Domain coverage (at least 60% of the domain)
Gaps (the fewer the gaps, the better the structural model)
For small proteins, specific search (disulfide bond)
No structure found: prediction method used (second and tertiary structure prediction method)
![Page 18: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/18.jpg)
01/12/2004 18
Selection of template sequence
Single structural homologue one unique choice for template selectionSeveral equally structural homologues are identified how many and which one(s) should we choose? Improve one template in viewing
simple phylogenetic tree (show the most similar structure)Completeness of structural information (viewing PDB information by RASMOL and verify the completeness of the structure)X-ray and NMR entries
![Page 19: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/19.jpg)
01/12/2004 19
One or many templates?
When we have selected many templates with same quality and similarity
Compare 3D structure to check the unique information each templates provides
Structure alignment of Cα atoms
If 2 templates are very close, keep only one
Keep templates that provide new information
![Page 20: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/20.jpg)
01/12/2004 20
Align the template and the target sequences
In case of homology (>40%), the alignment is constant and every method is available
In the other cases, the use of multiple sequence improve the quality
Some checks are needed to increase the satisfaction of the model
Residue conservation checks (pattern and function)
Visual inspection of indel regions (RASMOL)
![Page 21: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/21.jpg)
01/12/2004 21
Illustration of the building
![Page 22: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/22.jpg)
01/12/2004 22
And finally… Build the model
It is the moment to use a program
In input: target, template sequence and their alignment
In output: the 3D structure responding of the constraints
![Page 23: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/23.jpg)
01/12/2004 23
Which program to choose?
WHATIF (1990)
SWISSMODEL (1993)
MODELLER (1994)
ICM (1994)
CPH Models (1997)
SDSC1 (2000)
3D-JIGSAW (2001)
![Page 24: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/24.jpg)
01/12/2004 24
A short presentation of Modeller[Šali & Blundell, 1993]
This is one of the best available modeling programs
Is written in Fortran 90
A graphical interface to MODELLER is commercially available from Accelrys, as part of Discovery Studio Modeling 1.1.
![Page 25: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/25.jpg)
01/12/2004 25
Advantages of Modeller
Implements comparative protein structure modelling by satisfaction of spatial restraints (2,3)
Can perform many additional tasksincluding de novo modelling of loops in protein structures
Optimize various models of protein structure with respect to a flexibly defined objective function
Perform multiple alignment of protein sequences and/or structures
Search sequence in databases
Compare protein structures
![Page 26: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/26.jpg)
01/12/2004 26
Optimization with iteration
![Page 27: INFORMATION SYSTEM IN BIOLOGYlsir · INFORMATION SYSTEM IN BIOLOGY HOMOLOGY MODELING IN BIOLOGY AND MEDECINE Virginie Lafleur EPFL. 01/12/2004 2 Plan What is Homology Modeling Goals](https://reader033.vdocuments.us/reader033/viewer/2022041703/5e42be16cb755f64ac15065e/html5/thumbnails/27.jpg)
01/12/2004 27
To conclude
Protein structure determine functions
Importance to know protein structure for application in biology and medecine
Homology modeling :From a known structure in protein family
Build a model of homologous sequence