in silico functional assignment to rv3430c, a mycobacterium tuberculosis gene overexpressed when...
TRANSCRIPT
IN SILICO FUNCTIONAL ASSIGNMENT TO Rv3430c, A Mycobacterium tuberculosis
GENE OVEREXPRESSED DURING INFECTION IN BLOOD FROM HIV+/- SUBJECTS
Panjab UniversityChandigarh
Presented by:Rajni Devi
M.Sc. Biochemistry
A transcriptomics experiment conducted by Ryndak et al, in 2014 on M. tuberculosis
genes in HIV+ and HIV- patients was retrieved.
A set of genes that were significantly overexpressed during pathogenesis
(HIV+/HIV-) were analysed.
There were different ‘Rv’ genes with no known function which showed significant
increase in expression upon HIV+/HIV- patients.
Rv3430c was selected for analysis. It showed a 5-fold and 3-fold increase in
expression in HIV+ and HIV- infection respectively.
Introduction
Objectives
Amino acid sequence based functional assignment.
Build an 3D computational model of Rv3430c.
Comparison of Rv3430c model with other known proteins structures.
Identification of active site residues in Rv3430c.
Analysis of the binding pocket.
Screening of genes
(UniProt)
Rv3430c Sequence
based analysis (Pfam, SCOP)
Structure Visualization and analysis
(ICM)
Structure Validation (VADAR)
Search for homologs (BLAST)
3-D structure prediction
(LOMETS)
Structure Super-
positioning (Dali Server)
Active site & binding pocket
analysis (ICM
Browser)
Methodology
RESULTS
Rv3430c Protein Sequence
Rv3430c is 387 amino acid sequence. This amino acid sequence was used for further analysis to predict the function of Rv3430c gene
Structural and Evolutionary relationship amongst known protein
SCOP predicted Rv3430c belonged to a Superfamily called ‘Ribonuclease H-Like’ which have Ribonuclease H- like domain.
Visualization of Rv3430c protein structure
N-terminal domain (right side) have a HHCC motif. Catalytic core domain (central) is integrase domain and it contain a DDE motif. C-terminal (left side) non specifically
bind to DNA and contain SH3 motif.
Search for structure homologs
Dali Server generated a large number of structures with coordinates transformed according to the submitted model. Out of these 8 structures were selected for
superposition based on lower RMSD.
Structures RMSD in (A )⁰ Seq. Identity (%) Name Of The Organism
4e7h_A (Red) 1.4 14 Human spumaretrovirus
4eb2_A (Blue) 1.4 14 Human spumaretrovirus3oS0_A (white) 1.3 15 Simian foamy virus
1cxu_A (Orange) 2.1 21 Avian sarcoma virus
1asv_A(Cyan) 2.3 20 Avian Sarcoma Virus IN
n1c0m_A (Pink) 2.4 20 Rous sarcoma virus1asu_A (Green) 2.6 19 Avian sarcoma virus
1biu_A (Maroon ) 2.8 19 Human immunodeficiency virus 1
Structure Superposition
Rv3430c (yellow) The C-terminal region important for DNA binding and catalysis superposed well with
other integrases, inspite of no significant sequence identity.
Structure based Sequence Alignment
Arrow and cylinders depict β-strands and α-helix respectively. The DDE region (shaded cyan) are conserved in all integrases and active site residues ( shaded yellow).
Active site amino acid residues analysis
The active site residues were Asp145, Asp208, Ser232, Tyr234, His235, Asn240, and Glu244. These residues were conserve in all other integrases. This predicted that
Rv3430c is probably an ‘Integrase’
Manual Docking with DNA
Binding pocket for DNA in A: 3oS0_A, B: 4e7h_A, C: 4be2_A and D: Rv3430c. All the structures have conserved active site, among these in spite of low sequence identity.
Summary
• This work was aimed at assigning function to an unknown M. tuberculosis gene
whose expression goes up during HIV+/- subjects.
• Rv3430c was subjected to various bioinformatic analysis.
• Pfam predicted gene to be an Integrase.
• The structural analysis showed that Rv3430c has a typical conserved DDE region,
common to other integrases.
• Based on the active site analysis and comparison with known structures, it was found
that in spite of no significant sequence identity, the active site residues are conserved.
• This study shows that Rv3430c is likely to be an Integrase.