bayesian refinement of protein functional site matching

24
Bayesian Refinement of Protein Functional Site Matching Kanti V Mardia, Vysaul B Nyirongo*, Peter J Green, Nicola D Gold, David R Westhead Presented by Deephan, Mohan

Upload: hasad-zimmerman

Post on 01-Jan-2016

21 views

Category:

Documents


0 download

DESCRIPTION

Bayesian Refinement of Protein Functional Site Matching. Kanti V Mardia , Vysaul B Nyirongo *, Peter J Green, Nicola D Gold, David R Westhead. Presented by Deephan , Mohan. Presentation Flow. - PowerPoint PPT Presentation

TRANSCRIPT

Bayesian Refinement of Protein Functional Site MatchingKanti V Mardia, Vysaul B Nyirongo*,
Peter J Green, Nicola D Gold,
David R Westhead
Results
Conclusion
Disclaimer : Contrary to the assumption made by the authors, the paper presenter does have a thorough understanding of all the concepts related to the topics of advanced statistical, graph theory and structural genomics discussed in the paper..
2
Motivation
Shape analysis of Proteins
Infer functional relationship of proteins
Classification of Binding Patterns
4
Objective:
Matching Functional sites -Comparing amino acid configurations (Cα and Cβ atoms)
Functional site – Graph
5
Based on Posterior Joint Distribution
Product of Prior density and Likelihood
Biologically speaking,
Likelihood - Related to matches between functional sites
7
Graphs G1 and G2
Vertex sets
V1 = {Xj, j = 1, 2, ..., m} , V2 = {Yk, k = 1, 2, ..., n}
Xj , Yk - represents coordinates of amino acids in jth
and kth positions of X,Y
x1j, y1k – Cα coordinates for X,Y
x2j, y2k – Cβ coordinates for X,Y
x1 = {x1j : j = 1 ..., m}, x2 = {x2j : j = 1 ..., m}
y1 = {y1k : k = 1 ..., n}, y2 = {y2k : k = 1 ..., n}
8
Hv = G1 v G2
VH=V1 x V2
An edge between two vertices vh = (Xj, Yk), vh' = (Xj', Yk') ∈ VH exists for j ≠ j' and k ≠ k' when
1. the absolute difference between distances
|x1j - x1j'| and |y1k - y1k'| and
2. also the absolute difference between distances
|x2j - x2j'| and | y2k - y2k'|
are both less than 1.5Å (matching distance threshold).
9
Bayesian Alignment
Matching between amino acids X and Y represented by matrix M,
Mjk =
Transformations to bring the configurations into alignment is given by
xij = Ayik + τ for Mjk = 1, i = 1, 2
A – Rotation Matrix, τ – Translation vector
1 if jth amino acid corresponds to kth amino acid
0 otherwise
Bayesian Modeling (contd)
Joint Posterior Distribution:
p(A), p(τ) and p(σ) denote prior distributions for A, τ and σ
|A| - Jacobian Transformation
presence of Gaussian noise N(0, σ2) in in the atomic positions for x1j and y1k
11
Bayesian Modeling (contd)
Side chains orientation:
Extending the model by taking into account the relative orientation of Cα and Cβ in matching amino acids
12
MCMC Refinement Step
Markov Chain Monte Carlo (MCMC) – used to sample the full joint distribution function p(M, A, τ, σ, x1, y1, x2, y2)
p(M, A, τ, σ, x1, y1, x2, y2) – function of RMSD and angle for orientation difference between amino acids
13
RMSD – Root Mean Square Distribution
Matches of lower RMSD over larger numbers of matching residues are more statistically significant
MCMC Refinement improved the RMSD (reduction) and the number of matching residues ( increase)
14
Decision tree for refining the graph solution by the MCMC method. Boxes with curved corners show processes and their output while boxes with sharp corners are for branching conditions. The procedure starts with graph solution MG. The graph solution's RMSD and number of matches are denoted by RMSDG and LG respectively. MCMC is re-iterated until the MCMC solution: MB is better. The RMSD and number of matches for MB are denoted by RMSDB and LB respectively. MB and MG are compared using 1) RMSDs and the number of matches or 2) P-values for MG and MG, denoted by PG and PB respectively.
15
Results
Each study was performed with and
without considering the physico-chemical properties of amino-acids.
16
Case-I
Case 1: Site 1hdx_1 matching against its own SCOP family
125/145 sites produced significant matches – increased to 131/145 (after refinement)
RMSD is improved

17
18
Case 2: 17 – β hydroxysteroid dehydrogenase and family
After MCMC Refinement step significant matches increased from 248 to 318 of 326 sites
Increased number of matching residues at a similar RMSD
RMSD improvement in minority of the sites
19
Matching sites increased form 200 to 324
Case 4: Alcohol dehydrogenase and FAD/NAD(P)-binding domain
12 sites improved after MCMC refinement
20
Computationally expensive
21