supplementary material 1 - amazon s3 · 2015-12-04 · supplementary material 3: haddock: haddock...

23
Supplementary Material 1: Uniprot: Uniprot (Universal Protein resource) database (http://www.uniprot.org/) provides a free online comprehensive resource for protein sequence which is fully classified and accurately annotated [1]. It is developed by UniProt Consortium which comprises of groups from European Bioinformatics Institute (EBI), the SIB Swiss Institute of Bioinformatics (SIB) and the Protein Information Resource (PIR). The database is updated every month and the sequence is submitted in FASTA format. PDBsum: PDBsum is a graphical database (http://www.ebi.ac.uk/pdbsum/) that provides pictorial information in both 2D and 3D format [2]. It also provides other information such as protein chains, ligands, protein-protein interaction diagrams, Number of helices, number of beta, gamma turns, etc. Moreover, it provides wiring diagrams and topology diagrams of the query protein. It also provides information about protein-protein interfaces and residue-residue interactions. It is developed by European Bioinformatics Institute (EBI). More information about the server could be accessed from http://www.ebi.ac.uk/thornton-srv/databases/cgi- bin/pdbsum/GetPage.pl?pdbcode=n/a&template=doc_about.html. Protparam: Protparam is a web application that calculates physico-chemical properties from amino-acid sequence [3]. The website can be accessed from http://web.expasy.org/protparam/. The various properties are molecular weight, theoretical pI, amino acid composition, atomic composition, extinction coefficient, estimated half-life, instability index, aliphatic index and grand average of hydropathicity (GRAVY). The protein under investigation can be specified as a accession number or as raw sequence. The documentation of various parameters can be accessed through http://web.expasy.org/protparam/protparam-doc.html. Pfam: Pfam is a comprehensive database (http://pfam.xfam.org/) of proteins domains and families and is developed by The Wellcome Trust Sanger Institute, UK; University of Helsinki, Finland; University of Oxford, UK; Stockholm Bioinformatics Centre, Sweden and Janelia Farm Research Campus, USA [4]. The current information of release notes can be accessed from

Upload: others

Post on 09-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

Supplementary Material 1:

Uniprot: Uniprot (Universal Protein resource) database (http://www.uniprot.org/) provides a free

online comprehensive resource for protein sequence which is fully classified and accurately

annotated [1]. It is developed by UniProt Consortium which comprises of groups from European

Bioinformatics Institute (EBI), the SIB Swiss Institute of Bioinformatics (SIB) and the Protein

Information Resource (PIR). The database is updated every month and the sequence is submitted

in FASTA format.

PDBsum: PDBsum is a graphical database (http://www.ebi.ac.uk/pdbsum/) that provides pictorial

information in both 2D and 3D format [2]. It also provides other information such as protein

chains, ligands, protein-protein interaction diagrams, Number of helices, number of beta, gamma

turns, etc. Moreover, it provides wiring diagrams and topology diagrams of the query protein. It

also provides information about protein-protein interfaces and residue-residue interactions. It is

developed by European Bioinformatics Institute (EBI). More information about the server could

be accessed from http://www.ebi.ac.uk/thornton-srv/databases/cgi-

bin/pdbsum/GetPage.pl?pdbcode=n/a&template=doc_about.html.

Protparam: Protparam is a web application that calculates physico-chemical properties from

amino-acid sequence [3]. The website can be accessed from http://web.expasy.org/protparam/. The

various properties are molecular weight, theoretical pI, amino acid composition, atomic

composition, extinction coefficient, estimated half-life, instability index, aliphatic index and

grand average of hydropathicity (GRAVY). The protein under investigation can be specified as a

accession number or as raw sequence. The documentation of various parameters can be accessed

through http://web.expasy.org/protparam/protparam-doc.html.

Pfam: Pfam is a comprehensive database (http://pfam.xfam.org/) of proteins domains and families

and is developed by The Wellcome Trust Sanger Institute, UK; University of Helsinki, Finland;

University of Oxford, UK; Stockholm Bioinformatics Centre, Sweden and Janelia Farm

Research Campus, USA [4]. The current information of release notes can be accessed from

Page 2: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

(ftp://ftp.sanger.ac.uk/pub/databases/Pfam/current_release/relnotes.txt). It uses Uniprot as its reference

sequence database. Pfam uses various algorithms to provide results such as jackhammer [5] and

Hidden Markov Models [6]. Domain and family identification is done semi-automatically based

on expert knowledge, sequence similarity, other protein family databases and the ability of

HMM-profiles to correctly identify and align the sequences. The data is constantly shared with

other databases such as Structural Classification of Proteins (SCOP) [7] and CATH protein

structure classification [8].

InterProScan: InterProScan (http://www.ebi.ac.uk/interpro/) is another database which contains

broad information about protein domain and families [9]. The results obtained from Pfam were

crosschecked and compared to the results of InterProScan. Sequence (amino acid or nucleic acid)

submitted to InterProScan are matched against the signatures from several different databases.

Sequences are submitted in FASTA format. More information can be accessed from

http://www.ebi.ac.uk/interpro/about.html.

Supplementary Material 2:

I-TASSER: Iterative Threading ASSEmbly Refinement (I-TASSER)

(http://zhanglab.ccmb.med.umich.edu/I-TASSER/) is an algorithm for predicting three-dimensional

protein structure from amino acid sequences [10]. It identifies structure templates from the

Protein Data Bank by fold recognition. The full-length structure models are created by

reassembling structural fragments from threading templates using replica exchange Monte Carlo

simulations [11]. Amino acid sequences are submitted as input by the users. More information

about the I-TASSER can be accessed from http://zhanglab.ccmb.med.umich.edu/I-

TASSER/about.html. Moreover, I-TASSER server was ranked No. 1 server in Critical Assessment

of Techniques for Protein Structure Prediction (CASP) 7, 8, 9 and 10 respectively. I-TASSER

also uses LOMETS V3.0 for protein structure prediction and the documentation can be accessed

from http://zhanglab.ccmb.med.umich.edu/LOMETS/readme.txt. The best model for energy

minimization was chosen based on the maximum C-value score and maximum number of

decoys. C-score is a confidence score for estimating the quality of predicted models by I-

Page 3: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

TASSER. It is calculated based on the significance of threading template alignments and the

convergence parameters of the structure assembly simulations. C-score is typically in the range

of [-5,2], where a C-score of higher value signifies a model with a high confidence and vice-

versa. I-TASSER generates full length model of proteins by excising continuous fragments from

threading alignments and then reassembling them using replica-exchanged Monte Carlo

simulations. A higher cluster density means the structure occurs more often in the simulation

trajectory and therefore signifies a better quality model.

Discovery Studio: Discovery Studio is client-based-server suite and is developed and distributed

by Accelry’s (http://accelrys.com/products/discovery-studio/). It is well known collection of various

algorithms used for computational chemistry, computational biology, cheminformatics,

molecular simulations and quantum mechanics. It uses many software algorithms such as

CHARMM [12], MODELLER [13], DELPHI [14], ZDOCK [15], etc. All the thirty Peptides

were prepared using Discovery Studio 3.1 module build and edit protein, in which build action

was used to create and grow chains of amino acids as desired. The generated peptides were

minimized using CHARMM force field using electrostatics spherical cutoff and the smart

minimizer algorithms with maximum steps of 200.

Swiss-PDB Viewer: It is a wonderful application that helps to analyze and minimize H-bonds,

angles, distances between atoms, etc in proteins and it can be accessed from http://spdbv.vital-

it.ch/. The generated three dimensional models from I-TASSER web application were further

subjected to energy minimization using the steepest descent technique to eliminate bad contacts

between protein atoms. The Swiss-PDB viewer uses GROMOS 43B1 force field [16] which is

mainly used to repair distorted geometries by removing internal constrains. Energy minimization

preferences were set to 1000 steps of steepest descent technique while the cutoff value was set to

0.500 Å. The delta E cutoff value was maintained at 0.030 kJ/mol and the force acting on any

atom was set to a default value of 10.000. Energy minimization module in tools tab was used to

start the process. The minimized model was selected for molecular dynamics simulation studies.

GROMACS: GROMACS 4.5.4 package [17] and Amber99sb-ILDN force field [18] was

implemented to examine the modeled proteins stability. The protein models were solvated with

Page 4: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

SPC-E water model that extend to 0.9 nm triclinic box from the molecule to the edge of the box.

Periodic boundary conditions were applied in all directions and the total charge was adjusted to

zero. Maximum of 50,000 energy minimization steps was carried out for the protein models

using a steepest descent algorithm with a tolerance of 1000 kJ mol-1

nm-1

. Consequently, 50,000

steps of a conjugate gradient algorithm are also used to minimize the protein models with a

tolerance of 1000 kJ mol-1

nm-1

. The solvated and minimized system were considered a

reasonable one in terms of geometry and solvent orientation and used for further simulation

steps. All bond angles were controlled with LINCS algorithm [19], while SETTLE algorithm

[20] was used to constrain the geometry of the water molecules. Temperature was maintained

(300 K) by V-rescale weak coupling method, while the Parrinello-Rahman method [21] was used

to preserve the pressure (1 atm) of the system. The position restrains (PR) MD for both NVT

(constant number of particles, volume and temperature) and NPT (constant number of particles,

pressure and temperature) were carried out for 100 ps. This pre-equilibrated system was later

used in the 3000 ps (3 ns) production MDS with a time-step of 2 fs. Structural coordinates were

saved every 2 ps and analyzed using the analytical tool in the GROMACS package. The lowest

potential energy conformations were selected from 3 ns MDS trajectory for further Protein-

Protein Interaction as well as Protein-Peptide Interaction Studies. The refined models were

validated using the structural analysis and verification server (SAVES). The above mentioned

protocol was also used for molecular dynamics simulation studies of Protein-Peptide-Protein

complexes which also proved the stability of the designed peptides. More information can be

accessed from http://www.gromacs.org/.

SAVES: Structural Analysis and Verification Server (SAVES) is a protein structure validation

server (http://nihserver.mbi.ucla.edu/SAVES/). It used many web applications to come to a

conclusion such as PROCHECK [22], ERRAT [23] and VERIFY_3D [24]. PROCHECK checks

the stereochemical quality of a protein structure and overall structure geometry. ERRAT

analyzes the non-bonded interactions between different atom types while VERIFY _3D

examines 3D models with its amino acid sequence. The proteins under investigation were

submitted with their pdb files for structure validation and verification. The parameters and

working of SAVES server can be accessed from http://nihserver.mbi.ucla.edu/SAVES/Info.php.

Page 5: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

Supplementary Material 3:

HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively

used for biomolecular docking [25]. It is one of the few docking software platforms that

explicitly takes flexibility into account both in the side-chains and backbone of the proteins. It

uses both NMR and non-NMR experimental information to guide the docking process. Haddock

web server (http://haddock.science.uu.nl/services/HADDOCK/haddock.php) offers different levels of

services to users and these could be accessed by a simple registration process. The pdb file of the

proteins under consideration was submitted as an input file and the domain region was provided

as active site residues. The default parameters by the webserver were used both for protein-

protein and protein-peptide docking and it can be accessed from

http://haddock.science.uu.nl/services/HADDOCK/settings.html.

LIGPLOT: LIGPLOT is a program which generates schematic diagrams of protein-ligand and

protein-protein interactions [26] and it can be accessed from https://www.ebi.ac.uk/thornton-

srv/software/LIGPLOT/. Hydrogen bonds formation between proteins under investigation were

analysed using the DIMPLOT module of LIGPLOT. The maximum H-A distance for hydrogen

bond formation was kept at 2.70 Å while D-A distance was kept at 3.35 Å, where H=hydrogen,

A=acceptor and D=donor respectively.

PROPKA: PROPKA webserver (http://propka.ki.ku.dk/) helps to estimate pKa values of amino

acids as they exist within proteins. PROPKA 3.1 was used to calculate the was also used to

check the stability of the protein-peptide complex [27]. The pdb structure file was provided as

input and the results were calculated based on the default values of the server.

PISA: Protein Interfaces, Surfaces and Assemblies (PISA) webserver

(http://www.ebi.ac.uk/pdbe/pisa/pistart.html) was deployed for salt bridge analysis [28]. The

number of salt bridges is then used to assess the likely stability of the interface. PISA considers a

distance of 4 Å for a salt bridge to form. The pdb structure file of proteins under consideration

was submitted as input. Further details can be found from http://www.ebi.ac.uk/msd-

Page 6: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

srv/prot_int/pistart.html. European Bioinformatics Institute (EBI) is responsible for maintaining the

webserver.

DrugScorePPI

: DrugscorePPI

is a knowledge-based webserver for computational alanine-

scanning in protein-protein interfaces [29]. It uses QSAR approach with respect to experimental

binding free energy differences between wildtype proteins and ALA mutants for protein-protein

complex formation. This server automatically scans for the interface residues of given bio-

molecular complexes and it can be accessed from http://cpclab.uni-duesseldorf.de/dsppi/.

Initially it calculates ΔGWT (wild type) and mutates one of the interface residues to alanine then

calculates the ΔGMUT (mutant type) which allows succeeding calculation of ΔΔG (change in

binding free energy) by subtracting the ΔGWT from the ΔGMUT. This procedure will be

continued until ΔΔG of all the interface residues are calculated. The input was provided in the

form of pdb file.

Page 7: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

References

1. Bairoch, A., et al., The Universal Protein Resource (UniProt). Nucleic Acids Res, 2005. 33(Database issue): p. D154-9.

2. Laskowski, R.A., V.V. Chistyakov, and J.M. Thornton, PDBsum more: new summaries and analyses of the known 3D structures of proteins and nucleic acids. Nucleic Acids Research, 2005. 33(suppl 1): p. D266-D268.

3. Gasteiger, E., et al., Protein identification and analysis tools on the ExPASy server, in The proteomics protocols handbook. 2005, Springer. p. 571-607.

4. Finn, R.D., et al., Pfam: clans, web tools and services. Nucleic Acids Research, 2006. 34(Database issue): p. D247-51.

5. Johnson, L.S., S.R. Eddy, and E. Portugaly, Hidden Markov model speed heuristic and iterative HMM search procedure. BMC bioinformatics, 2010. 11(1): p. 431.

6. Baum, L.E., et al., A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. The annals of mathematical statistics, 1970: p. 164-171.

7. Murzin, A.G., et al., SCOP: a structural classification of proteins database for the investigation of sequences and structures. Journal of molecular biology, 1995. 247(4): p. 536-540.

8. Pearl, F.M.G., et al., The CATH database: an extended protein family resource for structural and functional genomics. Nucleic acids research, 2003. 31(1): p. 452-455.

9. Quevillon, E., et al., InterProScan: protein domains identifier. Nucleic Acids Res, 2005. 33(Web Server issue): p. W116-20.

10. Roy, A., A. Kucukural, and Y. Zhang, I-TASSER: a unified platform for automated protein structure and function prediction. Nature protocols, 2010. 5(4): p. 725-738.

11. Andrieu, C., et al., An introduction to MCMC for machine learning. Machine learning, 2003. 50(1-2): p. 5-43.

12. Brooks, B.R., et al., CHARMM: the biomolecular simulation program. Journal of computational chemistry, 2009. 30(10): p. 1545-1614.

13. Eswar, N., et al., Comparative protein structure modeling using Modeller. Current protocols in bioinformatics, 2006: p. 5.6. 1-5.6. 30.

14. Rocchia, W., E. Alexov, and B. Honig, Extending the applicability of the nonlinear Poisson-Boltzmann equation: Multiple dielectric constants and multivalent ions. The Journal of Physical Chemistry B, 2001. 105(28): p. 6507-6514.

15. Chen, R., L. Li, and Z. Weng, ZDOCK: An initial‐stage protein‐docking algorithm. Proteins:

Structure, Function, and Bioinformatics, 2003. 52(1): p. 80-87.

16. Scott, W.R., et al., The GROMOS biomolecular simulation program package. The Journal of Physical Chemistry A, 1999. 103(19): p. 3596-3607.

17. Pronk, S., et al., GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit. Bioinformatics, 2013. 29(7): p. 845-854.

18. Lindorff-Larsen, K., et al., Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins-Structure Function and Bioinformatics, 2010. 78(8): p. 1950-1958.

19. Hess, B., et al., LINCS: a linear constraint solver for molecular simulations. Journal of computational chemistry, 1997. 18(12): p. 1463-1472.

20. Miyamoto, S. and P.A. Kollman, SETTLE: an analytical version of the SHAKE and RATTLE algorithm for rigid water models. Journal of computational chemistry, 1992. 13(8): p. 952-962.

21. Martoňák, R., A. Laio, and M. Parrinello, Predicting crystal structures: the Parrinello-Rahman method revisited. Physical review letters, 2003. 90(7): p. 075503.

Page 8: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

22. Laskowski, R.A., et al., PROCHECK: a program to check the stereochemical quality of protein

structures. Journal of applied crystallography, 1993. 26(2): p. 283-291. 23. Colovos, C. and T.O. Yeates, Verification of protein structures: patterns of nonbonded atomic

interactions. Protein Science, 1993. 2(9): p. 1511-1519. 24. Bowie, J.U., R. Luthy, and D. Eisenberg, A method to identify protein sequences that fold into a

known three-dimensional structure. Science, 1991. 253(5016): p. 164-170. 25. de Vries, S.J., M. van Dijk, and A.M. Bonvin, The HADDOCK web server for data-driven

biomolecular docking. Nature Protocols, 2010. 5(5): p. 883-97. 26. Laskowski, R.A. and M.B. Swindells, LigPlot+: Multiple Ligand-Protein Interaction Diagrams for

Drug Discovery. Journal of Chemical Information and Modeling, 2011. 51(10): p. 2778-2786. 27. Li, H., A.D. Robertson, and J.H. Jensen, Very fast empirical prediction and rationalization of

protein pKa values. Proteins: Structure, Function, and Bioinformatics, 2005. 61(4): p. 704-721. 28. Krissinel, E. and K. Henrick, Inference of macromolecular assemblies from crystalline state.

Journal of molecular biology, 2007. 372(3): p. 774-797. 29. Kruger, D.M. and H. Gohlke, DrugScorePPI webserver: fast and accurate in silico alanine

scanning for scoring protein-protein interactions. Nucleic Acids Research, 2010. 38(Web Server issue): p. W480-6.

Page 9: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

Supplementary Material 4:

TABLES:

Table 6: Domain function screening for the proteins under consideration

Tools in Use Proteins

AtPOT1b AtTRB1 AtTRB2 AtTRB3

Pfam POT1

domain

(13-143)

Myb

DNA

binding

domain

(5-55)

Linker

Histone

family (123-

178)

Myb DNA

binding

domain

(5-55)

Histone

H1/H5

domain

(125-182)

Myb DNA

binding domain

(5-55)

InterProScan Telo_bind

domain

(13-143)

SANT/

Myb

domain

(5-55)

Histone

H1/H5

domain

(123-182)

SANT/My

b domain

(5-55)

Histone

H1/H5

domain

(125-182)

SANT/

Myb

domain

(5-55)

Histone

H1/H5

domain

(122-

180)

Page 10: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

Table 7: List of top ten templates used by I TASSER for three dimensional (3D) structure

prediction

Protein Name Templates

POT1b 2i0qA, 1jb7A , 1xjvA, 3kjpA

TRB1 2osxA, 2lsoA, 1hstA, 4fsxA, 2juhA, 1h89A, 1hstA, 1h88C, 3hfwA

TRB2 4fxgB, 2lsoA, 1hstA, 4fsxA, 2juhA, 1x58A, 1h88C, 4fxgB

TRB3 4fxgB, 1hstA, 1zrtD, 2juhA, 1x58A, 1h88C, 4fxgB, 2lsoA

Page 11: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

Table 8: I-TASSER scores to identify the best model generated

Protein Name I-TASSER

Models

C-score No. of decoys Cluster density

POT1b Model 1* -0.31 2089 0.1410

Model 2 -2.20 314 0.0212

Model 3 -1.00 1049 0.0708

Model 4 -3.27 108 0.0073

Model 5 -3.69 71 0.0048

TRB1 Model 1* -3.24 703 0.0242

Model 2 -3.37 621 0.0214

Model 3 -3.66 461 0.0159

Model 4 -3.98 335 0.0115

Model 5 -4.19 272 0.0094

TRB2 Model 1* -2.44 2315 0.0533

Model 2 -4.67 251 0.0058

Model 3 -4.73 235 0.0054

Model 4 -4.90 199 0.0046

Model 5 -5.00 175 0.0040

TRB3 Model 1 * -2.20 2336 0.0696

Model 2 -3.54 611 0.0182

Model 3 -4.69 193 0.0058

Model 4 -4.82 169 0.0050

Model 5 -4.86 162 0.0048

* represents the best models generated by I-TASSER server.

The number of decoys ranged from 703 to 2336 as shown in Table 2. Template modelling score

(TM-score) was used to find the structural similarity between the models and templates. The

TM-score for best model were revealed to be was 0.67±0.13, 0.35±0.12, 0.43±0.14 and

0.45±0.15 for the proteins AtPOT1b, AtTRB1, AtTRB2 and AtTRB3 respectively. The values of

decoys are directly proportional to the value of clusters. More the number of decoys, more the

density value, which indirectly influenced the stability of structures and less the c-score values,

more the decoy values which also supports for choosing the best structure. Based on this logic

and principle, the best models were identified and were further taken up for energy minimization.

Page 12: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

Table 9: ProMotif results for all proteins from PDBsum server

Protein Name No.of

Sheet

No. of

Beta

Hairpins

No. of

Psi

loop

No. of

Beta

bulges

No. of

strand

No. of

helices

No. of

Helix-Helix

Interaction

No. of

Beta

turns

No. of

Gamma

turns

AtPOT1B 7 7 1 5 19 7 1 60 10

AtTRB1 None None None None None 15 27 33 11

AtTRB2 1 1 None None 2 13 18 28 8

AtTRB3 1 1 None 1 2 13 12 31 7

Page 13: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

Table 10: Salt Bridge Interactions of the three complexes as detected through PISA

A B C

AtPOT1b Dist.

(Å)

AtTRB1 AtPOT1b Dist.

(Å)

AtTRB2 AtPOT1b Dist.

(Å)

AtTRB3

A:ARG 13 2.77 B:ASP 122 A:LYS 10 2.88 B:GLU 153 A:ASP 16 3.64 B:LYS 135

A:ARG 282 3.72 B:ASP 149 A:LYS 10 2.79 B:GLU 254 A:ASP 16 2.65 B:ARG 136

A:ARG 282 3.03 B:ASP 149 A:ASP 8 3.81 B:ARG 293 A:ASP 16 3.85 B:ARG 136

A:GLU 149 3.9 B:ARG 120 A:ASP 8 3.77 B:ARG 293 A:ASP 50 2.65 B:LYS 135

A:GLU 149 2.99 B:ARG 120 A:ASP 8 2.67 B:ARG 293 A:ASP 50 2.82 B:LYS 135

A:ASP 116 2.71 B:ARG 159 A:ASP 16 2.67 B:ARG 163 A:GLU 149 2.73 B:ARG 161

A:ASP 116 3.26 B:ARG 159 A:ASP 16 3.56 B:ARG 163 A:GLU 149 3.55 B:ARG 161

A:ASP 116 3.67 B:ARG 159 A:ASP 16 3.34 B:ARG 163 A:GLU 149 3.5 B:ARG 161

A:ASP 116 2.62 B:ARG 159 A:ASP 16 2.67 B:ARG 163 A:GLU 149 2.7 B:ARG 161

A:GLU 141 2.78 B:LYS 164 A:GLU

141

2.6

B:LYS 127

A:GLU 141 2.71 B:LYS 164

A:GLU 130 2.59 B:LYS 166

A:ASP 50 2.6 B:LYS 173

A:ASP 16 2.67 B:LYS 176

A:ASP 16 2.76 B:LYS 176

Page 14: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

Table 11: Propensity of important interacting residues between AtPOT1b and AtTRB1-3

AtPOT1b (Total no. of

amino acids)(Chain A)

AtTRB1-3 (Total No. of amino acids)

(Chain B)

6 Arginine 3 Aspartic acid / 1 Glycine/ 1 Threonine/ 1 Lysine

6 Asparagine 2 Arginine/ 1 Asparagine/ 1 Leucine/ 1 Aspartic acid/ 1 Serine

10 Glutamic acid 4 Arginine/ 2 Lysine/ 1 Serine/ 1 Tryptophan/ 1 Tyrosine/ 1 Glutamine

4 Serine 1 Serine/ 1 Aspartic Acid/ 1 Arginine/ 1 Asparagine

2 Tryptophan 1 Threonine/ 1 Aspartic acid

14 Aspartic acid 5 Lysine/ 7 Arginine/ 2 Asparagine

Number denotes the number of times the amino acids are involved in making hydrogen

bond formation.

Page 15: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

FIGURE LEGENDS:

Figure 8: Two and three dimensional structures of best I-TASSER models after 3ns simulation.

2D figures were generated in PDBsum server while 3D was prepared using Chimera software. A, E

represents AtPOT1b; B, F represents AtTRB1; C, G represents AtTRB2; D, H represents AtTRB3. Oval

dashed lines represent the specific regions of protein which interacts with each other for all the three

proteins while the positions of interacting residues can be inferred from 2D figures.

Figure 9 A-D: Ramachandran Plot of the four proteins as depicted by PROCHECK server.

AtPOT1b, AtTRB1, AtTRB2 and AtTRB3 are represented by Figure A, B, C and D respectively. Most

favored regions are colored red, additional allowed as yellow, generously allowed as light yellow and

disallowed regions as white fields respectively.

Figure 10 A: DIMPLOT result of AtPOT1b-AtTRB1 interaction.

Blue labels represent interacting residues of AtPOT1b while red represents AtTRB1.

Figure 10 B: DIMPLOT result of AtPOT1b-AtTRB2 interaction.

Blue labels represent interacting residues of AtPOT1b while red represents AtTRB2.

Figure 10 C: DIMPLOT result of AtPOT1b-AtTRB3 interaction.

Blue labels represent interacting residues of AtPOT1b while red represents AtTRB3.

Figure 11 A-B: PROPKA results of AtPOT1b with peptide and without peptide.

‘A’ represents AtPOT1b without peptide while ‘B’ represents AtPOT1b with peptide. Unbound AtPOT1b

requires very high energy for stability with 23.5 kcal mol-1 while Peptide bound AtPOT1b requires very

low energy for stability with -44.4 kcal mol-1 suggesting the binding of the peptide to AtPOT1b is highly

stable.

Figure 12: Distribution of amino acids of Protection of Telomeres 1 (POT1) protein in different

organisms.

Amino acid frequency of Protection of Telomeres 1 (POT1) protein in different organisms suggests

abundant presence of Leucine in all organisms.

Figure 13: Distribution of amino acids of Telomerase Reverse Transcriptase (TERT) protein in

different organisms.

Amino acid frequency of Telomerase Reverse Transcriptase (TERT) protein in different organisms shows

abundant presence of Leucine in all organisms.

Page 16: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

FIGURES:

Figure 8:

Page 17: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

Figure 9:

Page 18: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

Figure 10 A:

Page 19: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

Figure 10 B:

Page 20: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

Figure 10 C:

Page 21: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

Figure 11:

Page 22: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

Figure 12:

Ala Cys Asp Glu Phe Gly His Ile Lys Leu Met Asn Pro Gln Arg Ser Thr Val Trp Tyr Total

Arabidopsis_thaliana 4.185 3.5242 5.5066 6.6079 5.9471 4.4053 2.4229 6.8282 6.3877 9.0308 3.0837 4.185 4.6256 2.8634 6.8282 7.9295 4.185 6.8282 1.9824 2.6432 454

Arabidopsis_lyrata 4.4444 3.5556 4.8889 6.2222 5.5556 4.4444 2.8889 6.8889 6.2222 9.3333 2.8889 4.4444 4.4444 3.3333 6 7.7778 4.6667 7.3333 2 2.6667 450

Olimarabidopsis_pumila 4.3796 3.4063 5.1095 5.3528 5.8394 5.1095 2.6764 6.8127 6.326 9.7324 2.4331 3.6496 4.8662 3.4063 6.5693 8.2725 3.8929 7.5426 1.7032 2.9197 411

Lepidium_alyssoides 4.3796 3.4063 5.3528 5.5961 5.8394 4.8662 2.4331 6.8127 6.0827 9.7324 2.4331 4.1363 4.8662 3.4063 6.8127 7.7859 3.8929 7.5426 1.7032 2.9197 411

Brassica_oleracea 4.8889 3.1111 6.2222 5.3333 7.1111 4.6667 2.2222 6.6667 6 8.8889 2.2222 4.6667 5.3333 2.8889 6.2222 8.4444 4.6667 6.4444 1.7778 2.2222 450

Neslia_paniculata 4.6569 3.6765 4.1667 6.1275 5.1471 4.4118 2.9412 6.6176 6.1275 10.539 2.6961 4.4118 4.902 3.4314 6.1275 7.598 4.4118 7.8431 1.4706 2.6961 408

Boechera_platysperma 4.3902 3.1707 4.3902 6.3415 6.0976 5.3659 2.9268 7.0732 6.8293 9.7561 2.1951 3.9024 4.878 3.4146 6.3415 7.561 4.3902 6.8293 1.4634 2.6829 410

Cardaminopsis_arenosa 4.8469 3.3163 4.0816 5.8673 5.8673 5.3571 2.8061 6.3776 6.8878 11.48 2.2959 3.8265 4.5918 2.8061 5.8673 8.9286 3.8265 7.1429 1.2755 2.551 392

Arabidopsis_neglecta 5.102 2.8061 4.3367 5.6122 6.1224 5.3571 3.0612 6.3776 6.8878 11.48 2.2959 3.8265 4.5918 2.8061 6.1224 8.4184 3.5714 7.398 1.2755 2.551 392

Turritis_glabra 4.6036 3.5806 4.6036 5.6266 5.8824 5.6266 3.0691 6.6496 6.9054 10.742 2.046 3.3248 4.8593 2.3018 6.3939 7.1611 4.3478 8.1841 1.2788 2.8133 391

Cardamine_pulchella 3.8929 3.4063 4.6229 7.2993 5.8394 5.8394 2.4331 5.8394 6.0827 10.706 1.9465 4.1363 5.1095 3.4063 6.326 7.2993 5.1095 7.056 1.2165 2.4331 411

Pachycladon_stellatum 4.6341 3.4146 5.122 6.0976 6.8293 5.6098 2.9268 7.3171 7.0732 10 1.4634 2.9268 5.3659 3.1707 5.3659 6.5854 4.6341 6.5854 1.9512 2.9268 410

Lepidium_draba 4.6154 3.3333 4.8718 5.641 6.9231 5.641 3.3333 6.6667 6.6667 9.7436 2.0513 2.5641 4.6154 2.8205 6.1538 8.7179 4.359 7.1795 1.5385 2.5641 390

Matthiola_integrifolia 5.102 3.5714 5.102 5.6122 7.1429 4.8469 3.3163 6.8878 7.1429 9.949 1.5306 3.3163 4.5918 3.3163 6.3776 7.6531 4.5918 6.3776 1.5306 2.0408 392

Euclidium_syriacum 4.6036 3.3248 5.8824 4.8593 7.4169 5.1151 3.3248 6.3939 7.6726 8.6957 2.046 4.6036 4.6036 2.8133 5.3708 7.1611 5.3708 7.6726 1.5345 1.5345 391

Avg. 4.5757 3.375 4.9651 5.89 6.2307 5.0949 2.8395 6.6851 6.6039 9.9627 2.2554 3.878 4.8191 3.0829 6.1983 7.8209 4.3972 7.1881 1.5901 2.5475 410.87

Page 23: Supplementary Material 1 - Amazon S3 · 2015-12-04 · Supplementary Material 3: HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively used for biomolecular

Figure 13:

Ala Cys Asp Glu Phe Gly His Ile Lys Leu Met Asn Pro Gln Arg Ser Thr Val Trp Tyr Total

Homo_sapiens 8.7456 2.5618 3.0035 3.9753 4.1519 6.6254 3.0035 2.0318 3.5336 12.986 1.0601 1.8551 7.6855 4.1519 11.042 6.6254 5.1237 7.7739 1.5901 2.4735 1132

Mus_musculus 5.8824 3.1194 3.2086 3.4759 4.902 4.8128 2.9412 3.1194 4.6346 13.369 2.139 2.7629 6.0606 5.4367 8.6453 8.7344 5.3476 6.8627 1.426 3.1194 1122

Rattus_norvegicus 6.4889 3.0222 3.1111 3.2889 4.9778 5.6889 2.8444 2.8444 5.1556 13.067 1.9556 2.6667 6.4 5.1556 7.9111 8.6222 5.4222 7.0222 1.4222 2.9333 1125

Arabidopsis_thaliana 2.7605 3.2057 4.9866 4.3633 4.8085 4.0071 3.2057 5.4319 8.1033 10.864 1.6919 5.0757 4.1852 4.1852 7.3909 10.062 3.9181 6.5895 1.6028 3.5619 1123

Oryza_sativa 4.9245 4.7657 4.448 3.4948 4.2097 4.6863 3.2566 5.8777 7.3074 9.2931 1.9063 5.4011 3.6537 3.4154 7.1485 11.676 3.5743 5.56 1.1914 4.2097 1259

Tetrahymena_thermophila 1.5219 1.4324 4.3868 6.0877 7.0725 2.7753 0.8953 9.3107 11.638 10.027 1.6115 9.8478 2.1486 9.3107 2.6858 5.6401 3.6705 4.2077 0.6267 5.103 1117

Bos_taurus 10.756 2.7556 2.9333 3.5556 4.1778 8.2667 2.6667 1.3333 2.8444 13.6 0.8889 1.8667 7.7333 4.5333 12.089 5.6889 3.7333 7.2889 1.1556 2.1333 1125

Canis_familiaris 10.508 3.0276 2.7605 3.3838 4.0962 6.9457 3.0276 2.0481 3.1167 13.802 1.1576 2.1371 7.6581 4.3633 10.686 6.0552 4.3633 6.9457 1.2467 2.6714 1123

Oxytricha_trifallax 3.6219 1.8551 3.9753 6.0954 7.9505 3.1802 1.4134 7.6855 12.014 8.8339 3.0035 9.4523 2.3852 6.3604 3.4452 5.3004 4.1519 4.2403 0.7951 4.2403 1132

Avg. 6.1221 2.8856 3.6557 4.1821 5.1375 5.2154 2.5931 4.4258 6.4925 11.727 1.7157 4.572 5.3032 5.1862 7.8865 7.6526 4.3576 6.2683 1.2283 3.3925 1139.8