shape determination of proteins in solution using high throughput computing donna lammie

32
Shape determination of proteins in solution using high throughput computing Donna Lammie Structural Biophysics Group School of Optometry and Vision Sciences Cardiff University

Upload: sasha

Post on 22-Feb-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Shape determination of proteins in solution using high throughput computing Donna Lammie Structural Biophysics Group School of Optometry and Vision Sciences Cardiff University. Outline of Talk Setting the scene Data collection Data analysis software - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Shape determination of proteins in solution using high throughput computing Donna Lammie

Shape determination of proteins in solution using high throughput computing

Donna Lammie

Structural Biophysics GroupSchool of Optometry and Vision Sciences

Cardiff University

Page 2: Shape determination of proteins in solution using high throughput computing Donna Lammie

Outline of Talk

Setting the scene

Data collection

Data analysis software

High throughput computing through Condor

Page 3: Shape determination of proteins in solution using high throughput computing Donna Lammie

Proteins are large organic compounds made of amino acids arranged in a linear chain and joined together by peptide bonds.

Each protein has a unique, genetically defined amino acid sequence which determines its specific shape and function.

Proteins can work together to achieve a particular function, and they often associate to form stable complexes.

Page 4: Shape determination of proteins in solution using high throughput computing Donna Lammie

Roles and Functions

Enzymes

Structural

Hormones

Immunoglobulins

Involved in oxygen transport

Muscle contraction

Cell signalling

Page 5: Shape determination of proteins in solution using high throughput computing Donna Lammie

Techniques to investigate protein structures:

X-ray crystallography is the science of determining the arrangement of atoms within a crystal from the manner in which a beam of X-rays is scattered from the electrons within the crystal.

Restrictions:

X-ray crystallography requires good quality crystals

Therefore, a significant fraction of proteins cannot be analysed

Page 6: Shape determination of proteins in solution using high throughput computing Donna Lammie

Structure of haemoglobin - the iron-containing oxygen-transport metalloprotein in the red blood cells of the blood in vertebrates and other animals.

Page 7: Shape determination of proteins in solution using high throughput computing Donna Lammie

Importance of understanding structure of proteins

• how individual components fit together to build complex systems

• structure - function

Possibility of manipulation

•Drug design

•Drug therapies

Page 8: Shape determination of proteins in solution using high throughput computing Donna Lammie

Why scatter solutions?

Main advantage - the possibility to study the structure and structural dynamics of native particles in physiological solutions.

•Broad range of sizes and conditions

•Shape

•Complexes

Page 9: Shape determination of proteins in solution using high throughput computing Donna Lammie

Research Interests:

* Structural organisation at the nanometer length scale.* Systems in solution / in situ

X-ray Scattering

* ideal for investigating the structure and organisation of particles/molecules in a system.

* provides information about size, shape and arrangement of particles/molecules.

Page 10: Shape determination of proteins in solution using high throughput computing Donna Lammie

X-rays interact with molecules and are deflected.

We can interpret the deflection.

Incident X-rays

X-ray Diffraction (high angles)Diffraction from repeating structure (crystal lattice)

Sample

SAXS (small angles)Scattering from particles or changes in electron density

Page 11: Shape determination of proteins in solution using high throughput computing Donna Lammie

The shape and distribution of the scattering provides information such as size, shape and arrangement of the scattering particles.

SampleX-Rays

Scattered X-Rays

Detector

Small angle X-ray scattering image

Page 12: Shape determination of proteins in solution using high throughput computing Donna Lammie

Synchrotron sources

Page 13: Shape determination of proteins in solution using high throughput computing Donna Lammie

The two-dimensional data was converted into one-dimensional linear profiles.

Background corrected - buffer subtracted from protein using PRIMUS.

Konarev, P.V., Volkov, V.V., Sokolova, A.V., Koch, M.H.J. and Svergun, D.I. (2003) J. Appl. Cryst, 36, 1277-1282.

Page 14: Shape determination of proteins in solution using high throughput computing Donna Lammie

Buffer (background)

Protein

Subtracted/corrected data

Page 15: Shape determination of proteins in solution using high throughput computing Donna Lammie

GNOM was used to estimate the particle distancedistribution function, ρ(r) from the experimental scattering data.

GNOM output is entered into DAMMIN and GASBOR.

Semenyuk, A.V. and Svergun, D.I. (1991) J. Appl. Cryst, 24, 537-540.

Page 16: Shape determination of proteins in solution using high throughput computing Donna Lammie

Size and shape of molecules in solution can be extracted from the scattering pattern using a series of computer algorithms.

DAMMIN uses an ab initio method to build models of the protein shape by simulated annealing using a single-phase dummy-atoms model (Svergun, 1999).

GASBOR uses similar parameters to DAMMIN; however, instead of the dummy-atom model, an ensemble of dummy residues are used to form a chain-compatible model (Svergun et al., 2001).

Svergun, D.I. (1999) Biophys. J., 76, 2879-2886.Svergun, D.I., Petoukhov, M.V. and Koch, M.H.J. (2001) Biophys. J., 80, 2946-2953.

Page 17: Shape determination of proteins in solution using high throughput computing Donna Lammie
Page 18: Shape determination of proteins in solution using high throughput computing Donna Lammie
Page 19: Shape determination of proteins in solution using high throughput computing Donna Lammie
Page 20: Shape determination of proteins in solution using high throughput computing Donna Lammie

START

FINISH

Page 21: Shape determination of proteins in solution using high throughput computing Donna Lammie

Output files from Dammin and Gasbor are entered into a series of programs (DAMAVER), which align the models and produce an average of the models.

In order to obtain a reliable representation of the protein shape, DAMMIN and GASBOR need to be repeated a number of times and averaged.

The greater the number of repetitions the more accurate model is produced.

Page 22: Shape determination of proteins in solution using high throughput computing Donna Lammie

The average shape of 20 independent simulations produced from DAMMIN.

Transglutaminases are a family of enzymes that are capable of introducing isopeptide bonds in or between polypeptide chains

Page 23: Shape determination of proteins in solution using high throughput computing Donna Lammie

Using Condor, Dammin and Gasbor can be run multiple times for the same protein, and also multiple times for a number of different proteins simultaneously.

Before Condor, the total time for 20 repeat runs of approx. 36 mins would have been approximately 12 h on one PC.

Using Condor, 20 repeat runs were performed in approximately 36 mins.

Representing a significant performance gain in terms of accessibility.

Page 24: Shape determination of proteins in solution using high throughput computing Donna Lammie

A Submit Script Generator (SSG) was developed by James Osborne to assist running DAMMIN and GASBOR using the Condor toolkit.

The SSG asks the user only once for the necessary information to prepare and submit multiple jobs to Condor; thereby reducing the time taken to submit and process multiple proteins.

Page 25: Shape determination of proteins in solution using high throughput computing Donna Lammie

Running Dammin on Condor========================

1) put your gnom.out files into the input directory2) double click on make.bat

This runs makesubmit.exe which will ask you some questions

3) double click on submit.bat

The jobs are submittedYou can check the progress of your jobs using "condor_q"When your jobs are finished

4) copy the input and output directories somewhere safe5) double click on clean.bat6) Go to 1

Page 26: Shape determination of proteins in solution using high throughput computing Donna Lammie
Page 27: Shape determination of proteins in solution using high throughput computing Donna Lammie
Page 28: Shape determination of proteins in solution using high throughput computing Donna Lammie

Central Manager

1600 Workstations

30 WorkstationsSubmit Nodes

Execute Nodes

master, startd, starter

master, schedd, shadow

master, collector, negotiator

Page 29: Shape determination of proteins in solution using high throughput computing Donna Lammie

>Run >cmd> condor_q

Page 30: Shape determination of proteins in solution using high throughput computing Donna Lammie

Summary

User friendly

Easy to use

Overall, Condor has proved invaluable to our research since the work is completed rapidly and efficiently

Page 31: Shape determination of proteins in solution using high throughput computing Donna Lammie

Related Publications

Lammie D., Osborne J., Aeschlimann D., Wess T.J. (2007) Rapid shape determination of tissue transglutaminase using high-throughput computing. Acta crystallographica section D-biological crystallography, 63: 1022-1024.

Mankelow T.J., Burton N., Stefansdottir F. O., Spring F. A., Parsons S. F., Pedersen J. S., Oliveira C. L., Lammie D, Wess T., Mohandas N., Chasis J. A., Brady R.L., Anstee D.J. (2007) The Laminin 511/521-binding site on the Lutheran blood group glycoprotein is located at the flexible junction of Ig domains 2 and 3. Blood, 110:3398-406.

Dyksterhuis L. B., Baldock C., Lammie D., Wess T. J., Weiss A.S. (2007) Domains 17-27 of tropoelastin contain key regions of contact for coacervation and contain an unusual tum-containing crosslinking domain. Matrix Biology, 26: 125-135.

Baldock C., Siegler V., Bax D. V., Cain S. A., Mellody K. T., Marson A., Haston J. L., Berry R., Wang M.C., Grossmann J. G., Roessle M., Kielty C. M., Wess T. J. (2006) Nanostructure of fibrillin-1 reveals compact conformation of EGF arrays and mechanism for extensibility. Proc Natl Acad Sci U S A, 103:11922-7.

Page 32: Shape determination of proteins in solution using high throughput computing Donna Lammie

Acknowledgements

Cardiff University:

Tim Wess

Daniel Aeschlimann

James Osborne (E-mail: [email protected] )