protein structure. why study protein structure? studying the structure model allows better...
Post on 22-Dec-2015
219 views
TRANSCRIPT
Protein StructureProtein Structure
Why study protein structureWhy study protein structure??
Studying the structure model allows better understanding of the structure-function relationship, and is an important starting point for many kinds of research
Structure determinationStructure determination
Crystallography:Crystallography:• A solution of protein molecules is assembled A solution of protein molecules is assembled
into a periodic latticeinto a periodic lattice• The crystal is bombarded with X-ray beamsThe crystal is bombarded with X-ray beams• The collision of the beams with the atom The collision of the beams with the atom
electrons creates a diffraction patternelectrons creates a diffraction pattern• The diffraction pattern is transformed into an The diffraction pattern is transformed into an
electron density map of the protein from electron density map of the protein from which the 3D locations of the atoms can be which the 3D locations of the atoms can be deduceddeduced
F F
Structure determinationStructure determinationNuclear magnetic resonance Nuclear magnetic resonance
(NMR):(NMR):• A solution of the protein is placed in a A solution of the protein is placed in a
magnetic fieldmagnetic field• SpinsSpins align parallel or align parallel or
anti-parallel to the fieldanti-parallel to the field• RF pulses of electromagnetic energy RF pulses of electromagnetic energy
shift shift spinsspins from their alignment from their alignment• Upon radiation termination Upon radiation termination spinsspins re- re-
align while emitting the energy they align while emitting the energy they absorbedabsorbed
• The emission spectrum contains The emission spectrum contains information about the identity of the information about the identity of the nuclei and their immediate environmentnuclei and their immediate environment
• The result is an ensemble of models The result is an ensemble of models rather than a single one rather than a single one
PDB: Protein Data BankPDB: Protein Data Bankhttp://www.rcsb.orghttp://www.rcsb.org
PDB modelPDB model
Defines the 3D coordinates (x,y,z) of each of the Defines the 3D coordinates (x,y,z) of each of the atoms in one atoms in one or moreor more molecules (i.e., complex) molecules (i.e., complex)
There are models of proteins, protein There are models of proteins, protein complexes, proteins and DNA, protein complexes, proteins and DNA, protein segments, etcsegments, etc
The models also include the positions of ligand The models also include the positions of ligand molecules, solvent molecules, metal ions, etcmolecules, solvent molecules, metal ions, etc
PDB ID: integer + 3 integers/characters (e.g., PDB ID: integer + 3 integers/characters (e.g., 1a14) 1a14)
The PDB file formatThe PDB file format
The PDB file formatThe PDB file format
ATOM Records:
Usually protein or DNA
HETATM Records:
Usually Ligand, ion, water
chain
Residue identity
Residue number
Atom number
Atom identity atom coordinates Temperatur
e factorX Y Z
Occupancy
Background and motivationBackground and motivation DNA methylation at DNA CpG sites has a central DNA methylation at DNA CpG sites has a central
role in imprinting (plants and mammals), but it is not role in imprinting (plants and mammals), but it is not clear how the imprinting machinery recognizes its clear how the imprinting machinery recognizes its target genestarget genes
The Dnmt3 protein family (a,b,l) are The Dnmt3 protein family (a,b,l) are de novode novo methyltransferasesmethyltransferases
Dnmt3a and Dnmt3l KO mice show altered sexDnmt3a and Dnmt3l KO mice show altered sex--specific specific de novode novo methylation in germ cells, indicating methylation in germ cells, indicating that these proteins are both required for the that these proteins are both required for the methylation of most imprinted loci in germ cells methylation of most imprinted loci in germ cells
Goal:Goal:
Conduct a structural and biochemical study of a Conduct a structural and biochemical study of a homogeneous complex of Dnmt3L and Dnmt3ahomogeneous complex of Dnmt3L and Dnmt3a
MethodsMethods
Dnmt3a2, the shorter isoform of Dnmt3a Dnmt3a2, the shorter isoform of Dnmt3a that is the predominant form in embryonic that is the predominant form in embryonic stem cells was selectedstem cells was selected
For crystallographic reasons, a stable For crystallographic reasons, a stable complex of the Ccomplex of the C--terminal domains from terminal domains from both proteins both proteins ((Dnmt3aDnmt3a2-2-C and Dnmt3LC and Dnmt3L--CC) ) that retains substantial methyltransferase that retains substantial methyltransferase activity was focused on activity was focused on
ResultsResults The complex is a tetramer: The complex is a tetramer:
Dnmt3L–Dnmt3a–Dnmt3a–Dnmt3L–Dnmt3a–Dnmt3a–Dnmt3LDnmt3L
Mutagenesis at positions in both Mutagenesis at positions in both interfaces (a-a and a-L) indicate interfaces (a-a and a-L) indicate that these interfaces are that these interfaces are essential for catalysisessential for catalysis
Dnmt3a-Dnmt3a dimerization Dnmt3a-Dnmt3a dimerization brings two active sites togetherbrings two active sites together
Dimeric Dnmt3a could methylate Dimeric Dnmt3a could methylate two CpGs separated by one two CpGs separated by one helical turn in one binding event helical turn in one binding event
Dnmt3aDnmt3l
““We observed a highly significant correlation of We observed a highly significant correlation of methylation status at distances of eight to ten methylation status at distances of eight to ten base pairs between two CpG sites” base pairs between two CpG sites”
Distribution of CpG sites among 12 known maternally imprinted genes, indicated to be Dnmt3a-Dnmt3l targets
Protein visualizationProtein visualizationVisualization tools (working on PC):
RasMol / RasTop
SwissPDBviewer (sPDBv)
Protein Explorer (via the web)
And many more…
Rastop / RasmolRastop / Rasmol
http://www.geneinfinity.org/rastop/
RasTop- main menuRasTop- main menu פתיחתקובץ
סגירת קובץ
RasTop - display
Wireframe
קווים בין אטומים
Sphere VDW
מנפח כל אטום לפי
רדיוס ה- VDWשלו
Command editor
More on RasTopMore on RasTop
RasMolRasMol manual manual Using Using RasTopRasTop CommandsCommands
Structure alignmentStructure alignment
Essential for:
• Protein classification
• Detection of conserved protein folding cores
• Detection of similarities between domains
• Detection of similarities in functional binding sites
• Evolutionary conservation
• Construction of nonredundant databases
Pairwise structure Pairwise structure alignmentalignment
Outline:Given two proteins structures,find the transformation that
produces the best superimposition of one protein onto the other
Computationally
Find the rotations and translations of one of the points set (atoms of protein A) which produce
“large” superimpositions on the other points set (atoms of protein B)
?
X
Y
ZX
Y
Z
RMSDRMSDRoot Mean Square Deviation
Average distance between the matched superimposed atoms
usually between backbones Cα atoms
http://bioinfo3d.cs.tau.ac.il/c_alpha_match
Matches C-alpha atomsMatches C-alpha atoms Rigid pairwise alignementRigid pairwise alignement Sequence order independentSequence order independent Input: two PDB files or PDB IDs with Input: two PDB files or PDB IDs with
specific chainsspecific chains Output: a set of high scoring Output: a set of high scoring
conformations conformations The superimposed structures may be viewed The superimposed structures may be viewed
in a PDB viewerin a PDB viewer
BOBWHITE QUAIL LYSOZYME
HEN EGG WHITE LYSOZYME
ResultsResults
Ranking criteria:
1. Match size
2. RMSD
C-alpha correspondenceC-alpha correspondence
Aligned PDB fileAligned PDB file
http://bioinfo3d.cs.tau.ac.il/FlexProt
Flexible structural alignment Flexible structural alignment The first structure is assumed to be rigid, The first structure is assumed to be rigid,
while in the second structure potential while in the second structure potential flexible regions - flexible regions - hingeshinges, are , are automatically detectedautomatically detected
Input: two PDB IDs (specific chain)Input: two PDB IDs (specific chain) Output: list of alignments ranked according Output: list of alignments ranked according
to the number of hingesto the number of hinges
ResultsResults
Result with 0 hingesResult with 0 hinges::
Result with one hingeResult with one hinge::
http://bioinfo3d.cs.tau.ac.il/MultiProt
Multiple structural alignments of protein Multiple structural alignments of protein structures structures
Finds the common geometrical cores between Finds the common geometrical cores between the input moleculesthe input molecules
Does notDoes not require that all the input molecules require that all the input molecules participate in the alignmentparticipate in the alignment
Actually, it efficiently detects high scoring partial Actually, it efficiently detects high scoring partial multiple alignmentsmultiple alignments for all possible number of for all possible number of molecules from the inputmolecules from the input
The final structural alignment can either The final structural alignment can either preserve the sequence order (like sequence preserve the sequence order (like sequence alignment), or be sequence order independent alignment), or be sequence order independent
ResultsResults
DALI - Distance matrix ALIgnment DALI - Distance matrix ALIgnment
http://ekhidna.biocenter.helsinki.fi/dali_server/
Concept: “Similar 3D structures have similar inter-residue distances”
DALI AlgorithmDALI Algorithm Generates an inter-residue distance matrix for Generates an inter-residue distance matrix for
each proteineach protein
The distance matrix contains all pairwise The distance matrix contains all pairwise distances (symmetrical)distances (symmetrical)
Dij = distance between C-alpha i and C-alpha j Dij = distance between C-alpha i and C-alpha j in the same proteinin the same protein
Compares the two distance matrices for a pair of Compares the two distance matrices for a pair of proteins to be alignedproteins to be aligned
DALI ServicesDALI ServicesDALI severDALI sever
Used by crystallographers to compare a newly solved Used by crystallographers to compare a newly solved structure against structures in the PDBstructure against structures in the PDB
DALI databaseDALI databaseContains all-vs.-all PDB 3D structure comparisons and Contains all-vs.-all PDB 3D structure comparisons and thus enables to find structural neighbors of structures thus enables to find structural neighbors of structures that are already in the PDBthat are already in the PDB
Pairwise serverPairwise serverPairwise comparison of two structuresPairwise comparison of two structures
DaliliteDaliliteA standalone version of DALIA standalone version of DALI
DALI Database
http://ekhidna.biocenter.helsinki.fi/dali/start
Non-redundant chains - no two chains are more than90 % sequence identical
DALI Database
Number of structurally
aligned residues Number of residues in the protein
Sequence identity of aligned positions
SupplementarySupplementary
אלמנטים של ,מבנה שניוני
קו המחבר בין C-alpha
RasTop - Display
מחזק מבנה שניוני
RasTol - Display
Remove renderingRemove rendering
Clears the view from previous actions performed
לכל חומצה אמינית צבע משלה
תצוגה מסוימת בו לכל אטום צבע מסוים
עד כמה מיקום האטום קבוע במבנה: אדום- חופש גדול, כחול-קבוע מאוד
, beta sheetמבנה שניוני – צהוב- , loops and turnsכחול – . random coil, לבן – alpha helixאדום –
RasTop - Color
RasTop - Labels
סימון כל המולקולה או קטעים נבחרים בשם והמספר של חומצת האמינו.
The Select CommandSelect מגדירה את האזור שעליו נפעל במולקולה - סט האטומים –
שיעבור מודיפיקציות ע"י שרשרת הפקודות הבאות.
.”atom expression“ הוא selectהפרמטר של פקודת ה-
מגדיר באופן ייחודי קבוצה שרירותית של אטומים atom expressionה - בתוך מולקולה. הוא יכול להיות:
Primitive expressions ,
Predefined sets ,
Within expressions ,
or logical combination of all above mentioned.
In order to display only what we selected, use the command: Edit => restrict
The primitive expressions allow to select by:
Atom number - select atomno=102
Residue – select Val52 (select resno=52 or select 52)
Chain id – select :a
List of residue numbers – select 14,92,46
Range of atom numbers – select atomno=>35
A wildcard can be used to specify a whole field:
* Any number of characters
Atom or residue type – select *.sg (this will select all Sulphur atoms in Cysteine’s side chain)
? Single character wildcard – select ser.c? – will select all carbons in all
serine residues.
The predefined sets are groups of atoms given the definite names:
select helix
select hoh (water molecules)
select protein
There is a list with the predefined sets in the Rasmol reference card (google it)
Boolean ExpressionsBoolean ExpressionsAndהמשותף לשני תנאים –
Or(גם את זה וגם את זה) חיבור בין שני תנאים –
Notמה לא לכלול –
דוגמאות:
select tyr and :a → all tyr in ‘a’ chain
select tyr or :a → all tyr in the molecule and all ‘a’ chain
select not (tyr,:a) → all the molecule beside tyr or ‘a’ chain
Using Select ExpressionUsing Select Expression
……or via the command or via the command (edit(editcommand)command)
1. Spacefill
2. Color picker
3. Make sure you’re on atoms!