roadmap the topics: basic concepts of molecular biology more on perl overview of the field ...
Post on 21-Dec-2015
220 views
TRANSCRIPT
![Page 1: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/1.jpg)
RoadmapRoadmap
The topics:The topics: basic concepts of molecular biologybasic concepts of molecular biology more on Perlmore on Perl overview of the fieldoverview of the field biological databases and database biological databases and database
searchingsearching sequence alignmentssequence alignments phylogeneticsphylogenetics structure predictionstructure prediction microarray data analysismicroarray data analysis
![Page 2: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/2.jpg)
Protein Protein SynthesiSynthesi
ss
the national health museum
![Page 3: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/3.jpg)
ProteinsProteins
![Page 4: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/4.jpg)
ProteinsProteinsProteins perform a vast array of biological
functions including:
Transport: hemoglobin (delivers O2 to lungs) Mechanical support: collagen Storage: ferritin (stores iron) Regulation: repressor proteins (gene expression) Antibodies: immunoglobulin Catalysis: SOD (superoxide dismutase) …
Misfold:Misfold:mad cow disease, Alzheimer's disease, … mad cow disease, Alzheimer's disease, …
![Page 5: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/5.jpg)
Amino acid compositionAmino acid composition
Basic Amino AcidBasic Amino AcidStructure:Structure: The side chain, R,The side chain, R,
varies for each ofvaries for each ofthe 20 amino acidsthe 20 amino acids
C
RR
C
H
NO
OHH
H
Aminogroup
Carboxylgroup
Side chain
![Page 6: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/6.jpg)
The Peptide BondThe Peptide Bond
Dehydration synthesisDehydration synthesis Polypeptide with repeating backbone: NPolypeptide with repeating backbone: N–C–C –C ––C –NN–C–C –C–C
![Page 7: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/7.jpg)
Side chain propertiesSide chain properties
What make amino acids having different properties ?
CarbonCarbon does not make hydrogen bonds with does not make hydrogen bonds with water easily – water easily – hydrophobichydrophobic
O and NO and N are generally more likely than C to are generally more likely than C to h-bond to water – h-bond to water – hydrophilichydrophilic
The amino acids forms three general groups:The amino acids forms three general groups: HydrophobicHydrophobic PolarPolar Charged (positive/basic & negative/acidic)Charged (positive/basic & negative/acidic)
![Page 8: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/8.jpg)
The Hydrophobic Amino The Hydrophobic Amino AcidsAcids
Proline severelyProline severelylimits allowablelimits allowableconformations!conformations!
![Page 9: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/9.jpg)
The Charged Amino The Charged Amino AcidsAcids
Krane & Raymer
![Page 10: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/10.jpg)
The Polar Amino AcidsThe Polar Amino Acids
Krane & Raymer
![Page 11: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/11.jpg)
More Polar Amino AcidsMore Polar Amino Acids
and
![Page 12: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/12.jpg)
Peptidyl polymersPeptidyl polymers A few amino acids in a chain are called a A few amino acids in a chain are called a
polypeptidepolypeptide. A . A proteinprotein is usually is usually composed of 50 to 400+ amino acids.composed of 50 to 400+ amino acids.
![Page 13: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/13.jpg)
Primary & Secondary Primary & Secondary StructureStructure
Primary structurePrimary structure = the linear = the linear sequencesequence of amino acids comprising a protein:of amino acids comprising a protein:
AGVGTVPMTAYGNDIQYYGQVT…AGVGTVPMTAYGNDIQYYGQVT…
Secondary structureSecondary structure Regular patterns of hydrogen bonding in Regular patterns of hydrogen bonding in
proteins result in two patterns that emerge in proteins result in two patterns that emerge in nearly every protein structure known: the nearly every protein structure known: the --helixhelix and the and the --sheetsheet
The location of direction of these periodic, The location of direction of these periodic, repeating structures is known as the repeating structures is known as the secondary structuresecondary structure of the protein of the protein
![Page 14: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/14.jpg)
Levels of Levels of Protein Protein
StructureStructure
Secondary structure Secondary structure elements combine to elements combine to form form tertiary tertiary structurestructure
Quaternary structureQuaternary structure occurs in multi-enzyme occurs in multi-enzyme complexescomplexes Many proteins are active Many proteins are active
only as homodimers, only as homodimers, homotetramers, etc.homotetramers, etc.
![Page 15: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/15.jpg)
Dihedral anglesDihedral angles
![Page 16: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/16.jpg)
HelixHelix Most abundant secondary structureMost abundant secondary structure 3.6 amino acids per turn 3.6 amino acids per turn Hydrogen bond formed between every fourth Hydrogen bond formed between every fourth
residereside Avg length: 10 amino acids, or 3 turnsAvg length: 10 amino acids, or 3 turns Varies from 5 to 40 amino acidsVaries from 5 to 40 amino acids
![Page 17: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/17.jpg)
HelixHelix Normally found on the surface of protein coresNormally found on the surface of protein cores
Interact with aqueous environmentInteract with aqueous environment
Inner facing side has hydrophobic amino acidsInner facing side has hydrophobic amino acids
Outer-facing side has hydrophilic amino acidsOuter-facing side has hydrophilic amino acids
Every third amino acid tends to be hydrophobicEvery third amino acid tends to be hydrophobic
Pattern can be detected computationallyPattern can be detected computationally
Rich in alanine (A), gutamic acid (E), leucine (L), Rich in alanine (A), gutamic acid (E), leucine (L), and methionine (M)and methionine (M)
Poor in proline (P), glycine (G), tyrosine (Y), and Poor in proline (P), glycine (G), tyrosine (Y), and serine (S)serine (S)
![Page 18: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/18.jpg)
SheetSheet
![Page 19: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/19.jpg)
SheetSheet Hydrogen bonds between 5-10 consecutive amino Hydrogen bonds between 5-10 consecutive amino
acids in one portion of the chain with another 5-10 acids in one portion of the chain with another 5-10 farther down the chainfarther down the chain
Interacting regions may be adjacent with a short Interacting regions may be adjacent with a short loop, or far apart with other structures in betweenloop, or far apart with other structures in between
Directions:Directions: Same: Parallel SheetSame: Parallel Sheet Opposite: Anti-parallel SheetOpposite: Anti-parallel Sheet Mixed: Mixed SheetMixed: Mixed Sheet
Alpha carbons (and R side groups) alternate above Alpha carbons (and R side groups) alternate above & below the sheet& below the sheet
Prediction difficult, due to wide range of Prediction difficult, due to wide range of and and anglesangles
![Page 20: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/20.jpg)
Ramachandran Plot Ramachandran Plot (alpha)(alpha)
![Page 21: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/21.jpg)
Ramachandran Plot Ramachandran Plot (beta)(beta)
![Page 22: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/22.jpg)
Ramachandran PlotRamachandran Plot
![Page 23: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/23.jpg)
Helices and SheetsHelices and Sheets
![Page 24: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/24.jpg)
LoopLoop
Regions between Regions between helices and helices and sheets sheets
Various lengths and three-dimensional Various lengths and three-dimensional configurationsconfigurations
Located on surface of the structureLocated on surface of the structure
Hairpin loops: complete turn in the polypeptide Hairpin loops: complete turn in the polypeptide chain, (anti-parallel chain, (anti-parallel sheets) sheets)
More variable sequence structureMore variable sequence structure
Tend to have charged and polar amino acidsTend to have charged and polar amino acids
![Page 25: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/25.jpg)
CoilCoil
Region of secondary structure that is not Region of secondary structure that is not a helix, sheet, or loopa helix, sheet, or loop
![Page 26: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/26.jpg)
Determining Protein Determining Protein StructureStructure
There are O(100,000) distinct proteins There are O(100,000) distinct proteins in human proteome.in human proteome.
Two methods for revealing positions of Two methods for revealing positions of atoms in 3-D:atoms in 3-D: X-Ray CrystallographyX-Ray Crystallography
X-ray diffraction pattern + mathematical X-ray diffraction pattern + mathematical constructionconstruction
Good protein crystal needed, good resolution of Good protein crystal needed, good resolution of diffraction neededdiffraction needed
Nuclear Magnetic ResonanceNuclear Magnetic Resonance Small proteins only (< 250 residues)Small proteins only (< 250 residues) Inter-proton distances + geometric constraintsInter-proton distances + geometric constraints
![Page 27: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/27.jpg)
Bovine RibonucleaseBovine Ribonuclease
Christian Anfinsen, 1957.Christian Anfinsen, 1957.
![Page 28: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/28.jpg)
Disulfide BondsDisulfide Bonds
Two cysteines in Two cysteines in close proximity close proximity will form a will form a covalentcovalent bond bond
Disulfide bond, Disulfide bond, disulfide bridge, disulfide bridge, or dicysteine or dicysteine bond.bond.
Significantly Significantly stabilizes stabilizes tertiary tertiary structure.structure.
![Page 29: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/29.jpg)
![Page 30: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/30.jpg)
Principles that govern the folding Principles that govern the folding of protein chains - of protein chains - Christian Anfinsen, Christian Anfinsen,
Science 1973Science 1973
![Page 31: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/31.jpg)
RibonucleaseRibonuclease
![Page 32: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/32.jpg)
Disulfide BondsDisulfide Bonds
661212
551010
4488
3366
2244
# of combinations# of combinations# of S-S bonds# of S-S bonds# of cysteines# of cysteines
1039510395
945945
105105
1515
33
![Page 33: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/33.jpg)
Levinthal’s Levinthal’s paradoxparadox
How do proteins find the right conformation out of the simply endless number of potential three-dimensional forms that it could randomly fold into?
Consider a 100 residue protein. If each residue can take only 3 positions, there are ?possible conformations. If it takes 10-13s to convert from 1 structure to
another, exhaustive search would take ? years!
3100 = 5 1047
1.6 1027
![Page 34: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/34.jpg)
Current Opinion in Structural Biology, 2004, 14, 70-75
![Page 35: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/35.jpg)
What determines fold?What determines fold?
Anfinsen’s experiments in 1957 demonstrated Anfinsen’s experiments in 1957 demonstrated that proteins can fold spontaneously into their that proteins can fold spontaneously into their native conformations under physiological native conformations under physiological conditions. This implies that primary structure conditions. This implies that primary structure does indeed determine folding or 3-D does indeed determine folding or 3-D structure.structure.
Exceptions existExceptions exist Chaperone Chaperone proteins assist foldingproteins assist folding Abnormally folded Abnormally folded Prion Prion proteins can catalyze proteins can catalyze
misfolding of normal misfolding of normal prionprion proteins that then proteins that then aggregateaggregate
![Page 36: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/36.jpg)
Other factorsOther factors
Physical properties of protein that Physical properties of protein that influence stability & therefore, determine influence stability & therefore, determine its fold:its fold: Rigidity of backboneRigidity of backbone
Amino acid interaction with waterAmino acid interaction with water Hydropathy index for side chainsHydropathy index for side chains
Interactions among amino acidsInteractions among amino acids Electrostatic interactionsElectrostatic interactions
Hydrogen, disulphide bondsHydrogen, disulphide bonds
Volume constraintsVolume constraints
![Page 37: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/37.jpg)
Understand protein folding
Structure: Given a sequence, what tertiary structure does it adopt? Global optimization, Monte Carlo, Molecular dynamics,
Coarse-grained dynamics, etc.
Thermodynamics: under mutation does the free energy of the native state change relative to native sequence? MC, MD, Free energy methods, etc.
Kinetics: how fast does the protein fold? Does a different sequence fold faster and why? Lattice Monte Carlo, Molecular dynamics, Coarse-
grained dynamics
![Page 38: Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence](https://reader030.vdocuments.us/reader030/viewer/2022033101/56649d615503460f94a4390f/html5/thumbnails/38.jpg)
CASP changed the CASP changed the landscapelandscape
Critical Assessment of Structure Prediction Critical Assessment of Structure Prediction competition. Even numbered years since 1994competition. Even numbered years since 1994 Solved, but unpublished structures are posted in May, Solved, but unpublished structures are posted in May,
predictions due in Septemberpredictions due in September Various categoriesVarious categories
Relation to existing structures, Relation to existing structures, ab initioab initio, homology, fold, , homology, fold, etc.etc.
Partial vs. Fully automated approachesPartial vs. Fully automated approaches Produces lots of information about what aspects of the Produces lots of information about what aspects of the
problems are hard, and ends arguments about test sets.problems are hard, and ends arguments about test sets. Results showing steady improvement, and the Results showing steady improvement, and the
value of integrative approaches.value of integrative approaches.