lecture 1-3 topic 1

Upload: anthony-tee

Post on 03-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 Lecture 1-3 Topic 1

    1/17

    5/27/2013

    1

    TOPIC 1

    GENOME STUDIES

    Dr Choo QC (TOPIC 1) 1

    Genome

    Genome is total genetic information possessed by anorganism in every cell, tissue, and organ in a body.

    Every cells contain complete copy of instructions,written in the four-letter language of DNA (i.e. A, C,

    T, G).

    If the genome (DNA molecule) of a typical bacteriumis extended, it would be about 2mm in length. In

    comparison, the diameter of the bacterium itself is

    only about 0.001 mm.

    Dr Choo QC (TOPIC 1) 2

    Genome

    The amount of protein sequence informationin a cell cannot be easily estimated from its

    genome size because:

    (a) Not all DNA codes for proteins- Introns

    - Regulatory regions (e.g. promoters)

    (b) Some genes exist in multiple copies

    (c) The alternate splicing of the gene

    Dr Choo QC (TOPIC 1) 3

    Genome of Selected Organisms

    Organism Number of Genes

    (Approximate)

    Number of base pairs

    (x 106)

    Escherichia coli 4500 4.6

    Saccharomyces

    cerevisiae

    6000 12.1

    Caenorhabditis

    elegans

    19000 95.5

    Arabidopsis thaliana 25000 117

    Drosophila

    melanogaster

    1300 180

    Homo sapiens 25000 3200Dr Choo QC (TOPIC 1) 4

  • 7/28/2019 Lecture 1-3 Topic 1

    2/17

    5/27/2013

    2

    Genomics

    Genomics - study of genome

    Involves large data sets (i.e. 3 billion base pairsfor human genome)

    High-throughput methods (fast methods fordata collection)

    Dr Choo QC (TOPIC 1) 5

    Genomics

    Genomics studies include:

    DNA sequencing

    Genomic library constructions

    PCR amplification and cloning

    Hybridization techniques

    Dr Choo QC (TOPIC 1) 6

    Genomics

    Genome variation within a population,

    Transcriptional control of genes,

    Proteome (complete protein content of

    a cell/organisms at a given time)

    Dr Choo QC (TOPIC 1) 7

    Histones (DNA

    binding proteins)

    Chromatin (DNA

    histone complexes)

    Nucleosome (8histone protein)

    form core octamer

    have linker

    histones (act as

    clamp prevent

    coiled DNA from

    detaching from

    chromosome)

    Dr Choo QC (TOPIC 1) 8

  • 7/28/2019 Lecture 1-3 Topic 1

    3/17

    5/27/2013

    3

    Dr Choo QC (TOPIC 1) 9

    Genes

    An order sequence of nucleotides that encodes

    a specific product

    Physical and functional

    units of heredity

    Dr Choo QC (TOPIC 1) 10

    Genes

    May be turned on or off (by its regulatorymechanism) in response to the environment

    e.g. Concentration of nutrients & stress

    Development of the organism

    Bacterial genomes may also have operons acontiguous of several genes to catalyze

    successive steps of biochemical reactions

    Dr Choo QC (TOPIC 1) 11

    Genes

    Genes comprise only about 2% of the human

    genome

    The remainder consists of non-coding regions,

    Function: providing chromosomal structural

    integrity and regulation - where, when, and in

    what quantity proteins are made

    Human genome is estimated to contain~25,000 genes

    Dr Choo QC (TOPIC 1) 12

  • 7/28/2019 Lecture 1-3 Topic 1

    4/17

    5/27/2013

    4

    09_25_Chromosome22.jpg

    Dr Choo QC (TOPIC 1) 13

    There are 23 Chapters, called CHROMOSOMES:

    All the chapters being bind together called FOLDINGS

    Each chapter contains several thousand stories, calledGENES

    Each story is made up of paragraphs, called EXONS

    which are interrupted by advertisements calledINTRONS

    Each paragraph is made up of words, called CODONS

    Each word is written in letters called BASES which isglued together with BONDS

    And this is what that made up the GENOMEDr Choo QC (TOPIC 1) 14

    Proteins

    Large, complex molecules made up of chainsof small chemical compounds called amino

    acids

    Perform most life functions

    Majority of cellular structures

    Dr Choo QC (TOPIC 1) 15

    Proteins

    Nucleotide sequence can be translated intoamino acid sequence using the universal geneticcode

    Chemical properties that distinguish the 20different amino acids

    Cause the protein chains to fold up into specificthree-dimensional structures that define theirparticular functions in the cell

    Dr Choo QC (TOPIC 1) 16

  • 7/28/2019 Lecture 1-3 Topic 1

    5/17

    5/27/2013

    5

    Dr Choo QC (TOPIC 1) 17 18

    Why Study Proteins?

    Genomes Information

    Proteins Action

    Proteins rely on their regular three-dimensionalstructure for function. They have to have the rightshape and chemistry to carry out their biologicalrole

    This means bringing together amino acids, notonly in a particular sequence, but also spatialrelationship

    Dr Choo QC (TOPIC 1)

    19

    Amino Acids

    Monomeric building blocks of proteins Joined by covalent bond (peptide bond) Twenty different amino acids

    - Same general structure- Differ in side chain (R group)- All organisms have same set of 20

    Different activities and shapes of proteins due todifferent amino acid sequences

    Dehydrationsynthesis

    Carboxylgroup

    Aminogroup

    PEPTIDE BOND

    Dr Choo QC (TOPIC 1) 20

    What Amino Acids Look Like

    C

    R

    H

    N C

    O

    OH

    H

    H

    side chain

    amino

    group

    carboxyl

    group

    Different side-chain (R group)

    Different chemical and physical propertiesDr Choo QC (TOPIC 1)

  • 7/28/2019 Lecture 1-3 Topic 1

    6/17

    5/27/2013

    6

    21

    Three Classes of Amino Acids

    - Classification Based on Polarity(A) Nonpolar (hydrophobic)

    G A V L I

    M F W P

    Dr Choo QC (TOPIC 1) 22

    (B) Polar uncharged

    (C) Charged : basic and acidic

    S T C YN

    Q

    D EK R

    H

    Dr Choo QC (TOPIC 1)

    THREE-DIMENSIONAL

    STRUCTURES OFPROTEINS

    23Dr Choo QC (TOPIC 1) 24

    Primary Structure

    Determines 2o, 3o, 4o structures

    EXAMPLE: Sickle cell anemia - Single amino acidchange in hemoglobin related to disease

    Dr Choo QC (TOPIC 1)

    / /

  • 7/28/2019 Lecture 1-3 Topic 1

    7/17

    5/27/2013

    7

    25

    GUG CAC CUG ACU CCU GAG GAG AAGval his leu thr pro GLU glu lys1 2 3 4 5 6 7 8

    GUG CAC CUG ACU CCU GUG GAG AAGval his leu thr pro VAL glu lys1 2 3 4 5 6 7 8

    Mutation (in DNA)

    Normal mRNA

    Normal protein

    Mutant mRNA

    Mutant protein

    NOTE: Glu is a negatively charged amino acid and it is

    replaced by Val, which has no charge

    Sickle Cell Hemoglobin (HbS)

    Dr Choo QC (TOPIC 1) 26

    Sickle Cell Hemoglobin (HbS)

    Caused by a point mutation(single A to T)

    Amino acid substitution

    (Glu to Val) in the 6th

    position of -globins 146amino acid chain

    Low oxygen conditionscause Hb to aggregate into

    rod-shaped polymers This distort the shape of

    RBC to a sickle shape

    Dr Choo QC (TOPIC 1)

    27

    Primary Structure

    The amino acid sequence or polypeptide chain

    Primary structure determines final shape and function

    Secondary Structure

    Repeated coiling or folding of the polypeptide by hydrogenbonding

    Local description of structure

    Major Types: -helix

    -sheets

    Loops & turns

    Dr Choo QC (TOPIC 1) 28

    (ii) (iii)

    2o Structure

    (i) -helix(ii) -sheet(iii) Loops and turns

    (i)

    Dr Choo QC (TOPIC 1)

    5/27/2013

  • 7/28/2019 Lecture 1-3 Topic 1

    8/17

    5/27/2013

    8

    29

    -Helix

    Amino hydrogen (N-H) on nth residue bonds withcarbonyl oxygen (C=O) located 4 amino acidsaway (nth + 4)

    A common secondary structure in both fibrous andglobular proteins

    Side chain groups point outwards from helix

    Amino acids with bulky side chains less commonin -helix

    Glycine and proline destabilizes -helixDr Choo QC (TOPIC 1) 30

    -Strand and -Sheet

    Strands may be parallel / antiparallel Anti-parallel -sheets are more stable

    Side chains point alternately above and below theplane of the beta-sheet

    -sheet are common motifs in proteins

    Dr Choo QC (TOPIC 1)

    31

    Turns Loops with < 5 amino acids

    are called turns Allows the peptide chain to

    reverse direction Proline and glycine are

    prevalent in -turns

    Loops and turns= Non-repetitive structure

    Loops Loops usually contain hydrophillic residues Connect -helices and -sheets

    Dr Choo QC (TOPIC 1) 32

    3o Structure

    Third level of protein

    organization

    3-D arrangement

    Types of tertiaryinteractions

    (A) Bonds: Covalent Ionic

    Hydrogen(B) Hydrophobic interactions

    Hydrophobic

    Interaction

    Dr Choo QC (TOPIC 1)

    5/27/2013

  • 7/28/2019 Lecture 1-3 Topic 1

    9/17

    5/27/2013

    9

    33

    4o

    Structure

    Describes the organization ofsubunits in a protein withmultiple subunits

    Subunits held together bynon-covalent interactions

    22

    Dr Choo QC (TOPIC 1)

    Proteins

    Many new protein sequence data are nowbeing determined by translation of DNA

    sequences, rather than by direct sequencing

    of proteins (an expensive procedure)

    However, one should remember that the

    protein sequence translated from the genomesequence is a hypothetical structure until it is

    verified experimentally

    Dr Choo QC (TOPIC 1) 34

    Proteomics

    Proteome - complete set of proteins producedwithin a cell

    Proteomics - the study of proteins

    Proteome of an organism changes dependingon its environment stimulus (like heat shock,

    growth)

    Rate of synthesis of different proteins variesamong different tissues, different cell types

    and state of activityDr Choo QC (TOPIC 1) 35

    Picking out genes from genomes

    Bioinformatics software can assist scientist infinding novel genes from genome

    The software identifies open reading framesor ORFs - a region of DNA sequence that

    begins with an initiation codon (ATG) and ends

    with a stop codon

    An ORF is a potential protein-coding regionDr Choo QC (TOPIC 1) 36

    5/27/2013

  • 7/28/2019 Lecture 1-3 Topic 1

    10/17

    5/27/2013

    10

    Genome SequencingProjects

    Dr Choo QC (TOPIC 1) 37

    1977 First viral genome Sanger et al. sequence bacteriophage X174

    This virus is 5386 base pairs (encoding 11 genes) Note: Accession J02482

    1981 Human mitochondrial genome 16,500 base pairs

    1986 Chloroplast genome

    156,000 base pairs (most are 120 kb to 200 kb)

    1995 First genome of a free-living organism, the bacterium

    Hemophilus influenzaeDr Choo QC (TOPIC 1) 38

    1997

    More bacteria and archaea

    Escherichia coli

    4.6 Mb; 4200 proteins (38% of unknown function)

    1998

    First multicellular organism Nematode Caenorhabditis elegans

    97 Mb; 19,000 genes.

    1999

    First human chromosome

    Chromosome 22 (49 Mb, 673 genes)

    Dr Choo QC (TOPIC 1) 39 Dr Choo QC (TOPIC 1) 40

    5/27/2013

  • 7/28/2019 Lecture 1-3 Topic 1

    11/17

    5/27/2013

    11

    Significance and Importance

    of Genome Studies

    Dr Choo QC (TOPIC 1) 41

    Potential benefits(A) Molecular medicine:

    Improved diagnosis of disease - lead to more accuratediagnosis

    Earlier detection of genetic predispositions to disease

    Will be able to assess risk for certain diseases

    e.g. cancer, Type II diabetes, heart disease

    Drugs designed to target specific gene products thatcause disease

    Gene therapy and control systems for drugs

    Replacement of defective genes for certain diseases

    Pharmacogenomics "custom drugs

    Drug therapy based on genotype

    Human Genome Project (and others)

    Dr Choo QC (TOPIC 1) 42

    (B) Archaeology, anthropology, evolution, andhuman migration

    Study evolution through germline mutations in lineages

    Study migration of different population groups based onfemale genetic inheritance

    Study mutations on the Y chromosome to trace lineageand migration of males

    Compare breakpoints in the evolution of mutations withages of populations and historical events

    Human Genome Project (and others)

    Dr Choo QC (TOPIC 1) 43

    (C) DNA forensics (identification)

    Identify potential suspects whose DNA may matchevidence left at crime scenes

    Exonerate persons wrongly accused of crimes

    Identify crime and catastrophe victims Establish paternity and other family relationships

    Identify endangered and protected species as an aid towildlife officials (could be used for prosecuting poachers)

    Detect bacteria and other organisms that may pollute air,water, soil, and food

    Determine pedigree for seed or livestock breeds

    Human Genome Project (and others)

    Dr Choo QC (TOPIC 1) 44

    5/27/2013

  • 7/28/2019 Lecture 1-3 Topic 1

    12/17

    5/27/2013

    12

    (D) Agriculture, livestock breeding, and bioprocessing Disease-, insect-, and drought-resistant crops

    Healthier, more productive, disease-resistant farm animals

    More nutritious produce

    Biopesticides

    Edible vaccines incorporated into food products

    New environmental cleanup uses for plants

    Human Genome Project (and others)

    Dr Choo QC (TOPIC 1) 45

    Genomes of Prokaryotes

    Dr Choo QC (TOPIC 1) 46

    Gene Regulation in Bacteria

    Dr Choo QC (TOPIC 1) 47

    The Operon

    Dr Choo QC (TOPIC 1) 48

    5/27/2013

  • 7/28/2019 Lecture 1-3 Topic 1

    13/17

    / /

    13

    The Operator

    Dr Choo QC (TOPIC 1) 49

    Operons in E. coli

    Dr Choo QC (TOPIC 1) 50

    (a) Lac operon (Inducible Operon)

    Dr Choo QC (TOPIC 1) 51

    Lac operon (Inducible Operon)

    Dr Choo QC (TOPIC 1) 52

    5/27/2013

  • 7/28/2019 Lecture 1-3 Topic 1

    14/17

    14

    (a) Lactose absent, repressor active, operon off

    Regulatorygene

    Promoter

    Operator

    DNA lacZlacIDNA

    mRNA

    5

    3

    NoRNAmade

    RNApolymerase

    ActiverepressorProtein

    Dr Choo QC (TOPIC 1) 53

    (b) Lactose present, repressor inactive, operon on

    lacI

    lacoperon

    lacZ lacY lacADNA

    mRNA

    5

    3

    Protein

    mRNA 5

    Inactive

    repressor

    RNA polymerase

    Allolactose

    (inducer)

    -Galactosidase Permease Transacetylase

    Dr Choo QC (TOPIC 1) 54

    (b) Trp operon (Repressible Operon)

    A repressible operon that is always on

    Only turns off in the presence of its end

    product, the amino acid tryptophan

    Produces enzymes for production of tryptophan

    Structural genes present within the tryptophanoperon code for repressible enzymes

    Dr Choo QC (TOPIC 1) 55

    (a) Tryptophan absent, repressor

    inactive, operon on

    Dr Choo QC (TOPIC 1) 56

    5/27/2013

  • 7/28/2019 Lecture 1-3 Topic 1

    15/17

    15

    (b) Tryptophan present, repressor active, operon off

    DNA

    mRNA

    Protein

    Tryptophan(corepressor)

    Activerepressor

    No RNAmade

    Dr Choo QC (TOPIC 1) 57

    Inducible & Repressible Enzyme

    Inducible enzymes usually function in catabolicpathways - synthesis is induced by a chemical

    signal

    Repressible enzymes usually function in anabolicpathways - synthesis is repressed by high levels of

    the end-product

    Regulation of the trp and lac operons involvesnegative control of genes because operons are

    switched off by the active form of the repressor

    Dr Choo QC (TOPIC 1) 58

    Genome of prokaryotes

    Large circular, double-stranded DNA

    Usually < 5 Mbp

    May contain plasmids

    Environment-specific genes on plasmids andother types of mobile genetic elements

    Dr Choo QC (TOPIC 1) 59

    Genomes of prokaryotes

    The protein-coding regions of bacterialgenomes:

    Do not contain introns

    Partially organized into operons

    Genes that are located alongside one anothertranscribed into single mRNA molecule, under

    common transcriptional control

    Dr Choo QC (TOPIC 1) 60

    5/27/2013

  • 7/28/2019 Lecture 1-3 Topic 1

    16/17

    16

    Genomes of prokaryotes

    In bacteria, the genes of many operons codefor proteins with related functions

    For instance, successive genes in the trpoperon of E. coli code for proteins that

    catalyze successive steps in the biosynthesis of

    tryptophan

    Dr Choo QC (TOPIC 1) 61

    Genome of

    Escherichia coli

    Dr Choo QC (TOPIC 1) 62

    Genome ofEscherichia coli

    Contains 4,639,221 bp in a single circular DNAmolecule, with no plastids

    Relatively gene dense

    Gene coding for proteins or structural RNAsoccupy ~89 % of the sequence

    Average size of an ORF is 317 amino acids

    Dr Choo QC (TOPIC 1) 63

    Genome ofEscherichia coli

    Most of the transcribe units contain only 1gene but E. colialso has operons where a setof genes grouped at one place

    It is estimated that E. coli genome contains2584 operons

    Operons vary in size, although few containmore than five genes

    Genes within operons tend to have relatedfunctions

    Dr Choo QC (TOPIC 1) 64

    5/27/2013

  • 7/28/2019 Lecture 1-3 Topic 1

    17/17

    17

    Genome ofEschericia coli

    The largest class of proteins is the enzymes approximately 30 % of total genes

    Many enzymatic functions are shared by morethan one protein - arisen by duplication or

    differ in specificity, regulation or intracellular

    location

    Dr Choo QC (TOPIC 1) 65

    Genome ofEscherichia coli

    4288 protein-coding genes

    122 structural RNA genes

    Non-coding repeat sequences

    Regulatory elements

    Transposable elements

    Prophage remnants

    Patches of unusual composition - likely to beforeign elements introduced by horizontal gene

    transfer

    Dr Choo QC (TOPIC 1) 66

    Dr Choo QC (TOPIC 1) 67 Dr Choo QC (TOPIC 1) 68