genes: regulation and structure many slides from various sources, including s. batzoglou,
TRANSCRIPT
![Page 1: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/1.jpg)
Genes: Regulation and Genes: Regulation and StructureStructure
Many slides from various sources, including S. Batzoglou,
![Page 2: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/2.jpg)
Cells respond to environment
Heat
FoodSupply
Responds toenvironmentalconditions
Various external messages
![Page 3: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/3.jpg)
Genome is fixed – Cells are dynamic
• A genome is static
Every cell in our body has a copy of same genome
• A cell is dynamic Responds to external conditions Most cells follow a cell cycle of division
• Cells differentiate during development
![Page 4: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/4.jpg)
Gene regulation
• Gene regulation is responsible for dynamic cell
• Gene expression varies according to:
Cell type Cell cycle External conditions Location
![Page 5: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/5.jpg)
Where gene regulation takes place
• Opening of chromatin
• Transcription
• Translation
• Protein stability
• Protein modifications
![Page 6: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/6.jpg)
Transcriptional Regulation
• Strongest regulation happens during transcription
• Best place to regulate: No energy wasted making intermediate products
• However, slowest response timeAfter a receptor notices a change:
1. Cascade message to nucleus
2. Open chromatin & bind transcription factors
3. Recruit RNA polymerase and transcribe
4. Splice mRNA and send to cytoplasm
5. Translate into protein
![Page 7: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/7.jpg)
Transcription Factors Binding to DNA
Transcription regulation:
Certain transcription factors bind DNA
Binding recognizes DNA substrings:
Regulatory motifs
![Page 8: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/8.jpg)
Promoter and Enhancers
• Promoter necessary to start transcription
• Enhancers can affect transcription from afar
![Page 9: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/9.jpg)
Regulation of Genes
GeneRegulatory Element
RNA polymerase(Protein)
Transcription Factor(Protein)
DNA
![Page 10: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/10.jpg)
Regulation of Genes
Gene
RNA polymerase
Transcription Factor(Protein)
Regulatory Element
DNA
![Page 11: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/11.jpg)
Regulation of Genes
Gene
RNA polymerase
Transcription Factor
Regulatory Element
DNA
New protein
![Page 12: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/12.jpg)
Example: A Human heat shock protein
• TATA box: positioning transcription start
• TATA, CCAAT: constitutive transcription
• GRE: glucocorticoid response
• MRE: metal response
• HSE: heat shock element
TATASP1CCAAT AP2HSEAP2CCAATSP1
promoter of heat shock hsp70
0--158
GENE
![Page 13: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/13.jpg)
Gene expression
Protein
RNA
DNA
transcription
translation
CCTGAGCCAACTATTGATGAA
PEPTIDE
CCUGAGCCAACUAUUGAUGAA
![Page 14: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/14.jpg)
The Genetic Code
![Page 15: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/15.jpg)
Eukaryotes vs Prokaryotes
• Eukaryotic cells are characterized by membrane-bound compartments, which are absent in prokaryotes.
• “Typical” human & bacterial cells drawn to scale.
BIOS Scientific Publishers Ltd, 1999
Brown Fig 2.1
![Page 16: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/16.jpg)
Prokaryotic genes – searching for ORFs.
- Small genomes have high gene density
Haemophilus influenza – 85% genic - No introns- Operons
One transcript, many genes
- Open reading frames (ORF) – contiguous set of codons, start with Met-codon, ends with
stop codon.
![Page 17: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/17.jpg)
Example of ORFs.
There are six possible ORFs in each sequence for both directions of transcription.
![Page 18: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/18.jpg)
Eukaryotes vs Prokaryotes
• Eukaryotic cells are characterized by membrane-bound compartments, which are absent in prokaryotes.
• “Typical” human & bacterial cells drawn to scale.
BIOS Scientific Publishers Ltd, 1999
Brown Fig 2.1
![Page 19: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/19.jpg)
Gene structure
exon1 exon2 exon3intron1 intron2
transcription
translation
splicing
exon = protein-codingintron = non-coding
Codon:A triplet of nucleotides that is converted to one amino acid
![Page 20: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/20.jpg)
Gene structure
exon1 exon2 exon3intron1 intron2
transcription
translation
splicing
exon = codingintron = non-coding
![Page 21: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/21.jpg)
Finding genes
Start codonATG
5’ 3’
Exon 1 Exon 2 Exon 3Intron 1 Intron 2
Stop codonTAG/TGA/TAA
Splice sites
![Page 22: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/22.jpg)
![Page 23: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/23.jpg)
![Page 24: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/24.jpg)
atg
tga
ggtgag
ggtgag
ggtgag
caggtg
cagatg
cagttg
caggccggtgag
![Page 25: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/25.jpg)
0. We can sequence the mRNA
• Expressed Sequence Tag (EST) sequencing is expensive
• It has some false positive rates (aberrant splicing)
• The method sequences all RNAs and not just those that code for genes
• This is difficult for rare genes (those that are expressed rarely or in low quantities.
• Still this is an invaluable source of information (when available)
![Page 26: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/26.jpg)
Biology of Splicing
(http://genes.mit.edu/chris/)
![Page 27: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/27.jpg)
1. Consensus splice sites
(http://www-lmmb.ncifcrf.gov/~toms/sequencelogo.html)
Donor: 7.9 bitsAcceptor: 9.4 bits(Stephens & Schneider, 1996)
![Page 28: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/28.jpg)
2. Recognize “coding bias”
• Each exon can be in one of three framesag—gattacagattacagattaca—gtaag Frame 0ag—gattacagattacagattaca—gtaag Frame 1ag—gattacagattacagattaca—gtaag Frame 2
Frame of next exon depends on how many nucleotides are left over from previous exon
• Codons “tag”, “tga”, and “taa” are STOP No STOP codon appears in-frame, until end of gene Absence of STOP is called open reading frame (ORF)
• Different codons appear with different frequencies—coding bias
![Page 29: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/29.jpg)
2. Recognize “coding bias”
Amino Acid SLC DNA codonsIsoleucine I ATT, ATC, ATALeucine L CTT, CTC, CTA, CTG, TTA, TTGValine V GTT, GTC, GTA, GTGPhenylalanine F TTT, TTCMethionine M ATGCysteine C TGT, TGCAlanine A GCT, GCC, GCA, GCG Glycine G GGT, GGC, GGA, GGG Proline P CCT, CCC, CCA, CCGThreonine T ACT, ACC, ACA, ACGSerine S TCT, TCC, TCA, TCG, AGT, AGCTyrosine Y TAT, TACTryptophan W TGGGlutamine Q CAA, CAGAsparagine N AAT, AACHistidine H CAT, CACGlutamic acid E GAA, GAGAspartic acid D GAT, GACLysine K AAA, AAGArginine R CGT, CGC, CGA, CGG, AGA, AGGStop codons Stop TAA, TAG, TGA
Can map 61 non-stop codons to frequencies & take log-odds ratios
![Page 30: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/30.jpg)
3. Genes are “conserved”
![Page 31: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/31.jpg)
Approaches to gene finding
• Homology Procrustes
• Ab initio Genscan, Genie, GeneID
• Comparative TBLASTX, Rosetta
• Hybrids GenomeScan, GenieEST, Twinscan, SLAM…
![Page 32: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/32.jpg)
HMMs for single species gene finding: Generalized HMMs
![Page 33: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/33.jpg)
HMMs for gene finding
GTCAGAGTAGCAAAGTAGACACTCCAGTAACGC
exon exon exonintronintronintergene intergene
![Page 34: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/34.jpg)
GHMM for gene finding
TAA A A A A A A A A A A AA AAT T T T T T T T T T T T T T TG GGG G G G GGGG G G G GCC C C C C C
Exon1 Exon2 Exon3
duration
![Page 35: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/35.jpg)
Observed duration times
![Page 36: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/36.jpg)
Better way to do it: negative binomial
• EasyGene:
Prokaryotic
gene-finder
Larsen TS, Krogh A
• Negative binomial with n = 3
![Page 37: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/37.jpg)
Splice Site Models
• WMM: weight matrix model = PSSM (Staden 1984)
• WAM: weight array model = 1st order Markov (Zhang & Marr 1993)
• MDD: maximal dependence decomposition (Burge & Karlin 1997) decision-tree like algorithm to take significant pairwise dependencies into
account
![Page 38: Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,](https://reader034.vdocuments.us/reader034/viewer/2022051401/56649ea35503460f94ba7a1c/html5/thumbnails/38.jpg)
Splice site detection
5’ 3’Donor site
Position
-8 … -2 -1 0 1 2 … 17
A 26 … 60 9 0 1 54 … 21C 26 … 15 5 0 1 2 … 27G 25 … 12 78 99 0 41 … 27T 23 … 13 8 1 98 3 … 25