cmb2003 lecture 12 2013
TRANSCRIPT
Dianne Ford
DNA microarrays and high-throughput sequencing approaches
for analysing patterns of gene expression
Functional genomics• Experimental methods of identifying the function and
expression pattern of genomic sequences– Bioinformatics (CMB2005)– Genome projects – Professor Morgan– Mouse knockout models (MMed lecture 11)– DNA microarrays and high throughput (“next generation”)
sequencing• Detect patterns of expression of gene expression
– E.g. compare different tissues (see Mortazavi A et al (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 5:621-8)
– E.g. compare normal and abnormal states (e.g. cancer)– E.g. compare effects in specific tissue of pharmaceutical/dietary intervention
(see Lagouge et al (2006) Resveratrol improves mitochondrial function and protects against metabolic disease by activating Sirt1 and PGC-1alpha. Cell 127: 1109-1122)
– E.g. compare same tissue/cell line before and after a specific treatment (e.g nutrient starvation; oxidative stress)
So what is a DNA microarray?
• A glass slide (like a microscope slide) spotted at high density with individual DNA sequences (up to approximately 40,000), each of which corresponds to a known gene product– Oligonucleotides (50-70 mers)– PCR products
Why is that useful?
• Incubate microarray with fluorescently-labelled cDNA from the tissue/cell line of interest.
• Labelled cDNA will hybridase (stick) only to DNA spots on the array to which it is complementary.
• Detect which (known) spots test DNA has hybridised to to determine which genes are expressed.
Sounds too easy!
• True – that is an over-simplification– It is more usual to hybridise cDNA samples prepared
separately from two different tissues/cell types or from the same tissue in two different states (e.g. normal and diseased) or from the same tissue/cell type before and after a specific treatment.
– By incorporating a different coloured fluorescent dye into each sample, the relative level of expression of each gene on the array in each of the two samples can be compared.
• Other platforms (e.g. Affymetrix) use just a single dye and samples are hybridised to separate arrays, which are then compared.
Principle of analysis of relative levels of gene expression by DNA microarray hybridisation
RNA isolatedfrom sample A
RNA isolatedfrom sample B
Reverse transcribe(to produce cDNA),incorporating greenfluorescent dye
Reverse transcribe(to produce cDNA),incorporating redfluorescent dye
MixHybridise to microarrayand scan
Expressed in neither sample
Expressed only in sample A
Expressed only in sample B
Expressed equally in both samples
The image generated by a microarray experiment actually looks like this:
An example of a microarry experiment
• Middle-aged (1 y) male mice provided with standard diet or high fat (60% energy) diet
• Resveratrol added to the diet of half mice on each diet
• Resveratrol shifted physiology of mice on high-fat diet towards mice on standard diet and increased survival significantly
• Microarray anaysis of gene expression in liver showed resveratrol opposed effects of high-calorie diet on 144 out of 153 significantly-altered pathways
Lagouge et al (2006) Cell 127: 1109-1122
An example of a microarry experiment
• Microarray anaysis of gene expression in liver showed resveratrol opposed effects of high-calorie diet on 144 out of 153 significantly-altered pathways
Lagouge et al (2006) Cell 127: 1109-1122
Parametric analysis of gene-set enrichment (PAGE) comparing every pathway significantly upregulated (red) or downregulated (blue) by either the HC diet or resveratrol (153 in total, with 144 showing opposing effects).
Next generation DNA sequencing: extension of the approach to “counting” transcript numbers in an mRNA sample• Also known as “massively parallel” DNA sequencing• Different commercial platforms are available
– E.g. Ilumina Solexa Genome Analyser• Achieves parallel (simultaneous) short (35-75 bp) sequencing of
hundreds of millions of random fragments of DNA (or for determining transcript (mRNA) numbers of cDNA).
– Fragments for sequencing arrayed randomly in clusters of around 103-106 produced by “bridge amplification” of single fragments that bind to solid support (flow cell) covered with oligonucleotides that pair with adapter oligonucleotides ligated to each and of fragmented DNA (or cDNA copies of short (e.g. approx. 200 base) mRNA fragments).
– Then uses “DNA sequencing by synthesis” technology» All 4 nucleotides added together, with DNA polymerase; each carries base-unique fluorescent label and
3’OH group blocked chemically so incorporate only one base at a time.» Flow cell imaged by sophisticated optics after laser excitation.» Large number of copies sequenced in each cluster is required to generate a sufficiently-strong signal for
detection.» Then 3’ blocking group removed chemically and next round proceeds
• Sequence reads aligned against a reference genome.• Depending on initial sample preparation gives information on genomic
sequence variations, splice variants or transcript (mRNA) numbers.
New generation (Solexa) sequencing: step 1
OR cDNA sample (copy of mRNA, sorepresentative of number of copiesof each mRNA in the sample)
New generation (Solexa) sequencing: step 2
New generation (Solexa) sequencing: step 3
New generation (Solexa) sequencing: step 4
New generation (Solexa) sequencing: step 5
New generation (Solexa) sequencing: step 6
New generation (Solexa) sequencing: step 1 (reminder)
OR cDNA sample (copy of mRNA, sorepresentative of number of copiesof each mRNA in the sample)
New generation (Solexa) sequencing:Use for RNA-seq to determine transcript (mRNA) copy number
Mortazavi A et al (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 5:621-8
- by Mg-catalysed hydrolysis, to give fragments of approximately 200 bases
- by random-primed reverse transcription; then Solexa sequencing
RPKM = Reads Per Kilobase of transcript per Million mapped reads
Example of the use of sequence data from multiple genomes: deducing gene (protein) function
– Functionally-linked proteins should have homologues in all organisms with that function
• E.g. Flagella proteins should be only in bacteria with flagella
Flagella? No Yes Yes No Yes No
ReferencesMardis ER (2008) Next-generation DNA sequencing methods.Annu Rev
Genomics Hum Genet. 9:387-402
Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq.Nat Methods. 5:621-8
Noordewier M & Warren P (2001) Gene expression microarrays and the integration of biological knowledge. Trends in Biotechnology (2001)19:412 – 415
Lagouge et al (2006) Resveratrol improves mitochondrial function and protects against metabolic disease by activating Sirt1 and PGC-1alpha. Cell 127: 1109-1122