gene expression analysis - courses.cs.ut.ee€¦ · machine learning, predictive analytics,...
TRANSCRIPT
Gene Expression Analysis
Konstantin Tretyakov ([email protected])
BIIT Research group (http://biit.cs.ut.ee)
Science
November 2011 2
Observation Analysis Generalization
Science
November 2011 3
Observation Analysis Generalization
Science
November 2011 4
Observation Analysis Generalization
Science
November 2011 5
Observation Analysis Generalization
November 2011 6
November 2011 7
Observation Analysis Generalizatin
Why?
November 2011 8
Why?
November 2011 9
Contemporary Science
November 2011 10
Observation Analysis Generalization
Contemporary Science
November 2011 11
Contemporary Science
November 2011 12
Contemporary Science
November 2011 13
Contemporary Science
November 2011 14
Observation Analysis Generalization
Contemporary Science
November 2011 15
Observation Analysis Generalization
Contemporary Science
November 2011 16
Observation Analysis Generalization
Data mining,
Data analysis, Statistical analysis,
Pattern discovery, Statistical learning,
Machine learning, Predictive analytics,
Business intelligence, Data-driven statistics
Inductive reasoning, Pattern analysis,
Knowledge discovery from databases,
Analytical processing,
…
Contemporary Bioinformatics
November 2011 17
Observation Analysis Generalization
Bioinformatics
Nucleotide
sequences
Molecular
interactions
Gene expression
Heredity
Drug effects
…
Contemporary Bioinformatics
November 2011 18
Observation Analysis Generalization
Bioinformatics
Nucleotide
sequences
Molecular
interactions
Gene expression
Heredity
Drug effects
…
November 2011 19
November 2011 20
Bioconductor
November 2011 21
November 2011 22
Why measure gene expression?
November 2011 23
How to measure expression?
November 2011 24
DNA
RNA
Protein
How to measure expression?
November 2011 25
DNA
RNA
Protein
How to measure expression?
November 2011 26
DNA
RNA
Protein
How to measure expression?
November 2011 27
RNA
How to measure expression?
November 2011 28
RNA
Microarray
November 2011 29
RNA
cDNA
Microarray
Microarray technology
cDNA
Oligonucleotide
November 2011 30
cDNA: 2-dye chip
November 2011 31
Normalization of 2-dye chips
November 2011 32
log(R) + log(G)
Lo
g(R
/G)
Affymetrix GeneChip
November 2011 33
$100-850
Millions of probes per cell
500 000 cells per chip Each probe = 25 bp
Affymetrix GeneChip
November 2011 34
10-20 probes, each 25 basepairs long
Gene
+ as many “mismatch” probes
Affymetrix GeneChip
November 2011 35
Affymetrix GeneChip
November 2011 36
Affymetrix GeneChip
November 2011 37
Affymetrix GeneChip
November 2011 38
Affymetrix GeneChip
November 2011 39
10-20 probes, each 25 basepairs long
Gene
+ as many “mismatch” probes
Probe1 = 2.0 Probe1 = 1.0 Probe3 = 1.5 …
.DAT
Gene1 = -1 Gene2 = 0.2 Gene3 = 1.2 …
.CEL .CSV
Gene expression databases
November 2011 40
Gene Expression Omnibus: http://www.ncbi.nlm.nih.gov/geo/
ArrayExpress: http://www.ebi.ac.uk/arrayexpress/
ArrayExpress
November 2011 41
ArrayExpress
November 2011 42
AffyBatch
November 2011 43
November 2011 44
Dirty data
November 2011 45
Dirty data
November 2011 46
Normalization
Do we trust the measured data?
What to do with PM/MM?
How to match Probeset to a Gene?
How to normalize multiple chips?
November 2011 47
Normalization
Do we trust the measured data?
What to do with PM/MM?
How to match Probeset to a Gene?
How to normalize multiple chips?
November 2011 48
Gene expression data
November 2011 49
Expression data
Experiments
Ge
ne
s G(i,j)
November 2011 50
Comprehensive Identification of Cell Cycle-regulated Genes of the
Yeast Saccharomyces cerevisiae by Microarray Hybridization. Paul Spellman, et al. (1998)
November 2011 51