dna microarraydhmg.amu.edu.pl/uploads/microarrays-application-april17...dna microarray...
TRANSCRIPT
DNA MICROARRAY“Gene Expression Profiling:
from gene function to personalized medicine”
Hans Bluyssen
17-03-2018
Genome & Genes
Gene
RNA
Protein
transcription
translation
3’ 5’
+1
promoter
start site
coding sequence
mRNA
RNA polymerase
generation of mRNA from genomic DNA
RNA Transcription
The Transcriptome
Transcriptional Program & Cellular Fate
Breast Cancer cells Red Blood cells Sperm cells
Astrocytes Smooth Muscle cells Neuronal cells
Parallel monitoring of relative levels of thousands of mRNA
species at one time point or condition:
expression profiling
DNA Microarray
Transcriptomics
• Put a large number (~30K) of cDNA sequences or synthetic DNA oligomers onto a glass slide (or other substrate) in known locations on a grid.
• Label an RNA sample and hybridize
• Measure amounts of RNA bound to each square in the grid
• Make comparisons
– Cancerous vs. normal tissue
– Treated vs. untreated
– Time course
• Many applications in both basic and clinical research
DNA Microarrays: Basics
DNA microarrays are ordered assemblies of
DNA sequences immobilized on a solid support
(such as chemically modified glass).
What is a DNA microarray?
The DNA sequences (e.g. PCR products or oligos)
correspond to the transcribed regions of genes.
5’ start
genomic DNA
exon 1 exon 2 exon 3
…ATTTCAGGCGCATGCTCGG… gene X
gene Y
gene Z
What is a DNA microarray?
Each gene represented by single long oligo (60 - 70-mer)
Oligo’s spotted by robot
Long oligo arrays
Each gene represented by 20 different short (25-mer)
oligonucleotides and 20 mismatch controls
Oligonucleotides synthesized on chip by photolithographic
masking
Affymetrix chips
probe pair Mismatch probe cells
(12-20/gene)
Affymetrix chips
3-micron silica beads
that self assemble in
microwells on fiber
optic bundles or planar
silica slides.
When randomly
assembled the beads
have a uniform spacing
of ~5.7 microns.
ILLUMINA
… and BEADS!
Optical fiber with wells…
ILLUMINA
N=20
ILLUMINA
ILLUMINA
Oligo’s synthesized on chip by ink-jet printing
Agilent arrays
Each gene represented by single long oligo
(60 - 70-mer)
Agilent arrays
1.
Sample A cDNA A Cy3
Sample B cDNA B Cy5one microarray
2.
Measurements are relative i.e. a change is
measured, not an absolute amount
Dual color
Microarray hybridizations
Scan 1 Scan 2
Image 1 Image 2
Combined Image
(ratio cy5:cy3)
Statistical analysis
X Y Z
Hybridize
Wash
cy5 cy3
equal expression
higher expression in Cy3
higher expression in Cy5
Overview of data capture
– two different mRNA populations, labeled with different fluors
– excited by a laser
– each fluor excites at a different wavelength, which is captured using a photo detector attached to a filter tuned to the particular fluor
Microarray: Workflow
Controlcells
Experimentalcells
Cy3
Targets
Cy3
ILLUMINA
Microarray: Workflow
Microarray vs NG-Sequencing
1. Find the genes that change expression between experimental and control samples
2. Classify samples based on a gene expression profile
3. Find patterns: Groups of biologically related genes that change expression together across samples/treatments
4. Correlate expression profile to disease state, diagnosis/prognosis or treatment
Goals of a Microarray Experiment
• Type I: (n = 2)
– How is this gene expressed in target 1 as compared to target 2?
– Which genes show up/down regulation between the two targets?
• Type II: (n > 2)
– How does the expression of gene A vary over time, tissues, or treatments?
– Do any of the expression profiles exhibit similar patterns of expression?
Microarray Experiment Design
Aging & Increased Atherosclerosis
Endothelial Cells in Culture
Young Old
Aim: Identify known and novel genes involved in aging ECs
Gene expression?
Growing Cells
Grow HUVECs in culture
Cells were obtained from umbellical vein.Cultered for different passages (1-8).Two independent exp. (11-06-02, 13-02-03)
Isolate total RNA
Microarray analysis (fluor flip)
Run 1: #1 vs #8 (11-06-02)Run 2: #1 vs #8, #1 vs #4 (13-02-03)
Wavelength (nm)
Sample 260 280 ratio amount of RNA (ng)
1 mm 0.047 0.025 1.9 206.8
0.5 mm 0.025 0.014 1.8 11028S
18S
RNA yield & quality
19200, 70-mer oligos
-16752 genes of H. sapiens,
-2448 control spots (external controls for ratios
and normalization,
background controls).
Contains:
Human 19K Oligo Microarray (Holstege et al., Genomics lab, UMCU, 2004)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 EC1 empty 336 337 338 339 340 341 342 343 EC8 344 345 346 347 348 349 empty empty EC1
2 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335
3 299 300 EC2 301 302 303 304 305 306 307 EC16 308 309 310 311 312 313 EC5 314 315
4 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298
5 262 263 264 265 EC12 266 267 268 269 270 EC10 271 272 273 274 EC14 275 276 277 278
6 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261
7 225 226 227 228 229 230 EC4 231 232 233 EC9 234 235 EC7 236 237 238 239 240 241
8 207 208 209 210 211 212 213 empty 214 215 216 217 buffer 218 219 220 221 222 223 224
9 189 190 191 192 193 194 195 196 empty 197 198 EC17 199 200 201 202 203 204 205 206
10 EC11 176 EC3 177 EC15 178 EC6 179 EC13 empty EC18 180 181 182 183 184 185 186 187 188
11 163 164 165 166 167 168 169 170 171 EC18 empty EC13 172 EC6 173 EC15 174 EC3 175 EC11
12 145 146 147 148 149 150 151 152 EC17 153 154 empty 155 156 157 158 159 160 161 162
13 127 128 129 130 131 132 133 buffer 134 135 136 137 empty 138 139 140 141 142 143 144
14 110 111 112 113 114 115 EC7 116 117 EC9 118 119 120 EC4 121 122 123 124 125 126
15 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109
16 73 74 75 76 EC14 77 78 79 80 EC10 81 82 83 84 85 EC12 86 87 88 89
17 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
18 36 37 EC5 38 39 40 41 42 43 EC16 44 45 46 47 48 49 50 EC2 51 52
19 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
20 EC1 empty 1 2 3 4 5 6 7 EC8 8 9 10 11 12 13 14 15 empty EC1
1 2 3 4
9
10
11
12
5
6
7
8
1
2
3
4
Barcode
45 46 47 48
41 42 43 44
37 38 39 40
33 34 35 36
29 30 31 32
25 26 27 28
21 22 23 24
17 18 19 20
13 14 15 16
9 10 11 12
5 6 7 8
1 2 3 4
Cy3 Scan
Overlay Cy3 and Cy5 scans
equal expression
higher expression in Cy3
higher expression in Cy5
• Quantitate each spot
• Export a table of fluorescent intensities
for each gene in the array
• Subtract background
• Normalize
• Analyze data further
Data Acquisition
#1 #8
cDNA cDNA
Label mix
IMAGENE
Chip
Imaging analysis
Gridding
Data analysis: Image analysis
Measured differences may be due to variation in
processing, different labels used, different
channels measured etc.
Sample A cDNA A Cy3
Sample B cDNA B Cy5
Requires normalisation between channels
one microarray
Dual color experiments
• Can control for many of the experimental sources of variability
• Bring each image to the same average brightness
• Can use simple math or fancy -
– divide by the mean (whole chip or by sectors)
– House keeping genes (assumed constants?)
– External controls (constants)
– LOESS (locally weighted regression)
– Dye-swap (fluor-flip)
Normalization
#1 vs #8 UPREGULATED
24-01-03: FALSE 4779 (5005); upregulated 223
11-06-02: FALSE 1222; upregulated 373
Mean > 0.7: 125
#1 vs #8 DOWNREGULATED
24-01-03: downregulated 202
11-06-02: downregulated 351
Mean < -0.7: 118
Regulated Genes
Gene_Symbol/name GB_accession/nameDescription Fflip #1 vs #8 (240103)Fflip #1 vs #8 (110602)>0.5 Mean
KIAA177 AB029000 KIAA1077 protein 2,513731797 2,162643424 0 2,338188
MGP X53331 Matrix Gla protein 1,924562345 1,06359492 0 1,494079
CLU M25915 Clusterin (complement lysis inhibitor, SP-40,40, sulfated glycoprotein 2, testosterone-repressed pro1,120027922 1,422007634 0 1,271018
MLLT4 AL161973 Myeloid/lymphoid or mixed-lineage leukemia (trithorax (Drosophila) homolog); translocated to, 40,699761417 1,788515835 0 1,244139
FABP4 J02874 Fatty acid binding protein 4, adipocyte 1,033984581 1,224226734 0 1,129106
LMO2 X61118 LIM domain only 2 (rhombotin-like 1) 1,608864882 0,632131603 0 1,120498
CD34 S53911 CD34 antigen 0,625034943 1,491856907 0 1,058446
PLA2G6 AF064594 Phospholipase A2, group VI (cytosolic, calcium-independent)1,120023278 0,923718755 0 1,021871
FLJ22344 NM_024717Hypothetical protein FLJ22344 0,676278219 1,284356776 0 0,980317
IFITM1 J04164 Interferon induced transmembrane protein 1 (9-27)1,321185077 0,612176398 0 0,966681
FADS2 AF126799 Fatty acid desaturase 2 0,703455048 1,19919164 0 0,951323
GNG11 U31384 Guanine nucleotide binding protein 11 0,912465229 0,980640855 0 0,946553
ALDH1A1 M31994 Aldehyde dehydrogenase 1 family, member A11,18344417 0,657748227 0 0,920596
FLJ1111 AK001972 Hypothetical protein FLJ11110 0,995447178 0,799882598 0 0,897665
KIAA692 AB014592 KIAA0692 protein 0,78229028 0,99391436 0 0,888102
COL6A1 X15880 Collagen, type VI, alpha 1 1,159896438 0,599391575 0 0,879644
VWF X04385 Von Willebrand factor 0,938009512 0,777968076 0 0,857989
TROAP U04810 Trophinin associated protein (tastin) 0,737655495 0,939420669 0 0,838538
DKFZP434C211 AL110244 DKFZP434C211 protein 0,593289953 1,058910073 0 0,8261
LIPA X76488 Lipase A, lysosomal acid, cholesterol esterase (Wolman disease)1,07536012 0,566963312 0 0,821162
ARMET M83751 Arginine-rich, mutated in early stage tumors0,717870343 0,875637643 0 0,796754
GCL M81637 Grancalcin 0,811293852 0,777934191 0 0,794614
VCAM1 M73255 Vascular cell adhesion molecule 1 0,654017353 0,926043108 0 0,79003
IRF6 AL022398 Interferon regulatory factor 6 0,969052784 0,604807534 0 0,78693
HEPH AB014598 Hephaestin 0,978667308 0,594992788 0 0,78683
AL022163 Human DNA sequence from clone 551E13 on chromosome Xp11.2-11.3 Contains farnesyl pyrophosphate synth0,707438024 0,841306538 0 0,774372
TAS2R1 AF227129 Taste receptor, type 2, member 1 0,6107639 0,919104935 0 0,764934
PRCP L13977 Prolylcarboxypeptidase (angiotensinase C)0,567628921 0,946099969 0 0,756864
DLK1 U15979 Delta-like homolog (Drosophila) 0,976160319 0,529537279 0 0,752849
DACH AJ005670 Dachshund (Drosophila) homolog 0,831559463 0,667854409 0 0,749707
HMGCR NM_0008593-hydroxy-3-methylglutaryl-Coenzyme A reductase0,900287358 0,565012227 0 0,73265
AL137512 Homo sapiens mRNA; cDNA DKFZp564E0178 (from clone DKFZp564E0178); partial cds0,713181012 0,679221354 0 0,696201
AL137578 Homo sapiens mRNA; cDNA DKFZp564N2464 (from clone DKFZp564N2464)0,643705912 0,722626994 0 0,683166
DHCR7 AF034544 7-dehydrocholesterol reductase 0,77803299 0,552501793 0 0,665267
OR1G1 U53583 Olfactory receptor, family 1, subfamily G, member 10,794092501 0,51583863 0 0,654966
HSPB1 Z23090 Heat shock 27kD protein 1 0,68780744 0,616494742 0 0,652151
HHEX L16499 Hematopoietically expressed homeobox 0,735995153 0,567180638 0 0,651588
STAT1 M97935 Signal transducer and activator of transcription 1, 91kD0,765382029 0,529963687 0 0,647673
Log2 ratio’s
Up-regulated #1 vs #8, >0.7
Gene_Symbol/name GB_accession/nameDescription Fflip #1 vs #8 (240103)Fflip #1 vs #8 (110602)<-0.5 Mean
TGFBI M77349 Transforming growth factor, beta-induced, 68kD-2.33833722 -1.27362528 2 -1.8059812
SGK AJ000512 Serum/glucocorticoid regulated kinase -0.82421584 -2.58943878 2 -1.7068273
PLAU X02419 Plasminogen activator, urokinase -1.6055826 -1.11066064 2 -1.3581216
KEL M64934 Kell blood group -1.31760599 -1.34347572 2 -1.3305409
MMP1 M13509 Matrix metalloproteinase 1 (interstitial collagenase)-0.95061809 -1.58462396 2 -1.267621
AF098641 Homo sapiens CD44 isoform RC (CD44) mRNA, complete cds-1.26297713 -1.20426938 2 -1.2336233
AL136803 Homo sapiens mRNA; cDNA DKFZp434M1820 (from clone DKFZp434M1820); complete cds-0.50233598 -1.92845647 2 -1.2153962
CDKN1A U03106 Cyclin-dependent kinase inhibitor 1A (p21, Cip1)-1.25234383 -0.99838852 2 -1.1253662
XM_007600Homo sapiens alanyl (membrane) aminopeptidase (aminopeptidase N, aminopeptidase M, microsomal aminop-1.30118944 -0.91301997 2 -1.1071047
SPUVE AF015287 Protease, serine, 23 -1.19481818 -0.83773478 2 -1.0162765
AB033089 Homo sapiens mRNA for KIAA1263 protein, partial cds-1.3513174 -0.66492176 2 -1.0081196
PTPRO NM_002848Protein tyrosine phosphatase, receptor type, O-1.12796975 -0.76306015 2 -0.9455149
CGA S70585 Glycoprotein hormones, alpha polypeptide-0.76538247 -1.11468868 2 -0.9400356
CDKN1A Z85996 Cyclin-dependent kinase inhibitor 1A (p21, Cip1)-0.96468421 -0.88752639 2 -0.9261053
ARHB NM_004040Ras homolog gene family, member B -1.24914781 -0.56511585 2 -0.9071318
CAPN2 M23254 Calpain 2, (m/II) large subunit -0.67063344 -1.12861276 2 -0.8996231
SPOCK AF231124 Sparc/osteonectin, cwcv and kazal-like domains proteoglycan (testican)-0.76625413 -1.03297294 2 -0.8996135
ITGB5 X53002 Integrin, beta 5 -0.85883624 -0.93320134 2 -0.8960188
TEM8 AK025429 Tumor endothelial marker 8 -1.1064934 -0.56344376 2 -0.8349686
TADG12 NM_022364Serine protease TADG12 -1.05323657 -0.58052308 2 -0.8168798
FBN2 U03272 Fibrillin 2 (congenital contractural arachnodactyly)-0.99638042 -0.61295108 2 -0.8046657
SMURF2 NM_022739E3 ubiquitin ligase SMURF2 -0.77041438 -0.8217453 2 -0.7960798
AC004528 Homo sapiens chromosome 19, cosmid R32184-0.93405079 -0.64952944 2 -0.7917901
GRB14 L76687 Growth factor receptor-bound protein 14-0.96972574 -0.59117799 2 -0.7804519
FACL1 L09229 Fatty-acid-Coenzyme A ligase, long-chain 1-0.97376797 -0.55574601 2 -0.764757
HMOX1 Z82244 Heme oxygenase (decycling) 1 -0.96184824 -0.5646887 2 -0.7632685
KIAA1151 AB032977 KIAA1151 protein -0.56521377 -0.95738307 2 -0.7612984
CACNA1I AF142567 Calcium channel, voltage-dependent, alpha 1I subunit-0.55503678 -0.93969802 2 -0.7473674
TFEC D43945 Transcription factor EC -0.74754013 -0.74422819 2 -0.7458842
KIAA193 D83777 KIAA0193 gene product -0.91757685 -0.56213729 2 -0.7398571
EPHX1 NM_000120Epoxide hydrolase 1, microsomal (xenobiotic)-0.87217861 -0.60636676 2 -0.7392727
PRO2714 NM_018534Hypothetical protein PRO2714 -0.89962074 -0.51623757 2 -0.7079292
ZNF28 AC003973 Zinc finger protein 208 -0.67517041 -0.7261952 2 -0.7006828
RAC2 M64595 Ras-related C3 botulinum toxin substrate 2 (rho family, small GTP binding protein Rac2)-0.68866119 -0.71043787 2 -0.6995495
AJ276395 Homo sapiens mRNA for MSF-FN70 (FN gene)-0.54169472 -0.84441272 2 -0.6930537
AL034369 Human DNA sequence from clone 149D17 on chromosome Xq22.2-23. Contains part of a PLRP2 (PNLIPRP2, Pa-0.60013725 -0.77680433 2 -0.6884708
TRIP4 L40371 Thyroid hormone receptor interactor 4 -0.76642149 -0.60545613 2 -0.6859388
HCLS1 X16663 Hematopoietic cell-specific Lyn substrate 1-0.50070018 -0.8492648 2 -0.6749825
TIMP2 NM_003255Tissue inhibitor of metalloproteinase 2 -0.695665 -0.64558501 2 -0.670625
DKFZp586H623 AL096739 Hypothetical protein DKFZp586H0623 -0.57019613 -0.77100085 2 -0.6705985
KIAA273 D87463 KIAA0273 gene product -0.78756329 -0.53861163 2 -0.6630875
PTGS1 M59979 Prostaglandin-endoperoxide synthase 1 (prostaglandin G/H synthase and cyclooxygenase)-0.68414877 -0.63149194 2 -0.6578204
NFIX L31881 Nuclear factor I/X (CCAAT-binding transcription factor)-0.77217229 -0.54139827 2 -0.6567853
AL163202 Homo sapiens chromosome 21 segment HS21C002-0.74568656 -0.56590513 2 -0.6557958
Down-regulated #1 vs #8, <-0.7
Log2 ratio’s
• Biologists got together a few years ago and developed a sensible system called Genome Ontology (GO) Database
• 3 hierarchical sets of terminology
– Biological Process
– Cellular Component (location within cell)
– Molecular Function
• about 1000 categories of functions
Couple Gene List to Biological function:
Gene Ontology
ECMuPA, tPA Matrix Gla proteinFBN2 ADAM3AMMP1 ADAM12TIMP2 TFPI2COL6A2 COL3A1OPTC COL6A1KERA COL8A1
Cell AdhesionCD44 VWFCDH2 CD34TGFBI VCAM1ITGB5 PCDH1Neuropilin2 GJA4Claudin4 CTNNB1
TROAP
ApoptosisBCL2L1 SSTR3SGK CRDF14PAWR CASP1BOK TNFF8
CLUTNFRSF1B
REDOXHMOX1 NDUFA6TXNRD1 SEPP1
GSTT1CSR1
Cell growthP21 NDRG2CCDN1 ASCL2CCDN3 BMP2CSH2 LMO2IFRD2 IGFBP2QSCN6 IGF2NPPB
Transcr factorsId2b LMO2TTF1 HHEXTRIP4 MEOX2ZNF79 GFI1ZNF43 STAT1HES6 LBX1NFIX HNF3GTFEC IRF6EPAS1
InflammationTSA192 IFITM1FCGR3B NK4FCGR2B SCYA19TNFAIP6 CXCR4TNFSF4 CCR1TLR4 SCYA27
“senescence biological profile in vitro”
Overlap with cardiovascular disease in vivo
Genotype vs Phenotype
Senescent vascular cells in
human atheroma
Senescence-associated b-gal activity
Senescence in vitro Atherosclerosis in vivo
• Type I: (n = 2)
– How is this gene expressed in target 1 as compared to target 2?
– Which genes show up/down regulation between the two targets?
• Type II: (n > 2)
– How does the expression of gene A vary over time, tissues, or treatments?
– Do any of the expression profiles exhibit similar patterns of expression?
Microarray Experiment Design
Iyer et al. (1999) Science, 283: 83
The transcriptional program in the
response of human fibroblasts to serum
Identify genes with similar expression
Grouping unknown genes with known genes
may provide insight into function of unknown
genes
Cluster genes by similar changes - only really
meaningful across multiple treatments or time
points
Cluster samples by similar gene expression
profiles
>8 >6 >4 >2 >8>6>4>20 1 3 6 12 24 72hours:
genes
microarrays
Down Up
0 vs 1
0 vs 3
0 vs 6
etc..
Basics of Data Filtering and Visualization
0 1 3 6 12 24 72hours:
4,124 genes
spot-filtered
clustered
0 1 3 6 12 24 72hours:
464 genes
spot-filtered
ratio-filtered
clustered
0 1 3 6 12 24 72hours:
4,416 genes
Grouping genes: clustering
Microarray
(Clustering)
8600 cDNA clones
• Coordinated gene expression
• Differential gene expression
Iyer et al. (1999) Science, 283: 83
Biological information
Serum treatment ‘in vitro’
Wound healing ‘in vivo’
Expression
signatures
Couple expression to GO
GeneSpring
Visualisation
Expression Landscape of cell-cycle
regulated genes in yeast
GenMapp: Biological Pathways
Microarray and cancer
•Identification of prognostic biomarkers specific
to onset and progression
•Disease classification
•Development of drug resistance
•Risk of relapse assessment
•Metastasis
•Response to treatment
•Survival
Ross DT, et. al., Nature Genetics, (24): 2000, 227-235.
Aim:
to explore the variation in gene expression
of ~ 8,000 genes among 60 human cancer
cell lines (spanning 9 distinct tissues)
Variation in Gene Expression Patterns
in Human Cancer Cell Lines
1,161 genes: 60 cell lines
•Relationship between expression profile and tissue of origin•Recognize previously incorrect classified outliers•Recognize relationships to tumors in vivo
Hierarchical Clustering of Gene Expression Patterns
Groups Cell Lines According to Tissue of Origin
Alizadeh AA, et. al., Nature, (403): 2000,503-511.
Aim:
to determine whether gene expression profiling
could subdivide DLBCL – a clinically heterogeneous
diagnostic category – into molecularly distinct diseases
with more homogeneous clinical behaviors
Only 40% of patients respond well to therapy
“Lymphochip”: -17.856 cDNA clones
-lymphoid cell origin
-cancer + immunology
Distinct Types of Diffuse Large B-cell
Lymphoma (DLBCL)
45 DLBCL biopsies
Different B-cell differentiation stage
Set of ~3000 genes
Clustering Identifies 2 Major
Subgroups of DLBCL
a. Kaplan-Meier plot of overall survival of DLBCL patients grouped on the basis of gene expression profiling.
b. Kaplan-Meier plot of overall survival of DLBCL patients grouped according to the International Prognostic Index.
c. Kaplan-Meier plot of overall survival of low clinical risk DLBCL patients grouped on the basis of gene expression profiles.
DLBCL Subgroups Define
Prognostic Categories
Aim:
To classify breast carcinoma’s based on
expression profiling and to correlate these
to clinical outcome
2001, PNAS
Differential expressed genes: 476
• tumor properties
• patient outcome 85 biopsy samples
Clustering Identifies novel and existing
Subgroups of Breast cancer
Sorlie et al. PNAS 2003
ER++, PR++, G1,2 HER2 ISH pos “triple neg,” CK5/6+
overall survival: relapse-free survival:
Molecular classes are
predictive of outcome
Oncologists would like to use arrays to predict
whether or not a cancer is going to spread in the
body, how likely it will respond to a certain type of
treatment, and how long the patient will probably
survive.
It would be useful if the gene expression signatures
could distinguish between subtypes of tumours that
standard methods, such as histological pathology
from a biopsy, fail to discriminate, and that require
different treatments.
BioArray News (2, no. 35, 2002)
Arrays Hold Promise for Cancer Diagnostics
Gene expression profiling predicts
clinical outcome of breast cancer
Van ‘t Veer, et. al., Nature, (415): 2002,530-536.
Aim:
to determine whether gene expression profiling could
predict disease outcome and provide a strategy to select
patients who would benefit from adjuvant therapy
(metastasis)
Breast Cancer – Survival Pre-menopausal
patients, lymph node negative
~30% die <10 year
~70% survive >10 year
traditional diagnostics
Everyone receives
chemotherapy...!time (years)
su
rviv
al
time (years)
su
rviv
al
Current adjuvant treatment selection criteria:
• NIH (US) consensus criteria: > 95%• St Gallen (EU) consensus criteria: > 80%
receive adjuvant chemo- and hormonal therapy
As only 30% of these patients develop distantmetastases, some 50-65% of patients areover-treated with adjuvant (chemo)therapy
Breast Cancer – Survival Pre-menopausal
patients, lymph node negative
Identification of gene expression
changes in breast cancer
• analyse 98 breast tumors • 34 metastases-
positive <5 year– bad prognosis
• 44 metastases-negative >5 year– good prognosis
• 18 BRCA1 +• 2 BRCA2 +
‘spora
dic
’
Van’t veer, et. al.
98 breast tumors
analysed
34 ‘bad’ vs.
44 ‘good’
18 BRCA1 +
2 BRCA2 +
microarray with
24.000 genes
5.000 genes showed
expressional changes
in tumors
Different classes of
breast tumors...!
70-gene prognosis classifier for predicting risk
of distant metastasis within 5 yearsVan’t veer, et. al.
Supervisedclustering
Poor prognosis
Good prognosis
Microarray classification vs.
NIH classification
Classification of 158
breast cancer
tumors
Less unnecessary
chemo-therapy
Identification of
genes playing a role
in breast cancerClassical
NIH classification
59%
74%
5 % low risk
95 % high risk
96%
50%
Classification based
on microarray
39 % low risk
61 % high risk
Microarray to be used as
routine clinical screen
The Netherlands Cancer Institute in Amsterdam is the first institution in the world
to use microarray techniques for the routine prognostic screening of cancer
patients. Aiming for a June 2003 start date, the center will use a panoply of 70
genes to assess the tumor profile of breast cancer patients and to determine
which women will receive adjuvant treatment after surgery.
by C. M. Schubert
Nature Medicine
9, 9, 2003.
“Though each tumor is molecularly unique,
there exist common transcriptional cassettes
that underlie biological and clinical properties
of tumors that may be of diagnostic,
prognostic and therapeutic significance”.
Also true for other complex diseases
Expression profiling &
clinical application