normalization for cdna microarray data yee hwa yang, sandrine dudoit, percy luu and terry speed....

24
Normalization for cDNA Normalization for cDNA Microarray Data Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Post on 19-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Normalization for cDNA Normalization for cDNA Microarray DataMicroarray Data

Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed.

SPIE BIOS 2001, San Jose, CA

January 22, 2001

Page 2: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Normalization issuesNormalization issues

Within-slide– What genes to use– Location– Scale

Paired-slides (dye swap)– Self-normalization

Between slides

Page 3: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Within-Slide NormalizationWithin-Slide Normalization

—Normalization balances red and green intensities.

—Imbalances can be caused by – Different incorporation of dyes– Different amounts of mRNA– Different scanning parameters

—In practice, we usually need to increase the red intensity a bit to balance the green

Page 4: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Methods?log2R/G -> log2R/G - c = log2R/ (kG)

Standard Practice (in most software)

c is a constant such that normalized log-ratios have zero mean or median.

Our Preference:

c is a function of overall spot intensity and print-tip-group.

What genes to use?— All genes on the array— Constantly expressed genes (house keeping)— Controls

– Spiked controls (e.g. plant genes)– Genomic DNA titration series

— Other set of genes

Page 5: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

KO #8

Probes: ~6,000 cDNAs, including 200 related to lipid metabolism.

mRNA samplesR = Apo A1 KO mouse liverG = Control mouse liver(All C57Bl/6)

Experiment

Page 6: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

M vs. AM vs. AM = log2(R / G)A = log2(R*G) / 2

Page 7: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Normalization - MedianNormalization - Median

—Assumption: Changes roughly symmetric

—First panel: smooth density of log2G and log2R.

—Second panel: M vs. A plot with median set to zero

Page 8: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Normalization - lowessNormalization - lowess— Global lowess— Assumption: changes roughly symmetric at all intensities.

Page 9: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Normalisation - print-tip-groupNormalisation - print-tip-groupAssumption: For every print group, changes roughly symmetric

at all intensities.

Page 10: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

M vs. A - after print-tip-group M vs. A - after print-tip-group normalizationnormalization

Page 11: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Effects of Location NormalisationEffects of Location Normalisation

Before normalisation After print-tip-groupnormalisation

Page 12: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Within print-tip-group box plots forWithin print-tip-group box plots forprint-tip-group normalized Mprint-tip-group normalized M

Page 13: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Assumptions:

– All print-tip-groups have the same spread.

True ratio is ij where i represents different print-tip-groups, j represents different spots.

Observed is Mij, where

Mij = ai ij

Robust estimate of ai is

MADi = medianj { |yij - median(yij) | }

Taking scale into accountTaking scale into account

II

i i

i

MAD

MAD

1

Page 14: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Effect of location + scale normalizationEffect of location + scale normalization

Page 15: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Effect of location + scale normalizationEffect of location + scale normalization

Page 16: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Comparing different normalisation Comparing different normalisation methodsmethods

Page 17: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Follow-up ExperimentFollow-up Experiment

— 50 distinct clones with largest absolute

t-statistics from the first experiment.

— 72 other clones.

— Spot each clone 8 times .

— Two hybridizations:

Slide 1, ttt -> red ctl-> green.

Slide 2, ttt -> green ctl->red.

Page 18: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Follow-up Experiment

Page 19: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Paired-slidesPaired-slides: : dye swapdye swap

— Slide 1, M = log2 (R/G) - c

— Slide 2, M’ = log2 (R’/G’) - c’

Combine by subtract the normalized log-ratios:

[ (log2 (R/G) - c) - (log2 (R’/G’) - c’) ] / 2

[ log2 (R/G) + (log2 (G’/R’) ] / 2

[ log2 (RG’/GR’) ] / 2

provided c = c’

Assumption: the separate normalizations are the same.

Page 20: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Verify AssumptionVerify Assumption

Page 21: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Result of Self-NormalizationResult of Self-NormalizationPlot of (M - M’)/2 vs. (A + A’)/2

Page 22: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

SummarySummaryCase 1: A few genes that are likely to changeWithin-slide:

– Location: print-tip-group lowess normalization.– Scale: for all print-tip-groups, adjust MAD to equal

the geometric mean for MAD for all print-tip-groups.

Between slides (experiments) :– An extension of within-slide scale normalization

(future work).

Case 2: Many genes changing (paired-slides)– Self-normalization: taking the difference of the two

log-ratios.– Check using controls or known information.

Page 23: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

http://www.stat.berkeley.edu/users/terry/zarray/Html/

Technical Reports from Terry’s group:

http://www.stat.Berkeley.EDU/users/terry/zarray/Html

/papersindex.html— Comparison of Discrimination Methods for the Classification of Tumor

s Using Gene Expression Data

— Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments.

— Comparison of methods for image analysis on cDNA microarray data.

— Normalization for cDNA Microarray Data

Statistical software R

http://lib.stat.cmu.edu/R/CRAN/

Page 24: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

AcknowledgmentsAcknowledgments

Terry Speed

Sandrine Dudoit

Natalie Roberts

Ben Bolstad

Matt Callow (LBL)

John Ngai’s Lab (UCB)

Percy Luu

Dave Lin

Vivian Pang

Elva Diaz