department of statistics, university of california, berkeley, and division of genetics and...

33
Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research, Image Analysis on cDNA Image Analysis on cDNA Microarray Data Microarray Data Demo of Spot Demo of Spot Jean Yang October 24, 2000 Genetics & Bioinformatics Meetings

Post on 18-Dec-2015

220 views

Category:

Documents


5 download

TRANSCRIPT

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Image Analysis on cDNA Image Analysis on cDNA Microarray DataMicroarray DataDemo of SpotDemo of Spot

Jean Yang

October 24, 2000

Genetics & Bioinformatics Meetings

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

cDNA clones(probes)

PCR product amplificationpurification

printing

microarray Hybridise target to microarray

mRNA target)

excitation

laser 1laser 2

emission

scanning

analysis

overlay images and normalise

0.1nl/spot

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

ScannerScanner

Laser

PMT

Dye

Glass Slide

Objective Lens

Detector lens

Pinhole

Beam-splitter

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Scanner ProcessScanner Process

Dye Photons Electrons Signal

Laser PMTA/D

Convertor

excitation amplification FilteringTime-spaceaveraging

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

How to adjust for PMT?How to adjust for PMT?

Cy3 Cy51 600 6002 650 6003 650 6504 700 6505 650 7006 700 7007 750 750

saturated

Very weak

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

After normalisationAfter normalisation

In addition, the ranking of the genes stays pretty much the same.

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Practical Problems 1Practical Problems 1

• Comet Tails• Likely caused by

insufficiently rapid immersion of the slides in the succinic anhydride blocking solution.

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Practical Problems 2 Practical Problems 2

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Practical Problems 3Practical Problems 3

High Background• 2 likely causes:

– Insufficient blocking.

– Precipitation of the

labeled probe.

Weak Signals

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Practical Problems 4Practical Problems 4

Spot overlap:Likely cause: toomuch rehydrationduring post -processing.

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Steps in Images ProcessingSteps in Images Processing

1. Addressing: locate centers

2. Segmentation: classification of pixels either as signal or background. using seeded region growing).

3. Information extraction: for each spot of the array, calculates signal intensity pairs, background and quality measures.

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

AddressingAddressing

This is the process of assigning coordinates to each of the spots.

Automating this part of the procedure permits high throughput analysis.

4 by 4 grids19 by 21 spots per grid

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

AddressingAddressing

4 by 4 grids

Within the same batch of print runs. Estimate the translation of grids

Other problems:-- Mis-registration-- Rotation-- Skew in the array

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Segmentation Segmentation methodsmethods

• Fixed circles• Adaptive Circle• Adaptive Shape

– Edge detection.– Seeded Region Growing. (R. Adams and L.

Bishof (1994) :Regions grow outwards from the seed points preferentially according to the difference between a pixel’s value and the running mean of values in an adjoining region.

• Histogram Methods– Adaptive threshold.

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

SeedsSeeds

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Limitation of circular segmentationLimitation of circular segmentation

—Small spot—Not circular

Results from SRG

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Information ExtractionInformation Extraction

—Spot Intensities—mean (pixel intensities).—median (pixel intensities).

—Background values—Local —Morphological opening—Constant (global)—None

—Quality Information

Take the average

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Local BackgroundsLocal Backgrounds

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Statistical Software - RStatistical Software - R

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Who are we comparing?Who are we comparing?

• Spot (SRG)– valley– morph

• ScanAlzye (fixed circle)• GenePix (adaptive circle)• QuantArray

– Fixed circle– Adaptive (Chen’s method)– Histogram

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

How are we comparing?How are we comparing?

• Foreground and Background Intensities

• M vs A plot

• Within slide variability

• Between slide variability

• Ability to differentiate important genes from noise

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Foreground and Background comparisonForeground and Background comparison

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Does the image analysis matter?Does the image analysis matter?

Spot.nbgSpot.nbg Spot.morphSpot.morph

Spot.valleySpot.valley ScanAlyzeScanAlyze

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Background makes a differenceBackground makes a difference

Background method Segmentation method Exp1 Exp2S.nbg 6 6Gp.nbg 7 6SA.nbg 6 6

No background QA.fix.nbg 7 6QA.hist.nbg 7 6QA.adp.nbg 14 14S.valley 17 21GP 11 11

Local surrounding SA 12 14QA.fix 18 23QA.hist 9 8QA.adp 27 26

Others S.morph 9 9S.const 14 14

Medians of the SD of log2(R/G) for 8 replicated spots multiplied by 100and rounded to the nearest integer.

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Between slide variabilityBetween slide variability

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

TT

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Adjusted p-valuesAdjusted p-values

Rank S.nbg SA.nbg GP.nbg QA.fix.nbgQA.adp.nbgQA.hist.nbgS.valley GP QA.fix QA.adp QA.hist S.morph S.const1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.002 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.003 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.004 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.005 0.00 0.00 0.01 0.00 0.04 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.006 0.00 0.00 0.01 0.02 0.05 0.01 0.00 0.01 0.00 0.49 0.00 0.00 0.007 0.01 0.02 0.01 0.03 0.36 0.02 0.01 0.01 0.02 0.50 0.02 0.00 0.018 0.01 0.03 0.05 0.07 0.53 0.03 0.01 0.02 0.27 0.55 0.03 0.00 0.019 0.56 0.15 0.21 0.10 0.55 0.14 0.26 0.19 0.40 0.56 0.03 0.60 0.28

10 0.67 0.16 0.25 0.21 0.81 0.41 0.74 0.40 0.44 0.81 0.11 0.64 0.73

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

AcknowledgmentsAcknowledgments

Terry SpeedTerry Speed

Michael BuckleyMichael Buckley

Sandrine DudoitSandrine Dudoit

Natalie RobertsNatalie Roberts

Ben BolstadBen Bolstad

CSIRO Image Analysis Group

Ryan Lagerstorm

Richard Beare

Hugues Talbot

Kevin Cheong

Matt Callow (LBL)

Percy Luu (USB)

Dave Lin (USB)

Vivian Pang (USB)

Elva Diaz (USB)

WEHI Bioinformatics groupWEHI Bioinformatics group

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Steps in Images ProcessingSteps in Images Processing

1. Addressing: locate centers

2. Segmentation: classification of pixels either as signal or background. using seeded region growing).

3. Information extraction: for each spot of the array, calculates signal intensity pairs, background and quality measures.

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Steps in Image Processing Steps in Image Processing

• Spot Intensities– mean (pixel intensities).– median (pixel intensities).

– Pixel variation (IQR of log (pixel

intensities).• Background values

– Local

– Morphological opening

– Constant (global)

– None

• Quality Information

Signal

Background

3. Information Extraction

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

AddressingAddressing

Registration

Registration

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Quality MeasurementsQuality Measurements

• Array– Correlation between spot intensities.– Percentage of spots with no signals.– Distribution of spot signal area.

• Spot– Signal / Noise ratio.– Variation in pixel intensities.– Identification of “bad spot” (spots with no signal).

• Ratio (2 spots combined)– Circularity

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

TT