micro arrays ii - image analysis and data pre-processing(1)

Upload: beto-cavazos

Post on 06-Apr-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    1/34

    GENOMICA FUNCIONALDR. VCTOR [email protected]

    A7-421

    Microarrays Image Analysis

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    2/34

    [email protected]

    Microarray - Pre-Processing Purpose

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    3/34

    [email protected]

    Microarray Image AnalysisTECHNOLOGIES

    DNA Probes Oligos~20

    40nt

    Target(cDNA, PCR products, etc.)

    Copies per gene Usually 1Usually 3

    OrganizationSectors (print-tip) n x m probsets

    Probeset

    mprobsets

    (~100)

    ysectors

    (~=3)

    x sectors (~=3) n probsets (~100)

    Sectorsi x j spots (18x20)

    Empty spots

    landing lights

    perfect match probes (pm)

    mismatch probes (mm)

    Controls

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    4/34

    [email protected]

    Microarray - Image AnalysisTECHNOLOGIES

    10,000 genes* 2 dyes

    * 3 copies/gene* ~40 pixels/gene

    = 2,400,00 values

    only 10,000 values

    10,000 genes* 20 oligos

    * 2 (pm,mm)* ~ 36 pixels/gene

    = 14,400,00 values

    only 10,000 values

    RAW DATA

    Image AnalysisPre-processing

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    5/34

    [email protected]

    Image Analysis

    Addressing:Estimate location of spot centers.Segmentation:Classify pixels as foreground or background.Extraction:For each spot on the array and each dye

    foreground intensities background intensities

    quality measures.Addressing Done by GeneChipAffymetrix software

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    6/34

    [email protected]

    Image Analysis

    Addressing:Estimate location of spot centers.Segmentation:Classify pixels as foreground or background.Extraction:For each spot on the array and each dye

    foreground intensities background intensities

    quality measures.

    Addressing (by grid, GenePix)

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    7/34

    [email protected]

    Image Analysis

    Addressing:Estimate location of spot centers.Segmentation:Classify pixels as foreground or background.Extraction:For each spot on the array and each dye

    foreground intensities background intensities

    quality measures. Segmentation

    Circular featureIrregular feature shape

    Finally compute Average

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    8/34

    Background Reduction

    Extraction:

    DeterminingBackground

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    9/34

    [email protected]

    Image Analysis

    Segmentation

    (Spot detection)

    Background

    Estimation

    Value

    Value = Spot Intensity Spot Background

    Gene 1

    Gene 2Gene 3

    .

    .

    Gene k..

    Gene N

    Sample 1

    100

    2097

    .

    .

    9882..

    2298

    Sample 1

    98

    42092..

    9711..

    28

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    10/34

    [email protected]

    Data Transformation two dyes

    Gene 1Gene 2Gene 3

    .

    .

    Gene k..

    Gene N

    Sample 1

    100209

    7..

    9882..

    2298

    Sample 1

    984209

    2..

    9711..

    28 G=Sample 1

    R=Sample1

    Log2(G=Sample 1)

    Log2(R=Sample1)

    Log2

    Microarray Bioinformatics - D. Stekel (Cambridge, 2003)

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    11/34

    [email protected]

    Data Transformation two dyes

    Gene 1Gene 2Gene 3

    .

    .

    Gene k..

    Gene N

    Sample 1

    100209

    7..

    9882..

    2298

    Sample 1

    984209

    2..

    9711..

    28

    (log2 scale)

    R

    G1 value?

    ( )

    2

    2

    2

    GRLog

    A

    G

    RLogM

    =

    =

    A

    M

    MA-PlotG=Sample 1

    R=Sample1

    Desv

    Intensity

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    12/34

    8 10 12 14 16

    -4

    -3

    -2

    -1

    0

    1

    (log2(G)+log2(R)) / 2

    log2(R)-log2(G)

    A

    M

    "With-in"(2 color technologies)

    Normalization 2 dyes

    (assumption: Majority No change)

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    13/34

    Normalization 2 dyes

    (assumption: Majority No change)

    Before

    After

    "With-in"(2 color technologies)

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    14/34

    Normalization 2 dyes

    "With-in" Spatial(2 color technologies)

    Before Normalization

    Aftter loessGlobal Normalization

    Aftter loessby Sector (print-tip)

    Normalization

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    15/34

    [email protected]

    Data Transformation one dye

    Gene 1Gene 2Gene 3

    .

    .

    Gene k..

    Gene N

    Sample 1

    100209

    7..

    9882..

    2298

    Log2

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    16/34

    7 8 9 10 11 12

    0.0

    0.5

    1.0

    1.5

    N = 3840 Bandwidth = 0.1051

    Density

    9 10 11 12 13 14 15 16

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    log intensity

    density

    10 11 12 13 14 15

    0.0

    0.2

    0.4

    0.6

    0.8

    x

    density

    Before normalization After normalization

    Between-slides

    Normalization 1 or 2 dyes

    quantileMAD (median absolute deviation)

    scale

    qspline

    invariantset

    loess

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    17/34

    Sumarization = "Average"(Intensities)

    Summarization Affymetrix

    Oligonucleotide dependent technologies

    Usual Methods:tukey-biweightav-diffmedian-polish

    PMMM

    The "summarization" equivalent intwo-dyes technologies is the average

    of gene replicates within the slide.

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    18/34

    [email protected]

    Microarrays Filtering / TreatingUndefined Values

    Some spots may be defective in the printing process Some spots could not be detected Some spots may be damaged during the assay Artefacts may be presents (bubbles, etc)

    Use replicated spots as averages Remove unrecoverable genes Remove problematic spots in all arrays Infer values using computational methods (warning)

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    19/34

    [email protected]

    Microarray Data Filtering

    More than 10,000 genes Too many data increases Computation Time and analysis

    complexity

    Remove Genes that do not change significantly Undefined Genes Low expression

    Keeping

    Large signal to noise ratio Large statistical significance Large variability Large expression

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    20/34

    [email protected]

    Data Processing

    BackgroundDetection &Subtraction

    a)

    MicroarrayImage

    Scanning SpotDetectionIntensityValue

    Affymetrix

    Twodyes

    b)Image Analysis and Background Subtraction

    c)Transformation

    BetweenWithin

    d)

    A=log2(R*G)/2M=log2(R/G)

    Normalization

    Microarray Pre-Processing Summary

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    21/34

    [email protected]

    Image Analysis Exercise

    Data processing of Placental Microarrays Dr. Hugo A. Barrera Saldaa Paper in Mol. Med. 2007 : DNA Microarrays - A

    Powerful Genomic Tool for Biomedical Research -Trevino - Barrera - Mol Med 2007 Search PubMed for Trevino V

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    22/34

    Experimental DesignGoal : Differential Expression

    Placenta 1Placenta 2mRNA Extraction

    Reference Pool

    Labelling

    MicroarrayHybridization(by duplicates)

    Scanning &Data Processing

    Detection ofDifferentially

    Expressed Genes

    Validation and

    Analysis

    Green GreenRed Red

    ttestH0: = 0pvalues correction: False Discovery Rate

    Comparison With Known Tissue Specific Genes

    ImageAnalysis

    WithinNormalization

    (per array)

    BetweenNormalization

    (all arrays)

    (controls)

    (Dr. Hugo Barrera)

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    23/34

    [email protected]

    SLIDES' SCANNINGSGROUP SLIDE CY3 (GREEN) CY5(RED) COMMENTS

    1a 52 A V Sample Control

    1b 52 B V Sample Control

    2a 51 A V Sample Control RIGHT TOP GROUP

    2b 51 B V Sample Control RIGHT BOTTOM GROUP

    3a 56 A V Control Muestra

    3b 56 B V Control Muestra

    4a A 54 V Control Muestra

    4b B 54 V Control Muestra

    5a A 55 V Control Control LEFT TOP GROUP

    5b B 55 V Control Control LEFT BOTTOM GROUP

    6a A 53 V Control Control

    6b B 53 V Control Control

    Experimental Design - Slides

    http://bioinformatica.mty.itesm.mx/?q=node/68

    DownloadImages from

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    24/34

    [email protected]

    Read ImagesRead BOTH Imagestogether using SpotFinder

    Mark file 1 as "Cy3" = GreenMark file 2 as "Cy5" = Red

    Adjust Image Brightness and Contrast

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    25/34

    [email protected]

    Create Grid

    Create GridMetarows = 12,Metacolumns = 4Rows = 24, Columns = 24Pixels = 450 (of the 24 x 24

    spots)Spacing = 18 (betweenmetacolumns and metarows)

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    26/34

    [email protected]

    Adjust Grid

    Adjust each of the 12*4Grids to correctpositions

    Right mouse button in agrid to move that gridArrow keys also work

    Right mouse button in ablank section to move all

    grids

    Created Grids are not aligned to the image.

    Use VisibleAll (right click in

    a blank area)

    Use Move AllTo adjust overall

    position. Use

    visible all to

    restore grid.

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    27/34

    [email protected]

    Save Grid

    Save the grid frequently to avoid loosing your work

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    28/34

    [email protected]

    Image Analysis

    Use Gridding and Processing Adjust (save grid first, in mac adjust doesnt work well) Process

    Copy images 1 From the grid adjust 1 From the RI plot

    1 From the data (figure) 2 From the QC view (A and B) What does they represent?

    Export to .mev file Open .mev file in excel Remove comment lines Compute signal:

    Signal A = Cy3 Green = MNA - MedBkgA = Media del spot A - Mediana delfondo B

    Signal B = Cy5 Red = MNB - MedBkgB = Media del spot B - mediana del fondoB

    Plot Signal A vs Signal B Copy image in a word file

    DO NOT SAVE THE modified .MEV FILE

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    29/34

    [email protected]

    Execute Process

    - Select Gridding Tab- Use Histogram Segmentation

    - Spot Size = 10- Process All !

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    30/34

    [email protected]

    Inspect DATA PROCESSED

    Select Data Tab

    Select a row / spot

    See results and interpretoutput

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    31/34

    [email protected]

    Inspect MA-PLOT

    Select RI-PLOT Tab Observe the MA-PLOT

    You can switch on/offspecific grids

    A tendency can beobserved (which has

    to be corrected to 0see MIDAS exercise)

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    32/34

    [email protected]

    Quality Control View

    Quality view tab

    View 2 gives if each had M > 1(yellow, or 0.5 in this image)or M < -1

    View 1 gives the count of all Mvalues per color (yellow, gray, blue,and green)

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    33/34

    [email protected]

    Export DATA and VIEW in Excel

    Save data to a .mev file

    Open .mev file in excel

    Remove comment lines(important !)

    Compute signal: Signal A = Cy3 Green = MNA -

    MedBkgA = Media del spot A -Mediana del fondo B

    Signal B = Cy5 Red = MNB - MedBkgB= Media del spot B - mediana delfondo B

    Plot Signal A vs Signal B Copy image in a word file

    DO NOT SAVE THE modified .MEVFILE

    The Plot in Excel shouldbe similar to the MAplot (RI-Plot)

  • 8/2/2019 Micro Arrays II - Image Analysis and Data Pre-Processing(1)

    34/34

    [email protected]

    Resumen del Uso de SpotFinder

    Lemos 2 imgenes, Verde=Cy3, Roja=Cy5 paragenerar un valor de intensidad con ruido defondo reducido para cada color:Generamos un grid con la cantidad de spots y diseo

    espacial especificado para el microarreglo

    Ajustamos las posiciones visualmente moviendo los gridsCalculamos el valor de la seal y el ruido de fondo

    para cada colorObtuvimos un archivo con datos

    Imagen

    Datos