computational laboratory: acgh data analysis

26
Computational Laboratory: aCGH Data Analysis Feb. 4, 2011 Per Chia-Chin Wu

Upload: dmitri

Post on 14-Jan-2016

33 views

Category:

Documents


1 download

DESCRIPTION

Computational Laboratory: aCGH Data Analysis. Feb. 4, 2011 Per Chia-Chin Wu. Today’s Topics. Review aCGH and its data analysis Homework of aCGH data analysis using tools in Genboree and ruby. Chromosomal Aberrations. REF: Albertson et al. Array CGH. Label Patient DNA with Cy3. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Computational Laboratory: aCGH Data Analysis

Computational Laboratory: aCGH Data Analysis

Feb. 4, 2011

Per Chia-Chin Wu

Page 2: Computational Laboratory: aCGH Data Analysis

Today’s Topics

• Review aCGH and its data analysis

• Homework of aCGH data analysis using tools in Genboree and ruby

Page 3: Computational Laboratory: aCGH Data Analysis

Chromosomal Aberrations

REF: Albertson et al

Page 4: Computational Laboratory: aCGH Data Analysis

Array CGHLabel

Patient DNA with

Cy3

Label Control

DNA with Cy5

Hybridize DNA to genomic clone

microarray

Analyze Cy3/Cy5 fluorescence ratio of

patient to control (log of Cy3/Y5)

Page 5: Computational Laboratory: aCGH Data Analysis

Workflow of aCGH Analysis

Finished chips (scanner) Raw image data (experiment info ) (image processing software)

Probe level raw intensity data

Background adjustment, Normalization, transformation

Raw copy number (CN) data [log ratio of tumor/normal intensities]

Segmentation and boundary determination Estimation of CN

Characterizing individual genomic profiles

Page 6: Computational Laboratory: aCGH Data Analysis

• Background Adjustment/CorrectionReduces unevenness of a single chip

Before adjustment After adjustment

Corrected Intensity (S’) = Observed Intensity (S) – Background Intensity (B)

Eliminates non-specific hybridization signal

Normalization

Page 7: Computational Laboratory: aCGH Data Analysis

• NormalizationReduces technical variation between chips Before After

S – Mean of S

S’ =

STD of S

S’ ~ N(0,1 )

Normalization

• Log Transformation

before Log transformation

S

after Log transformation

Log(S)

S : Probe raw intensity; S’ : Log transformation, S’ = log2(S)CN = S’tumor - S’normal = log2(Stumor/Snormal)

Page 8: Computational Laboratory: aCGH Data Analysis

Segmentation/Smoothing

CN

Clone/Chromosome

Page 9: Computational Laboratory: aCGH Data Analysis

CN

Clone/Chromosome

Segmentation/Smoothing

Page 10: Computational Laboratory: aCGH Data Analysis

Segmentation/Smoothing

• Goal:To partition the clones into sets with the same copy number and to characterize the genomic segments.

Noise reduction Detection of Loss, Normal, Gain, Amplification Breakpoint analysis

• Biological model: genomic rearrangements lead to gains or losses of sizable contiguous parts of the genome. Recurrent (over tumors) aberrations may indicate an oncogene or a tumor suppressor gene

Page 11: Computational Laboratory: aCGH Data Analysis

• AWS - Adaptive Weights Smoothing• CBS - Circular Binary Segmentation• HMM - Hidden Markov Model partitioning• Many more

All existing methods amount to unsupervised, location-specific partitioning and operating on individual

chromosomes.

Segmentation Methods

Page 12: Computational Laboratory: aCGH Data Analysis

Workflow of aCGH Data Analysis

Finished chips (scanner) Raw image data (experiment info ) (image processing software)

Probe level raw intensity data

Background adjustment, Normalization, transformation

Raw copy number (CN) data [log ratio of tumor/normal intensities]

Segmentation and boundary determination Estimation of CN

Characterizing individual genomic profiles

Page 13: Computational Laboratory: aCGH Data Analysis

Homework: Analyze TCGA Data

Page 14: Computational Laboratory: aCGH Data Analysis

The Cancer Genome Atlas Project (TCGA)

• Goal: find genomic alterations that cause cancer (mutations, CNA, methylation, …)

• Pilot project1. brain (glioblastoma multiforme): 186 pairs of tumor and normal samples2. lung (squamous)3. ovarian (serous cystadenocarcinoma )

Page 15: Computational Laboratory: aCGH Data Analysis

Flowchart of Data Analysis

Raw copy number (CN) data [log ratio of tumor/normal intensities]

Segmenttion and boundary determination Estimation of CN

Characterizing individual genomic profiles

Annotation

Identify Recurrent Genes

Page 16: Computational Laboratory: aCGH Data Analysis

Ruby: Mapping Probes

Page 17: Computational Laboratory: aCGH Data Analysis

Ruby: Mapping Probes

Page 18: Computational Laboratory: aCGH Data Analysis

Ruby: Mapping Probes

LFF format

Page 19: Computational Laboratory: aCGH Data Analysis

Upload Data

Page 20: Computational Laboratory: aCGH Data Analysis

Data Analysis: Segmentation

Page 21: Computational Laboratory: aCGH Data Analysis

Data Analysis: Combine Tracks

Page 22: Computational Laboratory: aCGH Data Analysis

Data Analysis: Annotation Selector

Page 23: Computational Laboratory: aCGH Data Analysis

Data Analysis: Mapping Genes

Page 24: Computational Laboratory: aCGH Data Analysis

Data Analysis: Recurrent Genes

Page 25: Computational Laboratory: aCGH Data Analysis

Overview of Data Analysis

Raw copy number (CN) data [log ratio of tumor/normal intensities]

Data Preprocessing (Ruby) and uploading data to Genboree

Segmentation (Segmentation Tool)

Characterizing individual genomic profiles

Combing data

Annotation (Annotation Selector; Attribute Lifter)

Identify Recurrent Genes (Ruby)

Page 26: Computational Laboratory: aCGH Data Analysis

You Need To Submit

1. ruby script from step 1 that creates your lff file

2. ruby script from step 5 that parses your table

3. two-column final output from step 5