a novel approach to improve the noise in detecting copy number variations using oligonucleotide

24
____ __ __ _______ Birol et al :: AGBT :: 7 February 2008 A NOVEL APPROACH TO IMPROVE THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE MICROARRAYS 12 November 2008 12 November 2008 Noushin Farnoud Noushin Farnoud , Marco Marra, Jan Friedman, Stephane Flibotte, Allen Delan , Marco Marra, Jan Friedman, Stephane Flibotte, Allen Delan Canada’s Michael Smith Genome Sciences Centre

Upload: mimis

Post on 19-Jan-2016

47 views

Category:

Documents


0 download

DESCRIPTION

A NOVEL APPROACH TO IMPROVE THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE MICROARRAYS. 12 November 2008 Noushin Farnoud , Marco Marra, Jan Friedman, Stephane Flibotte, Allen Delaney Canada’s Michael Smith Genome Sciences Centre. Outline. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

A NOVEL APPROACH TO IMPROVE THE NOISE IN DETECTING COPY NUMBERVARIATIONS USING OLIGONUCLEOTIDE

MICROARRAYS

12 November 200812 November 2008

Noushin FarnoudNoushin Farnoud, Marco Marra, Jan Friedman, Stephane Flibotte, Allen Delaney, Marco Marra, Jan Friedman, Stephane Flibotte, Allen Delaney

Canada’s Michael Smith Genome Sciences Centre

Page 2: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

Outline

What are Copy Number Variations (CNVs)? Why is it important to study copy number

variations? How can we study CNVs? What are the issues associated with studying

CNVs? How can we deal with them?

Page 3: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

What is Copy Number Variation (CNV)?

The DNA copy number of a region of a genome is the number of copies of genomic DNA.

In humans the normal copy number is two for majority of autosomes. However, recent discoveries have revealed that many segments of DNA, ranging in size from kilobases to megabases, can vary in copy-number.

These DNA copy number variations (CNVs) are a result of genomic events causing discrete gains and losses in contiguous segments of the genome.

Page 4: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

Why is it important to study CNVs?

CNVs are common in cancer and other diseases. For example, a review paper by Charles Lee have listed 17 conditions of the nervous system alone – including Parkinson’s Disease and Alzheimer’s Disease – that can result from copy number variation (Neuron Oct 06)

CNVs are also common in normal individual and contribute to our uniqueness. These changes can also influence the susceptibility to disease.

Since CNVs often encompass genes, they can have important roles both in characterizing human disease and discovering drug response targets.

Understanding the mechanisms of CNV formation may also help us better understand human genome evolution.

Page 5: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

How can we detect CNVs?

Two-color arrays

One-color arrays

Patient

Reference

Page 6: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

Main issue of oligonucleotide microarrays

Log

2 R

ati

o o

f In

ten

sit

y

22

4

-4

-2

0

Position (Mb)0 50 100 150 200 250

* http://dsgweb.wustl.edu/qunyuan

Although high density microarrays provide genome wide data on copy number, they are often associated with substantial amount of noise that could affect the performance of the analyses.

Page 7: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

How can we improve this noise?

Can we improve the oligonucleotide microarray noise by analyzing individual oligonucleotide probes?

Hypothesis

Each SNP probe set has : # oligonucleotide probes (10K array): 647,080 oligos# oligonucleotide probes (100K array) : 4,648,160 oligos # oligonucleotide probes (500K array): 12,013,632 oligos

Page 8: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

Page 9: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

Therefore…

We can conclude that a major source of the noise is the different behavior of the individual oligonucleotide probes in the SNP probe-set.

This points out to the fact that averaging all PM oligos is not a proper approximation of information content of a SNP.

Page 10: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

Novel Algorithm: Oligonucleotide Probe-level Analysis of Signal intensities

(OPAS) Clusters the individual

oligos in each SNP probe-set

Apply Null-hypothesis testing : estimates the likelihood (p-value) that each cluster of oligos have log-ratio-intensity =0; >0 or <0

Based on these p-values and ML classification algorithms; identify the “most significant cluster of oligos”. The other cluster(s) of oligos is noise; exclude them from analysis.

Page 11: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

Example of Improving the SNP Noise by OPAS

Before

After

Page 12: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

How does OPAS Affect CN analysis?

Page 13: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

What's next?

The next-generation of DNA microarray-based technologies will allow equal detection of large and small CNVs.

Also on the horizon are new DNA sequencing technologies enabling rapid (and ultimately inexpensive) 'personalized' genome sequencing projects.

Coupled together, these technologies will capture almost all the variation in a genome.

Page 14: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

Acknowledgments

Funding

Contact: [email protected]

GSC Marco Marra Stephane Flibotte Allen Delaney Irene Li Hong Qian Robert Holt Sussana Chan

BC Children’s & Women’s Hospital

Jan Friedman

• Patrice Eydoux

Page 15: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

Page 16: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

Advantages of Array CGH

Page 17: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

Log2 Ratio

Classification (each SNP is classified to be deleted, normal or amplified, based

on comparing the P’s of its consisting clusters of PM oligos

Likelihood EstimationApply a series of Null-Hypothesis Tests, to determine the likelihood :

PHs(cluster = 0)PHs(cluster< -0.5)PHs(cluster> +0.6

Clustering PM oligos (using Fuzzy Clustering approach)

Post Processing the Results

Test Array(Normalized Log2 Raw-

Intensity)

Ref Set(Pool of Normal Parents)

nfarnoud
Average of a pool of Normal Parents arrays (Reference Set):(each array is first log-normalized)
Page 18: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

What is Copy Number

Introduction

- What is a SNP?

- What is a SNP array? Array Design + Target Preparation

• Applications of SNP arrays (other than genotyping) - Copy number analysis

• Genotyping using SNP arrays - Generations of methodologies - Properties of SNP arrays

Page 19: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

Schematic Representation of DNA Copy Number Change

Normal cell

deletion amplification

CN=0 CN=1 CN=3 CN=4

CN=2

Page 20: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

T

T

T

T

T

T A

A

A

A

A

C

C

C G

G

G

G

A

T

T

T

T

T

T A

A

A

A

A

C

C

C G

G

G

G

A

CGG C TA

Single Nucleotide- Polymorphism

(SNP)

Background (1) : What are SNPs?

Definition: SNPs are variations in single base pairs that are randomly dispersed throughout the genome

Page 21: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

Major conclusions so far* …

There is a considerable variation among the numbers and types of candidate CNVs detected by different analysis approaches.

Multiple programs are needed to find all real aberrations in a test set.

The frequency of false positive deletions is substantial, but can be greatly reduced by using the SNP genotype information to confirm loss of heterozygosity.

* Friedman et. al, AJHG 2006 Baross et. al, BMC Bioinformatics, 2007 Delaney et. al , in progress

Page 22: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

Profile of SNP probe setsDeleted SNPs

SNPs in ‘Normal’ Region

Page 23: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

3 Generation of Affymetrix SNP arrays

10K 100K 500K

Number of SNPs 11,555 116,204 500,568

Number of Oligonucleotide Probes

647,080

(14 quartets / SNP)

4,648,160

(10 quartets / SNP)

12,013,632

(6 quartets / SNP)

Number of Arrays 1

Xba I

2

Xba I + Hind III

2

Nsp I + Sty I

Number of SNPs per Array --

~58,960 : Xba

~57,244 : Hind

~262,000 : Nsp

~238,000 : Sty

Median inter-marker distance (kb)

105 8.5 2.5

Mean inter-marker distance (kb) 210 23.6 5.8

Average heterozygosity 0.37 0.3 0.3

% genome within 10kb of a SNP -- 40% 85%

Page 24: A NOVEL APPROACH TO IMPROVE  THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE

____ __ __ _______ Birol et al :: AGBT :: 7 February 2008

Background : Structure of Affy SNP array

Each SNP probe set has : 57 oligonucleotide probes (10K array): 647,080 oligos40 oligonucleotide probes (100K array) : 4,648,160 oligos 20 oligonucleotide probes (500K array): 12,013,632 oligos