epigenomics: someepigenomics: some statistical

Post on 03-Feb-2022

10 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Epigenomics: SomeEpigenomics: SomeEpigenomics: Some Epigenomics: Some Statistical ApplicationsStatistical ApplicationsStatistical Applications Statistical Applications

Rafael A. IrizarryRafael A. IrizarryDepartment of BiostatisticsJ h H ki Bl b S h l f P bliJohn Hopkins Bloomberg School of Public

Health

AcknowledgementsAcknowledgementsAcknowledgementsAcknowledgements

• Tom Albert Nimblegen• Tom Albert, Nimblegen• Benilton Carvalho, JHU Biostatistics• Andy Feinberg Lab, JHU Medicine• Todd Richmond, NimblegenTodd Richmond, Nimblegen• Hao Wu, JHU Biostatistics

J W B Bi t t• Jean Wu, Brown Biostat• Vasan Yegnasubramanian, JHU g

Oncology

OutlineOutlineOutlineOutline

• Quick Introduction to Epigenetics• Quick Introduction to Epigenetics• Introduction to Methylationy• Overview of competing technologies

Review: Expression arrays lessons• Review: Expression arrays lessons• Comparisonp• Role of statisticians

Genetics: the alphabet of lifeGenetics: the alphabet of life

• Letters of DNALetters of DNA sequence carry the informationinformation

EpigeneticsEpigenetics

(3.4x10-10 meters/bp) x (6x109 bp/genome) = ~2 meters/genome

Radius of the nucleus is ~ 10 µM !!!µ

Klug and Cummings, 1997

[(6 x 109 bp/genome) / (195 bp/nucleosome)] = ~ 30.8 x 106 nucleosomes/genome~ 5 % of nuclear volume

http://www.albany.edu/~achm110/solenoidchriomatin.html

Epigenetics: the grammar of Epigenetics: the grammar of lifelifelifelife

DNA methylationDNA methylationDNA methylationDNA methylation

DNA methyltransferase I (DNMT1)Obsered to expeced = Pr(CG) / { Pr(C) Pr(G) }

DNA methylation can lead to silencing of gene expressionDNA methylation can lead to silencing of gene expression

HDACMeCP2

Sin3A HDACMeCP2

Sin3A

HDAC2

HDAC1

MBD3

RbAp46RbAp48

Mi 2

MTA268kD

66kDHDAC2

HDAC1

MBD3

RbAp46RbAp48

Mi 2

MTA268kD

66kD>2 MDalton Complex

MBD2Mi-2 MBD2Mi-2

Robertson and Wolffe, Nat Rev Genet, 2000

ENCODE TrackENCODE TrackENCODE TrackENCODE Track

Expression Array LessonsExpression Array Lessons

NormalizationNormalizationNormalizationNormalization

Probe effectProbe effectProbe effectProbe effect

I t it B k d P b Eff t Q tit EIntensity = Background + Probe Effect x Quantity x Error

Sequence effect for BGSequence effect for BGSequence effect for BGSequence effect for BGWu et al. (2004) JASA 99(468) 909

Aff ∑ ∑ 125

jbk CGTAj

kj kAffinity =

= ∈∑ ∑= 1

1 },,,{,μ µj,k ~ smooth function

of k

Back to MethylationBack to Methylation

High throughput of course….High throughput of course….

Densities for three methodsDensities for three methodsDensities for three methodsDensities for three methods

HCT116 lots of methylation

DKO very little methylation

Hunh?Hunh?Hunh?Hunh?

MeDIP (like ChIPchip)MeDIP (like ChIPchip)

Total Reverse crosslinks Amplify Label/hybridizeAmplify Label/hybridize

Crosslink YLyse & Sonicate

IP Reverse crosslinks

Amplify Label/hybridize

Other controls for IP(e.g., no antibody, non-

specific antibody)

Some DataSome DataSome DataSome Data

Problem: Not specificProblem: Not specificProblem: Not specificProblem: Not specific

HELP: Two enzymesHELP: Two enzymesHELP: Two enzymesHELP: Two enzymesCuts at CCGG Cuts at CMCGG

No Methylation

HELP after PCRHELP after PCRHELP after PCRHELP after PCR

No Methylation

HELPHELPHELPHELP

Methylation

HELPHELPHELPHELP

No Methylation

Problem with HELPProblem with HELPProblem with HELPProblem with HELPCuts at CCGG Cuts at CMCGG

No Methylation

The ProblemThe ProblemThe ProblemThe Problem

Ob d t d P (CG) / { P (C) P (G) }Obsered to expeced = Pr(CG) / { Pr(C) Pr(G) }

Proportion of neighboring CpG also Proportion of neighboring CpG also th l t d/ t th l t dth l t d/ t th l t dmethylated/not methylatedmethylated/not methylated

McRBC on Tiling arrayMcRBC on Tiling arrayMcRBC on Tiling arrayMcRBC on Tiling array

ROC nowROC nowROC nowROC now

ENCODE TrackENCODE TrackENCODE TrackENCODE Track

Problems for StatisticiansProblems for StatisticiansProblems for StatisticiansProblems for Statisticians

• Background Correction +• Background Correction + Normalization

• Probability Model for Segments• Use these to from null and alternative• Use these to from null and alternative

models… we need power!• Use these to create bump finding

algorithmsg• Adapt to high-throughput sequencing

Supplemental SlidesSupplemental Slides

McRBC: One enzymeMcRBC: One enzymeMcRBC: One enzymeMcRBC: One enzymeCuts at AmCG or GmCG Input

No Methylation

McRBC after GelMcRBC after GelMcRBC after GelMcRBC after Gel

No Methylation

McRBC after GelMcRBC after GelMcRBC after GelMcRBC after Gel

No Methylation

McRBCMcRBCMcRBCMcRBC

Methylation

McRBC after GELMcRBC after GELMcRBC after GELMcRBC after GEL

Methylation

McRBC after GELMcRBC after GELMcRBC after GELMcRBC after GEL

Methylation

top related