chicago: statistical methodology for signal detection in ......chicago: statistical methodology for...
TRANSCRIPT
CHiCAGO: Statistical methodology for signal detectionin Capture Hi-C data
Jonathan Cairns
@jonathancairns
Fraser/Spivakov labs, Babraham Insitute
4th October 2016
Table of Contents
1 Introduction
2 The CHiCAGO model
3 Results
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 2 / 20
Table of Contents
1 Introduction
2 The CHiCAGO model
3 Results
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 3 / 20
Motivation
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 4 / 20
CHi-C: improved resolution at promoters, over Hi-C
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 5 / 20
Lieberman-Aiden et al (2009)
CHi-C: improved resolution at promoters, over Hi-C
Approx. 12-fold increase in read coverage
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 5 / 20
Schonfelder et al (2015), Mifsud et al (2015), Sahlen et al (2015)
The data
Align reads & filter out artefacts with HiCUP
Obtain counts Xij :
1 3 7 5 4 0 0 0 0
1 0 2 0 4 6 5 4 0
0 0 1 2 0 4 6 9 10 ...
0 0 0 1 1 2 5 3 4
0 0 0 0 0 1 1 5 7
... ...
22,000
823,000
baits (j)
other ends (i)
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 6 / 20
Wingett et al (2016)
The data
Align reads & filter out artefacts with HiCUP
Obtain counts Xij :
1 3 7 5 4 0 0 0 0
1 0 2 0 4 6 5 4 0
0 0 1 2 0 4 6 9 10 ...
0 0 0 1 1 2 5 3 4
0 0 0 0 0 1 1 5 7
... ...
22,000
823,000
baits (j)
other ends (i)
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 6 / 20
Wingett et al (2016)
The data
Align reads & filter out artefacts with HiCUP
Obtain counts Xij :
1 3 7 5 4 0 0 0 0
1 0 2 0 4 6 5 4 0
0 0 1 2 0 4 6 9 10 ...
0 0 0 1 1 2 5 3 4
0 0 0 0 0 1 1 5 7
... ...
22,000
823,000
baits (j)
other ends (i)
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 6 / 20
Wingett et al (2016)
●●●●● ● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●● ●● ●●●●●● ●●●●● ●●●●●●●●
●●● ●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●
●●●●
●●●●●●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●●
●●●
●
●●●●●●●●●●●●
●●●●●●
●●●●●●●
●●●●● ● ●●
●●●●●●
●●●●●●●●●●●●●●
●●●●●●●●●●●
●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●
−6e+05 −4e+05 −2e+05 0e+00 2e+05 4e+05 6e+05
050
100
200
300
MIR625−201 (224546)
Distance from viewpoint
N
●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● ●●●●●●●●●●●●● ●●●●●●●●●●●●●●
●●●●●●●●●●●
●●●●●
●
●●
●●●
●
●
●
●
●●●
●●
●●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●●●●
●●●●
●●●●●●
●●●● ●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●● ●●
−6e+05 −4e+05 −2e+05 0e+00 2e+05 4e+05 6e+05
010
020
030
040
050
0
PPP1CB−004,PPP1CB−006,PPP1CB−005,PPP1CB−003,PPP1CB−001,PPP1CB−009,... (340147)
Distance from viewpoint
N
no interaction
interaction
MIR625
PPP1CB
Table of Contents
1 Introduction
2 The CHiCAGO model
3 Results
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 8 / 20
CHiCAGO
CHiCAGO – Capture Hi-C Analysis of Genomic Organization.
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 9 / 20
Model
Background comes from two sources:
Brownian Technical
Source Random collisions Sequencing artefacts
Depends ondistance?
Yes (decreasing) No
Dominates Close to bait Far from bait
Under H0 (no interaction), counts are sum of the two components:
Xij = Bij + Tij
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 10 / 20
Model
Background comes from two sources:
Brownian Technical
Source Random collisions Sequencing artefacts
Depends ondistance?
Yes (decreasing) No
Dominates Close to bait Far from bait
Under H0 (no interaction), counts are sum of the two components:
Xij = Bij + Tij
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 10 / 20
Model
Background comes from two sources:
Brownian Technical
Source Random collisions Sequencing artefacts
Depends ondistance?
Yes (decreasing) No
Dominates Close to bait Far from bait
Under H0 (no interaction), counts are sum of the two components:
Xij = Bij + Tij
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 10 / 20
Model
Background comes from two sources:
Brownian Technical
Source Random collisions Sequencing artefacts
Depends ondistance?
Yes (decreasing) No
Dominates Close to bait Far from bait
Under H0 (no interaction), counts are sum of the two components:
Xij = Bij + Tij
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 10 / 20
Model
Background comes from two sources:
Brownian Technical
Source Random collisions Sequencing artefacts
Depends ondistance?
Yes (decreasing) No
Dominates Close to bait Far from bait
Under H0 (no interaction), counts are sum of the two components:
Xij = Bij + Tij
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 10 / 20
Brownian background estimation
Xij = Bij + Tij
Bij ∼ NB, with E(Bij) = f (dij) × (bait bias)j × (other end bias)i
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Brownian background estimation
Xij = Bij + Tij
Bij ∼ NB, with E(Bij) = f (dij)
× (bait bias)j × (other end bias)i
0
10
20
30
40
50
0
10
20
30
40
50
0
10
20
30
40
50
0
10
20
30
40
50
0
10
20
30
40
50
2501177
53485373
5382
−500000 −250000 0 250000 500000Distance from bait
Exp
ecte
d co
unt
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Brownian background estimation
Xij = Bij + Tij
Bij ∼ NB, with E(Bij) = f (dij) × (bait bias)j
× (other end bias)i
0
10
20
30
40
50
0
10
20
30
40
50
0
10
20
30
40
50
0
10
20
30
40
50
0
10
20
30
40
50
2501177
53485373
5382
−500000 −250000 0 250000 500000Distance from bait
Exp
ecte
d co
unt
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Brownian background estimation
Xij = Bij + Tij
Bij ∼ NB, with E(Bij) = f (dij) × (bait bias)j × (other end bias)i
0
10
20
30
40
50
0
10
20
30
40
50
0
10
20
30
40
50
0
10
20
30
40
50
0
10
20
30
40
50
2501177
53485373
5382
−500000 −250000 0 250000 500000Distance from bait
Exp
ecte
d co
unt
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Brownian background estimation
Xij = Bij + Tij
Bij ∼ NB, with E(Bij) = f (dij) × (bait bias)j × (other end bias)i
estimated close to bait(< 1.5Mb) in 20kb bins.
bin-wise estimates f (db)from geometric meanacross baits
interpolation: cubic fiton log-log scale
Distance function
10 11 12 13 14
−0.5
0.0
0.5
1.0
1.5
2.0
2.5
log(distance)
log(
f(d)
)
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Brownian background estimation
Xij = Bij + Tij
Bij ∼ NB, with E(Bij) = f (dij) × (bait bias)j × (other end bias)i
f (d):
estimated close to bait(< 1.5Mb) in 20kb bins.
bin-wise estimates f (db)from geometric meanacross baits
interpolation: cubic fiton log-log scale
Distance function
10 11 12 13 14
−0.5
0.0
0.5
1.0
1.5
2.0
2.5
log(distance)
log(
f(d)
)
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Brownian background estimation
Xij = Bij + Tij
Bij ∼ NB, with E(Bij) = f (dij) × (bait bias)j × (other end bias)i
f (d):
estimated close to bait(< 1.5Mb) in 20kb bins.
bin-wise estimates f (db)from geometric meanacross baits
interpolation: cubic fiton log-log scale
Distance function
10 11 12 13 14
−0.5
0.0
0.5
1.0
1.5
2.0
2.5
log(distance)
log(
f(d)
)
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Brownian background estimation
Xij = Bij + Tij
Bij ∼ NB, with E(Bij) = f (dij) × (bait bias)j × (other end bias)i
f (d):
estimated close to bait(< 1.5Mb) in 20kb bins.
bin-wise estimates f (db)from geometric meanacross baits
interpolation: cubic fiton log-log scale
Distance function
10 11 12 13 14
−0.5
0.0
0.5
1.0
1.5
2.0
2.5
log(distance)
log(
f(d)
)
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Brownian background estimation
Xij = Bij + Tij
Bij ∼ NB, with E(Bij) = f (dij) × (bait bias)j × (other end bias)i
Bait-specific bias:
Get bin-wise estimates for each bait.Take median across bins – robust to interactions
Other-end-specific bias:
Too sparse to estimate individuallyAssume trans-chromosomal reads are mostly noisePool other-ends by trans countsEstimate bias parameter, pool-wiseBait-to-bait interactions treated separately
Dispersion parameter
Established maximum likelihood methods.
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Brownian background estimation
Xij = Bij + Tij
Bij ∼ NB, with E(Bij) = f (dij) × (bait bias)j × (other end bias)i
Bait-specific bias:
Get bin-wise estimates for each bait.Take median across bins – robust to interactions
Other-end-specific bias:
Too sparse to estimate individuallyAssume trans-chromosomal reads are mostly noisePool other-ends by trans countsEstimate bias parameter, pool-wiseBait-to-bait interactions treated separately
Dispersion parameter
Established maximum likelihood methods.
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Brownian background estimation
Xij = Bij + Tij
Bij ∼ NB, with E(Bij) = f (dij) × (bait bias)j × (other end bias)i
Bait-specific bias:
Get bin-wise estimates for each bait.Take median across bins – robust to interactions
Other-end-specific bias:
Too sparse to estimate individuallyAssume trans-chromosomal reads are mostly noisePool other-ends by trans countsEstimate bias parameter, pool-wiseBait-to-bait interactions treated separately
Dispersion parameter
Established maximum likelihood methods.
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Brownian background estimation
Xij = Bij + Tij
Bij ∼ NB, with E(Bij) = f (dij) × (bait bias)j × (other end bias)i
Bait-specific bias:
Get bin-wise estimates for each bait.Take median across bins – robust to interactions
Other-end-specific bias:
Too sparse to estimate individuallyAssume trans-chromosomal reads are mostly noisePool other-ends by trans countsEstimate bias parameter, pool-wiseBait-to-bait interactions treated separately
Dispersion parameter
Established maximum likelihood methods.
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Brownian background estimation
Xij = Bij + Tij
Bij ∼ NB, with E(Bij) = f (dij) × (bait bias)j × (other end bias)i
Bait-specific bias:
Get bin-wise estimates for each bait.Take median across bins – robust to interactions
Other-end-specific bias:
Too sparse to estimate individuallyAssume trans-chromosomal reads are mostly noisePool other-ends by trans countsEstimate bias parameter, pool-wiseBait-to-bait interactions treated separately
Dispersion parameter
Established maximum likelihood methods.
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Brownian background estimation
Xij = Bij + Tij
Bij ∼ NB, with E(Bij) = f (dij) × (bait bias)j × (other end bias)i
Bait-specific bias:
Get bin-wise estimates for each bait.Take median across bins – robust to interactions
Other-end-specific bias:
Too sparse to estimate individuallyAssume trans-chromosomal reads are mostly noisePool other-ends by trans countsEstimate bias parameter, pool-wiseBait-to-bait interactions treated separately
Dispersion parameter
Established maximum likelihood methods.
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Technical noise estimation
Xij = Bij + Tij
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Technical noise estimation
Xij = Bij + Tij
Tij ∼ Pois(λij)
Estimated entirely from trans-chromosomal reads
Pool baits and other-ends
Pool-wise estimate: average number of reads per pair of transfragments.
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Technical noise estimation
Xij = Bij + Tij
Tij ∼ Pois(λij)
Estimated entirely from trans-chromosomal reads
Pool baits and other-ends
Pool-wise estimate: average number of reads per pair of transfragments.
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Technical noise estimation
Xij = Bij + Tij
Tij ∼ Pois(λij)
Estimated entirely from trans-chromosomal reads
Pool baits and other-ends
Pool-wise estimate: average number of reads per pair of transfragments.
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 11 / 20
Calling p-values
Xij = Bij + Tij
B is Negative Binomial, T is Poisson.
⇒ X has Delaporte distribution.
One-sided hypothesis test – Observed more than expected by chance?
Get p-value
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 12 / 20
Statistical model - p-value weighting
Simple p-value thresholding (even using Bonferroni/FDR)→ many false positives (typically, at large distances, with only oneread).
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 13 / 20
Statistical model - p-value weighting
Simple p-value thresholding (even using Bonferroni/FDR)→ many false positives (typically, at large distances, with only oneread).
At large distances:
far fewer reproducibleinteractions
Empirical probability of reproducible interaction
log(distance)
10 11 12 13 14 15 16
−15
−10
−5
DataFit
log[empiricalprobability]
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 13 / 20
Statistical model - p-value weighting
Simple p-value thresholding (even using Bonferroni/FDR)→ many false positives (typically, at large distances, with only oneread).
At large distances:
far fewer reproducibleinteractions
but vast majority of testsperformed there
Empirical probability of reproducible interaction
log(distance)
10 11 12 13 14 15 16
−15
−10
−5
DataFit
log[empiricalprobability]
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 13 / 20
Statistical model - p-value weighting
Simple p-value thresholding (even using Bonferroni/FDR)→ many false positives (typically, at large distances, with only oneread).
At large distances:
far fewer reproducibleinteractions
but vast majority of testsperformed there
So, large-distance falsepositives dominate.
Empirical probability of reproducible interaction
log(distance)
10 11 12 13 14 15 16
−15
−10
−5
DataFit
log[empiricalprobability]
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 13 / 20
Statistical model - p-value weighting
Simple p-value thresholding (even using Bonferroni/FDR)→ many false positives (typically, at large distances, with only oneread).
At large distances:
far fewer reproducibleinteractions
but vast majority of testsperformed there
So, large-distance falsepositives dominate.
Empirical probability of reproducible interaction
log(distance)
10 11 12 13 14 15 16
−15
−10
−5
DataFit
log[empiricalprobability]
Solution: p-value weighting (Genovese et al, 2009) to downweightlong-distance interactions
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 13 / 20
●●●●● ● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●● ●● ●●●●●● ●●●●● ●●●●●●●●
●●● ●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●
●●●●
●●●●●●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●●
●●●
●
●●●●●●●●●●●●
●●●●●●
●●●●●●●
●●●●● ● ●●
●●●●●●
●●●●●●●●●●●●●●
●●●●●●●●●●●
●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●
−6e+05 −4e+05 −2e+05 0e+00 2e+05 4e+05 6e+05
050
100
200
300
MIR625−201 (224546)
Distance from viewpoint
N
●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● ●●●●●●●●●●●●● ●●●●●●●●●●●●●●
●●●●●●●●●●●
●●
●●●
●
●●
●●●
●
●
●
●
●●●
●●
●●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●●●●
●●●●
●●●●●●
●●●● ●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●● ●●
−6e+05 −4e+05 −2e+05 0e+00 2e+05 4e+05 6e+05
010
020
030
040
050
0
PPP1CB−004,PPP1CB−006,PPP1CB−005,PPP1CB−003,PPP1CB−001,PPP1CB−009,... (340147)
Distance from viewpoint
N
no interaction
interaction
MIR625
PPP1CB
Table of Contents
1 Introduction
2 The CHiCAGO model
3 Results
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 15 / 20
Downstream analysis
CHiCAGO-derived interactions give us “Promoter-InteractingRegions” (PIRs).
Histone marks?SNPs?Other features?
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 16 / 20
Histone marks – significant enrichment at other ends0
2000
4000
6000
8000
1000
012
000
CT
CF
H3K
4me1
H3K
4me3
H3K
27ac
H3K
27me3
H3K
9me3
GM12878
020
0040
0060
0080
0010
000
1200
0
CT
CF
H3K
4me1
H3K
4me3
H3K
27ac
H3K
27me3
H3K
9me3
mESC
Significant interactions
Random samples
Significant interactions
Random samples
Num
ber
of o
verla
ps w
ith fe
atur
e
Num
ber
of o
verla
ps w
ith fe
atur
e
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 17 / 20
Paula Freire Pritchett
Interactions in blood cells
Javierre* / Burren* / Wilder* / Kreuzhuber* / Hill* et al. (in press)Genomic regulatory architecture links disease variants to target genes.
PCHi-C in 17 blood cell types (primary cells)
“Interactomes” found to be cell type-specific, matching lineage tree
Disease-associated SNP
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 18 / 20
Conclusions
CHiCAGO finds interactions in Capture Hi-C data:
robustlyhaving normalised for various sources of biasusing p-value weighting (to account for variable true positive rate)
Results provide biological understanding:
can detect cell type-specific interactions.can show enrichment for histone marks.can link disease-associated SNPs to their target genes.
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 19 / 20
Acknowledgements
CHiCAGO developers
Paula FreirePritchett
Steven W. Wingett
Mikhail Spivakov
Statistical Advice
Vincent Plagnol(UCL/Inivata)
Daniel Zerbino(EBI)
Additional DownstreamAnalysis
Csilla Varnai
Andrew Dimond
Data
Biola Javierre
Stefan Schonfelder
Cameron Osborne(KCL)
Peter Fraser
http://www.regulatorygenomicsgroup.org/chicago
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 20 / 20
p-value weighting
We make prior “guesses” Uij . We allow Uij to depend on dij , assumingthat short-range interactions are more likely than long-range interactions,with a smooth transition between the two. The Uij are transformed intoweights Wij by dividing through by the mean value, U, ensuring that theaverage Wij value is 1. Finally, weighted p-values are obtained by dividingthe p-values by their respective weights:
Qij =pijWij
We now specify the Uij model in our particular context. (next slide)
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 1 / 8
p-value weighting
Empirical probability of reproducible interaction
log(distance)
10 11 12 13 14 15 16
−15
−10
−5
DataFit
log[empiricalprobability]
Bounded logistic regression model: Uij is assumed a function of both dijand a vector of parameters Θ = (α, β, γ, δ), according to
Uij = ηijUmax + (1− ηij)Umin
where
ηij = expit(α + βlog(dij))
Umin = expit(γ)
Umax = expit(δ)
using the expit function
expit(x) =ex
1 + ex
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 2 / 8
Numbers of called interactions
# interactions per sample: 130,000− 190,000
# interactions per captured promoter:
05
1015
2025
30
Naive_
B
Total
_B
al_thy
mus
Activate
d
CD4_MF
nActiv
ated
aive_
CD4
aive_
CD8
Total
_CD8
hage
s_M2
hage
s_M0
hage
s_M1
precu
rsors
karyo
cytes
throb
lasts
Monoc
ytes
Neutro
phils
Ave
rag
e o
f in
tera
ctio
ns
per
C
aptu
red
Pro
mo
ter
0 10
20
30
Naïve B Total B Fetal Thymus
Naïve CD4+ Naïve CD8+ Total CD8+ Macrophages M2 Macrophages M0 Macrophages M1 Endothelial Precursors Megakaryocytes Erythroblasts Monocytes Neutrophils
Total CD4+ Unstimulated Total CD4+
Stimulated Total CD4+
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 3 / 8
Numbers of called interactions
# interactions per sample: 130,000− 190,000
# interactions per captured promoter:
05
1015
2025
30
Naive_
B
Total
_B
al_thy
mus
Activate
d
CD4_MF
nActiv
ated
aive_
CD4
aive_
CD8
Total
_CD8
hage
s_M2
hage
s_M0
hage
s_M1
precu
rsors
karyo
cytes
throb
lasts
Monoc
ytes
Neutro
phils
Ave
rag
e o
f in
tera
ctio
ns
per
C
aptu
red
Pro
mo
ter
0 10
20
30
Naïve B Total B Fetal Thymus
Naïve CD4+ Naïve CD8+ Total CD8+ Macrophages M2 Macrophages M0 Macrophages M1 Endothelial Precursors Megakaryocytes Erythroblasts Monocytes Neutrophils
Total CD4+ Unstimulated Total CD4+
Stimulated Total CD4+
J. Cairns (Babraham Institute) regulatorygenomicsgroup.org/chicago 3 / 8
2. PCHiC: sequencing, HICUP & CHiCAGO
Cell Type Processed Reads Capture Unique Valid Reads Significant Interactions
Megakaryocytes 2,696,317,863 653,848,788 150,203
Erythroblasts 2,338,677,291 588,786,672 144,771
Neutrophils 2,241,977,639 736,055,569 131,609
Monocytes 1,942,858,536 572,357,387 151,389
Macrophages M0 2,125,716,849 668,675,248 163,791
Macrophages M1 2,067,485,594 497,683,496 163,399
Macrophages M2 2,055,090,022 523,561,551 173,449
Naïve B 2,127,262,739 629,928,642 171,439
Total B 1,874,130,921 702,533,922 183,119
Fetal Thymus 2,728,388,103 776,491,344 145,577
Naïve CD4+ 2,797,861,611 844,697,853 192,048
Total CD4+ 2,227,386,686 836,974,777 166,668
Unstimulated Total CD4+ 2,034,344,692 721,030,702 177,371
Stimulated Total CD4+ 1,971,143,855 749,720,649 188,714
Naïve CD8+ 1,910,881,702 747,834,572 187,399
Total CD8+ 1,849,225,803 628,771,947 183,964
Endothelial Precursors 2,308,749,174 420,536,621 141,382
37,297,499,080 11,299,489,740 2,816,292
Paula Freire-Pritchett Steven Wingett
* HICUP *CHiCAGO
Clustering of BR according the score of 10K random interactions (cis
1Mb)
NeutrophilsTotal B
Naïve B
Total CD8+
NeutrophilsNeutrophils
MonocytesMonocytes
Erythroblasts
ErythroblastsErythroblasts
MegakaryocytesMegakaryocytes
MegakaryocytesMegakaryocytes
Endothelial Precursors
Endothelial PrecursorsEndothelial Precursors
Macrophages M1Macrophages M1
Macrophages M0
Macrophages M1
Macrophages M2Macrophages M2
Fetal ThymusNaïve CD4+Naïve CD4+
Total CD8+Total CD8+
Stimulated Total CD4+Stimulated Total CD4+Stimulated Total CD4+
Naïve CD8+
UnstimulatedTotal CD4+UnstimulatedTotal CD4+UnstimulatedTotal CD4+
Total CD4+
Naïve CD8+Naïve CD8+
Naïve BNaïve B
Total BTotal B
Fetal ThymusFetal Thymus
Naïve CD4+Total CD4+Naïve CD4+
Total CD4+
Monocytes
Macrophages M2
Macrophages M0Macrophages M0
Distance
400 500 600 700
Sven Sewitz
−1e+06 −5e+05 0e+00 5e+05 1e+06
050
100
150
N10
012
0
−1e+06 −5e+05 0e+00 5e+05 1e+06
020
4060
80
N
B1_Monocyte_D1_step2_chicago2
Megakaryocyte_D5_6_step2_chicago2 789407 − AP1S2
789407 − AP1S2
410124 − CD93−002,...
−1e+06 −5e+05 0e+00 5e+05 1e+06
050
100
150
200
250
300
N
B1_Monocyte_D1_step2_chicago2
−1e+06 −5e+05 0e+00 5e+05 1e+06
050
100
150
200
N
410124 − CD93−002,...Megakaryocyte_D5_6_step2_chicago2