school of social and community medicine university of bristol aries methylation pre-processing and...
TRANSCRIPT
![Page 1: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/1.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
ARIESMethylation Pre-
processingand Clean up
Geoff Woodward
![Page 2: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/2.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Overview
Initial QC Normalisation Batch Correction Data MWAS (Methylome Wide Assoc. Study) Results
![Page 3: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/3.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Initial QC
Probe p-value confidence in detection
• background• -ve controls
overall QC indicator• High background• Low signal• Poor stringency
![Page 4: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/4.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Initial QC: Control Probes
Mixture of dependent/independent Sample independent
• Staining (Biotin/DNP)• Hybridisation (synthetic target)• Extension (hairpin)
Sample dependent• Bisulfite conversion (HindIII site)• G/T mismatch (non-spec.)• Specificity & Non-polymorphic• Negative
![Page 5: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/5.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Initial QC: LIMS
![Page 6: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/6.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
LIMS Control DashBoard
Real time Jscript/JSON Zoom & scroll All Illumina controls
probes +ve & -ve
Area Max Median Min
![Page 7: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/7.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Intial QC: MDS Start pre-processing
What’s affecting the data?• Failures• controls
![Page 8: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/8.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Initial QC: MDS Remove Controls/Failures Remove Sex Chromosomes
![Page 9: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/9.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Sample Confirmation
Genotyping 65 SNP probes Kmeans clustering
• Call genotype Cross reference with SNP data Calculate % match
• Fully automated in pipeline• Stored in LIMS
![Page 10: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/10.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Normalisation Why?
Cancer vs. Control – not req. More sensitive differences...
Quantile? Rank & scale according to ref dist. (av.)
Not appropriate: Type I & II assays differ
• Medians – opposite ends of β scale• SD (across reps.) smaller in Type I probes• Interrogate different subsets of the genome
– Type II > proportion in open-sea– Type I > proportion in gene promoters
![Page 11: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/11.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Normalisation: Method 1
Subset Within Array Normalisation (minfi) To address differences in dist:
• No. of CpGs in probe body indicates density/loc.• Dist. more similar in these groups
Approach• Reference quantiles:
– N random type I & II selected for each group– Split meth/unmeth channels
• Linear interpolation fit probes to ref. Doesn’t treat type I & II separately
• BUT does decrease difference
![Page 12: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/12.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Normalisation: Method 2 Touleimat & Tost
To address differences:• CpG region
– Shore / Shelf / Island / Open-sea
• Treat Type I & II separately Approach:
• reference quantiles– Type I used “anchors” for each region– More reliable / lower SD
• estimate target quantiles• Fit type II to target
![Page 13: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/13.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Normalisation: Method 3
Dasen (wateRmelon) Under review Separate QN of
• methylated Type I• unmethylated Type I• methylated Type II• unmethylated Type II intensities.
Both directions
![Page 14: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/14.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Normalisation: Comparison wateRmelon metrics:
Imprinted DMRs• 237 probes within iDMRs• iDMR e=50% meth.• SE = SD / √ N
– SD of all 237 probes– N = number of samples
iDMRs
Raw 0.00431
Dasen 0.00241
Tost 0.00214
Swan 0.00428
![Page 15: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/15.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Normalisation: Comparison
SNP probes• 63 highly polym. SNP probes• K-means clustering into 3 genotypes• SE like measure for each group
AA AB BB
Raw 9.025 e-05 1.910 e-04 5.145 e-05
Dasen 1.669 e-04 2.047 e-04 2.321 e-05
Tost 8.253 e-05 5.242 e-04 1.541 e-04
Swan Na Na na
![Page 16: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/16.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Normalisation: Comparison wateRmelon metrics:
X-Chromosome Inactivation• 11,232 probes• T-test all probes for sex differences• ROC analysis
– using p-val for sex diff.
• 1 – AUC – 0 being the perfect predictor & best sex separation
X-Inact.
Raw 0.0947
Dasen 0.0889
Tost 0.0892
Swan 0.4952
![Page 17: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/17.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Comparison: Density Plots
Metrics are great but how do they really effect the data?
All typeI typeII
![Page 18: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/18.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Comparison: Density Plots
Normalised distributions All typeI typeII
![Page 19: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/19.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Comparison: Scatter Plot
Pepsi Plot – you’ll see why! Raw (x) vs. Normalised (y)
• typeI typeII
SWAN Tost dasen
![Page 20: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/20.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Comparison: Scatter Plot
![Page 21: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/21.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Batch Correction: Exp. Design
Bisulphite Conversion Excess of samples > 48 Redundant controls QC and PCR
MSA4 Plate Well dictates chip position (Robot) Randomised
• Min. 4 of each time point• Max 1 control• Mix of gender
Infinium 450k Chips 12 arrays per chip
Throughput doubled
![Page 22: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/22.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Batch Correction: Metadata
LIMS tracking Every process All consumables
• ~20• Formamide to hyb. Buffers• > 1000 used so far!
All equipment• Fridge/centrifuge/PCR block
![Page 23: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/23.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Batch Correction What are we seeing?
Bisulphite batch Correction
Many algorithms available• SVD/SVA/DWD
Gene expression
ComBat Chen C, Grennan K, Badner J, Zhang D, Gershon E, et al. (2011) Removing Batch Effects in Analysis of Expression Microarray Data: An Evaluation of Six Batch Adjustment Methods. PLoS ONE 6(2): e17238. doi:10.1371/journal.pone.0017238
Empirical Bayesian framework• Create a model matrix• Supply batch var• Standardise gene-wise
– Least squares approach
• Fits L/S model – find priors• Adjust to empirical parametric priors
![Page 24: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/24.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Batch Correction Example data
Batch correct Tost norm. data use M values Convert back to β Values can escape 0-1 limit
• Scale• 0.02% of probes• Dist. unaffected.
![Page 25: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/25.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Batch Correction: BEFORE
![Page 26: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/26.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Batch Correction: AFTER
![Page 27: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/27.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Datasets ARIES pre-release:
Filtered probes SNP probes
Age group n
Cord 584
F7 598
TF3 (15) 64
F17 280
Antenatal 394
FOM 329
![Page 28: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/28.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
MWAS
Choice of servers: Epi-garrod BlueCrystal
![Page 29: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/29.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Epi-garrod
Request account via IT-services for: epi-garrod.bris.ac.uk
Relatively quiet server in the dept. No queuing system
Check htop before running jobs Cord data requires ~15% RAM
![Page 30: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/30.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Epi-garrod
Data: SAN
• Accessible from multiple servers /mnt/sscm3/ARIES_DATA/…
Permissions for this folder You must be a member of the aries group
![Page 31: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/31.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Blue Crystal
Request an account via: https://www.acrc.bris.ac.uk/login-area/apply.cgi
Queuing handled Data:
/gpfs/cluster/smed/alspac-shared/aries/… Again, permissions required:
Member of aries group
![Page 32: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/32.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Files
ALN_dasen_<<time_code>>_betas.Rdata ALN_tost_<<time_code>>_betas.Rdata <<time_code>>_manifest.Rdata fdata.Rdata MWAS.r
![Page 33: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/33.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
ALN_dasen_<<time_code>>_betas.Rdata
![Page 34: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/34.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
<<time_code>>_manifest.Rdata
![Page 35: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/35.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Fdata_new.RData
![Page 36: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/36.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
CpGassoc
CRAN http://cran.r-project.org/web/packages/CpGassoc/index.html
Tests for association between an independent variable and methylation
Option to include additional covariates Assesses significance with:
Holm (step-down Bonferroni) FDR methods
![Page 37: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/37.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
MWAS.r
![Page 38: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/38.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
MWAS.r continued...
![Page 39: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/39.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
MWAS.r continued...
![Page 40: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/40.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Manhattan / QQ
Replicated the following studies results: 450K Epigenome-Wide Scan Identifies Differential DNA Methylation in Newborns Related to
Maternal Smoking during Pregnancy.Bonnie R. Joubert, et.al.,
Gene hits: GFI1, AHRR, MYO1G, CYP1A1 "CYP1A1 plays a key role in the aryl hydrocarbon receptor
signaling pathway, which mediates the detoxification of the components of tobacco smoke." - Joubert, et.al.,
![Page 41: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/41.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Results file
![Page 42: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/42.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
BlueCrystal .bashrc
![Page 43: School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward](https://reader035.vdocuments.us/reader035/viewer/2022062619/5515de9755034638038b4af4/html5/thumbnails/43.jpg)
School ofSOCIAL AND COMMUNITY
MEDICINE
University ofBRISTOL
Any Questions?