identification of aberrant epigenetic events in mss/cimp-negative colon … · 2015-10-15 ·...

153
IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON CANCER A DISSERTATION SUBMITTED TO THE GRADUATE DIVISION OF THE UNIVERSITY OF HAWAII AT MĀNOA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN MOLECULAR BIOSCIENCES AND BIOENGINEERING May 2014 By Min-Ae Song Dissertation Committee Maarit Tiirikainen, Chairperson Jason Barbour Dulal Borthakur Lana Garmire Alika Maunakea

Upload: others

Post on 27-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

IDENTIFICATION OF ABERRANT EPIGENETIC

EVENTS IN MSS/CIMP-NEGATIVE COLON CANCER

A DISSERTATION SUBMITTED TO THE GRADUATE DIVISION OF THE

UNIVERSITY OF HAWAI‘I AT MĀNOA IN PARTIAL FULFILLMENT OF

THE REQUIREMENTS FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

IN

MOLECULAR BIOSCIENCES AND BIOENGINEERING

May 2014

By

Min-Ae Song

Dissertation Committee

Maarit Tiirikainen, Chairperson

Jason Barbour

Dulal Borthakur

Lana Garmire

Alika Maunakea

Page 2: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

We certify that we have read this dissertation and that, in our opinion, it is satisfactory in

scope and quality as a dissertation for the degree of Doctor of Philosophy in Molecular

Biosciences and Bioengineering.

DISSERTATION COMMITTEE

_______________________________

Maarit Tiirikainen, Chairperson

_______________________________

Jason Barbour

________________________________

Dulal Borthakur

_______________________________

Lana Garmire

_______________________________

Alika Maunakea

Page 3: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

ii

© Copyright by Min-Ae Song 2014

All Rights Reserved

Page 4: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

iii

ACKNOWLEDGEMENTS

Completing my PhD degree has been the most challenging activity of my life. I

would never have been able to finish my dissertation without the guidance of my

committee members, help from friends, and support from my family. I would like to send

a warm and well deserved thanks to the following organizations for the research materials

and to the following people for intellectual and moral support.

I would like to thank all colon tissue donors, their families and colon cancer

family registry for the advancement of science. I would like to express my special

appreciation and thanks to my academic advisor Dr. Maarit Tiirikainen for outstanding

guidance, mentorship and caring throughout my graduate career. I would also like to

thank my thesis committee members Drs. Jason Barbour, Dulal Borthakur, Lana Garmire,

and Alika Maunakea for serving as my committee, for their helpful comments and

discussions. I would especially like to thank Dr. Loic Le Marchand for being a great

mentor and for being so generous in financially supporting my research. I would like to

thank my academic advisor for the master’s degree Dr. Suman Lee for always being in

my corner and for being a true inspiration for me. I would like to give special thanks to

Drs. Song-yi Park and Unhee Lim for caring, encouraging me, being there for me, and

making sure I made a right decision to go through the graduate school. I would like to

express sincere appreciation to all members of the liver cancer team, especially, Drs.

Linda Wong, Herbert Yu, and Sandi Kwee for their generously continuous support and for

helping me to shape my interest and ideas in liver cancer studies which I was was

conducting in parallel to my thesis work on colon cancer. I would also especially like to

Page 5: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

iv

thank my lab-mates, Annette Jones, Ann Seifried and Matt Hiramoto for their friendship,

support, and insight, day-in and day-out. I would like to a offer special thanks to my

family. Words cannot express how grateful I am to my parents, two elder sisters; Min-Suk

Song, Yun-sook Song, and an elder brother; Yui-Sung Song in Korea for all of their

sacrifices and for always supporting and encouraging me with their best wishes. This

dissertation had never even been started if it was not for my wonderful husband Dong

Hyun Kim. I would like to thank him for always cheering me up and standing by me

through the good and bad times. I would like to thank to my beloved two kids Claire

Haeun Kim and Aiden Seojin Kim for being such great a daughter and a son.

Page 6: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

v

ABSTRACT

My research project focused on the identification of aberrant epigenetic changes via non-

coding RNAs and DNA methylation in MSS/CIMP-negative colon cancer, the major

subtype. DNA methylation is the most well studied epigenetic event, while non-coding

RNA-mediated transcriptional silencing has also an important role in cancer. My first aim

was to profile microRNAs and their potential target genes in colon cancer by a study of

10 paired normal and tumor colon tissues using data from Next Generation Sequencing

and Exon arrays. Nineteen miRNAs, including 6 previously colon cancer associated and

13 not previously implicated, were found aberrantly expressed in the tumor tissues.

Thirty-six colon cancer related genes were significantly correlated to the expression

levels of the identified miRNAs and ‘Wnt/beta-catenin Signaling’ was identified as the

top canonical pathway for these target genes. My second aim was to identify small and

long novel non-coding RNAs at the 8q24 region which contains one of the most relevant

colon cancer risk variants, SNP rs6983267. Thirty-two pre-miRNAs were identified In

Silico by two algorithms, but none of them were verified in further studies. However, a

novel long non-coding RNA spanning the rs6983267 was recently identified, and

significantly elevated expression levels were observed in 23 colon tumor tissues in our

sample set. Also, one known miRNA in a cluster 400 kb away from the risk SNP showed

genotype dependent expression patterns. My last aim was to elucidate the landscape of

genome-wide DNA methylation in colon cancer. General hypomethylation was observed,

concentrating in the intergenic regions and gene bodies, while hypermethylation was

observed in promoter regions, N_Shores and CpG islands. Differentially methylated

Page 7: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

vi

CpGs were enriched in genes with roles in cancer and gastrointestinal disease.

Observations in imprinted genes suggest a more widespread dysregulation of imprinting

in colon cancer than previously reported. The findings of epigenetic alterations in colon

cancer will hopefully contribute to a better understanding of these aberrant events: how

they are related to colon cancer development and progression, while these findings may

also lead to discovery of new biomarkers that can be utilized in patient diagnosis,

stratification and the follow-up of the treatment.

Page 8: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

TABLE OF CONTENTS

ACKNOWLEDGEMENTS .......................................................................................................... iii

ABSTRACT .................................................................................................................................... v

LIST OF FIGURES ........................................................................................................................ x

LIST OF ABBREVIATIONS .................................................................................................... xiii

LIST OF PUBLICATIONS BY MIN-AE SONG RELATED TO THIS THESIS WORK .... 21

CHAPTER 1 ................................................................................................................................ 26

INTRODUCTION ....................................................................................................................... 26 1.1. Epigenetic mechanisms------------------------------------------------------------------------26

1.1.1. Histone modification ........................................................................................................................ 26 1.1.3. Non-coding RNAs ............................................................................................................................ 28 1.1.2. DNA methylation .............................................................................................................................. 29 1.1.3.1. Long non-coding RNAs .............................................................................................................. 32 1.1.3.2. MicroRNAs ..................................................................................................................................... 35

1.2. Epigenetic alterations in cancer-------------------------------------------------------------37 1.2.1. Histone modification in cancer ..................................................................................................... 37 1.2.2. DNA methylation in cancer ........................................................................................................... 38 1.2.3. LncRNAs in cancer .......................................................................................................................... 40 1.2.4. MiRNAs in cancer ............................................................................................................................ 42

1.3. Characterization of colorectal cancer------------------------------------------------------44 1.3.1. Molecular classification of colorectal cancer ........................................................................... 45 1.3.2. Genetic and epigenetic alterations of CRC ............................................................................... 46

1.4. Genome-wide State-of-the-art methods used for epigenetics-----------------------51 1.4.1. SOLiD Next Generation Sequencing (NGS) for miRNAs .................................................. 51 1.4.2. Illumina Infinium HumanMeth450 BeadChip for Methylation ......................................... 53

1.5. Research aims------------------------------------------------------------------------------------55 1.6. Significance---------------------------------------------------------------------------------------56

CHAPTER 2 ................................................................................................................................ 58

COMPREHENSIVE PROFILING OF EXPRESSION ALTERATIONS IN KNOWN AND NOVEL MIRNAS IN COLON CANCER ...................................................................................... 58

2.1. Introduction--------------------------------------------------------------------------------------58 2.2. Materials and method--------------------------------------------------------------------------58

2.2.1. Small RNA extraction and quality checks ................................................................................. 58 2.2.2. SOLiD sequencing ............................................................................................................................ 59 2.2.3. Statistical data analysis of SOLiD sequencing ........................................................................ 60 2.2.4. Technical Validation and Replication using realtime RT-qPCR ......................................... 61 2.2.5. Sample processing for Affymetrix Exon Arrays ..................................................................... 62 2.2.6. Integrated analysis and Ingenuity pathway analysis .............................................................. 62

2.3. Results and discussion-------------------------------------------------------------------------63 2.3.1. Quality checks of small RNAs ...................................................................................................... 63 2.3.2. Small RNA library preparation ..................................................................................................... 64 2.3.3. Deep sequencing of small RNAs ................................................................................................. 65

Page 9: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

viii

2.3.4. Evaluation and preliminary analyses of the NGS data .......................................................... 66 2.3.5. MiRNA expression profiles in colon normal and tumor tissues......................................... 69 2.3.6. Correlation of expression levels between 19 differentially expressed tumor miRNAs a

nd their predicted target genes .................................................................................................................. 75 2.3.7. Previous findings of known colon cancer miRNAs in cancers .......................................... 80 2.3.8. Previous findings of colon cancer new-miRNAs in cancers ............................................... 82

2.4. Conclusion-----------------------------------------------------------------------------------------89

CHAPTER 3 ................................................................................................................................ 90

IDENTIFICATION OF NONCODING RNAS IN THE 8q24 REGION SPANNING THE MULTIPLE CANCER RISK LOCUS SNP rs6983267 ............................................................ 90

3.1. Introduction--------------------------------------------------------------------------------------90 3.2. Materials and methods------------------------------------------------------------------------91

3.2.1. In Silico prediction of potential miRNAs in the 8q24 region ............................................. 91 3.2.2. Total RNA extraction ....................................................................................................................... 92 3.2.3. Reverse Transcriptase Quantitative PCR ................................................................................... 92 3.2.4. Affymetrix Genome-Wide Human SNP 6.0 Array ................................................................. 93

3.3. Results and discussion-------------------------------------------------------------------------93 3.3.1. Identification of novel miRNAs in the 8q24 region using computational algorithms 93 3.3.2. Altered expression of five known miRNAs located in the 8q24 region .......................... 96 3.3.3. Altered expression of novel lncRNAs in the 8q24 region .................................................... 98

3.4. Conclusion---------------------------------------------------------------------------------------100

CHAPTER 4 ............................................................................................................................. 101

LANDSCAPE OF ALTERED METHYLATION IN COLON CANCER ............................. 101 4.1. Introduction-------------------------------------------------------------------------------------101 4.2. Materials and methods-----------------------------------------------------------------------101

4.2.1. Information on Patient Specimen .............................................................................................. 101 3.2.2. Total RNA extraction .................................................................................................................... 102 4.2.3. Affymetrix Exon Arrays ............................................................................................................... 103 4.2.2. DNA Extraction and Bisulfite Conversion ............................................................................. 103 4.2.3. HumanMethylation450 BeadChips .......................................................................................... 104 4.2.4. Raw data normalization ............................................................................................................... 105 4.2.5. Initial filtering of beta-values ..................................................................................................... 105 4.2.6. Statistical analysis of differential methylation ...................................................................... 106 4.2.7. Ingenuity pathway analysis (IPA) ............................................................................................. 106

4.3. Results and discussion-----------------------------------------------------------------------107 4.3.1. Aims of data analysis .................................................................................................................... 107 4.3.2. Conversion of beta-values to M-values................................................................................... 107 4.3.3. Quality checks based on the distribution of the beta-values ............................................ 109 4.3.4. Distribution and classification of CpGs .................................................................................. 113 4.3.5. Identification of the genome-wide methylation profiles in colon cancer ..................... 113 4.3.6. Genome-wide methylation patterns of significant DM CpGs in colon cancer ........... 116 4.3.7. MSS/CIMP-neg colon cancer DM CpGs compared to another cancer type ............... 123 4.3.8. Deregulated methylation at imprinted genes in MSS/CIMP-neg colon cancer .......... 129 4.3.9. Correlation between DNA methylation and miRNA expression .................................... 133

4.4. Conclusion---------------------------------------------------------------------------------------137 GENERAL DISCUSSION-----------------------------------------------------------------------------138

REFERENCES.......................................................................................................................... 147

Page 10: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

ix

LIST OF TABLES

Table 1. Genomic loci associated with CRC risk.

Table 2. HumanMethylation450 BeadChip coverage through gene regions.

Table 3. Sample information for NGS miRNA study.

Table 4. NGS Mapping Results.

Table 5. Numbers of significant mRNAs in 21 comparisons.

Table 6. Expression levels and putative targets of 6 differentially expressed known colon

tumor miRNAs.

Table 7. Significantly differentially expressed target genes (FDR q<0.05) among

correlated targets (FDR q<0.05).

Table 8. Top IPA network of the CRC miRNA target genes.

Table 9. Significantly correlated target mRNAs that have been reported in other studies

and their differential expression in colon tumor compared to normal tissues in the current

study.

Table 10. Sample information for the ncRNA study.

Table 11. Lists of predicted pre-miRNAs by ProMiRII and miR-abela.

Table 12. Sample information for methylation analysis.

Table 13. IPA top networks for genes with DM CpGs.

Table 14. List of genes that have previously been reported to be mutated in colon cancer.

Table 15. Significant differential methylation in imprinted gene loci between colon tumor

and normal tissues at Bonferroni corrected p<0.05.

Table 16. The number of analyzed CpGs for differentially expressed miRNAs.

Page 11: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

x

LIST OF FIGURES

Figure 1. Dynamic regulation of transcription by histone modifications.

Figure 2. Conversion of the cytosine to 5-methylcytosine by DNA methyltransferase

(DNMT).

Figure 3. Distribution of CpG dinucleotides throughout the genome.

Figure 4. The decision tree to select appropriate DNA methylation analysis methods.

Figure 5. Paradigms for cellular functions of lncRNAs (red).

Figure 6. LncRNAs play roles in the chromatin remodeling, transcriptional control, post-

transcriptional processing.

Figure 7. MicroRNA biogenesis.

Figure 8. MicroRNAs' involvement in colorectal cancer pathogenesis.

Figure 9. Progressive altered genetics and epigenetics steps in the development of CRCs.

Figure 10. Derivation of molecular CRC groups 1-5 based on CIMP status and MSI status.

Figure 11. Classification of 125 CRCs and heatmap representation of Illumina

HumanMethylation27 BeadChip analysis.

Figure 12. Epigenetic alterations in colon cancer.

Figure 13. Overview of SOLiD sequencing chemistry.

Figure 14. Schematic of Infinium I (A) and II (B) technology.

Figure 15. Outline of the three Thesis Aims.

Figure 16. The quality of the total RNAs and the small RNA preparations.

Figure 17. The quality of the size selected cDNA libraries was checked on the

Bioanalyzer using the DNA 1000 chip.

Figure 18. Read length distribution (nt, number of nucleotides) of sequences mapped to

miRBase.

Figure 19. The strongest oncogenic and tumor suppressor miRNA candidates in colon

tumor tissues.

Figure 20. Expression values from SOLiD sequencing (X-axis) plotted against qPCR

delta Ct values (Y-axis).

Figure 21. Source of Variation and Principal Component Analysis (PCA). A. Source of

variation.

Page 12: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

xi

Figure 22. Tumor vs. normal miRNA expression profiles of 13 newly identified miRNAs.

Figure 23. Genes significantly correlated with top miRNAs in the Wnt/beta-catenin

signaling pathway in colon tumor tissue.

Figure 24. In Silico-predicted pre-miRNAs in the 30kb region flanking the 8q24 SNP

rs6983267.

Figure 25. Examples of secondary stem-loop structures of predicted miRNA precursors

from 3’region of rs6983267 by ProMiRII.

Figure 26. Expression levels of five known miRNAs located in the 8q24 region by

different tissue types (10 tumors and 10 normals) and the genotype of the rs6983267 SNP

(5 GG versus 5 TT for each group).

Figure 27. Elevated expression of a novel lncRNA, CCAT2, in tumors (A) and the

differential expression between GG, GT and TT samples.

Figure 28. The workflow for HumanMethylation450 BeadChips.

Figure 29. M-value transformation to address the issue of heteroscedasticity.

Figure 30. Histograms of beta-values (A) and M-values (B) interrogating CpGs in the

total of 485,577 CpGs.

Figure 31. Histogram of average beta-values for 485,577 CpGs in 40 tumor samples (A)

and 36 adjacent normal samples (B).

Figure 32. Distribution (A) and median of average beta-values (B) on 113 CpGs

consistently methylated in normal, but not in tumor tissues by Peter Laird’s group.

Figure 33. Unsupervised hierarchical clustering of beta-values for 8 CpGs (rows) in

pooled samples (A), and only paired samples (B) (columns).

Figure 34. Dot plots of beta-values in 26 paired colon tissues for 8 previously identified

hypermethylated CpGs by Karpinski et al.

Figure 35. Distribution of CpGs across functional genomic locations (A) and CGIs (B).

Figure 36. Volcano plots showing the magnitude of differential methylation levels (delta-

beta) in the entire CpGs sets; (A) various functional regions; (B) CpG islands and the

surrounding regions.

Figure 37. Methylation profiles of (A) 304 DM CpGs with Bonferroni corrected p<0.05

and (B) 152 DM CpGs with delta-beta values ≥ l0.2l by PCA (left), and unsupervised

hierarchical clustering (right).

Page 13: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

xii

Figure 38. Functional location of the 152 DM CpGs. (A) Distribution of 152 DM CpGs

including 18 hypermethylated CpGs (Left) and 134 hypomethylated CpGs (Right).

Figure 39. Distribution of CGIs and surrounding regions of 152 DM CpGs. (A)

Distribution of 152 DM CpGs including 18 hypermethylated CpGs (Left) and 134

hypomethylated CpGs (Right).

Figure 40. Clustering of normal tissues (colon and liver) and tumor tissues (colon cancer

and HCC) using 152 colon DM CpGs resulting in near perfect discrimination of tissues.

Figure 41. Dot plots of beta-values for 8 differentially methylated CpGs in 5 imprinted

genes in MSS/CIMP-negative colon cancer compared to adjacent normal tissues.

Figure 42. Inverse correlation between DNA methylation and gene expression level of

MEST.

Figure 43. Correlation between miRNA expression and their DNA methylation.

Page 14: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

xiii

LIST OF ABBREVIATIONS

AGCC Affymetrix GeneChip Command Console

ANCOVA Analysis of Covariance

APC adenomatous polyposis coli

BH Benjamini-Hochberg

BH-FDR Benjamini-Hochberg’s false discovery rate

BMP bone morphogenetic protein

C-DM cancer specific differentially methylated

CASP3 caspase 3

CCAT1 colon cancer associated transcript 1

CCAT2 colon cancer associated transcript 2

CDK4,6 cyclin dependent kinase 4,6

CIMP CpG island methylator phenotype

CLL chronic lymphocytic leukaemia

CML chronic myelogenous leukemia

CRC Colorectal cancer

CTGF connective tissue growth factor

DCC deleted in colorectal carcinoma

DM differentially methylated

ECM extracellular matrix

EGFR epidermal growth factor receptor

EMT epithelial-mesenchymal transition

EMT epithelial mesenchymal transition

EXPO5 exportin 5

GWAS genome-wide association studies

hESC human embryonic stem cells

HOTAIR HOX antisense intergenic lncRNA

ICAMs intercellular adhesive molecules

IPA Ingenuity Pathway Analysis

KLF4 Krüppel-like factor 4

known-

miRNAs previously reported miRNAs

KRAS Kirsten rat sarcoma viral oncogene homolog

LncRNAs Long non-coding RNAs

LOI loss of imprinting

MC microCosm

miRAGE miRNA serial analysis of gene expression

MMPs matrix metallopeptidases

MMR DNA mismatch repair gene

mRNA messenger RNA

microRNAs miRNAs

MSCs mesenchymal stem cells

Page 15: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

xiv

MSI microsatellite instability

mTOR mechanistic target of rapamycin

new-

miRNAs newly identified miRNAs

NGS Next Generation Sequencing

PCA Principal Component Analysis

PDCD4 programmed cell death 4

PI3K phosphatidylinositol-3-kinase

POU5F1B POU class 5 homeobox 1B

POU5FP1 POU class 5 homeobox 1B pseudoprotein 1

pre-miRNAs precursor miRNAs

PRNCR1 Prostate cancer non-coding RNA 1

PTEN phosphatase and tensin homolog

PVT1 plasmacytoma variant translocation 1

QC quality control

R-SBE repressive SBE sequence

RASSF1A ARAS association family 1 gene

RECK

reversion inducing cysteine rich protein with kazal

motifs

RISC RNA inducing silencing complex

rRNA ribosomal RNA

SBE Smad binding element

SIRT1 sirtuin 1

SNPs single nucleotide polymorphisms

T-DM tissue specific differentially methylated

TGFb transforming growth factor b

TGFR1/2 transforming growth factor, beta receptor 1/2

TIMP3 tissue inhibitor of metalloproteinase 3

tRNAs transfer RNA

TS TargetScan 5.1

TSP1 thrombospondin 1

TSS transcription start site

UCRs ultra conserved regions

uPAR urokinase plasminogen activator surface receptor

USP33 ubiquitin specific peptidase 33

UTR untranslated region

XIST X-inactive specific transcript

ZEB1/2 zinc finger E box binding homeobox 1

5-FU 5-fluorouracil

Page 16: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

21

LIST OF PUBLICATIONS BY MIN-AE SONG RELATED TO THIS

THESIS WORK

PUBLICATIONS

1. Unhee Lim and Min-Ae Song, Dietary and Lifestyle Correlates of DNA Methylation,

Methods in Molecular Biology in Cancer Epigenetics, Springer Science (Human Press).

2011. (Book Chapter)

Abstract

Lifestyle factors, such as diet, smoking, physical activity and body weight management, are

known to constitute the majority of cancer causes. Epigenetics has been widely proposed as a

main mechanism that mediates the reversible effects of dietary and lifestyle factors on

carcinogenesis. This chapter reviews human studies on potential dietary and lifestyle

determinants of DNA methylation. Apart from a few prospective investigations and

interventions of limited size and duration, evidence mostly comes from cross-sectional

observational studies and supports some associations. Considering the plasticity of epigenetic

marks and correlated nature of lifestyle factors, more longitudinal studies of healthy individuals

of varying age, sex, and ethnic groups are warranted, ideally with simultaneous and

comprehensive data collection on various lifestyle factors. Studies to date suggest that certain

dietary components may alter genomic and gene-specific DNA methylation levels in systemic

and target tissues, affecting genomic stability and transcription of tumor suppressors and

oncogenes. Most data and supportive evidence exist for folate, a key nutritional factor in one-

carbon metabolism that supplies the methyl units for DNA methylation. Other candidate

bioactive food components include alcohol and other key nutritional factors of one-carbon

metabolism, polyphenols and flavonoids in green tea, phytoestrogen and lycopene. Some data

Page 17: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

22

also support a link of DNA methylation with physical activity and energy balance. Effects of

dietary and lifestyle exposures on DNA methylation may be additionally modified by common

genetic variants, environmental carcinogens, and infectious agents, an aspect that remains largely

unexplored. In addition, growing literature supports that the environmental conditions during

critical developmental stages may influence later risk of metabolic disorders in part through

persistent programming of DNA methylation. Further research of these modifiable

determinants of DNA methylation will improve our understanding of cancer etiology and may

present certain DNA methylation markers as attractive surrogate endpoints for prevention

research.

2. Min-Ae Song, Maarit Tiirikainen, Sandi Kwee, Gordon Okimoto, Herbert Yu, Linda L.

Wong. Elucidating the Landscape of Aberrant DNA Methylation in Hepatocellular

Carcinoma. PLOS ONE, 8(2): e55761, 2013

Abstract

Background: Hepatocellular carcinoma (HCC) is one of the most common cancers and

frequently presents with an advanced disease at diagnosis. There is only limited knowledge of

genome-scale methylation changes in HCC.

Methods and Findings: We performed genome-wide methylation profiling in a total of 47

samples including 27 HCC and 20 adjacent normal liver tissues using the Illumina

HumanMethylation450 BeadChip. We focused on differential methylation patterns in the

promoter CpG islands as well as in various less studied genomic regions such as those

surrounding the CpG islands, i.e. shores and shelves. Of the 485,577 loci studied, significant

differential methylation (DM) was observed between HCC and adjacent normal tissues at 62,692

Page 18: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

23

loci or 13% (p,1.03e-07). Of them, 61,058 loci (97%) were hypomethylated and most of these

loci were located in the intergenic regions (43%) or gene bodies (33%). Our analysis also

identified 10,775 differentially methylated (DM) loci (17% out of 62,692 loci) located in or

surrounding the gene promoters, 4% of which reside in known Differentially Methylated

Regions (DMRs) including reprogramming specific DMRs and cancer specific DMRs, while the

rest (10,315) involving 4,106 genes could be potential new HCC DMR loci. Interestingly, the

promoter- related DM loci occurred twice as frequently in the shores than in the actual CpG

islands. We further characterized 982 DM loci in the promoter CpG islands to evaluate their

potential biological function and found that the methylation changes could have effect on the

signaling networks of Cellular development, Gene expression and Cell death (p = 1.0e-38), with

BMP4, CDKN2A, GSTP1, and NFATC1 on the top of the gene list.

Conclusion: Substantial changes of DNA methylation at a genome-wide level were observed in

HCC. Understanding epigenetic changes in HCC will help to elucidate the pathogenesis and may

eventually lead to identification of molecular markers for liver cancer diagnosis, treatment and

prognosis.

3. Hui Ling, Riccardo Spizzo, Yaser Atlasi, Milena Nicoloso, Masayoshi Shimizu,

Roxana S. Redis, Naohiro Nishida, Roberta Gafà, Jian Song, Zhiyi Guo, Cristina Ivan,

Elisa Barbarotto, Ingrid De Vries, Xinna Zhang, Manuela Ferracin, Mike Churchman,

Janneke F. van Galen, Berna H. Beverloo, Maryam Shariati, Franziska Haderk,

Marcos R Estecio, Guillermo Garcia-Manero, Gijs A. Patijn, David C. Gotley, Vikas

Bhardwaj, Shureiqi Imad, Subrata Sen, Asha S. Multani, James Welsh, Ken Yamamoto,

Itsuki Taniguchi, Min-Ae Song, Steven Gallinger, Graham Casey, Stephen N Thibodeau,

Page 19: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

24

Loïc Le Marchand, Maarit Tiirikainen, Sendurai A. Mani, Wei Zhan2, Ramana V.

Davuluri , Koshi Mimori, Masaki Mori, Anieta M. Sieuwerts, John W.M. Martens, Ian

Tomlinson, Massimo Negrini, Ioana Berindan Neagoe, John A. Foekens, Stanley R.

Hamilton, Giovanni Lanza, Scott Kopetz, Riccardo Fodde, George A. Calin. CCAT2, a

novel non-coding RNA mapping to 8q24, underlies metastatic progression and chromosomal

instability in colon cancer. Genome Research, 23(9):1446-61, 2013

Abstract

The functional roles of SNPs within the 8q24 gene desert in the cancer phenotype are not yet

well understood. Here, we report that CCAT2, a novel long noncoding RNA transcript (lncRNA)

encompassing the rs6983267 SNP, is highly over-expressed in microsatellite-stable colorectal

cancer and promotes tumor growth, metastasis, and chromosomal instability. We demonstrate

that MYC, miR–17–5p, and miR–20a are up-regulated by CCAT2 through TCF7L2-mediated

transcriptional regulation. We further identify the physical interaction between CCAT2 and

TCF7L2 resulting in an enhancement of WNT signaling activity. We show that CCAT2 is itself a

WNT downstream target, which suggests the existence of a feedback loop. Finally, we

demonstrate that the SNP status affects CCAT2 expression and the risk allele G produces more

CCAT2 transcript. Our results support a new mechanism of MYC and WNT regulation by the

novel lncRNA CCAT2 in colorectal cancer pathogenesis, and provide an alternative explanation

of the SNP-conferred cancer risk.

Page 20: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

25

POSTER ABSTRACTS

1. Min-Ae Song, Lenora WM Loo, Iona Cheng, Graham Casey, Steven

Callinger, Stephen N Thibodeau, Loïc Le Marchand, Maarit Tiirikainen. Integrated

analysis of microRNA and mRNA expression in microsatellite-stable colon cancer using next-

generation sequencing and cDNA microarrays. American Association for Cancer Research

Annual Meeting, April, 2011.

MANUSCRIPTS IN PREPARATION

1. Min-Ae Song, Lenora WM Loo, Iona Cheng, Graham Casey, Steven

Callinger, Stephen N Thibodeau, Loïc Le Marchand, Maarit Tiirikainen. Integration

analysis of microRNA and mRNA expression in MSS/CIMP-neg colon cancer using Next

Generation Sequencing. Manuscript in preparation.

2. Min-Ae Song, Lenora WM Loo, Iona Cheng, Graham Casey, Steven

Callinger, Stephen N Thibodeau, Loïc Le Marchand, Maarit Tiirikainen. The Landscape

of Aberrant DNA Methylation in MSS/CIMP-negative Colon Cancer. Manuscript in

preparation.

Page 21: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

26

CHAPTER 1

INTRODUCTION

1.1. Epigenetic mechanisms

Classic genetics alone is not able to explain how; despite of their identical DNA sequences,

monozygotic twins or cloned animals can have different phenotypes and different susceptibilities

to diseases. Epigenetic mechanisms may give an explanation for these phenomena (Esteller

2008). Epigenetics is defined as heritable modifications in gene function without a change of

DNA sequences (Goldberg, Allis et al. 2007). Epigenetics is also the gateway to gene-

environment interactions (Song 2011). Two major non-genetic alterations: DNA methylation and

histone modifications are tightly correlated to gene expression and activity (Goldberg, Allis et al.

2007; Mikkelsen, Ku et al. 2007). Moreover, although not currently known to be heritable, non-

coding RNAs (ncRNAs) such as microRNAs (miRNAs) and long ncRNAs (lncRNAs) have

recently been extensively studied for their roles as gene expression regulators (Lee 2012)

(Cannell, Kong et al. 2008) and they are considered to convey further epigenetic regulation.

1.1.1. Histone modification

Within the chromosome, DNA is packed into chromatin, which consists of DNA and

structural histone proteins. Within the chromatin, the repeating unit is the nucleosomes, which

are made up of about 146 base pairs (bp) of double stranded DNAs wrapped around the histone

octamer; consisting of two each of the histones H2A, H2B, H3, and H4 (Fischle, Wang et al.

2003). Epigenetic modification occurs at the amino terminal tail of the histones (Struhl 1998).

Page 22: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

27

Histones and their modifications have an essential role in the formation of heterochromatin.

Heterochromatin (condensed or silent chromatin) is distinguished by hypoacetylation and H3K9

methylation; euchromatin (open or active chromatin) is characterized by histone H4 acetylation

and histone H3K4 methylation (Grewal and Jia 2007).

Acetylation of histones with mainly targeting the amino-terminal tails of histones H3 and

H4 plays a key role in the regulation of gene expression. The balance of the control of histone

acetylation activity is regulated by two families of enzymes, histone acetyltransferases (HATs)

and histone deacetylase (HDACs) (Trievel 2004). For a gene to be transcribed, it must become

physically accessible to the transcriptional machinery. HAT plays a role in the uncoiled DNA and

an open the chromatin structure. Conversely, HDAC plays a role in tight coiling of DNA and

close chromatin structure. Many transcription coactivators such as CBP, p300 and MOF have

been reported to possess intrinsic HAT activity, whereas many transcriptional corepressor

complexes such as mSin3a, NCoR/SMART and Mi-2/NuRd contain subunits with HDAC

activity (Wang, Zang et al. 2008). Figure 1 shows the chromatin remodeling complexes initiated

by histone modifications in the dynamic regulation of transcription (Davis and Brackmann 2003).

In contrast to the dynamic ‘on-off’ nature of histone acetylation, early studies found that

histones H3 and H4 were highly methylated with little turnover of the methyl groups (Borun,

Pearson et al. 1972; Rice and Allis 2001). Histone methylation can occur on arginine or lysine

residues and is catalyzed by histone methyltransferases (HMTs) (Trievel 2004). Arginine

residues can be mono- or di-methylated while lysines can also be tri-methylated (Cohen, Poreba

et al. 2011).

Histone modification patterns are closely associated with gene expression states. “Active”

histone modification marks such as H3K4me3 and H3K36me3 highly enriched within gene

Page 23: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

28

promoters, may be involved in transcription initiation. “Silent” histone marks such as

H3K27me3 and H3K9me3 are correlated with transcriptional repression, in particular, the

H3K9me3 is highly correlated with constitutive heterochromatin as found at centromeres and

telomerases (Maunakea, Chepelev et al. 2010).

Figure 1. Dynamic regulation of transcription by histone modifications. In the presence of

acetylated histones by HAT and absence of methylase by HMT activity, chromatin is loosely

packed. Chromatin remodeler complex, SWI/SNF, opens up DNA region where

transcription machinery proteins such as RNA Polymerase II (RNA Pol II), transcription

factors and co-activators bind to turn on gene transcription. In the absence of SWI/SNF,

nucleosomes remain tightly aligned to one another. Additional methylation by HMT and

deacetylation by HDAC proteins condenses DNA around histones. Thus, RNA Pol II and

other activators cannot bind to DNA, leading to gene silencing (Davis and Brackmann

2003).

1.1.3. Non-coding RNAs

The human genome sequencing project revealed a quite a surprise; that the human

genome encodes just 20,000-25,000 protein-coding genes, representing less than 2% of the total

genome sequence (2004), although around 90% of the human genome is actively transcribed

Page 24: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

29

(Birney, Stamatoyannopoulos et al. 2007). It was discovered that human transcriptome consists

of a complex network including extensive antisense transcription, overlapping multiple exons,

and non-coding RNA (ncRNA) transcription (Kapranov, Cheng et al. 2007). A ncRNA is a

functional RNA molecule that is not translated into a protein (due to the lack of a significant

open reading frame). This RNA class is classified into two major groups based on the size: small

ncRNAs (<200 nt) such as miRNAs and small interfering RNAs (siRNAs), and long ncRNAs

(lncRNAs) (>200 nt). These are arbitrarily divided by a convenient practical cut-off in typical

RNA purification protocols that exclude small RNAs (Esteller 2011). In the cell, most of the

ncRNAs are located in the cytoplasm although some are found in both cytoplasm and the nucleus

(Banfai, Jia et al. 2012).

1.1.2. DNA methylation

DNA methylation is the best known epigenetic marker (Esteller 2008). DNA methylation

is a covalent modification of post-replicative DNA by DNA methyltransferases (DNMTs)

(Herman and Baylin 2003) which transfer the methyl group from S-adenosylmethionine (SAM)

to the carbon 5 position of a cytosine residue to form 5’-methylcytosine (5mC) (Figure 2). These

methyl groups project into the major groove of double helix of DNA and effectively block

transcription. Although a small amount of methylation also occurs at CpNpG sequences, where

N can be A or T, DNA methylation in human genome mostly occurs at CpG dinucleotides rather

than any other sites (Lee, Jang et al. 2010).

A further possible modification of 5mC is the addition of hydroxyl group producing 5-

hydroxymethylcytosine (5hmC). 5hmC was initially discovered in the DNA of certain

bacteriophages (Hershey, Dixon et al. 1953) and was reported in mammalian tissues in 1972 in

Page 25: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

30

brain and liver DNA (Penn, Suwalski et al. 1972). Recently, Tahiliani et al. discovered three

proteins; Ten-eleven translocation 1, 2, and 3 (TET1, TET2, TET3), which catalyze 5hmC

production in 2009 (Tahiliani, Koh et al. 2009). Although it has been suggested that 5hmC may

be produced as an intermediary molecule during demethylation of 5mC, the functional role and

proportion of 5hmC in the human genome warrants to be further determined.

Figure 2. Conversion of the cytosine to 5-methylcytosine by DNA methyltransferase

(DNMT). DNMT catalyzes the transfer of a methyl group (CH3) from S-

adenosylmethionine (SAM) to the 5-carbon position of cytosine (Singal and Ginder 1999).

In most cases, DNA methylation is fairly long-term, but in some cases such as in the

germ cells when silencing of imprinted genes must be reversed in germ cells during fertilization,

epigenetic reprogramming is performed. Although the mechanism for DNA demethylation is not

fully understood, deamination of 5mC may be mediated by the removal of amino groups in this

process (Morgan, Dean et al. 2004). Cytosine and especially 5mC are chemically less stable than

the other nucleobases. Cytosine deaminates into uracil, and 5mC deaminates into thymine.

Therefore, CpGs are underrepresented by about four fold of their expected frequency in

mammalian DNA (Simmen 2008). Although the general level of CpG dinucleotides within

human genome is low, high levels are observed at long repetitive sequences and CpG islands

Page 26: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

31

(CGIs).

Mammals have three active DNA methyltransferases: DNMT1, DNMT3A and 3B.

DNMT1 maintains DNA methylation at hemi-methylated DNA following DNA replication

during cell division (Bestor 1992), whereas DNMT3A and 3B are both considered de novo

methyltransferases, recruited to establish new DNA methylation patterns (Okano, Bell et al.

1999). Although DNMT2 has been identified as a DNA methyltransferase homolog, it does not

methylate DNA but methylates aspartic acid transfer RNA (Goll, Kirpekar et al. 2006).

In mammalian DNA, 5mC is found in approximately 4% of genomic DNA, primarily at

CpGs. CpGs are not uniformly distributed throughout the human genome, but are found more

frequently at small regions of DNA called CpG islands (CGIs) (Herman and Baylin 2003). The

accepted definition of a CGI is a region with at least 200 bp having a GC content greater than 50%

and with an observed-to-expected CpG ratio that is greater than 60% (Gardiner-Garden and

Frommer 1987). About 70% of annotated human genes are associated with the CGIs (Saxonov,

Berg et al. 2006).

Figure 3. Distribution of CpG dinucleotides throughout the genome. N and S indicate the

upstream and downstream of CGIs, respectively.

Recently, the surroundings of CGI within genome have been further classified: CGI

Page 27: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

32

shores (up to 2kb away from CGI) and CGI shelves (2kb to 4kb away from CGI). Interestingly, it

was found that most of the methylation actually occurs at CGI shores rather than in the CGIs

themselves (Irizarry, Ladd-Acosta et al. 2009). Figure 3 shows the classification of locations of

notable CGIs and their surrounding regions.

No single method to detect DNA methylation can be appropriate for every study. DNA

methylation can be analyzed by many different assays depending on the purpose of the study as

described in Figure 4 (Shen and Waterland 2007). For the investigation of DNA methylation,

bisulfite conversion of DNA, which converts unmethylated cytosines to uracil, leaving

methylated cytosines unchanged, is mostly essential for analyzing DNA methylation at specific

CpGs.

Figure 4. The decision tree to select appropriate DNA methylation analysis methods. (Shen

and Waterland 2007).

1.1.3.1. Long non-coding RNAs

LncRNAs can be produced from mRNA transcription process via alternative splicing

Page 28: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

33

intragenically from exons and introns (Shi, Sun et al. 2013) (Figure 5) and they play regulatory

roles at almost every stage of gene expression; from targeting epigenetic modifications in the

nucleus to modulating mRNA stability and translation in the cytoplasm (Mercer and Mattick

2013) (Figure 5).

Figure 5. Paradigms for cellular functions of lncRNAs (red). Transcription from an

upstream lncRNA promoter can negatively (1) or positively (2) affect expression of the

coding gene (purple) by inhibiting RNAPol II recruitment or inducing chromatin

remodeling (HOTAIR lncRNA recruits polycomb complex to induce heterochromatin

formation by H3K27m). In addition, antisense transcripts can pair to their specific sense

RNA, generating alternative splicing (3) or endo-siRNAs (4). When they interact with

proteins, they may influence protein activity (5) or localization (6) or even form cellular

substructures or protein complexes (7). LncRNAs can be processed to yield small, single- or

double-stranded RNAs that may act as endo-siRNAs or miRNAs (8). Moreover, they can

also act as “miRNA sponges” that affect the ceRNA network. LncRNAs: long noncoding

RNAs, miRNA: microRNA, ceRNA: competitive endogenous RNAs (Shi, Sun et al. 2013).

LncRNAs are known for their important roles in epigenetic regulation via chromatin

Page 29: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

34

modification, transcription, and post-translational processing (Figure 6) (Mercer, Dinger et al.

2009). Interestingly, at least 38% of lncRNAs bind to the histone methyltransferase complex

‘polycomb repressive complex2’ (PRC2) or the chromatin modifying proteins (Cheetham, Gruhl

et al. 2013). First discovered in 1991 and a well characterized lncRNA is X Inactivation Specific

Transcription (XIST) (Brown, Ballabio et al. 1991). XIST contains conserved repeats within the

transcript and is largely localized in the nucleus (Brown, Hendrich et al. 1992). Repeat region A

(RepA) is required for silencing function of XIST in cis X inactivation. RepA recruits the PRC2,

which lays down H3K27me, to silence one of the X chromosomes (Zhao, Sun et al. 2008).

Another well-studied lncRNA is Hox transcript antisense RNA (HOTAIR), which originates from

the HOXC locus at chromosome 12 and silences HOXD locus at chromosome 2 by recruiting

PRC2 to silence it (Rinn, Kertesz et al. 2007).

Figure 6. LncRNAs play roles in the chromatin remodeling, transcriptional control, post-

transcriptional processing. (a) LncRNAs can recruit chromatin modifying complexes to

specific genomic CpGs. HOTAIR and XIST recruit the chromatin modifying Polycomb

complex to HoxD locus in the X chromosome or the Kcng1 domain, respectively, where

they methylate H3K27 to induce heterochromatin formation and repress gene expression.

Page 30: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

35

Therefore, a lncRNA can regulate the transcriptional process. (b) A lncRNA binds to the

cyclin D1 gene and recruits the RNA binding protein to modulate the p300 to repress gene

transcription. (c) A lncRNA acts as a co-activator to the transcription factor and regulates

gene expression. (d) A lncRNA transcribed from the DHFR minor promoter in humans can

form a triplex at the major promoter to prevent the binding of the general transcription

factor TFIID, leading to silent DHFR gene expression (e) An antisense ncRNA binds to

mRNA and it results in alternative splicing by blocking of Spliceosome.

1.1.3.2. MicroRNAs

MiRNAs were first discovered in 1993 during a study of the gene lin-14 in Caenorhabditis

elegans (C. elegans) development by Lee et al. (Lee, Feinbaum et al. 1993) and have been

proven as an essential component of the epigenetic regulation. In 2000, a second important

miRNA, let-7 was discovered also in C. elegans (Reinhart, Slack et al. 2000). Let-7 miRNAs

have now been predicted or experimentally identified in a wide range of species.(MIPF000002).

MiRNAs play an important role in gene transcription regulation in different species

including the vertebrates (Lagos-Quintana, Rauhut et al. 2001). As of June in 2013, 30,424

mature miRNAs in 206 species including 2,555 mature human miRNAs have been registered in

the miRBase database (http://microrna.sanger.ac.uk). MiRNAs play important roles in basic

biological functions including cell growth, proliferation, differentiation, invasion, and

angiogenesis by the downregulation of their target mRNAs.

The biogenesis of a miRNA begins with the transcription of a primary transcript (pri-

miRNA). This hairpin structure is transcribed from the miRNA gene as 500-3,000 nucleotide

long transcripts by RNA Pol II and then cleaved by a protein complex involving Drosha/DGCR8.

This results in the precursor miRNA (pre-miRNA, ~60-100 nucleotides) and these double

stranded hairpin structures are exported from the nucleus to cytoplasm by exportin 5 (EXPO5)

Page 31: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

36

(Lagos-Quintana, Rauhut et al. 2001). Next, the pre-miRNA is further cleaved by Dicer1 (an

RNaseIII-containing enzyme) to produce the double stranded miRNA that includes a mature

miRNA sequence (~22 nucleotides, guide strand) and its complementary sequence, which a

miR* (star) (called also as passenger strand or 3p strand) (Denli, Tops et al. 2004). Whereas the

5’-end of the guide strand (the so-called “seed site”) represses the target coding mRNAs by

binding to the 3’untranslated region (UTR) of their target mRNAs and further through

incorporation into the RNA inducing silencing complex (RISC) (O'Toole, Miller et al. 2006), the

passenger strand is usually subjected to degradation (Khvorova, Reynolds et al. 2003). Binding

of a guide miRNA to an mRNA either triggers mRNA cleavage or inhibition of translation

depending on the degree of complementarity between the miRNA and the target sequence

(Figure 7). Interestingly, each miRNA has a potential to target a large number of genes and

bioinformatic analysis of miRNAs predicts that the 3’UTRs of single genes are often targeted by

several different miRNAs (Lewis, Burge et al. 2005). Many different algorithms have been

developed for the prediction of the miRNA-mRNA interactions. Well-annotated algorithms are

based on so called conservation criteria, such as miRanda (John, Enright et al. 2004), PicTar

(Krek, Grun et al. 2005) and TargetScan (Grimson, Farh et al. 2007). Moreover, other parameters

have been used, such as free energy of binding or secondary structures of 3’UTR that can

promote or prevent miRNA binding (Witkos, Koscianska et al. 2011). However, the rules for

predicting the interaction have not been fully established yet, and current knowledge of miRNAs

and their targets is based mainly on experimentally validated real miRNA-mRNA interactions

(Witkos, Koscianska et al. 2011).

Page 32: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

37

Figure 7. MicroRNA biogenesis. (a) MiRNAs are transcribed by RNA pol II into pri-

miRNAs which are recognized and cleaved in the nucleus by Drosha, resulting in hairpin

pre-miRNAs. (b) Pre-miRNAs are exported by Exportin 5 from the nucleus to the

cytoplasm and further cleaved by Dicer, (c) resulting in a miRNA duplex. One strand of

miRNA duplex (mature miRNA) is incorporated into the RISC (d). The mature miRNA

leads RISC to cleave the mRNA or induce translational repression depending on the degree

of complementarity between the miRNA and its target (Garzon, Calin et al. 2009).

1.2. Epigenetic alterations in cancer

1.2.1. Histone modification in cancer

Given the fundamental role of histone modification in regulation of gene expression as

explained in 1.1.1. Histone modification, it is not surprising that aberrant histone modification is

Page 33: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

38

found in cancer. Histone modification by HATs, HMTs, and HDACs have been found to be

involved in tumorigenesis (Fullgrabe, Kavanagh et al. 2011). Two HATs, p300 and CBP are

considered as tumor suppressors (Chan and La Thangue 2001) and loss of heterozygosity (LOH)

at the p300 locus is associated with hyperacetylation in many cancers (Tillinghast, Partee et al.

2003; Koshiishi, Chong et al. 2004). Aberrant expression of HDACs has also been found in

multiple cancers (Chervona and Costa 2012). Furthermore, HDACs have shown to associate with

the tumor suppressor retinoblastoma protein (RB) and repress RB-dependent cell cycle (Siddiqui,

Solomon et al. 2003).

1.2.2. DNA methylation in cancer

Global hypomethylation in tumors as compared to the normal tissue was one of the first

epigenetic alterations to be found (Feinberg and Vogelstein 1983). It is mainly caused by

hypomethylation of repetitive DNA sequences such as LINE-1, and causes demethylation of

coding regions as well as in introns of DNA that result in altered transcripts (Feinberg and Tycko

2004). Recent study found that hypomethylation of LINE-1 leads to activation of proto-

oncogenes such as MET, RAB3IP, and CHRM3 in colorectal liver metastasis tissues compared to

primary colorectal cancer tissues (Hur, Cejas et al. 2013). This study also indicates that increased

5hmC content is associated with LINE-1 hypomethylation in colorectal cancer, providing

important mechanistic insights into the fundamental processes underlying global DNA

hypomethylation. Hypomethylation of DNA was recently found in many CGIs in cancer, unlike

the normal pattern of methylation in somatic tissues. This can lead to gene activation in tumors

including oncogenes such as HRAS (Feinberg and Vogelstein 1983), cyclin D2, HPV16, WNT5A

and S100P (Feinberg and Tycko 2004; Wang, Williamson et al. 2007). However,

Page 34: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

39

hypomethylation of DNA has many mechanistic implications and is not fully understood

(Feinberg and Tycko 2004).

Hypermethylation of DNA in the promoter regions of tumor suppressor genes has been

well studied as a major event in the tumorigenesis. The first finding of a hypermethylated tumor

suppressor gene, the retinoblastoma gene RB, (Greger, Passarge et al. 1989) was soon followed

by the identification of many other hypermethylated tumor suppressor genes including VHL, p16,

hMLH1, MGMT, WRN and BRCA1 (Herman, Latif et al. 1994; Esteller 2008; Kawasaki, Ohnishi

et al. 2008). Moreover, in 1999, a subtype of Colorectal Cancer (CRC) with hypermethylation at

a specific set of CGIs, the “CpG island methylator phenotype (CIMP) markers” was recognized

as a distinct subgroup of CRC (Toyota, Ahuja et al. 1999). Now this classification method has

been applied to other cancers including gastric (Toyota, Ahuja et al. 1999), breast (B-CIMP)

(Fang, Turcan et al. 2011) and glioblastoma multiforme (G-CIMP) (Noushmehr, Weisenberger et

al. 2010). Recently, the role of CIMP has also been investigated in ovarian tumors, especially at

seven CpGs; BRCA1, HIC1, MINT25, MINT31, MLH1, p73, and hTR. Hypermethylation of

those genes was found in a significant proportion of the ovarian tumors, and methylation of at

least one of these genes was found in the majority (71%, 63/93) of samples. (Strathdee, Appleton

et al. 2001).

Recently devised epigenomic techniques suggest that 100 to 400 hypermethylated CGIs

in the promoter regions occur in a given tumor (Esteller 2007). Despite of the extensive studies

of altered methylation in CGIs, it is still not clearly understood how CGIs become

hypermethylated in some types of cancer, but not in others. Moreover, the potential involvement

of methylation beyond the CGI promoters in human disease has been largely overlooked even in

genome-wide studies and the neighborhood of CGIs requires further work for our understanding

Page 35: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

40

of cancers (Jones 2012).

1.2.3. LncRNAs in cancer

Since a number of ncRNAs such as miRNAs, tRNAs, rRNAs, and spliceosomal RNAs

are important to the functioning of the cells, it has been suggested that additional ncRNAs

may play a role in the regulation of cellular machinery (Wilusz, Sunwoo et al. 2009). Indeed,

a group of lncRNAs has been shown to be associated with developmental processes (Rinn,

Kertesz et al. 2007) and human diseases including cancer (Costa 2005), suggesting that

lncRNAs are a new class of functional transcripts. Although the biological significance of this

group of RNAs is still unclear, a variety of functions of lncRNAs have been found in normal

cells including the X-chromosome inactivation by the XIST gene (Wilusz, Sunwoo et al.

2009), genomic imprinting by H19 (Brannan, Dees et al. 1990) and DNA demethylation by

KHPS1a (Imamura, Yamamoto et al. 2004). Furthermore, recent studies have revealed

functional roles for several lncRNAs in cancer. For instance, human cancers have been

described to have aberrant overexpression of non-coding satellite repeats (Ting, Lipson et al.

2011). Also, highly conserved genomic regions called as ultra conserved regions or UCRs are

frequently aberrantly expressed in human leukemia (Calin, Liu et al. 2007) and colon cancer

(Wojcik, Rossi et al. 2010). Similarly, HOTAIR is highly expressed in breast cancers and plays

a role in retargeting chromatin-remodeling complexes (Gupta, Shah et al. 2010).

Other lncRNAs have been found to be key regulators of the protein signaling pathways

in carcinogenesis. The lncRNA lincRNA-p21 contains binding sites for the tumor suppressor

p53 in its promoter and it is directly activated by p53 under DNA damage. Similar to p53,

lincRNA-21 as a tumor suppressor is suggested (Huarte, Guttman et al. 2010).

Page 36: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

41

To achieve replicative immortality, cancerous cells need to bypass the cellular

mechanisms inhibiting proliferation. Telomeres are consisted of many kilobases of short

repeats in humans to protect the chromosomes and they are extended by telomerases, which

are a part of a protein subgroup of specialized reverse transcriptase enzymes named as

Telomerase Reverse Transcriptases (TERTs). Because TERTs are very low expressed in

many types of human normal cells, telomeres shrink a little bit every time a cell replicates.

Recent studies have discovered that telomeric ends are transcribed into a TERRA lncRNA,

which acts as an inhibitor for telomeric DNA (Redon, Reichenbach et al. 2010). In many

cancer cells, alteration of TERRA expression has been observed (Arora, Brun et al. 2012).

Recent studies indicate that several cancer risk associated CpGs are transcribed into

lncRNAs and these transcripts play important roles in tumorigenesis (Cheetham, Gruhl et al.

2013). LncRNAs including POU5F1B (Takeda, Seino et al. 1992), PVT1 (Shtivelman,

Henglein et al. 1989), PRNCR1 (Chung, Nakagawa et al. 2011), POU5FP1 (Wright, Brown et

al. 2010), CCAT1 (Nissan, Stojadinovic et al. 2012) and CCAT2 (Ling, Spizzo et al. 2013;

Redis, Sieuwerts et al. 2013) have been identified in the 8q24 gene desert region which

harbors multiple cancer risk loci for prostate, breast, ovarian and colon cancer susceptibility

(Pomerantz et al. 2009). ANRIL, a large lncRNA gene spanning 126 kb adjacent to p14/ARF,

is located in a genome-wide association studies (GWAS) “hot spot” linked to many complex

diseases including type-2 diabetes and cancers. Recent studies have shown that multiple

disease associated SNPs mapped to the ANRIL locus may affect ANRIL function differently,

resulting in diverse diseases (Cheetham, Gruhl et al. 2013). However, the functional roles of

lncRNAs in the cancer development and progression are still not completely known, so more

investigation is needed to comprehensively understand the roles of lncRNAs in cancer.

Page 37: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

42

1.2.4. MiRNAs in cancer

MiRNAs are directly involved in gene regulation by binding to the 3’UTR in their target

mRNAs, and many of them have been implicated in cancer. According to bioinformatical

analysis, miRNAs are thought to regulate ~30% of all genes (Lewis, Burge et al. 2005). In 2001,

Bullrich et al found chronic lymphocytic leukaemia (CLL) cases with a deletion of about 30 kb

at 13q14, at a chromosomal breakpoint (Bullrich, Fujii et al. 2001). Interestingly, two miRNA

genes, miR-15a and miR-16-1 were found in this region and loss of these miRNAs was observed

in 70% of CLLs. Following these initial observations, many other miRNAs have also been

identified in chromosomal loci which include regions of LOH, amplification, fragile sites, viral

integration sites, and other cancer associated genomic regions (Calin, Sevignani et al. 2004; Iorio

and Croce 2012). In 2005, Lu et al. presented systematic miRNA profiling in multiple human

cancer samples showing that the altered expression of miRNAs is highly correlated with

developmental lineages and differentiation states of the cancers whereas the classification based

on the mRNA profiles was highly inaccurate (Lu, Getz et al. 2005).

Recently, many approaches are applied to investigate the connection between miRNAs

and cancer (Witkos, Koscianska et al. 2011). MiRNAs have been shown to have a role in many

known oncogenic and tumor suppressor pathways involved in the pathogenesis of many cancers

such as the regulation of KRAS pathway by miR-143 (Johnson, Grosshans et al. 2005),

phosphatidylinositol-3-kinase (PI3K) pathway by miR-126 and miR-21 (Guo, Sah et al. 2008),

p53 as a transactivator of miR-34a (Chang, Wentzel et al. 2007), regulation of epithelial-

mesenchymal transition (EMT) transcription factors by the miR-200 family (Burk, Schubert et al.

2008) as well as the Wnt/beta-catenin pathway regulation by miR-135 (Nagel, le Sage et al.

Page 38: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

43

2008). Furthermore, the miR-17-92 cluster (miR-17, miR-18a, miR-19a, miR-20a, miR-19b-1

and miR-92-1) on chromosome 13 mediates Myc-dependent tumor promoting effects (Venturini,

Battmer et al. 2007). Figure 8 shows examples of aberrantly expressed miRNAs in colorectal

cancer pathogenesis. (Slaby, Svoboda et al. 2009).

Figure 8. MicroRNAs' involvement in colorectal cancer pathogenesis. Deregulation of

miRNAs can influence colon cancer carcinogenesis if their mRNA targets are tumor

suppressor genes or oncogenes. Many studies have identified many target mRNAs in tumor

suppressor and oncogenic pathways to be involved in the pathogenesis of CRC. Many

target proteins are involved in key signaling pathways of CRC, such as Wnt/beta-catenin,

PI3K, KRAS, p53, and so on. (Slaby, Svoboda et al. 2009).

Page 39: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

44

1.3. Characterization of colorectal cancer

Colorectal cancer (CRC) is a disease of the gastrointestinal tract arising in the epithelial

cells lining the colon consisting of the ascending, transverse, descending, and sigmoid colon, or

the lining of the rectum (2012). The development of colon cancer includes a heterogeneous

complex of etiological factors and pathogenic mechanisms (Fearon 2011). It is the third most

common cancer worldwide in both men and women in the United States, and the fourth most

common cause of death (Wiseman 2008). The American Cancer Society estimates 102,480 new

cases of colon cancer and 40,340 new cases of rectal cancer for 2013, and about 50,830 deaths

during 2013 (http://www.cancer.org).

Most colon cancer develops slowly over several years beginning as a non-cancerous

polyp on the inner lining of the colon or rectum. The vast majority of colon cancer (about 80%)

arises from adenomatous polyps (Cooper, Squires et al. 2010), which starts in cells that form

glands.

The risk of developing colon cancer is influenced by several risk factors including

modifiable risk factors such as environmental exposures, dietary factors, and lifestyle factors

(physical inactivity, obesity, high consumption of red meats, smoking, heavy alcohol use) and

non-modifiable risk factors like a personal or family history of colon cancer or adenomatous

polyps, and chronic inflammatory bowel disease (Wei, Giovannucci et al. 2004; Lin 2009).

About 25% of colon cancer occurs in people with family history (Cooper, Squires et al.

2010). On the other hand, about 5% to 10% of people who develop colon cancer have inherited

gene defects such as mutations that cause the disease.

Page 40: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

45

1.3.1. Molecular classification of colorectal cancer

CRC results from a relatively uniform and linear sequence of steps caused by both

genetic and epigenetic alterations (Figure 9).

Figure 9. Progressive altered genetics and epigenetics steps in the development of CRCs.

APC inactivation which encodes a protein involved in cell adhesion and transcription is

found in up to 85% of all colon cancers. KRAS is mutated in 50-60% of colon cancers.

SMAD4 is involved in the transforming growth factor b (TGF-b)-signaling pathway. TP53

mutation tends to be a late event and increases the resistance of cancer cells to apoptosis.

Source: Lono DL, Fauci AS, Kasper DL, Hauser SL, Jameson JL, Loscalzo J: Harrison’s

Principles of Internal Medicine, 18th

Edition: www.accessmedicine.com

The CRC occurs mostly sporadically and only about 20-25% of colon cancer patients

have a family history, suggesting genes and environmental factor interaction. Indeed, an

accumulation of multiple genetic (Fearon and Vogelstein 1990) and epigenetic alterations (Wong,

Hawkins et al. 2007) has been found in colon epithelial cells that have transformed into

adenocarcinomas. These alterations may be defined on the basis of two molecular features

including DNA microsatellite instability (MSI) status, classified as MSI-high (MSI-H), MSI-low

(MSI-L) and MS stable (MSS), and the CIMP status, classified as CIMP-high, CIMP-low and

CIMP-negative (CIMP-neg). The most common comprehensive molecular classification system

Page 41: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

46

of colon cancer was first proposed by Jass, defined according to MSI and CIMP status in

conjunction with clinical and pathological features (Jass 2007): Type 1 (CIMP-high⁄ MSI-H ⁄

BRAF mutation), Type 2 (CIMP-high ⁄ MSI-L or MSS ⁄ BRAF mutation), Type 3 (CIMP-low ⁄

MSS or MSI-L ⁄ KRAS mutation), Type 4 (CIMP-neg ⁄ MSS) and Type 5 or Lynch syndrome

(CIMP-neg ⁄ MSI-H). Type 4 is the major subtype of CRC (Figure 10) (Jass 2007).

Figure 10. Derivation of molecular CRC groups 1-5

based on CIMP status and MSI status (Jass 2007).

1.3.2. Genetic and epigenetic alterations of CRC

The cellular transformation process includes molecular alterations of oncogenes and

tumor suppressor genes via mechanisms such as point mutations, rearrangements and

amplifications that can disrupt regulated gene expression (Wong, Hawkins et al. 2007). The

earliest genetic change in colon cancer is often the inactivation of the APC (adenomatous

polyposis coli) gene which is a negative regulator of the Wnt signaling pathway (Gregorieff and

Clevers 2005). Also, genetic variations and altered gene expression levels in other tumor

suppressor genes (SMAD2 and TP53), oncogenes (KRAS) and multiple pathways (Wnt/beta, TGF

beta and/or base excision repair (BER) pathways) accompany transitions from normal cells to

highly malignant tumor cells (Frosina, Fortini et al. 1996; Bellacosa 2003; Gregorieff and

Clevers 2005; Slattery, Herrick et al. 2011).

Page 42: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

47

About 65-70% of sporadic colon cancer exhibits chromosomal instability (CIN), which

leads to increase in a rate of loss or gain of whole or parts of chromosomes (Lengauer, Kinzler et

al. 1998). It was proposed that cancer cells must acquire the intrinsic genomic instability to

increase the rate of new mutations, by Loeb et al. (Loeb, Loeb et al. 2003). The CIN phenotype is

caused by alteration of the chromosome segregation pathway (Pino and Chung 2010). CIN in

colon cancer has been shown to be a marker of poor prognosis (Pritchard and Grady 2011).

A defect in the DNA mismatch repair genes leads to instability in DNA microsatellites

(MSI). MSI is the condition of a rapid genetic mutation that results from loss of function of a

DNA mismatch repair gene (MMR) (Boland and Goel 2010). Cells with an abnormally

functioning MMR tend to accumulate errors and novel microsatellite fragments are created.

Microsatellites are repeated sequences of DNA (1-6 bp) (Queller, Strassmann et al. 1993).

Although the length of these microsatellites is highly variable from person to person (part of

DNA fingerprint), each individual has microsatellites of a set length. Five markers have been

recommended by the National Cancer Institute to screen for MSI (Umar, Boland et al. 2004).

Generally, MSI detection in two of the markers is considered a positive result or as a high

probability of MSI-H. About 15% of colon cancers display MSI because of either epigenetic

silencing by methylation of a mismatch repair gene, MLH1, or a germline mutation in MLH1,

MSL2, MSH6 or PMS2. The remaining 85% of colon cancers are characterized as MSS (Wong,

Hawkins et al. 2007), but the clinicopathologic features of this group remain to be investigated.

Recently, millions of single nucleotide polymorphisms (SNPs) have been studied by

means of GWAS and the meta-analysis of GWAS. Polymorphisms underlying genetic

susceptibility to colon cancer have been intensively investigated (Table 1) (Migliore, Migheli et

al. 2011).

Page 43: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

48

Table 1. Genomic loci associated with CRC risk.

Several independent GWAS have implicated the most promising cancer risk loci at the

8q24 region (128.0-130 Mb) in multiple epithelial cancers, including colon cancer (Easton,

Pooley et al. 2007; Zanke, Greenwood et al. 2007; Ghoussaini, Song et al. 2008). The 800 kb

region of 8q24 contains multiple cancer risk loci and the MYC proto-oncogene. This region

includes at least three regions that independently influence the risk of prostate cancer (region 2:

128.14–128.28, region 3: 128.47–128.54, and region 1: 128.54–128.62), colon cancer (128.47–

128.54) and breast cancer (128.35–128.51). Interestingly, this region contains no known protein

coding genes, but is bounded distally at its centromeric end by FAM84B and at its proximally

telomeric end by c-MYC, two candidate cancer susceptibility genes. In addition to c-MYC and

FAM84B, pseudogenes POU5FP1 and PVT1 within the 128.0- to 130-Mb region of 8q24 have

been shown to be associated with cancer risk. The over expression of POU5F1P1 in prostate

cancer and its genomic location to harbor genetic variation were suggested to have a genetic

Page 44: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

49

function variants to modulate prostate cancer susceptibility (Kastler, Honold et al. 2010).

Previous studies have revealed various genetic alterations in PVT1 locus including chromosome

translocation, amplification and SNP in human disease (Huppi, Pitt et al. 2012) The rs6983267

SNP at 8q24.21 has been consistently associated with an increased risk of colon cancer with the

G risk allele (Pomerantz, Ahmadiyeh et al. 2009). Interestingly, signatures of functional elements

such as enhancers have been found at the genomic region spanning rs6983267 (Tuupanen,

Turunen et al. 2009).

Colon cancer can be classified into three subtypes based on methylation level causing

epigenetic instability: CIMP-high, CIMP-low, and CIMP-neg. On the genetic level, CIMP-high

are characterized by MSI and BRAF mutations and relatively rare KRAS and p53 mutations;

CIMP-low is associated with KRAS mutations and rare MSI, BRAF, or p53 mutations; CIMP-

neg cases have a high rate of p53 mutations, but lower rates of MSI or mutations of KRAS or

BRAF (Shen, Toyota et al. 2007; Ogino and Goel 2008; Hinoue, Weisenberger et al. 2012)

(Figure 11).

Page 45: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

50

Figure 11. Classification of 125 CRCs and heatmap representation of Illumina

HumanMethylation27 BeadChip analysis. DNA methylation profiles of 1,401 probes with

most variable DNA methylation values (Standard deviation >0.2). A color scale from dark

blue as low DNA methylation to yellow as high DNA methylation is represented (Hinoue,

Weisenberger et al. 2012).

The understanding of epigenetic changes in colon cancer has advanced recently.

Examples of the altered epigenetic events are shown in Figure 12. Aberrant methylation of tumor

suppressors, oncogenes and repetitive elements such as LINE1 (Goto, Mizukami et al. 2009;

Kim, Lee et al. 2010; Migliore, Migheli et al. 2011) and also epigenetic regulation changes by

miRNAs have been identified in colon cancer (Yamakuchi, Ferlito et al. 2008; Liu and Chen

2010; Melo and Esteller 2011) (Figure 12).

Page 46: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

51

Figure 12. Epigenetic

alterations in colon cancer.

Many signaling pathways are

affected by altered epigenetic

events which include DNA

methylation and aberrant

expression of miRNAs.

1.4. Genome-wide State-of-the-art methods used for epigenetics

1.4.1. SOLiD Next Generation Sequencing (NGS) for miRNAs

Profiling of mature miRNAs in specific tissue types is one of the key approaches to

investigating the biological roles of miRNAs. Considerable effort has been devoted to

developing methods for high throughput detection of miRNAs. Because of the short length of the

mature miRNAs, very little sequence is available to design assays for quantitative PCR or

microarrays for analyzing miRNAs without bias. Moreover, since miRNAs have shown

similarity in their sequences within a family, often with only one nucleotide difference, it is also

tricky to detect the certain miRNAs specifically (Wark, Lee et al. 2008). Northern blotting is one

of the earliest simple methods to detect a single miRNA without chemical or enzymatic

Page 47: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

52

modification of the target miRNA before analysis (Wark, Lee et al. 2008). However, this method

is of relatively low sensitivity, requires high time consumption, and a large amount of starting

RNA (Varallyay, Burgyan et al. 2008).

Recently, next generation sequencing (NGS) approach to sequence miRNAs, i.e. via

massively parallel high throughput sequencing, has overcome the limitations of quantitative PCR,

microarrays, and northern blotting methods. NGS offers many advantages to profile miRNA

expression, such as sample throughput and capability to discover novel miRNAs (Metzker 2010;

Vigneault, Ter-Ovanesyan et al. 2012).

The variety of NGS platforms have enhanced our understanding of how miRNAs affect

diseases including cancer, (such as) 454 pyrosequencing (Roche), MiSeq/HiSeq (Illumina), PGM

(Ion Torrent), and the SOLiD system (Life Technologies) are common commercially available

technologies.

For my thesis work, I used the SOLiD system to profile miRNA expression. SOLiD

sequencer uses the sequencing by ligation approach followed by library fragmentation, and uses

an emulsion PCR approach with small magnetic beads to amplify the fragments clonally for

sequencing (http://www.lifetechnologies.com). This method uses two-base encoded probes

which give the primary advantage of improved accuracy in color calling. A universal primer

complementary to adaptor sequence is hybridized to templates which are then amplified to

cDNA and size selected. Next, size selected cDNA libraries are amplified by emulsion PCR for

clonal amplification. Each cycle of 1,2-probe hybridization and ligation, imaging, and probe

cleavage is repeated. The SOLiD NGS chemistry is illustrated in Figure 13. In this method,

fluorescently-labeled oligonucleotide probes are ligated to the primer only if they are perfectly

matched to the upstream sequences. This ligated DNA now serves as a primer, and the next

Page 48: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

53

labeled probe is ligated to this if it matches the upstream sequences. The extended product is

removed and the template is reset with a primer complementary to the n-1 position for a second

round of ligation cycles. Five rounds of primer reset are completed for each sequence tag. Thus,

this method has significantly higher specificity and a higher accuracy than the sequencing by

synthesis approach.

Figure 13. Overview of SOLiD sequencing chemistry. (http://www.lifetechnologies.com)

1.4.2. Illumina Infinium HumanMeth450 BeadChip for Methylation

DNA methylation microarrays allow a researcher to study methylation in genome-scale.

Illumina first offered Illumina GoldenGate DNA Methylation BeadArrays for 1,505 CpGs (Byun,

Siegmund et al. 2009) and Infinium HumanMethylation27 BeadChips for 27,000 CpGs (Kanduri,

Cahill et al. 2010). More recently, Infinium HumanMeth450 BeadChip was developed to allow

researchers to study comprehensive genome-scale methylation, with expert-selected coverage

and high sample throughout (Sandoval, Heyn et al. 2011). These unique features make it an ideal

Page 49: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

54

solution for epigenome-wide association studies. This chip covers over 450,000 CpGs across a

large number of genes as well as non-coding regions at a single nucleotide resolution. Coverage

is targeted across 99% of RefSeq genes with sites in TSS1500, TSS200, 5’UTR, 1stExon, Gene

body, and 3’UTR (Table2). Furthermore, it covers 96% of CGIs with additional coverage in CGI

shores and shelves. It also covers non-CpG methylated sites identified in human stem cells,

differentially methylated sites identified in tumor tissues compared to normal tissues and miRNA

promoter regions.

Table 2. HumanMethylation450 BeadChip coverage through gene regions.

Gene

location

Description Genes mapped from UCSC

database (% genes covered)

TSS1500 Region between 200 and 1,500 bp upstream

of the transcription start site (TSS)

NM: 17,820 (94%)

NR: 2,672 (88%)

TSS200 Region from the TSS to 200bp upstream of

the transcription start site (TSS)

NM: 14,895 (79%)

NR: 1,967 (65%)

5’UTR Untranslated region at the 5’end NM: 13,865 (78%)

1stExon 1st exon NM: 15,127 (80%)

Body Region between 1st exon and 3’UTR NM: 17,071 (97%)

NR: 2,345 (77%)

3’UTR Untranslated region at the 3’end 13,042 (72%)

Intergenic Sites which are not in the above categories

NM for mRNA confirmed by experiment evidence, NR for RNA - not coding

This BeadChip method is based on a combination of Infinium I and Infinium II techniques,

which both are analyzed using bisulfite converted DNA (Figure 14) (Bibikova, Barnes et al.

2011). The Infinium I use two site-specific probes for each targeted CpGs, one designed for the

methylated locus and another one for the unmethylated locus. The Infinium II uses single-base

extension of probes incorporating a labeled ddNTP. The level of DNA methylation is determined

by the ratio of the methylated probe intensity and the overall intensity (sum of methylated and

Page 50: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

55

unmethylated probe intensities) and is called a beta-value (Bibikova, Lin et al. 2006).

Figure 14. Schematic of Infinium I (A) and II (B) technology.

1.5. Research aims

The goal of this introduction is to highlight why identification of epigenetic events in

colon cancer is important scientifically and clinically, and how the identification of genes

affected by epigenetic regulation can be achieved. My thesis project focuses on identification of

aberrant epigenetic events in the MSS/CIMP-neg colon cancer, and includes the identification of

these changes via non-coding RNAs and DNA methylation, using a number of different standard

genetic and epigenetic methods, as well as state-of-the-art techniques such as NGS and Illumina

HumanMethylation450 BeadChip. Figure 15 illustrates the outline of the three aims for this

Page 51: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

56

thesis. The three aims are focusing to the study of the following epigenetic changes; (1) Profiling

of miRNAs and their associated potential target genes in colon cancer using data from NGS and

Affymetrix Exon arrays, respectively; (2) Identification of small and long novel ncRNAs at the

8q24 region; a region which contains one of the most relevant colon cancer risk variants, SNP

rs6983267, and including an in-depth look on the possible role of these ncRNAs in colon cancer;

(3) Elucidation of the landscape of the genome-scale DNA methylation in colon cancer using the

Illumina HumanMethylation450 BeadChip.

Through these projects, I have conducted studies to understand the aberrant epigenetic

events in colon cancer and how they are related to colon cancer development.

Figure 15. Outline of the three Thesis Aims.

1.6. Significance

Thanks to the technology revolution, state-of-art technologies such as the microarray

and NGS give us an excellent way to discover new epigenetic alterations. Moreover, they allow

us to comprehensively to understand the landscape of epigenetic changes. My dissertation

Page 52: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

57

project on the genome-wide epigenetic changes including non-coding RNAs and DNA

methylation may further our understanding of the genes and pathways involved in MSS/CIMP-

neg colon cancer development and can give biological insights into possible new genes to be

used as biomarkers.

Page 53: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

58

CHAPTER 2

COMPREHENSIVE PROFILING OF EXPRESSION ALTERATIONS IN

KNOWN AND NOVEL MIRNAS IN COLON CANCER

2.1. Introduction

It is now known that miRNA alterations are involved in the initiation and progression of

human cancer. Next Generation Sequencing (NGS) offers an opportunity to identify these

alterations genome-wide, comprehensively and accurately. An additional advantage of NGS is

the ability to detect expression differences for even the low-abundance miRNAs which may be

functionally significant but cannot be detected by hybridization-based methods (such as

microarrays). The goal of this aim was to comprehensively profile alterations in the expression of

miRNAs that may contribute to the development and progression of the MSS/CIMP-negative

colon cancer.

2.2. Materials and method

2.2.1. Small RNA extraction and quality checks

Colon tissue samples (tumor and normal) were collected from patients with biopsy-

confirmed adenocarcinoma of the colon. These samples are a subgroup of samples described in

Table 3. The tissues, sectioned by a pathologist, were fresh frozen and stored in liquid nitrogen.

The tissues were immersed in RLT Plus Lysis buffer (Qiagen Inc, Valencia, CA), thawed and

homogenized. This was followed by simultaneous DNA/RNA extraction using the AllPrep

DNA/RNA kit (Qiagen). Isolated total RNA was stored at a -80ºC freezer and the DNA at -20ºC.

Page 54: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

59

For small RNA, the flow-through from the RNA-column was collected and miRNA-enriched

fractions were extracted with RNeasy MinElute cleanup kit according to the manufacture’s

Supplementary protocol (Qiagen). Small RNA fractions contained small RNA with 5 to42%

miRNA (mean 22%) as could be seen as the 10-40 nt fraction on the Agilent Small RNA chip.

Sample information is shown in Table 1. Ten fresh frozen colon adenocarcinomas and 10

adjacent normal tissues were collected from colon cancer patients at three participating centers of

the Colorectal Cancer Family Registry (Mayo Clinic, Mount Sinai Hospital, and Cleveland

Clinic).

Table 3. Sample information for NGS miRNA study.

Tumor

sample

Tissue

type Gender Site KRAS TNM stage

T1* Tumor Female Right Mutation IV

T2 Tumor Female Right Mutation III

T3 Tumor Male Right Wild type II

T4 Tumor Male Right Mutation III

T5* Tumor Male Right Mutation N/A

T6* Tumor Female Right Mutation II

T7* Tumor Male Right Wild type II

T8* Tumor Male Left Mutation III

T9* Tumor Male Right Wild type II

T10 Tumor Male Left Mutation N/A

* Paired normal was also analyzed

N/A; not available

2.2.2. SOLiD sequencing

Small RNA fraction (<200bp) was processed into sequencing libraries using the Small

RNA Expression Kit (SREK, Applied Biosystems, Foster City, CA). Briefly, RNA was ligated

overnight with the “A” adaptors from the kit, reverse transcribed, RNAse H-treated, and PCR

amplified before size selection on polyacrylamide gels to isolate the amplicons with 18-30

Page 55: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

60

nucleotides of insert sequences. In addition to the expected 100-120 bp amplicons there was an

additional distinct band at or just above the 120 bp length containing a 30 nucleotide transcript,

so two libraries were made from each of the 20 samples to obtain a larger range of small RNAs;

one library from the region between the 100 and 120 bp of ligated amplicons for the classic

miRNAs (18-24 bp long) and the other for the 30 bp long small RNAs. After checking the size of

the prepared two sets of libraries on the DNA 1000 chip (Agilent, Santa Clara, CA), libraries

were amplified onto beads using emulsion PCR, deposited onto slides, and sequenced using the

SOLiD sequencing system at the Applied Biosystems (ABI) facility. Results were obtained in a

csfasta format. The ten different samples were distinguished by labeled amplification primers in

the SREK kit providing unique barcodes and all ten libraries of certain size were mixed and

sequenced on a single slide. However, since the initial data analysis indicated that a significant

proportion of classic miRNAs were included in the library of the larger small RNA species

(likely due to the inaccuracy of gel based size selection), the reads from the two separate libraries

for each sample were merged before analysis.

2.2.3. Statistical data analysis of SOLiD sequencing

GeneSifter (Geospiza, Inc., Seattle, WA) was used to align the sequences to miRBase

version 14, and Partek Genomics Suite (Partek Inc., St. Louis, MO) was used to carry out

statistical analyses. Two mismatches for read lengths were allowed. To quantify and compare

miRNA expression across datasets, corrected read counts were scaled into “reads per million”

(RPM) (the most common way to normalize reads in the NGS samples) by GeneSifter. Using

the Partek software, fold change filters were applied to select the miRNAs that were regulated

more than 2-fold of log2 transformed RPM. P-values were calculated using one-way ANOVA

Page 56: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

61

and adjusted by Bonferroni correction for multiple testing.

2.2.4. Technical Validation and Replication using realtime RT-qPCR

Technical validation and replication analyses were performed using real-time RT-qPCR

and TaqMan assays (Applied Biosystems, Foster City, CA) for mature miRNAs. The cDNA was

synthesized from small RNA (<200bp) using gene specific primers according to the Multiplex

RT-TaqMan Assay protocol with preamplification of the RT product for 6 newly identified

miRNAs (miR-549, miR-602, miR-638, miR-935, miR-1180, and miR-1268) and 5 cancer

marker miRNAs (miR-18a, miR-20a, miR-21, miR-31 and miR-143), respectively. Reverse

transcription was performed with 0.05x RT primer pools using the following program: 30 min at

16°C, 30 min at 42°C, 5 min at 85°C, and a hold at 4°C. After reverse transcription,

preamplification was done with 0.2x TaqMan miRNA assay pool according to the

PreAmplification Protocol provided by Applied Biosystems’ technical support team. Briefly, 500

pg of small RNA based on the Bioanalyzer Small RNA chip analysis was used for multiplex

reverse transcription, and 2.25 ul of RT product was used for the preamplification step. The RT-

qPCR was performed using 4.5 ul of 1 in 8 diluted preamplified sample, each specific miRNA

assay, and 2x Universal master mix (Applied Biosystems). All reactions were done in a total

reaction volume of 10 ul using relative quantification by real-time PCR on an Applied

Biosystems 7900HT system. Thermal cycling program used for the quantification was as follows:

95°C for 10 min, followed by 45 cycles of 95°C for 15 sec and 60°C for 1 min. Small RNA input

was normalized with the average of two endogenous controls: RNU48 and miR-16 using the

deltadelta Ct method with the formula 2−∆∆CT to calculate the fold change (Livak and

Schmittgen 2001). Each measurement was performed in duplicates and no-template (water)

controls were included for each assay. Data analysis was performed with the RQ Manager 1.2.1.

Page 57: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

62

(Applied Biosystems).

2.2.5. Sample processing for Affymetrix Exon Arrays

Nine colon normal and 10 tumor samples including 6 paired samples analyzed by SOLiD

sequencing were also analyzed on Affymetrix Exon arrays (Affymetrix Inc, Santa Clara, CA). 1

μg of total RNA for each sample was first processed using a ribosomal RNA (rRNA) reduction

procedure as suggested by Affymetrix. The rRNA reduction was verified by running the reduced

RNA samples on the Bioanalyzer (Agilent Technologies, Santa Clara, CA). After rRNA

reduction, the Affymetrix GeneChip® Whole Transcript (WT) Sense Target Labeling Assay

(Affymetrix) was used to generate amplified and biotinylated sense-strand DNA targets for

hybridization on GeneChip® Exon 1.0 ST Arrays following the Affymetrix protocol. Briefly,

double stranded cDNA was derived from 1ug of concentrated rRNA-reduced RNA using T7-

(N)6 random hexamers. This was followed by in vitro transcription to produce amplified

antisense cRNA, which was converted back to single-stranded sense DNA. 5.5ug of sense DNA

was enzymatically fragmented, checked on Bioanalyzer for the appropriate size, terminally

labeled with biotin and hybridized onto Exon Arrays. After an 18 hour-hybridization, the arrays

were washed and stained using the GeneChip® Hybridization, Wash and Stain Kit and the

suggested protocol. The arrays were scanned on The GeneChip® Scanner 3000 7G using the

AGCC (Affymetrix GeneChip® Command Console®) Software to measure the fluorescent

signal intensities at each probe location.

2.2.6. Integrated analysis and Ingenuity pathway analysis

The data used for integration analysis consisted of 715 miRNAs and 18,415 target

mRNAs. The predicted targets for 19 miRNAs were extracted from microCosm and

Page 58: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

63

TargetScan 5.1 by Partek Genomics Suite software and analyzed for correlation to differential

miRNA expression. Correlation analysis of the differentially expressed predicted target mRNAs

for the top 19 miRNAs was conducted using Pearson’s correlation and the Benjamini-Hochberg’s

false discovery rate (BH-FDR) (with a q-value cut-off <0.05). Ingenuity Pathways Analysis (IPA,

Ingenuity Systems, Inc., Redwood City, CA) was used to identify the biological functions of the

target genes and the involved pathways.

2.3. Results and discussion

2.3.1. Quality checks of small RNAs

To verify the extracted small RNA quantity and quality, 1uL of each RNA sample was

analyzed on the RNA 6000 Nano chips for total RNA, and on the Small RNA chips for the small

RNA, using the Agilent Bioanalyzer RNA chips (Figure 16).

Figure 16. The quality of the total RNAs and the small RNA preparations. Total RNAs and

A

B

Page 59: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

64

the small RNAs were checked on Bioanalyzer using the RNA 6000 Nano chip (A) and the

Small RNA chip (B), respectively.

2.3.2. Small RNA library preparation

Amplified cDNA library was made via multiple steps including small RNA ligation with

adaptors that include a defined sequence required for SOLiD NGS, reverse transcription of the

ligated small RNAs, and PCR amplification with “barcode” sequences. To concentrate the

amplified cDNA library and to remove the PCR by-products, size selection was done on

polyacrylamide gel electrophoresis and the purified cDNA library was analyzed on the DNA

1000 chip (Bioanalyzer) (Figure 17).

Figure 17. The quality of the size selected cDNA libraries was checked on the Bioanalyzer

using the DNA 1000 chip.

Interestingly, there was a distinct band at 120 bp containing 30 nucleotides (Figure 17,

Page 60: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

65

bottom figure), so I made two libraries from each of the 20 samples to obtain all of the small

RNAs; one from the region between the 100 bp and 120 bp for classic miRNAs (18-24 bp long)

and the other of the 120 bp fragments for small RNAs, as the ABI recommended. The prepared

libraries were amplified on beads using emulsion PCR and sequenced by ABI.

2.3.3. Deep sequencing of small RNAs

The sequencing process yielded on average 35,6 million and 31,9 million sequences from

normal (n=10) and tumor tissues (n=10), respectively. Among the total sequencing reads, 11.9%

of the reads in normals and 5.1% in tumors were mapped to miRBase version 14 (Table 4). These

miRNAs were further analyzed for differential expression in colon cancer. Read length

distribution of the miRBase mapped sequences (miRBase version 14) were from 18 nt to 33 nt

(Figure 18) in all samples. Most common sizes were 21 nt (13%), 22 nt (30.4 %) and 23 nt

(33.6%), as expected.

Table 4. NGS Mapping Results.

Page 61: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

66

Figure 18. Read

length distribution

(nt, number of

nucleotides) of

sequences mapped

to miRBase. The pie

chart depicts the

percentage of read

lengths relative to

the total number of

reads, averaged

over all 20 samples.

2.3.4. Evaluation and preliminary analyses of the NGS data

For a preliminary feasibility study and to further evaluate the quality of the small RNA

preparations, the expression patterns of well-studied oncogenic (n=13) and tumor suppressor

miRNAs (n=8) in colon adenoma and carcinoma miRNAs were confirmed first (Figure 19).

Except for three miRNAs (let-7g, miR-200c, miR-320), these known miRNAs were significantly

differentially expressed in our NGS data with 100% and 63% (5/8) concordance in the direction

for the oncogenic and tumor suppressor miRNAs, respectively, at BH-FDR p<0.05. Among those

miRNAs, five selected miRNAs were confirmed by quantitative PCR in 5 normal and 6 tumor

tissues; including 4 oncogenic miRNAs (miR-18a, miR-20a, miR-21 and miR-31) and 1 tumor

Page 62: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

67

suppressor miRNA (miR-143) (Figure 20). A high correlation (on average, r=-0.91) between

SOLiD (reads) and qPCR (delta Ct) was found. This preliminary data showed that the NGS

performed well and the results were reliable for the further analysis.

Figure 19. The strongest oncogenic and tumor suppressor miRNA candidates in colon

tumor tissues. The SOLiD data were compared to published data of 21 well-studied

miRNAs (Faber, Kirchner et al. 2009). Big green and red arrows indicate previously

reported changes, and small ones indicate expression change directions in our SOLiD data.

Page 63: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

68

Figure 20. Expression values from SOLiD sequencing (X-axis) plotted against qPCR delta

Ct values (Y-axis). qPCR deltaCt = (Ct miRNA - Ct average of RNU48 and miR-16).

A second preliminary data analysis was performed to see the effect of clinical phenotype

factors and the risk genotype on miRNA expression. This analysis was done using a t-test, and

the p-value was adjusted with Benjamini-Hochberg (BH) or Bonferroni correction. A p-value less

than 0.05 was considered to be indicating a significant difference between groups. Table 5 shows

the numbers of significant miRNAs from 21 comparisons including phenotype factors such as

different tissue type (tumor and normal), tissue origin (left and right), gender (male and female),

and the genotype of rs6983267 (GG or TT). Surprisingly, about 35% of analyzed miRNAs

(252/715) were significantly differentially expressed at BH-FDR of <0.05, and differential

expression of the majority of these miRNAs was observed in tumors compared to the normals in

both the pooled and paired analyses. While all the 21 comparisons didn’t yield significant

findings, I Interestingly, there was also a significant difference between the pooled GG tumors

Page 64: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

69

and GG normals; between the right side tumors and normals; as well as between the male tumors

and male normals, but these findings should be carefully investigated because of the limited

sample sizes in the reference groups (patients with TT genotype, left tumors, and the female

patients).

Table 5. Numbers of significant mRNAs in 21 comparisons.

2.3.5. MiRNA expression profiles in colon normal and tumor tissues

To determine expression alterations among 715 known miRNAs in colon tumors, I

analyzed the SOLiD NGS data using the Analysis of Covariance (ANCOVA). As I observed in

the preliminary analysis (Table 5), miRNA profiling showed substantial differential expression

patterns between tumors and normals (F-Ratio=6.71) while other factors such as gender (male

and female, F-Ratio=1.46), tumor tissue origin (right and left, F-Ratio=1.15), KRAS mutation

status (F-Ratio=1.08) and TNM stage (II, III and IV, F-Ratio=1.8) had a smaller effect on

miRNA expression (Figure 21A) and were not confounding factors in the tumor versus normal

Analysis Group1(control) Group2 Up Down Up Down Up Down

1 AllNormal(n=10) AllTumor(n=10) 261 25 231 21 11 1

2 TTTumor(n=10) GGTumor(n=10) 28 0 0 0 0 0

3 TTNormal(n=5) GGNormal(n=5) 3 20 0 0 0 0

4 TTNormal(n=5) TTTumor(n=5) 109 5 0 0 0 0

5 GGNormal(n=5) GGTumor(n=5) 179 19 52 8 1 0

6 AllNormal,paired(n=6) AllTumor,paired(n=6) 403 15 235 4 1 0

7 TTNormal,paired(n=4) TTTumor,paired(n=4) 130 4 0 0 0 0

8 GGNormal,paired(n=2) GGTumor,paired(n=2) 144 8 0 0 0 0

9 AllRight(n=16) AllLeft(n=4) 0 2 0 0 0 0

10 Righttumor(n=8) Lefttumor(n=2) 0 11 0 0 0 0

11 Rightnormal(n=8) Leftnormal(n=2) 9 22 0 0 0 0

12 Rightnormal(n=8) Righttumor(n=8) 250 19 204 15 10 1

13 leftnormal(n=2) lefttumor(n=2) 18 3 0 0 0 0

14 AllFemale(n=5) AllMale(n=15) 33 2 0 0 0 0

15 Femalenormal(n=2) Femaletumor(n=3) 21 2 0 0 0 0

16 Malenormal(n=8) Maletumor(n=7) 241 23 187 17 3 1

17 Femalenormal(n=2) Malenormal(n=8) 8 0 0 0 0 0

18 Femaletumor(n=3) Maletumor(n=7) 24 6 0 0 0 0

19 KRASwtintumor(n=3) KRASmtintumor(n=7) 1 8 0 0 0 0

20 StageIItumor(n=4) StageIIItumor(n=3) 4 18 0 0 0 0

21 Localizedtumor(n=5) Advancedtumor(n=3) 7 37 0 0 0 0

Rawp-valueBenjaminiand

HochbergBonferroni

Page 65: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

70

comparison. Interestingly, Principal Component Analysis (PCA) showed clear separation of

tumors (red) from normals (blue) based on the entire set of the 715 microRNAs analyzed (Figure

21B).

Figure 21. Source of Variation and Principal Component Analysis (PCA). A. Sources of

variation. B. Principal component analysis (PCA) scatter plot of all normalized NGS data

for tumors versus normals. The points are colored and connected to the centroid of each

tissue group, and ellipsoids are drawn for each group as well. X-axis, first principal component

(PC1); y-axis, second principal component (PC2).

Apparent deregulation of 392 (375 up and 17 down) miRNAs out of the 715 mapped

miRNAs was observed at absolute fold change ≥2 with a raw p-value of <0.05. Nineteen

miRNAs (18 up and 1 down) were significantly differentially expressed after Bonferroni

correction at p<0.01 (corresponding to a raw p-value <0.000007). Six out of the 19 miRNAs;

miR-30a, miR-31, miR-135b, miR-182, miR-183 and miR-202 have previously been reported in

colon cancer and have also been studied for their effect on putative target gene expression in

both/either colon and/or other cancers (Table 6). Among them, miR-31 showed the largest fold

change in tumors compared to normals in our study (fold change of 41.88). Importantly, the

A B

Page 66: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

71

rest of the miRNAs (miR-220b, miR-365-1, miR-549, miR-588, miR-602, miR-638, miR-935,

miR-937, miR-1180, miR-1268, miR-1292, miR-1909 and miR-1914) were first to be found in

our study suggesting a novel association of these 13 miRNAs with colon cancer (Figure 22A).

Expression levels based on average log2 transformed RPM numbers (Reads Per Million) for the

13 newly identified miRNAs (new-miRNAs) (RPM: 2.66 in 10 normals, RPM: 4.44 in 10 tumors)

were substantially lower than those for the 6 known-miRNAs (RPM: 6.66 in 10 normals, RPM:

9.14 in 10 tumors) (data not shown). Likely for this reason, I was able to successfully amplify

only six (miR-549, miR-602, miR-638, miR-935, miR-1180, and miR-1268) out of the 13 new-

miRNAs using real-time RT-qPCR assays for the mature miRNAs. Subsequently, these 6

miRNAs were successfully technically validated in the samples used for the SOLiD analysis (5

normals and 6 tumors, including 4 paired samples). The extent of differential expression between

the tumor and normal samples was not identical between the NGS and RT-qPCR, but all 6

miRNAs were confirmed as upregulated in tumor versus the normal tissue. To confirm the

relevance of these CRC new-miRNAs, Differential expression of five of the newly discovered

miRNAs (miR-549, miR-602, miR-638, miR-935 and miR-1268) was further replicated in

another set of 8 MSS/CIMP-negative paired tumor and normal samples (Figure 22B).

Interestingly, miR-365-1 is actually a pre-miRNA form of miR-365 but I was only able to

examine the expression level of the mature miR-365 because there were no pre-designed

TaqMan assays for the pre-miRNA of miR-365-1. Unlike overexpression of the pre-miR-365-1

in tumors by NGS analysis, downregulation of the mature miRNA was observed in the tumors of

the validation sample set by RT-qPCR.

Page 67: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

Table 6. Expression levels and putative targets of 6 differentially expressed known colon tumor miRNAs.

Page 68: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

73

Figure 22A. Tumor vs. normal miRNA expression profiles of 13 newly identified miRNAs.

A. Box plots of gene expression levels for 13 newly identified miRNAs in tumors (n=10, blue

dots) compared to normals (n=10, red dots). Each point represents the normalized miRNA

expression levels for an individual. The median gene expression level for each genotype

specific group is indicated by a line inside each box within the graph. Paired samples are

connected by a line. The p-value indicates the significance of the miRNAs’ expression

tumor versus normal. Corrected p-value indicates the significance after the Bonferroni

correction for multiple comparisons.

A

Page 69: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

74

Figure 22B. Quantitative real-time RT-PCR levels of the 6 newly identified miRNAs (new-

miRNAs) in the validation (n=11) and replication (n=16) samples. Expression values were

calculated relative to the average of the RNU48 and miR-16 levels, and assays were

performed in duplicates. The data are shown as mean ± standard error of the mean.

B

Page 70: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

75

2.3.6. Correlation of expression levels between 19 differentially expressed tumor miRNAs

and their predicted target genes

Expression levels of possible targets of the 19 altered miRNAs were verified using

summarized gene expression levels from Affymetrix Exon arrays. Since one of the normal

samples was not available for the Affymetrix Exon arrays, the integrative analysis was done

using 9 normals and 10 tumors, which were profiled for miRNAs using NGS. In general, the

expression levels of miRNAs were both positively and negatively correlated with their predicted

targets for all the miRNAs analyzed with a Pearson correlation FDR q-value <0.05. Although the

negatively correlated miRNA::mRNA pairs suggest potential direct interactions (i.e. upregulated

miRNA expression correlates with target gene downregulation), positively correlated ones were

also indicated in the analysis, suggesting an indirect mechanism of the gene regulation (not

mediated via the miRNA seed region). The predicted target genes for the 6 known and the 13

new colon cancer miRNAs were extracted from TargetScan 5.1 (TS) and microCosm (MC) using

Partek Genomics Suite and the expression levels of the mRNAs were integrated with the

expression levels of miRNAs using the Pearson correlation analysis.

Using targets from both sources, 166 and 41 genes were significantly correlated with 6

known miRNAs by TS and MC, respectively (data not shown). Among them, 10 genes were

predicted by both databases (r=±0.62-0.83, FDR q<0.05). Since there were no predicted targets

for a subset of the 13 new-miRNAs from neither TS nor MC, only 7 and 8 new-miRNAs were

analyzed for the correlation with their potential target by TS and MC, respectively. Ninety-two

pairs with 88 genes were significantly correlated at an FDR q<0.05 using combined TS and MC

databases (data not shown). The Table 7 shows the list of significantly differentially expressed

target genes (FDR q<0.05) among correlated targets (with Pearson Correlation FDR q<0.05) for

Page 71: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

76

known and the newly identified colon cancer miRNAs.

Table 7. Significantly differentially expressed target genes (FDR q<0.05) among correlated

targets (FDR q<0.05). Negative correlations between miRNAs and target genes are indicated by

bolded gene names.

KnownmiRNAs TargetScan microCosm NewlyidentifiedmiRNAs TargetScan microCosm

hsa-mir-135b RNF43 TCFL5 hsa-mir-220b RICH2 -

MMP11 - hsa-mir-365-1 ANK3 CLMN

PCMTD2 - hsa-mir-365-1 CPT1A GTF2F2

hsa-mir-182 EFNA5 RPUSD4 hsa-mir-365-1 CNTN4 RPS6

ZZEF1 TCFL5 hsa-mir-365-1 NR3C2 TNFSF15

KIAA0513 - hsa-mir-365-1 PDE4D INHBA

NUMB - hsa-mir-365-1 MYLK RPUSD4

RELL1 - hsa-mir-365-1 XPO4 TFAP4

ME2 - hsa-mir-365-1 CCND1 C13orf18

TTYH3 - hsa-mir-365-1 CDC25A LSM8

TRIB3 - hsa-mir-365-1 SET -

hsa-mir-183 - PIAS2 hsa-mir-549 NEGR1 TMIGD1

- DDX31 hsa-mir-549 KIAA0430 C6orf105

- TCFL5 hsa-mir-549 RICH2 NKD1

hsa-mir-202 AKAP13 ABR hsa-mir-549 MFAP5 -

SYT7 RABEPK hsa-mir-549 LPIN2 -

ACSL6 C13orf18 hsa-mir-549 MYO5A -

CCND1 CDH3 hsa-mir-549 RAG1 -

- IFRD2 hsa-mir-549 C1orf135 -

hsa-mir-30a C13orf18 C13orf18 hsa-mir-549 MET -

C19orf50 C19orf50 hsa-mir-549 GALNT2 -

PLXNA1 CSNK1E hsa-mir-588 KLF12 -

CDCA7 - hsa-mir-588 FOXN3 -

EYA2 - hsa-mir-588 TNS4 -

NEDD4L - hsa-mir-588 RHBDF1 -

WDR7 - hsa-mir-602 NKX2-3 -

SCARA5 - hsa-mir-638 EPHA7 TMEM161A

RELL1 - hsa-mir-935 APC TACSTD2

NR3C2 - hsa-mir-935 SH3TC2 C13orf18

LIFR - hsa-mir-935 PLXNA1 -

NDEL1 - hsa-mir-935 C13orf18 -

MARCH8 - hsa-mir-937 - LSM7

C14orf43 -

hsa-mir-31 NUMB ITPA

Page 72: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

77

To explore the biological implications associated with the significantly correlated target

genes from TS and MC targets, for both the known and newly identified colon cancer miRNAs,

269 unique genes were analyzed to identify functional networks that are possibly impacted by

the miRNAs. IPA analysis indicated that the target gene list is enriched for genes having a role in

cell cycle, cancer, and gastrointestinal disease (Table 8). About 13% (36/269) of these genes have

previously been reported to be deregulated in colon cancer (Table 9) either by gene expression

alterations, mutation or methylation. The altered expression profiles agreed with the previous

studies in 89% and 100% in down- (tumor suppressors) and upregulation (for oncogenes) in

colon tumor tissues (Table 9). Interestingly, Wnt/beta-catenin signaling was shown as the top

canonical pathway in the IPA analysis and ten potential targets are involved in this pathway

(Figure 23). Wnt/beta-catenin signaling is one of the best known activated pathways related to

colon cancer development (Kinzler and Vogelstein 1996). The activation of this pathway is

associated with the downregulation or mutation of APC genes. This gene was negatively

correlated with miR-935 expression in this study.

In conclusion, the 19 altered miRNAs that were found may have an important role in

colon cancer development through the regulation of target oncogenic or tumor suppressor

mRNAs that play important roles in cell cycle and the Wnt/beta-catenin pathway.

Table 8. Top IPA network of the CRC miRNA target genes.

a score = -log10(p-value)

b Number (percentage) of the genes in the given input gene list which are involved in a

network

Page 73: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

78

Table 9. Significantly correlated target mRNAs that have been reported in other studies and their differential expression in

colon tumor compared to normal tissues in the current study.

Page 74: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

79

Figure 23. Genes significantly correlated with top miRNAs in the Wnt/beta-catenin

signaling pathway in colon tumor tissue. Wnt/beta-catenin pathway genes that were

significantly correlated with significant altered expression of miRNAs in colon tumors

compared to adjacent normal tissues are indicated by red for positive correlation or green

for negative correlation. Each miRNA that is correlated with the gene’s expression is

shown in the blue box.

Page 75: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

80

2.3.7. Previous findings of known colon cancer miRNAs in cancers

Although bioinformatics tools provide means to predict potential target genes for

miRNAs via computational algorithms, the predicted target genes should be further validated

experimentally. It is important to elucidate the roles of the miRNAs via the target mRNAs

because one miRNA could downregulate several target genes. Thus, I searched the literature to

support and validate my findings, in order to understand the potential roles of the known and

the newly identified miRNAs in this study.

Six out of the 19 top miRNAs; miR-30a, miR-31, miR-135b, miR-182, miR-183 and

miR-202 have previously been described among the most consistently deregulated in colon

cancer in general (Bandres, Cubedo et al. 2006; Nagel, le Sage et al. 2008; Kim, Choi et al.

2009; Ng, Chong et al. 2009; Sarver, Li et al. 2010). All of them except miR-30a are up-

regulated in colon cancer (indicating oncogenic miRNAs). MiR-30a, which is suggested to

have a tumor suppressor-like function, was downregulated in our data, supporting the findings

by others (Schetter, Leung et al. 2008; Zhong, Bian et al. 2013) (Wang, Zhou et al. 2009).

MiR-30a can inhibit mitochondrial fission by suppressing the expression of p53 (Li, Donath et

al. 2010). In our study, upregulation of C13orf18, C19orf50, PLXNA1, CSNK1E and CDCA7

and downregulation of EYA2, NEDD4L, WDR7, SCARA5, RELL1, NR3C2, LIFR, NDEL1,

MARCH8 and C14orf43 was significantly correlated with downregulated miR-30a (r>±0.62,

FDR corrected q<0.05). Notably, C13orf18 and C19orf50 were identified in both microCosm

and TargetScan 5.1 databases as targets of miR-30a. Genes EYA2, NR3C2 and CSNK1E

function as transcription factors or activators. Transcription factors E2F3 and TFDP1 have

been found overexpressed in various cancers and are thought to have a key role in cell cycle

regulation (Yasui, Arii et al. 2002; Foster, Falconer et al. 2004). In our study, E2F3 was

Page 76: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

81

significantly negatively correlated with miR-30a (r=-0.64, q=0.46), but not differentially

expressed in tumors compared to normals.

In the study of Wang et al., miR-31 expression was positively associated with

advanced TNM stage, suggesting overexpression of miR-31 may be involved in the

development and progression of CRC (Wang, Zhou et al. 2009). In our analysis of a limited

number of samples, there was no difference in miR-31 expression in stages III and IV versus

stage II, only in MSS tumor versus normal. Interestingly, among putative targets of miR-31, the

most notable oncogenic targets are a member of Wnt signaling pathway, AXIN1, and the

forkhead family transcription factors FOXP3 (as indicated by microCosm). However, I didn’t

see significant correlations in expression of these two genes with miR-31 (AXINI: r=0.46,

raw p=0.45; FOXP3: r=0.39, raw p=0.1). The most significantly correlated genes were NUMB

(r=-0.75, q=0.027) and ITPA (r=0.79, q=0.03) and those genes were also significantly

differentially expressed in tumors compared to normals (q<0.05). Interestingly, NUMB, a

negative-regulator of Notch-1, has been reported to regulate p53, preventing its degradation

(Colaluca, Tosoni et al. 2008) and down-regulation of NUMB has been studied in advanced

colon cancers (Meng, Shelton et al. 2009).

More than 60% of all colorectal adenomas and carcinomas carry a mutation in the

APC gene. The APC is known as a tumor suppressor in its capacity to properly regulate

intracellular β-catenin levels (Powell, Zilz et al. 1992), and it encodes a multifunctional protein

that may participate in several cellular processes such as cell adhesion and migration, signal

transduction, microtubule assembly and chromosome segregation. Studies of the functional

roles of individual miRNAs has demonstrated that miR-135a and miR-135b directly target the

3'-UTR region of the APC, suppresses its expression and activates Wnt/beta-catenin signaling,

Page 77: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

82

and overexpression of miR-135 has been observed in colorectal adenomas and carcinomas

(Nagel, le Sage et al. 2008). However, I didn’t see any significant correlation between APC and

miR-135b expression in our tumor set (r=-0.21, raw p=0.38). I observed downregulation of

SMAD4, a key mediator of the TGF-β pathway (and which is mutated and/or deleted in many

cancers including colon cancer (Ali, McKay et al. 2010)) to correlate with miR-135b

upregulation (r=-0.66, raw p=0.0.002), although this gene was not significantly differentially

expressed in tumors compared to normals. I also observed significant positive correlation

between miR-135b and RNF43, TCFL5, MMP11 and PCMTD2 (r>±0.7, FDR corrected

q<0.05).

Recently, highly significant differential expression of miR-182 and miR-183 in colon

tumors was reported in both MSI-high and MSS colon tumors (Sarver, French et al. 2009)

and miR-183 was also studied as an oncogene with the transcription factor EGR1 as a

putative target (Sarver, Li et al. 2010). Here, in this study, among the predicted target genes,

I didn’t see a correlation between miR-183 and EGR1 expression (r=0.25, raw p=0.29), but

instead, downregulation of PIAS2 and upregulation of DDX31 and TCFL5 were associated

with miR-183 overexpression. Downregulation of EFNA5, ZZEF1, KIAA0513, NUMB,

RELL1, and ME2 and upregulation of TTYH3, TRIB3, RPUSD4, and TCFL5 were correlated

with miR-182 overexpression. All of the above genes (except for EGR1) were significantly

correlated with miR-182 or miR-183 (r>±0.62, FDR corrected q<0.05) as well as

significantly differentially expressed in tumors relative to normals (FDR corrected q<0.05).

2.3.8. Previous findings of colon cancer new-miRNAs in cancers

Up-regulation of miR-202 (9.15 fold compared to healthy controls) has previously been

Page 78: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

83

shown in plasma and biopsy samples of CRC using RT-qPCR based miRNA profiling arrays

(Ng, Chong et al. 2009). A putative target gene of miR-202, CCND1, is a key regulatory

protein of the cell cycle and overexpression of this gene has been associated with increased cell

proliferation and poor prognosis in CRC (Le Marchand, Seifried et al. 2003). In this study, I

observed significant positive correlation between miR-202 and CCND1 and also CDH3, a gene

involved in Wnt/beta-catenin signaling, as well as negative correlation between miR-202 and

AKAP13, a gene related to apoptosis (r>±0.65, q<0.05). These genes are also significantly

differentially expressed in the MSS tumors compared to normals (with q<0.05).

Although one might expect some markers to be expressed specifically in one tumor

type alone, it is conceivable that others would be expressed in a range of tumors. Here, I have

found 13 miRNAs to be potential new colon tumor markers and to our knowledge, this is the

first finding of these miRNAs in resected colon cancer tissues; them being miR-220b, miR-

365-1, miR-549, miR-588, miR-602, miR-638, miR-935, miR-937, miR-1180, miR-1268,

miR-1292, miR-1909 and miR-1914 (except for the finding of upregulated miR-549 in a mixed

set of tumors from colon and rectum, more details below). Cummins et al. previously

reported colorectal “microRNAome” results in 4 colorectal cancer cell lines, 2 colon normals

and 2 tumor tissues using a new developed approach called miRNA serial analysis of gene

expression (miRAGE) (Cummins, He et al. 2006). In his study, four of our newly identified

miRNAs (miR-549, miR-588, miR-602 and miR-638) were identified in one of the colorectal

cancer cell lines (HCT-116). However, none of them had differential expression in colon

tumors as compared to normal tissue. A few past studies have examined expression levels of

these 13 miRNAs in other cancers. Our study is looking into the roles of these newly

identified miRNAs in colon cancer based on the observed miRNA::mRNA interactions.

Page 79: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

84

Interestingly, miR-220b, one of our newly identified colon tumor miRNAs, was

withdrawn from the miRBase based on a study by Chiang’s group (Chiang, Schoenfeld et al.

2010), and it has been suggested that the absence of miR-220b in data sets might reflect either

inaccuracy of its annotation or very low expression. Here, I identified this low expressed

miRNA as consistently upregulated in colon tumors, which suggest that miR-220b may play an

important role as an oncogenic miRNA despite of its low abundance (average log2 transformed

RPM: 3.75 in normals, 6.37 in tumors). Downregulation of RICH2 mRNA significantly

correlated with upregulation of miR-220b in tumors (r=-0.66, corrected FDR q=0.04) and the

RICH2 was also differentially expressed in tumors compared to normals (FDR corrected

q=0.04) in our data set.

I identified upregulation of miR-365-1, a pre-miRNA of miR-365, in colon cancer by

NGS. Despite of the consistent and significant upregulation of this small pre-miRNA in our

study, the mature miR-365 was not significantly deregulated in our NGS data and in fact I

observed downregulation of the mature form by RT-qPCR (Figure 20). Downregulation of the

mature miR-365 was also observed in a recent colon cancer study (Nie, Liu et al. 2012) as well

as in a small study using 2 paired normal and tumor samples and 4 colorectal cancer cell lines

(Cummins, He et al. 2006). The potential target genes (of the mature miRNA) include

members of the RAS oncogene family, such as RAB1B and RAB22A, and ubiquitin specific

peptidase 33 (USP33) (Yan, Huang et al. 2008), but I didn’t find significant correlation of these

three genes to the miR-365-1. Instead, the most notable targets for miR-365-1 were

upregulated cell cycle related genes CCND1, INHBA, and CDC25A and cell death related

genes PDE4D, GTF2F2, PDE4D, RPS6, TFAP4, TNFSF15 and MYLK (r>±0.66, FDR

corrected q<0.05). These genes were also significantly differentially expressed in tumors

Page 80: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

85

compared to normals (q<0.05).

MiR-549 was very recently found to be upregulated in a study that utilized high-

throughput sequencing for a mixed set of colon and rectal adenocarcinomas (Hamfjord,

Stangeland et al. 2012). No further molecular classification was provided for the tumors, but

the finding of an upregulation of miR-549 matches our finding in the MSS/CIMP-negative

tumors of colon. Surprisingly, only one more miRNA (miR-135b) was significantly

deregulated in both studies, which may be due to the higher heterogeneity between the tumors

in the Hamfjord study (with a total of 37 significantly deregulated miRNAs). MiR-549 is

thought to be co-transcribed with the co-located KIAA1199, a gene previously found to be

upregulated in colon cancer (Sabates-Bellver, Van der Flier et al. 2007). However, this gene

was not on the predicted target gene list, so I didn’t study the expression correlation at this time.

MiR-549 has also previously been reported as an oncogenic miRNA targeting leucine zipper

putative tumor suppressor LZTS1 gene in uveal metastatic melanoma (Radhakrishnan,

Badhrinarayanan et al. 2009), but this gene was not significantly correlated with miR-549 in

our colon cancers. Instead, I found downregulation of Dihydropyrimidine dehydrogenase gene

(DPYD) to be associated with overexpression of miR-549 in our study (r=-0.74, FDR corrected

q=0.04). DPYD is the initial rate-limiting enzyme in the degradation of 5-fluorouracil (5-FU),

and is known to be a principal factor in clinical response to the anticancer agent 5-FU. Low

expression of DPYP has been previously observed in colon cancer and one study showed that

aberrant methylation of the DPYD promoter region acted as one of the repressor mechanisms

for DPYD expression (Noguchi, Tanimoto et al. 2004). I also observed DPYD to be

downregulated in the tumor versus normal tissue. Our findings suggest a regulatory mechanism

of DPYD by miR-549.

Page 81: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

86

Yang et al. found that miR-602 expression increased with progression of HBV-related

hepatitis to cirrhosis and hepatocellular carcinoma, and noted that the tumor suppressor RAS

association family 1 gene (RASSF1A) was inhibited in cell lines that highly expressed miR-602

(Yang, Ma et al. 2010). However, I could not observe the correlation between miR-602 and

RASSF1 (r=0.15, raw p=0.55) in colon tissues. I did observe that downregulation of a predicted

target gene NKX2-3 was significantly correlated with overexpression of miR-602 (r=-0.64,

FDR corrected q=0.04), although the gene expression was not significantly different in tumors

compared to normals. The NKX2-3 belongs to a large family of related genes that encode

homeodomain-containing transcription factors and is involved in gut and lymphoid organ

formation (Pabst, Forster et al. 2000).

Overexpression of miR-638 (19p13.2) has been found to be associated with

mesenchymal stem cells (MSCs) (Liu, Fu et al. 2009) and it was found to be upregulated in the

plasma of patients with pancreatic cancer (Ali, Almhanna et al. 2010), while downregulation of

this miRNA was observed in gastric cancer tissue compared to normal gastric tissue (Katada,

Ishiguro et al. 2009; Yao, Suo et al. 2009). MiR-638 is significantly upregulated by 5-FU in

MCF-7 breast cancer cells (Shah, Pan et al. 2010) and it has also been identified as regulator of

breast cancer 1, early onset (BRCA1) gene via binding to its target site inside the coding

sequence (Nicoloso, Sun et al. 2010). In our study, BRCA1 was not significantly correlated

with miR-638 (r=0.47, FDR corrected q=0.13), but overexpression of TMEM161A and

downregulation of EPHA7 were significantly correlated with upregulation of miR-638

(r>±0.67, FDR corrected q<0.05). These genes were also significantly differentially expressed

in tumors compared to normals (FDR corrected q<0.05).

Overexpression of miR-935 and miR-937 has previously been identified in cervical

Page 82: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

87

cancer (Lui, Pourmand et al. 2007), although, in contrast, miR-935 was found to be

downregulated in stage III/IV ovarian carcinomas by deep sequencing (Wyman, Parkin et al.

2009). Most interestingly, APC is one of the potential targets of miR-935 (based on TargetScan

database) and the expression levels of these two genes were significantly inversely correlated (r

= -0.64 and p=0.0028) in our colon cancer data. I also observed significant downregulation of

the Krüppel-like factor 4 (KLF4) to be associated with miR-935 (r=-0.72, FDR corrected q

=0.045) in colon tumor tissues, although this gene was not significantly differentially

expressed in tumors compared to normal (expression in tumors vs normals: p=0.0047. FDR

corrected q=0.08). KLF4 is a zinc finger-containing transcription factor that inhibits cell

proliferation and its downregulation promotes proliferation and differentiation in epithelial

cells, both during development and in tumorigenesis. Furthermore, KLF4 is suggested to act

as a tumor suppressor in colon tissue (Yori, Johnson et al. 2010). A microRNA mimic to

miR-935 has been found to sensitize the HCT-116 CRC cell line to a BCL-2 family inhibitor

therefore promoting apoptosis (Lam, Lu et al. 2010). Sensitizing miRNAs are expected to be in

low levels in tumors, but surprisingly, I found upregulation of miR-935 in the MSS colon

tumors. Also, interestingly, Shah et al. reported upregulation of miR-935, miR-1180, miR-1268

and miR-1292 in response to 5-FU chemotherapeutic drug in breast cancer cells (Shah, Pan et

al. 2010). MiR-937 has been found to be significantly upregulated in inflammatory versus non-

inflammatory breast cancer (Lerebours, Cizeron-Clairac et al. 2013). The expression levels of

two potential target genes including ABHD11 (r=0.75, FDR corrected q=0.042) and LSM7

(r=0.77, FDR corrected q=0.034) (microcosm) were significantly correlated with expression of

miR-937 in our study. Moreover, LSM7 gene was also differentially expressed in tumors

compared to normals (p=0.0003, FDR corrected q=0.042).

Page 83: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

88

MiR-1180 and miR1268 were found upregulated in the plasma of patients with cancers

of pancreas and prostate (Ali, Almhanna et al. 2010). MiR-1180 has been found to be

significantly deregulated by tamoxifen in MCF-7 cells bearing an oncogenic isoform of HER2

(Cittelly, Das et al. 2010). However, there are no significantly correlated predicted targets for

miR-1180 and there aren’t any predicted targets for miR-1268 at all.

Recently, a set of miRNAs called T/B-miRs that are post-transcriptionally regulated by

TGFb/BMP (transforming growth factor b (TGFb)/bone morphogenetic protein (BMP )) were

identified (Davis, Hilyard et al. 2010). The stem region of primary transcripts of T/B-miRs

contains a conserved sequence similar to the Smad binding element (SBE) found in the

promoters of TGFb/BMP regulated genes. MiR-1292 contains the repressive SBE sequence

(R-SBE) (5’-CAGAC-3’) and it is suggested that the biosynthesis of miR-1292 is controlled by

the TGFb-Smad signaling pathway. MiR-1909 and miR-1292 were first identified in human

embryonic stem cells (hESC) by deep sequencing of small RNA libraries and their predicted

targets are related to chromatin remodeling (TargetScan) (Bar, Wyman et al. 2008). However,

there has been no study of this miRNA associated with any disease so far.

MiR-1914, the final miRNA identified in this study, is located at a breakpoint region

(chr20:62,043,262-62,043,341) and has been studied in chronic myelogenous leukemia (CML)

(Albano, Anelli et al. 2010) However it has not been implicated in any other solid cancers

except for the star form, which was found to be associated with the differentiation status in

liver cancer (Murakami, Tamori et al. 2013) and has also been studied in plasma samples of

patients with gastric cancer (Konishi, Ichikawa et al. 2012). Currently, there are no predicted

targets for miR-1914 by neither TargetScan 5.1 nor microCosm

Page 84: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

89

2.4. Conclusion

Deep small RNA sequencing in MSS/CIMP-negative colon cancer confirmed

differential expression of 6 known-miRNAs) and 13 new-miRNA; miRNAs that have not been

associated with colon cancer before our study. Most of the 13 are relatively late additions to the

miRBase and may have not been included in many pre-designed miRNA assay panels. This

combined with our observance that all 13 are low abundance RNAs, may have prevented them

from emerging in other colon cancer studies. Our findings underline the importance of an

unbiased and sensitive analysis platform, such as provided by a next generation sequencing

application, for any genome-wide transcriptome analyses. The present study points exciting

directions for future biological functional gene research; the target genes of the newly

identified miRNAs play roles in cell cycle; cell death and cell proliferation and tumorigenesis.

It will be important to examine the functional significance of the identified miRNA::mRNA

interactions. If their relevant role for colon cancer is confirmed in our follow-up In Vitro

studies, these new-miRNAs may be candidates for new biomarkers of colon cancer.

Page 85: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

90

CHAPTER 3

IDENTIFICATION OF NONCODING RNAS IN THE 8q24 REGION

SPANNING THE MULTIPLE CANCER RISK LOCUS SNP rs6983267

3.1. Introduction

Large genome-wide association studies (GWAS) have recently identified CRC-

associated loci on 8q23.3, 10p14, 11q23, 15q13, and 18q21 (Tenesa, Farrington et al. 2008;

Tomlinson, Webb et al. 2008). Most interestingly, germline genetic variations in a very gene

poor 8q24 region (128.1-128.7 Mb) were identified by GWAS studies in patients that

developed prostate, colon and ovarian cancers. Among them, the SNP rs6983267 has been

considered as the most promising variant for functional assessment (Poynter, Figueiredo et al.

2007; Huppi, Pitt et al. 2012). The G allele of rs6983267 SNP located in 8q24.21 has been

associated with an increased risk of prostate, ovarian, breast, and colon cancer (Esteller 2008;

Tenesa, Farrington et al. 2008). Despite of the consistent association between rs6983267 and

cancer risk, the molecular mechanism mediating the risk is still largely unknown. The genomic

region spanning rs6983267 at this 8q24 gene desert region was found to contain DNA enhancer

elements (Jia, Landan et al. 2009) and the risk allele G has been shown to produce a stronger

binding site for the Wnt-regulated transcription factor TCF4 (Tuupanen, Turunen et al. 2009).

The rs6983267 is located in the intergenic region 335 kb upstream from the MYC proto-

oncogene which is found to be the most frequently amplified protein-coding gene in cancers

including the colon cancer (Beroukhim, Mermel et al. 2010). I hypothesized that the “gene

desert” locus upstream of c-MYC oncogene that contains rs6983267 may contain non-coding

RNAs that may play a role in colon cancer development.

Page 86: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

91

Aim 1: Identification of small non-coding RNAs, miRNAs, using computational

algorithms in 8q24. The aim was to identify novel miRNAs around rs6983267 SNP on

8q24 using computational algorithms In Silico.

Aim 2: Genotype associated expression of known miRNAs in the 8q24 region

The aim was to study expression levels of known 8q24 region miRNAs depending on the

rs6983267 SNP genotype.

Aim 3: Identification of lncRNAs. A highly conserved lncRNA in the 8q24.21 genomic

region encompassing the rs6983267 SNP was recently found. My aim was to study the

expression of this lncRNA in our colon cancer samples and to try to elucidate the possible role

of this lncRNA in colon cancer carcinogenesis.

3.2. Materials and methods

3.2.1. In Silico prediction of potential miRNAs in the 8q24 region

I have utilized several publicly available algorithms.to predict novel miRNAs in a 30kb

region corresponding to the strongest LD block in CEU population representing Caucasians

that harbors the 8q24 SNP rs6983267. Algorithms “ProMiR II” and “miR-abela” were used for

novel pre-miRNA prediction using the genomic sequence. To further confirm that the candidate

sequences were likely microRNAs, the “MiPred” and “CID-miRNA” were used. Furthermore,

the candidates were aligned to known precursor and/or mature miRNAs on the miRBase

website database.

Page 87: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

92

3.2.2. Total RNA extraction

All tumor samples were sectioned and stained with hematoxylin and eosin, then

reviewed by a pathologist to determine tumor cell content. Tumor tissue samples with >70%

tumor cell content were used for this study. Total RNA was extracted from the tissue samples

using the AllPrep DNA/RNA Mini kit (QIAGEN, Valencia, CA) following manufacturer’s

recommendations. Isolated RNA quality was checked by Bioanalyzer analysis (Agilent; Foster

City, CA). Sample information is shown in Table 10. Samples were collected from colon

cancer patients at three participating centers of Colorectal Cancer Family Registry (Mayo

Clinic, Mount Sinai Hospital, and Cleveland Clinic).

Table 10. Sample information for the ncRNA study.

Number %

8q24

genotype GG 8 35%

TT 3 13%

GT 12 52%

sex Female 6 26%

Male 17 74%

smoking

status Ex smoker 10 43%

Current

smoker 3 13%

Never smoker 4 17%

N/A 6 26%

TNM stage II 1 4%

III 1 4%

N/A 21 91%

Location Left 18 78%

Right 5 22%

3.2.3. Reverse Transcriptase Quantitative PCR

cDNA was synthesized using the High Capacity cDNA Reverse Transcription kit (Life

Page 88: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

93

Technologies, Foster City, CA) and 25ng of cDNA based on the RNA input was used for

quantitative PCR analysis in duplicates using the ABI 7900HT real-time PCR system (Life

Technologies) with the appropriate primers. The 2-deltaCt

method was used to calculate the

relative abundance of the lncRNA to GAPDH.

3.2.4. Affymetrix Genome-Wide Human SNP 6.0 Array

Samples were genotyped using the Affymetrix Genome-Wide Human SNP 6.0 Array.

DNA samples were processed, labeled and hybridized according to the manufacturer's

recommendations. All arrays were scanned on the GeneChip® Scanner 3000 7G using the

Affymetrix GeneChip Command Console (AGCC) Software to measure the fluorescent signal

intensities at each probe location.

3.3. Results and discussion

3.3.1. Identification of novel miRNAs in the 8q24 region using computational algorithms

Bioinformatics-computational algorithms predict new miRNAs by homology and

examine them for the stem-loop structure. One of the advantages to use this method is that a

large number of sequences can be scanned and examined for the characteristic hairpin structure

in pri-miRNA and pre-miRNA precursor structures to predict the existence of miRNAs. To

predict the novel miRNAs in a 30kb region corresponding to the LD block harboring the 8q24

SNP rs6983267 (hg18, chr8:128,472,000-128,501,999), I first used ProMiRII (Nam, Kim et al.

2006) and miR-abela (Sewer, Paul et al. 2005) (Figure 24) which are algorithms for unknown

pre-miRNAs from genomic targets. Eleven and 21 potential pre-miRNAs were identified by

ProMiRII (red arrow) and miR-abela (blue arrow), respectively. The genomic location of four

of them partially overlapped by both algorithms (green arrow) and they are shown as same

Page 89: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

94

colored numbers in Table 8. Seven red star marked candidates included SNPs. Examples of

secondary structures for predicted pre-miRNAs by ProMiRII are shown in Figure 25.

Figure 24. In Silico-predicted pre-miRNAs in the 30kb region flanking the 8q24 SNP

rs6983267.

Figure 25. Examples of secondary stem-loop structures of predicted miRNA precursors

from 3’region of rs6983267 by ProMiRII.

Page 90: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

95

Next, MiPred (Jiang, Wu et al. 2007) and CIDmiRNA (Tyagi, Vaz et al. 2008) algorithms were

used to distinguish the real pre-miRNAs from other hairpin sequences such as pseudo pre-

miRNAs among the 32 pre-miRNA candidates (Table 10). Furthermore, the candidates were

aligned with known precursors and mature miRNAs across all the available species (mmu;

mouse, hsa; Human, mo; morpholino, bta; bos taurus, ppt; physcomitrella patens) in the

miRBase (http://www.mirbase.org). Three out of the eleven predicted candidates by ProMiRII

and five out of twenty-one predicted by miR-abela were indicated as potential real miRNAs by

MiPred (Table 11). Table 10 shows the genomic location (hg18), length and free-energy of

each candidate. Seven out of the 32 pre-miRNAs had significantly high absolute values,

beyond the optimal cut-off at 0.6 by CID-miRNA. However, none of the 32 candidates

matched any of our reads from the colon cancer NGS (Chapter 2). Furthermore, while I was

conducting this project, another study (on prostate cancer) also failed to find any evidence for

significant novel miRNAs in this genomic region using NGS (Pomerantz, Beckwith et al.

2009). This suggests that it is difficult to computationally identify miRNAs or there are indeed

no novel miRNAs in this region.

Table 11. Lists of predicted pre-miRNAs by ProMiRII and miR-abela.

Page 91: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

96

The colored numbers indicate partially overlapping pre-miRNAs predicted by the two

algorithms, ProMiRII and miR-abela.

3.3.2. Altered expression of five known miRNAs located in the 8q24 region

While no known miRNAs reside in a about 800 kb genomic area at 8q24.1

encompassing the SNP rs6983267, five miRNAs (miR-1204, miR-1205, miR-1206, miR-1207,

miR-1208) were recently found in the region of the non-coding PVT1 locus on 8q24, 400 kb

downstream of the rs6983267 (Huppi, Pitt et al. 2012). To study if the expression levels of

those miRNAs were associated with colon cancer, as well as with the risk allele (G) of

rs6983267 SNP, in cis, the expression levels of these miRNAs were studied in the SOLiD NGS

data (Figure 26) (detailed materials and methods are shown in Chapter 3).

Page 92: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

97

Figure 26. Expression levels of five known miRNAs located in the 8q24 region by

different tissue types (10 tumors and 10 normals) and the genotype of the rs6983267 SNP

(5 GG versus 5 TT for each group). Green and purple dots represent the normal and tumor,

respectively. Blue dots represent the TT and red dots the GG genotype.

The five microRNAs; miR-1204, miR-1205, miR-1207, and miR-1208 demonstrated

differential expression levels by the colon tissue types (tumor or normal, Figure 26), indicating

association of these miRNAs with colon cancer. This finding is supported by Huppi’s finding

(Huppi, Volfovsky et al. 2008) that the enhanced expression of the hsa-miR-1204 (precursor of

miR-1204) was seen in two colon cell lines (HCT-116 and COLO-320) and breast cancer cell

Page 93: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

98

lines (Huppi, Volfovsky et al. 2008). The genotype of rs6983267 was confirmed by using the

Affymetrix Genome-Wide Human SNP 6.0 Array on germline DNA and the samples were

separated into GG and TT genotypes. Interestingly, the expression level of miR-1204 was

significantly related to the GG genotype in both colon tumor (p=0.011) and adjacent normal

(p=0.012) tissues, suggesting a potential association of this miRNA with the rs6983267 SNP.

Since the genomic region spanning rs6983267 was found to contain DNA enhancer elements

(Jia, Landan et al. 2009), this miRNA may be deregulated by the SNP.

3.3.3. Altered expression of novel lncRNAs in the 8q24 region

Inspired by the localization of the SNP rs6983267 (the most promising risk variant in

colon cancer) within a highly conserved region of the genome (Sotelo, Esposito et al. 2010)

and a recent discovery of transcription of UTRs (Calin, Liu et al. 2007), our collaborator found

novel lncRNAs encompassing rs6983267 SNP by RACE (rapid amplification of cDNA ends)

in bone marrow cDNA (Ling, Spizzo et al. 2013). I identified significantly upregulated

expression of this lncRNA, CCAT2, in 23 colon tissues as compared to the paired 23 adjacent

normals (p<0.002, nonparametric Mann-Whitney-Wilcoxon test) (Figure 27A). To study the

role of this lncRNA in colon cancer, which may be mediated via the rs6983267 SNP, I

examined if the rs6983267 SNP status affects the differential expression of the CCAT2 between

the colon tumor and adjacent normal tissues. Interestingly, higher fold changes of CCAT2

expression between normal and tumors were found in GG (N=8) and GT (N=12) samples

compared to those with the TT (N=3) genotype (Figure 27B). Although significant difference

was observed in the slightly larger Italian cohort by our collaborator (p=0.009) for GG (N=19)

versus TT (N-24) in CRC (data not shown), no statistical difference was observed in our study

Page 94: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

99

between the two genotypes (GG vs TT), in neither tumor nor normal samples, probably due to

the limited sample size. Interestingly, a significant difference was only observed in GT (N=12)

versus TT (N=3) in the tumor samples (p<0.048). However, these data should be further

validated in a larger number of the samples to prove if this lncRNA may indeed be associated

with the rs6983267 cancer risk SNP.

B. Different genotypes in Tumor C. Different genotypes in Normal

Figure 27. Elevated expression of a novel lncRNA, CCAT2, in tumors (A) and the

differential expression between GG, GT and TT samples. A. Relative expression levels of

the CCAT2 to GAPDH in 23 normal and the paired tumor tissues. B &C. Expression

levels of the CCAT2 in tumor (B) and normal tissues (C) by the genotypes of rs6983267.

The data are presented as box-whisker plots showing the 25th

percentile (lower box), median (a

line), and the 75th

percentile (upper box).

A . Tumor and Normal

Page 95: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

100

3.4. Conclusion

Although no evidence was found for significant miRNA transcription within the 8q24

colon cancer risk locus using computational algorithms and NGS, the expression level of one

of the five miRNAs in 8q24 (miR-1204) was significantly associated with the GG type in both

colon tumor (p=0.011) and adjacent normal (p=0.012) tissues, suggesting a potential

association of this miRNA with the rs6983267 SNP. Interestingly, a lncRNA encompassing the

rs6983267 (called CCAT2) was also identified and confirmed in our sample set. Moreover,

higher expression of CCAT2 was observed in colon tumors compared to matched normal

tissues, suggesting an oncogenic role in the colon cancer development. The discovery of

CCAT2 may represent a novel mechanism suggesting an involvement of ncRNAs in colon

cancer pathogenesis, and providing a potential new diagnostic and/or prognostic marker if

further confirmed in a large number of independent samples.

Page 96: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

101

CHAPTER 4

LANDSCAPE OF ALTERED METHYLATION IN COLON CANCER

4.1. Introduction

Aberrant methylation of tumor suppressors, oncogenes and repetitive elements such as

LINE1 has been identified in colon cancer, but the studies so far have focused on small

numbers of specific genes. Therefore, there is a limited knowledge of the genome-wide

methylation events of the colon and how this may contribute to colon carcinogenesis. I

hypothesize that altered methylation in various functional genomic locations (promoter, gene

body, UTR) of the CGIs as well as the surrounding regions, and at imprinted genes, may

contribute to the development and progression of colon cancer via disturbed gene regulation.

4.2. Materials and methods

4.2.1. Information on Patient Specimen

A total of 40 MSS/CIMP-neg colon tumors and 30 colon normal tissues, including 26

pairs, were collected from colon cancer patients at three participating centers of Colorectal

Cancer Family Registry (Mayo Clinic, Mount Sinai Hospital, and Cleveland Clinic). Sample

information is shown in Table 12.

Page 97: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

102

Table 12. Sample information for methylation analysis.

3.2.2. Total RNA extraction

All tumor samples were sectioned and stained with hematoxylin and eosin, then

reviewed by a pathologist to determine tumor cell content. Tumor samples with >70% tumor

cell content were used for this study. Total RNA was extracted from the tissue samples using

the AllPrep DNA/RNA Mini kit (QIAGEN, Valencia, CA) following manufacturer’s

recommendations. Isolated RNA quality was checked by Bioanalyzer analysis (Agilent; Foster

City, CA).

Page 98: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

103

4.2.3. Affymetrix Exon Arrays

1 μg of total RNA for each sample was first processed using a ribosomal RNA (rRNA)

reduction procedure as suggested by Affymetrix (Affymetrix, Santa Clara, CA). The rRNA

reduction was verified by running the reduced RNA samples on the Bioanalyzer (Agilent

Technologies, Santa Clara, CA). After rRNA reduction, the Affymetrix GeneChip® Whole

Transcript (WT) Sense Target Labeling Assay (Affymetrix) was used to generate amplified and

biotinylated sense-strand DNA targets for hybridization on GeneChip® Exon 1.0 ST Arrays

following the Affymetrix protocol. Briefly, double stranded cDNA was derived from 1ug of

concentrated rRNA-reduced RNA using T7-(N)6 random hexamers. This was followed by in

vitro transcription to produce amplified antisense cRNA, which was converted back to single-

stranded sense DNA. The sense DNA (5.5 ug) was enzymatically fragmented, checked on

Bioanalyzer for the appropriate size, terminally labeled with biotin and hybridized onto Exon

Arrays. After an 18-hour hybridization, the arrays were washed and stained using the

GeneChip® Hybridization, Wash and Stain Kit and the suggested protocol. The arrays were

scanned on The GeneChip® Scanner 3000 7G using the AGCC Software to measure the

fluorescent signal intensities at each probe location.

4.2.2. DNA Extraction and Bisulfite Conversion

DNA was extracted from the tissue specimens using the AllPrep DNA/RNA Mini kit

(Qiagen Inc, Valencia, CA). The purity and quantity of extracted DNA was examined by the

NanoDrop-2000 (Thermo Scientific, Wilmington, DE) and the integrity of the DNA was

checked by agarose gel electrophoresis. Bisulfite conversion of 500 ng of DNA was performed

on each sample according to the manufacturer’s recommendations for the Methylation450

Page 99: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

104

BeadChip using the EZ DNA Methylation kit (Zymo Research, Irvine, CA). The treatment

protocol includes 16 cycles of denaturing at 95°C for 30 sec and incubation at 50°C for 60 min,

as well as a final step of holding at 4°C.

4.2.3. HumanMethylation450 BeadChips

Figure 28. The workflow for

HumanMethylation450 BeadChips.

HumanMethylation450 BeadChips

(Illumina, San Diego, CA) were used to

analyze the genome-wide DNA

methylation profiles across 485,577 CpGs.

These CpGs cover 96% of the known CpG

islands and 99% of the NCBI Reference

Sequence (Illumina) genes, with an

average of 17 CpGs per gene distributed

across the upstream of the TSS1500,

TSS200, 5’UTR, 1st exon, gene body, and

the 3’UTR (Table 2 in Chapter1). The 485,577 cytosine positions in the genome include

482,421 (99.35%) CpG dinucleotides, 3,091 (0.64%) CNG targets, and 65 (0.01%) SNP sites.

The workflow is shown in Figure 28. Four µl of bisulfite-converted DNA was used for

hybridization onto the HumanMethylation450 BeadChips, following the Illumina Infinium HD

Methylation protocol. This consisted of a whole genome amplification step followed by

enzymatic end-point fragmentation, precipitation and resuspension. The resuspended samples

Page 100: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

105

were hybridized onto the BeadChips for 16 hours at 48°C. After hybridization, the

unhybridized and non-specifically bound DNA was washed away, followed by single

nucleotide extension using the hybridized bisulfite-treated DNA as a template. The Illumina

iScan SQ scanner was used to create images of the single arrays and the intensities of the

images were extracted using GenomeStudio (v.2011.1) Methylation module (v.1.9.0) software.

4.2.4. Raw data normalization

The data was normalized using the ‘Background Subtraction’ and ‘Normalization to

Internal Controls’ methods offered by the Genome Studio software. First, the background

subtraction value was derived from the signals of built-in negative control bead types for each

channel, setting the background level at the 5% percentile of the negative controls in the given

channel. Background was then subtracted from probe intensities in the same channel. If

intensity becomes negative, it is set to 0. Secondly, the internal control probe pairs on the

HumanMethylation450 BeadChips were utilized for normalization. The normalization control

probe pairs (over 90 of them) are designed to target the same region within housekeeping genes

and have no underlying CpGs in the probe. For normalization, probe intensity in the given

sample was multiplied by a constant normalization factor (for all samples) and divided by the

average of normalization controls in the probe’s channel in the given sample.

4.2.5. Initial filtering of beta-values

The methylation score for each CpG site is represented by a beta-value calculated

according to the normalized probe fluorescence intensity ratios between methylated and

unmethylated signals. Beta-values vary between 0 (fully unmethylated) and 1 (fully

Page 101: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

106

methylated). Every beta-value on the HumanMethylation450 BeadChip is accompanied by a

detection p-value indicating signals significantly greater than background. Any sites with

detection p-values greater than 0.05 were filtered out before further analysis. Finally, we

excluded probes that are designed for sequences on either the X or Y chromosome. Average

delta-beta-values indicating the differential methylation between colon cancer and the adjacent

normal tissue were calculated by subtracting the average beta-value of pooled colon cancer

samples from that of pooled adjacent normal tissues.

4.2.6. Statistical analysis of differential methylation

To address the heteroscedasticity of beta-values, they were converted to M-values for the

statistical analysis and the Partek Genomics Suite (Partek Inc., St. Louis, MO) was used for the

statistical analysis of differential methylation. First, data quality control (QC) analyses on the

normalized average beta-values generated by the Genome Studio were performed. These

included graphing the sample histograms for signal distributions (Figure 29). For the actual

differential methylation analysis, a multivariate ANOVA (ANCOVA), including factors such as

tissue (tumor vs. normal), stage and scan date, was performed to evaluate the contribution of

these factors to differential methylation. Heat maps were created using the Partek Genomics

Suite. The Euclidian distance between the two groups of samples (tumors and normals) was

calculated by the average linkage.

4.2.7. Ingenuity pathway analysis (IPA)

The Ingenuity Pathway Analysis (IPA) program (http://www.ingenuity.com/index.html)

was be used to identify the possibly affected gene networks, functional categories and

Page 102: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

107

canonical pathways related to colon cancer. IPA ranks gene networks by a score (-log (p-value))

that takes into account of the number of focus genes and the size of the network.

4.3. Results and discussion

4.3.1. Aims of data analysis

There were five major aims for the data analysis: 1) To identify differentially methylated

CpGs genome-wide in 26 paired samples; 2) To study the biological functional roles of genes

covered by differentially methylated (DM) CpGs; 3) To identify specific colon cancer DM

CpGs by a comparison to Hepatocellular Carcinoma; 4) To identify DM CpGs in imprinted

genes; 5) To identify correlations between methylation and miRNA expression. The results of

these analyses are described in the following sections.

4.3.2. Conversion of beta-values to M-values

In a high-throughput statistical analysis using ANOVA, which is based on the

assumption of homoscedasticity (normal distribution), the variable variances should be

approximately constant. The heteroscedasticity associated with beta-values suggests

transforming beta-values to M-values as M-values are more appropriate for ANOVA (Du,

Zhang et al. 2010). M-value was computed in the Partek Genomics by applying the logit

transformation (Figure 29). In addition, logit transformation reduces Infinium type I and type II

probe bias. Figure 30 shows histograms of beta-values and M-values for all samples analyzed

by HumanMethylation450 BeadChip interrogating the entire 485,577 CpGs. The histogram of

M-values shows a clear bimodal distribution with one positive (methylation) and one negative

(unmethylation) peak. In contrast, beta-values are severely distributed in the low (between 0

Page 103: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

108

and 0.1) and high (between 0.7-1.0) ranges. The range of beta-values is between 0 and 1, which

can be interpreted as the percentages of DNA methylation for the population of given CpG

sites in the samples. The beta-value has a more intuitive biological interpretation since beta-

values represent methylation levels. However, M-value is more statistically appropriate for

the differential analysis of methylation levels, but it is difficult to directly interpret the DNA

methylation based on M-values. Therefore, Du et al. recommends using the M-value method

for performing differential methylation analysis and including the beta-values to report the

results (Du, Zhang et al. 2010).

Figure 29. M-value transformation to address the issue of heteroscedasticity. (Source:

Institute of Genetic Medicine)

A. B.

Figure 30. Histograms of beta-values (A) and M-values (B) interrogating CpGs in the

total of 485,577 CpGs

Page 104: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

109

4.3.3. Quality checks based on the distribution of the beta-values

A total of 40 MSS/CIMP-neg colon tumor and 36 adjacent normal tissues were analyzed

on the HumanMethylation450 BeadChips. While all tumor samples showed good quality of

DNA, ten normal samples showed a decreased quality of DNA with some smearing on the gel

analysis (Data not shown). To increase a number of adjacent normal samples for the statistical

power, the lower quality of DNA samples were also initially analyzed on the BeadChips. First,

data quality control analyses for the 40 tumor (Figure 31A) and 36 normal colon samples

(Figure 31B) were performed based on the normalized average beta-values on entire 485,577

CpGs. Relatively lower average beta-values were seen for five normal samples (red arrows in

Figures 31B).

A. B.

Figure 31. Histogram of average beta-values for 485,577 CpGs in 40 tumor samples (A)

and 36 adjacent normal samples (B). Quality of DNA analyzed on the gel is indicated on the

top row. B; bad, G; good, F; fair. The x-axis and y-axis represent the individual samples and

average beta-values, respectively. The arrows indicate the samples with relatively low average

beta-values.

Next, to check if the average beta-values on these five normal samples in Figure 32

were expected or not, 113 CpGs of interest based on a HumanMethylation27 BeadChip

Page 105: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

110

analysis by a Peter Laird’s group were checked in our normal samples. These 113 CpGs were

constitutively methylated in normal samples, but showed variable levels of DNA

hypomethylation in colon tumor tissues in the previous study (Hinoue, Weisenberger et al.

2012). Six samples including the anticipated five bad normal samples showed relatively lower

average beta-values for the 113 CpGs (as shown in Figure 32). These six samples with the

average beta-values range between 0.35 and 0.50 were thus excluded from further statistical

analysis aiming to identify the differentially methylated CpGs in MSS/CIMP-neg colon cancer

compared to normal tissues.

A. B.

Figure 32. Distribution (A) and median of average beta-values (B) on 113 CpGs

consistently methylated in normal, but not in tumor tissues by Peter Laird’s group. Six

bad samples are indicated.

Next, I validated if our data rediscovered previously identified DM CpGs presented by

other studies in colon cancer. Using data generated by Illumina HumanMethylation27

BeadChip, Hinoue and colleagues proposed a new “two-panel method” to differentiate CIMP-

high and CIMP-low subtypes (Hinoue, Weisenberger et al. 2012). Using this approach,

Karpinski and colleagues recently classified new subgroups that include HME, IME, and LME

corresponding to CIMP-high, CIMP-low, and CIMP-neg, respectively (Karpinski, Walter et al.

Page 106: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

111

2013). From their study, I selected 8 CpGs that were all relatively hypermethylated in LME

(comparable to CIMP-neg) cancer compared to adjacent normal tissues, although they are less

methylated in CIMP-neg cancer compared to other cancer groups such as HME and IME. The

8 loci are located in genes THBD, FBN2, RAB31, IGF2 (or INS-IGF2 or IGF2AS), ELMO1,

RAB31, FAM78A, and SLIT1. These eight CpGs were checked in 70 samples comprised of 40

MSS/CIMP-neg colon tumors and 30 adjacent normals and further confirmed using only the 26

paired samples, as shown in Figures 33A and 33B, respectively. Unsupervised clustering

showed that tissue types (tumor and normal) by the eight CpGs were more clearly clustered in

the paired sample analysis compared to all pooled samples. This result indicated that DNA

methylation patterns on some MSS/CIMP-neg colon tumor tissues behave more like normals.

Since there were no matched normal samples for them, there was a limitation to explain the

heterogeneous methylation patterns in pooled samples. For this reason, I decided to first study

the paired samples by further analysis and examine the signatures in pooled samples later.

A. B.

Figure 33. Unsupervised hierarchical clustering of beta-values for 8 CpGs (rows) in

pooled samples (A), and only paired samples (B) (columns). Green and red blocks on the

maps represent 30 normals and 40 MSS/CIMP-neg tumor tissues (A) or 26 normals and

matched 26 tumors (B), respectively. The red and blue for the CpGs represent

hypermethylation and hypomethylation, respectively.

Page 107: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

112

Next, I checked how many paired samples were in the right direction

(hypermethylation in tumor compared to normal tissues) for these eight previously identified

DM CpGs. A high frequency of hypermethylation at these loci in MSS/CIMP-neg colon tumor

tissues was observed, up to 84.6%, with a 100% consistency in the direction of methylation

changes (Figure 34). Moreover, all eight CpGs showed significant difference by p-value <0.05

(ANCOVA adjusted factors: pairs and batch effect).

Figure 34. Dot plots of beta-values in 26 paired colon tissues for 8 previously identified

hypermethylated CpGs by Karpinski et al. (Karpinski, Walter et al. 2013). Each point

represents the beta-values for an individual. The paired samples are connected by a line.

Numbers (proportion) of observed hypermethylated paired samples among the 26 total

paired samples are presented in the bottom of the plots. (N: Normal, T: Tumor).

Page 108: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

113

4.3.4. Distribution and classification of CpGs

Before doing statistical analysis to identify differentially methylated CpGs, distribution

of CpGs on the HumanMethylation450 BeadChip was checked. After exclusion of unreliable

CpGs at detection p-value >0.05 and CpGs in either X or Y chromosome, the proportions of

472,826 CpGs across functional genomic locations (A) and CGIs and their surrounding regions

(B) are illustrated in Figure 35. For the functional genomic location, 44% CpGs are located in

proximal promoters including CpGs in TSS1500 (15%), TSS200 (11%), 5’UTR (11%), and

1stExon (7%) (Sandoval, Heyn et al. 2011). Moreover, 4%, 31%, and 21% CpGs corresponded

to 3’UTR, gene body and intergenic sequences, respectively (Figure 35A). Of the 472,826

CpGs, 31% CpGs are in the CGIs, 23% in CGI shores, 10% in CGI shelves, while 36% are in

other regions of the genome (Open Sea) (Figure 35B).

A. B.

Figure 35. Distribution of CpGs across functional genomic locations (A) and CGIs (B).

4.3.5. Identification of the genome-wide methylation profiles in colon cancer

After normalization of the data with internal controls and the background subtraction,

Page 109: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

114

Principal Component Analysis (PCA) was performed on the beta-values of 472,826 CpGs to

show methylation signal clustering of the samples by tissue type. The results showed distinctly

different overall methylation patterns between all colon tumors and matched normal samples

(data not shown). To explore the landscape of consistently aberrant DNA methylation in the

MSS/CIMP-neg colon cancer, 26 MSS/CIMP-neg colon tumor and 26 matched normal tissues

were analyzed. A total of six chips were used to profile all samples in 2 batches, and tumor-

normal sample pairs were analyzed on the same chip to minimize experimental variation.

Differential methylation levels (delta-beta values) against raw p-values are shown by volcano

plots in Figure 36. I observed an enrichment of CpGs with negative delta-beta values among

the genome-wide 472,826 CpGs at Bonferroni corrected p-value <0.05 (horizontal red line),

reflecting the general hypomethylation in MSS/CIMP-neg colon cancer. When it comes to

specific functional location, general hypomethylation was observed in the 3’UTR and

intergenic regions and general hypermethylation was found in TSS1500, TSS200, 5’UTR, and

the 1stExon which are considered the functional promoter regions (Figure 36A).

Page 110: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

115

A.

Figure 36A. Volcano plots showing the magnitude of differential methylation levels (delta-

beta) in the entire CpGs sets at various functional regions. The x-axis is used for delta-beta

values and the y-axis shows the negative log10 of p-values (a higher value indicates greater

significance). The horizontal red and blue lines mark the thresholds at Bonferroni corrected p-

value = 0.05 for defining differentially methylated CpGs in colon tumors compared to normal

tissues. The vertical lines represent delta-beta values at -0.2 and 0.2, respectively.

For the CGIs and the surrounding regions, the enrichment of hypermethylation was

observed in N_Shores and CGIs, but less hypermethylation was observed in the N_Shelf,

S_Shore, S_Shelf, and Open Sea loci (Figure 36B).

Page 111: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

116

B.

Figure 36B. Volcano plots showing the magnitude of differential methylation levels (delta-

beta) in the entire CpGs sets at CpG islands and the surrounding regions. The x-axis is

used for delta-beta values and the y-axis shows the negative log10 of p-values (a higher value

indicates greater significance). The horizontal red and blue lines mark the thresholds at

Bonferroni corrected p-value = 0.05 for defining differentially methylated CpGs in colon

tumors compared to normal tissues. The vertical lines represent delta-beta values at -0.2 and

0.2, respectively.

4.3.6. Genome-wide methylation patterns of significant DM CpGs in colon cancer

Significantly differential methylation events between the MSS/CIMP-neg colon tumors

and matched normal tissues were observed at 0.6% of the CpGs analyzed (304/472,826) at a

Bonferroni corrected p-value <0.05 (corresponding to raw p-value< 1.04997e-007), and clear

separation by these 304 CpGs was observed between the colon tumors and the adjacent normal

tissues by PCA and by the unsupervised hierarchical clustering of beta-values (Figure 37A). Of

the 304 CpGs, 50% (152/304) had delta-beta values ≥ l0.2l which means greater than or equal

Page 112: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

117

to 20% methylation differences between tumor and normal tissues (Figure 37B). Among these

CpGs, 88% (134/152) and 12% (18/152) were hypomethylated and hypermethylated,

respectively. This general hypomethylation is supported by previous findings using other

methods such as a antibody-based method combined with restriction enzymes (Hernandez-

Blazquez, Habib et al. 2000) and MethyLight for the LINE-1 repetitive element (Sunami, de

Maat et al. 2011) in colon cancer development.

A.

B.

Figure 37. Methylation profiles of (A) 304 DM CpGs with Bonferroni corrected p<0.05

and (B) 152 DM CpGs with delta-beta values ≥ l0.2l by PCA (left), and unsupervised

hierarchical clustering (right). The heat maps show beta-values, with red being more

methylated and blue less. Columns represent individual samples, and rows represent 304

DM CpGs.

Page 113: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

118

To explore the landscape of the observed DM distribution, I further analyzed functional

location of the 152 significant hyper- or hypomethylated DM CpGs separately (Figure 38A). A

CpG site can be in more than one functional location since a locus can reside in several

transcript variants of the same gene or in different genes. Among the 18 hypermethylated CpGs,

most were localized in the 5’UTR (38%) and TSS200 (23%), followed by 12% and 10% in the

TSS1500 and Body regions (Figure 38A). The highest frequency of hypermethylation was

observed in the 5’UTR (6/15, 40%) (Figure 38B). In contrast, all three DM CpGs in the 3’UTR

were hypomethylated in colon cancer compared to matched normal tissues. The

hypomethylated CpGs were similarly distributed across 7 different functional locations.

(Figure 38A).

Next, the localization of the 152 DM CpGs respective to the CGIs and the surrounding

areas was studied (Figure 39). While most of the DM CpGs found in this study reside in Open

Seas (103/152, 68%) and are hypomethylated (100/134, 75%), most of the hypermethylated

loci were in the CGIs (12/16, 67%).

Page 114: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

119

A

Figure 38A. Functional location of the 152 DM CpGs. Distribution of 152 DM CpGs

including 18 hypermethylated CpGs (Left) and 134 hypomethylated CpGs (Right).

Page 115: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

120

B

Figure 38B. Functional location of the 152 DM CpGs. DNA methylation patterns of 152 DM CpGs by a functional

location.

Page 116: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

121

A

Figure 39A. Distribution of CGIs and surrounding regions of 152 DM CpGs. Distribution of 152 DM CpGs including

18 hypermethylated CpGs (Left) and 134 hypomethylated CpGs (Right).

Page 117: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

122

B

Figure 39B. Distribution of CGIs and surrounding regions of 152 DM CpGs. DNA methylation patterns of 152 DM

CpGs by CGIs and surrounding regions.

Page 118: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

123

4.3.7. MSS/CIMP-neg colon cancer DM CpGs compared to another cancer type

Genome-wide studies have shown that DNA methylation profiles in mammals are tissue

specific (Kitamura, Igarashi et al. 2007; Rakyan, Down et al. 2008) as well as tumor specific

(Rakyan, Hildmann et al. 2004; Eckhardt, Lewin et al. 2006). Therefore, I questioned if colon

DM CpGs found in this study are colon tissue specific (T-DM) or colon cancer specific (C-DM).

To answer this question, I first investigated the degree of DM CpG methylation between colon

cancer (N=26 paired samples) and Hepatocellular carcinoma (HCC, N=19 paired samples) using

the 152 DM colon DM CpGs identified in this thesis (Figure 40). For HCC samples, the same

comprehensive genome-wide approach was used and the landscape of aberrant methylation in

HCC was recently published by our group (Song, Tiirikainen et al. 2013). Interestingly, clear

separation was observed by cancer types (colon cancer and HCC, y-axis; PC1, 66%) as well as

tissue types (colon and liver, x-axis; PC2, 11.8%) in PCA using the 152 DM colon DM CpGs.

(Figure 40A). By definition, the colon cancers and matched normal tissues were segregated in

unsupervised clustering (Figure 40B). Moreover, the two major branches in the dendrograms

correspond perfectly to tissue type except for a few samples.

Next, the 18 hypermethylated and 134 hypomethylated colon DM CpGs were separately

investigated by PCA analysis to see if these differential methylation patterns represent either

colon T-DM or colon C-DM CpGs (Figure 40C and 40D). Interestingly, the colon tumor sample

(red) cluster lays to the right while the colon normal (blue), liver normal (green) and HCC

(purple) samples are located on the left using the 18 hypermethylated colon DM CpGs on PC1

(65.5%), suggesting that they could be C-DM CpGs, or colon cancer specific (Figure 40C).

Unlike with the hypermethylated colon DM CpGs, the tumor and normal samples are located on

Page 119: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

124

A B

C D

Figure 40. Clustering of normal tissues (colon and liver) and tumor tissues (colon cancer

and HCC) using 152 colon DM CpGs resulting in near perfect discrimination of tissues.

The beta-values of all tissues for the 152 colon DM CpGs were used for the PCA (A) and

unsupervised clustering (B). The heat map shows beta-values, with red being more

methylated and blue less. Columns represent individual samples, and rows represent 152

DM CpGs. PCA was performed with 18 hypermethylated (C) and 134 hypomethylated

colon DM CpGs.

Page 120: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

125

left and right regardless of cancer types, but with the colon tissues in the upper sections and liver

tissues in the lower sections, using the 134 hypomethylated colon DM CpGs on PC1 (72.3%),

indicating that they could be T-DM CpGs, or tissue specific (Figure 40D).

To explore the biological implications of the 152 significant DM loci, I characterized the

associated genes using IPA (Table 13). A gene list of 83 unique genes harboring the 152 DM

CpGs was created, and 73 genes were analyzed by IPA to identify their biological networks and

possible functional roles. Fifteen genes were involved in the top IPA network related to “Cell

Death and Survival, Nervous System Development and Function, Cellular Assembly and

Organization”. Moreover, 14 genes were found to be involved in “Dermatological Diseases and

Conditions, Organismal Injury and Abnormalities, Cell Death and Survival”, and another 11

genes were identified in “Embryonic Development, Organismal Development, Tissue

Development” (Table 13). IPA analysis indicated that this gene list is enriched for genes having

roles in cancer (47 genes) and gastrointestinal disease (31 genes) (data not shown). About 32%

(24/73) of these genes have previously been reported to be deregulated in colon cancer (Table 14)

by mutation.

Looking at their cellular functions in more detail, the DM genes identified in this study

have a variety of functional roles; as enzymes, transcription regulators, and ion channels.

- Enzyme: DPYS is involved in the catabolism of pyrimidine base (Hamajima, Kouwaki

et al. 1998) and has been suggested as a novel tumor marker in cancers including colon cancer

(Chung, Kwabi-Addo et al. 2008). MACF1, microtubule-actin crosslinking factor 1, is a member

of ATPase and involved in the Wnt signaling pathway and functions as a positive regulator in the

Page 121: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

126

translocation of Axin and its associated complex from the cytoplasm to the cell membrane (Chen,

Lin et al. 2006). NADPH oxidase 4, NOX4 (hypomethylation in cancer), has recently been

identified to be overexpressed in colon cancer, and associated with carcinogenesis (Wang,

Dashwood et al. 2011).

- Transcription regulator: AFF1 is a sequence specific DNA binding transcription factor

and is associated with leukemia. The t(4;11)(q21;q23) involves the genes MLL and AFF1 and

this fusion related leukemia is associated with poor prognosis (Tamai, Miyake et al. 2011). An

important paralog of AFF1 is AFF3, which is known as a putative transcription activator that

may function in lymphoid development and as an oncogene (Luo, Lin et al. 2012). However,

these genes have not been studied in colon cancer development. MYT1L is a myelin

transcription factor 1-like gene that activates A2BP1, resulting in blocking glioblastoma cancer

(Hu, Ho et al. 2013). Two other transcription regulators, OSR2 and SIM1, were found to be

aberrantly methylated in our study; however, those genes are poorly understood in cancers.

- Ion channel: Membrane ion channels are essential for cell proliferation and have an

important role in the development of cancers (Kunzelmann 2005). Voltage-gated potassium

channels have an oncogenic function. KCNJ1 is a potassium channel and was recently presented

for its important role in colon cancer by Zhu and colleagues at AACR Annual Meeting 2013.

NOX5 is a calcium-dependent NADPH oxidase and has been suggested to be involved in cell

growth and apoptosis in a prostate cancer cell line (Brar, Corbin et al. 2003).

Page 122: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

127

Table 13. IPA top networks for genes with DM CpGs.

Page 123: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

128

Table 14. List of genes that have previously been reported to be mutated in colon cancer.

Page 124: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

129

4.3.8. Deregulated methylation at imprinted genes in MSS/CIMP-neg colon cancer

Imprinting is an important epigenetic regulator mechanism in the expression of

certain genes that can be expressed in a parent-of-origin-specific manner (Robertson

2005). Although loss of imprinting (LOI) in IGF2 has been found in colon cancer against

the matched normal colonic mucosa of 30% of CRC patients, as compared with 10% of

individuals without CRC (Cui, Cruz-Correa et al. 2003), colon cancer related changes in

methylation across all imprinted regions has not been systemically investigated.

Thanks to the genome scale coverage of the Illumina HumanMethylation450

BeadChip, I was able to analyze 1,257 CpGs co-localizing to 41 imprinted genes in 26

MSS/CIMP-neg colon cancer pairs. Among the CpGs, methylation increased

significantly at 28 CpGs and decreased at 27 CpGs across 18 unique genes, as observed

at Bonferroni corrected p<0.05 (Table 15). The strongest altered hypomethylation and

hypermethylation events were observed at 3 and 5 CpGs, with hypomethylation at gene

DLX5 and hypermethylation at four genes; LRTM1, GNASAS, MEST, and KCNK9, at

delta-beta ≥ l0.2l, respectively (Table 15). Furthermore, a higher frequency of both

hypomethylation and hypermethylation among the 26 paired samples was observed with

up to 92% consistency (24/26 pairs) in the direction of the methylation changes (Figure

41). Interestingly, two genes out of the 5 DM imprinted genes (DLX5 and MEST) were

recently shown to be differentially methylated in prostate tumor tissues by the

HumanMethylation27 BeadChip (Jacobs, Mao et al. 2013).

Next, I questioned if these methylation differences were correlated to the host

gene expression levels. The Affymetrix Exon Arrays were used for the gene expression

study on 23 pairs that were also analyzed on the Illumina HumanMethylation450

Page 125: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

130

BeadChips. Four out of the five DM imprinted genes were analyzed on the GeneChip®

Human Exon 1.0 ST Array. Methylation levels of two MEST promoter (TSS1500) CpGs

were significantly inversely correlated with MEST gene expression values (Figure 42).

This result is supported by other findings that LOI of MEST is linked to colon cancer

(Nishihara, Hayashida et al. 2000), breast cancer (Pedersen, Dervan et al. 1999), and lung

cancer (Nakanishi, Suda et al. 2004). Aberrant methylation of MEST is also associated

with male infertility (Marques, Costa et al. 2008).

Page 126: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

131

Table 15. Significant differential methylation in imprinted gene loci between colon

tumor and normal tissues at Bonferroni corrected p<0.05.

Page 127: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

132

Figure 41. Dot plots of beta-values for 8 differentially methylated CpGs in 5 imprinted genes in MSS/CIMP-negative colon

cancer compared to adjacent normal tissues. Each point represents the beta-values for an individual. The paired samples are

connected by a line. The number and proportion of observed paired samples among 26 total paired samples are presented in

the bottom of the plots. (N: Normal, T: Tumor)

Page 128: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

133

Figure 42. Inverse correlation between DNA methylation and gene expression level

of MEST. Gene name, TargetID (locus) by Illumina, and the correlation coefficients

(r) are presented. Beta-values and expression levels for each individual colon tumor

(red) and normal (blue) samples are presented by dots. The x-axis and y-axis

indicate the levels of gene expression from Affymetrix Exon arrays and the DNA

methylation by Illumina HumanMethylation450 BeadChips, respectively.

4.3.9. Correlation between DNA methylation and miRNA expression

Chapter 2 describes the discovery of nineteen miRNAs, including 6 previously

colon cancer associated and 13 new-miRNAs (not previously implicated in colon cancer),

which were identified to be differentially expressed between the MSS/CIMP-neg colon

cancer and normal tissues. Although methylation patterns do not always correlate with

gene expression because of diversity of epigenetic changes (Eckhardt, Lewin et al. 2006),

I had a question if altered expression of these miRNAs could potentially be regulated by

DNA methylation. First, the CpGs that cover the miRNAs were selected. Illumina

HumanMethylation450 BeadChip includes 117 CpGs that cover four of the new-miRNAs

Page 129: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

134

and 11 new-miRNAs. The number of analyzed CpGs for each miRNA is shown in Table

16. All miRNAs except the miR-30a were highly expressed in MSS/CIMP-neg colon

cancer compared to adjacent normal tissues.

Table 16. The number of analyzed CpGs for differentially expressed miRNAs.

# of CpG sites on HumanMethylation450

Known-miRNAs

miR-30a 3

miR-135b 4

miR-182 8

miR-202 10

New-miRNAs

miR-183 9

miR-365-1 6

miR-549 5

miR-602 3

miR-638 18

miR-935 7

miR-937 6

miR-1180 11

miR-1292 9

miR-1909 11

miR-1914 7

Because not all of samples were analyzed by both the NGS (for miRNA expression)

and HumanMethylation450, only a subset of samples (7 normal and 6 tumor samples

including 3 pairs) were analyzed to study the correlation between miRNA expression and

methylation levels in pooled samples (normal plus tumor) and colon tumor samples

separately (Figure 43). Spearman’s rank correlation test showed that four CpGs were

significantly correlated either negatively (miR-1292) or positively (miR-135b, miR-182,

Page 130: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

135

and miR-602) with four corresponding miRNAs in colon tumor samples at Spearman

correlation p<0.05 (Figure 43, right). However, this finding should be confirmed in a

larger number of samples and further experimentally studied if they are correlated by

direct interaction.

Page 131: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

136

Figure 43. Correlation between

miRNA expression and their

DNA methylation. MiRNAs

name, TargetID (locus) by

Illumina, and the correlation

coefficients (R) are presented.

Beta-values for each individual

colon cancer (red) and normal

(green) samples are presented

by dots. The x-axis and y-axis

indicate the miRNA expression

from NGS and DNA

methylation from

HumanMethylation450

BeadChip, respectively.

Page 132: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

137

4.4. Conclusion

The main observations of this methylation study are: (1) General hypomethylation

was observed in MSS/CIMP-neg colon cancer, concentrating in the intergenic regions

and gene bodies; (2) Hypermethylation was observed in promoter regions; (3)

Enrichment of hypermethylation was observed in the N_Shore and CGIs; (4) Colon

cancer specific altered methylation was identified at hypermethylated CpGs and tissue

specific (normal or tumor, colon or liver) altered methylation was identified at

hypomethylated CpGs (in colon cancer compared to matched normal tissues); (5)

Significant DM CpGs were enriched for genes having a role in cancer and

gastrointestinal disease; (6) The observations in imprinted genes genome-wide suggest a

more widespread dysregulation of imprinting in colon cancer than previously reported as

well as consistent with previous reports for some of genes; (7) A subset of differentially

expressed miRNAs had a significant correlation with their genomic methylation. These

findings from the genome-wide profiling of MSS/CIMP-neg colon cancer may help to

define the landscape of aberrant DNA methylation in MSS/CIMP-neg colon cancer and

give more depth to the observation of altered methylation in imprinted genes.

Understanding epigenetic changes may lead to identification of molecular markers for the

MSS/CIMP-neg colon cancer to be utilized in diagnosis, treatment and prognosis.

Page 133: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

138

GENERAL DISCUSSION

This dissertation comprises of three main studies of epigenetic events in microsatellite

stable (MSS) and CpG island methylator phenotype-negative (CIMP-negative) colon

cancer. These studies have (i) identified differential expression of known and novel

microRNAs (miRNAs), (ii) characterized non-coding RNAs (ncRNAs) in the 8q24 gene

desert region containing the cancer risk variant rs6983267, and (iii) elucidated the

landscape of altered genome-wide DNA methylation in colon tumors compared to

adjacent normal tissues.

Thanks to the technology revolution, state-of-the-art technologies such as microarrays

and Next Generation Sequencing (NGS) give us excellent ways to study epigenetic

alterations comprehensively. My dissertation projects discovered widespread epigenetic

changes, as accomplished via miRNAs and DNA methylation in colon cancer.

First, I utilized SOLiD NGS for comprehensive profiling of novel and known miRNAs in

colon cancer. To enable this cutting-edge project, Dr. Maarit Tiirikainen who is my long

time supervisor and my academic advisor, received an award from Applied Biosystems

based on the preliminary results from my work. I hypothesized that the MSS/CIMP-

negative colon cancer is likely to have altered expression of miRNAs, which may

contribute to the development and progression of the disease. For this study, I used 10

colon tumor tissues and 10 adjacent normal tissues. I identified statistically significant

differential expression in 19 miRNAs, including 13 new colon cancer miRNAs, as

compared to the adjacent normal. Because these newly identified miRNAs were of low

Page 134: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

139

abundance, only 6 of them could be measured by quantitative PCR for the validation of

the findings. However, the findings for these six new colon cancer miRNAs were

confirmed, and for five of them tumor-specific expression was replicated in another set of

samples. Furthermore, I identified close to 100 significantly correlated potential target

mRNAs for a subset of the newly identified miRNA and pathway analysis revealed

plausible roles of these target genes in the colon cancer development. Early results of this

project were presented at the AACR Annual Meeting in 2011 and the manuscript is under

preparation.

It is now known that miRNAs are found throughout the genome, and interestingly, many

miRNAs are located near cancer susceptibility loci or are associated with regions o

f genomic instability. Several studies have also revealed that polymorphisms in mi

RNAs and in their target sites can lead to aberrant gene regulation. The genome-

wide association studies (GWAS) have found many cancer susceptibility loci in non

-coding genomic areas such as the multiple cancer susceptibility locus at 8q24. Thi

s region is very interesting because there are no coding genes in about 600 kb. The

SNP rs6983267 is located upstream of the well-known proto-oncogene MYC. However,

despite the consistent association between the SNP and colon cancer risk, the molecular

mechanism/s of action are still not that clear.

I hypothesized that the “gene desert” locus harboring rs6983267 may contain non-coding

RNAs that may thus play a role in the colon cancer development. First, to predict novel

miRNAs in that genomic region, I used publicly available algorithms, and over thirty

Page 135: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

140

potential candidates were identified within close proximity (30kb) of the SNP. Secondly,

I tried to identify novel miRNAs in the same genomic region by utilizing the SOLiD

NGS data. I realized however, that in silico methods to predict miRNAs are very limited

for finding the miRNAs: none of the in silico candidates were matched to the SOLiD

NGS reads. Interestingly, no other novel miRNA reads were found neither anywhere near

the SNP in our NGS data. Therefore, it is not surprising, that another study on prostate

cancer also failed to find novel miRNAs in this region using NGS (Pomerantz, Beckwith

et al. 2009). This suggests that it is difficult to computationally or even empirically to

identify novel miRNAs or rather, that there are indeed no miRNAs in this region.

However, five miRNAs (miR-1204, miR-1205, miR-1206, miR-1207 and miR-1208)

were recently found in a more distal 8q24 region, in the non-coding PVT1 locus, 400 kb

downstream of the rs6983267 (Huppi, Volfovsky et al. 2008). Interestingly, four of the

microRNAs; miR-1204, miR-1205, miR-1207 and miR-1208 demonstrated differential

expression levels between tumor and normal, indicating association of these miRNAs

with colon cancer.

Several groups have been trying to study if the cancer risk SNP rs6983267 is associated

with altered gene expression to explain the etiology of the cancer risk conferred by the

SNP. These studies have shown that an ~335 kb DNA loop brings the genomic region

containing the SNP close to the MYC locus, and this physical association may enable

enhancer function of the SNP-containing region, thus affecting MYC transcription.

Therefore, next, I was wondering if the risk allele (G) of rs6983267 SNP could, in cis,

affect the expression of the PVT1 locus miRNAs. I studied the expression levels of these

Page 136: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

141

miRNAs in the SOLiD NGS data. Interestingly, the expression level of miR-1204 was

significantly related to the GG genotype in both colon tumor and adjacent normal tissues,

suggesting a potential association of this miRNA with the risk allele of the rs6983267

SNP. Furthermore, although I could not find any miRNAs in the 8q24 region proximal

to the SNP, our collaborator (Dr. George A. Calin, MD Anderson, Texas) recently

discovered a long ncRNA called CCAT2 in this region, harboring the cancer risk SNP

rs6983267 (Ling, Spizzo et al. 2013). Interestingly, this lncRNA is significantly

overexpressed in colon tumors compared to paired adjacent normal tissues. To study how

the lncRNA expression correlates with the different genotypes of the SNP, the expression

level of CCAT2 was separately analyzed in tumors and normals by the genotypes GG, GT,

TT. There was a trend for a higher expression in the presence of the G risk allele as

compared to the T allele, especially in the GT heterozygotes among our tumors. The

multicenter CCAT2 study looked into the genotype-CCAT2 expression correlation in

several colon cancer cohorts and highest expression level in the GG tumors vs the other

genotype tumors was found in one of the cohorts. These findings should be further

confirmed in a larger number of samples and in various colon cancer subtypes.

The first two parts have focused on RNA-based epigenetics. Since DNA methylation is

the most well-known epigenetic mechanism, I also studied altered DNA methylation in a

larger sample set from the same MSS/CIMP-negative colon cancer cohort (26 colon

tumor tissues and 26 paired adjacent normal tissues). Altered methylation events in tumor

suppressors, oncogenes, imprinted genes and repetitive elements such as LINE1 have

been identified in colon cancer, but these studies have focused on mostly promoter CpG

Page 137: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

142

islands in small number of specific genes or on certain repetitive elements. Therefore, I

hypothesized that methylation changes in various functional genomic locations (promoter,

gene body, UTR), enhancer elements, CpG islands, as well as their surrounding regions

are expected in colon tumors of even the methylator-negative type, and these alterations

may contribute to the development and progression of this type of colon cancer. Twenty-

six colon tumor and matched adjacent normal tissues were analyzed using the genome-

scale Human Methylation450 Bead Chip. This platform, although not giving a totally

unbiased genome-wide coverage, covers 99% of the RefSeq genes, with an average of 17

CpG sites per gene region; distributed across the promoter, 5'UTR, first exon, gene body,

and 3'UTR. It covers also 96% of CpG islands, with additional coverage in island shores

and island shelves. Thanks to this genome-scale coverage, I was able to use this platform

to test my hypothesis. My first finding was general hypomethylation in intergenic regions

and gene bodies in tumors compared to normal tissues, which was an expected result

since global hypomethylation is found in many cancers including colon, using other

techniques such as HPLC and MethyLight for the LINE1 repetitive element. The most

well-known epigenetic alteration, hypermethylation in promoters and CpG islands, was

also observed.

Although overall methylation levels are similar in individuals, it has been known that

there are significant differences in overall and specific methylation levels between

different tissue types and between normal and cancer cells from the same tissue.

Interestingly, I observed colon

cancer specific altered methylation at hypermethylated loci (as compared to liver cancer)

Page 138: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

143

and tissue specific (tumor vs normal, colon vs liver) altered methylation at the

hypomethylated loci. Those significant differentially methylated loci were enriched in

genes that play role in cancer and gastrointestinal disease. The other interesting finding

was the observervation of widespread altered methylation in the form of imprinting in the

MSS/CIMP-neg colon cancer, especially in two promoter related loci in MEST, the

methylation of which was shown to be correlated with the gene’s expression.

Although methylation patterns do not always correlate with gene expression due to the

variety of epigenetic changes, I questioned if altered expression of the miRNAs found by

NGS could potentially be regulated by DNA methylation. Interestingly, Spearman’s rank

correlation test showed that four CpGs were significantly correlated either negatively

(miR-1292) or positively (miR-135b, miR-182 and miR-602) with four corresponding

miRNAs in the colon tumor samples at p<0.05. However, this finding should be

confirmed in a larger number of samples and further experimentally studied whether the

correlation is by direct interaction. Although I observed the above interesting altered

methylation events in tumors compared to normal tissues, there are also limitations that

should be considered when interpreting the results. One limitation is that many of the

differentially methylated loci were presented by a single statistically significant CpG site

and it may be difficult to explain how a single CpG site could contribute to altered

complex biological mechanisms such as gene expression. However, this limitation can be

overcome by further analysis of the surrounding CpGs. This can be best accomplished by

other types of analysis such as NGS of bisulfite converted DNA giving a more

comprehensive resolution at single nucleotide level.

Page 139: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

144

Significance and Future Perspectives

The above findings may hopefully give biological insight into the genes and pathways

involved in the MSS/CIMP-neg colon cancer development and furhermore, the identified

specific molecular changes may potentially lead to the discovery of new colon cancer

biomarkers. My dissertation projects focused only on the identification of the aberrant

epigenetic events, but there are several interesting areas for future work.

1) Functional research to understand the biological significance of the epigenetic

alterations found:

Project Further research question

miRNAs Are the correlated potential target genes direct targets? Specific miRNA mimics or inhibitors could be designed for each

identified miRNAs for artificial upregulation and downregulation of

target mRNA translation.

Are the identified miRNAs regulated by other epigenetic mechanism such as DNA methylation?

The 5’- Azacytidine and 5’-aza-2’ deoxycytidine as inhibitors of DNA

methylation could be utilized to treat colon cancer cell lines to study if

hypermethylation or hypomethylation is directly associated with

decreased or increased miRNA expression.

Is the expression of the identified miRNAs associated with copy number variation?

The relationship of DNA copy number variation and expression of the

miRNAs in matched colon cancer tissues and adjacent normal tissues

could be studied by quantitative PCR.

lncRNAs Does the CCAT2 lncRNA regulate genes? Gain and loss of function experiments could be done.

Does the CCAT2 lncRNA regulate proteins? RNA immunoprecipitation could be used to isolate proteins bound to

CCAT2.

Page 140: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

145

DNA

methylation Do differentially methylated loci regulate host gene

expression directly? The 5’- Azacytidine and 5’-aza-2’ deoxycytidine as inhibitors of DNA

methylation could be utilized to treat colon cancer cell lines to study if

hypermethylation or hypomethylation is directly associated with

decreased or increased host gene expression.

2) Translational research using the potential new epigenetic biomarkers in tissues or

body fluids such as urine, blood, plasma, serum, and stool samples; for diagnosis and

prognosis:

Project Further research question

miRNAs Can the identified miRNAs be used for markers of colon cancer development?

The identified miRNAs could be analyzed in different colon cancer

stages such as adenoma, adeno-carcinoma, carcinoma, and advanced

carcinoma to see if they are involed in the early changes in the

tumorigenesis or they are rather markers for advanced carcinomas.

Can the identified miRNAs be used as biomarkers for diagnosis and prognosis?

The identified miRNAs could be analyzed in peripheral fluids like urine, blood, plasma, serum, and stool samples.

Are SNPs in the identified miRNAs associated with colon

cancer risk? MiRNA-related SNPs (MirSNPs) might promote carcinogenesis by

affecting miRNA function and/or maturation. Among 19 identified

miRNAs, four miRNAs (miR-182 (rs76481776), miR-202

(rs12355840), miR-602 (rs201175632), miR-1268 (rs28599926))

include SNPs in their pre-miRNA sequences. Therefore, an

association study between the MirSNPs and colon cancer risk in a

larger number of samples could be done.

lncRNAs Can the CCAT2 lncRNA be used as a colon cancer development marker?

The CCAT2 lncRNA could be analyzed in different colon cancer

stages such as adenoma, adeno-carcinoma, carcinoma, and advanced

carcinoma to see if its expression is related to early changes in the

tumorigenesis or whether it is a marker for advanced carcinomas.

Is the CCAT2 lncRNA associated with patients’ survival or other risk factors of colon cancer?

Since there was very limited clinical information for the samples in

Page 141: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

146

this cohort, I could not study this. However, an association study

between expression of the CCAT2 and clinical parameters could give

insight into the role of this potential biomarker.

DNA

methylation Can the altered DNA methylation events be used as

biomarkers? The altered DNA methylation events could be studied in peripheral

fluids like urine, blood, plasma, serum, and stool samples for

diagnosis and prognosis.

3) Epigenetic Epidemiology studies to understand the epigenetic markers of response to

environmental exposures and life style that contribute to the development of colon cancer:

miRNAs,

ncRNAs, and

DNA

methylation

Increasing evidence shows that aging, environmental factors and lifestyle, as well as dietary factors may influence epigenetic mechanisms such as miRNAs and DNA methylation. Therefore, epigenetic changes could represent an important pathway to bridge the effects of environmental factors and the lifestyle. If there were information for these factors for the individuals, a correlation study could be done between the factors and epigenetic mechanisms. However, we have to keep in mind that these epigenetic changes are cumulative and they manifest over time.

Page 142: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

147

REFERENCES

(2004). "Finishing the euchromatic sequence of the human genome." Nature 431(7011):

931-945.

(2012). "Comprehensive molecular characterization of human colon and rectal cancer."

Nature 487(7407): 330-337.

Albano, F., L. Anelli, et al. (2010). "Non random distribution of genomic features in

breakpoint regions involved in chronic myeloid leukemia cases with variant t(9;22)

or additional chromosomal rearrangements." Mol Cancer 9: 120.

Ali, N. A., M. J. McKay, et al. (2010). "Proteomics of Smad4 regulated transforming

growth factor-beta signalling in colon cancer cells." Mol Biosyst 6(11): 2332-

2338.

Ali, S., K. Almhanna, et al. (2010). "Differentially expressed miRNAs in the plasma may

provide a molecular signature for aggressive pancreatic cancer." Am J Transl Res

3(1): 28-47.

Arora, R., C. M. Brun, et al. (2012). "Transcription regulates telomere dynamics in human

cancer cells." RNA 18(4): 684-693.

Bandres, E., E. Cubedo, et al. (2006). "Identification by Real-time PCR of 13 mature

microRNAs differentially expressed in colorectal cancer and non-tumoral

tissues." Mol Cancer 5: 29.

Banfai, B., H. Jia, et al. (2012). "Long noncoding RNAs are rarely translated in two human

cell lines." Genome Res 22(9): 1646-1657.

Bar, M., S. K. Wyman, et al. (2008). "MicroRNA discovery and profiling in human

embryonic stem cells by deep sequencing of small RNA libraries." Stem Cells

26(10): 2496-2505.

Bellacosa, A. (2003). "Genetic hits and mutation rate in colorectal tumorigenesis:

versatility of Knudson's theory and implications for cancer prevention." Genes

Chromosomes Cancer 38(4): 382-388.

Beroukhim, R., C. H. Mermel, et al. (2010). "The landscape of somatic copy-number

alteration across human cancers." Nature 463(7283): 899-905.

Bestor, T. H. (1992). "Activation of mammalian DNA methyltransferase by cleavage of a

Zn binding regulatory domain." EMBO J 11(7): 2611-2617.

Bibikova, M., B. Barnes, et al. (2011). "High density DNA methylation array with single

CpG site resolution." Genomics 98(4): 288-295.

Bibikova, M., Z. Lin, et al. (2006). "High-throughput DNA methylation profiling using

universal bead arrays." Genome Res 16(3): 383-393.

Birney, E., J. A. Stamatoyannopoulos, et al. (2007). "Identification and analysis of

functional elements in 1% of the human genome by the ENCODE pilot project."

Nature 447(7146): 799-816.

Boland, C. R. and A. Goel (2010). "Microsatellite instability in colorectal cancer."

Gastroenterology 138(6): 2073-2087 e2073.

Borun, T. W., D. Pearson, et al. (1972). "Studies of histone methylation during the HeLa

S-3 cell cycle." J Biol Chem 247(13): 4288-4298.

Brannan, C. I., E. C. Dees, et al. (1990). "The product of the H19 gene may function as an

RNA." Mol Cell Biol 10(1): 28-36.

Brar, S. S., Z. Corbin, et al. (2003). "NOX5 NAD(P)H oxidase regulates growth and

apoptosis in DU 145 prostate cancer cells." Am J Physiol Cell Physiol 285(2):

Page 143: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

148

C353-369.

Brown, C. J., A. Ballabio, et al. (1991). "A gene from the region of the human X

inactivation centre is expressed exclusively from the inactive X chromosome."

Nature 349(6304): 38-44.

Brown, C. J., B. D. Hendrich, et al. (1992). "The human XIST gene: analysis of a 17 kb

inactive X-specific RNA that contains conserved repeats and is highly localized

within the nucleus." Cell 71(3): 527-542.

Bullrich, F., H. Fujii, et al. (2001). "Characterization of the 13q14 tumor suppressor locus

in CLL: identification of ALT1, an alternative splice variant of the LEU2 gene."

Cancer Res 61(18): 6640-6648.

Burk, U., J. Schubert, et al. (2008). "A reciprocal repression between ZEB1 and members

of the miR-200 family promotes EMT and invasion in cancer cells." EMBO Rep

9(6): 582-589.

Byun, H. M., K. D. Siegmund, et al. (2009). "Epigenetic profiling of somatic tissues from

human autopsy specimens identifies tissue- and individual-specific DNA

methylation patterns." Hum Mol Genet 18(24): 4808-4817.

Calin, G. A., C. G. Liu, et al. (2007). "Ultraconserved regions encoding ncRNAs are

altered in human leukemias and carcinomas." Cancer Cell 12(3): 215-229.

Calin, G. A., C. Sevignani, et al. (2004). "Human microRNA genes are frequently located

at fragile sites and genomic regions involved in cancers." Proc Natl Acad Sci U S

A 101(9): 2999-3004.

Cannell, I. G., Y. W. Kong, et al. (2008). "How do microRNAs regulate gene expression?"

Biochem Soc Trans 36(Pt 6): 1224-1231.

Chan, H. M. and N. B. La Thangue (2001). "p300/CBP proteins: HATs for transcriptional

bridges and scaffolds." J Cell Sci 114(Pt 13): 2363-2373.

Chang, T. C., E. A. Wentzel, et al. (2007). "Transactivation of miR-34a by p53 broadly

influences gene expression and promotes apoptosis." Mol Cell 26(5): 745-752.

Cheetham, S. W., F. Gruhl, et al. (2013). "Long noncoding RNAs and the genetics of

cancer." Br J Cancer 108(12): 2419-2425.

Chen, H. J., C. M. Lin, et al. (2006). "The role of microtubule actin cross-linking factor 1

(MACF1) in the Wnt signaling pathway." Genes Dev 20(14): 1933-1945.

Chervona, Y. and M. Costa (2012). "Histone modifications and cancer: biomarkers of

prognosis?" Am J Cancer Res 2(5): 589-597.

Chiang, H. R., L. W. Schoenfeld, et al. (2010). "Mammalian microRNAs: experimental

evaluation of novel and previously annotated genes." Genes Dev 24(10): 992-

1009.

Chung, S., H. Nakagawa, et al. (2011). "Association of a novel long non-coding RNA in

8q24 with prostate cancer susceptibility." Cancer Sci 102(1): 245-252.

Chung, W., B. Kwabi-Addo, et al. (2008). "Identification of novel tumor markers in

prostate, colon and breast cancer by unbiased methylation profiling." PLoS One

3(4): e2079.

Cittelly, D. M., P. M. Das, et al. (2010). "Downregulation of miR-342 is associated with

tamoxifen resistant breast tumors." Mol Cancer 9: 317.

Cohen, I., E. Poreba, et al. (2011). "Histone modifiers in cancer: friends or foes?" Genes

Cancer 2(6): 631-647.

Colaluca, I. N., D. Tosoni, et al. (2008). "NUMB controls p53 tumour suppressor activity."

Nature 451(7174): 76-80.

Cooper, K., H. Squires, et al. (2010). "Chemoprevention of colorectal cancer: systematic

review and economic evaluation." Health Technol Assess 14(32): 1-206.

Page 144: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

149

Costa, F. F. (2005). "Non-coding RNAs: new players in eukaryotic biology." Gene 357(2):

83-94.

Cui, H., M. Cruz-Correa, et al. (2003). "Loss of IGF2 imprinting: a potential marker of

colorectal cancer risk." Science 299(5613): 1753-1755.

Cummins, J. M., Y. He, et al. (2006). "The colorectal microRNAome." Proc Natl Acad Sci

U S A 103(10): 3687-3692.

Davis, B. N., A. C. Hilyard, et al. (2010). "Smad proteins bind a conserved RNA sequence

to promote microRNA maturation by Drosha." Mol Cell 39(3): 373-384.

Davis, P. K. and R. K. Brackmann (2003). "Chromatin remodeling and cancer." Cancer

Biol Ther 2(1): 22-29.

Denli, A. M., B. B. Tops, et al. (2004). "Processing of primary microRNAs by the

Microprocessor complex." Nature 432(7014): 231-235.

Du, P., X. Zhang, et al. (2010). "Comparison of Beta-value and M-value methods for

quantifying methylation levels by microarray analysis." BMC Bioinformatics 11:

587.

Easton, D. F., K. A. Pooley, et al. (2007). "Genome-wide association study identifies

novel breast cancer susceptibility loci." Nature 447(7148): 1087-1093.

Eckhardt, F., J. Lewin, et al. (2006). "DNA methylation profiling of human chromosomes 6,

20 and 22." Nat Genet 38(12): 1378-1385.

Esteller, M. (2007). "Cancer epigenomics: DNA methylomes and histone-modification

maps." Nat Rev Genet 8(4): 286-298.

Esteller, M. (2008). "Epigenetics in cancer." N Engl J Med 358(11): 1148-1159.

Esteller, M. (2011). "Non-coding RNAs in human disease." Nat Rev Genet 12(12): 861-

874.

Faber, C., T. Kirchner, et al. (2009). "The impact of microRNAs on colorectal cancer."

Virchows Arch 454(4): 359-367.

Fang, F., S. Turcan, et al. (2011). "Breast cancer methylomes establish an epigenomic

foundation for metastasis." Sci Transl Med 3(75): 75ra25.

Fearon, E. R. (2011). "Molecular genetics of colorectal cancer." Annu Rev Pathol 6: 479-

507.

Fearon, E. R. and B. Vogelstein (1990). "A genetic model for colorectal tumorigenesis."

Cell 61(5): 759-767.

Feinberg, A. P. and B. Tycko (2004). "The history of cancer epigenetics." Nat Rev

Cancer 4(2): 143-153.

Feinberg, A. P. and B. Vogelstein (1983). "Hypomethylation distinguishes genes of some

human cancers from their normal counterparts." Nature 301(5895): 89-92.

Feinberg, A. P. and B. Vogelstein (1983). "Hypomethylation of ras oncogenes in primary

human cancers." Biochem Biophys Res Commun 111(1): 47-54.

Fischle, W., Y. Wang, et al. (2003). "Histone and chromatin cross-talk." Curr Opin Cell

Biol 15(2): 172-183.

Foster, C. S., A. Falconer, et al. (2004). "Transcription factor E2F3 overexpressed in

prostate cancer independently predicts clinical outcome." Oncogene 23(35):

5871-5879.

Frosina, G., P. Fortini, et al. (1996). "Two pathways for base excision repair in

mammalian cells." J Biol Chem 271(16): 9573-9578.

Fullgrabe, J., E. Kavanagh, et al. (2011). "Histone onco-modifications." Oncogene 30(31):

3391-3403.

Gardiner-Garden, M. and M. Frommer (1987). "CpG islands in vertebrate genomes." J Mol

Biol 196(2): 261-282.

Page 145: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

150

Garzon, R., G. A. Calin, et al. (2009). "MicroRNAs in Cancer." Annu Rev Med 60: 167-179.

Ghoussaini, M., H. Song, et al. (2008). "Multiple loci with different cancer specificities

within the 8q24 gene desert." J Natl Cancer Inst 100(13): 962-966.

Goldberg, A. D., C. D. Allis, et al. (2007). "Epigenetics: a landscape takes shape." Cell

128(4): 635-638.

Goll, M. G., F. Kirpekar, et al. (2006). "Methylation of tRNAAsp by the DNA

methyltransferase homolog Dnmt2." Science 311(5759): 395-398.

Goto, T., H. Mizukami, et al. (2009). "Aberrant methylation of the p16 gene is frequently

detected in advanced colorectal cancer." Anticancer Res 29(1): 275-277.

Greger, V., E. Passarge, et al. (1989). "Epigenetic changes may contribute to the

formation and spontaneous regression of retinoblastoma." Hum Genet 83(2): 155-

158.

Gregorieff, A. and H. Clevers (2005). "Wnt signaling in the intestinal epithelium: from

endoderm to cancer." Genes Dev 19(8): 877-890.

Grewal, S. I. and S. Jia (2007). "Heterochromatin revisited." Nat Rev Genet 8(1): 35-46.

Grimson, A., K. K. Farh, et al. (2007). "MicroRNA targeting specificity in mammals:

determinants beyond seed pairing." Mol Cell 27(1): 91-105.

Guo, C., J. F. Sah, et al. (2008). "The noncoding RNA, miR-126, suppresses the growth of

neoplastic cells by targeting phosphatidylinositol 3-kinase signaling and is

frequently lost in colon cancers." Genes Chromosomes Cancer 47(11): 939-946.

Gupta, R. A., N. Shah, et al. (2010). "Long non-coding RNA HOTAIR reprograms

chromatin state to promote cancer metastasis." Nature 464(7291): 1071-1076.

Hamajima, N., M. Kouwaki, et al. (1998). "Dihydropyrimidinase deficiency: structural

organization, chromosomal localization, and mutation analysis of the human

dihydropyrimidinase gene." Am J Hum Genet 63(3): 717-726.

Hamfjord, J., A. M. Stangeland, et al. (2012). "Differential expression of miRNAs in

colorectal cancer: comparison of paired tumor tissue and adjacent normal mucosa

using high-throughput sequencing." PLoS One 7(4): e34150.

Herman, J. G. and S. B. Baylin (2003). "Gene silencing in cancer in association with

promoter hypermethylation." N Engl J Med 349(21): 2042-2054.

Herman, J. G., F. Latif, et al. (1994). "Silencing of the VHL tumor-suppressor gene by

DNA methylation in renal carcinoma." Proc Natl Acad Sci U S A 91(21): 9700-

9704.

Hernandez-Blazquez, F. J., M. Habib, et al. (2000). "Evaluation of global DNA

hypomethylation in human colon cancer tissues by immunohistochemistry and

image analysis." Gut 47(5): 689-693.

Hershey, A. D., J. Dixon, et al. (1953). "Nucleic acid economy in bacteria infected with

bacteriophage T2. I. Purine and pyrimidine composition." J Gen Physiol 36(6):

777-789.

Hinoue, T., D. J. Weisenberger, et al. (2012). "Genome-scale analysis of aberrant DNA

methylation in colorectal cancer." Genome Res 22(2): 271-282.

Hu, J., A. L. Ho, et al. (2013). "From the Cover: Neutralization of terminal differentiation

in gliomagenesis." Proc Natl Acad Sci U S A 110(36): 14520-14527.

Huarte, M., M. Guttman, et al. (2010). "A large intergenic noncoding RNA induced by p53

mediates global gene repression in the p53 response." Cell 142(3): 409-419.

Huppi, K., J. J. Pitt, et al. (2012). "The 8q24 gene desert: an oasis of non-coding

transcriptional activity." Front Genet 3: 69.

Huppi, K., N. Volfovsky, et al. (2008). "The identification of microRNAs in a genomically

unstable region of human chromosome 8q24." Mol Cancer Res 6(2): 212-221.

Page 146: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

151

Hur, K., P. Cejas, et al. (2013). "Hypomethylation of long interspersed nuclear element-1

(LINE-1) leads to activation of proto-oncogenes in human colorectal cancer

metastasis." Gut.

Imamura, T., S. Yamamoto, et al. (2004). "Non-coding RNA directed DNA demethylation

of Sphk1 CpG island." Biochem Biophys Res Commun 322(2): 593-600.

Iorio, M. V. and C. M. Croce (2012). "MicroRNA dysregulation in cancer: diagnostics,

monitoring and therapeutics. A comprehensive review." EMBO Mol Med 4(3):

143-159.

Irizarry, R. A., C. Ladd-Acosta, et al. (2009). "The human colon cancer methylome shows

similar hypo- and hypermethylation at conserved tissue-specific CpG island

shores." Nat Genet 41(2): 178-186.

Jacobs, D. I., Y. Mao, et al. (2013). "Dysregulated methylation at imprinted genes in

prostate tumor tissue detected by methylation microarray." BMC Urol 13(1): 37.

Jass, J. R. (2007). "Classification of colorectal cancer based on correlation of clinical,

morphological and molecular features." Histopathology 50(1): 113-130.

Jia, L., G. Landan, et al. (2009). "Functional enhancers at the gene-poor 8q24 cancer-

linked locus." PLoS Genet 5(8): e1000597.

Jiang, P., H. Wu, et al. (2007). "MiPred: classification of real and pseudo microRNA

precursors using random forest prediction model with combined features."

Nucleic Acids Res 35(Web Server issue): W339-344.

John, B., A. J. Enright, et al. (2004). "Human MicroRNA targets." PLoS Biol 2(11): e363.

Johnson, S. M., H. Grosshans, et al. (2005). "RAS is regulated by the let-7 microRNA

family." Cell 120(5): 635-647.

Jones, P. A. (2012). "Functions of DNA methylation: islands, start sites, gene bodies and

beyond." Nat Rev Genet 13(7): 484-492.

Kanduri, M., N. Cahill, et al. (2010). "Differential genome-wide array-based methylation

profiles in prognostic subsets of chronic lymphocytic leukemia." Blood 115(2):

296-305.

Kapranov, P., J. Cheng, et al. (2007). "RNA maps reveal new RNA classes and a possible

function for pervasive transcription." Science 316(5830): 1484-1488.

Karpinski, P., M. Walter, et al. (2013). "Intermediate- and low-methylation epigenotypes

do not correspond to CpG island methylator phenotype (low and -zero) in

colorectal cancer." Cancer Epidemiol Biomarkers Prev 22(2): 201-208.

Kastler, S., L. Honold, et al. (2010). "POU5F1P1, a putative cancer susceptibility gene, is

overexpressed in prostatic carcinoma." Prostate 70(6): 666-674.

Katada, T., H. Ishiguro, et al. (2009). "microRNA expression profile in undifferentiated

gastric cancer." Int J Oncol 34(2): 537-542.

Kawasaki, T., M. Ohnishi, et al. (2008). "WRN promoter methylation possibly connects

mucinous differentiation, microsatellite instability and CpG island methylator

phenotype in colorectal cancer." Mod Pathol 21(2): 150-158.

Khvorova, A., A. Reynolds, et al. (2003). "Functional siRNAs and miRNAs exhibit strand

bias." Cell 115(2): 209-216.

Kim, M. S., J. Lee, et al. (2010). "DNA methylation markers in colorectal cancer." Cancer

Metastasis Rev 29(1): 181-206.

Kim, S., M. Choi, et al. (2009). "Identifying the target mRNAs of microRNAs in colorectal

cancer." Comput Biol Chem 33(1): 94-99.

Kinzler, K. W. and B. Vogelstein (1996). "Lessons from hereditary colorectal cancer."

Cell 87(2): 159-170.

Kitamura, E., J. Igarashi, et al. (2007). "Analysis of tissue-specific differentially

Page 147: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

152

methylated regions (TDMs) in humans." Genomics 89(3): 326-337.

Konishi, H., D. Ichikawa, et al. (2012). "Detection of gastric cancer-associated

microRNAs on microRNA microarray comparing pre- and post-operative plasma."

Br J Cancer 106(4): 740-747.

Koshiishi, N., J. M. Chong, et al. (2004). "p300 gene alterations in intestinal and diffuse

types of gastric carcinoma." Gastric Cancer 7(2): 85-90.

Krek, A., D. Grun, et al. (2005). "Combinatorial microRNA target predictions." Nat Genet

37(5): 495-500.

Kunzelmann, K. (2005). "Ion channels and cancer." J Membr Biol 205(3): 159-173.

Lagos-Quintana, M., R. Rauhut, et al. (2001). "Identification of novel genes coding for

small expressed RNAs." Science 294(5543): 853-858.

Lam, L. T., X. Lu, et al. (2010). "A microRNA screen to identify modulators of sensitivity

to BCL2 inhibitor ABT-263 (navitoclax)." Mol Cancer Ther 9(11): 2943-2950.

Le Marchand, L., A. Seifried, et al. (2003). "Association of the cyclin D1 A870G

polymorphism with advanced colorectal cancer." JAMA 290(21): 2843-2848.

Lee, J., S. J. Jang, et al. (2010). "Presence of 5-methylcytosine in CpNpG trinucleotides

in the human genome." Genomics 96(2): 67-72.

Lee, J. T. (2012). "Epigenetic regulation by long noncoding RNAs." Science 338(6113):

1435-1439.

Lee, R. C., R. L. Feinbaum, et al. (1993). "The C. elegans heterochronic gene lin-4

encodes small RNAs with antisense complementarity to lin-14." Cell 75(5): 843-

854.

Lengauer, C., K. W. Kinzler, et al. (1998). "Genetic instabilities in human cancers." Nature

396(6712): 643-649.

Lerebours, F., G. Cizeron-Clairac, et al. (2013). "miRNA expression profiling of

inflammatory breast cancer identifies a 5-miRNA signature predictive of breast

tumor aggressiveness." Int J Cancer 133(7): 1614-1623.

Lewis, B. P., C. B. Burge, et al. (2005). "Conserved seed pairing, often flanked by

adenosines, indicates that thousands of human genes are microRNA targets." Cell

120(1): 15-20.

Li, J., S. Donath, et al. (2010). "miR-30 regulates mitochondrial fission through targeting

p53 and the dynamin-related protein-1 pathway." PLoS Genet 6(1): e1000795.

Lin, O. S. (2009). "Acquired risk factors for colorectal cancer." Methods Mol Biol 472:

361-372.

Ling, H., R. Spizzo, et al. (2013). "CCAT2, a novel noncoding RNA mapping to 8q24,

underlies metastatic progression and chromosomal instability in colon cancer."

Genome Res 23(9): 1446-1461.

Liu, M. and H. Chen (2010). "The role of microRNAs in colorectal cancer." J Genet

Genomics 37(6): 347-358.

Liu, S. P., R. H. Fu, et al. (2009). "MicroRNAs regulation modulated self-renewal and

lineage differentiation of stem cells." Cell Transplant 18(9): 1039-1045.

Loeb, L. A., K. R. Loeb, et al. (2003). "Multiple mutations and cancer." Proc Natl Acad Sci

U S A 100(3): 776-781.

Lu, J., G. Getz, et al. (2005). "MicroRNA expression profiles classify human cancers."

Nature 435(7043): 834-838.

Lui, W. O., N. Pourmand, et al. (2007). "Patterns of known and novel small RNAs in

human cervical cancer." Cancer Res 67(13): 6031-6043.

Luo, Z., C. Lin, et al. (2012). "The super elongation complex (SEC) family in

transcriptional control." Nat Rev Mol Cell Biol 13(9): 543-547.

Page 148: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

153

Marques, C. J., P. Costa, et al. (2008). "Abnormal methylation of imprinted genes in

human sperm is associated with oligozoospermia." Mol Hum Reprod 14(2): 67-74.

Maunakea, A. K., I. Chepelev, et al. (2010). "Epigenome mapping in normal and disease

States." Circ Res 107(3): 327-339.

Melo, S. A. and M. Esteller (2011). "Dysregulation of microRNAs in cancer: playing with

fire." FEBS Lett 585(13): 2087-2099.

Meng, R. D., C. C. Shelton, et al. (2009). "gamma-Secretase inhibitors abrogate

oxaliplatin-induced activation of the Notch-1 signaling pathway in colon cancer

cells resulting in enhanced chemosensitivity." Cancer Res 69(2): 573-582.

Mercer, T. R., M. E. Dinger, et al. (2009). "Long non-coding RNAs: insights into

functions." Nat Rev Genet 10(3): 155-159.

Mercer, T. R. and J. S. Mattick (2013). "Structure and function of long noncoding RNAs in

epigenetic regulation." Nat Struct Mol Biol 20(3): 300-307.

Metzker, M. L. (2010). "Sequencing technologies - the next generation." Nat Rev Genet

11(1): 31-46.

Migliore, L., F. Migheli, et al. (2011). "Genetics, cytogenetics, and epigenetics of

colorectal cancer." J Biomed Biotechnol 2011: 792362.

Mikkelsen, T. S., M. Ku, et al. (2007). "Genome-wide maps of chromatin state in

pluripotent and lineage-committed cells." Nature 448(7153): 553-560.

Morgan, H. D., W. Dean, et al. (2004). "Activation-induced cytidine deaminase deaminates

5-methylcytosine in DNA and is expressed in pluripotent tissues: implications for

epigenetic reprogramming." J Biol Chem 279(50): 52353-52360.

Murakami, Y., A. Tamori, et al. (2013). "The expression level of miR-18b in

hepatocellular carcinoma is associated with the grade of malignancy and

prognosis." BMC Cancer 13: 99.

Nagel, R., C. le Sage, et al. (2008). "Regulation of the adenomatous polyposis coli gene by

the miR-135 family in colorectal cancer." Cancer Res 68(14): 5795-5802.

Nakanishi, H., T. Suda, et al. (2004). "Loss of imprinting of PEG1/MEST in lung cancer

cell lines." Oncol Rep 12(6): 1273-1278.

Nam, J. W., J. Kim, et al. (2006). "ProMiR II: a web server for the probabilistic prediction

of clustered, nonclustered, conserved and nonconserved microRNAs." Nucleic

Acids Res 34(Web Server issue): W455-458.

Ng, E. K., W. W. Chong, et al. (2009). "Differential expression of microRNAs in plasma of

patients with colorectal cancer: a potential marker for colorectal cancer

screening." Gut 58(10): 1375-1381.

Nicoloso, M. S., H. Sun, et al. (2010). "Single-nucleotide polymorphisms inside microRNA

target sites influence tumor susceptibility." Cancer Res 70(7): 2789-2798.

Nie, J., L. Liu, et al. (2012). "microRNA-365, down-regulated in colon cancer, inhibits

cell cycle progression and promotes apoptosis of colon cancer cells by probably

targeting Cyclin D1 and Bcl-2." Carcinogenesis 33(1): 220-225.

Nishihara, S., T. Hayashida, et al. (2000). "Multipoint imprinting analysis in sporadic

colorectal cancers with and without microsatellite instability." Int J Oncol 17(2):

317-322.

Nissan, A., A. Stojadinovic, et al. (2012). "Colon cancer associated transcript-1: a novel

RNA expressed in malignant and pre-malignant human tissues." Int J Cancer

130(7): 1598-1606.

Noguchi, T., K. Tanimoto, et al. (2004). "Aberrant methylation of DPYD promoter, DPYD

expression, and cellular sensitivity to 5-fluorouracil in cancer cells." Clin Cancer

Res 10(20): 7100-7107.

Page 149: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

154

Noushmehr, H., D. J. Weisenberger, et al. (2010). "Identification of a CpG island

methylator phenotype that defines a distinct subgroup of glioma." Cancer Cell

17(5): 510-522.

O'Toole, A. S., S. Miller, et al. (2006). "Comprehensive thermodynamic analysis of 3'

double-nucleotide overhangs neighboring Watson-Crick terminal base pairs."

Nucleic Acids Res 34(11): 3338-3344.

Ogino, S. and A. Goel (2008). "Molecular classification and correlates in colorectal

cancer." J Mol Diagn 10(1): 13-27.

Okano, M., D. W. Bell, et al. (1999). "DNA methyltransferases Dnmt3a and Dnmt3b are

essential for de novo methylation and mammalian development." Cell 99(3): 247-

257.

Pabst, O., R. Forster, et al. (2000). "NKX2.3 is required for MAdCAM-1 expression and

homing of lymphocytes in spleen and mucosa-associated lymphoid tissue." EMBO

J 19(9): 2015-2023.

Pedersen, I. S., P. A. Dervan, et al. (1999). "Frequent loss of imprinting of PEG1/MEST in

invasive breast cancer." Cancer Res 59(21): 5449-5451.

Penn, N. W., R. Suwalski, et al. (1972). "The presence of 5-hydroxymethylcytosine in

animal deoxyribonucleic acid." Biochem J 126(4): 781-790.

Pino, M. S. and D. C. Chung (2010). "The chromosomal instability pathway in colon

cancer." Gastroenterology 138(6): 2059-2072.

Pomerantz, M. M., N. Ahmadiyeh, et al. (2009). "The 8q24 cancer risk variant rs6983267

shows long-range interaction with MYC in colorectal cancer." Nat Genet 41(8):

882-884.

Pomerantz, M. M., C. A. Beckwith, et al. (2009). "Evaluation of the 8q24 prostate cancer

risk locus and MYC expression." Cancer Res 69(13): 5568-5574.

Powell, S. M., N. Zilz, et al. (1992). "APC mutations occur early during colorectal

tumorigenesis." Nature 359(6392): 235-237.

Poynter, J. N., J. C. Figueiredo, et al. (2007). "Variants on 9p24 and 8q24 are associated

with risk of colorectal cancer: results from the Colon Cancer Family Registry."

Cancer Res 67(23): 11128-11132.

Pritchard, C. C. and W. M. Grady (2011). "Colorectal cancer molecular biology moves into

clinical practice." Gut 60(1): 116-129.

Queller, D. C., J. E. Strassmann, et al. (1993). "Microsatellites and kinship." Trends Ecol

Evol 8(8): 285-288.

Radhakrishnan, A., N. Badhrinarayanan, et al. (2009). "Analysis of chromosomal

aberration (1, 3, and 8) and association of microRNAs in uveal melanoma." Mol

Vis 15: 2146-2154.

Rakyan, V. K., T. A. Down, et al. (2008). "An integrated resource for genome-wide

identification and analysis of human tissue-specific differentially methylated

regions (tDMRs)." Genome Res 18(9): 1518-1529.

Rakyan, V. K., T. Hildmann, et al. (2004). "DNA methylation profiling of the human major

histocompatibility complex: a pilot study for the human epigenome project." PLoS

Biol 2(12): e405.

Redis, R. S., A. M. Sieuwerts, et al. (2013). "CCAT2, a novel long non-coding RNA in

breast cancer: expression study and clinical correlations." Oncotarget.

Redon, S., P. Reichenbach, et al. (2010). "The non-coding RNA TERRA is a natural ligand

and direct inhibitor of human telomerase." Nucleic Acids Res 38(17): 5797-5806.

Reinhart, B. J., F. J. Slack, et al. (2000). "The 21-nucleotide let-7 RNA regulates

developmental timing in Caenorhabditis elegans." Nature 403(6772): 901-906.

Page 150: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

155

Rice, J. C. and C. D. Allis (2001). "Histone methylation versus histone acetylation: new

insights into epigenetic regulation." Curr Opin Cell Biol 13(3): 263-273.

Rinn, J. L., M. Kertesz, et al. (2007). "Functional demarcation of active and silent

chromatin domains in human HOX loci by noncoding RNAs." Cell 129(7): 1311-

1323.

Robertson, K. D. (2005). "DNA methylation and human disease." Nat Rev Genet 6(8):

597-610.

Sabates-Bellver, J., L. G. Van der Flier, et al. (2007). "Transcriptome profile of human

colorectal adenomas." Mol Cancer Res 5(12): 1263-1275.

Sandoval, J., H. Heyn, et al. (2011). "Validation of a DNA methylation microarray for

450,000 CpG sites in the human genome." Epigenetics 6(6): 692-702.

Sarver, A. L., A. J. French, et al. (2009). "Human colon cancer profiles show differential

microRNA expression depending on mismatch repair status and are characteristic

of undifferentiated proliferative states." BMC Cancer 9: 401.

Sarver, A. L., L. Li, et al. (2010). "MicroRNA miR-183 functions as an oncogene by

targeting the transcription factor EGR1 and promoting tumor cell migration."

Cancer Res 70(23): 9570-9580.

Saxonov, S., P. Berg, et al. (2006). "A genome-wide analysis of CpG dinucleotides in the

human genome distinguishes two distinct classes of promoters." Proc Natl Acad

Sci U S A 103(5): 1412-1417.

Schetter, A. J., S. Y. Leung, et al. (2008). "MicroRNA expression profiles associated with

prognosis and therapeutic outcome in colon adenocarcinoma." JAMA 299(4): 425-

436.

Sewer, A., N. Paul, et al. (2005). "Identification of clustered microRNAs using an ab initio

prediction method." BMC Bioinformatics 6: 267.

Shah, M. Y., X. Pan, et al. (2010). "5-Fluorouracil drug alters the microRNA expression

profiles in MCF-7 breast cancer cells." J Cell Physiol.

Shen, L., M. Toyota, et al. (2007). "Integrated genetic and epigenetic analysis identifies

three different subclasses of colon cancer." Proc Natl Acad Sci U S A 104(47):

18654-18659.

Shen, L. and R. A. Waterland (2007). "Methods of DNA methylation analysis." Curr Opin

Clin Nutr Metab Care 10(5): 576-581.

Shi, X., M. Sun, et al. (2013). "Long non-coding RNAs: a new frontier in the study of

human diseases." Cancer Lett 339(2): 159-166.

Shtivelman, E., B. Henglein, et al. (1989). "Identification of a human transcription unit

affected by the variant chromosomal translocations 2;8 and 8;22 of Burkitt

lymphoma." Proc Natl Acad Sci U S A 86(9): 3257-3260.

Siddiqui, H., D. A. Solomon, et al. (2003). "Histone deacetylation of RB-responsive

promoters: requisite for specific gene repression but dispensable for cell cycle

inhibition." Mol Cell Biol 23(21): 7719-7731.

Simmen, M. W. (2008). "Genome-scale relationships between cytosine methylation and

dinucleotide abundances in animals." Genomics 92(1): 33-40.

Singal, R. and G. D. Ginder (1999). "DNA methylation." Blood 93(12): 4059-4070.

Slaby, O., M. Svoboda, et al. (2009). "MicroRNAs in colorectal cancer: translation of

molecular biology into clinical application." Mol Cancer 8: 102.

Slattery, M. L., J. S. Herrick, et al. (2011). "Genetic variation in the TGF-beta signaling

pathway and colon and rectal cancer risk." Cancer Epidemiol Biomarkers Prev

20(1): 57-69.

Song, M. A., M. Tiirikainen, et al. (2013). "Elucidating the landscape of aberrant DNA

Page 151: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

156

methylation in hepatocellular carcinoma." PLoS One 8(2): e55761.

Song, U. L. a. M.-A. (2011). "Dietary and Lifestyle Correlates of DNA Methylation."

Springer Science (Human Press).

Sotelo, J., D. Esposito, et al. (2010). "Long-range enhancers on 8q24 regulate c-Myc."

Proc Natl Acad Sci U S A 107(7): 3001-3005.

Strathdee, G., K. Appleton, et al. (2001). "Primary ovarian carcinomas display multiple

methylator phenotypes involving known tumor suppressor genes." Am J Pathol

158(3): 1121-1127.

Struhl, K. (1998). "Histone acetylation and transcriptional regulatory mechanisms." Genes

Dev 12(5): 599-606.

Sunami, E., M. de Maat, et al. (2011). "LINE-1 hypomethylation during primary colon

cancer progression." PLoS One 6(4): e18884.

Tahiliani, M., K. P. Koh, et al. (2009). "Conversion of 5-methylcytosine to 5-

hydroxymethylcytosine in mammalian DNA by MLL partner TET1." Science

324(5929): 930-935.

Takeda, J., S. Seino, et al. (1992). "Human Oct3 gene family: cDNA sequences,

alternative splicing, gene organization, chromosomal location, and expression at

low levels in adult tissues." Nucleic Acids Res 20(17): 4613-4620.

Tamai, H., K. Miyake, et al. (2011). "Resistance of MLL-AFF1-positive acute

lymphoblastic leukemia to tumor necrosis factor-alpha is mediated by S100A6

upregulation." Blood Cancer J 1(11): e38.

Tenesa, A., S. M. Farrington, et al. (2008). "Genome-wide association scan identifies a

colorectal cancer susceptibility locus on 11q23 and replicates risk loci at 8q24

and 18q21." Nat Genet 40(5): 631-637.

Tillinghast, G. W., J. Partee, et al. (2003). "Analysis of genetic stability at the EP300 and

CREBBP loci in a panel of cancer cell lines." Genes Chromosomes Cancer 37(2):

121-131.

Ting, D. T., D. Lipson, et al. (2011). "Aberrant overexpression of satellite repeats in

pancreatic and other epithelial cancers." Science 331(6017): 593-596.

Tomlinson, I. P., E. Webb, et al. (2008). "A genome-wide association study identifies

colorectal cancer susceptibility loci on chromosomes 10p14 and 8q23.3." Nat

Genet 40(5): 623-630.

Toyota, M., N. Ahuja, et al. (1999). "CpG island methylator phenotype in colorectal

cancer." Proc Natl Acad Sci U S A 96(15): 8681-8686.

Toyota, M., N. Ahuja, et al. (1999). "Aberrant methylation in gastric cancer associated

with the CpG island methylator phenotype." Cancer Res 59(21): 5438-5442.

Trievel, R. C. (2004). "Structure and function of histone methyltransferases." Crit Rev

Eukaryot Gene Expr 14(3): 147-169.

Tuupanen, S., M. Turunen, et al. (2009). "The common colorectal cancer predisposition

SNP rs6983267 at chromosome 8q24 confers potential to enhanced Wnt

signaling." Nat Genet 41(8): 885-890.

Tyagi, S., C. Vaz, et al. (2008). "CID-miRNA: a web server for prediction of novel miRNA

precursors in human genome." Biochem Biophys Res Commun 372(4): 831-834.

Umar, A., C. R. Boland, et al. (2004). "Revised Bethesda Guidelines for hereditary

nonpolyposis colorectal cancer (Lynch syndrome) and microsatellite instability." J

Natl Cancer Inst 96(4): 261-268.

Varallyay, E., J. Burgyan, et al. (2008). "MicroRNA detection by northern blotting using

locked nucleic acid probes." Nat Protoc 3(2): 190-196.

Venturini, L., K. Battmer, et al. (2007). "Expression of the miR-17-92 polycistron in

Page 152: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

157

chronic myeloid leukemia (CML) CD34+ cells." Blood 109(10): 4399-4405.

Vigneault, F., D. Ter-Ovanesyan, et al. (2012). "High-throughput multiplex sequencing of

miRNA." Curr Protoc Hum Genet Chapter 11: Unit 11 12 11-10.

Wang, C. J., Z. G. Zhou, et al. (2009). "Clinicopathological significance of microRNA-31, -

143 and -145 expression in colorectal cancer." Dis Markers 26(1): 27-34.

Wang, Q., M. Williamson, et al. (2007). "Hypomethylation of WNT5A, CRIP1 and S100P in

prostate cancer." Oncogene 26(45): 6560-6565.

Wang, R., W. M. Dashwood, et al. (2011). "NADPH oxidase overexpression in human

colon cancers and rat colon tumors induced by 2-amino-1-methyl-6-

phenylimidazo[4,5-b]pyridine (PhIP)." Int J Cancer 128(11): 2581-2590.

Wang, Z., C. Zang, et al. (2008). "Combinatorial patterns of histone acetylations and

methylations in the human genome." Nat Genet 40(7): 897-903.

Wark, A. W., H. J. Lee, et al. (2008). "Multiplexed detection methods for profiling

microRNA expression in biological samples." Angew Chem Int Ed Engl 47(4):

644-652.

Wei, E. K., E. Giovannucci, et al. (2004). "Comparison of risk factors for colon and rectal

cancer." Int J Cancer 108(3): 433-442.

Wilusz, J. E., H. Sunwoo, et al. (2009). "Long noncoding RNAs: functional surprises from

the RNA world." Genes Dev 23(13): 1494-1504.

Wiseman, M. (2008). "The second World Cancer Research Fund/American Institute for

Cancer Research expert report. Food, nutrition, physical activity, and the

prevention of cancer: a global perspective." Proc Nutr Soc 67(3): 253-256.

Witkos, T. M., E. Koscianska, et al. (2011). "Practical Aspects of microRNA Target

Prediction." Curr Mol Med 11(2): 93-109.

Wojcik, S. E., S. Rossi, et al. (2010). "Non-codingRNA sequence variations in human

chronic lymphocytic leukemia and colorectal cancer." Carcinogenesis 31(2): 208-

215.

Wong, J. J., N. J. Hawkins, et al. (2007). "Colorectal cancer: a model for epigenetic

tumorigenesis." Gut 56(1): 140-148.

Wright, J. B., S. J. Brown, et al. (2010). "Upregulation of c-MYC in cis through a large

chromatin loop linked to a cancer risk-associated single-nucleotide

polymorphism in colorectal cancer cells." Mol Cell Biol 30(6): 1411-1420.

Wyman, S. K., R. K. Parkin, et al. (2009). "Repertoire of microRNAs in epithelial ovarian

cancer as determined by next generation sequencing of small RNA cDNA

libraries." PLoS One 4(4): e5311.

Yamakuchi, M., M. Ferlito, et al. (2008). "miR-34a repression of SIRT1 regulates

apoptosis." Proc Natl Acad Sci U S A 105(36): 13421-13426.

Yan, L. X., X. F. Huang, et al. (2008). "MicroRNA miR-21 overexpression in human breast

cancer is associated with advanced clinical stage, lymph node metastasis and

patient poor prognosis." RNA 14(11): 2348-2360.

Yang, L., Z. Ma, et al. (2010). "MicroRNA-602 regulating tumor suppressive gene

RASSF1A is overexpressed in hepatitis B virus-infected liver and hepatocellular

carcinoma." Cancer Biol Ther 9(10): 803-808.

Yao, Y., A. L. Suo, et al. (2009). "MicroRNA profiling of human gastric cancer." Mol Med

Rep 2(6): 963-970.

Yasui, K., S. Arii, et al. (2002). "TFDP1, CUL4A, and CDC16 identified as targets for

amplification at 13q34 in hepatocellular carcinomas." Hepatology 35(6): 1476-

1484.

Yori, J. L., E. Johnson, et al. (2010). "Kruppel-like factor 4 inhibits epithelial-to-

Page 153: IDENTIFICATION OF ABERRANT EPIGENETIC EVENTS IN MSS/CIMP-NEGATIVE COLON … · 2015-10-15 · Classification of 125 CRCs and heatmap representation of Illumina HumanMethylation27

158

mesenchymal transition through regulation of E-cadherin gene expression." J Biol

Chem 285(22): 16854-16863.

Zanke, B. W., C. M. Greenwood, et al. (2007). "Genome-wide association scan identifies a

colorectal cancer susceptibility locus on chromosome 8q24." Nat Genet 39(8):

989-994.

Zhao, J., B. K. Sun, et al. (2008). "Polycomb proteins targeted by a short repeat RNA to

the mouse X chromosome." Science 322(5902): 750-756.

Zhong, M., Z. Bian, et al. (2013). "miR-30a Suppresses Cell Migration and Invasion

Through Downregulation of PIK3CD in Colorectal Carcinoma." Cell Physiol

Biochem 31(2-3): 209-218.