study of mirna polymorphisms and their potential...

STUDY OF MIRNA POLYMORPHISMS AND

THEIR POTENTIAL ASSOCIATION WITH

BREAST CANCER RISK IN AUSTRALIAN

CAUCASIAN POPULATIONS.

Diego Fernando Chacon Cortes MBBS, MMedRes

Student No n6542042

Submitted in fulfilment of the requirements for the degree of

Doctor of Philosophy

School of Biomedical Sciences

Institute of Health and Biomedical Innovation

Faculty of Health

Queensland University of Technology

2015

Keywords

Association study; breast cancer; DNA extraction; genomic DNA, genotyping;

MALDI-TOF MS analysis; microRNA (miRNA); microRNA SNPs (miR-SNPs);

single nucleotide polymorphisms (SNPs); haplotype

Study of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations. i

Abstract

Breast cancer is a malignant neoplasm that results from innate and/or acquired

genetic predisposition from somatic mutations in breast tissue and their interaction

with hormonal exposure and other risk factors. MicroRNAs might be important for

the pathogenesis of breast cancer since they have been described as important

epigenetic regulators of gene expression present in all the molecular pathways

involved in cancer biology. Previous research has shown that mutations present in

miRNA sequences could result in the changes in miRNA synthesis and function seen

in cancer biology and single nucleotide polymorphisms (SNPS) are the most

common type of variation present in miRNA genes. Only a small fraction of the total

of microRNA SNPs (miR-SNPs) identified to date, have been studied in relation to

breast cancer. Findings are still somewhat controversial and this is probably the

result of limitations in terms of the populations examined, experimental design and

methods used for analysis. To address this gap in the literature the present study aims

to achieve the following aims: i) establish a large sized and well characterised

prospective DNA biobank to be used as a replication cohort to the one previously

available in our facility, using the most appropriate DNA extraction protocol in a

time-efficient and cost-effective manner; and ii) identify miR-SNPs associated with

breast cancer risk in both Australian Caucasian populations using genotype and

haplotype association analysis.

Chapter One explores the scientific problem and aims of this study. Chapter

Two and Three provide a comprehensive literature review of the background in this

field of work and the methodology used to conduct our research. We aimed to

establish a prospective genomic DNA (gDNA) biobank extracted from whole blood

samples obtained from 929 breast cancer patients in Queensland recruited in

collaboration with the Cancer Council Queensland. In order to determine the best

possible method to perform DNA extraction from the ones available at our facility,

Chapter Four presents our in-house evaluation study conducted to determine the most

cost effective and time efficient method to perform genomic DNA (gDNA)

extraction from whole blood samples. We compared the modified salting-out

iiStudy of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations.

protocol described by Nasiri et al, against a traditional salting out method and a

commercially available DNA extraction kit and we performed gDNA extraction from

whole blood samples. We evaluated the performance of each method in terms of the

quality and quantity of DNA obtained, as well as time requirements and costs for

each technique. Our results highlighted the modified salting out method to be the

most cost-effective and time-efficient technique to isolate genomic DNA from whole

blood samples. The modified salting out method has consistently proved to be an

appropriate technique based on the quality and quantity of DNA obtained from blood

samples extracted at our facility and these samples have also been shown to perform

well in downstream applications used for the genotyping studies included in this

thesis. Some of the limitations of this study included the small number of samples

used to test our chosen DNA extraction methods and that we also failed to include a

more comprehensive range of techniques that were unavailable at the time at our

facility.

We also decided to review current knowledge of genomic DNA extraction

from whole blood methods. Our study (Chapter Five) reviews the history of DNA

extraction techniques, the main categories of the methods most commonly used in

facilities around the world, as well as the available literature on evaluation studies

comparing different extraction techniques to highlight their strengths and

weaknesses. This review evidenced the limited scientific evidence available in

regards to the best choice for gDNA extraction and it also highlighted the need to

conduct our own validation study to determine the most appropriate technique to be

used in the establishment of our new case control cohort.

The following study (Chapter Six) validated previous findings in relation to the

miR-SNP rs2910164 present in MIR146A that has shown conflicting results in

previous studies because it had not been properly analysed in purely sporadic or

purely Caucasian breast cancer populations. We genotyped this variant using high

resolution melt (HRM) analysis and were able to determine that presence of the

mutant allele in this gene is significantly associated with breast cancer risk in both

our primary and replicate populations (p = 0.03 and p = 0.00013). However we

Study of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations. iii

acknowledge that we were limited by the small number of samples included in our

replication population (GU-CCQ BB population), as this was still under

development. To the best of our knowledge this is the first study to identify

association between this rs2910164 and breast cancer risk in a purely sporadic breast

cancer bearing Caucasian population.

Chapter Seven presents the results of a large scale genotyping study using both

of our previously established cohorts. In this study, we developed a selection

algorithm to choose miR-SNPs to be included in this and other studies, which

allowed the selection of a large panel of miR-SNPs (24 miR-SNPs in total present in

9 miRNA genes) to be analysed using multiplex PCR and chip-based matrix assisted

laser desorption ionization time-of-flight (MALDI-TOF) mass spectrometry (MS).

Results from our study showed 6 of the miR-SNPs (rs73798217, rs112394324,

rs35301225, rs1547354, rs7050391 and rs3851812) not to be polymorphic in our

populations, providing initial validation of the rare presence of these miR-SNPs in

Caucasian populations. We were also most importantly, able to show significant

association to breast cancer risk for one of our selected variants present in MIR145.

We determined that miR-SNP rs353291 in the MIR145 gene had significant

differences in allele frequencies between cases and controls in both of our tested

populations (p = 0.041 and p = 0.023) and to the best of our knowledge this is the

first report of an association to breast cancer risk for this variant. Our study provided

initial validation of our findings using two independent populations, but further

validation of these results in larger cohorts or populations of both similar and

different ethnic background is required to strengthen their significance. If we take

into account multiple testing correction of our results, our findings are not

statistically significant, but since they were validated in two independent populations

they are still very interesting and worthy of further research.

Our last study (Chapter Eight) presents the results of our genotype and

haplotype analysis of 6 miR-SNPs present in the microRNA 17-92 cluster host gene

(MIR17HG) conducted in our two Australian Caucasian breast cancer cohorts. We

decided to study polymorphisms in MIR17HG because of its importance in cancer

development, which has been shown in previous studies. It was the first cluster host

ivStudy of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations.

gene to be found to play a role in cancer biology and it has also been linked to the

pathogenesis of breast cancer targeting important genes in breast biology, like

NCOA3, CCDN1 and ESR1. To the best of our knowledge there are no previous

studies evaluating the role of miR-SNPs for this particular cluster host gene, making

it an ideal candidate for research. We selected 6 miR-SNPs located in the cluster

gene region or in close proximity to it providing good coverage for this miRNA

cluster gene, and they included a polymorphism located inside the mature sequence

for MIR92A1 (rs9589207). We genotyped these miR-SNPs using multiplex PCR and

chip-based MALDI-TOF MS analysis in our two independent Australian Caucasian

breast cancer case-control cohorts. Our results showed significant differences

between cases and control for miR-SNP rs4284505 in both of our tested populations

(p = 0.01 and p = 0.03) and for rs7336610 only in our initial population (p = 0.03) for

the genotypic analysis. Furthermore haplotypic analysis on the combined results

from both populations identified these two miR-SNPs (rs4284505/rs7336610) to be

in linkage disequilibrium and part of a haplotype block (D’ score = 0.98). Chi-square

analysis of the haplotypes contained in this block showed a lower frequency for

haplotype AC in cases compared to controls and our analysis showed it to be

significantly associated with breast cancer (p = 5x10-4). Odds ratio (OR) for this

haplotype suggests that individuals that carry the AC haplotype had a reduced risk of

developing breast cancer (about 25%) in comparison to other haplotype carriers (OR:

0.75; 95% CI: 0.60 to 0.94; p = 0.012). To the best of our knowledge this is the first

report of a significant association between these polymorphisms and breast cancer

risk.

Taken together, these data indicate that the presence of specific miR-SNPs may

play a role in breast cancer development and the results presented in Chapters Six to

Eight of this document are only the initial step required to determine their effect on

breast cancer pathogenesis. Further research that includes functional assays will be

required for these findings to be translated into strategies that impact breast cancer

diagnosis, treatment and prognosis. Further validation of our positive findings

through genotyping of larger populations or populations of different ethnic

backgrounds as well as functional validation using cell culture or animal models are

Study of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations. v

also required to proceed with translational studies and these would allow a better

understanding of the detailed mechanisms occurring in breast cancer biology.

viStudy of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations.

List of publications and manuscripts

The following is a list of publications and manuscripts that have been prepared in

conjunction with this thesis

Chacon-Cortes D, Smith RA, Lea RA, Youl PH and Griffiths LR (2015).

Association of microRNA 17-92 cluster host gene (MIR17HG)

polymorphisms with breast cancer. Tumor Biology, 36(7), 5369-76

Chacon-Cortes D, and Griffiths LR (2014). Methods for extracting

genomic DNA from whole blood samples: current perspectives. Journal of

Biorepository Science for Applied Medicine, 2, 1-9.

Chacon-Cortes D, Haupt LM, Lea RA and Griffiths LR (2012).

Comparison of genomic DNA extraction techniques from whole blood

samples: a time, cost and quality evaluation study. Molecular Biology

Reports, 39(5), 5961-6.

Upadhyaya A, Smith RA, Chacon-Cortes D, Revêchon G, Haupt LM,

Chambers SK, Youl PH and Griffiths LR. Association of the microRNA-

single nucleotide polymorphism rs2910164 in miR146a with sporadic

breast cancer susceptibility: a case control study. Gene (Under revision)

Chacon-Cortes D, Smith RA, Lea RA, Youl PH and Griffiths LR. Genetic

association analysis of miRNA single nucleotide polymorphisms (miR-

SNPs) in Australian Caucasian breast cancer case control cohorts. BMC

Medical Genetics (Under revision)

Study of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations. vii

List of additional publications and manuscripts

The following is a list of publications involving collaborative research

undertaken as part of the PhD

Journal articles

Okolicsanyi RK, Faure M, Jacinto JM, Chacon-Cortes D, Chambers S,

Youl PH, et al. Association of the SNP rs2623047 in the HSPG

modification enzyme SULF1 with an Australian Caucasian Breast Cancer

Cohort. Gene. 2014; 547(1):50-4.

Okolicsanyi RK, Buffiere A, Jacinto JM, Chacon-Cortes D, Chambers S,

Youl PH, et al. Association of heparan sulfate proteoglycans SDC1 and

SDC4 polymorphisms with breast cancer in an Australian Caucasian

population. Tumor Biology. 2014:1-8

Conference papers

Chacon-Cortes D, Smith RA, Lea RA, Chambers S, Youl PH., Griffiths

LR. Disease association analysis of selected miRNA SNP genotypes for

two Australian Caucasian Breast Cancer Populations. IHBI Inspires

Postgraduate Conference. Gold Coast, November 20th-21st 2014.

viiiStudy of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations.

Chacon-Cortes D, Haupt LM, Lea RA, Griffiths LR. DNA Extraction

techniques from whole blood samples: A time, cost and quality evaluation

study. GHU Gold Coast Health & Medical Research Conference. Gold

Coast, 1-2nd December 2011.

Chacon-Cortes D, Chambers S., Youl PH., Smith RA., Lea RA, Griffiths,

LR. A genetic approach to investigate clinical and psychological outcomes

for women with breast cancer – establishment of a Griffith Health Institute

Breast Cancer Biobank. Annual Human Genetics Society of Australasia

conference. Gold Coast International, Surfers Paradise. July 31st – August

2nd 2011.

Study of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations. ix

Table of Contents

Keywords .......................................................................................................................................... i

Abstract ............................................................................................................................................ ii

List of publications and manuscripts ................................................................................................. vii

List of additional publications and manuscripts................................................................................ viii

Table of Contents .............................................................................................................................. x

List of Tables ................................................................................................................................... xv

List of Figures ............................................................................................................................... xvii

List of Abbreviations .................................................................................................................... xviii

Statement of Original Authorship .................................................................................................... xxi

Acknowledgements ........................................................................................................................ xxii

CHAPTER 1: INTRODUCTION ................................................................................................ 24

1.1 A description of the scientific problem investigated .............................................................. 24

1.2 The overall objective of the study ......................................................................................... 25

1.3 The specific aims of the study .............................................................................................. 26

1.4 An account of scientific progress linking the scientific papers ............................................... 26

CHAPTER 2: LITERATURE REVIEW ..................................................................................... 29

2.1 Breast cancer epidemiology .................................................................................................. 29

2.2 Anatomy of the mamary gland.............................................................................................. 32

2.3 Breast cancer definition and aetiology .................................................................................. 35

2.4 Breast cancer classification................................................................................................... 38

2.4.1 Histological Classification ......................................................................................... 39

2.4.2 Bloom and Richardson grading system ...................................................................... 41

2.4.3 The American Joint Committee on Cancer (AJCC) staging system. ............................ 42

2.5 Genetics and breast cancer.................................................................................................... 43

2.5.1 Genes and breast cancer............................................................................................. 45

xStudy of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations.

2.6 MicroRNAs and breast cancer .............................................................................................. 56

2.6.1 MicroRNA biogenesis and functions.......................................................................... 56

2.6.2 MicroRNAs in cancer ................................................................................................ 60

2.6.3 MicroRNAs and breast cancer ................................................................................... 66

2.6.4 MicroRNA SNPs (miR-SNPs) and breast cancer ........................................................ 73

2.7 Summary and Implications ................................................................................................... 78

CHAPTER 3: METHODOLOGY BACKGROUND .................................................................. 81

3.1 Study Populations ................................................................................................................ 81

3.1.1 Genomics Research Centre Breast Cancer Population (GRC-BC population) .............. 81

3.1.2 The Griffith University – Cancer Council Queensland Breast Cancer Biobank (GU-CCQ BB) .......................................................................................................... 81

3.2 Case - Control Association studies........................................................................................ 82

3.2.1 Genomic DNA Extraction ......................................................................................... 82

3.2.2 Genotyping Techniques ............................................................................................. 84

3.2.3 Statistical Analysis .................................................................................................... 88

CHAPTER 4: COMPARISON OF GENOMIC DNA EXTRACTION TECHNIQUES FROM WHOLE BLOOD SAMPLES: A TIME, COST AND QUALITY EVALUATION STUDY. ..... 91

4.1 Statement of contribution of co-authors for thesis by published paper ................................... 91

4.2 Abstract ............................................................................................................................... 93

4.3 Introduction ......................................................................................................................... 93

4.4 Materials and methods ......................................................................................................... 95

4.4.1 Samples .................................................................................................................... 95

4.4.2 Genomic DNA extraction methods ............................................................................ 95

4.4.3 Evaluation of extracted DNA ..................................................................................... 97

4.5 Results ................................................................................................................................. 98

4.6 Discussion ......................................................................................................................... 103

CHAPTER 5: METHODS FOR EXTRACTING GENOMIC DNA FROM WHOLE BLOOD SAMPLES: CURRENT PERSPECTIVES. ............................................................................... 105

5.1 Statement of contribution of co-authors for thesis by published paper ................................. 105

5.2 Abstract ............................................................................................................................. 107

Study of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations. xi

5.3 Introduction ....................................................................................................................... 107

5.4 Initial development of DNA extraction techniques .............................................................. 108

5.5 Main types of DNA extraction methods from human whole blood samples ......................... 110

5.5.1 Solution based DNA extraction methods .................................................................. 112

5.5.2 Solid phase DNA extraction methods ....................................................................... 114

5.6 Choosing the appropriate protocol ...................................................................................... 118

5.7 Conclusions ....................................................................................................................... 125

CHAPTER 6: ASSOCIATION OF THE MICRORNA-SNP RS2910164 IN MIR146A WITH SPORADIC BREAST CANCER SUSCEPTIBILITY: A CASE CONTROL STUDY. ............ 126

6.1 Statement of contribution of co-authors for thesis by published paper.................................. 126

6.2 Abstract ............................................................................................................................. 128

6.3 Background ....................................................................................................................... 128

6.4 Materials and methods........................................................................................................ 130

6.4.1 Study Cohort ........................................................................................................... 130

6.4.2 Genotyping ............................................................................................................. 131

6.4.3 Statistical Analysis .................................................................................................. 132

6.5 Results ............................................................................................................................... 132

6.6 Discussion ......................................................................................................................... 134

6.7 Conclusions ....................................................................................................................... 137

CHAPTER 7: GENETIC ASSOCIATION ANALYSIS OF MIRNA SNPS IMPLICATES MIR145 IN BREAST CANCER SUSCEPTIBILITY. ................................................................ 138

7.1 Statement of contribution of co-authors for thesis by published paper.................................. 138

7.2 Abstract ............................................................................................................................. 140

7.3 Background ....................................................................................................................... 140

7.4 Materials and methods........................................................................................................ 143

7.4.1 Study Populations .................................................................................................... 143

7.4.2 Genomic DNA sample preparation from whole human blood. .................................. 144

7.4.3 miRNA SNP selection ............................................................................................. 145

7.4.4 Primer design .......................................................................................................... 148

7.4.5 Primary Multiplex PCR ........................................................................................... 151

xiiStudy of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations.

7.4.6 MALDI-TOF MS analysis and data analysis ............................................................ 151

7.4.7 Statistical analysis ................................................................................................... 151

7.5 Results and discussion ........................................................................................................ 152

7.6 Conclusions ....................................................................................................................... 158

CHAPTER 8: ASSOCIATION OF MICRORNA 17-92 CLUSTER HOST GENE (MIR17HG) POLYMORPHISMS WITH BREAST CANCER. .................................................................... 160

8.1 Statement of contribution of co-authors for thesis by published paper ................................. 160

8.2 Abstract ............................................................................................................................. 162

8.3 Introduction ....................................................................................................................... 162

8.4 Materials and methods ....................................................................................................... 166

8.4.1 Study populations .................................................................................................... 166

8.4.2 Genomic DNA sample preparation from whole human blood. .................................. 168

8.4.3 miRNA SNP selection ............................................................................................. 168

8.4.4 Primer design and multiplex PCR genotyping .......................................................... 169

8.4.5 MALDI-TOF MS analysis and data analysis ............................................................ 171

8.4.6 Statistical analysis ................................................................................................... 171

8.5 Results ............................................................................................................................... 172

8.5.1 Genotypic Analysis ................................................................................................. 172

8.5.2 Haplotypic Analysis ................................................................................................ 174

8.6 Discussion ......................................................................................................................... 176

CHAPTER 9: GENERAL DISCUSSION .................................................................................. 178

9.1 Introduction ....................................................................................................................... 178

9.2 Breast cancer and microRNAs ............................................................................................ 179

9.3 Population resources development ...................................................................................... 181

9.3.1 DNA extraction methods ......................................................................................... 182

9.3.2 Comparison of DNA Extraction methods ................................................................. 182

9.4 Association studies............................................................................................................. 184

9.4.1 MIR146A results ...................................................................................................... 184

9.4.2 miRNA selection and analysis ................................................................................. 184

9.4.3 MicroRNA 17-92 cluster host gene analysis ............................................................. 185 Study of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations. xiii

9.4.4 Comparison to GWAS results and miRNA functional implications. .......................... 187

9.5 Limitations and future directions. ....................................................................................... 188

9.6 Conclusions ....................................................................................................................... 189

BIBLIOGRAPHY ....................................................................................................................... 190

APPENDICES............................................................................................................................. 216

Appendix A Modified salting out protocol for DNA extraction from whole blood samples. Adapted from Nasiri et al., 2005. ............................................................... 216

Appendix B Protocol for genotyping using multiplex PCR and matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDI-TOF MS) analysis. .................................................................................................................. 219

Appendix C Protocol for genotyping using automated Sanger sequencing ........................... 222

xivStudy of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations.

List of Tables

Table 2-1 Cancer incidence and mortality in the world in 2007 and 2011 according to the International Agency for Research on Cancer (Ferlay et al., 2010; Ferlay et al., 2013). ............................................................................................................................. 30

Table 2-2 Incidence, mortality and 5 year prevalence of breast cancer in women in Australia and the world in 2011 according to the International Agency for Research on Cancer (Ferlay et al., 2013). ........................................................................................................ 30

Table 2-3 Histological Classification of Breast Cancer (Robbins et al., 2010). .................................. 40

Table 2-4 American Joint Committee on Cancer Staging System (Adapted from AJCC Staging Manual 2002) ................................................................................................................. 42

Table 2-5 Autosomal dominant inherited breast cancer syndromes. .................................................. 50

Table 2-6 Familial forms of breast cancer. ....................................................................................... 51

Table 2-7 Some of the miRNAs and their target genes most commonly found in relation to cancer. ............................................................................................................................ 63

Table 2-8 Summary of the most significantly associated microRNAs in breast cancer and their target genes .................................................................................................................... 69

Table 4-1 Spectrophotometry measures from whole blood samples (BC319-BC323)........................ 99

Table 4-2 Spectrophotometry measures from gDNA extracted from whole blood samples using 3 different extraction methods ....................................................................................... 100

Table 4-3 Time and Cost summary for different gDNA extraction methods ................................... 102

Table 5-1 DNA extraction methods commonly used for extraction from whole blood samples ....... 111

Table 5-2 Summary of comparative study of DNA extraction methods by Lahiri et al .................... 119

Table 5-3 Summary of results from comparative study on DNA extraction techniques by Lee et al .................................................................................................................................. 122

Table 5-4 Summary of features evaluated for deoxyribonucleic acid (DNA) extraction methods reviewed by Carpi et al.1 ............................................................................................... 123

Table 5-5 Summary of results from comparative study of DNA extraction methods by Chacon-Cortes et al ................................................................................................................... 124

Table 6-1 Genotype and allele frequencies for GRC-BC population .............................................. 132

Table 6-2 Genotype and allele frequencies for GU-CCQ BB population ........................................ 133

Table 6-3 Combined primary and secondary population genotype and allele frequencies and analysis results.............................................................................................................. 134

Table 6-4 Odds Ratios for combined population analysis results. ................................................... 134

Table 7-1 Selected features from the Human miRNA disease database (HMDD) included in present study ................................................................................................................ 145

Study of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations. xv

Table 7-2 List of the miRNA SNPs included in multiplex primer design using the MassARRAY® Assay Design Suite v1.0 software (SEQUENOM Inc., San Diego, CA, USA) ..................................................................................................................... 149

Table 7-3 Primer sequences for the miRNA SNPs included in genotyping study using multiplex PCR reaction and MALDI-TOF MS .............................................................. 150

Table 7-4 Allele and genotype frequencies for miRNA SNPs in MIR210 obtained from the GRC-BC population ..................................................................................................... 154

Table 7-5 Allele and genotype frequencies for miRNA SNPs in MIR221 and MIR222 from the GRC-BC Population ..................................................................................................... 154

Table 7-6 Allele and genotype frequencies for miRNA SNPs in MIR21 obtained from the GRC-BC population ..................................................................................................... 155

Table 7-7 Allele and genotype frequencies for miRNA SNPs in MIRLET7A1 obtained from the GRC-BC cohort ............................................................................................................ 155

Table 7-8 Allele and genotype frequencies for miRNA SNPs in MIRLET7A2 obtained from the GRC-BC population ..................................................................................................... 156

Table 7-9 Allele and genotype frequencies for rs55945735 located in MIR145 obtained from the GRC-BC population ................................................................................................ 156

Table 7-10 Allele and genotype frequencies for SNP rs353291 located in MIR145 obtained from the GRC-BC and GU-CCQ BB cohorts ................................................................. 157

Table 8-1 SNPs in the microRNA 17-92 cluster host gene selected for primer design using the MassARRAY® Assay Design Suite v1.0 software (SEQUENOM Inc., San Diego, CA, USA) ..................................................................................................................... 170

Table 8-2 Primer sequences for microRNA 17-92 cluster host gene SNPs included in genotyping study using multiplex PCR reaction and MALDI-TOF MS .......................... 170

Table 8-3 Chromosomal location and allele information for selected miRNA SNPs in the microRNA 17-92 cluster host gene ................................................................................ 173

Table 8-4 Allele frequencies of SNPs in the microRNA 17-92 cluster host gene obtained from the GRC-BC and GU-CCQ BB cohorts ......................................................................... 173

Table 8-5 Haplotype block association analysis for selected SNPS in the microRNA 17-92 cluster host gene in combined GRC-BC/GU-CCQ BB population .................................. 176

xviStudy of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations.

List of Figures

Figure 2-1 Anatomical structure of mammary glands. Anterolateral disection and sagital section (Reproduced from , Netter, F. H. (2011). Atlas of human anatomy. Copyright © 2011 by permission of Elsevier, Inc.)........................................................................... 33

Figure 2-2 Lymphatic drainage of the breast ((Reprinted from , Netter, F. H. (2011). Atlas of human anatomy. Copyright © 2011 by permission of Elsevier, Inc.). ............................... 34

Figure 2-3 ERBB2 Receptor and intracellular signalling pathways. Reproduced from Johnston & Dowsett, 2003. Copyright © 2003, with permission from Nature Publishing Group. ............................................................................................................................ 48

Figure 2-4 Copyright © 2006. Canonical pathway of miRNA genesis and mRNA regulation. Reprinted by permission of Nature Publishing Group (Esquela-Kerscher & Slack, 2006). ............................................................................................................................. 59

Figure 3-1 Example of High Resolution Melt (HRM) curves showing three different miR-SNP genotypes. ...................................................................................................................... 85

Figure 4-1 Agarose gel showing PCR product amplification on DNA extracted by traditional salting out method(lanes 1-6) and modified salting out method(lanes 7-12) .................... 102

Figure 7-1 MicroRNA SNP (miR-SNP) selection algorithm using the Human miRNA Disease Database (HMDD). This flow chart shows workflow for selection of preliminary miR-SNPs included in genotyping study. ...................................................................... 147

Figure 8-1 Linkage Disequilibrium (LD) blocks for analysed microRNA polymorphisms in the microRNA 17-92 cluster host gene................................................................................ 175

Study of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations. xvii

List of Abbreviations

3’-UTR 3’ untranslated region

AJCC American joint committee on cancer

AT Ataxia telangiectasia

BCLL B-cell chronic lymphocytic leukaemia

CI Confidence interval

CLL Chronic lymphocytic leukaemia

CS Cowden syndrome

CsCl Cesium Chloride

DEAE diethylaminoethyl cellulose

DCIS Ductal carcinoma in situ

DGCR8 DiGeorge syndrome critical region gene 8

DLBCL Diffuse large B cell lymphoma

DNA Deoxyribonucleic Acid

dsRBD Double stranded RNA binding domain

EDTA Ethilenediaminetetraacetic acid

ERα Oestrogen receptor alpha

ERβ Oestrogen receptor beta

ER Oestrogen receptor

ERBB2 Human epidermal growth factor receptor 2 gene

FDA Food and drug administration

FGFR2 Fibroblast growth factor receptor 2

gDNA Genomic DNA

GRC-BC Genomics research centre breast cancer population

xviiiStudy of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations.

GU-CCQ BB Griffith University-Cancer Council Queensland

Biobank

GWAS Genome-wide association studies

GTC Guanidine thiocyanate

HER2/neu Human epidermal growth factor receptor 2

HL Hodgkin’s lymphoma

HMDD Human miRNA disease database

HRM High resolution melting

HWE Hardy-Weinberg equilibrium

IARC International agency for research on cancer

ISH In situ hybridisation

Kb kilobases

LCIS Lobular carcinoma in situ

LD Linkage disequilibrium

µTAS Miniaturised total chemical analysis system

MAF Minor allele frequency

MALDI-TOF Matrix assisted laser desorption ionization time-of-

flight

MGA Microfluidic genetic analysis

MINDACT Microarray in node negative disease may avoid

chemotherapy trial

MIR17HG MicroRNA 17-92 cluster host gene

miRNA MicroRNA

miR-SNP MicroRNA-SNP

mRNA Messenger RNA

MS Mass Spectrometry

MTHFR Methylenetetrahydrofolate reductase gene

Study of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations. xix

NCBI National center for biotechnology information

NST No special type

OR Odds ratio

P-bodies Processing bodies

PCR Polymerase chain reaction

piRNA piwi-interacting RNA

PI3K phosphatidylinositol 3-kinase

PMBL primary mediastinal B cell lymphoma

PR Progesterone receptor

pre-miRNA MicroRNA precursor

pri-miRNA MicroRNA primary transcript

qPCR Quantitative real time polymerase chain reaction

RBP RNA binding protein

rcf Relative centrifugal force

RISC RNA-induced silencing complex

RNA Ribonucleic Acid

RNase Ribonuclease

SD Standard deviation

SDS Sodium dodecyl sulfate

siRNA Short interfering RNA

snoRNA Small nucleolar RNA

SNP Single Nucleotide Polymorphism

TS Tumour suppressor

XPO5 Exportin5

.

xxStudy of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations.

Statement of Original Authorship

The work contained in this thesis has not been previously submitted to meet

requirements for an award at this or any other higher education institution. To the

best of my knowledge and belief, the thesis contains no material previously

published or written by another person except where due reference is made.

Signature: QUT Verified Signature

Date: October 2015

Study of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations. xxi

Acknowledgements

I am pleased to thank the many colleagues, friends and family who have kindly

given me some advice, help, encouragement and support over the last four years.

I would like to thank my supervisor Prof Lyn Griffiths for welcoming me to the

Genomics Research Centre and for giving me the opportunity and support required to

develop my research project. I would like to also thank my associate supervisor Dr

Robert A. Smith because his invaluable support, guidance and advice were essential

to the successful completion of this degree.

I would like to thank staff at Queensland University of Technology, the

Institute of Health and Biomedical Innovation and/or the Genomics Research Centre

who contributed in any way to the completion of my work. I am also grateful to the

reviewers of this thesis for their time, valuable comments and suggestions throughout

the examination process of my thesis.

I would like to thank the Australian Government, Griffith University and

Queensland University of Technology for providing the financial support required to

undertake my studies, as well as all the financial sponsors and government bodies

supporting this research project.

I would also like to thank all my friends, both at Griffith University and

Queensland University of Technology, who provided me with advice and support

when I required it and for your friendship and company which made my studies in

much more enjoyable.

Finally, I would like to thank my wife for standing by my side during the ups

and downs of these past four years. You have been my rock and your love,

xxiiStudy of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations.

encouragement and support have been essential to this achievement. I would also like

to thank my family in particular both my parents, my brother and sister because even

in the distance their love, encouragement and support keeps me motivated and

because they are the reason behind everything I do.

Study of miRNA polymorphisms and their potential association with breast cancer risk in Australian Caucasian populations. xxiii

Chapter 1: Introduction

1.1 A DESCRIPTION OF THE SCIENTIFIC PROBLEM INVESTIGATED

Breast Cancer is the second most common type of cancer in the world and it is

the most common type of cancer diagnosed in women in Australia. It is a major

concern for Australian public health, because more and more cases are being

diagnosed every day, with Australia’s breast cancer incidence being the third highest

in the world. Furthermore, it is the second most common cause of death for women

in this country (AIHW, 2009, 2012; AIHW & AACR, 2010, 2012; Ferlay et al.,

2010; Ferlay et al., 2013; Robbins, Kumar, & Cotran, 2010).

Genetic information stored within the cells in breast tissue governs cellular fate

from cellular genesis to their death. Breast cancer is a complex disease, which arises

as a result of cumulative and permanent DNA damage to cells within tissues in the

breast and their interaction with many other risk factors (Robbins et al., 2010;

Strachan & Read, 2011; Weinberg, 2007). Therefore, genetics plays a key role in

understanding, preventing and managing this life threatening disease. Many different

genetic markers have been identified and are being used to refine early diagnosis and

prevention, to classify cases and to make more accurate treatment decisions and to

predict response to treatment and other outcomes in breast cancer patients. However,

none of the current genetic markers have proven to be a sufficient and highly

accurate way of addressing any of these important disease features (Ben Zhang,

Beeghly-Fadiel, Long, & Zheng, 2011), therefore more research is needed to identify

molecular markers of disease development and progression.

Study of MicroRNA genes is an emerging field of study in relation to cancer

and they could be useful to understand the complexity of the molecular and cellular

Chapter 1: Introduction 24

mechanisms involved in breast cancer carcinogenesis and progression. They have

been shown to be involved in many different types of cancer because they regulate

complex gene networks required for cell growth, differentiation and cell survival

(Farazi, Spitzer, Morozov, & Tuschl, 2011; Robbins et al., 2010; Westholm & Lai,

2011; Zuoren Yu et al., 2010). Different types of cancer have been seen to exhibit

up-regulation and down-regulation of specific miRNA genes. However, we still lack

knowledge of the mechanisms that result in changes in miRNA expression in cancer,

in particular for breast cancer. Polymorphisms present in the functional and

regulatory regions of specific miRNA genes could change the expression of the

miRNA genes and studying these would be an initial way to address the current gap

in knowledge for their role in the pathogenesis of breast cancer (Gong et al., 2012;

H.-C. Lee, Yang, Chen, & Au, 2011; Mishra, Banerjee, & Bertino, 2008). To

successfully characterise microRNA single nucleotide polymorphisms (miR-SNPs)

we are required to conduct genotyping studies in well-characterized, large case

control cohorts. Therefore in our study we also aimed to establish a larger population

of DNA samples from breast cancer patients and matched controls that could serve as

a replicate population to our previously existing cohort. We determined the most

appropriate method for DNA extraction from whole blood samples that would yield a

good quality and good quantity of genomic DNA to use in our miR-SNP genotyping

analysis. We identified microRNAs previously implicated in breast cancer

development and we selected polymorphisms within those genes to genotype in our

two Australian Caucasian breast cancer cohorts.

1.2 THE OVERALL OBJECTIVE OF THE STUDY

The overall objective of the work described in this thesis is to contribute to the

general knowledge of microRNA polymorphisms and their potential association with

breast cancer risk.


1.3 THE SPECIFIC AIMS OF THE STUDY

1. To create and establish a significant sized prospective DNA biobank from

blood samples collected from newly diagnosed breast cancer patients and

control volunteers in Queensland.

2. To determine the most cost-effective and time-efficient technique to

isolate DNA from whole blood samples that improves our genotyping

results.

3. To conduct association analysis of single nucleotide polymorphisms

present in previously identified microRNA genes using high resolution

melting (HRM) and multiplex PCR and matrix assisted laser desorption

ionization time-of-flight (MALDI-TOF) in both our previously

established and newly developed Australian Caucasian case-control

populations.

.

1.4 AN ACCOUNT OF SCIENTIFIC PROGRESS LINKING THE SCIENTIFIC PAPERS

The five papers presented as part of this thesis are directly linked to the topic of

genetic association analysis of miRNA polymorphisms and breast cancer risk in

Australian Caucasian populations.

Chapter Four presents an evaluation study that was performed to choose a

suitable technique to extract genomic DNA from whole blood samples for the GU-

CCQ BB population. It highlights the modified salting-out method developed by

Nasiri et al to be both a cost-effective and time-efficient technique. The work

presented in this chapter was published in the journal “Molecular Biology Reports”.

26 Chapter 1: Introduction

Chapter Five presents a review of most commonly used laboratory techniques

used to extract genomic DNA from whole blood samples. This literature review was

undertaken to try to identify current findings in terms of cost-effective and time

efficient methods for DNA extraction to determine the most suitable technique to use

for our large sized DNA biobank: The Griffith University-Cancer Council

Queensland Breast Cancer Biobank (GU-CCQ BB). The work presented in this

chapter was published in the “Journal of Biorepository Science for Applied

Medicine”.

Chapter Six describes our initial genotyping study for a single nucleotide

polymorphism (SNP) present in MIR146A in our Australian Caucasian breast cancer

case-control cohorts; the Genomics Research Centre Breast Cancer (GRC-BC)

population and the GU-CCQ Biobank. This study is the first to identify rs2910164 to

be associated with breast cancer risk in an Australian Caucasian population. The

work presented in this chapter has been submitted to the journal “Gene”.

Chapter Seven presents our large scale genetic association analysis of a

significant number of microRNA single nucleotide polymorphisms (miR-SNPs)

present in 10 miRNA genes in both of our Australian Caucasian breast cancer

populations. This work is the first to analyse the 24 selected miR-SNPs in 10 miRNA

genes in relation to breast cancer risk. It is also the first study to identify association

of rs353291 (MIR145) with risk of developing breast cancer in Australian Caucasian

females. The work presented in this chapter has been submitted the “BMC Medical

Genetics”.

Chapter Eight describes our large scale genotype and haplotype analysis of six

miR-SNPs present in the microRNA 17-92 cluster host gene (MIR17HG) conducted

in both Australian Caucasian breast cancer case-control populations. This is the first

association study for MIR17HG SNPs conducted in Australian Caucasian

individuals. It is also the first to identify association with breast cancer risk with both

rs4284505 and haplotype for rs4284505/rs7336610 in Australian Caucasian


populations. The work presented in this chapter has been accepted for publication in

the journal “Tumor Biology”.

Chapter Nine provides a comprehensive discussion of the main findings of this

study and highlights the outcome of the research.

28 Chapter 1: Introduction

Chapter 2: Literature Review

2.1 BREAST CANCER EPIDEMIOLOGY

Cancer represents a significant burden of disease worldwide. In 2007, cancer

accounted for 12.7 million new cases and 7.6 million deaths around the world

according to the International Agency for Research on Cancer (IARC). By 2011 it

had reached 14.1 million new cases and 8.2 million deaths which represented an

increase of about 11% in world incidence and about 8% in world mortality (See

Table 2-1) (Ferlay et al., 2010; Ferlay et al., 2013). This percentage increase in

cancer incidence is particularly remarkable since it is double the increase in world

population for the same period of time, which was only about 5.5% according to the

figures reported by the Population Reference Bureau (6,625 million people in 2007

and 6,987 million people in 2011) (PRB, 2007, 2011).

Cancer in Australia is also a major cause for concern, since incidence increased

about 13% from 2007 to 2011 with 108,368 and 122,031 new cancer cases diagnosed

in each year respectively. In 2011 52,361 cancer cases in Australia were diagnosed in

women, 43% of all diagnosed cancer cases. Similarly, cancer deaths in Australia are

rising with 39,884 people reported as dying from cancer in 2007 and 43,403 in 2011.

44% of all Australian cancer related deaths were women, with about 17,000 women

reported to die from cancer in 2007 and about 19,000 women in 2011 (AIHW &

AACR, 2010, 2012). The Australian population increased from 21,180,600 to

22485300 people in the 2007-2011 period according to the Australian Bureau of

Statistics, which corresponds to a 6.2% increment (ABS, 2007, 2011). Based on

these figures, we observe a similar pattern both in Australia and worldwide statistics

with growth of cancer incidence doubling the reported percent population increase.

Chapter 2: Literature Review 29

Table 2-1 Cancer incidence and mortality in the world in 2007 and 2011 according to the International

Agency for Research on Cancer (Ferlay et al., 2010; Ferlay et al., 2013).

World Mortality World Incidence Year 2007 2011 Increase 2007 2011 Increase All cancers 7,6000,000 8,200,000 8.2% 12,700,000 14,100,000 11% Breast cancer

458,000 522,000 14% 1,380,000 1,700,000 23%

Breast cancer is a particularly concerning type of cancer since 1.38 and 1.7

million new breast cancer cases were reported in 2007 and 2011, respectively,

showing an increase of about 23% in 4 years. In 2007 458,000 deaths in the world

were related to breast cancer, accounting for about 6% of all cancer related deaths,

and they increased to about 14% in 2011 with 522,000 deaths (See Table 2-1). Breast

malignancies are also the most common type of cancer in women in Australia and

worldwide, and it is estimated that they will be diagnosed in one in eight women

living to the age of 90 (Ferlay et al., 2010; Ferlay et al., 2013; Robbins et al., 2010).

Breast cancer has the highest rates for 5 year prevalence for any cancer for women

both in Australia and around the world and it is also the first cause of cancer related

deaths for women worldwide, while being the second cause of cancer death in

Australian women (see Table 2-2) (AIHW, 2009, 2012).

Table 2-2 Incidence, mortality and 5 year prevalence of breast cancer in women in Australia and the

world in 2011 according to the International Agency for Research on Cancer (Ferlay et al., 2013).

World Australia No Cases % Ranking No Cases % Ranking Incidence 1,676,633 25.2 1st 14,710 28.1 1st Mortality 521,817 14.7 1st 2,968 15.7 2nd 5 year prevalence 6,255,391 36.4 1st 59,584 37.5 1st

There are a number of epidemiological risk factors associated with breast cancer and

they include age; reproductive factors such as age at menarche, menopause and first

pregnancy, oral contraceptives and hormone replace therapy; lifestyle factors like

diet, weight, alcohol intake, smoking and ionizing radiation and chemical exposure.

30 Chapter 2: Literature Review

Breast cancer incidence is low in women under 25 years of age and it doubles about

every 10 years until menopause, when the rate of increase slows considerably.

However, this phenomenon seems to be dependent on geographical location, in some

countries like Japan and Colombia there is a flattening of the incidence curve after

menopause, while in US and Sweden risk keeps on increasing until 75 years of age

(Dumitrescu & Cotarla, 2005; McPherson, Steel, & Dixon, 2000).

Reproductive factors also seem to affect breast cancer incidence. Early age of

menarche and late menopause increase the risk of developing breast cancer by about

20%. On the other hand, women who undergo bilateral oophorectomy before 35

years of age have only 40% of the risk of breast cancer of women experiencing

natural menopause. Nulliparity and late age at first birth increase risk of breast

cancer. Furthermore, women giving their first birth after the age of 35 have a higher

risk than nulliparous women. Finally both use of oral contraceptives and hormonal

replacement therapy seem to be associated with a small increase in the risk of

developing breast cancer, but further studies are still required (relative risk (1.07)

(Dumitrescu & Cotarla, 2005; Hulka & Moorman, 2001; McPherson et al., 2000).

Although there seems to be a correlation between the incidence of breast cancer and

dietary fat intake, however epidemiological evidence available up to date is not

particularly consistent or strong. Obesity in postmenopausal women is associated

with a twofold increase in the risk of developing breast cancer. Conversely, obesity is

associated with a reduced incidence of breast cancer in premenopausal women.

Results from studies evaluating the role of other dietary components such as soy

protein, vitamins A, C, E and D have been inconclusive and further research is

required (Dumitrescu & Cotarla, 2005; Hulka & Moorman, 2001; McPherson et al.,

2000).

Alcohol consumption and smoking have been linked to many types of cancer and

there is evidence that suggests that alcohol consumption of one to two drinks on

average per day increases the risk of developing breast cancer (relative risk of 1.2 to

1.4). On the other hand, current evidence shows that smoking has a little overall


effect on breast cancer risk (Dumitrescu & Cotarla, 2005; Hulka & Moorman, 2001;

McPherson et al., 2000).

Finally, studies on chemical compounds including the pesticide bis(4-chlorophenyl)-

1,1,1-trichloroethane (DDT) and its metabolite bis(4-chlorophenyl)-1,1-

dichloroethene (DDE) and polychlorinated biphenyls (PCBs) have shown no

association to breast cancer risk. Exposure to ionising radiation doubled the risk of

breast cancer for teenage girls exposed to radiation during the Second World War

(Hulka & Moorman, 2001; McPherson et al., 2000; Wolff, Collman, Barrett, & Huff,

1996).

2.2 ANATOMY OF THE MAMARY GLAND

Female mammary glands (or breasts) are skin-covered modified

sebaceous/sweat glands located in the superficial fascia on the ventral wall of the

thorax, anterior to the pectoral muscles. They stretch from the second rib to about the

seventh intercostal cartilage and from the lateral edge of the sternum to the

midaxillary line. Breasts are part of the female reproductive system and they become

functional after pregnancy finishes, producing milk to nurse the newborn baby. They

are markedly undeveloped during childhood and they start acquiring their round

conical adult shape after puberty.

On the exterior surface, mammary glands have a central ring of pigmented skin

known as the areola. It surrounds a protruding conical structure of about 12mm in

diameter called the nipple, commonly located at about the level of the fourth

intercostal space in young women. Lying underneath the skin, there are between 15

to 20 lobes that open through their individual excretory duct to the surface of the

nipple. Lobes are formed by smaller structures called lobules, which contain multiple

glandular alveoli (or acini) able to produce milk during lactation. Acini are connected

through a network of lactiferous ducts that merge into the lactiferous sinus, a dilated

structure leading to the excretory duct, where milk accumulates throughout nursing.


Glandular tissue, lactiferous ducts and lactiferous sinuses are surrounded by fatty

connective tissue. These structures are connected to the skin and the muscular layer

underneath by fibrous bands of the superficial fascia, which form the suspensory

ligament of Cooper on the upper part of the breasts (Ellis, 2010; Love & Barsky,

2004; Marieb, 2010; Morehead, 1982) (See Figure 2-1).

Figure 2-1 Anatomical structure of mammary glands. Anterolateral disection and sagital section (Reproduced from , Netter, F. H. (2011). Atlas of human anatomy. Copyright © 2011 by permission of Elsevier, Inc.).

Blood supply to the mammary gland is provided by three main sources:

Internal thoracic artery, axillary artery and intercostal arteries. Internal thoracic

artery arises from the subclavian artery and it provides the largest arterial vessels

reaching the breasts. Axillary artery is the second major source of blood, which is


supplied through 4 main branches: Superior thoracic, pectoral branch, lateral thoracic

and subscapular arteries. Venous drainage follows the pathway of the

aforementioned arteries and it ends in two major vessels: Axillary vein or the internal

thoracic vein.

There are two main sites for lymphatic drainage from mammary glands and

they are the axillary lymph nodes and the internal thoracic lymph nodes (Figure 2-2).

Axillary lymph nodes are divided into 5 groups and they receive approximately 75%

of all drainage, which explains why early metastases for most breast cancer patients

are located in the axilla. (Morehead, 1982; Netter, 2011).

Figure 2-2 Lymphatic drainage of the breast ((Reprinted from , Netter, F. H. (2011). Atlas of human anatomy. Copyright © 2011 by permission of Elsevier, Inc.).


2.3 BREAST CANCER DEFINITION AND AETIOLOGY

Malignant neoplasm or cancer is the term to define a new growth that invades

local tissue and/or distant places within the human body. It is a complex disease in

which cells proliferate with no control and with no physiological purpose. Cancer is

the result of genetic alteration to a cell or a group of cells that modifies normal

growth pattern (Neighbors & Tannehill-Jones, 2006). Robert Weinberg defined it as

“a disease of chaos, a breakdown of existing biological order within the body”

(Weinberg, 2007), whereas Strachan and Read referred to it as “the natural end of a

multi cellular organism” (Strachan & Read, 2011). It is a medical condition that

requires at least six successive somatic mutations within a cell, or a number of

different cells in a tissue, to allow uncontrolled cellular division. Malignant

proliferation of cells can be accomplished through seven main cellular capabilities:

1. Autonomy from external growth signals.

2. Freedom from external anti-growth signals.

3. Ability to avoid apoptosis (programmed cell-death).

4. Faculty to replicate indefinitely.

5. Competence to trigger angiogenesis.

6. Ability to invade tissues and establish secondary tumours (metastasis).

7. Failure in DNA repair

Hanahan and Weinberg described these features as the hallmarks of cancer in

2000 and they have been a fundamental principle in our understanding of the

pathogenesis of cancer (Hanahan & Weinberg, 2000). These features are

accomplished in cancers by two basic means: Mutations that increase cell

proliferation and mutations that affect stability of the entire genome increasing

mutation rate (Robbins et al., 2010; Strachan & Read, 2011; Weinberg, 2007).

Research into the molecular mechanisms underlying neoplastic disease has also

identified two additional hallmarks: reprogramming of energy metabolism and


evading immune destruction. These hallmarks are achieved through the activation

and/or silencing of complex intracellular pathways which are out of the scope of the

present review. However, neoplasms are not the result of isolated cancer cells.

Malignant tumours have also been recognised as organs because they require the

interplay of specialised cell types and the tumour microenvironment that they

develop during the progression of the disease. Cancer cells, cancer stem cells,

endothelial cells, pericytes, immune inflammatory cells, cancer-associated fibroblast

and stem and progenitor cells of the tumour stroma are key players of the tumour

microenvironment. Cancer progression is achieved through very complex signalling

networks that are established amongst all specialised cells previously mentioned. To

date, many of these networks remain to be fully identified and they are also the focus

of current research studies into targeted cancer therapies (Hanahan & Weinberg,

2011).

Although there is no definitive cause to the genetic alteration to most cancers,

one or many contributing risk factors have been identified, including race/ethnicity,

exposure to chemical carcinogens, hormones, radiation exposure, viruses, genetic

predisposition or familial history and other personal circumstances such as alcohol

use, nutritional habits and smoking (Martin & Weber, 2000; Neighbors & Tannehill-

Jones, 2006).

Breast cancer is a malignant neoplasm arising from the epithelial cells that line

the terminal duct lobular unit. Out of all the risk factors previously mentioned,

hormonal factors and genetic factors are the two mainly involved in the development

of carcinoma of the breast in women. As a result, pathologists usually divide breast

cancer into those related to hormonal exposure or sporadic cases and the ones

featuring germline mutations or hereditary cases (Copstead & Banasik, 2010;

Robbins et al., 2010; Sainsbury, Anderson, & Morgan, 2000).

Genes and germline mutations involved in the aetiology of breast cancer will

be further discussed in this chapter, however they account only for a small fraction of

all diagnosed breast cancer patients. Sporadic cases are primarily associated with


hormone exposure and its main effects on women, mediated through factors such as

age at menarche and menopause, reproductive history, breastfeeding and use of

exogenous oestrogen (Martin & Weber, 2000). Oestrogen is a key hormone in the

development of normal breast epithelium, but it also seems to have an effect on

breast cancer development by increasing the number of malignant target cells

through breast growth stimulation. DNA damage may also occur through mutations

caused by hormonal metabolites or production of free radicals and due to their effect

on gene variants related to oestrogen synthesis and metabolism. It binds to two

specific receptors, oestrogen receptor alpha (ERα) and oestrogen receptor beta

(ERβ), which act as nuclear transcription factors. ERα is the receptor most

commonly associated with breast cancer since it is the one mainly expressed in breast

tissue (Abeloff, Armitage, & Niederhuber, 2008; Robbins et al., 2010).

Of the hormonal related pathways for malignant development, two of the most

clinically important ones are the ones related to oestrogen receptor (ER): ER-positive

and ER-negative pathways. The majority of breast cancers are ER-positive, about

70%, and they most probably develop initially in the epithelium of the glandular

ducts, mainly in the luminal cells expressing oestrogen receptor. ER-negative

carcinomas have no clear precursor cells but they might be derived from ER-negative

myoepithelial cells or from ER-positive cells that lose ER expression. ER- positive

carcinomas tend to have better differentiation and a slowly growing pattern. Almost

half of breast carcinomas in young women who develop breast cancer are ER-

negative or human epidermal growth factor receptor 2 (HER2/neu) positive, while

these markers are only positive in less than a third of women over 40 years of age,

contributing to a more aggressive phenotype seen in breast cancers arising earlier in

life. Most lobular carcinoma in situ (LCIS) are ER-negative, they usually tend to

over express HER2/neu and to lose expression of E-cadherin (Abeloff et al., 2008;

Copstead & Banasik, 2010; Robbins et al., 2010). HER2/neu will be discussed in

section 2.5.1 referring to genes and breast cancer further ahead in this chapter and

LCIS will be described as part of section 2.4.1, which deals with the classification of

breast cancer (Abeloff et al., 2008; Robbins et al., 2010).


Progesterone receptor (PR) is an oestrogen regulated gene. There are two

isoforms of this receptor PR-A and PR-B, the last one being more specific to breast

cancer. Polymorphisms in the PR gene promoter seem to increase risk of endometrial

and breast cancer. The relationship of PR with ER seems to have an effect on tumour

biology. Tumours which are ER/PR-positive seem to be less aggressive, they usually

do not extend to surrounding lymph nodes and they have a better response to anti-

oestrogen therapy. On the other hand, ER-positive/PR-negative carcinomas are more

frequently found in women over 50 years of age and they associate with abnormal

chromosomal count and nodal involvement. Furthermore, they have a poor clinical

response to anti-oestrogen therapy (Abeloff et al., 2008; De Vivo et al., 2002).

Breast cancer is a complex disease and there is no robust association between a

single genetic marker and sporadic breast cancer development. However, there is

clear evidence about the role genetics play in the development of cancer. There are

many different molecular pathways described and they seem to interact with each

other creating very complex signalling networks. Therefore, it is more likely that

several genes play a role in carcinogenesis. These genes are the target for genetic

damage during carcinogenesis, a process aided by many other environmental factors

previously mentioned. Although inherited predisposition is very rare for most types

of breast cancer, recognition of genes contributing to such predisposition has had a

major impact in our understanding of carcinogenesis. Genes associated with cancers

with a strong hereditary component seem to also be involved in the sporadic forms.

Genetic predisposition to breast cancer is found within two categories: Autosomal

dominant inherited cancer syndromes and familial cancers, which are summarized in

Table 2-5, which will be discussed section 2.5.1 of this review, which covers

genetics and breast cancer due to their importance for both familial and sporadic

forms of the disease (Robbins et al., 2010; Weinberg, 2007; Ben Zhang et al., 2011).

2.4 BREAST CANCER CLASSIFICATION

Many different classifications of breast carcinomas have been created over the

years in an attempt to group tumours according to morphologic patterns and clinical


behaviour (Bloom & Richardson, 1957; Meyer et al., 2005; Sainsbury et al., 2000).

We will focus on some of the most commonly used by pathologist and clinicians: the

histological classification, Bloom and Richardson grading system and the American

Joint Committee on Cancer (AJCC) staging system. However, the molecular

classification of breast cancer will be covered further ahead in section 2.5.1.2

discussing genetics and breast cancer because it relates to gene expression signatures

used in clinical settings to determine treatment and predict outcome.

2.4.1 Histological Classification

Adenocarcinomas are the main type of breast cancer and they can be further

grouped as in situ carcinomas and invasive carcinomas. A neoplastic tumour within

ducts and lobules limited by the basement membrane is called carcinoma in situ,

whereas the invasive form crosses the basement membrane and invades the stroma.

The latter form can potentially extend to vasculature and reach regional lymph nodes

and distant anatomical locations. Previously, ductal carcinoma was a general term

initially used to designate neoplastic proliferation within the ducts and lobular

carcinoma was the one used for malignant tumours within the lobules. However as

classification evolved, these terms are now used to designate differences in tumour

cell biology, based on particular patterns of growth and cellular morphology and

specific markers. Histological classification of breast cancer is summarised in Table

2-3 (Robbins et al., 2010; Sainsbury et al., 2000).


Table 2-3 Histological Classification of Breast Cancer (Robbins et al., 2010).

CARCINOMA IN SITU Type Subtype

Ductal Carcinoma in Situ (DCIS)

Comedocarcinoma

Non-comedo DCIS

Solid Cribriform Papillary

Micropapillary Paget's Disease

Lobular Carcinoma in Situ (LCIS)

INVASIVE (INFILTRATING) CARCINOMA Type Subtype

Invasive Carcinoma No Special Type (NST; Invasive Ductal Carcinoma)

Invasive Lobular Carcinoma Special Type

Tubular Cribriform Mucinous Papillary

Medullary Classic Lobular

In general, in situ carcinomas have a better outcome overall compared to

invasive carcinomas but their overall incidence is lower, about 30% of total cases.

Ductal carcinoma in situ (DCIS) is the second most common type of carcinoma (12

to 24% of cases) and it is further classified into 5 architectural subtypes which are

present in the majority of DCIS tumours. It can have infiltrating edges and when it

becomes invasive it usually spreads to the surrounding normal breast tissue.

Following this it disseminates into the lymphatic system, initially to the lymph

system of the axilla and from here to other distant sites such as lung, liver and bone.

In its initial stages, it can be cured in about 95% of patients through mastectomy and

DCIS recurrences or deaths after surgery are rare. Most lobular carcinoma in situ

(LCIS) are found incidentally and they account for about 3 to 6% of total cases. They

are very hard to detect through mammography and they affect primarily young

women (up to 90% of cases are identified before menopause). Treatments for LCIS

tend to be more aggressive and they include bilateral prophylactic mastectomy

(Abeloff et al., 2008; Robbins et al., 2010).


On the other hand, invasive carcinomas constitute the majority of breast cancer

cases (70 to 85% of all cases). Invasive carcinomas of no special type (NST), also

known as invasive ductal carcinomas, are the most common subtype and they are

about 55 to 67% of all cases. Molecular classification of invasive ductal carcinomas

is clinically important to determine prognosis and response to therapy and will be

discussed in detail in section 2.5.1.2. As a general rule, neoplasms classified as

special type carcinoma tend to have a better prognosis compared to the no special

type. Invasive lobular carcinoma of special type can be divided in 6 main subtypes:

Classic lobular carcinoma, cribriform, mucinous, papillary, tubular and medullary

carcinoma (Page, 2003). Classic lobular carcinoma account for about 7 to 8.5% of

total cases and they are hard to detect on physical examination and mammography.

Invasive lobular carcinoma is usually graded using grading scales and staging

systems, further described in section 2.4.2 and they have a different pattern of

metastasis, usually spreading to the abdominal and pelvic cavities (peritoneum,

retroperitoneum, gastrointestinal tract, ovaries and uterus). Medullary and mucinous

carcinomas together make up about 1.7% of all cases. Medullary carcinomas are very

common in the sixth decade and usually poorly differentiated. Mucinous carcinomas,

also known as colloid carcinomas, occur after the seventh decade and have a slow

growing rate. Both of these subtypes rarely involve lymph node metastases and have

a slightly better prognosis than no specific type carcinomas. Finally, tubular

carcinomas range from 4 to 5% of total cases and are commonly present in the fourth

decade. They are usually detected by mammography and overall they tend to have an

excellent clinical prognosis (Abeloff et al., 2008; Robbins et al., 2010).

2.4.2 Bloom and Richardson grading system

Bloom and Richardson grading, also known as Scarff, Bloom and Richardson

grading, is an histological grading scale developed to obtain prognostic information

especially for invasive carcinoma of no special type. Tumours are graded on three

main factors: Degree of glandular formation, nuclear pleomorphism and frequency of

mitoses. Each one of these factors is scored from 1 to 3 and these results combined

would allocate a tumour into one of three groups: Grade I (Scores 3 to 5), grade II


(scores 6-7) and grade III (8-9). Lower scores for each factor, leading to lower

grades, would give a better prognosis for both disease free and overall survival, as it

has been shown that most deaths occur within the first 10 years after diagnosis for

grade III patients (Bloom & Richardson, 1957; Robbins et al., 2010; Sainsbury et al.,

2000).

2.4.3 The American Joint Committee on Cancer (AJCC) staging system.

The American Joint Committee on Cancer (AJCC) developed a staging system

in 2002, which relies on the pathological examination of the primary tumour as well

as the axillary lymph nodes to establish prognosis. This staging system is very

complex and it defines various different categories to classify tumours and lymph

nodes. The main features of the AJCC staging system, which divides patients into 5

different stages associated with different survival rates are summarized in Table 2-4.

Breast cancers classified in the lower grades show a better outcome compared to the

higher grades, starting with 92% survival for local, non-invasive carcinomas in grade

0 and lowering down to 13% for invasive carcinomas classified as grade IV. (F.

Greene, Balch, Haller, & Morrow, 2002; Robbins et al., 2010; Sainsbury et al.,

2000).

Table 2-4 American Joint Committee on Cancer Staging System (Adapted from AJCC Staging

Manual 2002)

Stage T: Primary Cancer Lymph Nodes (LNs) M: Distant Metastasis

5-Year Survival

(%) 0 DCIS or LCIS No metastases Absent 92 I Invasive Carcinoma ≤ 2 cm No metastases Absent 87 II Invasive Carcinoma > 2 cm No metastases Absent 75

Invasive Carcinoma < 5 cm 1-3 positive LNs Absent III Invasive Carcinoma > 5 cm 1-3 positive LNs Absent 46

Any size invasive carcinoma ≥ 4 positive LNs Absent Invasive carcinoma with skin or chest wall involvement or inflammatory carcinoma

0 to > 10 positive LNs

Absent

IV Any size invasive carcinoma Negative or positive LNs

Present 13


Current available classifications for breast cancer, including molecular

profiling discussed in section 2.5.1.2, still fail to provide accurate patient prognosis,

response to treatment and prediction of disease relapse. Molecular diversity overlaps

with histopathology, therefore changes to these classifications are currently under

work. Reviews and studies are being conducted to include validated genetic profiling

in an effort to strengthen our understanding of breast cancer and to provide better

treatment rationale and disease outcome. (Singletary et al., 2003; Singletary &

Connolly, 2006).

2.5 GENETICS AND BREAST CANCER

Four main types of regulatory genes are involved in the development and

progression of breast cancer: growth promoting proto-oncogenes, growth-inhibiting

tumour suppressor genes, genes regulating apoptosis and genes participating in DNA

repair (Robbins et al., 2010; Strachan & Read, 2011; Weinberg, 2007).

Oncogenes are genes that promote self-sufficient cell growth in cancer cells.

They derive from proto-oncogenes and they are transcribed and translated to produce

oncoproteins, required for the continued, autonomous replication of cells even in the

absence of growth factors or other external signals. Proto-oncogenes, under normal

conditions, are involved in physiological processes like cellular growth and

proliferation and they transform into oncogenes when DNA damage occurs within

them. Activation or transformation of proto-oncogenes can be achieved through 3

main mechanisms: amplification, point mutation and chromosomal translocation.

Each of these results in overproduction of the product of the gene, or an altered

product with increased signalling capacity. Oncoproteins, the final product of

oncogenes, can be classified as one of four main categories: growth factors, growth

factor receptors, signal transduction proteins and nuclear-regulatory proteins

(Robbins et al., 2010; Strachan & Read, 2011; Weinberg, 2007).


Tumour suppressor genes have a counteracting function to that of oncogenes.

They inhibit cell proliferation by restraining inadequate cell division, maintaining

DNA integrity or driving cells to apoptosis. They are able to achieve these functions

through different final protein products which include categories like transcription

factors, cell cycle inhibitors, signal transduction molecules, cell surface receptors and

regulators of cellular responses to DNA damage (Robbins et al., 2010; Strachan &

Read, 2011; Weinberg, 2007). Tumour suppressor genes rarely contribute directly to

cancer formation; instead it is primarily their inactivation or destruction, through

mutation or suppression of gene expression that allows the further progression of

early dysplasias and adenomas into malignancy.

Apoptosis is the process of programmed cell death. It is achieved through

highly regulated and controlled biochemical and molecular events that lead to

cellular changes and eventually death. Genes regulating apoptosis include those

involved in TNF, FAS and Caspase pathways. Changes in apoptosis-regulating genes

result in increased cell survival and this is generally achieved through mutations

leading to enhanced activity of genes suppressing apoptosis or reduced expression of

cell-death promoting genes (Robbins et al., 2010; Strachan & Read, 2011; Weinberg,

2007).

Finally, genes involved in various DNA repair mechanisms are also implicated

in carcinogenesis. Loss-of-function mutations in genes involved in DNA repair

impairs the cellular ability to identify and repair genetic damage, which leads to the

high rates of acquired mutations commonly seen in carcinogenesis. ATM, CHEK1,

CHEK2, BRCA1 and BRCA2 are examples of genes associated with inherited

disorders and familial cancer syndromes, which are usually the result of

chromosomal abnormalities like breaks, abnormal chromosome numbers and

widespread mutations. These genes have also been studied and associated with some

cases of sporadic cancer (Robbins et al., 2010; Strachan & Read, 2011; Weinberg,

2007).


Due to the complexity of the cellular networks and the large number of genes

involved with potential roles in the development and progression of breast cancer, we

will focus on some of the most important genes identified to date in subsection 2.5.1.

We will also discuss microRNAs (miRNAs) and their role as epigenetic modifiers of

gene expression in relation to breast cancer as part section 2.6.3 of this review to give

a more comprehensive overview due to their importance for the topic of this study.

2.5.1 Genes and breast cancer

As mentioned before, approximately 5-10% of breast cancers are caused by the

inheritance of a susceptibility gene or a set of genes. These genes can be divided into

three different groups: High risk, high penetrance, breast cancer susceptibility genes,

moderate risk breast cancer susceptibility genes with moderate penetrance and low-

penetrance variants for some other identified genes (Oldenburg, Meijers-Heijboer,

Cornelisse, & Devilee, 2007; Ben Zhang et al., 2011). Families with multiple

affected first-degree relatives, especially with subjects affected before menopause

and/or medical history of other types of cancer increase the possibility of breast

cancer development within their members. Most common single gene mutations are

associated with increased susceptibility to breast cancer, autosomal dominant

syndromes and familial forms and they include BRCA1, BRCA2, p53, PTEN, ATM

and CHEK2 briefly outlined in Table 2-5 and Table 2-6 (Robbins et al., 2010;

Strachan & Read, 2011; Weinberg, 2007). Due to their apparent role in the

development of sporadic forms of breast cancer, the main features and current

findings in relation to these genes will be discussed in further detail.

BRCA1: BRCA1 is a tumour suppressor gene of about 100 kilobases (kb)

localised in chromosome 17q21, it contains 24 exons, 22 of which are protein

coding. (Martin & Weber, 2000). It was initially identified by linkage analysis in

1994 (Hall et al., 1990; Miki et al., 1994) and it is one of four high penetrance genes

described so far in relation with breast cancer. Prevalence of heterozygous carriers of

the high-risk mutation in the general Caucasian population is one in 1000 (0.10%)

and it has been shown to confer female carriers a cumulative lifetime risk of about 50


to 85% for developing breast cancer. The BRCA1 protein, containing about 1863

amino acids, interacts with transcripts of some of the other genes described in this

section, including TP53, RB, ATM and BRCA2. It is involved in DNA-damage repair,

cell-cycle regulation, transcriptional regulation and chromatin remodelling (Carter,

2001; Martin & Weber, 2000; Oldenburg et al., 2007; Welcsh & King, 2001; Ben

Zhang et al., 2011). More than 500 sequence variations, mainly deletions or

insertions of nucleotides, have been described and associated with breast cancer,

including the founder mutations identified in the Ashkenazi Jewish population

(185delAG and 5382insC) (Carter, 2001; Martin & Weber, 2000). BRCA1

expressing tumours are commonly of the basal type, which will be discussed ahead

in more detail in the molecular classification of breast cancer (Ben Zhang et al.,

2011).

BRCA2: BRCA2 is a tumour suppressor gene, spanning about 70kb of genomic

DNA, localised in chromosome 13 (13q12.3) and it contains 27 exons, one of which

is not translated. Initially identified by linkage analysis in 1995, it is a high

penetrance gene involved in inherited forms of breast cancer. At least 250 mutations

in the gene have been identified and they include a founder mutation in the

Ashkenazi Jewish population (6174delT). Prevalence of heterozygous carriers of the

high risk BRCA2 mutations in the general Caucasian population is about 1 in 750

(0.13%) and mutation carriers have a lifetime breast cancer risk estimated in the

range of 60 to 85% (Martin & Weber, 2000; Oldenburg et al., 2007; Wooster et al.,

1995). BRCA2 encodes a protein of about 3418 amino acids, which has a role in

DNA recombination and repair pathways, interacting with other genes described in

this section, such as including CHEK2 and ATM, and it may possibly be involved in

cytokinesis (Martin & Weber, 2000; Ben Zhang et al., 2011).

p53: TP53 or p53, also known as “the guardian of the genome” is a commonly

altered gene in most human cancers. It is localised on chromosome 17p13.1 and it is

involved in inducing apoptosis or cell-cycle arrest in response to cellular stress,

through modulation of transcription of about a dozen genes. It is one of the high

penetrance risk genes identified in relation to breast cancer since inheritance of a p53

mutant allele causes Li-Fraumeni syndrome, which involves multiple early-onset


cancers, including breast cancer in adulthood. 90% of all mutations for p53, which

account for over 2,000 in total, are located in exons 5 to 8 of the gene and more than

90% of mutation carriers are expected to develop cancer by age 70. Patients with

ductal and medullary tumours below 60 years of age typically carry mutations

located in these genomic regions of the p53 gene and they are about 30 to 50% of

sporadic breast cancer cases. It has also been studied as a marker for breast cancer-

specific mortality, showing an increase of 2.27-fold in overall relative risk. Germline

mutations of ATM and CHEK2 required for p53 activation also appear to increase

susceptibility to breast cancer (Martin & Weber, 2000; Oldenburg et al., 2007; Ben

Zhang et al., 2011).

ERBB2 (HER2): ERBB2, also known as HER2/neu, is a proto-oncogene

localised on chromosome 17q21.1. It encodes the human epidermal growth factor

receptor 2, which is a transmembrane tyrosine kinase receptor, a member of a family

of growth factor receptors that also includes EGFR (ERBB1), ERBB3 (HER3) and

ERBB4. HER2 is part of the RAS-MEK and AKT cellular pathway, it does not bind to

any particular ligand and therefore it requires interaction with other members of the

family, mainly HER3, to perform its cellular activity (See Figure 2-3). 20% of breast

cancers have HER2/neu amplification and/or overexpression of HER2 protein, which

correlates with these tumours being more aggressive and resulting in a reduced

patient’s survival time. It is well known and studied because it is the target of

specific therapy. Trastuzumab (Herceptin®) is a humanised monoclonal antibody

that binds to ERBB2 and it is used as an adjuvant therapeutic agent. It reduces

recurrence of tumours and prolongs survival when combined with chemotherapy in

patients who are HER2/neu positive (Abeloff et al., 2008).


Figure 2-3 ERBB2 Receptor and intracellular signalling pathways. Reproduced from Johnston & Dowsett, 2003. Copyright © 2003, with permission from Nature

Publishing Group.

PTEN: PTEN, phosphatase and tensin homolog, is a tumour suppressor gene

localised on 10q23.3, which inactivates the signal of phosphatidylinositol 3-kinase

(PI3K). PI3K produces phosphatidylinositol-3,4,5-triphosphate, a second messenger

responsible of the activation of AKT and other pathways involved in cellular

proliferation and apoptosis inhibition. High penetrance germline mutations of PTEN

produce inherited cancer predisposition through Cowden syndrome (CS). CS is a

clinical syndrome featuring a high incidence of neoplasms which include breast

carcinoma. Female patients carrying a PTEN mutation have a 25 to 50% lifetime

breast cancer risk. This gene usually becomes inactive through point mutations and

methylation, which causes genetic instability. Although 11 to 41% of sporadic breast

cancers cases show loss of heterozygosity for PTEN, to date no somatic mutations in


the PTEN gene have been identified in the remaining allele or in this gene in breast

cancer families without features of Cowden syndrome (Martin & Weber, 2000;

Oldenburg et al., 2007; Ben Zhang et al., 2011).

ATM: The ATM gene is located on chromosome 11q22-23. It produces a

protein that plays an important role in sensing and signalling DNA double-strand

breaks and genome instability. It becomes very active when cells are irradiated and it

autophosphorylates, regulating other tumour suppressor proteins, including tumour

suppressor proteins p53, BRCA1 and checkpoint kinase CHEK2. Homozygous or

heterozygous mutations in the ATM gene are associated with ataxia telangiectasia

(AT), an autosomal recessive disorder presenting with movement disorder (ataxia)

secondary to cerebellar degeneration, telangiectasia, immunodeficiency,

chromosomal instability and increased susceptibility to cancer. AT female patients

have an estimated relative risk for breast cancer of about 2.3. No germline mutations

in the ATM gene have been identified in studies of sporadic breast cancer cases;

haplotype analysis of common variants for the ATM gene suggested a slight

association with breast cancer risk but large population studies are required for

further validation of these results (Einarsdottir et al., 2006; Oldenburg et al., 2007;

Szabo et al., 2004).

CHEK2: The tumour suppressor gene CHEK2 is located on chromosome

22q12.1. It has been found to confer low to moderate risk for breast cancer, due to

moderate penetrance and it is involved in DNA repair of double strand-breaks. The

CHEK2 protein is a G2 check point kinase that requires phosphorylation by ATM,

usually after exposure to ionising radiation, and it also phosphorylates other cell

cycle proteins such as BRCA1 and p53. A 1100delC mutation of CHEK2 was found

in a patient with Li-Fraumeni syndrome with no mutation in TP53 (Li-Fraumeni

variant syndrome) and this variant was detected in about 1% of the healthy control

population and about 1.5 to 3 times more commonly in patients with breast cancer.

Furthermore, a two-fold increase in risk of breast cancer was found in

CHEK2*1100delC mutation carrier patients with no mutation in BRCA1/2 using

segregation analysis (Einarsdottir et al., 2006; Oldenburg et al., 2007; Ben Zhang et

al., 2011).


Table 2-5 Autosomal dominant inherited breast cancer syndromes.

CATEGORY CLINICAL SYNDROME - GENE (Location) % OF

BREAST CANCERS

FUNCTION

BREAST CANCER RISK BY AGE 70

MECHANISM IN SPORADIC

BREAST CANCER

Autosomal Dominant Inherited Cancer Syndromes. Individuals have an increased risk of developing breast cancer after inheritance of a single autosomal dominant mutant gene. It involves usually a single point mutation in a single allele for a tumour suppressor gene.

Li-Fraumeni Syndrome - p53 (17p13.1) < 1% Tumour suppressor gene involved in cell cycle control, DNA replication, DNA repair and apoptosis.

> 90% Mutations in 20%. LOH in 30% to 42%. Most frequent in triple negative cancers.

Li-Fraumeni Variant Syndrome - CHEK2 (22q12.1) ~ 1% Cell cycle checkpoint kinase. Involved in recognition and repair of DNA damage. Activates BRCA1 and p53 by phosphorylation

10 - 20% Mutations are rare (< 5%). Loss of protein expression in 1/3 of cases.

Rare Syndromes: < 1% Cowden Syndrome – PTEN

Peutz-Jeghers Syndrome - LKBI/STK11 Ataxia telangiectasia – ATM


Table 2-6 Familial forms of breast cancer.

CATEGORY CLINICAL SYNDROME - GENE (Location) % OF

BREAST CANCERS

FUNCTION

BREAST CANCER RISK BY AGE 70

MECHANISM IN SPORADIC

BREAST CANCER

Familial Breast Cancer. They occur at higher frequency within a family without a defined pattern of transmission. Usually early age at onset, tumours in two or more relatives and multiple or bilateral tumours. It is likely that familial susceptibility depends on multiple penetrating alleles.

BRCA1 (17q21) < 2% Tumour suppressor gene. Involved in transcriptional regulation and repair of double stranded DNA breaks.

40 - 90% Mutations are rare. Inactivated in 50% of some subtypes by methylation.

BRCA2 (13q12-13) < 1% Tumour suppressor. Involved in transcriptional regulation and repair of double stranded DNA breaks.

30 - 90% Mutations and loss of expression are rare.


As previously stated, breast cancer is a complex disease. It originates from an

interaction between genetic and environmental factors and it involves different types

of tumours derived from tissues in the mammary glands. This might explain why all

of the single genes previously discussed have not shown strong association with the

sporadic forms of the disease. They have different cellular roles as tumour suppressor

genes or proto-oncogenes; hence they could be involved in the development and

progression of sporadic cases of breast cancer. However, in spite of the evidence

about their association to the inherited forms of breast cancer, they do not

sufficiently explain breast cancer aetiology and natural history of breast carcinoma

for all forms of the disease.

Genetic studies on breast cancer have been conducted in different ways to

address a diverse range of research questions. Some of these approaches include

studies on common genetic variants, like single nucleotide polymorphisms (SNPs),

to determine if they play a role in the biogenesis of breast cancer. Additional avenues

of research include gene expression studies aimed to develop molecular profiles that

could be used in the diagnosis or treatment of breast cancer. We will briefly discuss

these two approaches as part of this section of the literature review to present key

findings in the field of genetics in relation to breast cancer. Research focused on

miRNA genes and their role in breast cancer will be covered in section 2.6.3 of this

review to provide a more in-depth view of the subject due to its importance for the

general and specific aims of this study.

2.5.1.1 Single nucleotide polymorphisms and breast cancer

A single base mutation can be classified as a single nucleotide polymorphism

(SNP) when the frequency of the minor allele exceeds 1% in at least one population.

There are around 10 million SNPs in the human genome; about one SNP every 100

to 400 base pairs and about 56% of SNPs have been successfully typed. They form

haplotypes, which comprise genetic variants on the same chromosome that are

inherited together having little chance of recombination. They tend to be transmitted

as blocks within small chromosomal regions, that is, they tend to be in linkage with


one another. Linkage disequilibrium is a term used to define the correlation between

SNPs at nearby sites descended from single, ancestral chromosomes. This allows the

prediction of the genotypes at a large number of SNP loci from identification of

representative SNPs called “tag SNPs” or “haplotype tag SNPs” (Brown, 2007; Dutt

& Beroukhim, 2007; LaFramboise, 2009; Reich et al., 2001).

It is very likely that more than a single gene variant is required to address

tumour heterogeneity in breast cancer. Therefore, genome-wide association studies

(GWAS) have been conducted to try to identify genes and genetic variants associated

with this particular disease. Results from 14 GWAS conducted between 2007 and

2010 show association between breast cancer incidence in different cohorts and 49

low penetrance genetic variants of different genes or loci. Seven single nucleotide

polymorphisms (SNPs) are the most common variants found from these GWAS and

they include: Two single nucleotide polymorphisms in intronic regions of fibroblast

growth factor receptor 2 (FGFR2) gene: rs1219648 and rs2981579 (Fletcher et al.,

2011; Hunter et al., 2007; Li et al., 2011; G. Thomas et al., 2009; Turnbull et al.,

2010) and; 5 SNPs located in intergenic regions. The five intergenic SNPs are one

SNP on chromosome 2q35 between TNP1 and DIRC3 (rs13387042), one SNP on

chromosome 3p24.1 near to the SLC4A7 gene (rs4973768), one SNP on chromosome

5q11.2 between the RPL26P19 and MAP3K1 genes (rs889312), one SNP located on

chromosome 8q24.21 between the SRRM1P1 and POU5F1B genes (rs1562430) and

one SNP located on chromosome 16q12.1-2 between the TOX3 and CHD9 genes

(rs3803662) (Easton et al., 2007; Fletcher et al., 2011; Li et al., 2011; Stacey et al.,

2007; G. Thomas et al., 2009; Turnbull et al., 2010). These variants are located in

relation to genes that are involved in molecular functions or cellular processes linked

to cancer development and their roles remain unclear because they are part of

complicated gene regulatory networks which are still under investigation.

In addition, due to the limitations of the platforms used to undertake GWAS

research, primarily microarray, the full array of human polymorphisms have not been

assessed for association with breast cancer. Many of the intergenic SNPs may have

been associated with breast cancer through linkage to functional SNPs in gene

sequences, while other areas of the genome may have had very little assessment due


to the lack of a known common SNP in the region to be included on GWAS

microarrays. Regions previously thought to be intergenic and thus of less interest

during the development of GWAS microarrays may contain microRNA genes and

similar non-protein coding sequences nonetheless important for control of cellular

growth. Lack of assessment of previously unknown SNPs and microRNA containing

regions means that research examining the role of SNPs in breast cancer aetiology

should take these into consideration to improve our understanding of the disease.

Indeed, this is one of the main rationales for the development of this research.

2.5.1.2 Molecular classification of breast cancer

Molecular classification of breast carcinoma was developed to try to determine

suitable therapy and predict prognosis and response to therapy. As a result, 5

different patterns of gene expression have been identified using microarray

technologies (Blenkiron et al., 2007; Perou et al., 2000; Robbins et al., 2010; Schnitt,

2010; Sørlie et al., 2003):

a. Luminal A: They account for 40 to 55% of no special type (NST) cancers.

The category consists of breast cancers that are ER-positive and

HER2/neu-negative. They express genes under the control of ER and also

genes characteristic of luminal cells. Usually well differentiated, they

occur mainly in postmenopausal women. They are slow growing and

respond well to hormonal treatment.

b. Luminal B: They correspond to about 15 to 20% of NST cancers and they

are also known as triple-positive cancers. They are ER-positive, usually

highly proliferative and poorly differentiated (having high grading) and

overexpress HER2/neu. Luminal B tumours are likely to respond well to

chemotherapy and are often lymph node positive.


c. Normal breast-like: They are about 6 to 10% of NST breast cancers. They

are well differentiated, ER-positive, HER2/neu-negative and they are very

similar in gene expression to normal breast tissue.

d. Basal-like: 13 to 25% of no special type breast cancer tumours fall within

this group. They are ER-negative, progesterone receptor (PR)-negative and

HER2/neu-negative and they have a similar gene expression pattern to

myoepithelial and putative stem cells. They can be grouped with other

types of breast carcinomas that are “triple-negative” carcinomas such as

medullary carcinomas and metaplastic carcinomas. Basal-like carcinomas

are poorly differentiated and have a high proliferation rate. They usually

metastasise to brain and viscera. Although they are associated with poor

prognosis, about 15 to 20% of cancers in this subgroup are sensitive to

chemotherapy and likely to be cured. They are frequently associated with

BRCA1 dysfunction, which occurs in about 80% of basal like tumours.

e. HER2 Positive: This subgroup comprises about 7 to 12% of ER- negative

no special type carcinomas, which also over express HER2/neu protein.

They are poorly differentiated, have a high proliferation rate and they

usually are associated with a high frequency of brain metastasis (Blenkiron

et al., 2007; Perou et al., 2000; Robbins et al., 2010; Schnitt, 2010; Sørlie

et al., 2003).

As a result of this molecular profiling classification, two main multigene

prediction score sets were designed and validated and they are pending approval

from the U.S. Food and Drug Administration (FDA). Both methods, the 21-gene

predictor set, commercially available under the name of Oncotype Dx® (Genomic

Health, Inc., Redwood City, CA), and the technology 70-gene predictor set,

commercially available as MammaPrint® (Agendia, Inc., Amsterdam, The

Netherlands) (Paik et al., 2004; Schnitt, 2010; van de Vijver et al., 2002) are based

on quantitative real time polymerase chain reaction (qPCR) and microarray

technologies, respectively. Oncotype Dx® produces a score for recurrence of breast


carcinoma and it classifies patients as low, moderate or high risk based on testing for

16 target transcripts, including ER, PR and HER2. The TAILORx (Trial Assigning

Individualized Options for Treatment (Rx)) trial is currently in progress and it aims

to validate the “recurrence score”, which also appears to correspond to likelihood of

response to chemotherapy and endocrine therapy. On the other hand, the

MammaPrint® predictor set, currently under validation in the ongoing MINDACT

(Microarray In Node Negative Disease may Avoid Chemotherapy) trial, aims to

identify patients with good and poor prognostic signatures, likely to benefit from the

use of adjuvant therapy. Thus, molecular classification and profiling of patients using

the multigene predictor sets is currently being used to try to individualise therapy for

breast cancer patients to improve outcomes (Abeloff et al., 2008; Galanina, Bossuyt,

& Harris, 2011; Paik et al., 2004; Schnitt, 2010; van de Vijver et al., 2002).

2.6 MICRORNAS AND BREAST CANCER

2.6.1 MicroRNA biogenesis and functions.

MicroRNAs (miRNAs) are small non-coding single stranded RNA molecules

of about 21 to 25 nucleotides in length, involved in gene regulation at the

translational level. They are part of a family of small RNAs that also include small

nucleolar RNA (snoRNA), short interfering RNA (siRNA) and piwi-interacting RNA

(piRNA). miRNAs are estimated to account for about 1% of the human genome and

they were initially identified in Caenorhabditis elegans (R. C. Lee, Feinbaum, &

Ambros, 1993; Reinhart et al., 2000). They regulate important cellular processes like

cell growth, differentiation and cell survival and therefore they have been linked to

cancer and tumour development (Thalia A. Farazi et al., 2011; Robbins et al., 2010;

Westholm & Lai, 2011; Zuoren Yu et al., 2010).

52% of miRNA genes are located in different non-random positions within the

genome in regions of chromosomal instability or cancer-associated regions. The

majority of miRNA genes are found in intergenic regions, 1 kb away from known


genes, and about one third of miRNA genes are found in the introns of protein-

coding genes. Many of these genes are transcribed independently and they have

promoter regions that allow gene expression in a cell-type specific manner. Many

others form genomic clusters, about 113 in total which can be up to 51 kb in length,

and they are probably transcribed as a single polycistronic transcript (Thalia A.

Farazi et al., 2011; Lynam-Lennon, Maher, & Reynolds, 2009). 1881 micro RNA

precursors and 2588 mature sequences have been identified for the human genome

according to miRBase (www.mirbase.org), a miRNA database, in their latest release:

miRBase release 21 (June 2014) (Kozomara & Griffiths-Jones, 2011, 2014).

miRNAs are transcribed from long microRNA primary transcripts (pri-

miRNAs) about 1kb in size that contain up to six miRNA precursors (pre-miRNA).

Pre-miRNAs are stem-loop molecules of about 55 to 70 nucleotides in length and

they are produced by canonical and non-canonical pathways. Non-canonical

pathways are also called “mirtron” pathways, which are well described in

invertebrates but they have not been experimentally demonstrated in mammalians or

humans (Ha & Kim, 2014; Westholm & Lai, 2011). In the canonical pathway, pre-

miRNAs are produced when pri-miRNA are cleaved by the complex Drosha –

DCGR8 (DiGeorge syndrome critical region gene 8, also known as pasha) within the

cellular nucleus (Yoontae Lee et al., 2003). Pre-miRNA are then transported to the

cytoplasm by XPO5 (exportin5) and the nuclear protein ran-GTP. In the cytoplasm

pre-miRNAs are processed by DICER1, a ribonuclease type III enzyme, and double-

stranded RNA binding domain (dsRBD) proteins TARBP2 and/or PKRA producing

a duplex molecule that contains both single-stranded mature miRNA strand and a

complementary strand (miRNA*) (Chendrimada et al., 2005; Yi, Qin, Macara, &

Cullen, 2003). The miRNA:miRNA* molecule is loaded into the RNA-induced

silencing complex (RISC), where the mature miRNA sequence is merged into the

Argonaute/EIFC2C (Ago) proteins and the miRNA* is released and degraded (Thalia

A. Farazi et al., 2011; Lynam-Lennon et al., 2009; Westholm & Lai, 2011).

Ago proteins bound to mature miRNA mediate target messenger RNA

(mRNA) recognition and interaction between these three components resulting in

gene regulation. The mRNA target is recognised by pairing of the miRNA seed


http://www.mirbase.org/

region (nucleotides 2 to 8) located in the 5’-end of the miRNA and the

complementary sequence mainly located in the 3’ untranslated region (3’-UTR)

region of target mRNA but binding sites can also found in the coding region

(Khvorova, Reynolds, & Jayasena, 2003; Lai, 2002; Y. Lee, Jeon, Lee, Kim, & Kim,

2002). This process is mediated and regulated by Ago proteins and the mRNA

processing bodies (P-bodies), which are cytoplasmic structures that contain

regulatory proteins from the GW182/TNRC6 family of proteins. If the seed region of

the miRNA and the target region of mRNA are highly complementary, the RISC

complex will cleave the target mRNA, a process known as RNA interference. If there

is not enough complementarity between miRNA seed region and target mRNA

regulation occurs by repression of translation, a process that has not yet been fully

elucidated but that may occur before or after mRNA translation (see Figure 2-4)

(Thalia A. Farazi et al., 2011; Lynam-Lennon et al., 2009; Westholm & Lai, 2011).


Figure 2-4 Copyright © 2006. Canonical pathway of miRNA genesis and mRNA regulation. Reprinted by permission of Nature Publishing Group (Esquela-Kerscher

& Slack, 2006).

Each miRNA may bind to up to 200 gene targets and each gene could have

multiple binding sites for different miRNAs. Therefore they have been reported to

regulate up to 60% of the genes within the human genome and targets for miRNAs

are still under current investigation and validation (Bartel, 2009; Rajewsky, 2006).


2.6.2 MicroRNAs in cancer

It is very likely that miRNAs play an important role in cancer biology because

of the complexity of the molecular processes and mediators involved in miRNA

genesis and miRNA-mRNA regulation and because they are involved in cell

development, proliferation and death. Oncogenic miRNAs, or oncomirs, are known

to regulate cancer pathways and they can control neoplastic development by reducing

the expression of tumour suppressor genes or enhancing expression of certain

oncogenes (Baohong Zhang, Pan, Cobb, & Anderson, 2007).

Research studies have been conducted since microRNAs were initially

described and they have linked miRNAs to different types of cancer. Because of the

large number of miRNA genes currently identified and the large number of different

approaches used to study them, we will focus on some of the most important studies

published to date in relation to different types of cancers; those addressing breast

cancers will be further covered in subsection 2.6.3 of this review.

Initially, research studies focused on the expression patterns of different

microRNAs in relation to cancer. The first reports of miRNA involvement in cancer

date back about 12 years when Calin et al found MIR15 and MIR16 to be deleted or

downregulated in patients with chronic lymphocytic leukaemia (CLL), the most

common form of leukaemia in adults, as well as in prostate cancer cell lines in a

study published in 2002 (Calin et al., 2002). A t(8, 17) translocation in an aggressive

type of B cell leukaemia was also found to produce loss of expression of MIR142

resulting in overexpression of MYC (Gauwerky, Huebner, Isobe, Nowell, & Croce,

1989; Lagos-Quintana et al., 2002). In a review published in 2003, McManus et al

postulated that microRNAs could play a role in cancer as either tumour suppressor

(TS) genes or oncogenes (McManus, 2003). Following this, a study in colorectal

carcinoma identified reduced expression of MIR143 and MIR145 (Michael, O'

Connor, van Holst Pellekaan, Young, & James, 2003) and a different study showed

that reduced expression of MIRLET7A1 was associated with a poor prognosis in lung

cancer (Takamizawa et al., 2004). Downregulation of MIR145 is also a common


finding previously reported in colorectal (Arndt et al., 2009; Michael et al., 2003),

bladder (Chiyomaru et al., 2010), lung (Cho, Chow, & Au, 2009, 2011) and

oesophageal (Kano et al., 2010) cancers and it has been associated with poor

prognosis for cancer patients.

On the other hand, two separate studies identified expression of MIR155 to be

increased in paediatric Burkitt lymphoma (Metzler, Wilda, Busch, Viehmann, &

Borkhardt, 2004) Hodgkin´s lymphoma (HL), primary mediastinal B cell lymphoma

(PMBL), diffuse large B cell lymphoma (DLBCL) and CLL (Eis et al., 2005; Kluiver

et al., 2005).Similarly, Chen et al found markedly elevated levels of MIR21 in

glioblastoma tumour tissue and glioblastoma cell lines (Chan, Krichevsky, & Kosik,

2005). Calin et al determined that microRNAs were frequently located in cancer

associated genomic regions as well as fragile chromosomal sites and they proposed

that microRNAs can act both as tumour suppressor or oncogenes due to the variety

of genes they regulate and the complexity of their networks (Calin, Sevignani, et al.,

2004).

The microRNA 17-92 cluster host gene (MIR17HG) located on chromosome

13q31.3 was the first miRNA cluster to be associated with cancer. The miRNA 17-92

cluster host gene transcript is about 800bp in length and contains six miRNAs:

MIR17, MIR18A, MIR19A, MIR20A, MIR19B1 and MIR92A1, which belong to 4

seed families (miRNA 17, miRNA 18, miRNA 19 and miRNA 92). Based on their

seed sequences, two related paralogues of this cluster have also been identified in the

human genome: miRNA 106b-25 cluster located on chromosome 7 (7q22.1) in the

13th intron of gene MCM7 and miRNA 106a-363 cluster located on chromosome X

(Xq26.2) (Ventura et al., 2008). In 2004 Ota et al observed amplification of

MIR17HG in DLBCL and lymphoma cell lines which was corroborated by another

study in 2005 (L. He et al., 2005; Ota et al., 2004). He et al demonstrated increased

expression of MIR17HG in B cell lymphomas in cell lines and tumour samples and

they also found this cluster gene promoted accelerated tumour development in a

murine model. Therefore they named MIR7HG as oncomir-1 (L. He et al., 2005). It

has also been implicated to play a role in different types of cancer including lung


cancer, breast cancer, colon cancer gastric cancer and pancreatic cancer (Hayashita et

al., 2005; Mendell, 2008; Mogilyansky & Rigoutsos, 2013; Volinia et al., 2006).

Research studies have been conducted to identify specific miRNA expression

profiles for different types of cancer. For example, Lu et al analysed the expression

of 217 microRNAs using bead-based flow cytometry in 334 samples of leukaemia

and solid cancers and they were able to correlate miRNA expression with

developmental lineage and differentiation state (J. Lu et al., 2005). Using a

microarray platform, Volinia et al investigated the miRNA profile of 363 cancer

samples, including B-cell chronic lymphocytic leukaemia (BCLL), breast, colon,

lung, pancreas, prostate and stomach cancer to compare them to 177 normal tissue

samples. 228 miRNA genes were evaluated and they identified 36 microRNA genes

to be over expressed and 21 to be downregulated in cancer cells versus normal

tissues (Volinia et al., 2006). Many other profiling studies have been conducted in

different types of cancer including chronic lymphocytic leukaemia (CLL) (Calin,

Liu, et al., 2004), papillary thyroid carcinoma (H. He et al., 2005), breast cancer

(Iorio et al., 2005), colon cancer (Cummins et al., 2006), glioblastoma (Ciafre et al.,

2005), lung cancer (Yanaihara et al., 2006), hepatocellular carcinoma (Murakami et

al., 2006) and pancreatic tumours (Roldo et al., 2006). All these studies consistently

show that expression patterns for cancer tissues involve both overexpressed and

downregulated miRNA genes which may corroborate what was previously proposed

by McManus in 2003 (McManus, 2003). Some of the miRNAs that were most

commonly identified in these studies are shown in Table 2-7.


Table 2-7 Some of the miRNAs and their target genes most commonly found in relation to cancer.

miRNA Upregulated Downregulated Predicted target genes References MIR17HG Lymphomas, lung cancer, breast

cancer, colon cancer, gastric cancer, pancreatic cancer

E2F1, BCL2L11, PTEN

He et al., 2005; Iorio et al., 2005; Mendell , 2008; Ota et al., 2004; Volinia et al., 2006

MIR21 Brain cancer, breast cancer, glioblastomas, acute myeloid leukaemia, colon cancer, pancreatic cancer, prostate cancer, liver cancer

BCL2, PTEN, PDCD4, TPM1, TIMP3

Ciafre et al., 2005; Chan et al., 2005; Iorio et al., 2005; Murakami et al., 2006; Volinia et al., 2006

MIR221 Brain Cancer, colon Cancer, pancreatic cancer, prostate cancer, papillary thyroid carcinoma

CDKN1B, CDK1C, PTEN, TIMP3

Ciafre et al., 2005; Chan et al., 2005; Cummins et al., 2006; He et al., 2005; Roldo et al., 2006; Volinia et al., 2006

MIR155 Lymphomas Breast Cancer MAF, AGTR1 Eis et al., 2005, Kluiver et al., 2005; Iorio et al., 2005

MIR15A Chronic lymphocytic leukaemia, prostate cancer, multiple myeloma

BCL2, WT1, RAB9B Calin et al., 2002; Calin et al., 2004

MIR16-1 Chronic lymphocytic leukaemia, prostate cancer, multiple myeloma

BCL2, WT1, RAB9B Calin et al., 2002; Calin et al., 2004

MIR34A Pancreatic cancer, colon cancer, breast cancer, liver cancer

CDK4, CDK6, CCNE2, E2F3, MET

Blenkiron et al, 2007; Cummins et al., 2006; Iorio et al., 2005; Farazi et al., 2011; Murakami et al., 2006; Roldo et al., 2006

MIR145 Breast Cancer, colorectal cancer ERG Cummins et al., 2006; Iorio et al., 2005; Michael et al., 2003

miRNA Let-7 family Lung cancer, breast cancer, haematopoietic malignancies

RAS family, MYC, HMGA2

Blenkiron et al., 2007; Yanaihara et al., 2006;


MicroRNAs have been used in cancer diagnosis through the following

approaches: Profiling of tumours based on differential expression of miRNAs can be

used to identify tissue of origin in cancers of unknown primary site, to identify

tumour subtype and/or predict prognosis (Calin & Croce, 2006; J. Lu et al., 2005;

Paranjape, Slack, & Weidhaas, 2009; Baohong Zhang, Pan, et al., 2007). Profiling of

circulating blood or tumour derived exosomal miRNAs has also been studied for

their potential use as a non-invasive method for early detection of cancer (X. Chen et

al., 2008; Heneghan et al., 2010; Kosaka, Iguchi, & Ochiya, 2010; Mitchell et al.,

2008; Roth et al., 2010; W. Zhu, Qin, Atasoy, & Sauter, 2009).

It has been recognised that changes in miRNA synthesis and expression in

relation to cancer could occur through many different mechanisms and some of these

include: point mutations in miRNA and mRNA sequences, epigenetic changes like

mRNA gene methylation, loss or mutation in the promoter regions for specific

miRNA clusters and alterations in pathway related RNA binding protein (RBP)

(Thalia A. Farazi et al., 2011; H.-C. Lee et al., 2011; Lynam-Lennon et al., 2009;

Ryan, Robles, & Harris, 2010; G. Sun et al., 2009; Westholm & Lai, 2011).

Therefore, the study of single nucleotide polymorphisms (SNPs) in miRNA

sequences has emerged as an approach to investigate cancer predisposition, as well

as explain changes in miRNA biogenesis and function (Paranjape et al., 2009; G. Sun

et al., 2009).

1899 SNPs in 961 pre-miRNAs, 601 miR-SNPs in 470 miRNA mature

sequences and 203 miR-SNPs located in the seed region of 170 mature miRNAs

were identified in 2013 by Han et al (M. Han & Zheng, 2013) in a genome-wide

scan analysis of the SNPs contained in the NCBI dbSNP database (build 137 for

humans) (Smigielski, Sirotkin, Ward, & Sherry, 2000) and the genomic locations and

sequences of pre-miRNAs and mature miRNAs from the miRBase (release 18.0,

November 2011) (Kozomara & Griffiths-Jones, 2011). This highlights miRNA SNPs

(miR-SNPs) as the most common type of genetic variation in miRNA sequences and

neighbouring regions and they have been determined to alter miRNA biogenesis and

processing (R. Duan, Pak, & Jin, 2007; Gottwein, Cai, & Cullen, 2006; Ryan et al.,

2010; G. Sun et al., 2009), which could have an effect on numerous targets and


diverse functional consequences. Therefore miR-SNPs may be ideal candidate

biomarkers to use in early diagnosis of cancer, to determine prognosis and to predict

outcome in cancer patients, and this hypothesis has obtained significant support from

a number of studies in different types of cancer. Some of the most relevant miR-SNP

studies in various types of cancer will be the focus of subsection 2.6.2 since similar

studies in relation to breast cancer will be discussed further in subsection 2.6.3 of this

review.

In 2008, a miR-SNP in MIR196A2 (rs11614913) which is present in the mature

sequence was found to increase processing of the mature miRNA and be associated

with significantly decreased survival in patients with non-small cell lung cancer (Z.

Hu et al., 2008). rs2910164 present in the precursor sequence of MIR146A was found

to reduce the amount of both pre- and mature MIR146A and it was also shown to

affect the Drosha/DGCR8 processing step (Z. Hu et al., 2009; Jazdzewski et al.,

2008; Shen et al., 2008). Hu et al also showed that rs11614913 (MIR196A2),

rs2910164 (MIR146A), rs2292832 (MIR149) and rs3746444 (MIR499A) were

associated with increased breast cancer risk in Chinese women (Z. Hu et al., 2009).

Similarly, two separate studies showed rs2910164 to be associated with papillary

thyroid carcinoma (Jazdzewski et al., 2008) and hepatocellular carcinoma (T. Xu et

al., 2008). Ye et al showed that rs6505162 located in the precursor of MIR423 to be

significantly associated with oesophageal cancer risk (Ye et al., 2008). All of the

previously mentioned studies show that the effects of miR-SNPs can be modulated in

a cell type-specific manner, implying that not all functional miR-SNPs affect cancer

risk globally. However, there are still a few issues to overcome in regards to miR-

SNP studies in relation to cancer and they include the following: Minor allele

frequencies of many identified miR-SNPs are still unknown, pointing to the need to

conduct population studies to determine whether these are polymorphic in specific

populations; most studies focus on the analysis of pre and mature regions of miRNAs

because further clarification of the extent of the pri region is still required for many

miRNAs to determine the true extent of miRNA-related genetic variation;

identification of tag SNPs for miRNA-related SNPs will be useful to identify miR-

SNPs which are in linkage disequilibrium and therefore inherited together; and


finally the inclusion of more miR-SNPs in genome wide association studies (GWAS)

to identify low penetrance susceptibility mutations.

2.6.3 MicroRNAs and breast cancer

In an effort to understand the role of miRNAs in breast cancer, they have been

studied using microRNA expression evaluation in different breast cancer cell lines

and normal and breast cancer tissue samples. This approach is potentially valuable as

miRNAs could be used as a diagnostic tool for prediction of breast cancer

development and tumour profiling, as therapeutic targets for cancer treatment and as

prognostic markers to predict metastasis and to evaluate treatment response and

patient survival (Paranjape et al., 2009).

In 2005, Iorio et al examined expression of 246 miRNA using 76 breast tumour

samples and they identified 29 miRNAs that showed significantly different levels of

expression in cancer samples versus normal breast tissue (Iorio et al., 2005). They

identified 5 miRNAs to be the most consistently deregulated across all tumour

samples: MIR10B, MIR125B1 and MIR145 were significantly down-regulated and

MIR21 and MIR155 were significantly up-regulated, suggesting their role as tumour

suppressor genes or oncogenes, respectively. However they failed to identify

miRNAs that were differentially expressed between tumour subtypes. A following

study conducted by Mattie et al in 2006 was able to distinguish HER2+/ER- and

HER-/ER+ breast cancer using profiling of 204 miRNAs in 20 tumour samples. This

study also identified a subset of miRNAs specific to HER2+ tumours (MIRLET7F1,

MIRLET7G, MIR107, MIR10B, MIR126, MIR154 and MIR195) and to ER+/PR+

tumours (MIR142 5p, MIR200A, MIR205 and MIR25) (Mattie et al., 2006).

A further study conducted by Blenkiron et al using a bead-based flow

cytometric platform showed that miRNAs are differentially expressed between

molecular subtypes of breast cancer (Luminal A, Luminal B, Basal-like and HER2

positive), previously described in section 2.5.1.2 on molecular classification of breast


cancer (Blenkiron et al., 2007). Blenkiron et al also evidenced changes in miRNA

biosynthesis in 93 frozen breast cancer tumour samples due to changes in gene

expression of post-transcriptional mediators. Basal-like, HER2 positive and luminal

B type tumours showed down regulation of DICER1 expression and a higher

expression of AGO2. Oestrogen receptor (ER) status was also found to be associated

with gene expression changes and ER negative tumours showed AGO2 and DROSHA

to be up regulated and DICER1 to have lower levels of gene expression (Blenkiron et

al., 2007).

Sempere et al found that altered microRNA expression corresponded to

particular epithelial cell subpopulation using microarray analysis, northern blot and

in-situ hybridization (ISH) with archived formalin-fixed paraffin-embedded breast

cancer tissue samples breast cancer. They found MIR145 and MIR205 to be restricted

to the myoepithelial/basal cell compartment of normal ducts and lobules and their

expression was significantly down-regulated in tumour breast samples. On the other

hand MIR21 was found to be present in luminal epithelial cells in normal breast

tissue and its expression significantly increased in breast cancer samples compared to

normal tissue (Sempere et al., 2007).

Lowery et al validated three different miRNA cluster signatures to be

predictive of oestrogen receptor (ER) (MIR342, MIR299, MIR217, MIR190,

MIR135B and MIR218), progesterone receptor (PR) (MIR520G, MIR377,

MIR518A1, MIR520F and MIR520C) and HER2/neu receptor (MIR520D, MIR181C,

MIR302C, MIR376B and MIR30E) status in breast cancer using artificial neural

network algorithms, miRNA microarray and quantitative PCR (qPCR) (Lowery et

al., 2009). There is room for debate in regards to some of the findings for some of

the miRNA signatures previously mentioned. For instance, Blenkiron et al showed

MIR342 to be downregulated in Basal-like and HER2 positive type tumours

(Blenkiron et al., 2007). There is also some contradicting evidence in relation to

HER2 positive tumours; since the cluster microRNA signature developed and

validated by Lowery et al found MIR342 expression to be higher in ER positive,

HER2/neu positive and luminal B type tumours and lowest in Basal-like tumours

(Lowery et al., 2009). Finally, Buffa et al (Buffa et al., 2011) found that high levels


of expression of MIR342 were associated with good prognosis in ER negative

tumours. Target genes for MIR342 include RRM2, PGM2L1 and TLE1, which could

be potential therapeutic targets for breast cancer. Some of the differences in the

results of these studies could be the result of differences in methods of collection,

preservation and storage of primary tumours used, resulting in RNA degradation and

distortion of qPCR results. There are also differences in the experimental methods

used and statistical analysis of results which could lead to the inconsistencies in

findings.

We will discuss some of the miRNAs most commonly identified to have a role

in breast cancer development before we move on to discuss findings in relation to

miR-SNPs and breast cancer. These miRNAs are summarised in Table 2-8 along

with some of their gene targets.

As mentioned before, MicroRNA-10b (MIR10B) has been described as an

oncogene and it was initially identified in the work conducted by Iorio et al (Iorio et

al., 2005). They found MIR10B to be down-regulated across tumour samples but

further studies showed different findings. Studies by Mattie et al (Mattie et al., 2006)

and Liu et al (Y. Liu et al., 2012) showed MIR10B to be significantly up-regulated

particularly in metastatic breast cancer. Differences in the results found could be the

result of the primary sample used and methodological approach. Studies conducted

by Iorio et al and Mattie et al both used cryopreserved tumour samples, while Liu et

al used freshly isolated tumour tissue. Moreover, there were significant differences in

their methods because not every study considered important features such as tumour

size, ERBB2 status, ER/PR positivity, tumour staging and grading and lymph node

metastasis in their analysis. Target genes for MIR10B include FLT1, BDNF,

HOXD10, CDH1 and TGFB1 (X. Han et al., 2014; Y. Liu et al., 2012; Mattie et al.,

2006).


Table 2-8 Summary of the most significantly associated microRNAs in breast cancer and their target

genes

MiRNA PREDICTED TARGET GENES AUTHOR MIR10B FLT1

BDNF HOXD10

CDH1 TGFB1

Iorio et al., 2005 Volinia et al., 2006

Blenkiron et al., 2007 Ma et al., 2007 Liu et al., 2012

MIR17HG NCOA3 E2F1 MYC RB1

CCND1

O’Donnell et al., 2005 Hossain et al., 2006

Yu et al., 2008 Leivonen et al., 2009

Mogilyansky et al., 2013 MIR125A - MIR125B ERBB2

ERBB3 YES1

Iorio et al., 2005) Volinia et al., 2006

ETS1 AKT3

FGFR2 MAP3K10 MAP3K11 MAPK14

MIR145 MYCN Iorio et al., 2005 Volinia et al., 2006

Blenkiron et al., 2007 Sempere et al., 2007

FOS CCND2

MAP3K3 MAP4K4

MIR155 SOCS1 TP53INP1

VHL APC

WEE1 RAB6A RAB6C

Iorio et al., 2005 Volinia et al., 2006

Blenkiron et al., 2007 Jiang et al., 2010 Zhang et al., 2013 Kong et al., 2014

MIR21 Caspase protease family Iorio et al., 2005 Volinia et al., 2006

Si et al., 2007 Zhu et al., 2007

TPM1 PDCD4 PTEN BCL2

TGFB1 RAB6A RAB6C


As was mentioned in subsection 2.6.2 of this review the microRNA 17-92

cluster host gene (MIR17HG) has been associated with different types of cancer

including lymphomas, lung cancer, colon cancer, gastric cancer and pancreatic

cancer. MIR17HG was the first miRNA cluster to be identified in relation to cancer

and subsequently named oncomir-1 (L. He et al., 2005; Ota et al., 2004). It has been

found to inhibit tumour growth in breast cancer cell lines and it targets cell cycle

regulators including NCOA3, E2F1, MYC, RB1 and CCND1(Hossain, Kuo, &

Saunders, 2006; Leivonen et al., 2009; Z. Yu et al., 2008) .

He et al demonstrated increased expression of MIR17HG in B cell lymphomas

in cell lines and tumour samples they also found this cluster gene to promote

accelerated tumour development in a murine model. (L. He et al., 2005) and it has

also been implicated to play a role in different types of cancer including lung cancer,

breast cancer, colon cancer gastric cancer and pancreatic cancer (Hayashita et al.,

2005; Mendell, 2008; Mogilyansky & Rigoutsos, 2013; Volinia et al., 2006)

Mattie et al found microRNA-125a (MIR125A) and miRNA-125b (MIR125B)

to be downregulated in in HER2+ breast cancer. Both miRNAs are considered to be

tumour suppressor genes and they target important genes including ERBB2 (HER2),

ERBB3 (HER3) and FGFR2 (Scott et al., 2007).Overexpression of ERBB2 is also

associated with poor prognosis for breast cancer patients (Ross & Fletcher, 1998).

MIR145 has been previously reported to play a role in cancer biogenesis and

progression for other types of cancer (Arndt et al., 2009; Chiyomaru et al., 2010; Cho

et al., 2009, 2011; Kano et al., 2010; Michael et al., 2003). As previously mentioned,

it has also been found to have a tumour suppressor role in breast tissue particularly in

the myoepithelial/basal cell compartment and it is found to be downregulated in

breast cancer tissue samples (Iorio et al., 2005; Sempere et al., 2007; Volinia et al.,

2006). Research using breast cancer cell lines showed it regulates some important

genes involved in modulation of apoptosis, shown in Table 2-8 (Spizzo et al., 2010;

S. Wang et al., 2009). In vitro studies by Michael et al using MCF-7 and T47-D cell

lines showed down regulation of MIR145 (Michael et al., 2003). Finally studies


conducted by Goette et al (Goette et al., 2010) and Radojicic et al (Radojicic et al.,

2011) showed down-regulation of MIR145 to be associated with a more aggressive

behaviour of breast cancer based on results performed in both breast cancer cell line

and tissue samples.

miRNA-155 (MIR155) has been determined to be upregulated in breast cancer

according to studies previously mentioned (Iorio et al., 2005; Volinia et al., 2006). It

is considered to be an oncogene because it has been shown to target tumour

suppressor genes including SOCS1, TP53INP1 and VHL according to studies

performed both in breast cancer cell lines and tumour tissue samples. It has also been

associated with poor prognosis in patients with triple negative breast cancer (Czyzyk-

Krzeska & Zhang, 2014; Jiang et al., 2010; Kong et al., 2014; C.-M. Zhang, Zhao, &

Deng, 2013)

miRNA-21 (MIR21) has been found to be consistently up-regulated in different

types of cancer (Chan et al., 2005; Y. Hu et al., 2011; Tetzlaff et al., 2007; Volinia et

al., 2006). In 2005 Iorio et al identified MIR21 to be up-regulated in breast cancer

tumour samples as well (Iorio et al., 2005). As previously mentioned, Sempere et al

found it to be expressed in luminal epithelial cells and to be further increased in

malignant tissue when compared to normal samples (Sempere et al., 2007) and

Radojicic et al found it to be overexpressed in triple negative tumours (Radojicic et

al., 2011). Moreover, overexpression of MIR21 has been associated with metastasis

and poor prognosis for breast cancer sufferers (Yan et al., 2008). Using cell culture

and a xenograft carcinoma mouse model, Si et al identified that knock-down of

MIR21 resulted in increased apoptosis and reduced tumour proliferation (Si et al.,

2007). This group established that the oncogenic role of MIR21 was the result of the

inhibition of a tumour suppressor gene (TPM1) (S. Zhu, Si, Wu, & Mo, 2007). Other

studies have identified additional targets of MIR21 in breast cancer including

PDCD4, Caspase protease family, BCL2 and PTEN amongst others (Thalia A. Farazi

et al., 2011; Frankel et al., 2008; Robbins et al., 2010; Z. X. Wang, Lu, Wang,

Cheng, & Yin, 2011; Wen et al., 2007).


Other miRNA findings in relation to breast cancer include the following:

MIR222 host gene (MIR222HG) targets CDK inhibitor to regulate cell cycle both in

breast cancer MCF-7 cell line and HER2/neu positive breast tumours and it is also

associated with Tamoxifen resistance (Thalia A. Farazi et al., 2011; Negrini & Calin,

2008; Robbins et al., 2010; Zuoren Yu et al., 2010); epigenetic modification such as

inactivation of expression by methylation of MIR9-1, MIR124-2HG, MIR148,

MIR152 and MIR663 has been described in breast cancer (Lehmann et al., 2008) and

copy number alteration of miRNA genes has also been evidenced in breast cancer

using array-based comparative genomic hybridization (L. Zhang et al., 2006). Other

miRNAs have been associated with metastasis including the microRNA let-7 family,

MIR10B, MIR9-1 and MIR31. On the other hand MIR335, MIR206 and MIR126 have

been implicated to have a role as breast cancer metastasis suppressor miRNAs

(Thalia A. Farazi et al., 2011; Negrini & Calin, 2008; Robbins et al., 2010; Zuoren

Yu et al., 2010).

MicroRNAs mentioned in this literature review have been evaluated in

different studies using breast cancer cell lines and breast cancer tissue samples,

however recent studies have been conducted to try to measure microRNA expression

in blood products. Evaluation of circulating microRNAs in breast cancer patients

could be used for early diagnosis, follow up and to characterise novel miRNAs.

MicroRNAs are remarkably stable molecules and therefore it was hypothesised that

they should be present in the circulation of cancer patients. Heneghan et al

investigated whether cancer specific miRNAs (MIR10B, MIR16, MIR21, MIR145,

MIR155, MIR195 and MIRLET7A1) were detectable and expression was altered in

serum, plasma or whole blood obtained from cancer patients in comparison to age-

matched controls. They identified whole blood as the best source of circulating

miRNAs and using quantitative PCR (qPCR) they also showed changes in miRNA

expression for MIR195 and MIRLET7A1 when comparing cancer patients to controls

and also when miRNA expression was measured before and after surgical treatment

(Heneghan et al., 2010). However, Roth et al (Roth et al., 2010) showed contrasting

results when they evaluated levels of MIR10B, miR34A, MIR141 and MIR155 in

blood serum of post-operative breast cancer patients and in breast cancer cell lines.

Using blood serum, they found MIR155 in breast cancer patients with no metastasis


and MIR-10B, MIR34A and MIR155 in patients with metastatic breast cancer were

significantly higher than healthy controls. Therefore differences in results obtained

for both studies using similar miRNA panels, need to be addressed, primarily by

evaluating these circulating microRNAs using bigger and more thoroughly

characterised cohorts.

Using a different approach, Wu et al (Q. Wu et al., 2011) performed next-

generation sequencing of microRNAs, comparing breast tissue samples and normal

breast tissue and validated the top five miRNAs found to be altered more than

fivefold using RT- PCR in serum samples from breast cancer patients and controls.

This study found significantly elevated levels of MIR21 and MIR29A in breast cancer

patients compared to controls, which could be used as early non-invasive markers for

the detection of breast cancer, a finding that requires further validation in a bigger

sample population and correlation with clinical features if it is to translate to clinical

practice (Budhu, Ji, & Wang, 2010; Heneghan et al., 2010; Roth et al., 2010; Q. Wu

et al., 2011).

2.6.4 MicroRNA SNPs (miR-SNPs) and breast cancer

Scientific literature addressing SNPs in microRNA genes in relation to breast

cancer is scarce but it seems to be increasing as it has been recognised that point

mutations in miRNA sequences can affect their synthesis and expression (Gong et

al., 2012; H.-C. Lee et al., 2011; Mishra et al., 2008).

A SNP located in the terminal loop of the precursor of MIR27A, rs895819, has

been associated with gastric cancer and colorectal cancer. However findings for this

polymorphism in relation to breast cancer remain controversial. Initially, rs895819

was found to reduce risk of developing breast cancer in German families with a

history of non-BRCA related breast cancer (1217 patients versus 1422 controls)

(Rongxi Yang et al., 2010), as well as in individuals from Jewish families with

mutant BRCA2 (58 patients vs 48 control) (Kontorovich, Levy, Korostishevsky, Nir,


& Friedman, 2010). However two studies by Zhang et al, conducted on Chinese

women (252 patients versus 248 controls) (M. Zhang et al., 2012); and Catucci et al,

conducted on Italian women with non-BRCA related familial breast cancer (1205

cases versus 1593 controls) (Catucci et al., 2012), failed to replicate these findings.

It was argued that the genotyping results using the TaqMan Assay obtained by

Catucci et al (Catucci et al., 2012) were affected and biased by a nearby SNP

(rs11671784) (R. Yang & Burwinkel, 2012). To make matters worse, a recent study

by Zhang et al conducted in Chinese women (264 patients versus 255 controls)

identified an association between rs895819 and reduced sporadic breast cancer risk

(N. Zhang et al., 2013) contradicting the result previously found in a similar

population (M. Zhang et al., 2012). Most recently, Zhong et al conducted a meta-

analysis of the published studies for rs895819 in an effort to clarify the conflicting

results in relation to cancer risk. As a result, they were able to identify the G allele in

rs895819 to be associated with decreased risk in breast cancer cases and in the

Caucasian population (Zhong, Chen, Xu, Li, & Zhao, 2013).

In 2009 Kontorovich et al identified the AA genotype of rs560516, present in

MIR423 to reduce the risk of developing both ovarian and breast cancer in

individuals from Jewish families with mutant BRCA2 (Kontorovich et al., 2010). On

the other hand, a study conducted by Smith et al in 2012 showed that the presence of

the CC genotype for the same miR-SNP was significantly associated with reduced

risk of sporadic breast cancer development in Australian Caucasian individuals (R.

A. Smith et al., 2012). It is argued that this contrasting result may be due to the

difference in ethnicities for the population examined as well as the type of cancer

(familial versus sporadic). It may also be due to the effect this miR-SNP has on

expression or processing of the mature sequence of the microRNA or how it interacts

with BRCA2 mutations.

The microRNA SNP rs2910164 present in MIR146A constitutes another

example of conflicting results in breast cancer susceptibility studies for miR-SNPs.

Shen et al initially reported that the presence of this polymorphism was associated

with a younger age of diagnosis in 2008. This study, conducted in the United States,

included patients with familial breast cancer with different ethnic backgrounds (35


Caucasian, 4 African American and 3 unknown) and they determined the C allele to

significantly lower the age of diagnosis in breast cancer patients, as well as increase

the levels of expression for the mature MIR146A (Shen et al., 2008). However a

similar study that year conducted on Chinese individuals (1009 cases versus 1093

controls) failed to find association with age of diagnosis or risk of breast cancer

development (Z. Hu et al., 2009). In 2010 Pastrello et al conducted a similar study in

Italian patients diagnosed with non-BRCA related familial breast cancer (348 cases

versus 148 controls) and they found the presence of the C allele to be associated with

a younger age of diagnosis (Pastrello, Polesel, Della Puppa, Viel, & Maestro, 2010).

These results confirmed those found by Shen et al (Shen et al., 2008) and

contradicted the results of Hu et al (Z. Hu et al., 2009) and this may be explained by

the difference in the ethnicity of the populations examined, attributable to genetic or

environmental factors differing between them. On the other hand, Catucci et al

conducted a similar study for rs2910164 in German and Italian individuals with non-

BRCA related familial breast cancer (1894 versus 2760) and they found no

association to breast cancer risk or age of onset (Catucci et al., 2010). Also on the

previous year, a study by Hoffman et al conducted in the United States failed to

identify association between rs2910164 and sporadic breast cancer. It is worth noting

that this study included patients with different ethnic backgrounds (about 9% of

patients were African American or other mixed ethnicities) and also about 20% of

individuals in both cases and controls had familial history of breast cancer (1st degree

relative), which may have introduced bias to the study results (Hoffman et al., 2009).

Garcia et al conducted a genotyping study on French individuals with familial

history of breast cancer and either BRCA1 or BRCA2 mutations both in cases and

controls (1130 individuals and 596 individuals respectively). Results from this study

did not show an association between breast cancer risk and rs2910164 (Amandine I.

Garcia et al., 2011). A meta-analysis was conducted in 2011 by Gao et al to try to

clarify the conflicting results obtained for rs2910164 to date in relation to breast

cancer. Results of this meta-analysis showed no association between this miR-SNP

and breast cancer risk, however it is to be noted that the studies included in the

analysis contained a mixture of different ethnic backgrounds (Caucasians, African

Americans, unknown ethnicities and Chinese) and this was not considered as part of

the analysis, raising potential questions about its utility (Gao et al., 2011). Three

studies were conducted in 2012 in relation to rs2910164 and breast cancer and they


included two case control genotype association analyses and one meta-analysis.

Alshatwi et al performed a genotyping analysis in Saudi-Arabian patients with

sporadic breast cancer (100 cases vs 100 controls) and they showed no association

between the presence of mutation to breast cancer risk (Alshatwi et al., 2012).

Similarly, Esteban Cardeñosa et al showed no association between rs2910164 and

breast cancer risk, for both sporadic and familial types, or any association with age of

onset in their study conducted on a Spanish Caucasian population (538 cases versus

189 controls) (Esteban Cardenosa et al., 2012). However, a meta-analysis by Lian et

al published that year showed significant association between the CC genotype of

rs2910164 and breast cancer risk only in European Caucasians (Lian, Wang, &

Zhang, 2012). Interestingly, a very similar meta-analysis by Wang et al, which

included most of the studies analysed by Lian et al (Lian et al., 2012), reported no

association between rs2910164 and breast cancer risk (P. Y. Wang et al., 2013).

Finally, a recent study by Bansai et al conducted in 121 patients with sporadic breast

cancer and 164 controls from a North-Indian population identified that the presence

of the C allele significantly decreased the risk of developing breast cancer (Bansal et

al., 2014).

Conflicting results obtained for rs2910164 may be due to the variations in the

studies in terms of ethnicity of participants and type of cancer that were evaluated, as

well as the statistical approach to analyse the results. It is also worth noting that only

one study evaluated risk of familial breast cancer in a Caucasian population (Catucci

et al., 2010) and there is no replication of these results, since the following studies

for sporadic breast cancer were not performed in a purely Caucasian populations. To

the best of our knowledge there is no evidence in the literature about the association

of this miR-SNP with risk of sporadic breast cancer in a purely Caucasian

population. Therefore, more research for this miR-SNP is required using well-

characterised cohorts to determine whether there is an association with risk of

developing breast cancer.

Another example of conflicting evidence in terms of association analysis with

breast cancer risk is seen in the results of the studies conducted for rs11614913

present in MIR196A2. A study previously mentioned, conducted by Hu et al on


Chinese women identified significant association between presence of mutation in

rs11614913 and a higher risk of developing breast cancer (Z. Hu et al., 2009). On the

other hand, Hoffman et al showed that the presence of mutation in rs11614913

significantly decreased risk of breast cancer, a fact that was corroborated by both

their methylation assay of a CpG island upstream from the precursor of MIR196A2

and their functional validation in vitro using the MCF-7 breast cancer cell line

(Hoffman et al., 2009). However in 2010, Catucci et al failed to identify an

association between rs11614913 and breast cancer risk or age of onset (Catucci et al.,

2010). A meta-analysis conducted by Gao et al showed the CC genotype of

rs11614913 to be associated with an increased breast cancer risk (Gao et al., 2011),

which contradicted the results obtained by Hoffman et al (Hoffman et al., 2009). In

211, a study performed by Jedlinski et al in an Australian Caucasian sporadic breast

cancer cohort (193 cases versus 190 controls) showed no association to breast cancer

risk (Jedlinski, Gabrovska, Weinstein, Smith, & Griffiths, 2011). Linhares et al

showed that the presence of the CC allele significantly decreased the risk of breast

cancer in their study conducted in Brazilian women (388 cases and 388 controls)

(Linhares et al., 2012). Zhang et al found a marginally significant association

between the presence of mutation in rs11614913 and decreased risk of breast cancer

in their study in a Chinese cohort (252 cases versus 248 controls) (M. Zhang et al.,

2012). A previously mentioned meta-analysis conducted by Wang et al showed

significant association between CC genotype of rs11614913 and increased breast

cancer risk (P. Y. Wang et al., 2013). A study by Bansal et al, previously mentioned,

conducted in a North Indian population showed no association between mutation at

this loci and breast cancer risk (Bansal et al., 2014). Looking at patient outcomes

instead of risk, a recent study conducted in a Korean population by Lee et al

identified the presence of mutation in rs11614913 to be significantly associated with

poor prognosis for patients with sporadic breast cancer, which implicates some role

for the polymorphism in breast cancer aetiology (S. J. Lee et al., 2014). Despite the

conflicting evidence presented in here, it is more likely that rs11619913 is not

associated with sporadic breast cancer risk in Caucasian individuals, since it is a

common finding for the only two studies that evaluated this miR-SNP in purely

Caucasian breast cancer cohorts (Catucci et al., 2010; Jedlinski et al., 2011).

However further studies in larger cohorts may be required to ascertain this finding


and determine whether the role in patient outcome identified by Lee et al can be

replicated.

Finally rs3746444 present in MIR499 has been associated with increased breast

cancer risk in Asian populations according to the results obtained in the studies

conducted by Hu et al (Z. Hu et al., 2009) and Chen et al (P. Chen, Zhang, & Zhou,

2012). However studies performed in populations of other ethnicities have failed to

replicate this finding (Bansal et al., 2014; Catucci et al., 2010; P. Y. Wang et al.,

2013), which may suggest that the effects of this miR-SNP are specific to certain

ethnicities or affected by environmental factors.

2.7 SUMMARY AND IMPLICATIONS

Breast cancer represents a significant burden of disease, in particular to the

Australian population. Both incidence and mortality rates are increasing despite our

current knowledge on the disease (AIHW, 2009, 2012).

It is a complex disease caused by the interplay between our genetic

susceptibility and a number other risk factors (Martin & Weber, 2000; Robbins et al.,

2010). Single gene mutations associated with the most commonly found types of

familial breast cancer have already been identified, but they only account to about

10% of all cases and even in familial cancer, not all mutations have been identified

(Robbins et al., 2010; Strachan & Read, 2011; Weinberg, 2007). We still lack

knowledge about the genetics mechanisms involved in development of the sporadic

forms of the disease. Complicated gene networks interacting with the environment

may be at play in the aetiology of the disease and we need to identify mediators with

an important role in the pathophysiology of breast cancer. MicroRNA genes have

been identified to regulate a large number of genes involved in molecular pathways

important for cancer development. Therefore, a deeper understanding of these genes

is required to develop diagnostic or therapeutic strategies that will allow earlier

detection or will have a big impact for breast cancer sufferers in their prognosis.


Deregulation of microRNA genes has been identified as a feature present in

most types of cancer, in particular breast cancer. However, we still lack detailed

knowledge of the mechanisms that result in the miRNA expression patterns seen in

relation to breast cancer. It has been proposed that changes in the sequence of

miRNA genes might have an impact on their biogenesis and function in relation to

cancer and functional sequence variants have been identified. Single nucleotide

polymorphisms (SNPs) are the most common type of variation identified in miRNA

sequences and they are very likely to be key players in the development of cancer. A

large number of miR-SNPs have been identified to date and only a few have been

associated with breast cancer. Moreover, there is conflicting evidence for some of the

miR-SNPs that have been previously studied in relation to breast cancer. Further

studies to validate those in larger cohorts are needed, as well as studies to address a

large number of miR-SNPs not previously analysed, in order to successfully identify

those variants significantly associated with breast cancer.

One of the many problems that research studies face today in relation to

validating polymorphisms in miRNA genes; is the availability of large and well

characterized cohorts and the resources to recruit them. Additionally, genomic DNA

of good quality is essential to perform genotyping association studies that will

successfully identify miRNA variants. Therefore, it is important to use a DNA

extraction method that is cost-effective and time efficient and that will result in

optimal DNA yield. It is even more important when we have to deal with a large

number of samples in order to be quick and effective in the establishment of the

study population.

Careful selection of the miRNA variants to investigate in relation to breast

cancer is another key step. It requires careful assessment of the validity of previous

reports for miRNA genes and their selected variants. It is also important to consider

their prevalence and the potential impact they may have in the target study

population. Selection of an efficient and reliable genotyping technique is also

fundamental to obtain valid results.


Therefore our study aimed to establish a new prospective large cohort of DNA

samples obtained from Cancer patients recruited in Queensland in collaboration with

the Queensland Cancer Council that would serve as a replication population to our

previously established case control population. To validate any positive association

result it is important to test for association in a second independent case-control

population. Hence the establishment of a large second independent cohort was a

primary aim of this research. We also developed a selection process for miR-SNPs to

validate in both our Australian Caucasian case control cohorts to identify variants

that could be further analysed in order to determine their role in breast cancer

development.


Chapter 3: Methodology background

3.1 STUDY POPULATIONS

3.1.1 Genomics Research Centre Breast Cancer Population (GRC-BC population)

The Genomics Research Centre Breast Cancer (GRC-BC) population was

obtained from two facilities: 244 Caucasian Australian breast cancer patients residing

in the South East Queensland Region recruited from the Gold Coast Hospital,

Southport and 187 female individuals matched for age (± 5 years) and ethnicity with

no history of personal or familial breast cancer collected at the Genomics Research

Centre Clinic, Southport. All participants supplied informed written consent and

research was approved by Griffith University’s Human Ethics Committee (Approval:

MSC/07/08/HREC and PSY/01/11/HREC) and the Queensland University of

Technology Human Research Ethics Committee (Approval: 1400000104). The mean

age for cases and controls in the GRC-BC population was 57.52 years and 57 years,

respectively.

3.1.2 The Griffith University – Cancer Council Queensland Breast Cancer Biobank (GU-CCQ BB)

The Griffith University – Cancer Council Queensland Breast Cancer Biobank

(GU-CCQ BB) was established as a collaboration between the Genomics Research

Centre and the Cancer Council of Queensland. All women aged 20 to 79 years

resident in Queensland with a histologically confirmed diagnosis of invasive breast

cancer between have been notified to the Queensland Cancer Registry since January

2010 and recruitment will extend throughout the following 5 years. 929 Patients were

recruited at the Cancer Council Queensland as part of this 5-year population-based

longitudinal study of women newly diagnosed with breast cancer. Exclusion criteria

Chapter 3: Methodology background 81

for this study include women unable to speak or understand English and women with

cognitive impairment.

All patients provided informed written consent to participate and clinical and

demographic information was obtained from through the database at the Queensland

Cancer Registry, computer assisted telephone interview (CATI), self-administered

questionnaires (SAQ) and the Genomics Research Centre General Questionnaire.

Data was collected over two main time points: 4 to 6 months post diagnosis (Time 1)

and 18 months post diagnosis [approximately 12 months after initial interview]

(Time 2). Patients were also requested to provide a blood sample for molecular and

genetic marker analysis at Time 1 (4 to 6 months post diagnosis).

Control population comprised 102 healthy female volunteers with no personal

or familial history of breast cancer matched by age (± 5 years) and ethnicity and it

was recruited at the Genomics Research Centre Clinic, Southport. Individuals from

the control population provided written consent to participate and research was

approved by Griffith University’s Human Ethics Committee (Approval:

MSC/07/08/HREC and PSY/01/11/HREC) and the Queensland University of

Technology Human Research Ethics Committee (Approval: 1400000104). Mean

ages for cases and controls was 60.17 years and 60.20 years respectively.

3.2 CASE - CONTROL ASSOCIATION STUDIES

3.2.1 Genomic DNA Extraction

There are a number of DNA extraction techniques available to perform nucleic

acid extraction from the blood samples obtained for our case and control population.

DNA extraction can be performed using laboratory protocols that involve organic

and non-organic reagents, centrifugation methods and there is also a wide range of

commercially available kits (J. J. Greene & Rao, 1998; Grimberg et al., 1989; Miller,

82 Chapter 3: Methodology background

Dykes, & Polesky, 1988). We conducted a review of the evidence supporting DNA

extraction techniques most commonly used at research facilities to determine the

most suitable protocol for the establishment of the GU-CCQ BB population. Results

from our literature review are shown in Chapter Five. As it is further described in

detail in Chapter Four, we tested 3 different protocols including a commercially

available kit to determine the most suitable one in terms of cost-effectiveness to use

for DNA extraction of the samples obtained for the DNA biobank (Chacon-Cortes,

Haupt, Lea, & Griffiths, 2012). Our study showed the modified salting-out DNA

extraction method, developed by Nasiri et al (Nasiri, Forouzandeh, Rasaee, &

Rahbarizadeh, 2005), to have the best performance in terms of cost-effectiveness and

time-efficiency. This method is very similar to traditional salting out extraction

protocols, but it replaces proteinase K with laundry detergent instead, as a reagent for

the lysis and removal of contaminants in DNA isolation, reducing time (no overnight

incubation) and general costs overall. However, it maintains similar results, in terms

of quality and quantity of final DNA yield obtained, as the ones observed with

traditional salting out method and commercially available DNA extraction kit

(Chacon-Cortes et al., 2012). Therefore, DNA isolation from blood samples from the

GU-CCQ BB population were performed using the modified salting-out protocol

initially developed by Nasiri et al (Nasiri et al., 2005), which is further described in

detail in Appendix A.

Following DNA extraction from blood, samples were assessed for integrity

(260/280 nm ratio and absorbance spectrum) and concentration (obtained in ng/µl)

using the Thermo Scientific NanoDropTM 8000 UV-Vis Spectrophotometer

(Thermo Fisher Scientific Inc., Wilmington, DE. USA). To ensure consistency of

DNA sample evaluation, samples were measured in duplicate to check for

consistency and the average result was kept on record for our sample population

database.


3.2.2 Genotyping Techniques

As previously mentioned, this project aimed to investigate genetic variants

(miR-SNPs) in microRNA genes identified to play a role in breast cancer biology.

Therefore, we chose the following genotyping techniques to conduct the case-control

association studies presented in Chapters Six to Eight: Genotyping using High-

Resolution Melting (HRM) analysis and Multiplex PCR and matrix-assisted laser

desorption/ionisation time-of-flight mass spectrometry (MALDI-TOF MS) analysis

3.2.2.1 Genotyping by High Resolution Melting Analysis (HRM)

High resolution melting (HRM) is a genotyping method that involves melting

curve analysis of real- time PCR by continuously monitoring each cycle. It relies on

the same principle of real-time PCR where a fluorescent dye, which produces a

signal only when it is attached to double stranded DNA (dsDNA), is monitored to

produce a graph that will be specific for homozygotes and heterozygotes from a

chosen SNP. High level of fluorescence is observed at low temperatures when the

DNA is still in double stranded form and fluorescence decreases when the

temperature rises denaturing DNA into single strands.

The highest rate of fluorescence decrease is generally at the melting

temperature of the DNA sample (Tm). The Tm is defined as the temperature at which

50% of the DNA sample is double-stranded and 50% is single-stranded. The Tm is

typically higher for DNA fragments that are longer and/or have a high GC content.

SNPs or sequence variations are classified into four different classes (I to IV)

according to the expected shift in Tm. SNPs in class I correspond to base changes C/T

and G/A, which occur in approximately 64% of the human genome. Class II includes

C/A and G/T base changes and they occur in about 20% of the human genome. SNPs

in class I and II generally produce large changes in Tm (> 0.5°C). Class III includes

SNPs with C/G base changes, which usually produce a small change in Tm (0.2 to

0.5°C) and they are found in approximately 9% of the human genome. Finally, SNPs

in class IV are found in 7% of the human genome and they correspond to A/T base


changes producing a very small change in Tm (< 0.2°C). Changes in the binding

strength of the fluorescent dye to the amplicon are plotted showing the level of

fluorescence vs. the temperature, generating a melting curve. Melt curves generated

after High Resolution Melting analysis are normally plotted with fluorescence on the

Y axis and temperature on the X axis. These are similar to real-time PCR

amplification plots but with the substitution of temperature for cycle number (Liew

et al., 2004; Liew, Wittwer, & Voelkerding, 2010; Reed, Kent, & Wittwer, 2007;

Wittwer, Reed, Gundry, Vandersteen, & Pryor, 2003).

Figure 3-1 Example of High Resolution Melt (HRM) curves showing three different miR-SNP genotypes.

HRM curve seen from genotyping experiments for miR-SNP rs2910164 showing similar shape for both homozygous genotypes and a distinct curve for heterozygous

genotype.

A temperature shift along the x-axis in a HRM melt curve is characteristic of

homozygous allelic variants. In contrast, a change in the shape is indicative of

heterozygotes and these changes provide clear distinctions between all three potential

genotypes allowing both automatic and visual calling of genotypes (See Figure 3-1).


It is a reliable and fast technique, but it requires careful optimization to be able to

produce distinctive melting curves shapes for proper identification of the different

genotypes from the chosen amplicon (Liew et al., 2004; Liew et al., 2010; Reed et

al., 2007; Wittwer et al., 2003). Specific details of the protocol used for HRM

genotyping are detailed in the methods section of Chapter 6.

3.2.2.2 Multiplex PCR and matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDI-TOF MS) analysis.

Matrix-assisted laser desorption ionisation time-of flight (MALDI-TOF) was

developed both by Tanaka et al and Karas and Hillenkamp in 1988 and it was

initially used to study proteins and peptides (Karas & Hillenkamp, 1988; Tanaka et

al., 1988). MALDI-TOF has also been developed since then to be used in other

applications including DNA methylation analysis, expression profiling, mutation

detection and SNP analysis (Buetow et al., 2001; Griffin & Smith, 2000; Gut, 2004;

Werner et al., 2002). To date MALDI-TOF mass spectrometry (MS) analysis has

been used successfully in different SNP genotyping studies (Adkins, Strom,

Jacobson, Seemann, & et al., 2002; Falzoi, Mossa, Congeddu, Saba, & Pani, 2010;

Hung et al., 2005; Little, Braun, Darnhofer-Demar, & Koster, 1997; Little, Braun,

O'Donnell, & Koster, 1997; Nakai et al., 2002; Shi, Xiang, Li, & Shen, 2011;

Stepanov & Trifonova, 2013; Syrmis et al., 2013).

Multiplex PCR and Chip-based SNP genotyping using MALDI-TOF MS

analysis allows high-throughput genotyping in single tube reactions based on primer

extension. This technique consists of an initial locus-specific PCR to amplify the

target region where the SNP is located, followed by an extension reaction where an

extension primers anneals immediately upstream to the target site. This extension

reaction contains mass modified dideoxynucleotide terminators with differences

large enough to be detected by the mass spectrometry detection system. Products are

then loaded onto a chip coated with a matrix that allows crystallization of the PCR

product. A laser is fired at the crystals to ionise the molecules and they travel through

a vacuum tube with an ion detector located at the end. The software in the detection


system calculates the mass of the fragments based on the time it takes fragments to

reach the detector and the mass determines the specific genotype of the fragments

present in each reaction (Gabriel, Ziaugra, & Tabbaa, 2001).

Details of the protocol used for genotyping by multiplex PCR and MALDI-

TOF MS analysis used in our miR-SNP association studies are presented in

Appendix C.

3.2.2.3 Sanger Sequencing

Sanger sequencing has been widely used since it was initially developed by

Frederick Sanger and colleagues in 1974 (Sanger, Nicklen, & Coulson, 1977). It was

used in the human genome project (Lander et al., 2001) and it has been highly

developed over the past decades. The principle behind it is to generate all possible

single stranded DNA molecules differing in one nucleotide in length that are

complementary to a template. It basically starts by annealing an oligonucleotide of

about 20 bases specific to a sequence of a target gene in the 5’ end of a denatured

template. Complementary strand is catalysed by DNA polymerase in presence of

four deoxynucleotide triphosphates (dNTPs), creating molecules that differ in length

by one nucleotide randomly terminated by the incorporation of a random

dideoxynucleotide triphosphate (ddNTP). DNA is denatured again and fragments are

separated, a process that was initially done manually on polyacrylamide gel

electrophoresis and which is nowadays performed automatically using different

techniques and equipment including fluorescently tagged chain-terminators, scanning

of laser induced fluorescence, reading using thermostable polymerases or capillary

electrophoresis (Gibson & Muse, 2009; Pettersson, Lundeberg, & Ahmadian, 2009;

L. M. Smith et al., 1986).

Disadvantages of this technique include high costs and limitations to detect

rare mutations due to analogue nature of the sequence output. New sequencing

technologies are being developed to address these issues, which involve sequencing-

by-synthesis principle in a parallel format, making them able to detect rare mutations


in a more affordable manner, including pyrosequencing (Gibson & Muse, 2009;

Morozova & Marra, 2008; Pettersson et al., 2009; Strausberg, Levy, & Rogers,

2008).

Automated DNA Sanger sequencing was used to validate results from

genotyping techniques or on samples that were not adequately genotyped by any

other technique. Details of the Sanger sequencing protocol used in our genotyping

studies are described in Appendix C.

3.2.3 Statistical Analysis

We examined the power of our available 2 populations (GRC-BC and CCQ-

GU BB) for breast cancer association analysis. Power analysis indicated that even if

any of the analysed polymorphism had a relatively small effect size (w ≥ 0.2) in the

risk of developing breast cancer, both our populations (GRC-BC and CCQ-GU BB)

are of sufficient size to have ~80% or greater power to detect an allelic association at

the 0.05 level.

We determined genotype and allele frequencies for each miRNA SNP in our

case and control populations. Hardy-Weinberg equilibrium (HWE) was used to

evaluate deviation between observed and ideal Hardy-Weinberg frequencies. (Guo

& Thompson, 1992; Hardy, 1908). HWE states that the frequencies of alleles and

genotypes in a population always remain constant due to the presence of random

mating in absence of evolutionary influences. It is represented by the following

equation: p2 + 2pq + q2 = 1, where p represents proportion of individuals carrying the

dominant allele (homozygous) and q represents the proportion of individuals

carrying the recessive allele. We compared our genotyping results to an ideal Hardy-

Weinberg population to identify significant deviation which indicates selection bias,

population stratification, inbreeding or genotyping error (Guo & Thompson, 1992;

Hardy, 1908; Wigginton, Cutler, & Abecasis, 2005).


Following this, we evaluated differences in genotype and allele frequencies

between cases and controls using Chi-square analysis for each independent

population to determine association with breast cancer. Chi-square analysis (χ2) is a

non-parametric test used to examine the association between two variables at the

categorical level. It compares observed population values against values for an

expected population, using the differences between them to determine a p-value. The

equation of the chi-squared is χ2 = Σ [(O-E)2 / E] where the observed frequency in

this study was the cancer affected population while the expected frequency was the

control population (Fisher & Yates, 1963). The α for p-values was set at 0.05 to

determine statistical significance indicating that the differences observed would not

happen by chance 95% of the time. Some limitations for Chi-square analysis include

a large enough sample size so that the minimum expected frequency is five in each

category and normal distribution of the data. We also calculated odds ratio (OR) and

a confidence interval (CI) of 95% to assess disease risk (Fisher & Yates, 1963;

Rosner, 2006).

We used the G*Power statistical package version 3.1.9.2 (Heinrich-Heine-

Universitat Dusseldorf) to determine the power of our study (Faul, Erdfelder,

Buchner, & Lang, 2009; Faul, Erdfelder, Lang, & Buchner, 2007). All other

statistical analyses for genotypes and alleles were performed using Plink software

version 1.07 (http://pngu.mgh.harvard.edu/purcell/plink/) (Purcell et al., 2007).

Genetic linkage is an association between genetic polymorphisms which occurs

when there is non-random association of their alleles as a result of their proximity on

the same chromosome. Haplotypes comprise of SNPs or genetic variants on the same

chromosome that are inherited together which have little chance of recombination

and Linkage disequilibrium (LD) refers to the correlation between neighbouring

haplotypes descended from single, ancestral chromosomes (Hong & Park, 2012;

NAP, 2012; Reich et al., 2001). Therefore, where appropriate, we performed linkage

disequilibrium (LD) and haplotype block association analysis on the combined

genotyping data using Haploview software version 4.2 (Daly Lab, Broad Institute,


http://pngu.mgh.harvard.edu/purcell/plink/

Cambridge, MA, USA) (Barrett, Fry, Maller, & Daly, 2005) to determine association

between haplotypes and disease risk.


Chapter 4: Comparison of genomic DNA extraction techniques from whole blood samples: a time, cost and quality evaluation study.

4.1 STATEMENT OF CONTRIBUTION OF CO-AUTHORS FOR THESIS BY PUBLISHED PAPER

The authors listed below have certified that:

1. they meet the criteria for authorship in that they have participated in the

conception, execution, or interpretation, of at least that part of the publication in their

field of expertise;

2. they take public responsibility for their part of the publication, except for the

responsible author who accepts overall responsibility for the publication;

3. there are no other authors of the publication according to these criteria;

4. potential conflicts of interest have been disclosed to (a) granting bodies, and (b)

the editor or publisher of Molecular Biology Reports, and;

5. they agree to the use of the publication in the student’s thesis and its publication

on the QUT ePrints database consistent with any limitations set by publisher

requirements.

In the case of this chapter:

Chacon-Cortes D, Haupt LM, Rod LA and Griffiths LR (2012). Comparison of

genomic DNA extraction techniques from whole blood samples: a time, cost and

quality evaluation study. Molecular Biology Reports, 39, 5961–5966.

Chapter 4: Comparison of genomic DNA extraction techniques from whole blood samples: a time, cost and quality evaluation study. 91

Contributor Statement of Contribution

Diego Chacon-Cortes

(Candidate)

Conducted all experimental work, contributed to

experimental design, research plan and data analysis

and prepared and approved final version of the

manuscript.

Larisa M Haupt Assisted in experimental design, provided advice on

manuscript editing, critically reviewed and approved

final version of the manuscript.

Rod A Lea Assisted in statistical analysis, critically reviewed and

approved final version of the manuscript.

Lyn R. Griffiths Critically reviewed and approved the final version of

the manuscript.

Principal Supervisor Confirmation

I have sighted email or other correspondence from all co-authors confirming

their certifying authorship.

92Chapter 4: Comparison of genomic DNA extraction techniques from whole blood samples: a time, cost and quality evaluation study

ABSTRACT

Genomic DNA obtained from patient whole blood samples is a key element for

genomic research. Advantages and disadvantages, in terms of time-efficiency, cost-

effectiveness and laboratory requirements, of procedures available to isolate nucleic

acids need to be considered before choosing any particular method. These

characteristics have not been fully evaluated for some laboratory techniques, such as

the salting out method for DNA extraction, which has been excluded from

comparison in different studies published to date. We compared three different

protocols (a traditional salting out method, a modified salting out method and a

commercially available kit method) to determine the most cost-effective and time-

efficient method to extract DNA. We extracted genomic DNA from whole blood

samples obtained from breast cancer patient volunteers and compared the results of

the product obtained in terms of quantity (concentration of DNA extracted and DNA

obtained per ml of blood used) and quality (260/280 ratio and Polymerase Chain

Reaction product amplification) of the obtained yield. On average, all three methods

showed no statistically significant differences between the final result, but when we

accounted for time and cost derived for each method, they showed very significant

differences. The modified salting out method resulted in a 7 and 2 fold reduction in

cost compared to the commercial kit and traditional salting out method, respectively

and reduced time from 3 days to one hour compared to the traditional salting out

method. This highlights a modified salting out method as a suitable choice to be used

in laboratories and research centres, particularly when dealing with a large number of

samples.

4.2 INTRODUCTION

Genomic research studies benefit from the use of large populations. Biobanks

are becoming more popular and they are being recognized as a useful tool for

research in the field (Cambon-Thomsen, 2003; Cambon-Thomsen, Ducournau,

Gourraud, & Pontille, 2003). Whole blood is one of the available sources to obtain

genomic DNA (gDNA) required for experimental purposes in human science Chapter 4: Comparison of genomic DNA extraction techniques from whole blood samples: a time, cost and quality evaluation study. 93

research. After blood samples are collected, DNA extraction is the initial step of

laboratory procedures required to perform any chosen molecular application.

Therefore, it is essential to choose the appropriate DNA extraction method to obtain

high-yield and high-quality genomic DNA from any selected population of samples.

Technical requirements, time-efficiency and cost-effectiveness of the chosen method

are among the issues to consider when choosing a particular technique, especially if

you are dealing with a large number of samples.

DNA extraction techniques have come a long way since they were initially

described and most of them follow the same basic steps. They include the use of

organic and non-organic reagents, centrifugation methods (J. J. Greene & Rao, 1998)

and a variety of commercially available kits. One of these methods is DNA

extraction by salting out. It was initially described by Miller et al in 1988 (Miller et

al., 1988) and it is one of the techniques used for DNA isolation in many science

research facilities. In order to purify genomic DNA, laboratory protocols may require

the use of organic solvents such as phenol, chloroform and isoamyl alcohol or

enzymes such as proteinase K (Grimberg et al., 1989). As a result such protocols can

be expensive and time-consuming and they require the use of toxic materials. In an

effort to modify the methods used to isolate DNA, variations to the initially describe

techniques have been tested. In 2005, Nasiri et al described a different approach to

the traditional salting-out method, in which they did not require the use of toxic

reagents or long digestions with proteinase K to purify DNA. Instead of those, they

used a common, easily available laundry detergent in the process and the protocol

was optimized to obtain high-quality and high-yield genomic DNA (Nasiri et al.,

2005). Their research study showed the modified protocol to be a fast, safe and

efficient technique. It has been used as a protocol for DNA extraction in different

research facilities around the world (Garcia-Sepulveda, Carrillo-Acuña, Guerra-

Palomares, & Barriga-Moreno, 2010; Iri-Sofla, Rahbarizadeh, Ahmadvand, &

Rasaee, 2011; B. Liu et al., 2010; Trejo-de la O et al., 2008; Y. Wu et al., 2011; Y.

Zhang et al., 2009) and it has shown to be a suitable method. However, there is

limited evidence attesting to its advantages in important characteristics like cost-

effectiveness, time efficiency and technical requirements when compared to other

available methods. There are a number of studies which have evaluated different


DNA extraction techniques using whole blood samples (Abd El-Aal, Abd Elghany,

Mohamadin, & El- Badry, 2010; Di Pietro, Ortenzi, Tilio, Concetti, & Napolioni,

2011), but they excluded the salting out method and modified salting out from their

comparisons. In order to assess the modified salting-out protocol described by Nasiri

et al (Nasiri et al., 2005), we compared it against a traditional salting out method and

against an established kit method to determine which of these available procedures

was the best choice to obtain high-quality and high-quantity DNA.

4.3 MATERIALS AND METHODS

4.3.1 Samples

Eleven whole blood samples obtained from breast cancer patients residing in

the South East Queensland Region, recruited from the Gold Coast Hospital,

Southport were randomly selected for this study. Collection of samples was approved

by Griffith University’s Ethics Committee and it was performed in 2003-2004.

Samples were collected in EDTA tubes and stored at - 80oC prior to DNA extraction

about 8 years before gDNA extraction. It was noted that a number of the blood

samples showed some marked clotting, which may have been the result of incorrect

mixing prior to freezing and long term storage prior to their use.

4.3.2 Genomic DNA extraction methods

Three genomic DNA extraction methods were evaluated on the previously

selected samples, as follows:


5.4.2.1 Traditional salting out method

5 ml of whole blood were thawed and poured into a 50 ml falcon tube. 10 ml of

NKM buffer (3 mM MgCl2, 140 mM NaCl, 30 mM KCl) was added to the samples

and tubes were shaken vigorously. After centrifugation at 4600 rcf for 25 minutes at

4ºC, supernatant was discarded and 5 ml of RSB buffer (40mM Tris-HCl pH 7.5, 3

mM MgCl2, 40mM NaCl) was added to the falcon tube to break up the pellet. Tubes

were shaken vigorously again and RSB buffer was added to bring the volume up to

15 ml in total. Mixture was centrifuged at 3800 rcf for 15 minutes at 4°C, supernatant

was discarded and pellets were resuspended by adding 0.5 ml of RSB buffer. 2 ml of

lympholysis buffer (50 mM EDTA pH 8.0, 50 mM Tris-HCl pH 7.5, 150 mM NaCl,

10% SDS) and 125 µl of proteinase K solution (2 mg/ml proteinase K, 1 mM CaCl2

– 1 mM Tris pH 7.5) were added to the samples and falcon tubes were placed

overnight in a 37°C shaking water bath. On the following day 2ml of sterile saturated

NaCl (6M) solution was added and samples were mixed by inversion. Tubes were

centrifuged at 2400 rcf for 15 minutes at 4°C. Supernatant was carefully transferred

to a 10ml tube and they were centrifuged again at 2400 rcf for 15 minutes at 4°C.

Supernatant was transferred into a new 50ml falcon tube and two volumes of -20°C

chilled absolute (100%) ethanol were added. DNA strand precipitate was transferred

using an inoculating loop into a labelled 2ml tube containing 1ml of 1X TE buffer

(10mM Tris pH 8.0, 1mM EDTA). Extracted DNA was allowed to dissolve by

incubating at 37°C overnight and samples were stored at 4°C after quantification and

testing.

5.4.2.2 Modified salting out method

5 ml of whole blood were thawed and transferred into a 15 ml falcon tube. 8 ml

of lysis buffer (0.3 M sucrose, 0.01 M TrisHCl, pH 7.5, 5 mM MgCl2, 1% Triton

X100) were added to each sample and they were mixed by inversion. Tubes were

centrifuged at 2500 rcf for 30 minutes at 4°C. Supernatant was discarded without

disturbing pellet on the bottom. 300 µl of 10 mM TrisHCl, pH 8.0 were added to the

falcon tube and the pellet was broken by pipetting. Samples were transferred to fresh


microfuge tubes and they were resuspended by vortexing. After centrifugation at

2500 rcf for 20 minutes at 4oC, supernatant was discarded. 330 µl of 10 mM

TrisHCl, pH 8.0; 330 µl of commercial laundry powder (Biozett Attack© Front

Loader, Kao Marketing Pty. Ltd., Australia) solution (30 mg/ml) and a glass bead

were added to the pellet. Samples were vortexed for 1 minute and 250 µl of 6M NaCl

were added and samples were vortexed again for 20 seconds. Mixture was

transferred to fresh 2 ml microfuge tubes and they were centrifuged at 15000 rcf for

10 minutes. Supernatant was transferred to fresh microfuge tubes and DNA was

precipitated by addition of 1 to 2 volumes of -20°C chilled absolute (100%) ethanol.

Using an inoculating loop, washed twice in 70% ethanol, DNA strands were placed

into one labelled 2 ml tube containing 1 ml of 1X TE buffer (10 mM Tris pH 8.0, 1

mM EDTA). DNA was allowed to dissolve by incubating at 70°C for 5 minutes.

Samples were finally stored at 4°C after quantification and testing.

5.4.2.3 QIAGEN© QIAamp® DNA Blood Maxi Kits (QIAGEN Pty Ltd, PO Box 25, Clifton Hill, Victoria 3068, Australia)

5ml of whole blood were thawed and the protocol was carried out according to

manufacturer’s instructions in their handbook. Obtained gDNA samples were stored

at 4°C after quantification and testing.

4.3.3 Evaluation of extracted DNA

Genomic DNA extracted from blood samples was evaluated using the

following procedures:

5.4.3.1 gDNA yield concentration and quality evaluation by spectrophotometry

To assess integrity and quantity of genomic DNA extracted from blood

samples using each method, we performed spectrophotometry measurements using


The Thermo Scientific NanoDrop™ 1000 Spectrophotometer (Thermo Fisher

Scientific Inc, Wilmington, DE. USA.). Samples were measured in duplicate and the

average result was used for comparisons. Results for DNA quality (260/280 nm ratio

and absorbance spectrum) and concentration (measured in ng/µl) were compared for

each individual sample and using the average of all samples from each method.

Statistical analysis of the results was performed using Office Excel Spreadsheet 2007

(Microsoft®) and IBM® SPSS® software version 19 (IBM Company ©).

5.4.3.2 Polymerase Chain Reaction

Extracted samples were also tested using standard polymerase chain reaction

(PCR) to amplify the methylenetetrahydrofolate reductase (MTHFR) gene producing

a fragment of about 190 base pairs using the following primer sequence: 5’-

TGAAGGAGAAGGTGTCTGCGTTA. Reaction mixture contained 0.2 µl of

GoTaq® Flexi DNA polymerase (Promega, Madison, WI, USA) 100 nM of each

primer, 1X PCR Buffer, 1.75 nM MgCl2, 200 nM of deoxynucleotides (dNTPs) and

40 ng of gDNA in a final reaction volume of 20µl. Reactions were performed using

Applied Biosystems Veriti™ Thermal Cycler (Applied Biosystems, Foster City,

California, USA) with cycling conditions as follows: Initial cycle of denaturation at

94oC for 10 minutes, followed by 30 cycles of denaturation at 94oC for 10 seconds,

and annealing at 58oC for 45 seconds and extension at 72oC for 45 seconds followed

by additional 7 minutes at 72oC and a final hold at 4oC for 5 minutes. Each DNA

sample was run in duplicate and against a negative control. Product amplification

was determined by electrophoresis on 2% agarose gel stained with ethidium bromide,

visualized under U.V. light using Gel Doc XR System (Bio-Rad) and Quantity One®

software.

4.4 RESULTS

In an initial experiment, 10 ml of whole blood from 5 different patient samples

were split to be extracted using both the traditional salting out and the modified


salting out method according to the protocols outlined above. The average values of

measures obtained by spectrophotometry from initially extracted gDNA samples are

shown in Table 4-1.

Table 4-1 Spectrophotometry measures from whole blood samples (BC319-BC323)

TRADITIONAL SALTING OUT

METHOD MODIFIED SALTING OUT

METHOD Sample ID Final

gDNA Yield (µg)

Obtained gDNA Yield (µg) per ml

of blood

260/280 nm

Ratio

Final gDNA Yield (µg)


of blood

260/280 nm

Ratio

BC319 6.54 1.30 1.63 250.99 50.20 1.73 BC320 17.34 3.47 1.99 168.55 33.71 1.68 BC321 13.07 2.61 1.80 33.72 6.74 1.82 BC322 43.27 8.65 1.90 19.68 3.94 1.72 BC323 13.99 2.80 1.84 14.16 2.83 1.44

Average 18.84 3.77 1.83 97.42 19.48 1.68

On average final gDNA yield and obtained gDNA yield per ml of blood was

significantly higher when using the modified salting out method but 260/280 nm

ratio was slightly lower to the results obtained using the traditional salting out

method. However, when performing statistical analyses (independent t-test and

ANOVA) on the results obtained, there were no statistically significant differences

for final gDNA yield and obtained gDNA yield per ml of blood (P = 0.176 for both

measures) and for 260/280 nm ratio (P = 0.116). Because both methods proved to be

comparable in terms of effectiveness, we decided to compare them against a

commercially available DNA extraction kit, believed to be more effective and

efficient than the ones previously mentioned.

Six different new 15 ml whole blood patient samples were split to extract

gDNA, using all previously selected methods: traditional salting out, modified

salting out and QIAGEN© QIAamp® DNA Blood Maxi Kit. gDNA extracted from

each sample using each method was evaluated by spectrophotometry and the results

obtained are presented in Table 4-2.


Table 4-2 Spectrophotometry measures from gDNA extracted from whole blood samples using 3 different extraction methods

TRADITIONAL SALTING OUT

METHOD MODIFIED SALTING OUT METHOD QIAGEN© QIAamp® DNA Blood Maxi Kit

Sample ID Final gDNA Yield (µg)


of blood

260/280 nm Ratio



of blood

260/280 nm Ratio



of blood

260/280 nm Ratio

BC316 50.55 10.11 1.86 67.48 13.50 1.65 112.00 22.40 2.12 BC317 26.22 5.24 1.59 30.55 6.11 1.76 84.55 16.91 1.78 BC324 10.46 2.09 1.66 27.21 5.44 1.82 47.26 9.45 2.59 BC325 31.00 6.20 1.77 44.68 8.94 1.71 60.98 12.20 1.67 BC326 50.01 10.00 1.83 38.46 7.69 1.76 11.99 2.40 2.06 BC327 27.64 5.53 1.80 34.46 6.89 1.82 54.41 10.88 1.91

Average 32.65 6.53 1.75 40.47 8.09 1.75 61.86 12.37 2.02


As shown in Table 4-2, on average, QIAGEN© QIAamp® DNA Blood Maxi

Kit provided higher final gDNA yield, obtained higher gDNA yield per ml of blood

and a better 260/280 nm ratio than traditional and modified salting out methods. On

the other hand, the modified salting out method resulted, on average, in higher final

gDNA yield and obtained higher gDNA yield per ml of blood than traditional salting

out method. However, they both had the same average result for 260/280 nm ratio

measure. It appeared as if QIAGEN © QIAamp® DNA Blood Maxi Kit provided a

more highly concentrated and better quality gDNA than the other two methods.

Although, statistical analysis using ANOVA showed that the results from all 3

methods were not significantly different (p value= 0.110 for gDNA yield) and a

marginally significant difference (p value = 0.05) was obtained for the 260/280 nm

ratio. Furthermore, using an independent t-test to compare modified salting out

method and QIAGEN © QIAamp® DNA Blood Maxi Kit, there were no statistically

significant differences between these results (P value = 0.187 for gDNA yield and P

value= 0.076 for 260/280 nm ratio).

Once quality and quantity of gDNA obtained was evaluated, a polymerase

chain reaction (PCR) to investigate amplification of the methylenetetrahydrofolate

reductase (MTHFR) gene was performed on all samples obtained from traditional

and modified salting out method. PCR was run using the experimental conditions

previously exposed in our methods section and the results from this experiment are

shown in Figure 4-1.

PCR product amplification for the MTHFR gene was equally achieved using

gDNA samples extracted from both traditional and modified salting out. gDNA

quality was shown to be similar and both techniques were able to result in suitable

products for future use in molecular applications (Figure 4-1).


Figure 4-1 Agarose gel showing PCR product amplification on DNA extracted by traditional salting out method(lanes 1-6) and modified salting out method(lanes 7-12)

When analysing the costs derived from the use of either one of the previously

exposed methods, based on prices from our local providers, we were able to define

the information summarized in the following table (Table 4-3).

Table 4-3 Time and Cost summary for different gDNA extraction methods

gDNA Extraction Method

Cost per gDNA extraction from

5ml of whole blood sample

Cost of gDNA extraction from

100 Samples Overnight steps

Traditional salting out method

AUD 3.65 AUD 365.00 Incubation 37oC

Modified salting out method

AUD 1.90 AUD 190.00 None

QIAGEN© QIAamp® DNA Blood Maxi Kit

AUD 12.3 AUD 1,230.00 None

As shown in Table 4-3, QIAGEN © QIAamp® DNA Blood Maxi Kit is the

most expensive out of all the techniques chosen for extraction comparison. It is

almost seven times more expensive than the modified salting out method. The

102Chapter 4: Comparison of genomic DNA extraction techniques from whole blood samples: a time, cost and quality evaluation study.

modified salting out method and QIAGEN © QIAamp® DNA Blood Maxi Kit both

require about the same time to be performed, taking on average about 1 hour per

sample extracted. On the other hand, the traditional salting out method requires the

most time to be performed because it involves an overnight incubation at 37°C to

allow proteinase K digestion and it almost doubles the cost of the modified salting

out protocol. As a result the modified salting out method proved to be the most cost-

effective technique, reducing expenses about seven times less per sample than the

commercial kit and approximately half the costs of traditional salting out method.

4.5 DISCUSSION

Genomic DNA (gDNA) is a key component in order to perform molecular

applications involving genomic studies. Whole blood is the primary source for DNA

isolation and there are a number of different methods to choose to prepare this

material. They go all the way from old-school basic laboratory protocols to

commercially available tested kits and they have been tested and improved over the

years. Nonetheless, there are a number of advantages and disadvantages to each one

of the available choices and they must be considered before choosing a particular

technique. Issues like the time, direct costs and indirectly generated costs and

efficiency from each method should be carefully evaluated, especially when dealing

with large populations and particular baseline conditions of the samples to extract.

Because of technological advances, some protocols have been disregarded

despite their potential advantages, particularly in terms of costs. Although it is being

widely used nowadays, the salting out method has been excluded from studies where

comparisons have been made between available techniques for DNA isolation.

Therefore, we decided to compare a modified version of the salting out method to the

traditional protocol and also to one of the commercially available choices to

determine their advantages. Our results show that despite some differences in total

gDNA obtained from one individual to the next, all of these techniques do not differ

significantly in the quality and quantity of DNA yield obtained from whole blood

samples, previously frozen. They all result in gDNA that can be used safely for Chapter 4: Comparison of genomic DNA extraction techniques from whole blood samples: a time, cost and quality evaluation study. 103

molecular genomic studies. However, when you take time and money into

consideration some important differences were observed. The modified salting out

and QIAGEN © QIAamp® DNA Blood Maxi Kit method significantly reduce the

amount of time required for DNA extraction, from up to 3 days to about 1 hour.

When compared to the traditional salting out method, the modified salting out

method provides an additional advantage, in spite of their similarities in time

performance to QIAGEN © QIAamp® DNA Blood Maxi Kit. With our findings

showing almost a 7-fold cost reduction when using modified salting out method

instead of the commercially available kit and almost half a reduction in costs to the

traditional salting out technique.

In conclusion, any of these three techniques perform similarly in terms of

quantity and quality of final product, but our results showed the modified salting out

method to be the most cost-effective and time-efficient technique to isolate gDNA

from whole blood samples.

104Chapter 4: Comparison of genomic DNA extraction techniques from whole blood samples: a time, cost and quality evaluation study.

Chapter 5: Methods for extracting genomic DNA from whole blood samples: current perspectives.





field of expertise;




4. potential conflicts of interest have been disclosed to (a) granting bodies, and (b)

the editor or publisher of Journal of Biorepository Science for Applied Medicine,

and;



requirements.


Chacon-Cortes D, and Griffiths LR (2014). Methods for extracting genomic DNA

from whole blood samples: current perspectives. Journal of Biorepository Science

for Applied Medicine, 2, 1-9.

Chapter 5: Methods for extracting genomic DNA from whole blood samples: current perspectives. 105


Diego Chacon-Cortes

(Candidate)

Conducted literature research, prepared and approved


Lyn R. Griffiths Critically reviewed and approved the final version of

the manuscript.




106 Chapter 5: Methods for extracting genomic DNA from whole blood samples: current perspectives.

5.2 ABSTRACT

DNA extraction has considerably evolved since it was initially performed back

in 1869. It is the first step required for many of the available downstream

applications used in the field of molecular biology. Whole blood samples are one of

the main sources used to obtain DNA and there are many different protocols

available to perform nucleic acid extraction on such samples. These methods vary

from very basic manual protocols to more sophisticated methods included in

automated DNA extraction protocols. Based on the wide range of available options,

it would be ideal to determine the ones that perform best in terms of cost-

effectiveness and time-efficiency.

We have reviewed DNA extraction history and the most commonly used

methods for DNA extraction from whole blood samples, highlighting their individual

advantages and disadvantages. We also searched current scientific literature to find

studies comparing different nucleic acid extractions methods to determine the best

available choice. Based on our research we have determined that there is not enough

scientific evidence to support one particular DNA extraction method from whole

blood samples. Choosing a suitable method is still a process that requires

consideration of many different factors and more research is needed to validate

choices made at facilities around the world.

5.3 INTRODUCTION

Human health studies in the field of molecular biology require the use of

deoxyribonucleic acid (DNA), ribonucleic acid (RNA) and protein samples.

Successful use of available downstream applications will benefit from the use of high

quality and high quantity DNA. Therefore, nucleic acid extraction is a key step in

laboratory procedures required to perform further molecular research applications. It

is essential to choose a suitable extraction method and there are a few considerations

to be made when evaluating the available options. These may include technical


requirements, time-efficiency, cost-effectiveness, as well as biological specimens to

be used and their collection and storage requirements (Carpi, Di Pietro, Vincenzetti,

Mignini, & Napolioni, 2011).

Whole blood is one of many different available sources to obtain genomic

DNA (gDNA) and it has been widely used in facilities around the world. Therefore,

we will focus on DNA extraction protocols using whole blood samples. Issues

regarding collection, storage and manual handling of human whole blood specimens

escape the scope of this publication and will not be covered. However, they are

important and they should be considered as they could potentially impact on the

performance and success of any DNA extraction technique chosen.

5.4 INITIAL DEVELOPMENT OF DNA EXTRACTION TECHNIQUES

Friedrich Miescher was the first scientist to isolate DNA, while studying the

chemical composition of cells. In 1869, he used leucocytes that he collected from the

samples on fresh surgical bandages and he conducted experiments to purify and

classify proteins contained in these cells. During his experiments, he identified a

novel substance in the nuclei, which he called “nuclein” (Dahm, 2005). He then

developed two protocols to separate cells’ nuclei from cytoplasm and to isolate this

novel compound, nowadays known as DNA, which differed from proteins and other

cellular substances. This scientific finding, along with the isolation protocols used,

was published in 1871 in collaboration with his mentor Felix Hoppe-Seyler (Dahm,

2005; Holmes, 2001). However, it was only in 1958, that Meselson and Stahl

developed a routine laboratory procedure for DNA extraction. They performed DNA

extraction from bacterial samples of E.Coli using a salt density gradient

centrifugation protocol (Holmes, 2001; Meselson & Stahl, 1958). Since then, DNA

extraction techniques have been adapted to perform extractions on many different

types of biological sources.


DNA extraction methods follow some common procedures aimed to achieve

effective disruption of cells, denaturation of nucleoprotein complexes, inactivation of

nucleases and other enzymes and removal of biological and chemical contaminants

and finally DNA precipitation (Tan & Yiap, 2009). Most of them follow similar

basic steps and they include the use of organic and non-organic reagents and

centrifugation methods. Finally they have developed into a variety of automated

procedures and commercially available kits (Carpi et al., 2011; J. J. Greene & Rao,

1998; Price, Leslie, & Landers, 2009; Tan & Yiap, 2009).

Initially we will discuss protocols and steps aimed to achieve cell lysis,

inactivation of cellular enzymes and denaturation of cellular complexes and DNA

precipitation, which require similar procedures and/or reagents during DNA

extraction from whole blood samples. Key differences in steps aiming to remove

biological and chemical contaminants will be highlighted further ahead when we

discuss each protocol in detail.

As previously mentioned, lysis of cells is a common step to most DNA

extraction protocols and it is commonly achieved through the use of detergents and

enzymes. SDS and Triton X-100 are examples of popular detergents used to

solubilize cell membranes. Enzymes are also combined with detergents to target cell

surface or cytosolic components. Proteinase K is a commonly used enzyme used in

various protocols in order to cleave glycoproteins and inactivate RNases and

DNases. Other denaturants such as urea, guanidium salts and chemical chaotropes

have also been used to disrupt cells and inactivate cellular enzymes, but these can

impact on quality and nucleic acid yield (Boom et al., 1990; Carpi et al., 2011;

Herzer, 2001; Price et al., 2009).

DNA precipitation is achieved by adding high concentrations of salt to DNA-

containing solutions, since cations from salts such as ammonium acetate counteract

repulsion caused by negative charge of phosphate backbone. A mixture of DNA and

salts in the presence of solvents like ethanol, used in final concentrations from 70%

to 80%, or isopropanol (final concentrations of 40-50%) causes nucleic acids to


precipitate. Some protocols include washing steps with 70% ethanol to remove

excess salt from DNA. Finally nucleic acids are resuspended in water or TE buffer

(10mM Tris, 1mM ethilenediaminetetraacetic acid (EDTA)) (Boom et al., 1990;

Herzer, 2001; Price et al., 2009). TE buffer is commonly used for long term DNA

storage because it prevents it from being damaged by nucleases, inadequate pH,

heavy metals and oxidation by free radicals. Tris provides safe a safe pH of 7 to 8

and EDTA chelates divalent ions used in nuclease activity and counteracts oxidative

damage from heavy metals (Herzer, 2001).

5.5 MAIN TYPES OF DNA EXTRACTION METHODS FROM HUMAN WHOLE BLOOD SAMPLES

Table 5-1 shows the main categories and sub-categories of DNA extraction

methods from whole blood samples that are generally used in research facilities

worldwide. Laboratory reagents commonly used for each stage of the nucleic acid

extraction protocol are included in this table in order to highlight similarities and

differences between them.

DNA extraction techniques included in Table 5-1 will be discussed in more

detail in the following sections, along with a brief summary of the technique history

and background. In recent years, some of these protocols have been adapted to

microdevices that would develop miniaturized total chemical analysis system

(μTAS) or microfluidic genetic analysis (MGA) microchips (Duarte et al., 2010;

Price et al., 2009). However, we will limit the scope of our review to those

techniques that are available for macroscale nucleic acid extraction.


Table 5-1 DNA extraction methods commonly used for extraction from whole blood samples

DNA Extraction

Method (Main

category)

DNA Extraction Method

(Sub-category)

DNA Extraction Protocol Stage Cell Lysis Denaturation of

Nucleoproteins/ Inactivation of

cellular enzymes

Removal of Contaminants DNA precipitation

Solution Based DNA Extraction Methods

Salting Out Methods • SDS • SDS/

Proteinase k • Triton X-100

• Proteinase K • Laundry

powder

• Potassium acetate • Sodium acetate • Sodium chloride

• Ethanol • Isopropanol

Organic Solvent/Chaotropes Methods

• SDS • SDS/

Proteinase k

• Guanidine thiocyanate (GTC)

• Phenol

• Phenol • Phenol: chloroform • Phenol: chloroform

isoamyl alcohol

• Sodium acetate/ethanol • Sodium

acetate/isopropanol

Solid Phase DNA

Extraction Methods

Glass Milk/ Silica Resin Methods

• SDS • Triton X-100

• Guanidine thiocyanate

• Glass Milk (Silica in chaotropic buffer)

• Silica matrix • Diatomaceous earth

• Ethanol • Isopropanol

Anion Exchange Methods

• Heat • Chelex • Chelex/Prote

inase K

• Chelex • N/A

Magnetic Beads Methods

• SDS • N/A • Sodium Chloride/ Polyethylene glycol

• Magnetic beads

Abbreviations: DNA, deoxyribonucleic acid; N/A, not applicable; SDS, sodium dodecyl sulfate


5.5.1 Solution based DNA extraction methods

As previously mentioned, solution based protocols have two main approaches:

I) Solutions based methods using organic solvents and II) those based on a salting

out technique. Further description of both methods follows.

Solution based DNA extraction methods using organic solvents

DNA extraction protocols using organic solvents derived originally from a

series of related RNA extraction methods. Some of the main steps used in these

methods are: I) Cell lysis undertaken by adding a detergent/chaotropic-containing

solution, including sodium dodecyl sulfate (SDS) or N-Lauroyl sarcosine. II)

Inactivation of DNases and RNases usually through the use of organic solvents, III)

purification of DNA and removal of RNA, lipids and proteins and finally IV)

resuspension of extracted nucleic acids (Carpi et al., 2011; Sambrook & Russell,

2001; Tan & Yiap, 2009).

This method was initially developed in 1977 when an RNA extraction

technique using guanidium isothyocyanate was used by Ulrich et al to isolate

plasmid DNA (Ullrich et al., 1977). This technique was later modified by Chirgwin

et al in 1979 and it required the use of guanidium thyocyanate and long hours of

ultracentrifugation through a cesium chloride (CsCl) cushion (Chirgwin, Przybyla,

MacDonald, & Rutter, 1979). In an effort to improve this method, Chomczynski et al

in 1987 developed a protocol for RNA extraction using guanidium thiocyanate-

phenol-chloroform and much shorter centrifugation (Chomczynski & Sacchi, 1987,

2006). This last RNA extraction protocol was able to isolate RNA, DNA and

proteins, but in order to be used as a DNA extraction technique, guanidium

thiocyanate-phenol-chloroform was later replaced by a mixture of phenol:

chloroform: isoamyl alcohol since the former solvent did not completely inhibit

RNase activity (Sambrook & Russell, 2001). Phenol is a carbolic acid that denatures

proteins quickly, but it is highly corrosive, toxic and flammable. This organic solvent

is usually added to the sample and then using centrifugal force a biphasic emulsion is


obtained. The top hydrophilic layer contains diluted DNA and the bottom

hydrophobic layer is composed of organic solvents, cellular debris, proteins and

other hydrophobic compounds. DNA is then precipitated after centrifugation by

adding high concentrations of salt, such as sodium acetate and ethanol or isopropanol

in 2:1 or 1:1 ratios. Excess salt can be removed by adding 70% ethanol and the

sample is then centrifuged to collect the DNA pellet which can be resuspended in

sterile distilled water or TE buffer (10mM Tris; 1mM EDTA pH 8.0) (Carpi et al.,

2011; Tan & Yiap, 2009).

Because these techniques involve the use of toxic and corrosive organic

solvents, safety is a main concern. Personal protective equipment, safety measures

involving the use of a biohazard hood and training are required. Phenol/chloroform

needs to be equilibrated to an adequate pH and protocol conditions should be

optimized (Price et al., 2009). In an effort to improve safety and ease of use of these

protocols, certain modifications have been introduced to avoid physical contact with

solvents. These include incorporating a silica gel polymer (S. Thomas, Tilzer, &

Moreno) or replacing solvents with other substances like benzyl alcohol (Ness.,

Cimler., Rich B. Meyer, & Vermeulen.).

Solution based DNA extraction methods using salting out

Some nucleic acid extraction techniques that avoid the use of organic solvents

have also been developed over the years (Carpi et al., 2011; J. J. Greene & Rao,

1998; Sambrook & Russell, 2001). In 1988, Miller et al published a protocol that

achieved DNA purification through protein precipitation at high salt concentration

(Miller et al., 1988). The traditional protocol involves initial cell disruption and

digestion with SDS-Proteinase K, followed by addition of high concentrations of

salts, usually 6M NaCl. The mixture is then centrifuged to allow proteins to

precipitate to the bottom, with the supernatant containing DNA then transferred to a

new vial. DNA is then precipitated using ethanol or isopropanol in the same manner

as described for organic solvent methods (Carpi et al., 2011; F. Greene et al., 2002;

Grimberg et al., 1989; Miller et al., 1988).


However, the use of proteinase K can be time consuming and expensive when

compared to other reagents used in different solution-based approaches, so there

have been a few attempts to find alternative reagents for deproteinisation of DNA

(Bahl & Pfenninger, 1996; Drabek & Petrek, 2002; Gaaib, Nassief, & Al-Assi, 2011;

Debomoy K Lahiri & Nurnberger Jr, 1991). In 1991, Lahiri et al developed a DNA

extraction protocol from blood samples that eliminated the use of organic solvents

and prolonged incubation with Proteinase K. Their protocol used Nonidet P-40 (NP-

40, Sigma) to lyse blood cells and high salt buffers and 10% SDS to inactivate and

remove contaminants. Another protocol is the modified salting out method published

in 2005 by Nasiri et al in 2005 (Nasiri et al., 2005), which replaced Proteinase K

digestion with the use of laundry powder. This modified technique has been

successfully used as a DNA extraction protocol in many facilities around the world

(Arévalo-Herrera et al., 2011; Garcia-Sepulveda et al., 2010; Iri-Sofla et al., 2011; B.

Liu et al., 2010; H. Liu et al., 2013; Trejo-de la O et al., 2008; Y. Wu et al., 2011; Y.

Zhang et al., 2008; Y. Zhang et al., 2009).

5.5.2 Solid phase DNA extraction methods

Purification of DNA using the liquid/solid phase approach can be traced back

to 1979 when Vogelstein et al used silica in a glass powder form in their protocol to

purify DNA fragments previously separated by agarose gel electrophoresis

(Vogelstein & Gillespie, 1979). Solid phase extraction methods for DNA extraction

from blood samples were initially described in 1989 by McCormick et al. They

published a technique using siliceous-based insoluble particles, chemically similar to

phenol, that interact with proteins to allow DNA purification (McCormick, 1989). A

number of different procedures using the liquid/solid DNA extraction approach have

been developed since then and they are used in the majority of commercially

available extraction kits.

These techniques will absorb DNA under particular pH and salt content

conditions through any of the following principles: I) hydrogen-binding in the


presence of a chaotropic agent to a hydrophilic matrix, II) ionic exchange using an

anion exchanger under aqueous conditions and III) affinity and size exclusion

mechanisms (Price et al., 2009; Tan & Yiap, 2009). Most of these methods follow a

series of similar steps to achieve cell disruption, DNA adsorption, nucleic acid

washing and final elution. Most solid phase techniques use a spin column to bind

nucleic acid under centrifugal force. Spin columns are made of silica matrices, glass

particles or powder, diatomaceous earth or anion exchange carriers and these

compounds generally need to be conditioned using buffer solutions at specific pH to

turn them into the required chemical form. Blood cells previously degraded using

particular lysis buffers are applied to the columns and centrifuged and the DNA

binds to column aided by pH and salt concentration conditions provided by binding

solutions. Some proteins and other biochemical compounds may also bind to the

column and they are later removed using washing buffers containing competitive

agents during a series of washing steps. DNA is finally eluted in sterile distilled

water or TE buffer (Carpi et al., 2011; Price et al., 2009; Tan & Yiap, 2009).

4.5.2.1 DNA extraction methods using silica and silica matrices

Silica matrices have unique properties for DNA binding. They are positively

charged and have high affinity towards the negative charge of the DNA backbone.

High salt conditions and pH are achieved using sodium cations, which bind tightly to

the negatively charged oxygen in the phosphate backbone of DNA. Contaminants are

removed with a series of washing steps, followed by DNA elution under low ionic

strength (pH ≥ 7) using TE buffer or sterile distilled water. Commercially available

kits using silica based approach are manufactured by Clontech (NucleoSpin™), Mo

Bio Laboratories (Ultraclean™ BloodSpin™), QIAGEN (QiaAmp®), Promega

(Wizard™), Epoch Biolabs (EconoSpin™), Sigma Aldrich (GenElute™) amongst

others. In these protocols blood samples are incubated for a few minutes with a lysis

buffer and most protocols take about 40 minutes to one hour to complete producing

high yields of DNA with minimum contamination (Carpi et al., 2011; Price et al.,

2009; Tan & Yiap, 2009).


A substance that contains high amounts of silica (up to 94%), known as

kieselguhr, diatomite or diatomaceous earth has also been used for DNA purification.

It was initially described by Boom et al in 1990 (Boom et al., 1990). It binds DNA

in the presence of chaotropic agents, followed by washing with a buffer containing

alcohol and finally DNA is eluted in a low salt buffer or sterile distilled water.

BioRad (Quantum Prep®) is an example of a DNA extraction product developed

using diatomaceous earth (Price et al., 2009).

DNA extraction kits have also evolved and they are incorporated into semi-

and fully automated equipment able to perform protocols from sample lysis to some

downstream applications like polymerase chain reaction (PCR), such as BioRobot

EZ1® Advanced (QIAGEN), Biomek® 4000 Laboratory Automation Workstation

(Beckman Coulter, Inc., Brea, CA, USA), amongst others. Pipetting error, reduced

number of sample transfer and less protocol time are among the advantages of these

devices, however they should be carefully considered given the high cost of some of

the available choices of equipment. They have also been incorporated into

miniaturized total chemical analysis systems, which are silicon microchips where

DNA purification separation and detection is achieved (Price et al., 2009).

4.5.2.2 DNA extraction using anion exchange resins

Positively charged chemical substances able to bind to negatively charged

nucleic acids or contaminants or enzymes, such as nucleases, are called anion

exchange resins and they have also been used as part of DNA extraction protocols

from blood samples (Herzer, 2001).

Chelex® 100 resin is made of styrene divinylbenzene copolymers that contain

paired imminodiacetate ions. It is used in DNA extraction protocols as a chelating

ion exchange resin that binds polyvalent metal ions, such as nucleases, commonly

used in DNA extraction from forensic samples. The initial laboratory protocol, using

blood as a biological source, was described by Wash et al in 1991 (Walsh, Metzger,

& Higushi, 2013). Based on this initial approach, other protocols have been


developed to perform nucleic acid extraction from whole blood samples. They

require small sample volumes (under 1 ml of blood) and are usually performed in a

single tube reaction with different steps and reagents involved. Blood samples could

be lysed using proteinase K and/or incubation at high temperature, and removal of

contaminants is achieved by adding Chelex® 100 resin which precipitates them.

Single stranded DNA is obtained and it remains suspended in the supernatant, which

can be immediately used in downstream application or can be transferred to a new

vial for long-term storage (Butler, 2009; NFSTC, 2006; Walsh et al., 2013).

Selingson et al (Seligson & Shrawder) used anion exchange materials as part of

their invention to isolate nucleic acid samples from a variety of sources, including

whole blood samples. Selingson’s protocol uses a column containing a resin with

positively charged diethylaminoethyl cellulose (DEAE) groups on its surface to bind

negatively charged phosphates of the backbone of DNA. The strength of DNA

binding to column, as well as RNA and other impurities, can be altered through salt

concentrations and pH conditions of buffers used in this nucleic acid isolation

protocol. Contaminants such as protein and RNA can be washed from the DNA-

containing column using medium salt buffers (Seligson & Shrawder; Tan & Yiap,

2009).

4.5.2.3 DNA extraction methods using magnetic beads

Nucleic acid extraction techniques using magnetic separation have been

emerging since the early 90’s. They were originally used to extract plasmid DNA

from bacterial cell lysates by Hawkins et al in 1994 (Hawkins, O'Connor-Morin,

Roy, & Santillan, 1994) and by 2006 Saiyed et al (Saiyed, Bochiwal, Gorasia,

Telang, & Ramchand, 2006; Saiyed, Ramchand, & Telang, 2008) developed and

validated a protocol using naked magnetic nanoparticles for genomic DNA

extraction from whole blood samples.

Magnetic particles are made of one or several magnetic cores, such as

magnetite (Fe3O4) or maghemite (gamma Fe2O3), coated with a matrix of polymers,


silica or hydroylapatite with terminal functionalized groups. In the protocol

developed by Saiyed et al, 30 µl of whole blood are mixed with an equal volume of

1% (w/v) SDS solution. The tube is mixed by inversion two or three times and

incubated at room temperature one minute. 10 µl of magnetic nanoparticles are

added to this mixture, followed by addition of 75µl of binding buffer (1.25 M sodium

chloride and 10% polyethylene glycol 6000). The solution is mixed by inversion and

allowed to rest for 3 minutes at room temperature and the magnetic pellet is

immobilized using an external magnet to discard the supernatant. The magnetic

pellet is washed with 70% ethanol and dried. The magnetic pellet is resuspended in

50 µl of TE buffer and magnetic particles bound to DNA are eluted by incubation at

65ºC in continuous agitation (Saiyed et al., 2006; Saiyed et al., 2008).

5.6 CHOOSING THE APPROPRIATE PROTOCOL

The ideal extraction method should fit the following criteria: It should be

sensitive, consistent, quick and easy to use and depending on the country in which it

is used it may be important to minimize specialized equipment or biochemical

knowledge. It should also pose minimum risk to users, as well as avoid possible

cross-contamination of samples. Finally and most importantly, the DNA extraction

technique chosen should be able to deliver pure DNA samples ready to be used in

downstream molecular applications (Boom et al., 1990; Debomoy K. Lahiri, Bye,

Nurnberger Jr, Hodes, & Crisp, 1992; Price et al., 2009).

The quality and quantity of genomic DNA extracted from blood samples is a

key feature most facilities consider when choosing a protocol. Measuring UV light

absorbance using spectrophotometry at different wavelengths (230, 240, 260 and

280nm) is an initial quick and efficient way of determining purity and concentration

of nucleic acid samples. Concentration is usually calculated from DNA absorbance

reading at 260 nm using Beer-Lambert Law. Purity of nucleic acid samples is

assessed on 260/280 absorbance ratio and values in the range of 1.8 to 2.0 are

generally considered acceptable. 260/230 absorbance ratios between 2.0 and 2.2 are

also considered to be adequate as a secondary measure of purity for DNA


(Huberman, 1995; Sahota, Brooks, & Tischfield, 2007; Sambrook & Russell, 2001;

Troutman et al., 2002).

Lahiri et al (Debomoy K. Lahiri et al., 1992) published a study in 1992 where

they compared ten solution-based extraction methods for DNA extraction using

whole blood as source. They compared a protocol previously developed by their

group (method 10a and 10b) (Debomoy K Lahiri & Nurnberger Jr, 1991), which

required no use of organic solvents or enzyme digestion against nine other methods

previously published and used for DNA extraction from blood. On their study, Lahiri

et al (Debomoy K. Lahiri et al., 1992) extracted whole blood samples from 5

individuals in triplicate using the aforementioned methods. They determined DNA

concentration from samples using spectrophotometry absorbance reading at 260nm

and assessed quality through 260/280 absorbance ratio and electrophoresis on

agarose gel, as well as restriction enzyme digestion and southern blot. A summary of

some of the protocol features, as well as findings, is presented in Table 5-2.

Table 5-2 Summary of comparative study of DNA extraction methods by Lahiri et al

Method Enzyme Digestion

Hazardous Reagents

RNase Treatment

Time (Hours)

260/280 Ratio

DNA Yield (μg)

10a No No No 1 1.81 210 10b No No No 1 1.8 175 6 Yes Yes No 6 1.94 174 9 No Yes No 3 1.7 170 7 Yes Yes No 6 1.81 147 3 Yes Yes Yes 3 1.81 135 2 Yes No Yes 5 1.7 95 8 Yes Yes No Overnight 1.74 130 1 Yes No No Overnight 1.75 116 5 Yes Yes No Overnight 1.72 75 4 Yes Yes No Overnight 1.72 55

Note: Reprinted from J Biochem Biophys Methods 1992;25(4). Lahiri DK, Bye S, Nurnberger Ji Jr,

Hodes Me, Crisp M. A non-organic and non-enzymatic extraction method gives higher yields of

genomic DNA from whole-blood samples than do nine other methods tested. 193–205. Copyright ©

1992, with permission from elsevier. Abbreviations: DNA, deoxyribonucleic acid; RNase, ribonucleic

acid.


All protocols tested were able to isolate DNA with relatively good purity

(260/280 ratios from 1.7 to 1.94), but DNA obtained with methods 2, 5 and 6 showed

different amounts of degradation evidenced in gel electrophoresis. Seven of the

protocols tested, methods 3 to 9, required use of organic solvents and/or hazardous

substances such as phenol: chloroform or chloroform. Methods 1 and 2 did not use

organic compounds, but method 1 was the most time consuming. It required

overnight incubation with proteinase K, a problem solved in protocol 2 by incubating

samples for 30 minutes with both proteinase K and RNase A reducing DNA

extraction time to 5 hours. Method 10 (version a and b) was the quickest of all DNA

isolation methods (1 hour) and it removed enzyme digestion and the use of organic

solvents/hazardous substances. Both versions of this protocol were able to recover

similar or higher DNA yields than the other tested protocols, with about comparable

260/280 ratios. Based on their study findings, Lahiri et al were able to conclude that

the DNA extraction method they developed was the quickest and safest of the

solution-based method tested, recovering DNA of comparable quality and quantity

(Debomoy K. Lahiri et al., 1992).

Abd El-Aal et al compared a combination of manual and automated extraction

methods for DNA extraction from whole blood samples. Their study included six

techniques: phenol-chloroform purification, DNA extraction using microwave

thermal shock, DNA extraction with Wizard Genomic DNA Purification Kit

(Promega Corporation, Madison, WI, USA), magnetic separation (LC MagNA Pure

Compact Instrument – Roche Diagnostics) both manually and partially automated

using the Precision™ XS Microplate Sample processor from Biotek Instruments

(Vermont, USA) and finally they modified the Wizard SV 96 Genomic DNA

Purification Kit (Promega) combining it with magnetic separation using the MagNA

Pure purification method. They extracted 96 blood samples and used 100 μl as the

initial volume. However, they failed to mention how many samples were extracted

using each method and the number of experimental replicates performed. Their

results showed DNA extracted for each protocol with final concentrations ranging

from 0.50 to 0.98 μg/μl. While phenol-chloroform, manual magnetic separation and

the combined Promega-MagNA pure method isolated DNA showed relatively similar

DNA concentrations (0.72 to 0.79 μg/μl); magnetic separation and microwaving


technique achieved the highest and lowest DNA concentrations, 0.98 μg/μl and 0.50

μg/μl respectively. In their comparisons, they established five categories for

simplicity of extraction: extremely simple; simple; less simple; more simple and

difficult; but their system can be confusing since they failed to present criteria used

for each category. However, they categorized phenol/chloroform as difficult and

automated MagNA Pure as extremely simple using their previously mentioned

category system. They also have five categories for cost of each protocol, with

automated MagNA Pure technique as the most expensive and microwaving as the

cheapest method. Their costing categories were also not defined and there is no

actual mention of specific costs for each method in their study. Based on previously

mentioned findings, they concluded that magnetic separation using an automated

protocol performed best in terms of simplicity of extraction, purity of extracted DNA

and speed, even though it is the one with the highest cost. They also concluded that it

was essential to optimize any method chosen and they recommended the use of

magnetic separation, because it required minimal starting material and it was both

cost-effective and user-friendly. However, as previously stated there is no mention of

how cost-effectiveness was determined (Abd El-Aal et al., 2010).

Lee et al (J.-H. Lee, Park, Choi, Lee, & Kim, 2010) extracted DNA from 22

whole blood samples using three automated extraction systems. All three protocols

compared were based on solid phase extraction techniques: QIAamp® Blood Mini

Kit (Qiagen, Hilden, Germany) with QIAcube, which uses a silica membranes and

resins within a spin column to bind DNA and two other protocols that are based on

magnetic based DNA isolation techniques MagNA Pure LC Nucleic Acid Isolation

Kit I with MagNA Pure LC (Roche Diagnostics, GmbH, Manheim, Germany) and

Magtration-Magnazorb DNA common kit-200N with Magtration System 12GC

(Precision System Science (PSS) Co. Ltd., China, Japan)). DNA concentration was

measured by spectrophotometry and purity was assessed by 260/280 ratio, DNA

electrophoresis on agarose gel and PCR. Statistical analysis was performed to

validate study results and they are summarized in Table 5-3.


Table 5-3 Summary of results from comparative study on DNA extraction techniques by Lee et al


Mean Corrected

DNA Concentration (ng/µl) ± SD

DNA Concentration

range (min-max)

Mean 260/280

Ratio ± SD

260/280 Range (min-max)

QIAamp® Blood Mini Kit (Qiagen, Hilden, Germany) with QIAcube

25.42 ± 8.82

13.49 - 52.85

1.84 ± 0.09

1.59 - 2.04

MagNA Pure LC Nucleic Acid Isolation Kit I with MagNA Pure LC

22.66 ± 7.24

9.59 - 46.70

1.88 ± 0.08

1.6 - 1.97

Magtration-Magnazorb DNA common kit-200N with Magtration System 12GC

22.35 ± 6.47

12.57 - 35.08

1.70 ± 0.08

1.56 -1.90

Note: Reproduced from Lee J-H, Park Y, Choi JR, Lee eK, Kim H-S. Comparisons of three automated

systems for genomic DNA extraction in a clinical diagnostic laboratory. Yonsei Med J.

2010;51(1):104–110.49

Abbreviations: DNA, deoxyribonucleic acid; min, minimum; max, maximum; SD, standard deviation.

Results showed no statistical difference between DNA concentrations obtained

among the three commercial methods, but DNA purity was slightly lower for

Magtration-Magnazorb DNA common kit-200N when compared to the other two

methods. DNA extracted was of similar quality based on results from PCR and

electrophoresis on agarose gel. Therefore, they concluded that effectiveness for all

systems was equivalent and they all produced acceptable nucleic acid isolation (J.-H.

Lee et al., 2010).

Table 5-4 was adapted from a review published by Carpi et al in 2011 (Carpi

et al., 2011), where they reviewed DNA extraction methods used in a wide range of

biological sources, including 6 methods used on whole blood samples. Summary of

the three evaluated features for nucleic acid isolation methods, such as use of toxic

compounds, cost per sample and time required.


Table 5-4 Summary of features evaluated for deoxyribonucleic acid (DNA) extraction methods

reviewed by Carpi et al.1


Toxic Compounds

Cost estimate per sample ($)

Time Required (hours)

Phenol-Chloroform method

Phenol, Chloroform

<5

>3

Silica gel method Phenol, Chloroform

<5

>3

Benzyl alcohol method

Benzyl alcohol <5 >3

Salting out method None <5 >3 Magnetic bead-based

method None >5

>0.5

Note: Copyright © 2011. Bentham Science Publishers. Carpi FM, Di Pietro F, vincenzetti S,

Mignini F, Napolioni v. Human DNA extraction methods: patents and applications. Recent Pat DNA

Gene Seq. 2011;5(1):1–7. Reprinted by permission of eureka Science Ltd.1

Based on the methods included in Table 5-4, it can be noted that magnetic

bead-based method is the quickest DNA extraction protocol requiring over 30

minutes to be performed, whereas all the other protocols require more than 3 hours.

Also, the magnetic bead-based method is the most expensive one costing more than

$5 per sample extracted. However, there is no mention in this review of the

differences when comparing quality and quantity of nucleic acid isolated for each

method (Carpi et al., 2011).

Chacon-Cortes et al evaluated cost effectiveness and time efficiency of three

available DNA extraction techniques from whole blood samples: a traditional salting

out method, a modified salting out method and a commercially available kit based on

solid-phase DNA extraction method (QIAGEN© QIAamp® DNA Blood Maxi Kits

(QIAGEN Pty Ltd, Victoria, Australia) (Chacon-Cortes et al., 2012). Modified

salting out protocol (Nasiri et al., 2005) replaced the sample overnight incubation

step from the traditional salting out method (Miller et al., 1988), required for

contaminant removal using proteinase K, with the use of laundry detergent to reduce

time of extraction to about one hour. 5 ml of whole blood from 6 breast cancer

patients were manually extracted using each protocol and techniques were compared


in terms of quality and quantity of DNA extracted, as well as required cost and time.

DNA quantity was measured using spectrophotometry and DNA quality was

assessed by both 260/280 ratio and agarose gel electrophoresis of PCR product. A

summary of findings is presented in Table 5-5 (Chacon-Cortes et al., 2012).

Table 5-5 Summary of results from comparative study of DNA extraction methods by Chacon-Cortes

et al


Average Final gDNA Yield

(μg)

Average 260/280 Ratio

Cost estimate per sample

(AUD)

Time Required (hours)

Traditional Salting Out Method

32.65 1.75 3.65 Overnight

Modified Salting Out Method

40.47 1.75 1.9 1

QIAGEN© QIAamp® DNA Blood Maxi Kit

61.86 2.02 12.3 1

Note: Copyright © 2012. Springer. Mol Biol Rep. 2012;39:5961–5966. Comparison of genomic DNA

extraction techniques from whole blood samples: a time, cost and quality evaluation study. Chacon-

Cortes D, Haupt L, Lea R, Griffiths L. Reproduced with kind permission from Springer Science and

Business Media.50

Abbreviations: AUD, Australian dollars; gDNA, genomic deoxyribonucleic acid.

As shown in Table 5-5 QIAGEN© QIAamp® DNA Blood Maxi Kits produced

the highest yield and 260/280 ratio from all methods evaluated with average measure

of 61.86 µg and 2.02 respectively, but traditional and modified salting out protocols

both produced similar results in terms of DNA yield and 260/280 ratios. However,

statistical analysis using ANOVA showed that DNA yield and 260/280 ratio results

were not significantly different for all 3 methods (P-value of 0.110 and 0.05

respectively). On the other hand, the traditional salting out method required an

overnight incubation step and the QIAGEN© QIAamp® DNA Blood Maxi Kit was

the most expensive of all the methods included costing about 12.30 Australian

dollars, almost 7 times than the modified salting out protocol. Therefore, based on

these findings the modified salting out protocol developed by Nasiri et al (Nasiri et

al., 2005) proved to be to be the most cost-effective and time-efficient technique to

isolate gDNA from whole blood samples (Chacon-Cortes et al., 2012).


Unfortunately, no other solid phase extraction methods for DNA extraction were

included in this study.

5.7 CONCLUSIONS

DNA extraction has evolved for the past 145 years and it has developed into a

diversity of laboratory techniques. This review highlights currently available

methods for DNA extraction from whole blood samples and it summarizes

comparison studies using different nucleic acid extraction approaches published to

date. DNA extraction has evolved from solution and solid-phase manual techniques

initially performed manually into incorporating these into automated methods. There

is no consensus on a gold standard method for DNA extraction from whole blood

samples and they all differ in many different aspects. Studies comparing extraction

techniques and highlighting their strengths and weaknesses are limited and to our

knowledge there is no publication that evaluates all approaches in terms of all

possible features. Therefore, it is very difficult to determine the best choice available.

Facilities around the world usually choose a method based on the availability of

equipment, sample and reagents, as well as considering speed, extraction efficiency

and quality, technical requirements and cost, but based on our review findings there

is not enough scientific evidence to support these choices


Chapter 6: Association of the microRNA-SNP rs2910164 in MIR146A with sporadic breast cancer susceptibility: a case control study.





field of expertise;




4. potential conflicts of interest have been disclosed to (a) granting bodies, and;



requirements.


Upadhyaya A, Smith RA, Chacon-Cortes D, Revêchon G, Haupt LM, Chambers SK,

Youl PH and Griffiths LR. Association of the microRNA-single nucleotide

polymorphism rs2910164 in miR146a with sporadic breast cancer susceptibility: a

case control study.

Gene (Under revision).

Chapter 6: Association of the microRNA-SNP rs2910164 in MIR146A with sporadic breast cancer susceptibility: a case control study. 126

DOI:10 .1016/j.gene.2015.10.019

http://www.sciencedirect.com/science/article/pii/S0378111915012275


Akanksha Upadhyaya Performed experimental work on first breast cancer

cohort, prepared and approved final version of the

manuscript

Robert A. Smith Assisted in assay design and optimisation and

critically reviewed and approved final version of the

manuscript.

Diego Chacon-Cortes

(Candidate)

Performed genotyping on replication population and

assisted in sequencing, critically reviewed and


Gwladys Revêchon Participated in experimental work and assisted with

DNA extraction for populations.

Larisa M. Haupt Assisted in assay design, provided advice on

manuscript editing, critically reviewed and approved


Suzanne K. Chambers Administered recruitment of the replication population

and approved final version of the manuscript.

Philippa H. Youl Assisted in recruitment of the replication population

and provided organisation for replicate data entry for

control matching and approved final version of the

manuscript

Lyn R. Griffiths Provided advice on study design, laboratory oversight,

recruited the primary population, critically reviewed

and approved the final draft of the manuscript.





6.2 ABSTRACT

Background: Breast Cancer (BC) is primarily considered a genetic disorder

with a complex interplay of factors including age, gender, ethnicity, family history,

personal history and lifestyle with associated hormonal and non-hormonal risk

factors. The SNP rs2910164 in MIR146A (a G to C polymorphism) was previously

associated with increased risk of BC in cases with at least a single copy of the C

allele in breast cancer, though results in other cancers and populations have shown

significant variation. Methods: In this study, we examined this SNP in an Australian

sporadic breast cancer population of 160 cases and matched controls, with a replicate

population of 403 breast cancer cases using High Resolution Melting. Results: Our

analysis indicated that the rs2910164 polymorphism is associated with breast cancer

risk in both a primary and replicate population (p= 0.03 and 0.0013, respectively). In

contrast to the results of familial breast cancer studies, however, we found that the

presence of the G allele of rs2910164 is associated with increased cancer risk, with

an OR of 1.774 (95% CI 1.402-2.237). Conclusions: The microRNA MIR146A has a

potential role in the development of breast cancer and the effects of its SNPs require

further inquiry to determine the nature of their influence on breast tissue and cancer.

6.3 BACKGROUND

Breast cancer (BC) is the leading cause of death amongst Australian women

with risk increasing with age and with an average age of diagnosis of 60 years

(Australia, 2011). BRCA1 and BRCA2 are the two most commonly associated genes

in familial BC. With familial BC cases comprising only 5-10% of all BC cases,

sporadic BC cases have a much higher incidence rate (Anderson, 1991). Causes of

sporadic BC are not yet clearly understood and it is regarded as the more complex

form of the disease. miRNAs are a highly conserved class of small (~22 nucleotides),

endogenous, non-protein coding RNAs recently found to play a role not only in BC

but also in various other cancers. These miRNAs bind to target mRNA regions where

they can both inhibit transcription or initiate mRNA cleavage (Bartel, 2004; Lynam-

Lennon et al., 2009). Depending on target interaction, miRNAs can be classified as

128Chapter 6: Association of the microRNA-SNP rs2910164 in MIR146A with sporadic breast cancer susceptibility: a case control study.

either cancer causing (oncogenic) or tumour-suppressing (non-oncogenic).

Polymorphisms in miRNAs potentially alter these interactions by interfering,

generating or removing miRNA-binding sites as well as affecting expression of

miRNAs. Thus, these polymorphisms have the potential to significantly alter how the

miRNA interacts with its target/s and its effect on cellular biology. Consequently, the

identification of these SNPs and their association with disease susceptibility could be

used for improved diagnostic and therapeutic strategies.

MIR146A is a miRNA located on human chromosome 5 at locus 5q34 and has

been linked with BRCA1/BRCA2 activity, as these are among its potential targets.

The SNP rs2910164 is located in the middle of the stem hairpin in the miRNA and

leads to a G:U pair mismatch (instead of C:U) in the structure of MIR146A. In a

study by Shen et al (Shen et al., 2008), this polymorphism was linked with age of

diagnosis in breast and ovarian cancer patients and was the first research to show

association of MIR146A with breast cancer. The study reported that the presence of

at least one C allele lowered the age of diagnosis in BC patients with the level of

expression of mature MIR146A 60% higher in C allele carriers (n= 124 white

Caucasian women, 42 diagnosed with familial BC, 82 diagnosed with ovarian

cancer). In addition the authors identified an association of MIR146A with

BRCA1/BRCA2 by confirming that MIR146A could directly bind to the 3’UTR of

both genes and therefore play a role in regulating their expression. Interestingly, a

similar study conducted by Hu et al in 2009 on Han Chinese familial BC cases

(n=1009), produced contradictory results. In their study, although a much larger

cohort was examined, no association of the MIR146A SNP with age of BC diagnosis

was demonstrated; suggesting the effect of genotype or alleles for the SNP

rs2910164 may be population specific or modulated by environmental factors (Z. Hu

et al., 2009). In 2010, Pastrello et al conducted a similar study on Italian patients

diagnosed with familial BC and negative for BRCA1/BRCA2 mutations (n=101) to

ascertain if rs2910164 had any association with age of diagnosis (Pastrello et al.,

2010). Their results confirmed those by Shen et al and contradict those found by Hu

et al (Z. Hu et al., 2009; Shen et al., 2008). In addition, a study by Alshatwi et al in

Saudi-Arabian breast cancer patients indicated that while miR146a expression was

significantly altered in breast cancer patients compared to controls, rs2910164 did


not appear to be associated with breast cancer risk, though their population was only

100 individuals and the ages of some of their patients indicated a potential familial

aspect in these cases (Alshatwi et al., 2012). Due to the variations in these studies

and the lack of research in a purely sporadic population of breast cancer patients in

Caucasians, we examined this SNP in a case-control Australian Caucasian sporadic

breast cancer population.


6.4.1 Study Cohort

The study population (GRC-BC population) consisted of 160 Caucasian

Australian females, who were recruited on a voluntary basis with the criteria of being

diagnosed with sporadic breast cancer, showing no family history of the disease.

These women were matched for age (±5 years), Caucasian ethnicity and sex to a

control cohort of healthy individuals likewise showing no family history of breast or

related cancers. The case and control populations had average ages of 58±12 and

57.5±11.6 years, respectively. Case samples were recruited from the Gold Coast

Hospital, Southport, Australia, inviting all women with breast cancer attending to

participate. Control samples were recruited by the Genomics Research Centre Clinic,

Southport, Australia by invitation for healthy women from the community to

volunteer for the research.

The secondary replication population consisted of 403 sporadic breast cancer

patient samples obtained from the Griffith University-Cancer Council Queensland

Breast Cancer Biobank (GU-CCQ BB). These individuals were all those currently

available, recruited on a prevalence basis, but with confirmed diagnosis on the

Queensland Cancer Registry. The secondary population consisted of an additional 89

healthy controls, representing the currently available healthy women volunteers,

recruited and matched on a first-available-match basis to one of the cases as for the


primary population (age (±5 years), ethnicity and sex). Biobank samples and controls

had average ages of 61.09±25.09 and 60.2±28.2 years, respectively.

The project was carried out with the approval of the Griffith University Human

Research Ethics Committee, approval numbers: MSC/07/08/HREC and

PSY/01/11/HREC. All participants supplied informed written consent.

6.4.2 Genotyping

The candidate SNP in this study was identified using dbSNP

(http://www.ncbi.nlm.nih.gov/projects/SNP). Primers were designed using Primer3

(http://frodo.wi.mit.edu/) and checked for specificity through BLAST

(http://blast.ncbi.nlm.nih.gov). The SNPs were genotyped using high resolution melt

(HRM) analysis on a RotorGene-Q (Qiagen, Australia). The primer sequences used

were: forward 5’-TTACAGGGCTGGGACAG-3', reverse-5' -

CCTCAAGCCCACGATGA-3'. PCR amplification and HRM analyses were carried

out in a 12µL reaction containing: 2µL of 20ng/µL genomic DNA, 1µL of 10µM

each forward and reverse primers, 2.3µL of 25mM MgCl2, 0.8µL of 5µM dNTPs,

2.4µL of 5X GoTaq Buffer, 0.1µL of 5Units/µL GoTaq and 1.75µL dH2O. PCR

conditions were: 95oC for 1 minute, followed by 40 cycles of 95oC for 15 seconds,

62oC for 15 seconds and 75oC for 30 seconds, followed by 7 minutes at 72oC.

Melting analysis was carried out between 78oC and 86oC with data collected every

0.1oC. Two negative controls and two positive controls (Homozygous G and

Heterozygous CG) were included in each run to ensure a maintained high quality of

genotyping. Each sample was performed in duplicate. Sanger Sequencing (DNS) was

used on a random selection of four of each genotype in each population for

validation of genotypes identified by HRM.

Following HRM, the number of samples that were able to be accurately

genotyped was 143 cases and 157 controls for the GRC-BC population and the full

403 cases and 89 controls for the GU-CCQ BB population. Samples which were


unable to provide an unambiguous genotype result or would not amplify in PCR for

direct sequencing were discarded.

6.4.3 Statistical Analysis

Analysis was undertaken using chi-square analysis in the SPSS statistics

package. Analysis was undertaken using the Chi-square test of independence and

Odds Ratios and 95% confidence intervals were calculated from the combined results

of both populations. All populations also underwent testing for Hardy Weinberg

equilibrium. The α was set at 0.05.

6.5 RESULTS

Following detection by HRM and sequencing validation, we obtained and

compared the observed genotypic and allelic frequencies for rs2910164 between the

case and control populations. There were some shifts in genotype frequencies

between the case and control populations, with the C allele approximately 10% rarer

in the case population, reflected in decreases in the CC and GC genotypes in cases

compared to controls, as seen in Table 6-1. Hardy-Weinberg analysis indicated that

both populations were in Hardy-Weinberg equilibrium. Chi square analysis indicated

that the genotype differences were not significant, but that allelic frequencies

between the populations were significantly altered (p= 0.11 and 0.03 respectively).

Table 6-1 Genotype and allele frequencies for GRC-BC population

Genotypes Alleles GG GC CC G C Cases 80 49 14 209 77 (n=143) (55.9%) (34.3%) (9.8%) (73.1%) (26.9%) Controls 70 63 24 203 111 (n=157) (44.6%) (40.1%) (15.3%) (64.6%) (35.4%) Allele Analysis x2 = 4.95 P = 0.03


Because the primary breast cancer population (GRC-BC) showed a significant

allelic association between rs2910164 alleles and breast cancer susceptibility, we

undertook genotyping in our secondary population (GU-CCQ BB). The results of

this genotyping showed a similar trend, with the C allele being approximately 10%

rarer in the case population compared to the controls, with both CC and GC

genotypes rarer in the case population compared to controls (Table 6-2). As for the

initial population, both case and control populations for the secondary analysis were

in Hardy-Weinberg equilibrium. Chi square analysis on the secondary association

population indicated that the differences between cases and controls were significant

for both genotype and allelic frequencies, at p= 0.00088 and 0.0013, respectively.

Table 6-2 Genotype and allele frequencies for GU-CCQ BB population

Genotypes Alleles GG GC CC G C Cases 245 144 14 634 172 (n=403) (60.8%) (35.7%) (3.5%) (78.6%) (21.4%) Controls 42 36 11 120 58 (n=89) (47.2%) (40.4%) (12.4%) (67.4%) (32.6%) Allele Analysis x2 = 10.29 P = 0.00134

Since both populations showed significant association with breast cancer

susceptibility, we undertook odds ratio calculations to determine the magnitude of

the effect. All three models tested (G vs. C for additive, GG vs. GC+CC for recessive

and GG+GC vs. CC for dominant inheritance models) showed increases in breast

cancer risk associated with a higher dose of the G allele, with the G vs. C model

showing the most precise odds estimate, with an OR of 1.774 (95% CI 1.402 -

2.237). A summary of combined statistical analysis for both populations can be

found in Table 6-3 and the complete Odds Ratios can be found in Table 6-4.


Table 6-3 Combined primary and secondary population genotype and allele frequencies and analysis

results.

Genotypes Alleles GG GC CC G C Cases 325 193 28 843 249 (n=546) (59.5%) (35.3%) (5.2%) (77.2%) (22.8%) Controls 112 99 35 323 169 (n=246) (45%) (40.2%) (14.2%) (65.7%) (34.3%) Genotype Analysis x2 = 24.78 P = 0.000004 Allele

Analysis x2 = 23.28 P = 0.000001

Table 6-4 Odds Ratios for combined population analysis results.

Odds Ratios OR 95% CI G vs. C model 1.774 1.402 - 2.237 GG vs. GC and CC model 1.8428 1.3584 – 2.4999 GG and GC vs. CC model 3.0687 1.8206 - 5.1725

6.6 DISCUSSION

Our results indicate that the presence of the G allele for rs2910164 increases

the risk for sporadic breast cancer in a Caucasian population, confirmed through

analysis in two independently collected populations. These results show some

deviation from the expected frequency of the C allele in the general population

compared to reported frequencies for the rs2910164 SNP in Caucasian populations.

The reported allelic frequency as per the 1000 Genomes Project in Europeans is G-

78.2% and C-21.8%, while HapMap gives frequencies of G-76.5% and C-23.5%.

Data from other populations in both the 1000 Genomes Project and HapMap,

however, show that rs2910164 varies significantly in different ethnic groups, and in

Asians, the most common allele is C, with frequencies of C-60% and G-40%. More

importantly for this research, a subset breakdown within the European population

shows that Italians have an increased frequency of the C allele compared to other

European groups, at C-27% and G-73%. It thus seems possible that our elevated C

allele count is the result of a contribution from Italian and related Caucasian

subgroups, especially given that sequencing of HRM analysed samples showed

confirmation of the genotype calls in both cancer and control populations.


The results also show some variation form the implications of previous

research on rs2910164 in breast cancer, though those studies have also shown

conflicting results. In 2008, Shen et al identified an increased risk of breast cancer

amongst patients who had a single copy of the C allele and that the binding capacity

of MIR146A to BRCA1 was significantly higher in vitro when cells were transfected

with C allele mimics as compared to the G allele (Shen et al., 2008). These results

led them to draw the conclusion that the CC or CG variant of the rs291016

polymorphism increases susceptibility to familial BC. A similar study conducted by

Hu et al in 2009 found contrasting results (Z. Hu et al., 2009). The Hu et al study

involved 1009 familial breast cancer patients and 1093 healthy controls from a Han

Chinese population. In this study, no association of miR146a with increased risk of

familial breast cancer was observed. More recently, a study conducted by Catucci et

al in 2010 on 1894 familial breast cancer patients and 2760 healthy controls from

German and Italian populations also demonstrated no association with increased risk

of familial breast cancer and miR146a (Catucci et al., 2010). The discrepancy

between the results obtained from previous studies, including the current study may

be due to the varied ethnic backgrounds investigated, however, the observed

differences are more likely a result of fundamental differences in the type of breast

cancer addressed in the population cohorts examined.

This difference is that each of the previous breast cancer association studies

have been carried out in populations of individuals with familial cancer, most of

whom show symptoms of BRCA pathway malfunction, but lack BRCA1 or BRCA2

mutations. The suggested association of MIR146A with increased risk of breast

cancer by Shen et al in 2008 was a result of a study conducted on patients with

BRCA1/BRCA2 mutations, indicating familial breast cancer where these genes are

known to play an essential role (Shen et al., 2008). In addition, the study by Pastrello

et al where BRCA-negative patients were examined supports the data observed by

Shen et al for the effects of the C allele on age of onset (Pastrello et al., 2010; Shen

et al., 2008). Similar to the study by Shen et al and Alshatwi et al, the Pastrello et al

study cohort was a much smaller population when compared to the studies conducted

by Hu et al and Catucci et al (Catucci et al., 2010; Shen et al., 2008). Finally, a study Chapter 6: Association of the microRNA-SNP rs2910164 in MIR146A with sporadic breast cancer susceptibility: a case control study. 135

on 1166 BRCA1 mutation carriers and 560 BRCA2 mutation carriers from France

and America (no ethnicity specified) by Garcia et al failed to find an association

between rs2910164 and cancer risk in these patients. MIR146A has thus been

variably associated with BRCA1/BRCA2 familial BC risk, age of onset and age of

diagnosis in these studies. Meta-analyses on the BC studies to date indicate that both

ethnicity and gender is a factor in the effect of rs2910164 on cancer susceptibility

(Amandine I. Garcia et al., 2011; A. Wang et al., 2012). This would make sense in

the context of recent results from Omrani et al. who tested the rs2910164 SNP in an

Iranian breast cancer population and failed to find any association with the alleles

and breast cancer (Omrani et al., 2014).

The conflicting results of previous studies may arise through differential effects

of the SNP in the background of a familial impairment of BRCA1 and 2 associated

pathways compared to non-impairment of these pathways in sporadic breast cancer

patients. If BRCA associated pathways are poorly functional in familial breast cancer

patients, then the effect of the increased BRCA1/BRCA2 binding for the C allele of

rs2910164 may result in greater loss of cell growth regulation. By comparison, in

sporadic breast cancer the increased binding of the C allele may be compensated for

by upregulation of the precise BRCA related mechanisms that are impaired in

familial patients. This would lead to a situation where the control populations used to

determine association for the familial populations are subject to allelic pressure in the

same direction, partially masking the effect of the SNP on risk in these patients.

There is also support for differential biological effects for the SNP from data

gathered in other cancer types. In head and neck cancers, colorectal cancer, acute

lymphoblastic leukaemia and H. pylori infection related dysplasia, the C allele of

rs2910164 has been associated with increased risk of development (Chae et al., 2013;

Hasani et al., 2014; Lian et al., 2012; Ma et al., 2013; Orsós et al., 2013; Song et al.,

2013). By contrast, the G allele has been associated with increased risk of

hepatocellular carcinoma and prostate cancer, while melanomas showed a more

complex relationship where heterozygotes had increased risk of cancer development,

but cells carrying the G allele had increased invasive and proliferative characteristics

(B. Xu et al., 2010; Yeqiong Xu et al., 2013; Yumin Xu et al., 2013; Yamashita et


al., 2013). These differences add weight to the implications from the breast cancer

studies suggesting that the effect of the polymorphism is modulated by genetic and

environmental factors, probably due to the relative expression of its target mRNAs

and factors affecting its own expression. In addition, there may be significant effects

on the effect of miR146a through the presence or absence of BRCA1 or BRCA2

mutations, epistatic effects from other SNPs, or specific environmental triggers

common to closely related individuals compared to more sporadic population

cohorts. These in turn may influence the overall impact of the rs2910164 SNP on

breast cancer susceptibility, despite its observed functional effects.

6.7 CONCLUSIONS

Our analysis indicates that the rs2910164 SNP in MIR146A has a significant

effect on breast cancer susceptibility. Given the interaction with BRCA pathways, the

potential role of other polymorphisms (inferred from the role of ethnicity) and the

variable results of research on rs2910164 in breast cancer, further analyses on this

polymorphism should take into account the potential for familial versus sporadic

breast cancer effects, and should carefully choose controls to establish the effects of

the SNP more precisely in the target populations. This latter point may be a

limitation on our study, as ethnic variation of the SNP differs strongly and may have

affected our frequency data and obscured the true strength of the polymorphism in

breast cancer risk. Leading on from this, our relatively low population size may also

be affecting overall accuracy of the study, especially in the lack of controls available

for the secondary population. As a result, research using sporadic breast cancer

populations in larger cohorts of Caucasians and other ethnicities should be

undertaken to confirm the role of rs2910164 in cancer risk in the general population.


Chapter 7: Genetic association analysis of miRNA SNPs implicates MIR145 in breast cancer susceptibility.





field of expertise;







requirements.


Chacon-Cortes D, Smith RA, Rod LA, Youl PH and Griffiths LR. Genetic

association analysis of miRNA single nucleotide polymorphisms (miR-SNPs) in

Australian Caucasian breast cancer case control cohorts.

BMC Medical Genetics. (Under revision)


Diego Chacon-Cortes

(Candidate)

Performed experimental work, contributed to

experimental design, research plan and data analysis,

Chapter 7: Genetic association analysis of miRNA SNPs implicates MIR145 in breast cancer susceptibility. 138


prepared and approved final version of the manuscript.



manuscript.

Rodney A. Lea Assisted in statistical analysis, critically reviewed and





manuscript








7.2 ABSTRACT

MicroRNAs (miRNAs) are important small non-coding RNA molecules that

regulate gene expression in many important cellular processes related to the

pathogenesis of cancer. Genetic variation in miRNA genes could impact their

synthesis and cellular effects and single nucleotide polymorphisms (SNPs) are one

example of genetic variants previously studied in relation to breast cancer. Studies

aimed at identifying miRNA SNPs (miR-SNPs) associated with breast malignancies

could be the initial step towards further understanding of the disease and in

developing clinical applications for early diagnosis and treatment. In this study we

genotyped a panel of 24 miR-SNPs using multiplex PCR and chip-based matrix

assisted laser desorption ionization time-of-flight (MALDI-TOF) mass spectrometry

(MS) analysis in our Australian Caucasian breast cancer case control populations.

Statistical analysis of our genotyping results showed six of these miR-SNPs to be

non-polymorphic and twelve of our selected miR-SNPs to have no association with

breast cancer risk. However we were able to show association between rs353291

(located in MIR145) and the risk of developing breast cancer in two independent case

control cohorts (p = 0.041 and p=0.023). Our study is the first to report an

association between a miR-SNP present in MIR145 in individuals of Caucasian

background and breast cancer risk. This finding requires further validation through

genotyping of larger cohorts or in individuals of different ethnicities to determine the

potential significance of this finding as well as studies to determine functional

significance.

7.3 BACKGROUND


of about 21 to 25 nucleotides in length involved in negative regulation of gene

expression. They account for about 1% of the human genome and they were initially

identified in Caenorhabditis elegans (R. C. Lee et al., 1993; Reinhart et al., 2000).

They are transcribed from long microRNA primary transcripts (pri-miRNAs) that

contain miRNA precursors (pre-miRNA), which are stem-loop molecules of about 55

140 Chapter 7: Genetic association analysis of miRNA SNPs implicates MIR145 in breast cancer susceptibility.

to 70 nucleotides in length. Pre-miRNA are usually generated via the canonical

pathway, however a small group are produced through splicing/debranching

machinery in the “miRtron pathway”, or non-canonical pathway (Westholm & Lai,

2011). In the canonical pathway, pri-miRNA are cleaved by the complex formed by

the nuclear ribonuclease (RNase) III DROSHA and the RNA-binding protein

DCGR8 (DiGeorge syndrome critical region gene 8) (Yoontae Lee et al., 2003). Pre-

miRNAs are transported to the cytoplasm by XPO5 (exportin5) and the nuclear

protein ran-GTP and they are processed by DICER1, a RNase type III enzyme, and

double-stranded RNA binding domain (dsRBD) proteins TARBP2 and/or PKRA

(Chendrimada et al., 2005; Yi et al., 2003). This process results in the production of

a duplex molecule that contains both single-stranded mature miRNA sequence and a

complementary strand named miRNA*. The miRNA* is released and degraded while

the mature miRNA is loaded into the RNA-induced silencing complex (RISC) ,

where it is merged into the Argonaute/EIFC2C (Ago) proteins to mediate target

messenger RNA (mRNA) recognition and interaction. The mRNA target is

recognised by base pairing of the second to eighth nucleotides in the 5’-end of the

miRNA (Seed region) and the complementary sequence mainly located in the 3’

untranslated region (3’-UTR) region of the mRNA, but binding sites may also be

found in the coding region (Khvorova et al., 2003; Y. Lee et al., 2002). Each

miRNA may bind to up to 200 gene targets and each gene could have multiple

binding sites for different miRNAs, making a highly complex web of interactions

that may have extensive effects on a variety of biological pathways (Bartel, 2009;

Rajewsky, 2006).

miRNAs are very likely to play an important role in cancer biology because

they are involved in regulation of important cellular processes like cell growth,

differentiation and cell survival (Bartel, 2004; Thalia A. Farazi et al., 2011; Robbins

et al., 2010; Zuoren Yu et al., 2010; Baohong Zhang, Pan, et al., 2007; Baohong

Zhang, Wang, & Pan, 2007). Some of the mechanisms that can result in changes in

miRNA synthesis and expression in relation to cancer include: point mutations in

miRNA and mRNA sequences, loss or mutation in the promoter regions for specific

miRNA clusters, epigenetic changes and alterations in pathway related to dsRBD

proteins (Ha & Kim, 2014; Kim, 2005; Lynam-Lennon et al., 2009). Single


nucleotide polymorphisms (SNPs) in miRNA genes (miR-SNPs) are one example of

point mutation (albeit one occurring in the past) that could affect miRNA function in

one of three possible ways: altering transcription of the primary miRNA transcript;

altering processing of the pri-miRNA and pre-miRNA and; through their effects on

modulation of miRNA-mRNA interactions(S. Duan, Mi, Zhang, & Dolan, 2009; H.-

C. Lee et al., 2011; Ryan et al., 2010; G. Sun et al., 2009). As a result miR-SNPs

have been associated with different types of cancer, including chronic lymphocytic

leukaemia, gastric, lung and thyroid carcinoma (Calin & Croce, 2006; Z. Hu et al.,

2008; Jazdzewski et al., 2008; Q. Sun et al., 2010).

Breast cancer is the second most commonly diagnosed cancer around the world

(1.67 million new cases, 25%) and the most common type of cancer for women in

developed countries (793,684 new cases, 28%) according to the International Agency

for Research on Cancer (IARC) (Ferlay et al., 2013). Breast cancer is also the most

common cause of cancer-related deaths for women in the world (521,817, 14.7%)

and it ranks second for women in developed countries amongst all cancers

deaths(Ferlay et al., 2013). There is evidence of the role that miR-SNPs play in

relation to breast cancer. Hu et al. showed that the presence of mutant alleles of

MIR196A2 rs11614913 and MIR499A rs3746444 significantly increased breast

cancer susceptibility in Chinese women (Z. Hu et al., 2009). However in the genetic

association analysis of Caucasian populations and functional studies in breast cancer

cell lines performed by Hoffman et al. the presence of mutation in MIR196A2

rs11614913 was significantly associated with reduced risk of breast cancer, as well

as less efficient processing of MIR196A2 and reduced capacity to regulate target

genes, indicating additional factors may be at work in this SNP’s effect on breast

cancer (Hoffman et al., 2009). Kontorovich et al. found 2 SNPs, rs6505162 and

rs895819 located in MIR423 and MIR27A precursors respectively, to be significantly

associated with decreased risk of breast cancer in BRCA2 mutation carriers from a

Jewish population (Kontorovich et al., 2010). rs895819 was also significantly

associated with reduced risk of developing breast cancer in families with a history of

non-BRCA related breast cancer in a later study performed by Yang et al (Rongxi

Yang et al., 2010). Finally rs2910164, a miR-SNP located in the 3p strand of


MIR146A, was found to be associated with a younger age of diagnosis in familial

breast cancer for BRCA1 mutation carriers (Shen et al., 2008).

This study investigated a panel of miR-SNPs present in miRNA genes

previously identified to be involved in the pathophysiological mechanisms of breast

cancer. Our initial selection included 24 variants located in 9 miRNA genes or in

close proximity to them and these were genotyped using multiplex PCR and matrix-

assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDI-TOF

MS) analysis. Genotyping results of the 19 miR-SNPS successfully genotyped and

included in this study identified six previously reported microRNA SNPs as non-

polymorphic. From the remaining polymorphisms twelve miR-SNPS showed no

significant differences between cases and controls and one variant in MIR145 was

identified to be associated with breast cancer susceptibility in both of our Australian

Caucasian breast cancer case-control populations.


7.4.1 Study Populations

Two independent Australian Caucasian case control populations were available

for our study: The Genomics Research Centre Breast Cancer (GRC-BC) population

and part of the Griffith University-Cancer Council Queensland Breast Cancer

Biobank (GU-CCQ BB). We conducted single nucleotide polymorphism genotyping

in the GRC-BC population initially. It consisted of DNA samples from 173 breast

cancer patients from South East Queensland and DNA samples from 187 healthy age

and sex matched females with no history of personal or familial cancer collected at

the Genomics Research Centre Clinic, Southport, with research approved by Griffith

University’s Human Ethics Committee (Approval: MSC/07/08/HREC and

PSY/01/11/HREC) and the Queensland University of Technology Human Research

Ethics Committee (Approval: 1400000104). All participants supplied informed


written consent. Average age of test population was 57.52 years and 57 years for

cases and controls respectively.

Further validation of genotyping results was performed on a subset of the GU-

CCQ BB population. 679 DNA samples from breast cancer patients residing in

Queensland with a diagnosis of invasive breast cancer confirmed histologically were

used to validate genotyping of miR-SNPs. Patient samples had been collected by the

Genomics Research Centre in collaboration with the Cancer Council of Queensland

as part of a 5-year population-based longitudinal study commenced in January 2010.

Patients included in this study were between 33 and 80 years of age, with an average

age of 60.16 years. Control population for the GU-CCQ BB was established from 2

sources: 107 healthy age and ethnicity matched females with no personal or familial

history of cancer, recruited since January 2000 at the Genomics Research Centre at

the Institute of Health and Biomedical Innovation, Queensland University of

Technology and genotyping result data taken from 201 age, ethnicity and sex

matched individuals belonging to the phase 1 European population from the

1000Genomes project (Abecasis et al., 2012).

7.4.2 Genomic DNA sample preparation from whole human blood.

Genomic DNA was extracted from whole blood samples using a modified

salting out method described previously (Chacon-Cortes et al., 2012; Nasiri et al.,

2005). DNA samples were evaluated by spectrophotometry using the Thermo

Scientific NanoDropTM 8000 UV-Vis Spectrophotometer (Thermo Fisher Scientific

Inc., Wilmington, DE. USA) to determine DNA yield and 260/280 ratios (Chacon-

Cortes & Griffiths, 2014; Huberman, 1995; Sahota, Brooks, Tischfield, & King,

2007). Samples with a reading below 1.7 for their 260/280 ratio were purified using

an ethanol precipitation protocol to guarantee DNA sample purity (Sambrook &

Russell, 2001).


7.4.3 miRNA SNP selection

Figure 7-1 shows the selection process we followed to determine miRNA SNPs

(miR-SNPs) that could be included in our study. Two datasets, “The whole miRNA-

disease association data” and “The miRNA function set data” from the human

miRNA disease database (HMMDD) created by Lu et al (M. Lu et al., 2008) and

updated in January 2012, were used to select 8 diseases and/or pathological

characteristics and 24 biological and/or cellular functions related to breast cancer

(See Table 7-1).

Table 7-1 Selected features from the Human miRNA disease database (HMDD) included in present

study

Selected diseases and/or pathological features related to breast cancer

miRNA-disease

association dataset

Human miRNA genes in dataset

1. Adenocarcinoma 2. Breast Neoplasms 3. Carcinoma 4. Ductal breast carcinoma

5. Squamous Cell Carcinoma 6. Neoplasms 7. Germ cell and embryonal neoplasm 8. Squamous cell neoplasms

503 Human Diseases in

dataset 396

Selected biological and/or cellular function in relation to Cancer

miRNA function dataset

Human miRNA genes in dataset

1. Activation of caspases cascade 2. AKT pathway 3. Angiogenesis 4. Anti-cell proliferation 5. Apoptosis 6. Cell cycle related 7. Cell death 8. Cell differentiation 9. Cell division 10. Cell fate determination 11. Cell motility 12. Cell proliferation

13. Chemosensitivity of tumour cells 14. Chemotaxis 15. Chromatin remodelling 16. DNA Repair 17. Epithelial-mesenchymal transition 18. Human embryonic stem cell (hESC) regulation 19. Immune response 20. Immune system 21. Inflammation 22. miRNA tumour suppressors 23. Onco-miRNAs

503

Biological and cellular functions in

dataset

43

As shown in Figure 7-1, we picked the 50 miRNA genes from each dataset that

were present in the majority of selected features for inclusion in the following steps.

This list was narrowed down to the 25 miRNA genes on each dataset with the

strongest evidence in order to maximise the potential for identification of

biologically relevant molecules using two main criteria: miRNAs involved in the

largest number of selected features from each group followed by a literature search to


confirm the number of publications showing significant relationships to cancer

biology or the possession of known functional effects of polymorphisms within the

miRNA itself. Following this, we chose 10 miRNA genes from the 25 genes on both

lists, again prioritising by number of functions and publications, and conducted a

search to identify SNPs using both dbSNP database from The National Center for

Biotechnology Information (NCBI) (Sherry et al., 2001) and 1000 Genomes project

browser (Consortium, 2012). Final selection of SNPs was done using this algorithm:

All microRNA-SNPs located inside the pre-miRNA gene were automatically

included in the SNP selection. However, SNPs located outside of the pre-miRNA

gene were assessed using the following criteria: miR-SNPs located up to 500bp

upstream or downstream from pre-miRNA were automatically included in the SNP

selection. On the other hand, SNPs located more than 500bp from the 3’ or 5’ end

were chosen only if they had a previously reported minor allele frequency higher

than 5% in Caucasian populations. As a result 56 microRNA SNPs were identified in

this preliminary selection (Data not shown) (See Figure 7-1).


Figure 7-1 MicroRNA SNP (miR-SNP) selection algorithm using the Human miRNA Disease Database (HMDD). This flow chart shows workflow for selection of preliminary miR-SNPs included in genotyping study.

Abbreviations: dbSNP, single nucleotide polymorphism database; MAF, minor allele frequency; miRNA, microRNA; NCBI National Center for Biotechnology Information; SNP, Single nucleotide polymorphisms


7.4.4 Primer design

Using the MassARRAY® Assay Design Suite v1.0 software (SEQUENOM

Inc., San Diego, CA, USA) we were able to create a single multiplex PCR

genotyping assay containing 24 miR-SNPs from our preliminary selection (See Table

7-2).

We designed forward and reverse PCR primers and one iPLEX® (extension)

primer and verified that the mass of extension primers differed by at least 30 Da

among different SNPs and by 5 Da between alternative alleles of the same marker to

achieve successful marker and allele identification by mass spectrometry analysis.

Primers were manufactured by Integrated DNA Technologies (IDT®) Pte. Ltd.

(Baulkham Hills, NSW 2153, Australia) and primer information is shown in Table

7-3.


Table 7-2 List of the miRNA SNPs included in multiplex primer design using the MassARRAY® Assay Design Suite v1.0 software (SEQUENOM Inc., San Diego, CA,

USA)

miRNA Locus Location SNPs in gene region MAF Location SNPs near

3' end MAF Location SNPs near 5' end MAF Location

MIR210 11p15.5 568,089 - 568,198 rs1062099 0.187 567,627 rs7395206 0.500 568,211 rs1090217

3 0.483 570,072

MIR34A 1p36.22 9,211,727 - 9,211,836 rs35301225 NA 9,211,802 rs77585961 0.024 9,211,468 MIR155 21q21.3 26,946,292 - 26,946,356 rs2829801 0.470 26,944,305 rs1547354 0.081 26,946,709 MIR221 Xp11.3 45,605,585 - 45,605,694 rs7050391 0.013 45,605,273 rs2858061 0.435 45,606,132

rs2858060 0.499 45,607,008 rs2858059 0.478 45,607,190

MIR222 Xp11.3 45,606,421 - 45,606,530 rs2858061 0.435 45,606,132 rs2858060 0.499 45,607,008 rs2858059 0.478 45,607,190

MIR21 17q23.1 57,918,627 - 57,918,698 rs112394324 0.003 57,918,251 rs1292037 0.244 57918908 rs3851812 0.037 57,918,370 rs13137 0.244 57919031

MIRLET7A1 9q22.32 96,938,239 - 96,938,318 rs10761322 0.458 96,937,478 rs10739971 0.260 96,937,680

MIRLET7A2 11q24.1 122,017,230 - 122,017,301

rs629367 0.145 122,017,014 rs1143770 0.486 122,017,598 rs562052 0.326 122,018,550 rs693120 0.326 122,019,011

MIR145 5q32 148,810,209 - 148,810,296

rs55945735 0.201 148,809,158 rs353291 0.332 148,810,746 rs73798217 0.054 148,809,918


Table 7-3 Primer sequences for the miRNA SNPs included in genotyping study using multiplex PCR reaction and MALDI-TOF MS

SNP Forward primer sequence Reverse primer sequence iPLEX® (Extension) primer sequence rs2829801 ACGTTGGATGTTCCCACTCAGGCATATAGC ACGTTGGATGTGGAGTCTATGTCCACTTCC TGGGCTAAGCAACCA rs7395206 ACGTTGGATGTCGGACGCCCAAGTTGGAG ACGTTGGATGTCACACGCACAGTGGGTCT GGTGGGCGGGCGGAG rs73798217 ACGTTGGATGACTGGAGGTTATCAGAGAGG ACGTTGGATGATGTGTATTCCCCAGTCTCC ACCATTCAGTTCCTAGC rs353291 ACGTTGGATGGTAGAGATGCCACAAGAGGG ACGTTGGATGAAACCTTAAGTCTTCGTTCG GGTTGTTCTCTGGCTGC rs55945735 ACGTTGGATGGGTCTCAAACTCCTGACTTC ACGTTGGATGCTGCATCTGAGCACTTCAAG TGATGCCTGGCGGGGCG rs10902173 ACGTTGGATGTCACAGGCACCTTTTCTCAG ACGTTGGATGGAAGCCTGGGTATTAGGATG ACCTTTTCTCAGCATCTG rs112394324 ACGTTGGATGACTGGAGAGAGAAATTACCC ACGTTGGATGGTTGAAACCAGAGTACATGC CAGATACGACAGAGTGTG rs1292037 ACGTTGGATGTACAGCTAGAAAAGTCCCTG ACGTTGGATGGGAGGGAGGATTTTATGGAG AACCTTTTCAAAACCCACA rs562052 ACGTTGGATGCTCCAGGCTAGTGGAATAAC ACGTTGGATGCTACTGCACTGATCGTGTTC CAGATTCAAATGCCATAGCA rs2858059 ACGTTGGATGCAGTAAGTATTTCTGGGGTG ACGTTGGATGCTCCCATGATACAATGAAGG TGGGGTGGATAAATGAATAG rs1062099 ACGTTGGATGGACCCGGTCCTGATTTTAAC ACGTTGGATGTGTGTTTCTGCCGCTTCAGT TTTAACAGTAGACTTGAGAAG rs10739971 ACGTTGGATGCCTAATAAGACCACTTAGTGT ACGTTGGATGATGCACTAACATACAACGAG CCACCTACTCATTTATCCCATG rs2858060 ACGTTGGATGACTGTATTATCCTCAGTTC ACGTTGGATGACTTGGGTAATCTAGCAATG GTATTATCCTCAGTTCGTAACA rs2858061 ACGTTGGATGGCTTTCAATACTACAAGGG ACGTTGGATGAATGATACCTTTCATAGGGG GTAAAACAAAAACAGGTAAGAG rs35301225 ACGTTGGATGGGCAGTATACTTGCTGATTG ACGTTGGATGGCTGTGAGTGTTTCTTTGGC GGATTATTGCTCACAACAACCAG rs1547354 ACGTTGGATGTTGCAGGTTTTGGCTTGTTC ACGTTGGATGGGAGGTTAGTAGTCCTTCTA TTTGATTCAACTGTTAGAAATGTG rs13137 ACGTTGGATGAGGTGAAAGAGATGAACCAC ACGTTGGATGAAAGCATTCCCAAAATGCTC TCCCAAAATGCTCTATTTTAGATAG rs10761322 ACGTTGGATGGCTTTTGGTTACTAAATCAC ACGTTGGATGCTTCATATTTAGGAGGTAGC GGCATATTTAGGAGGTAGCTACTAC rs693120 ACGTTGGATGGTAGATGGCACATATAGAAA ACGTTGGATGCATCCCTTAACTGTAAGTTC GATTGTCAAATGAAAAGAAGAATAT rs1143770 ACGTTGGATGCTGAACAATTTAATGCCTTC ACGTTGGATGTTCAGTTTTACCAGAGGAAC CCCCCATGCCTTCTGATATCTGTTGA rs77585961 ACGTTGGATGGTTTCCTTCTCTGCAAGACG ACGTTGGATGCACTTACTATGCAGGAAGGC CCTACATGATGTAATACACTTACAATA rs7050391 ACGTTGGATGGTAAGGCAGTATGATTAGGC ACGTTGGATGGCCTCAACTGTCAAAGATTG TGTTCATAATTATTATCAGAAGGCATA rs3851812 ACGTTGGATGTTTTCCTCCCAAGCAAAAC ACGTTGGATGTTCTTGCCGTTCTGTAAGTG GGTGGAAGTGTTTTATTCTTAGTGTGA rs629367 ACGTTGGATGTATGCAGCATTTTTGTGAC ACGTTGGATGATTCTGTTTCCTCGGGTTAG GGGGTCAGCATTTTTGTGACAATGGACA


7.4.5 Primary Multiplex PCR

Genotyping was undertaken following the iPLEX™ GOLD genotyping

protocol using the iPLEX® Gold Reagent Kit (SEQUENOM Inc., San Diego, CA,

USA). Primer extension reactions were performed according to the instructions for

the SEQUENOM linear adjustment method included in the iPLEX™ GOLD

genotyping protocol (SEQUENOM Inc., San Diego, CA, USA). All reactions were

performed using Applied Biosystems® MicroAmp® EnduraPlate™ Optical 96-Well

Clear Reaction Plates with Barcode (Life Technologies Australia Pty Ltd.,

Mulgrave, VIC, Australia) and an Applied Biosystems® Veriti® 96-Well Thermal

Cycler (Life Technologies Australia Pty Ltd., Mulgrave, VIC, Australia).

7.4.6 MALDI-TOF MS analysis and data analysis

A total of 12-16 nl of each iPLEX® reaction product were transferred onto a

SpectroCHIP® II G96 (SEQUENOM Inc., San Diego, CA, USA) using

SEQUENOM® MassARRAY® Nanodispenser (SEQUENOM Inc., San Diego, CA,

USA). SpectroCHIP® analysis was carried out by SEQUENOM® MassArray®

Analyzer 4 and the SpectroAcquire software Version 4.0 (SEQUENOM Inc., San

Diego, CA, USA). Finally data analysis for genotype determination was done using

the MassARRAY® Typer software version 4.0 (SEQUENOM Inc., San Diego, CA,

USA).

7.4.7 Statistical analysis

Statistical analysis of genotypes and alleles was conducted using Plink

software version 1.07 (http://pngu.mgh.harvard.edu/purcell/plink/) (Purcell et al.,

2007). The α for p-values was set at 0.05 to determine statistically significant

association with breast cancer. Genotype and allele frequencies for each miRNA

SNP in our case and control populations were established and we used Hardy-

Weinberg equilibrium (HWE) to evaluate deviation between observed and expected



frequencies for identification of unexpected population or genotyping biases (Guo &

Thompson, 1992; Hardy, 1908). We performed Chi square analysis to evaluate

differences in genotype and allele frequencies between cases and controls for each

independent population (Fisher & Yates, 1963). Finally we calculated odds ratio

(OR) and obtained 95% confidence interval (CI) 95% to assess disease risk.

7.5 RESULTS AND DISCUSSION

MicroRNAs are some of the small non-coding RNA molecules responsible for

gene regulation at the translational level. They require a very complex series of

nuclear and cytoplasmic processes for their synthesis and to achieve their functional

effects on genes involved in key cellular functions like replication and cell

differentiation (Bartel, 2009; Y. Lee et al., 2002; Baohong Zhang, Wang, et al.,

2007). As a result they are likely to play a role in the development and progression of

cancer due to the important biological mechanisms they regulate (Thalia A. Farazi et

al., 2011; Robbins et al., 2010; Baohong Zhang, Pan, et al., 2007). They have been

shown to have different roles in various types of cancer including breast cancer

(Calin & Croce, 2006; Lynam-Lennon et al., 2009). Breast cancer is a cause for

concern since recent reports by the IARC show it has very high incidence and

mortality rates around the world (Ferlay et al., 2013). Analysis of single nucleotide

polymorphisms in well-defined case control cohorts has provided information on

miR-SNPs involved in the pathophysiology of different types of cancer including

breast cancer (Hoffman et al., 2009; Z. Hu et al., 2008; Z. Hu et al., 2009;

Jazdzewski et al., 2008; Kontorovich et al., 2010; Shen et al., 2008; Q. Sun et al.,

2010; Rongxi Yang et al., 2010). Therefore we selected a panel of 24 miR-SNPs

related to 9 miRNA genes previously identified to play a role in breast cancer to

genotype in our Australian Caucasian breast cancer case control populations.

Genotyping of our selected miRNA variants in the GRC-BC cohort showed six of

them to be non-polymorphic (rs73798217, rs112394324, rs35301225, rs1547354,

rs7050391 and rs3851812) and another five of the chosen miR-SNPs failed to

successfully deliver genotypes (rs2829801, rs7395206, rs2858059, rs1143770 and

rs77585961).


On the other hand, genotype and allele frequencies of the remaining 13 miR-

SNPs in the GRC-BC cases and controls showed these to be closely similar to those

found in Hapmap for Caucasian populations and they were also in Hardy Weinberg

Equilibrium (HWE) (p > 0.05). Ultimately, however, chi-square analysis of

genotyping results of 12 SNPs located in relation to seven miRNAs (MIR210,

MIR221, MIR222, MIR21, MIRLET7A1, MIRLET7A2 and MIR145) showed no

significant differences for genotype and allele frequencies between cases and

controls in our GRC-BC population (See Tables 6-4 to 6-9).

In contrast, we were able to determine significant differences at the allelic level

for rs353291 after chi-square analysis in our GRC-BC cohort (p=0.041) although no

significant difference in genotype frequencies (p=0.09) (See Table 7-10). We then

proceeded to genotype this SNP in the GU-CCQ BB population and we also found

that genotype and allele frequencies closely matched Hapmap frequencies for

Caucasian populations and that cases and controls were in HWE as well. Statistical

analysis of genotyping in our replication population showed similar findings to those

obtained in the GRB-BC cohort shown in Table 7-10. We were able to find

significant differences in allele frequencies between cases and controls in the GU-

CCQ BB population (p=0.023) and also no statistically significant differences at the

genotype level (p=0.07). Finally we calculated odds ratios for alleles in the GRC-BC

and GU-CCQ BB populations to be 1.37(CI 95%: 1.01-1.84) and 1.25 (CI 95%:

1.03-1.52) respectively, which suggests that the presence of the C allele at this locus

increases the risk of developing breast cancer. However it should be noted that if we

consider statistical correction for multiple testing, our finding for the analysis of

allele frequencies for both populations for this variant is not significant if we

determine Bonferroni correction of the α for p-value to be 2.08 x 10-3. However these

results are still potentially interesting, particularly considering the similarity of allelic

significance in both independent case-control cohorts and point to the need for

further genotyping in extended populations.


Table 7-4 Allele and genotype frequencies for miRNA SNPs in MIR210 obtained from the GRC-BC population

rs1062099 rs10902173 Allele Genotype Allele Genotype G

(%) C

(%) p-value GG

(%) CG (%)

CC (%)

p-value C (%)

T (%)

p-value CC (%)

TC (%)

TT (%)

p-value

Control 300 (83.3)

60 (16.7)

0.36 126 (70.0)

48 (26.7)

6 (3.3)

0.61 216 (59.3)

148 (40.7)

0.93 63 (36.6)

90 (49.5)

29 (15.9)

0.64

Cases 263 (80.7)

63 (19.3)

106 (65.0)

51 (31.3)

6 (3.7)

203 (59.0)

141 (41.0)

63 (34.6)

77 (44.8)

32 (18.6)

Hapmap (%) 81.7 18.3 66.8 29.8 3.4 60.4 39.6 34.8 51.2 14.0

Table 7-5 Allele and genotype frequencies for miRNA SNPs in MIR221 and MIR222 from the GRC-BC Population

rs2858061 rs2858060 Allele Genotype Allele Genotype G

(%) C

(%) p-value GG

(%) CG (%)

CC (%)

p-value C (%)

G (%)

p-value CC (%)

GC (%)

GG (%)

p-value

Control 323 (87.8)

45 (12.2)

0.43 142 (77.2)

39 (21.2)

3 (1.6)

0.38 219 (69.3)

97 (30.7)

0.75 81 (51.3)

57 (36.1)

20 (12.7)

0.35

Cases 295 (85.8)

49 (14.2)

130 (75.6)

35 (20.3)

7 (4.1)

218 (68.1)

102 (31.9)

74 (46.3)

70 (43.8)

16 (10.0)

Hapmap (%) 86.2 13.8 80.2 12.4 7.4 72.8 27.2 62.3 20.6 17.2


Table 7-6 Allele and genotype frequencies for miRNA SNPs in MIR21 obtained from the GRC-BC population

rs1292037 rs13137 Allele Genotype Allele Genotype A

(%) G

(%) p-

value AA (%)

GA (%)

GG (%)

p-value

T (%)

A (%)

p-value

TT (%) AT (%) AA (%)

p-value

Control 300 (80.6)

72 (19.4)

0.86 122 (65.6%)

56 (30.1)

8 (4.3)

0.95 299 (80.8)

71 (19.2)

0.85 122 (65.9)

55 (29.7)

8 (4.3)

0.94

Cases 274 (80.1)

68 (19.9)

110 (64.3) 54 (31.6)

7 (4.1)

276 (80.2)

68 (19.8)

111 (64.5)

54 (31.4)

7 (4.1)

Hapmap (%)

81.0 19.0 65.4 31.1 3.4 81.0 19.0 65.4 31.1 3.4

Table 7-7 Allele and genotype frequencies for miRNA SNPs in MIRLET7A1 obtained from the GRC-BC cohort

rs10761322 rs10739971 Allele Genotype Allele Genotype

T (%)

C (%)

p-value TT (%) TC (%)

CC (%)

p-value G (%)

A (%)

p-value GG (%)

AG (%)

AA (%)

p-value

Control 225 (61.8)

139 (38.2)

0.73 76 (41.8)

73 (40.1)

33 (18.1)

0.08 243 (67.5)

117 (32.5)

0.39 89 (49.4)

65 (36.1)

26 (14.4)

0.36

Cases 207 (60.5)

135 (39.5)

59 (34.5)

89 (52.0)

23 (13.5)

223 (64.5)

123 (35.5)

74 (42.8)

75 (43.4)

24 (13.9)

Hapmap (%) 61.9 38.1 40.9 42.0 17.2 68.1 31.9 46.4 43.3 10.3


Table 7-8 Allele and genotype frequencies for miRNA SNPs in MIRLET7A2 obtained from the GRC-BC population

rs629367 rs562052 rs693120

Allele Genotype Allele Genotype Allele Genotype

A

(%) C

(%) p-

value AA (%)

CA (%)

CC (%)

p-value G (%) A (%) p-

value GG (%)

AG (%)

AA (%)

p-value

G (%)

A (%)

p-value

GG (%)

AG (%)

AA (%)

p-value

Control 302 (82.5)

64 (17.5) 0.72

125 (68.3)

52 (28.4)

6 (3.3) 0.88

242 (65.4)

128 (34.6) 0.76

81 (43.8)

80 (43.2)

24 (13.0) 0.57

230 (67.3)

112 (32.7) 0.77

82 (48.0)

66 (38.6)

23 (13.5) 0.21

Cases 289 (83.5)

57 (16.5)

122 (70.5)

45 (26.0)

6 (3.5)

230 (66.5)

116 (33.5)

74 (42.8)

82 (47.4)

17 (9.8)

225 (66.2)

115 (33.8)

72 (42.4)

81 (47.6)

17 (10.0)

Hapmap (%) 84.6 15.4 75.4 18.5 6.2 68.1 31.9 46.2 43.8 10.0 68.1 31.9 46.2 43.8 10.0

Table 7-9 Allele and genotype frequencies for rs55945735 located in MIR145 obtained from the GRC-BC population

rs55945735 Allele Genotype

A (%) G (%) p-value AA (%) GA (%) GG (%) p-value Control 193 (59.9) 129 (40.1) 0.29 61 (37.9) 71 (44.1) 29 (18.0) 0.58 Cases 182 (64.1) 102 (35.9) 60 (42.3) 62 (43.7) 20 (14.1)

Hapmap (%) 62.8 37.2 38.3 49.1 12.7


Table 7-10 Allele and genotype frequencies for SNP rs353291 located in MIR145 obtained from the GRC-BC and GU-CCQ BB cohorts

GRC-BC Population GU-CCQ BB Population Allele Genotype Allele Genotype

T (%)

C (%)

p-value TT (%)

CT (%)

CC (%)

p-value T (%)

C (%)

p-value TT (%)

CT (%)

CC (%)

p-value

Control 211 (58.6)

149 (41.4)

0.041 61 (33.9)

89 (49.4)

30 (16.7)

0.09 386 (62.7)

230 (37.3)

0.023 128 (41.6)

130 (42.2)

50 (16.2)

0.07

Cases 171 (50.9)

165 (49.1)

40 (23.8)

91 (54.2)

37 (22.0)

769 (57.2)

575 (42.8)

229 (34.1)

311 (46.3)

132 (19.6)

Hapmap (%) 62.7 37.3 40.6 44.1 15.3 62.7 37.3 40.6 44.1 15.3


miR-SNP rs353291 is located 450 bp upstream from the MIR145 gene, inside the

MIRNA143HG transcript, in the long arm of chromosome 5 region 32 at position

148,810,746. To the best of our knowledge there are no previous association studies

in relation to this miR-SNP and breast cancer risk in Caucasians or any populations

of other ethnicities. However, MIR145 has been previously reported to play a role in

cancer biogenesis and progression. Downregulation of MIR145 is a common finding

previously reported in colorectal (Arndt et al., 2009; Michael et al., 2003), bladder

(Chiyomaru et al., 2010), lung (Cho et al., 2009, 2011) and oesophageal (Kano et al.,

2010) cancers possibly leading to poor prognosis. Similarly, it has also been found to

have a tumour suppressor role in breast tissue particularly in the myoepithelial/basal

cell compartment and it is found to be downregulated in breast cancer tissue samples

(Iorio et al., 2005; Sempere et al., 2007; Volinia et al., 2006). Research using breast

cancer cell lines showed it regulates genes involved in modulation of apoptosis

(Spizzo et al., 2010; S. Wang et al., 2009). Finally downregulation of MIR145 has

been associated with a more aggressive behaviour of breast malignancies based on

results performed in both breast cancer cell line and tissue samples (Goette et al.,

2010; Radojicic et al., 2011). However the molecular mechanisms leading to

decreased expression of MIR145 still remain unknown and our finding could

potentially help to provide further knowledge on these mechanisms through further

functional validation on breast cancer cell culture and/or animal/models. It is also

possible that rs353291 is linked to a different nearby SNP not considered in this

research that may have direct functional effects on MIR145, so additional

sequencing/genotyping studies may need to be performed prior to functional

assessment of the link between rs353291 and breast cancer.

7.6 CONCLUSIONS

This study suggests a role for a mutation in miR-SNP rs353291 in breast

cancer. To the best of our knowledge, this is the first report on breast cancer risk

association for this variant. This finding could potentially explain the previously

described role that MIR145 plays in breast cancer, previously documented in the

literature, but it requires confirmation via functional studies using cell culture or


animal models. It also requires further validation on larger Caucasian population as

well as in cohorts of individuals with different ethnic backgrounds before it can be

translated into clinical applications used in breast cancer diagnosis or treatment.


Chapter 8: Association of microRNA 17-92 cluster host gene (MIR17HG) polymorphisms with breast cancer.





field of expertise;







requirements.


Chacon-Cortes D, Smith RA, Lea RA, Youl PH and Griffiths LR (2015). Association

of microRNA 17-92 cluster host gene (MIR17HG) polymorphisms with breast

cancer. Tumor Biology, 36(7), 5369-76


Diego Chacon-Cortes

(Candidate)

Performed experimental work, contributed to

experimental design, research plan and data analysis,

prepared and approved final version of the manuscript.



160 Chapter 8: Association of microRNA 17-92 cluster host gene (MIR17HG) polymorphisms with breast cancer.


manuscript.

Rodney A. Lea Assisted in statistical analysis, critically reviewed and





manuscript







Chapter 8: Association of microRNA 17-92 cluster host gene (MIR17HG) polymorphisms with breast cancer. 161

8.2 ABSTRACT

Breast cancer incidence and mortality rates are increasing despite our current

knowledge of the disease. 90-95% of breast cancer cases correspond to sporadic

forms of the disease and are believed to involve an interaction between

environmental and genetic determinants. The microRNA 17-92 cluster host gene

(MIR17HG) has been shown to regulate expression of genes involved in breast

cancer development and progression. Study of single nucleotide polymorphisms

(SNPs) located in this cluster gene could help provide a further understanding of its

role in breast cancer. Therefore, this study investigated 6 SNPs in the MIR17HG

using two independent Australian Caucasian case-control populations (GRC-BC and

GU-CCQ BB populations) to investigate association to breast cancer susceptibility.

Genotyping was undertaken using chip-based matrix assisted laser desorption

ionization time-of-flight (MALDI-TOF) mass spectrometry (MS). We found

significant association between rs4284505 and breast cancer at the allelic level in

both study cohorts (GRC-BC p = 0.01 and GU-CCQ BB p = 0.03). Furthermore

haplotypic analysis of results from our combined population determined a significant

association between rs4284505/rs7336610 and breast cancer susceptibility (p = 5x10-

4). Our study is the first to show that the A allele of rs4284505 and the AC

haplotype of rs4284505/rs7336610 are associated with risk of breast cancer

development. However, definitive validation of this finding requires larger cohorts or

populations in different ethnic backgrounds. Finally, functional studies of these SNPs

could provide a deeper understanding of the role that MIR17HG plays in the

pathophysiology of breast cancer.

8.3 INTRODUCTION

In 2012, 14.09 million people in the world were newly diagnosed with cancer

and about 21% of them corresponded to breast cancer (1,676,633 cases in total)

according to the International Agency for Research on Cancer (IARC). It is the most

common type of cancer in women worldwide and is estimated to be developed by

one in eight women living to the age of 90. Breast cancer is also the fifth cause of


cancer-related deaths in the world (6.4%) and accounted for 197,528 of all cancer

related deaths in developed countries. It has become a cause for concern since both

incidence and mortality rates have increased by more than 20% and 14% respectively

from figures reported in 2008 (Ferlay et al., 2013; Robbins et al., 2010; Stewart,

Wild, International Agency for Research on, & World Health, 2014). Sporadic forms

of breast cancer account for about 90-95% of all cases and they are most likely the

result of interaction between genetic and environmental factors (Anderson, 1991;

Claus, Schildkraut, Thompson, & Risch, 1996; Martin & Weber, 2000; Robbins et

al., 2010).


of about 21 to 25 nucleotides in length involved in gene regulation at the

translational level. They account to about 1% of the human genome and are very

likely to be important in cancer biology due to their role in regulating important

cellular processes like cell growth, differentiation and cell survival (Calin & Croce,

2006; Thalia A. Farazi et al., 2011; Lynam-Lennon et al., 2009; Robbins et al., 2010;

Zuoren Yu et al., 2010). miRNAs are usually generated following the canonical

pathway where they are transcribed from long microRNA primary transcripts (pri-

miRNAs) about 1kb in size that contain up to six miRNA precursors (pre-miRNA).

Pre-miRNAs are stem-loop molecules of about 55 to 70 nucleotides in length and

they are produced when pri-miRNA are cleaved by the complex Drosha – DCGR8

(DiGeorge syndrome critical region gene 8, also known as pasha) within the cellular

nucleus. Pre-miRNAs are then transported to the cytoplasm by XPO5 (exportin5)

and the nuclear protein Ran-GTP. In the cytoplasm pre-miRNAs are processed by

DICER1, a ribonuclease type III enzyme, and double-stranded RNA binding proteins

(RBP) TARBP2 and/or PKRA producing a duplex molecule that contains both the

single-stranded mature miRNA sequence and its complementary strand named

miRNA*. The miRNA:miRNA* molecule is loaded into the RNA-induced silencing

complex (RISC), where the mature miRNA sequence is merged into the

Argonaute/EIFC2C (Ago) proteins and the miRNA* is released and degraded. Ago

proteins bound to mature miRNA mediate target messenger RNA (mRNA)

recognition and interaction between these three components results in gene

regulation. The mRNA target is recognised by pairing of the miRNA seed region


(nucleotides 2 to 8) located in the 5’-end of the miRNA and the complementary

sequence mainly located in the 3’-UTR region of the mRNA(Chendrimada et al.,

2005; Yoontae Lee et al., 2003; Y. Lee et al., 2002; Yi et al., 2003). Each miRNA

may bind to up to 200 gene targets and each gene could have multiple binding sites

for different miRNAs (Bartel, 2009; Rajewsky, 2006). Location of promoter regions

of microRNA genes, mechanisms that regulate their biogenesis and molecular

mediators of their functional effects within cells and tissues are still being studied

and remain unclear for most miRNAs identified to date (Ha & Kim, 2014; Kim,

2005).

Around 52% of all miRNA genes are located in different non-random positions

within the genome, generally in regions of chromosomal instability or cancer-

associated regions (Calin, Sevignani, et al., 2004). Changes in miRNA synthesis,

expression and effect on target genes in relation to cancer could occur through many

different mechanisms and some of them include: point mutations in miRNA genes,

mRNA sequences and surrounding regions, epigenetic changes such as mRNA gene

methylation, loss or mutation in the promoter regions for specific miRNA clusters

and/or alterations in pathway related RBP (Ha & Kim, 2014; Lynam-Lennon et al.,

2009).

Approximately one-third of all miRNAs form genomic clusters, about 113 in

total, which may measure up to 51 kb in length, and they are usually transcribed as a

single polycistronic transcript (Thalia A. Farazi et al., 2011; Lynam-Lennon et al.,

2009). One such cluster is the microRNA 17-92 cluster host gene (MIR17HG) cluster

located on chromosome 13q31.3 in the third intron of the c13orf25 gene. It was

initially identified in relation to B cell lymphoma both in cell lines and tissue sample

from patients by Ota et al in 2004 (Ota et al., 2004). The miRNA 17-92 cluster host

gene transcript is about 800bp in length and contains six miRNAs: MIR17, MIR18A,

MIR19A, MIR20A, MIR19B1 and MIR92A1, which belong to 4 seed families

(miRNA 17, miRNA 18, miRNA 19 and miRNA 92). Based on their seed sequences,

two related paralogues of this cluster have also been identified in the human genome:

miRNA 106b-25 cluster located on chromosome 7 (7q22.1) in the 13th intron of gene


MCM7 and miRNA 106a-363 cluster located on chromosome X (Xq26.2) (Ventura

et al., 2008).

The MIR17HG was named oncomiR-1 by He et al who found it to promote

oncogenesis in B-cell lymphoma using a murine model (L. He et al., 2005). Further

studies have confirmed roles both in cell proliferation and cell death, a feature that

seems to be dependent on the cell types and tissues in which it is found. Research has

shown that MIR17HG inhibits tumour growth by targeting important cell cycle

regulator genes including E2F1, MYC, RB1 and CCND1. In 2005, O’Donnell et al

observed that MIR17HG down-regulated expression of both E2F1 and MYC genes

using Burkitt lymphoma cells (P493-6) as a model (O'Donnell, Wentzel, Zeller,

Dang, & Mendell, 2005). On the other hand, later studies have shown this cluster

gene region seems to have an oncogenic effect in other types of tumour.

Experimental models of human B-cell lymphomas overexpressing MIR17HG

showed evasion of apoptosis due to enhanced activity of MYC. Moreover, studies of

small-cell lung carcinomas showed negative regulation of the pro-apoptotic gene

BCL2L11 resulting from overexpression of MIR17HG (Hayashita et al., 2005). This

finding may suggest that fine regulation of MIR17HG expression is important.

Research studies indicate that specific processing of miRNAs included in this cluster

might be the result of largely unknown intricate regulatory mechanisms. There are 34

transcription factors that regulate MIR17HG transcription identified to date and they

include MYC, MYCN, MXI, E2F3 and TP53 amongst others, that are specific to

different cell types and biological contexts (Mogilyansky & Rigoutsos, 2013; Olive,

Jiang, & He, 2010).

In relation to breast malignancies, a number of studies on the effects of

MIR17HG in breast cancer cell lines have also been published. In 2006 MIR17 5p,

one of the miRNAs included in the cluster, was shown to inhibit oestrogen receptor α

(ESR1) co-activator NCOA3 in different breast cancer cell lines (Hossain et al.,

2006). Following this finding, Yu et al demonstrated a tumour suppressor role for

MIR17HG inducing cell cycle arrest and decreasing cell proliferation by directly

inhibiting CCND1 in the MCF-7 breast cancer cell line in 2008 (Z. Yu et al., 2008).

Similarly in a later study, Leivonen et al showed that MIR17HG reduced cell growth


and progression of cell cycle in breast cancer cell lines MCF-7 and BT-474 using

two mechanisms: translational repression of ESR1 and/or downregulation of ESR1

responding genes. Additionally, they were able to confirm their in-vitro findings

through expression studies on breast cancer tumour samples showing significant

association between high expression levels of miR-18a and ESR1-negative breast

cancer tumours (Leivonen et al., 2009).

According to the evidence previously presented, MIR17HG seems to play an

important role in the development of different carcinomas including breast cancer.

However, there is a lack of knowledge on the molecular mechanisms leading to

altered expression or regulation of this particular cluster as well as each of the mature

miRNAs included in it. Genetic variants such as single nucleotide polymorphisms

(SNPs) could potentially impact biological processes involved in the production or

functional effects of the microRNA 17-92 cluster host gene and the study of such

variants in relation to cancer could provide insight into its aetiological mechanisms.

In this study, we genotyped 6 SNPs located in the cluster gene region or in

close proximity to it. Genotyping of these variants was performed using multiplex

PCR and Matrix-assisted laser desorption/ionization time-of-flight mass

spectrometry (MALDI-TOF MS) analysis in two independent Australian Caucasian

breast cancer cohorts. Results from our allele and haplotype association analysis

showed statistical significance for two of the selected SNPs and the risk of breast

cancer in these Caucasian populations.


8.4.1 Study populations

Genotyping was carried out in two independent cohorts of females of

Caucasian (Northern European) origin: The Genomics Research Centre Breast


Cancer (GRC-BC) population and a subset of the Griffith University-Cancer Council

Queensland Breast Cancer Biobank (GU-CCQ BB). GRC-BC population was

recruited from the Gold Coast Hospital, Southport and included 244 breast cancer

patient samples from patients residing in the South East Queensland region. Samples

from 187 healthy females with no history of personal or familial cancer were used as

controls and they were also obtained via the Genomics Research Centre Clinic,

Southport, with the research approved by Griffith University’s Human Ethics

Committee (Approval: MSC/07/08/HREC and PSY/01/11/HREC) and the

Queensland University of Technology Human Research Ethics Committee

(Approval: 1400000104). Mean age of the test populations was 57.52 years and 57

years, for cases and controls, respectively.

Replication was performed using 679 samples from the GU-CCQ BB

population. Patient samples were collected by the Genomics Research Centre in

collaboration with the Cancer Council of Queensland as part of a 5-year population-

based longitudinal study of women newly diagnosed with breast cancer. 929 women

resident in Queensland with a diagnosis of invasive breast cancer confirmed

histologically have been recruited by this biobank since January 2010. Age range of

patient samples included in this study varied from 33 to 80 years, with an average

age of 60.16. Clinical and demographic information was obtained through the

Queensland Cancer Registry, Genomics Research Centre General Questionnaire,

computer assisted telephone interview (CATI) and self-administered questionnaires

(SAQ) over two main time points: 4 to 6 months post diagnosis and 18 months post

diagnosis [approximately 12 months after initial interview]. Additionally, using the

Queensland Cancer Registry and the National Death Index, the study cohort will be

followed-up for five years post diagnosis to record cancer recurrence and mortality

(Youl et al., 2011). 308 ethnicity and sex matched control samples for the GU-CCQ

BB population were obtained from 2 sources: 107 females with no personal or

familial history of cancer, recruited from January 2000 through the Genomics

Research Centre at the Institute of Health and Biomedical Innovation Institute,

Queensland University of Technology and also genotyping data obtained from 201

individuals belonging to the phase 1 European population from the 1000Genomes

project (Abecasis et al., 2012).


8.4.2 Genomic DNA sample preparation from whole human blood.

Genomic DNA was obtained from whole blood samples using a modified

salting out method (Chacon-Cortes et al., 2012; Nasiri et al., 2005). DNA samples

were checked for quantity and purity using spectrophotometry measurements using

the Thermo Scientific NanoDropTM 8000 UV-Vis Spectrophotometer (Thermo Fisher

Scientific Inc., Wilmington, DE. USA) to determine concentration and 260/280

ratios (Chacon-Cortes & Griffiths, 2014; Huberman, 1995; Sahota, Brooks,

Tischfield, et al., 2007). To guarantee DNA sample purity, those with a value below

1.7 for their 260/280 ratio were further purified using an ethanol precipitation

protocol (Sambrook & Russell, 2001).

8.4.3 miRNA SNP selection

We used the human miRNA disease database (HMDD), developed by Lu et al

(M. Lu et al., 2008) and updated in January 2012, to identify microRNAs involved in

development and progression of breast cancer. We selected 24 biological/cellular

functions and 8 diseases and/or pathological features related to breast cancer from

two datasets included in the HMDD: “The whole miRNA-disease association data”

and “The miRNA function set data”. The microRNA 17-92 cluster host gene

(MIR17HG) was found to be present in most of the features on each dataset and

therefore we conducted a preliminary literature search to determine its role in cancer

development and progression and to investigate any SNP genotyping studies for this

cluster gene. Following this, we searched for previously identified SNPs in the

MIR17HG using both the dbSNP database from The National Center for

Biotechnology Information (NCBI) (Sherry et al., 2001) and 1000 Genomes project

browser (Abecasis et al., 2012). MicroRNA SNPs (miR-SNPs) that were located

inside the pre-miRNA gene were automatically included in a preliminary selection

and those located outside of the pre-miRNA gene were selected using either one of

the following criteria: SNPs located within 500bp to the 3’ or 5’ end of the pre-

miRNA gene region were automatically included in the preliminary selection.


Finally, miR-SNPs located at a distance higher than 500bp from either end of the

pre-miRNA gene were chosen only if they had a minor allele frequency (MAF)

higher than 0.05 in Caucasian populations. Our preliminary selection identified 13

miR-SNPs in the MIR17HG cluster or located up to 1.6 kb downstream and two

variants (rs72631821 and rs9589207) located inside one of the mature miRNA

sequences (MIR92A1) (See Table 8-1).

8.4.4 Primer design and multiplex PCR genotyping

We were able to design a multiplex PCR assay to include six miR-SNPs: one

of the variants located inside the MIR92A1 gene (rs9589207) and five miR-SNPs

located throughout the MIR17HG region using the MassARRAY® Assay Design

Suite v1.0 software (SEQUENOM Inc., San Diego, CA, USA). Two PCR primers

(forward and reverse) and one iPLEX® (extension) primer were designed for each

miR-SNP to achieve successful marker and allele identification by mass

spectrometry (See Table 8-2). We verified that masses of extension primers differed

by at least 30 Da among different SNPs and by 5 Da between alternative alleles of

the same marker and primers were synthesized by Integrated DNA Technologies

(IDT®) Pte. Ltd. (Baulkham Hills, NSW 2153, Australia). Multiplex PCR reactions

were performed according to the manufacturer’s protocol for the iPLEX™ GOLD

genotyping application using iPLEX® GOLD reaction kit reagents (SEQUENOM

Inc., San Diego, CA, USA). Primer extension reactions were carried out using the

SEQUENOM linear adjustment method as per manufacturer’s iPLEX® GOLD

genotyping protocol. All iPLEX® GOLD genotyping reactions were performed using

Applied Biosystems® MicroAmp® EnduraPlate™ Optical 96-Well Clear Reaction

Plates with Barcode (Life Technologies Australia Pty Ltd., Mulgrave, VIC,

Australia) and a Applied Biosystems® Veriti® 96-Well Thermal Cycler (Life

Technologies Australia Pty Ltd., Mulgrave, VIC, Australia).


Table 8-1 SNPs in the microRNA 17-92 cluster host gene selected for primer design using the MassARRAY® Assay Design Suite v1.0 software (SEQUENOM Inc., San

Diego, CA, USA)

miRNA Locus Location miRNA SNPs in gene region

MAF Location SNPs near 3' end

MAF Location Downstream distance (bp)

MIR92A1 13q31.3 92,003,568 - 92,003,645 rs72631821 NA 92,003,588

rs9589207 0.0211 92,003,589 MicroRNA 17-92 cluster host gene

13q31.3 92,000,074 - 92,006,829 rs4284505 0.4867 92,001,472 rs138514639 0.0517 91,998,497 1577 rs72640333 0.0939 92,004,927 rs113242099 0.0504 92,005,118 rs1888138 0.0916 91,999,496 578 rs111371822 0.1484 92,005,134 rs7336610 0.4524 92,005,137 rs7318578 0.3594 92,005,469 rs2351704 0.3539 91,999,716 358 rs17735387 0.0829 92,006,054 rs1428 0.463 92,006,770

Table 8-2 Primer sequences for microRNA 17-92 cluster host gene SNPs included in genotyping study using multiplex PCR reaction and MALDI-TOF MS

SNP Forward primer sequence Reverse primer sequence iPLEX® (Extension) primer sequence rs1888138 ACGTTGGATGGTTTATTACTTTACCGGCCC ACGTTGGATGTACTTCTCTGGTTCCGGTTG TGGGTTTCCGAGGTA rs7336610 ACGTTGGATGAAAAAGTTCCGGCTGGACAC ACGTTGGATGACAGCGTTTCACCATGTCGG GACTGACCTCAGGTAATCC rs9589207 ACGTTGGATGACTCAAACCCCTTTCTACAC ACGTTGGATGGGACAAGTGCAATACCATAC AAGGAACACAGCATTGCAAC rs17735387 ACGTTGGATGAGCCTTAACTATTTGGAGGG ACGTTGGATGGCTTTCTTTCCAAATATAGGC GGGGAGAAAGTTGTACATGCAAA rs4284505 ACGTTGGATGCTTTGCAGTCTCGGGTGTTC ACGTTGGATGTGATATTGCAACGACGAGCC TGATCCTGCCTTTTTCAGTTCCTT rs1428 ACGTTGGATGTCAATATTCTCGTTCTGGAC ACGTTGGATGACAGTTTGGTCTGGCTGTTT GCATTTAATGTTAATAAATAAAATACTG


8.4.5 MALDI-TOF MS analysis and data analysis

We dispensed a total of 12-16 nl of each iPLEX® reaction product onto a

SpectroCHIP® II G96 (SEQUENOM Inc., San Diego, CA, USA) using

SEQUENOM® MassARRAY® Nanodispenser (SEQUENOM Inc., San Diego, CA,

USA) and SpectroCHIP® analysis was performed by SEQUENOM® MassArray®

Analyzer 4 (SEQUENOM Inc., San Diego, CA, USA). We acquired automated

spectra after laser desorption/ionisation using the SpectroAcquire software version

4.0 (SEQUENOM Inc., San Diego, CA, USA) and genotype data analysis was

performed using the MassARRAY® Typer software version 4.0 (SEQUENOM Inc.,

San Diego, CA, USA).

8.4.6 Statistical analysis

We determined genotype and allele frequencies for each miRNA SNP in our

case and control populations. Hardy-Weinberg equilibrium (HWE)(Guo &

Thompson, 1992; Hardy, 1908) was used to evaluate deviation between observed and

ideal Hardy-Weinberg frequencies. Differences in genotype and allele frequencies

between cases and controls were evaluated using Chi-square analysis (Fisher &

Yates, 1963) for each independent population to determine association with breast

cancer. The α for p-values was set at 0.05 to determine statistical significance.

Statistical analysis of genotypes and alleles was conducted using Plink software

version 1.07 (http://pngu.mgh.harvard.edu/purcell/plink/) (Purcell et al., 2007). We

also calculated odds ratio (OR) and a confidence interval (CI) of 95% to assess

disease risk. Finally, we merged results from both populations and we performed

linkage disequilibrium (LD) and haplotype block association analysis on the

combined genotyping data using Haploview software version 4.2 (Daly Lab, Broad

Institute, Cambridge, MA, USA) (Barrett et al., 2005).



8.5 RESULTS

8.5.1 Genotypic Analysis

Table 8-3 shows the 6 SNPs genotyped in our study provided good coverage of

the MIR17HG cluster. They also included rs9589207 located inside a mature miRNA

sequence (MIR92A1) and rs1888138 located 578 bp downstream of the cluster gene.

All SNPs had a MAF greater than 0.05, with the exception of rs9589207 which had a

MAF of 0.02. We proceeded to include all these SNPs in our multiplex PCR, in

particular rs9589207 because of their genetic location. However, rs9589207 was

finally excluded from our association analysis because we were unable to find the

mutant allele in our initial genotyping of the GRC-BC cohort.

Hence genotyping results of the 5 remaining SNPs obtained from both

populations are shown in Table 8-4. All control populations were in HWE (p > 0.05)

for all tested SNPs and allele and genotype frequencies closely matched those found

in Hapmap for Caucasian populations. As shown in Table 8-4, we were able to

validate a significant difference in allele frequencies between cases and controls for

rs4284505 in both the GRC-BC and GU-CCQ BB populations using chi-square

analysis (p = 0.01 and 0.03 respectively). Statistical analysis of differences between

allele frequencies in cases and controls in the GRC-BC population for rs7336610

also showed significance (p=0.03), but we were unable to obtain a similar finding in

our independent replication population (p=0.10) and all the other remaining SNPs

showed no significant differences between cases and controls in both populations.

We observed lower frequencies (38.3% and 35.9% for GRC-BC and GU-CCQ

cohorts respectively) of the A allele for rs4284505 in cases for both independent

populations (See Table 8-4) and we calculated an odds ratio of 0.70 (CI 95%: 0.52 –

0.93) for GRC-BC population and of 0.79 (CI 95%: 0.65 – 0.97) for GU-CCQ

population. These results indicate that the presence of the A allele in this locus seems

to have a protective effect on susceptibility to breast cancer.


Table 8-3 Chromosomal location and allele information for selected miRNA SNPs in the microRNA 17-92 cluster host gene

SNP Locus Wild Type allele Mutant allele Chromosomal position MAF rs1888138 13q31.3 A T chr13: 91999496 0.0916 rs4284505 13q31.3 G A chr13: 92001472 0.4867 rs9589207 13q31.3 G A chr13: 92003589 0.0211 rs7336610 13q31.3 T C chr13: 92005137 0.4524 rs17735387 13q31.3 G A chr13: 92006054 0.0829 rs1428 13q31.3 T G chr13: 92006770 0.4625

Table 8-4 Allele frequencies of SNPs in the microRNA 17-92 cluster host gene obtained from the GRC-BC and GU-CCQ BB cohorts

rs1888138 rs4284505 rs7336610 rs17735387 rs1428 Alleles Population A

(%) T

(%) p-

value G

(%) A

(%) p-

value T

(%) C

(%) p-

value G

(%) A

(%) p-

value T

(%) G

(%) p-

value

GRC-BC Population

Control 361 (96.5)

13 (3.5)

0.86 189 (53.1)

167 (46.9)

0.01 185 (52.6)

167 (47.4)

0.03 360 (96.3)

14 (3.7)

0.68 182 (56.2)

142 (43.8)

0.67

Cases 470 (96.3)

18 (3.7)

263 (61.7)

163 (38.3)

263 (60.0)

175 (40.0)

449 (96.8)

15 (3.2)

278 (57.7)

204 (42.3)

GU-CCQ BB

Population

Control 590 (95.8)

26 (4.2)

0.87 362 (58.8)

254 (41.2)

0.03 346 (56.2)

270 (43.8)

0.10 590 (95.8)

26 (4.2)

0.66 383 (62.8)

227 (37.2)

0.36

Cases 1295 (95.9)

55 (4.1)

767 (64.1)

429 (35.9)

712 (60.1)

472 (39.9)

1287 (96.2)

51 (3.8)

655 (60.5)

427 (39.5)

Hapmap (%) 95.9 4.1 57.9 42.1 55.7 44.3 98.2 1.8 65.6 34.4


Finally, chi square analysis of genotypes for all selected MIR17HG SNPs

between cases and controls showed significant differences for rs4284505 and

rs7336610 (p = 0.01 for both SNPs) in the GRC-BC population. Interestingly, we

were only able to obtain statistical significance in the analysis at the genotype level

for rs7336610 in our replication cohort (p = 0.04).

8.5.2 Haplotypic Analysis

Following the basic genotyping association analysis, we combined genotyping

results from both GRC-BC and GU-CCQ populations to conduct linkage

disequilibrium (LD) and haplotype block association analysis. As shown in Figure

8-1, two of our selected SNPs (rs4284505 and rs7336610) were part of a 3kb LD

block (Block 1) with a D’ score of 0.98.

Table 8-5 Haplotype block association analysis for selected SNPS in the

microRNA 17-92 cluster host gene in combined GRC-BC/GU-CCQ BB population

shows frequencies for the different haplotypes in block 1 found in cases and controls,

as well as results of haplotype association analysis to breast cancer using chi-square

analysis. According to our findings, cases had a lower frequency of haplotype AC

(36.4%) than controls (43.3%) and statistical analysis proved this haplotype to be

significantly associated with breast cancer risk (p = 5x10-4). Finally, we calculated an

odds ratio for haplotype AC of 0.75 with a 95% confidence interval of 0.60 to 0.94 (p

= 0.012). Results from our haplotype analysis validate our finding for the genotyping

analysis at the allele level in rs7336610. These results also suggest that for the

individuals that carry the AC haplotype, it appears to confer a protective effect on

risk of breast cancer in Caucasians reducing the risk of being affected by breast

cancer to about 25% in comparison to other haplotype carriers.


Figure 8-1 Linkage Disequilibrium (LD) blocks for analysed microRNA polymorphisms in the microRNA 17-92 cluster host gene.

Figure 8-1 shows the haplotype block formed by rs4284505 and rs7336610 found by linkage disequilibrium analysis using Haploview software version 4.2 (Daly Lab,

Broad Institute, Cambridge, MA, USA) for the SNPs in the microRNA 17-92 cluster host gene.


Table 8-5 Haplotype block association analysis for selected SNPS in the microRNA 17-92 cluster host

gene in combined GRC-BC/GU-CCQ BB population

GT (%) AC (%) GC (%) Control 269 (54.3) 214 (43.3) 12 (2.4) Cases 544 (58.9) 336 (36.4) 43 (4.7)

x2 5.05 12.21 6.58 p-value 0.02 5.0E-04 0.01

Global Statistics x2 = 9.21 p-value = 0.01

8.6 DISCUSSION

Breast cancer affects a significant number of women and mortality rates seem

to be increasing, despite substantial research and our current knowledge of the

disease (Ferlay et al., 2013; Stewart et al., 2014). MicroRNAs are involved in

molecular pathways leading to cell growth, differentiation and survival and there is

enough evidence to show they play a significant role in the mechanisms that lead to

development and progression of different types of cancer. Changes in expression of

the microRNA 17-92 cluster host gene, also known as oncomiR-1, were initially

described in haematopoietic and lung cancers where it was shown to play a role in

tumour growth and apoptosis (Hayashita et al., 2005; L. He et al., 2005; O'Donnell et

al., 2005; Ota et al., 2004; Ventura et al., 2008). Research has also highlighted

MIR17HG as an important regulator of genes involved in breast cancer pathways.

Studies by Hossain et al and Leivonen et al showed it plays a role in regulation of

oestrogen receptor α (ESR1) in different breast cancer cell lines and tumour tissues

through transcriptional downregulation of this receptor and genes that interact or co-

activate ESR1 (Hossain et al., 2006; Leivonen et al., 2009). Yu et al found the

microRNA 17-92 cluster host gene to have a tumour suppressor role through CCND1

inhibition in the MCF-7 cell line (Z. Yu et al., 2008). However, we currently lack

understanding of the specific features that result in altered biogenesis and functional

effects of this particular miRNA cluster gene in breast malignancies. Therefore the

study of single nucleotide polymorphisms located inside the MIR17HG cluster gene

using well defined breast cancer cohorts may help to assess whether they have a role

on the molecular events leading to the development of this disease.


Selection of 6 miRNA SNPs found 0.6 kb downstream and inside the

microRNA 17-92 cluster host gene region provided a thorough coverage of the gene

and we were able to successfully genotype all of these variants in the two available

cohorts, even rs9589207 located inside the MIR92A1 transcript. Unfortunately this

particular SNP was not included in our association analysis because we could not

identify the presence of the mutant allele in any of our populations. However two of

the five remaining SNPs, rs4248505 located about 1.4 kb downstream from MIR17A

and rs7336610 found about 1.4 kb upstream from MIR92A1, showed some

interesting results on our association analysis of breast cancer risk. Moreover, there

does not appear to be any previous genotyping studies on these variants in relation to

breast cancer or other diseases in Caucasian or any other ethnic populations.

Statistical analysis of our results for rs4284505 showed a significantly higher

presence of the A allele in healthy individuals for both GRC-BC and GU-CCQ BB

populations (p = 0.01 and 0.03 respectively) and a reduction in breast cancer risk of

up to 30%. We also found higher frequencies for the C allele of rs7336610 in

controls in our GRC-BC cohort but failed to establish a similar finding in our

replication population. Analysis of the haplotypes present in the LD block

rs4284505/rs7336610 showed results consistent to our previous finding in the

individual SNP analysis, with the presence of the AC haplotype found to have

significantly higher frequencies in controls (p = 5x10-4) decreasing the risk of

developing breast cancer by about 25%.

In conclusion, via analysis of SNPs from MIR17HG, we were able to identify

significant association between rs4248505 at the allele level and

rs4248505/rs7336610 at the haplotype level with susceptibility to breast cancer. To

the best of our knowledge this is the first study investigating SNPs in this cluster

gene in breast cancer populations and the first to determine that the presence of the

AC haplotype significantly affects the risk of developing breast cancer. However our

findings require further validation in larger populations and/or population of different

ethnicities, as well as functional studies to further define the role of this haplotype in

miRNA expression or molecular pathways leading to the development of breast

cancer.


Chapter 9: General Discussion

9.1 INTRODUCTION

Cancer affects a large number of individuals worldwide. Increase in incidence

and mortality rates is a concerning fact and it is a trend also observed in Australian

population (AIHW & AACR, 2010, 2012; Ferlay et al., 2010; Ferlay et al., 2013).

Breast cancer is the most common type of cancer in women and it ranks first as a

cancer-related cause of death worldwide and second in Australia (AIHW, 2009,

2012).

Breast cancer is a malignant neoplasm characterised by uncontrolled cellular

growth within breast tissue with the potential to invade distant tissues and organ

within the human body. It is a complex disease that results from innate and/or

acquired genetic predisposition from somatic mutations in breast tissues and their

interaction with hormonal exposure and other risk factors (Robbins et al., 2010;

Strachan & Read, 2011; Weinberg, 2007). Hereditary or familial forms of the disease

that can be primarily attributed to germline mutation are only a small fraction of the

total of cases currently diagnosed accounting for about 10% of all reported cases

(Anderson, 1991). On the other hand sporadic breast cancer requires very complex

signalling networks created by the interaction of many molecular pathways that have

been identified up to date. Therefore several genes and mediators may play a role in

the development of the sporadic disease, including input from the environment

(Robbins et al., 2010; Weinberg, 2007; Ben Zhang et al., 2011). Genes involved in

the inherited or familial types of the disease have helped our understanding of the

molecular mechanisms and have shown the molecular pathways involved in the

pathogenesis but we still lack knowledge on the detailed molecular mechanisms

involved in those pathways. Better understanding of the mediators and molecular

178 Chapter 9: General Discussion

regulator involved in those pathways will give us a better understanding of the

pathophysiology of the disease, as well as help in the development of diagnostic or

treatment approaches that would have a substantial impact for breast cancer patients.

9.2 BREAST CANCER AND MICRORNAS

MicroRNAs have been described as important epigenetic regulators of gene

expression present in all the molecular pathways involved in cell growth,

differentiation and survival. Therefore, they might be important for the pathogenesis

of breast cancer because they are known to regulate about 60% of the genes in the

human genome (Thalia A. Farazi et al., 2011; Robbins et al., 2010; Zuoren Yu et al.,

2010). Research studies aimed at a better understanding of the molecular

mechanisms and mediators involved in miRNA biogenesis and function in relation to

cancer are fundamental. Scientific research to elucidate the role of microRNAs in

cancer biology is increasing and differential patterns of expression for miRNAs have

been identified in many types of cancer including breast cancer (Calin & Croce,

2006; Iorio et al., 2005; J. Lu et al., 2005; Volinia et al., 2006). The molecular

mechanisms leading to miRNA expression deregulation that results in cancer

development are still not fully understood. However it has been recognised that

mutations present in the miRNA sequences could result in the changes in miRNA

synthesis and function seen in cancer biology.

Single nucleotide polymorphisms (SNPS) are the most common type of

variation present in miRNA genes (T. A. Farazi et al., 2011; H.-C. Lee et al., 2011;

Ryan et al., 2010). Therefore the study of miRNA SNPs (miR-SNPs) has the

potential to explain the mechanisms behind miRNA changes in biogenesis and

function. They can also be used as markers to identify cancer predisposition, predict

cancer behaviour and prognosis and as targets for cancer therapy (Budhu et al., 2010;

Heneghan et al., 2010; Roth et al., 2010; Q. Wu et al., 2011). There is not currently a

lot of evidence for involvement of miR-SNPs in cancer but it is a field of research

that seems to be growing (Gong et al., 2012; H.-C. Lee et al., 2011; Mishra &

Bertino, 2009). The lack of data arises primarily because only a small fraction of the

Chapter 9: General Discussion 179

total of miR-SNPs identified to date have been studied in relation to cancer and

breast cancer more specifically (Z. Hu et al., 2008; Z. Hu et al., 2009; Jazdzewski et

al., 2009; Jazdzewski et al., 2008; B. Xu et al., 2010; T. Xu et al., 2008; Yeqiong Xu

et al., 2013; Yamashita et al., 2013; Ye et al., 2008).

miR-SNPs located in MIR27A, MIR423, MIR146A, MIR196A2 and MIR499

have been associated with breast cancer, but findings are still somewhat controversial

(Alshatwi et al., 2012; Bansal et al., 2014; Catucci et al., 2012; Catucci et al., 2010;

Esteban Cardenosa et al., 2012; Gao et al., 2011; Amandine I. Garcia et al., 2011;

Hoffman et al., 2009; Jedlinski et al., 2011; Kontorovich et al., 2010; S. J. Lee et al.,

2014; Lian et al., 2012; Linhares et al., 2012; Pastrello et al., 2010; Shen et al., 2008;

R. A. Smith et al., 2012; P. Y. Wang et al., 2013; R. Yang & Burwinkel, 2012;

Rongxi Yang et al., 2010; M. Zhang et al., 2012; N. Zhang et al., 2013; Zhong et al.,

2013). Conflicting results in regards to miR-SNPs and breast cancer are probably the

result of limitations in terms of the populations examined, experimental design and

methods used for analysis. Therefore all of these need to be carefully considered and

addressed in the study of the remaining miR-SNPs that need to be analysed in

relation to breast cancer or to validate previous findings. More research is required to

address a larger number or miR-SNPs, in particular those related to genes previously

identified to be involved in breast cancer development and progression as this would

allow better understanding of the mechanisms involved and it might result in the

development of new and/or improved diagnostic and therapeutic approaches.

This study aimed to contribute to the understanding of microRNA

polymorphisms and their potential association with breast cancer risk as an initial

step in the identification of those miR-SNPs that may have a substantial role in the

biology of cancer. In order to successfully identify those miRNA variants that are

significantly associated with breast cancer we established some specific aims: i) the

establishment of a large sized and well characterised prospective DNA biobank to be

used as a replication cohort to the one previously available in our facility. ii) to

determine the most appropriate protocol to use in the establishment of our new breast

cancer case-control cohort in a time-efficient and cost-effective manner. iii) to


identify miR-SNPs associated with breast cancer risk in both Australian Caucasian

populations using genotype association analysis.

9.3 POPULATION RESOURCES DEVELOPMENT

We aimed to establish a significant sized and well-characterised prospective

genomic DNA (gDNA) biobank extracted from whole blood samples obtained from

breast cancer patients in Queensland recruited in collaboration with the Cancer

Council Queensland. A well-characterised biobank would contain comprehensive

information from volunteers regarding ethnic origin, age, sex, biometric data and

complete clinical information to allow selection of candidates with very specific

phenotypes to be included in genetic studies in relation to a condition, which would

minimise bias from experimental findings. This goal was achieved through the very

thorough process of recruitment and follow-up described in Section 3.1.2, resulting

in a very comprehensive database containing the information required for appropriate

characterisation of volunteers. However, for this study, we were limited in our access

to some of the information, due to the fact that both the biobank and the database are

still under development and data is still being collated and finalised. This limited

information in turn prevented us from including important features like type of breast

cancer, stage at diagnosis and characteristics of the tumours into our analysis and

results. This could have an impact on our findings that is yet to be determined.

Therefore, once such information is available it could be included as part of the

future work needed to further validate and interpret our findings.

About 3000 samples were expected to be collected during the 5 year period of

patient recruitment and 929 breast cancer patients had been recruited by this biobank

since January 2010. We were able to successfully perform DNA extraction of all the

929 samples collected to date with the resources available to the study. This was the

result of careful planning and assessment of the DNA extraction technique that we

used which allowed us to progress the construction of the population most

efficiently. As previously mentioned in Section 3.2.3, we were also able to establish

significant sized populations because both GRC-BC and CCQ-GU BB populations


has a sufficient size to achieve ~80% or greater power to detect an allelic association

at the 0.05 level.

9.3.1 DNA extraction methods

In order to determine the best possible method to perform DNA extraction

from the ones available at our facility, we decided to review current knowledge of

genomic DNA extraction from whole blood methods to determine the most suitable

one to successfully achieve the establishment of our biobank population. Chapter

Five reviews the history of DNA extraction techniques and the main categories of the

methods used in DNA extraction from whole blood samples most commonly used in

facilities around the world. Finally, we reviewed available literature on evaluation

studies previously published comparing different extraction techniques to highlight

their strengths and weaknesses. This review evidenced the limited scientific

evidence available in regards to the best choice for gDNA extraction, since to the

best of our knowledge there is no publication that comprehensively evaluates all

approaches in terms of all possible features. It also highlighted the need to conduct

our own validation study to determine the most appropriate technique to be used

from the ones currently available at our facility to determine the one to be used in the

establishment of our new case control cohort.

9.3.2 Comparison of DNA Extraction methods

Chapter Four presents our in-house evaluation study conducted to determine

the most appropriate method in terms of cost effectiveness and time efficiency to

perform genomic DNA (gDNA) extraction from whole blood samples. We

conducted a study that compared the modified salting-out protocol described by

Nasiri et al (Nasiri et al., 2005), against a traditional salting out method (Miller et al.,

1988) and a commercially available DNA extraction kit. We performed gDNA

extraction in eleven whole blood samples obtained from breast cancer patients and

evaluated the performance of each method in terms of the quality and quantity of


DNA obtained, as well as time requirements and costs for each technique. Our

results showed that any of these three techniques perform similarly in terms of

quantity and quality of final DNA product, but it highlighted the modified salting out

method to be the most cost-effective and time-efficient technique to isolate gDNA

from whole blood samples achieving similar results to the ones obtained with the

commercial DNA extraction kit. Unfortunately, costs associated with the collection,

handling and storage of the sample were not taken into account. However they are

important because this could potentially impact on the integrity of the initial sample,

as well as the performance and success of any DNA extraction technique chosen.

They are also a significant factor to consider, since it is very important to carefully

obtain and preserve initial samples, especially for future extraction of DNA samples

from those which are exhausted from the biobank.

Some of the limitations of this study included the small number of samples

used to test our chosen DNA extraction methods and we also failed to include a more

comprehensive range of techniques like magnetic bead-based extraction methods or

automated method unavailable at our facility. Spectrophotometry can potentially

overestimate the amount of double stranded DNA resulting from the chosen

extraction method up to 35% compared to other methods such as PicoGreen, because

there are different amounts of double and single stranded DNA and RNA remaining

as a result of differences in chemistry between different methods. Moreover, using

qPCR to evidence differences in quantification cycle (Cq) would potentially improve

our data in regards to the quality of DNA yield.

Nonetheless, the modified salting out method has consistently proved to be an

appropriate technique based on the quality and quantity of DNA obtained from blood

samples extracted for the GU-CCQ biobank population. DNA samples extracted with

this technique have also showed to perform well in downstream application, as

evidenced by the genotyping results obtained in the genotyping analysis included in

this study, as well as other studies performed at our facility not included here because

they escaped the scope of this manuscript.


9.4 ASSOCIATION STUDIES

9.4.1 MIR146A results

Chapter Six was our pilot genotyping study to determine whether our newly

established population extracted with the modified salting out technique performed

well in downstream applications. It also provided us with the opportunity to validate

previous findings in relation to a miR-SNP that has shown conflicting results in

studies previously conducted by other research groups. We selected the miR-SNP

rs2910164 present in MIR146A because it had not been properly analysed in purely

sporadic or purely Caucasian breast cancer populations to validate previous findings.

We genotyped this variant using high resolution melt (HRM) analysis and were able

to determine that presence of the mutant allele in this locus to be significantly

associated with breast cancer risk in both our primary and replicate populations. One

limitation of this study was the small number of samples included in our replication

population (GU-CCQ BB population), as this was still under development. However

we were able to successfully conduct our assay and to achieve statistically significant

results. Our study showed that the presence of the mutant allele in rs2910164 was

significantly associated with sporadic breast cancer risk in both of our cohorts (p =

0.03 and p = 0.00013). To the best of our knowledge this is the first study to identify

an association between this variant and breast cancer risk in a purely sporadic breast

cancer bearing Caucasian population.

9.4.2 miRNA selection and analysis

Chapter Seven presents the results of a large scale genotyping study using both

of our previously established cohorts. In this study, we developed a selection

algorithm to choose miR-SNPs to be included shown in Figure 7-1 of this chapter.

This allowed the selection of a large panel of miR-SNPs (24 miR-SNPs in total

present in 9 miRNA genes) to be evaluated using multiplex PCR and chip-based

matrix assisted laser desorption ionization time-of-flight (MALDI-TOF) mass


spectrometry (MS) analysis. Results from our study showed 6 of the miR-SNPs

(rs73798217, rs112394324, rs35301225, rs1547354, rs7050391 and rs3851812) not

to be polymorphic in our populations. As mentioned in Section 2.5.1.1, there are

around 10 million SNPs in the human genome and only about 56% of SNPs have

been successfully typed. Many of these SNPs have been tested using small

populations and therefore it is very likely that a large number of those will turn out to

be non-polymorphic, or have altered minor allele frequencies when genotyping larger

cohorts, or cohorts from different ethnicities. Therefore, it is not surprising that up to

21% of our selected miR-SNPs were shown not to be polymorphic. Our results

provide initial validation of the rarity of these variants in Caucasian populations.

From the remaining polymorphisms analysed, we were able to show significant

association to breast cancer risk for one of our selected variants. We determined that

miR-SNP rs353291, occurring in relation to MIR145 had significant differences in

allele frequencies between cases and controls in both populations (p = 0.041 and p =

0.023). To the best of our knowledge this is the first report of an association to breast

cancer risk for this variant in Caucasian populations. Although we provided initial

validation of our findings using two independent populations, further validation of

this results in larger cohorts or populations of both similar and different ethnic

background is required to strengthen their significance. If we take into account

multiple testing correction of our results, however, our findings are no longer

significant, yet still interesting and worthy of further research based on the similarity

of significance found in both populations. Functional studies to help refine the likely

mechanism by which the miR-SNP may interact with breast cancer influencing

pathways may also help in properly identifying populations who may be affected by

it.

9.4.3 MicroRNA 17-92 cluster host gene analysis

Chapter Eight presents the results of our genotype and haplotype analysis of 6

miR-SNPs present in the microRNA 17-92 cluster host gene (MIR17HG) conducted

in our two Australian Caucasian breast cancer cohorts. We decided to study


polymorphisms in MIR17HG because it has been previously shown in the literature

to be associated with cancer development. It was the first cluster host gene to be

found to play a role in cancer biology for hematopoietic malignancies and other types

of cancer (Hayashita et al., 2005; L. He et al., 2005; Mogilyansky & Rigoutsos,

2013; O'Donnell et al., 2005; Olive et al., 2010). It has also been linked to the

pathogenesis of breast cancer. It has been shown to target important genes in breast

biology, such as NCOA3, CCDN1 and ESR1 (Hossain et al., 2006; Leivonen et al.,

2009; Z. Yu et al., 2008). The molecular mechanisms that lead to deregulation of

MIR17HG may be influenced by SNPs and to the best of our knowledge there are no

previous studies evaluating the role of miR-SNPs for this particular cluster host gene,

making it an ideal candidate for study. We selected 6 miR-SNPs located in the

cluster gene region or in close proximity to it providing good coverage for this

miRNA cluster, and they included a polymorphism located inside the mature

sequence for MIR92A1 (rs9589207), a variant almost certain to be functional in some

respect. We genotyped these SNPs using multiplex PCR and chip-based MALDI-

TOF MS analysis in our two independent Australian Caucasian breast cancer case-

control cohorts. Genotypic analysis of our results showed significant differences

between cases and control for miR-SNP rs4284505 in both of our populations (p =

0.01 and p = 0.03) and for rs7336610 only in our initial population (p = 0.03). We

then performed haplotypic analysis on the combined results from both populations

and we identified these two miR-SNPs (rs4284505/rs7336610) to be in linkage

disequilibrium and part of a haplotype block (D’ score = 0.98). Chi-square analysis

of the haplotypes contained in that block showed a lower frequency for haplotype

AC in cases compared to controls and our analysis showed it to be significantly

associated with breast cancer (p = 5x10-4). Odds ratio for this haplotype was

calculated to be 0.75 with a 95% confidence interval of 0.60 to 0.94 (p = 0.012)

suggesting that individuals that carry the AC haplotype had a reduced risk of

developing breast cancer (about 25%) in comparison to other haplotype carriers. To

the best of our knowledge this is the first report of a significant association between

these polymorphisms and breast cancer risk in Caucasian populations.


9.4.4 Comparison to GWAS results and miRNA functional implications.

As mentioned before and to the best of our knowledge the genotype association

studies included in this thesis are the first to report an association between the risk of

sporadic breast cancer in purely Caucasian populations and the presence of mutation

in these loci (MIR146A (rs2910164), MIR145 (rs353291) and MIR17HG

(rs4284505)). These miR-SNPs are located in intergenic regions at a large distance

from any implicated genomic regions previously identified in GWAS for breast

cancer (Easton et al., 2007; Fletcher et al., 2011; Hunter et al., 2007; Khan et al.,

2014; Li et al., 2011; Michailidou et al., 2013; Stacey et al., 2007; G. Thomas et al.,

2009; Turnbull et al., 2010) with none of these GWAS implicating regions near these

miRNA genes.

Our findings are supported by the functional implications of these miRNA

genes for breast cancer development and progression previously identified by other

studies. For instance, MIR146A has been found to downregulate expression of

BRCA1 and BRCA2 genes (A. I. Garcia et al., 2011; Shen et al., 2008) and to inhibit

breast cancer cell invasion and metastasis (Bhaumik et al., 2008; Hurst et al., 2009).

Moreover, some of the predicted targets for MIR146A include IRAK1, CARD10,

MMP16, MNAT1, GTF2H4 and TRAF6 which are found in diverse cancer related

pathways (Z. Hu et al., 2009). MIR145 has been found to play a tumour suppressor

role in breast cancer through activation of TP53 and reduced expression of ESR1

which reduces proliferation and induces apoptosis of breast cancer cells (Spizzo et

al., 2010). Furthermore, it has been found to reduce expression of RTKN which

confers breast cancer cell resistance to apoptosis (S. Wang et al., 2009). Finally

MIR17HG has been also described to have a tumour suppressor role through

regulation of expression of important genes for breast cancer development including

ESR1 and CCND1 (Hossain et al., 2006; Leivonen et al., 2009; Z. Yu et al., 2008).


9.5 LIMITATIONS AND FUTURE DIRECTIONS.

There are some limitations to the studies presented in this thesis and they

include: Size and ethnicity of the study populations, failing to include other variables

like age and type of breast cancer in the analysis and lack of studies to establish the

functional effects of the miR-SNPs positively associated to breast cancer risk.

We acknowledge that the size of the populations used in some of our studies

may limit their power to detect association for the rare variants that were included.

We may also be able to strengthen the significance of our findings by testing the

miR-SNPs found to be positively associated with breast cancer risk in larger

populations. This would be particularly important for those association studies were

multiple testing was performed. Another alternative to increase the power for our

populations would be to increase the number of individuals present in the control

populations to obtain a case-control ratio of 1:4 (Hong & Park, 2012; Schork, 2002).

However, some of these populations are still recruiting volunteers for both cases and

controls which could address this issue of population size for future studies.

Our populations only included women of Caucasian (Northern European)

origin due to the importance of using well-characterised population to conduct

genetic association studies but this also limits the strength of our findings. Therefore,

it would be important to test our findings using population of other ethnic

backgrounds as this changes the distribution of genotypes and alleles for miR-SNPs.

Findings in regard to other ethnicities may also indicate the role these miR-SNPs

play in the development of the disease, therefore they should be the focus of future

genotyping studies.

The studies included in this thesis focused on the association between the

presence of mutation in miRNAs and the risk of developing breast cancer. However

it would be also important to determine their association to some other important

characteristics like type of breast cancer, age of onset, response to treatment options


and recurrence of cancer. Unfortunately, information on these features was

unavailable to us because some of the cohorts were still under development but they

could be the focus of future studies as they could provide information on the role

they may play in the development and progression of breast cancer.

Finally it is important to conduct functional studies using cell culture and/or

animal models to establish the effects of these SNPs in the synthesis and expression

of the miRNA genes where they are present, as well as their effect on the genes these

miRNAs target and the molecular pathways where they are present. This is a

fundamental step towards the translation of these findings into diagnostic or

therapeutic approaches that could impact prognosis of breast cancer sufferers.

9.6 CONCLUSIONS

This study has implicated a number of miR-SNPs in breast cancer. More

specifically, we were able to identify significant association with the presence of

mutation and breast cancer risk for three miR-SNPs and one haplotype (rs2910164 in

MIR146A, rs353291 in MIR145, rs4284505 in MIR17HG and rs4284505/rs7336610

in MIR17HG) in both of our tested populations.

Results presented in Chapters Six to Eight of this document are only the initial

step required to determine whether the studied microRNA SNPs have an effect in

breast cancer development. Further research that includes functional assays will be

required for these findings to be able to translate into strategies that impact breast

cancer diagnosis, treatment and prognosis. The next immediate step would be

validation of our positive findings through genotyping of larger populations or

populations of different ethnic backgrounds to ascertain our findings. Following this,

functional validation using cell culture or animal models would allow a better

understanding of the detailed mechanisms behind their involvement in breast cancer

biology, which is also fundamental to proceed with future translational studies.


Bibliography

Abd El-Aal, A. A., Abd Elghany, N. A., Mohamadin, A. M., & El- Badry, A. A. (2010). Comparative study of five methods for DNA extraction from whole blood samples. International Journal Of Health Science, III(1), 285-287.

Abecasis, G. R., Auton, A., Brooks, L. D., DePristo, M. A., Durbin, R. M., Handsaker, R. E., . . . McVean, G. A. (2012). An integrated map of genetic variation from 1,092 human genomes. Nature, 491(7422), 56-65. doi: 10.1038/nature11632

Abeloff, M. D., Armitage, J. O., & Niederhuber, J. E. (2008). Abeloff's clinical oncology: Churchill Livingstone/Elsevier.

ABS. (2007). 3101.0 - Australian Demographic Statistics, Dec 2007. Retrieved 10/09/2015, from http://www.abs.gov.au/AUSSTATS/[email protected]/allprimarymainfeatures/04846512597875E2CA2574CD001239FE?opendocument

ABS. (2011). 3101.0 - Australian Demographic Statistics, Dec 2011. Retrieved 10/09/2015, from http://www.abs.gov.au/ausstats/[email protected]/lookup/3101.0Media%20Release1Dec%202011

Adkins, K. K., Strom, D. A., Jacobson, T. E., Seemann, C. R., & et al. (2002). Utilizing genomic DNA purified from clotted blood samples for single nucleotide polymorphism genotyping. Archives of Pathology and Laboratory Medicine, 126(3), 266-270.

AIHW. (2009). Breast cancer in Australia: an overview, 2009 Cat. no. CAN 46. Canberra: AIWH.

AIHW. (2012). Breast cancer in Australia: an overview, 2012 Cancer series no. 71. Cat. no. CAN 67. Canberra: AIHW.

AIHW, & AACR. (2010). Cancer in Australia: an overview, 2010 Cancer series no. 60. Cat. no. CAN 56. Canberra: AIHW.

AIHW, & AACR. (2012). Cancer in Australia: an overview, 2012 Cancer Series no. 74. Cat. no. CAN 70. Canberra: AIWH.

Alshatwi, A. A., Shafi, G., Hasan, T. N., Syed, N. A., Al-Hazzani, A. A., Alsaif, M. A., & Alsaif, A. A. (2012). Differential Expression Profile and Genetic Variants of MicroRNAs Sequences in Breast Cancer Patients. PloS one, 7(2), e30049. doi: 10.1371/journal.pone.0030049

Anderson, D. E. (1991). Familial versus sporadic breast cancer. Cancer, 70(S4), 1740-7146.

190 0 Bibliography

http://www.abs.gov.au/AUSSTATS/[email protected]/allprimarymainfeatures/04846512597875E2CA2574CD001239FE?opendocument

http://www.abs.gov.au/AUSSTATS/[email protected]/allprimarymainfeatures/04846512597875E2CA2574CD001239FE?opendocument

http://www.abs.gov.au/ausstats/[email protected]/lookup/3101.0Media%20Release1Dec%202011

http://www.abs.gov.au/ausstats/[email protected]/lookup/3101.0Media%20Release1Dec%202011

Arévalo-Herrera, M., Soto, L., Perlaza, B. L., Céspedes, N., Vera, O., Lenis, A. M., . . . Herrera, S. (2011). Antibody-Mediated and Cellular Immune Responses Induced in Naive Volunteers by Vaccination with Long Synthetic Peptides Derived from the Plasmodium vivax Circumsporozoite Protein. The American Journal of Tropical Medicine and Hygiene, 84(2 Suppl), 35-42. doi: 10.4269/ajtmh.2011.09-0507

Arndt, G., Dossey, L., Cullen, L., Lai, A., Druker, R., Eisbacher, M., . . . Raponi, M. (2009). Characterization of global microRNA expression reveals oncogenic potential of miR-145 in metastatic colorectal cancer. BMC Cancer, 9(1), 374.

Australia, C. (2011). Breast Cancer Statistics. Retrieved 06 August 2013, from NBOCC http://canceraustralia.nbocc.org.au/breast-cancer/about-breast-cancer/breast-cancer-statistics

Bahl, A., & Pfenninger, M. (1996). A rapid method of DNA isolation using laundry detergent. Nucleic Acids Research, 24(8), 1587-1588.

Bansal, C., Sharma, K. L., Misra, S., Srivastava, A. N., Mittal, B., & Singh, U. S. (2014). Common genetic variants in pre-microRNAs and risk of breast cancer in the North Indian population. Ecancermedicalscience, 8, 473. doi: 10.3332/ecancer.2014.473

Barrett, J. C., Fry, B., Maller, J., & Daly, M. J. (2005). Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics (Oxford, England), 21(2), 263-265. doi: 10.1093/bioinformatics/bth457

Bartel, D. P. (2004). MicroRNAs: genomics, biogenesis, mechanism, and function. Cell, 116(2), 281-297.

Bartel, D. P. (2009). MicroRNAs: Target Recognition and Regulatory Functions. Cell, 136(2), 215-233.

Bhaumik, D., Scott, G. K., Schokrpur, S., Patil, C. K., Campisi, J., & Benz, C. C. (2008). Expression of microRNA-146 suppresses NF-kappaB activity with reduction of metastatic potential in breast cancer cells. Oncogene, 27(42), 5643-5647. doi: 10.1038/onc.2008.171

Blenkiron, C., Goldstein, L., Thorne, N., Spiteri, I., Chin, S.-F., Dunning, M., . . . Miska, E. (2007). MicroRNA expression profiling of human breast cancer identifies new markers of tumor subtype. Genome Biology, 8(10), R214.

Bloom, H. J., & Richardson, W. W. (1957). Histological grading and prognosis in breast cancer; a study of 1409 cases of which 359 have been followed for 15 years. British Journal of Cancer, 11(3), 359-377.

Boom, R., Sol, C. J., Salimans, M. M., Jansen, C. L., Wertheim-van Dillen, P. M., & van der Noordaa, J. (1990). Rapid and simple method for purification of nucleic acids. Journal of Clinical Microbiology, 28(3), 495-503.

Brown, T. A. (2007). Genomes 3 (3rd ed.). New York: Garland Science Pub.

Budhu, A., Ji, J., & Wang, X. (2010). The clinical potential of microRNAs. Journal of Hematology & Oncology, 3(1), 37.

0 Bibliography 191

http://canceraustralia.nbocc.org.au/breast-cancer/about-breast-cancer/breast-cancer-statistics

http://canceraustralia.nbocc.org.au/breast-cancer/about-breast-cancer/breast-cancer-statistics

Buetow, K. H., Edmonson, M., MacDonald, R., Clifford, R., Yip, P., Kelley, J., . . . Braun, A. (2001). High-throughput development and characterization of a genomewide collection of gene-based single nucleotide polymorphism markers by chip-based matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Proceedings of the National Academy of Sciences of the United States of America, 98(2), 581-584. doi: 10.1073/pnas.021506298

Buffa, F. M., Camps, C., Winchester, L., Snell, C. E., Gee, H. E., Sheldon, H., . . . Ragoussis, J. (2011). microRNA-Associated Progression Pathways and Potential Therapeutic Targets Identified by Integrated mRNA and microRNA Expression Profiling in Breast Cancer. Cancer Research, 71(17), 5635-5645. doi: 10.1158/0008-5472.can-11-0489

Butler, J. M. (2009). DNA Extraction from Forensic Samples Using Chelex. Cold Spring Harbor Protocols, 4(6). doi: 10.1101/pdb.prot5229

Calin, G. A., & Croce, C. M. (2006). MicroRNA signatures in human cancers. Nature Reviews Cancer, 6(11), 857-866. doi: 10.1038/nrc1997

Calin, G. A., Dumitru, C. D., Shimizu, M., Bichi, R., Zupo, S., Noch, E., . . . Croce, C. M. (2002). Frequent deletions and down-regulation of micro- RNA genes miR15 and miR16 at 13q14 in chronic lymphocytic leukemia. Proceedings of the National Academy of Sciences of the United States of America, 99(24), 15524-15529. doi: 10.1073/pnas.242606799

Calin, G. A., Liu, C.-G., Sevignani, C., Ferracin, M., Felli, N., Dumitru, C. D., . . . Croce, C. M. (2004). MicroRNA profiling reveals distinct signatures in B cell chronic lymphocytic leukemias. Proceedings of the National Academy of Sciences of the United States of America, 101(32), 11755-11760. doi: 10.1073/pnas.0404432101

Calin, G. A., Sevignani, C., Dumitru, C. D., Hyslop, T., Noch, E., Yendamuri, S., . . . Croce, C. M. (2004). Human microRNA genes are frequently located at fragile sites and genomic regions involved in cancers. Proceedings of the National Academy of Sciences of the United States of America, 101(9), 2999-3004. doi: 10.1073/pnas.0307323101

Cambon-Thomsen, A. (2003). Assessing the impact of biobanks. Nature Genetics, 34(1), 25-26.

Cambon-Thomsen, A., Ducournau, P., Gourraud, P.-A., & Pontille, D. (2003). Biobanks for genomics and genomics for biobanks. Comparative and functional genomics, 4(6), 628-634.

Carpi, F. M., Di Pietro, F., Vincenzetti, S., Mignini, F., & Napolioni, V. (2011). Human DNA extraction methods: patents and applications. Recent patents on DNA & gene sequences, 5(1), 1-7.

Carter, R. F. (2001). BRCA1, BRCA2 and breast cancer: a concise clinical review. Clinical and Investigative Medicine. Medecine Clinique et Experimentale, 24(3), 147-157.

Catucci, I., Verderio, P., Pizzamiglio, S., Bernard, L., Dall'olio, V., Sardella, D., . . . Peterlongo, P. (2012). The SNP rs895819 in miR-27a is not associated with

192 0 Bibliography

familial breast cancer risk in Italians. Breast Cancer Research and Treatment, 133(2), 805-807. doi: 10.1007/s10549-012-2011-y

Catucci, I., Yang, R., Verderio, P., Pizzamiglio, S., Heesen, L., Hemminki, K., . . . Peterlongo, P. (2010). Evaluation of SNPs in miR-146a, miR196a2 and miR-499 as low-penetrance alleles in German and Italian familial breast cancer cases. Human Mutation, 31(1), E1052-1057. doi: 10.1002/humu.21141

Chacon-Cortes, D., & Griffiths, L. R. (2014). Methods for extracting genomic DNA from whole blood samples: current perspectives. Journal of Biorepository Science for Applied Medicine, 2, 1-9.

Chacon-Cortes, D., Haupt, L. M., Lea, R. A., & Griffiths, L. R. (2012). Comparison of genomic DNA extraction techniques from whole blood samples: a time, cost and quality evaluation study. Molecular Biology Reports, 39(5), 5961-5966. doi: 10.1007/s11033-011-1408-8

Chae, Y. S., Kim, J. G., Lee, S. J., Kang, B. W., Lee, Y. J., Park, J. Y., . . . Choi, G. S. (2013). A miR-146a Polymorphism (rs2910164) Predicts Risk of and Survival from Colorectal Cancer. Anticancer Research, 33(8), 3233-3239.

Chan, J. A., Krichevsky, A. M., & Kosik, K. S. (2005). MicroRNA-21 is an antiapoptotic factor in human glioblastoma cells. Cancer Research, 65(14), 6029-6033. doi: 10.1158/0008-5472.CAN-05-0137

Chen, P., Zhang, J., & Zhou, F. (2012). miR-499 rs3746444 polymorphism is associated with cancer development among Asians and related to breast cancer susceptibility. Molecular Biology Reports, 39(12), 10433-10438. doi: 10.1007/s11033-012-1922-3

Chen, X., Ba, Y., Ma, L., Cai, X., Yin, Y., Wang, K., . . . Zhang, C. Y. (2008). Characterization of microRNAs in serum: a novel class of biomarkers for diagnosis of cancer and other diseases. Cell Research, 18(10), 997-1006. doi: 10.1038/cr.2008.282

Chendrimada, T. P., Gregory, R. I., Kumaraswamy, E., Norman, J., Cooch, N., Nishikura, K., & Shiekhattar, R. (2005). TRBP recruits the Dicer complex to Ago2 for microRNA processing and gene silencing. Nature, 436(7051), 740-744. doi: 10.1038/nature03868

Chirgwin, J. M., Przybyla, A. E., MacDonald, R. J., & Rutter, W. J. (1979). Isolation of biologically active ribonucleic acid from sources enriched in ribonuclease. Biochemistry, 18(24), 5294-5299. doi: 10.1021/bi00591a005

Chiyomaru, T., Enokida, H., Tatarano, S., Kawahara, K., Uchida, Y., Nishiyama, K., . . . Nakagawa, M. (2010). miR-145 and miR-133a function as tumour suppressors and directly regulate FSCN1 expression in bladder cancer. British Journal of Cancer, 102(5), 883-891.

Cho, W. C. S., Chow, A. S. C., & Au, J. S. K. (2009). Restoration of tumour suppressor hsa-miR-145 inhibits cancer cell growth in lung adenocarcinoma patients with epidermal growth factor receptor mutation. European Journal of Cancer, 45(12), 2197-2206.

0 Bibliography 193

Cho, W. C. S., Chow, A. S. C., & Au, J. S. K. (2011). MiR-145 inhibits cell proliferation of human lung adenocarcinoma by targeting EGFR and NUDT1. RNA Biology, 8(1), 125-131.

Chomczynski, P., & Sacchi, N. (1987). Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. Analytical Biochemistry, 162(1), 156-159.

Chomczynski, P., & Sacchi, N. (2006). The single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction: twenty-something years on. Nature Protocols, 1(2), 581-585.

Ciafre, S. A., Galardi, S., Mangiola, A., Ferracin, M., Liu, C. G., Sabatino, G., . . . Farace, M. G. (2005). Extensive modulation of a set of microRNAs in primary glioblastoma. Biochemical and Biophysical Research Communications, 334(4), 1351-1358. doi: 10.1016/j.bbrc.2005.07.030

Claus, E. B., Schildkraut, J. M., Thompson, W. D., & Risch, N. J. (1996). The genetic attributable risk of breast and ovarian cancer. Cancer, 77(11), 2318-2324. doi: 10.1002/(sici)1097-0142(19960601)77:11<2318::aid-cncr21>3.0.co;2-z

Consortium, G. P. (2012). An integrated map of genetic variation from 1,092 human genomes. Nature, 491(7422), 56-65.

Copstead, L.-E. C., & Banasik, J. L. (2010). Pathophysiology. St. Louis, Mo.: Saunders Elsevier.

Cummins, J. M., He, Y., Leary, R. J., Pagliarini, R., Diaz, L. A., Jr., Sjoblom, T., . . . Velculescu, V. E. (2006). The colorectal microRNAome. Proceedings of the National Academy of Sciences of the United States of America, 103(10), 3687-3692. doi: 10.1073/pnas.0511155103

Czyzyk-Krzeska, M. F., & Zhang, X. (2014). MiR-155 at the heart of oncogenic pathways. Oncogene, 33(6), 677-678. doi: 10.1038/onc.2013.26

Dahm, R. (2005). Friedrich Miescher and the discovery of DNA. Developmental Biology, 278(2), 274-288.

De Vivo, I., Huggins, G. S., Hankinson, S. E., Lescault, P. J., Boezen, M., Colditz, G. A., & Hunter, D. J. (2002). A functional polymorphism in the promoter of the progesterone receptor gene associated with endometrial cancer risk. Proceedings of the National Academy of Sciences of the United States of America, 99(19), 12263-12268. doi: 10.1073/pnas.192172299

Di Pietro, F., Ortenzi, F., Tilio, M., Concetti, F., & Napolioni, V. (2011). Genomic DNA extraction from whole blood stored from 15- to 30-years at −20 °C by rapid phenol–chloroform protocol: A useful tool for genetic epidemiology studies. Molecular and Cellular Probes, 25(1), 44-48.

Drabek, J., & Petrek, M. (2002). A sugar, laundry detergent, and salt method for extraction of deoxyribonucleic acid from blood. Biomedical papers of the Medical Faculty of the University Palacky, Olomouc, Czechoslovakia, 146(2), 37-39.

194 0 Bibliography

Duan, R., Pak, C., & Jin, P. (2007). Single nucleotide polymorphism associated with mature miR-125a alters the processing of pri-miRNA. Human Molecular Genetics, 16(9), 1124-1131. doi: 10.1093/hmg/ddm062

Duan, S., Mi, S., Zhang, W., & Dolan, M. E. (2009). Comprehensive analysis of the impact of SNPs and CNVs on human microRNAs and their regulatory genes. RNA Biology, 6(4), 412-425.

Duarte, G. R., Price, C. W., Littlewood, J. L., Haverstick, D. M., Ferrance, J. P., Carrilho, E., & Landers, J. P. (2010). Characterization of dynamic solid phase DNA extraction from blood with magnetically controlled silica beads. Analyst, 135(3), 531-537. doi: 10.1039/b918996c

Dumitrescu, R. G., & Cotarla, I. (2005). Understanding breast cancer risk - where do we stand in 2005? Journal of Cellular and Molecular Medicine, 9(1), 208-221. doi: 10.1111/j.1582-4934.2005.tb00350.x

Dutt, A., & Beroukhim, R. (2007). Single nucleotide polymorphism array analysis of cancer. Current Opinion in Oncology, 19(1), 43-49 10.1097/CCO.1090b1013e328011a328018c328011.

Easton, D. F., Pooley, K. A., Dunning, A. M., Pharoah, P. D. P., Thompson, D., Ballinger, D. G., . . . Ponder, B. A. J. (2007). Genome-wide association study identifies novel breast cancer susceptibility loci. Nature, 447(7148), 1087-1093.

Einarsdottir, K., Rosenberg, L., Humphreys, K., Bonnard, C., Palmgren, J., Li, Y., . . . Wedren, S. (2006). Comprehensive analysis of the ATM, CHEK2 and ERBB2 genes in relation to breast tumour characteristics and survival: a population-based case-control and follow-up study. Breast Cancer Research, 8(6), R67.

Eis, P. S., Tam, W., Sun, L., Chadburn, A., Li, Z., Gomez, M. F., . . . Dahlberg, J. E. (2005). Accumulation of miR-155 and BIC RNA in human B cell lymphomas. Proceedings of the National Academy of Sciences of the United States of America, 102(10), 3627-3632. doi: 10.1073/pnas.0500613102

Ellis, H. (2010). Anatomy of the breast. 28(3), 114-116.

Esquela-Kerscher, A., & Slack, F. J. (2006). Oncomirs -- microRNAs with a role in cancer. Nature Reviews Cancer, 6(4), 259-269.

Esteban Cardenosa, E., de Juan Jimenez, I., Palanca Suela, S., Chirivella Gonzalez, I., Segura Huerta, A., Santaballa Beltran, A., . . . Bolufer Gilabert, P. (2012). Low penetrance alleles as risk modifiers in familial and sporadic breast cancer. Familial Cancer, 11(4), 629-636. doi: 10.1007/s10689-012-9563-1

Falzoi, M., Mossa, A., Congeddu, E., Saba, L., & Pani, L. (2010). Multiplex genotyping of CYP3A4, CYP3A5, CYP2C9 and CYP2C19 SNPs using MALDI-TOF mass spectrometry. Pharmacogenomics, 11(4), 559-571. doi: 10.2217/pgs.09.172

Farazi, T. A., Horlings, H. M., Ten Hoeve, J. J., Mihailovic, A., Halfwerk, H., Morozov, P., . . . Tuschl, T. (2011). MicroRNA sequence and expression analysis in breast tumors by deep sequencing. Cancer Research, 71(13), 4443-4453. doi: 10.1158/0008-5472.CAN-11-0608

0 Bibliography 195

Farazi, T. A., Spitzer, J. I., Morozov, P., & Tuschl, T. (2011). miRNAs in human cancer. The Journal of Pathology, 223(2), 102-115.

Faul, F., Erdfelder, E., Buchner, A., & Lang, A. G. (2009). Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses. Behavior Research Methods, 41(4), 1149-1160. doi: 10.3758/BRM.41.4.1149

Faul, F., Erdfelder, E., Lang, A. G., & Buchner, A. (2007). G*Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175-191.

Ferlay, J., Shin, H.-R., Bray, F., Forman, D., Mathers, C., & Parkin, D. M. (2010). Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008. International Journal of Cancer, 127(12), 2893-2917.

Ferlay, J., Soerjomataram, I., Ervik, M., Dikshit, R., Eser, S., Mathers, C., . . . Bray, F. (2013). GLOBOCAN 2012 v1.0, Cancer Incidence and Mortality Worldwide: IARC CancerBase No. 11 [Internet]. Lyon, France: International Agency for Research on Cancer. Retrieved 22/08/2012, from http://globocan.iarc.fr

Fisher, S. R. A., & Yates, F. (1963). Statistical Tables for Biological, Agricultural and Medical Research: Oliver & Boyd.

Fletcher, O., Johnson, N., Orr, N., Hosking, F. J., Gibson, L. J., Walker, K., . . . Peto, J. (2011). Novel Breast Cancer Susceptibility Locus at 9q31.2: Results of a Genome-Wide Association Study. Journal of the National Cancer Institute. doi: 10.1093/jnci/djq563

Frankel, L. B., Christoffersen, N. R., Jacobsen, A., Lindow, M., Krogh, A., & Lund, A. H. (2008). Programmed cell death 4 (PDCD4) is an important functional target of the microRNA miR-21 in breast cancer cells. Journal of Biological Chemistry, 283(2), 1026-1033. doi: 10.1074/jbc.M707224200

Gaaib, J. N., Nassief, A. F., & Al-Assi, A. H. (2011). Simple salting – out method for genomic DNA extraction from whole blood Tikrit Journal of Pure Science, 16(2), 9-11.

Gabriel, S., Ziaugra, L., & Tabbaa, D. (2001). SNP Genotyping Using the Sequenom MassARRAY iPLEX Platform Current Protocols in Human Genetics: John Wiley & Sons, Inc.

Galanina, N., Bossuyt, V., & Harris, L. N. (2011). Molecular predictors of response to therapy for breast cancer. Cancer Journal, 17(2), 96-103. doi: 10.1097/PPO.0b013e318212dee3

Gao, L. B., Bai, P., Pan, X. M., Jia, J., Li, L. J., Liang, W. B., . . . Zhang, L. (2011). The association between two polymorphisms in pre-miRNAs and breast cancer risk: a meta-analysis. Breast Cancer Research and Treatment, 125(2), 571-574. doi: 10.1007/s10549-010-0993-x

Garcia-Sepulveda, C., Carrillo-Acuña, E., Guerra-Palomares, S., & Barriga-Moreno, M. (2010). Maxiprep genomic DNA extractions for molecular epidemiology studies and biorepositories. Molecular Biology Reports, 37(4), 1883-1890.

196 0 Bibliography

http://globocan.iarc.fr/

Garcia, A. I., Buisson, M., Bertrand, P., Rimokh, R., Rouleau, E., Lopez, B. S., . . . Mazoyer, S. (2011). Down-regulation of BRCA1 expression by miR-146a and miR-146b-5p in triple negative sporadic breast cancers. EMBO Molecular Medicine, 3(5), 279-290. doi: 10.1002/emmm.201100136

Garcia, A. I., Cox, D. G., Barjhoux, L., Verny-Pierre, C., Barnes, D., Gemo Study, C., . . . Mazoyer, S. (2011). The rs2910164:G>C SNP in the MIR146A gene is not associated with breast cancer risk in BRCA1 and BRCA2 mutation carriers. Human Mutation, 32(9), 1004-1007. doi: 10.1002/humu.21539

Gauwerky, C. E., Huebner, K., Isobe, M., Nowell, P. C., & Croce, C. M. (1989). Activation of MYC in a masked t(8;17) translocation results in an aggressive B-cell leukemia. Proceedings of the National Academy of Sciences of the United States of America, 86(22), 8867-8871.

Gibson, G., & Muse, S. V. (2009). A primer of genome science (3rd ed.). Sunderland, Mass.: Sinauer Associates.

Goette, M., Mohr, C., Koo, C. Y., Stock, C., Vaske, A. K., Viola, M., . . . Yip, G. W. (2010). miR-145-dependent targeting of Junctional Adhesion Molecule A and modulation of fascin expression are associated with reduced breast cancer cell motility and invasiveness. Oncogene, 29(50), 6569-6580.

Gong, J., Tong, Y., Zhang, H. M., Wang, K., Hu, T., Shan, G., . . . Guo, A. Y. (2012). Genome-wide identification of SNPs in microRNA genes and the SNP effects on microRNA target binding and biogenesis. Human Mutation, 33(1), 254-263. doi: 10.1002/humu.21641

Gottwein, E., Cai, X., & Cullen, B. R. (2006). A novel assay for viral microRNA function identifies a single nucleotide polymorphism that affects Drosha processing. Journal of Virology, 80(11), 5321-5326. doi: 10.1128/JVI.02734-05

Greene, F., Balch, C., Haller, D., & Morrow, M. (2002). AJCC Cancer Staging Manual (6th Edition): Springer.

Greene, J. J., & Rao, V. B. (Eds.). (1998). Recombinant DNA Principles and Methodologies (1st ed.). New York: Marcel Dekker, Inc.

Griffin, T. J., & Smith, L. M. (2000). Single-nucleotide polymorphism analysis by MALDI-TOF mass spectrometry. Trends in Biotechnology, 18(2), 77-84.

Grimberg, J., Nawoschik, S., Belluscio, L., McKee, R., Turck, A., & Eisenberg, A. (1989). A simple and efficient non-organic procedure for the isolation of genomic DNA from blood. Nucleic Acids Research, 17(20), 8390.

Guo, S. W., & Thompson, E. A. (1992). Performing the exact test of Hardy-Weinberg proportion for multiple alleles. Biometrics, 361-372.

Gut, I. G. (2004). DNA analysis by MALDI-TOF mass spectrometry. Human Mutation, 23(5), 437-441. doi: 10.1002/humu.20023

Ha, M., & Kim, V. N. (2014). Regulation of microRNA biogenesis. Nature Reviews Molecular Cell Biology, 15(8), 509-524. doi: 10.1038/nrm3838

0 Bibliography 197

Hall, J., Lee, M., Newman, B., Morrow, J., Anderson, L., Huey, B., & King, M. (1990). Linkage of early-onset familial breast cancer to chromosome 17q21. Science, 250(4988), 1684-1689. doi: 10.1126/science.2270482

Han, M., & Zheng, Y. (2013). Comprehensive analysis of single nucleotide polymorphisms in human microRNAs. PloS one, 8(11), e78028. doi: 10.1371/journal.pone.0078028

Han, X., Yan, S., Weijie, Z., Feng, W., Liuxing, W., Mengquan, L., & Qingxia, F. (2014). Critical role of miR-10b in transforming growth factor-[beta]1-induced epithelial-mesenchymal transition in breast cancer. Cancer Gene Therapy, 21(2), 60-67. doi: 10.1038/cgt.2013.82

Hanahan, D., & Weinberg, R. A. (2000). The hallmarks of cancer. Cell, 100(1), 57-70.

Hanahan, D., & Weinberg, R. A. (2011). Hallmarks of cancer: the next generation. Cell, 144(5), 646-674. doi: 10.1016/j.cell.2011.02.013

Hardy, G. H. (1908). Mendelian proportions in a mixed population. Science, 28(706), 49-50.

Hasani, S.-S., Hashemi, M., Eskandari-Nasab, E., Naderi, M., Omrani, M., & Sheybani-Nasab, M. (2014). A functional polymorphism in the miR-146a gene is associated with the risk of childhood acute lymphoblastic leukemia: a preliminary report. Tumor Biology, 35(1), 219-225. doi: 10.1007/s13277-013-1027-1

Hawkins, T. L., O'Connor-Morin, T., Roy, A., & Santillan, C. (1994). DNA purification and isolation using a solid-phase. Nucleic Acids Research, 22(21), 4543-4544.

Hayashita, Y., Osada, H., Tatematsu, Y., Yamada, H., Yanagisawa, K., Tomida, S., . . . Takahashi, T. (2005). A polycistronic microRNA cluster, miR-17-92, is overexpressed in human lung cancers and enhances cell proliferation. Cancer Research, 65(21), 9628-9632. doi: 10.1158/0008-5472.CAN-05-2352

He, H., Jazdzewski, K., Li, W., Liyanarachchi, S., Nagy, R., Volinia, S., . . . de la Chapelle, A. (2005). The role of microRNA genes in papillary thyroid carcinoma. Proceedings of the National Academy of Sciences of the United States of America, 102(52), 19075-19080. doi: 10.1073/pnas.0509603102

He, L., Thomson, J. M., Hemann, M. T., Hernando-Monge, E., Mu, D., Goodson, S., . . . Hammond, S. M. (2005). A microRNA polycistron as a potential human oncogene. Nature, 435(7043), 828-833. doi: 10.1038/nature03552

Heneghan, H. M., Miller, N., Lowery, A. J., Sweeney, K. J., Newell, J., & Kerin, M. J. (2010). Circulating microRNAs as Novel Minimally Invasive Biomarkers for Breast Cancer. Annals of Surgery, 251(3), 499-505 410.1097/SLA.1090b1013e3181cc1939f.

Herzer, S. (2001). DNA purification. Molecular Biology problem solver: A laboratory Guide, 167-196.

198 0 Bibliography

Hoffman, A. E., Zheng, T., Yi, C., Leaderer, D., Weidhaas, J., Slack, F., . . . Zhu, Y. (2009). microRNA miR-196a-2 and Breast Cancer: A Genetic and Epigenetic Association Study and Functional Analysis. Cancer Research, 69(14), 5970-5977. doi: 10.1158/0008-5472.can-09-0236

Holmes, F. L. (2001). Meselson, Stahl, and the Replication of DNA: A History of the Most Beautiful Experiment in Biology: Yale University Press.

Hong, E. P., & Park, J. W. (2012). Sample Size and Statistical Power Calculation in Genetic Association Studies. Genomics & Informatics, 10(2), 117-122. doi: 10.5808/GI.2012.10.2.117

Hossain, A., Kuo, M. T., & Saunders, G. F. (2006). Mir-17-5p regulates breast cancer cell proliferation by inhibiting translation of AIB1 mRNA. Molecular and Cellular Biology, 26(21), 8191-8201. doi: 10.1128/MCB.00242-06

Hu, Y., Correa, A. M., Hoque, A., Guan, B., Ye, F., Huang, J., . . . Xu, X.-c. (2011). Prognostic significance of differentially expressed miRNAs in esophageal cancer. International journal of cancer. Journal international du cancer, 128(1), 132-143. doi: 10.1002/ijc.25330

Hu, Z., Chen, J., Tian, T., Zhou, X., Gu, H., Xu, L., . . . Shen, H. (2008). Genetic variants of miRNA sequences and non–small cell lung cancer survival. The Journal of Clinical Investigation, 118(7), 2600-2608. doi: 10.1172/JCI34934

Hu, Z., Liang, J., Wang, Z., Tian, T., Zhou, X., Chen, J., . . . Shen, H. (2009). Common genetic variants in pre-microRNAs were associated with increased risk of breast cancer in Chinese women. Human Mutation, 30(1), 79-84. doi: 10.1002/humu.20837

Huberman, J. A. (1995). Importance of measuring nucleic acid absorbance at 240 nm as well as at 260 and 280 nm. Biotechniques, 18(4), 636.

Hulka, B. S., & Moorman, P. G. (2001). Breast cancer: hormones and other risk factors. Maturitas, 38(1), 103-113.

Hung, S. I., Chung, W. H., Liou, L. B., Chu, C. C., Lin, M., Huang, H. P., . . . Chen, Y. T. (2005). HLA-B*5801 allele as a genetic marker for severe cutaneous adverse reactions caused by allopurinol. Proceedings of the National Academy of Sciences of the United States of America, 102(11), 4134-4139. doi: 10.1073/pnas.0409500102

Hunter, D. J., Kraft, P., Jacobs, K. B., Cox, D. G., Yeager, M., Hankinson, S. E., . . . Chanock, S. J. (2007). A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nature Genetics, 39(7), 870-874.

Hurst, D. R., Edmonds, M. D., Scott, G. K., Benz, C. C., Vaidya, K. S., & Welch, D. R. (2009). Breast cancer metastasis suppressor 1 up-regulates miR-146, which suppresses breast cancer metastasis. Cancer Research, 69(4), 1279-1283. doi: 10.1158/0008-5472.CAN-08-3559

Iorio, M. V., Ferracin, M., Liu, C.-G., Veronese, A., Spizzo, R., Sabbioni, S., . . . Croce, C. M. (2005). MicroRNA Gene Expression Deregulation in Human

0 Bibliography 199

Breast Cancer. Cancer Research, 65(16), 7065-7070. doi: 10.1158/0008-5472.can-05-1783

Iri-Sofla, F. J., Rahbarizadeh, F., Ahmadvand, D., & Rasaee, M. J. (2011). Nanobody-based chimeric receptor gene integration in Jurkat cells mediated by PhiC31 integrase. Experimental Cell Research(0).

Jazdzewski, K., Liyanarachchi, S., Swierniak, M., Pachucki, J., Ringel, M. D., Jarzab, B., & Chapelle, A. d. l. (2009). Polymorphic Mature MicroRNAs from Passenger Strand of Pre-MiR-146a Contribute to Thyroid Cancer. Proceedings of the National Academy of Sciences of the United States of America, 106(5), 1502-1505. doi: 10.2307/40272407

Jazdzewski, K., Murray, E. L., Franssila, K., Jarzab, B., Schoenberg, D. R., & Chapelle, A. d. l. (2008). Common SNP in Pre-miR-146a Decreases Mature miR Expression and Predisposes to Papillary Thyroid Carcinoma. Proceedings of the National Academy of Sciences of the United States of America, 105(20), 7269-7274. doi: 10.2307/25461966

Jedlinski, D. J., Gabrovska, P. N., Weinstein, S. R., Smith, R. A., & Griffiths, L. R. (2011). Single nucleotide polymorphism in hsa-mir-196a-2 and breast cancer risk: a case control study. Twin Research and Human Genetics, 14(5), 417-421. doi: 10.1375/twin.14.5.417

Jiang, S., Zhang, H. W., Lu, M. H., He, X. H., Li, Y., Gu, H., . . . Wang, E. D. (2010). MicroRNA-155 functions as an OncomiR in breast cancer by targeting the suppressor of cytokine signaling 1 gene. Cancer Research, 70(8), 3119-3127. doi: 10.1158/0008-5472.can-09-4250

Johnston, S., R. D. , & Dowsett, M. (2003). Aromatase inhibitors for breast cancer: lessons from the laboratory. Nature Reviews Cancer, 3(11), 821-831.

Kano, M., Seki, N., Kikkawa, N., Fujimura, L., Hoshino, I., Akutsu, Y., . . . Matsubara, H. (2010). miR-145, miR-133a and miR-133b: Tumor-suppressive miRNAs target FSCN1 in esophageal squamous cell carcinoma. International Journal of Cancer, 127(12), 2804-2814. doi: 10.1002/ijc.25284

Karas, M., & Hillenkamp, F. (1988). Laser desorption ionization of proteins with molecular masses exceeding 10,000 daltons. Analytical Chemistry, 60(20), 2299-2301.

Khan, S., Greco, D., Michailidou, K., Milne, R. L., Muranen, T. A., Heikkinen, T., . . . Nevanlinna, H. (2014). MicroRNA related polymorphisms and breast cancer risk. PloS one, 9(11), e109973. doi: 10.1371/journal.pone.0109973

Khvorova, A., Reynolds, A., & Jayasena, S. D. (2003). Functional siRNAs and miRNAs exhibit strand bias. Cell, 115(2), 209-216.

Kim, V. N. (2005). MicroRNA biogenesis: coordinated cropping and dicing. Nature Reviews. Molecular Cell Biology, 6(5), 376-385.

Kluiver, J., Poppema, S., de Jong, D., Blokzijl, T., Harms, G., Jacobs, S., . . . van den Berg, A. (2005). BIC and miR-155 are highly expressed in Hodgkin, primary

200 0 Bibliography

mediastinal and diffuse large B cell lymphomas. The Journal of Pathology, 207(2), 243-249. doi: 10.1002/path.1825

Kong, W., He, L., Richards, E. J., Challa, S., Xu, C. X., Permuth-Wey, J., . . . Cheng, J. Q. (2014). Upregulation of miRNA-155 promotes tumour angiogenesis by targeting VHL and is associated with poor prognosis and triple-negative breast cancer. Oncogene, 33(6), 679-689. doi: 10.1038/onc.2012.636

Kontorovich, T., Levy, A., Korostishevsky, M., Nir, U., & Friedman, E. (2010). Single nucleotide polymorphisms in miRNA binding sites and miRNA genes as breast/ovarian cancer risk modifiers in Jewish high-risk women. International Journal of Cancer, 127(3), 589-597. doi: 10.1002/ijc.25065

Kosaka, N., Iguchi, H., & Ochiya, T. (2010). Circulating microRNA in body fluid: a new potential biomarker for cancer diagnosis and prognosis. Cancer Science, 101(10), 2087-2092. doi: 10.1111/j.1349-7006.2010.01650.x

Kozomara, A., & Griffiths-Jones, S. (2011). miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Research, 39(suppl 1), D152-D157. doi: 10.1093/nar/gkq1027

Kozomara, A., & Griffiths-Jones, S. (2014). miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Research, 42(D1), D68-D73. doi: 10.1093/nar/gkt1181

LaFramboise, T. (2009). Single nucleotide polymorphism arrays: a decade of biological, computational and technological advances. Nucleic Acids Research, 37(13), 4181-4193. doi: 10.1093/nar/gkp552

Lagos-Quintana, M., Rauhut, R., Yalcin, A., Meyer, J., Lendeckel, W., & Tuschl, T. (2002). Identification of tissue-specific microRNAs from mouse. Current Biology, 12(9), 735-739.

Lahiri, D. K., Bye, S., Nurnberger Jr, J. I., Hodes, M. E., & Crisp, M. (1992). A non-organic and non-enzymatic extraction method gives higher yields of genomic DNA from whole-blood samples than do nine other methods tested. Journal of Biochemical and Biophysical Methods, 25(4), 193-205.

Lahiri, D. K., & Nurnberger Jr, J. I. (1991). A rapid non-enzymatic method for the preparation of HMW DNA from blood for RFLP studies. Nucleic Acids Research, 19(19), 5444.

Lai, E. C. (2002). Micro RNAs are complementary to 3' UTR sequence motifs that mediate negative post-transcriptional regulation. Nature Genetics, 30(4), 363-364. doi: 10.1038/ng865

Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C., Baldwin, J., . . . Chen, Y. J. (2001). Initial sequencing and analysis of the human genome. Nature, 409(6822), 860-921. doi: 10.1038/35057062

Lee, H.-C., Yang, C.-W., Chen, C.-Y., & Au, L.-C. (2011). Single point mutation of microRNA may cause butterfly effect on alteration of global gene expression. Biochemical and Biophysical Research Communications, 404(4), 1065-1069.

0 Bibliography 201

Lee, J.-H., Park, Y., Choi, J. R., Lee, E. K., & Kim, H.-S. (2010). Comparisons of three automated systems for genomic DNA extraction in a clinical diagnostic laboratory. Yonsei Medical Journal, 51(1), 104-110.

Lee, R. C., Feinbaum, R. L., & Ambros, V. (1993). The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell, 75(5), 843-854. doi: 10.1016/0092-8674(93)90529-y

Lee, S. J., Seo, J. W., Chae, Y. S., Kim, J. G., Kang, B. W., Kim, W. W., . . . Park, J. Y. (2014). Genetic polymorphism of miR-196a as a prognostic biomarker for early breast cancer. Anticancer Research, 34(6), 2943-2949.

Lee, Y., Ahn, C., Han, J., Choi, H., Kim, J., Yim, J., . . . Kim, N. (2003). The nuclear RNase III Drosha initiates microRNA processing (Vol. 425, pp. 415): Nature Publishing Group.

Lee, Y., Jeon, K., Lee, J. T., Kim, S., & Kim, V. N. (2002). MicroRNA maturation: stepwise processing and subcellular localization. EMBO Journal, 21(17), 4663-4670.

Lehmann, U., Hasemeier, B., Christgen, M., Müller, M., Römermann, D., Länger, F., & Kreipe, H. (2008). Epigenetic inactivation of microRNA gene hsa-mir-9-1 in human breast cancer. The Journal of Pathology, 214(1), 17-24. doi: 10.1002/path.2251

Leivonen, S. K., Makela, R., Ostling, P., Kohonen, P., Haapa-Paananen, S., Kleivi, K., . . . Kallioniemi, O. (2009). Protein lysate microarray analysis to identify microRNAs regulating estrogen receptor signaling in breast cancer cell lines. Oncogene, 28(44), 3926-3936. doi: 10.1038/onc.2009.241

Li, J., Humphreys, K., Heikkinen, T., Aittomäki, K., Blomqvist, C., Pharoah, P., . . . Hall, P. (2011). A combined analysis of genome-wide association studies in breast cancer. Breast Cancer Research and Treatment, 126(3), 717-727. doi: 10.1007/s10549-010-1172-9

Lian, H., Wang, L., & Zhang, J. (2012). Increased Risk of Breast Cancer Associated with CC Genotype of <italic>Has-miR-146a</italic> Rs2910164 Polymorphism in Europeans. PloS one, 7(2), e31615. doi: 10.1371/journal.pone.0031615

Liew, M., Pryor, R., Palais, R., Meadows, C., Erali, M., Lyon, E., & Wittwer, C. (2004). Genotyping of single-nucleotide polymorphisms by high-resolution melting of small amplicons. Clinical Chemistry, 50(7), 1156-1164.

Liew, M., Wittwer, C., & Voelkerding, K. V. (2010). Nucleotide extension genotyping by high-resolution melting. The Journal of molecular diagnostics : JMD, 12(6), 731-738.

Linhares, J. J., Azevedo, M., Jr., Siufi, A. A., de Carvalho, C. V., Wolgien Mdel, C., Noronha, E. C., . . . da Silva, I. D. (2012). Evaluation of single nucleotide polymorphisms in microRNAs (hsa-miR-196a2 rs11614913 C/T) from Brazilian women with breast cancer. BMC Medical Genetics, 13, 119. doi: 10.1186/1471-2350-13-119

202 0 Bibliography

Little, D. P., Braun, A., Darnhofer-Demar, B., & Koster, H. (1997). Identification of apolipoprotein E polymorphisms using temperature cycled primer oligo base extension and mass spectrometry. European Journal of Clinical Chemistry and Clinical Biochemistry, 35(7), 545-548.

Little, D. P., Braun, A., O'Donnell, M. J., & Koster, H. (1997). Mass spectrometry from miniaturized arrays for full comparative DNA analysis. Nature Medicine, 3(12), 1413-1416.

Liu, B., Zhang, Y., Jin, M., Ni, Q., Liang, X., Ma, X., . . . Chen, K. (2010). Association of selected polymorphisms of CCND1, p21, and caspase8 With colorectal cancer risk. Molecular Carcinogenesis, 49(1), 75-84.

Liu, H., Jiang, X., Zhang, M.-w., Pan, Y.-f., Yu, Y.-x., Zhang, S.-c., . . . Chen, K. (2013). Association of CASP9, CASP10 gene polymorphisms and tea drinking with colorectal cancer risk in the Han Chinese population. Journal of Zhejiang University Science B, 14(1), 47-57.

Liu, Y., Zhao, J., Zhang, P. Y., Zhang, Y., Sun, S. Y., Yu, S. Y., & Xi, Q. S. (2012). MicroRNA-10b targets E-cadherin and modulates breast cancer metastasis. Medical Science Monitor, 18(8), BR299-308.

Love, S. M., & Barsky, S. H. (2004). Anatomy of the nipple and breast ducts revisited. Cancer, 101(9), 1947-1957.

Lowery, A., Miller, N., Devaney, A., McNeill, R., Davoren, P., Lemetre, C., . . . Kerin, M. (2009). MicroRNA signatures predict oestrogen receptor, progesterone receptor and HER2/neu receptor status in breast cancer. Breast Cancer Research, 11(3), R27.

Lu, J., Getz, G., Miska, E. A., Alvarez-Saavedra, E., Lamb, J., Peck, D., . . . Golub, T. R. (2005). MicroRNA expression profiles classify human cancers. Nature, 435(7043), 834-838.

Lu, M., Zhang, Q., Deng, M., Miao, J., Guo, Y., Gao, W., & Cui, Q. (2008). An Analysis of Human MicroRNA and Disease Associations. PloS one, 3(10), e3420. doi: 10.1371/journal.pone.0003420

Lynam-Lennon, N., Maher, S. G., & Reynolds, J. V. (2009). The roles of microRNA in cancer and apoptosis. Biological Reviews, 84(1), 55-71. doi: 10.1111/j.1469-185X.2008.00061.x

Ma, L., Zhu, L., Gu, D., Chu, H., Tong, N., Chen, J., . . . Wang, M. (2013). A genetic variant in miR-146a modifies colorectal cancer susceptibility in a Chinese population. Archives of Toxicology, 87(5), 825-833. doi: 10.1007/s00204-012-1004-2

Marieb, E. N. (2010). Human Anatomy and Physiology (8th Ed ed.).

Martin, A.-M., & Weber, B. L. (2000). Genetic and Hormonal Risk Factors in Breast Cancer. Journal of the National Cancer Institute, 92(14), 1126-1135. doi: 10.1093/jnci/92.14.1126

Mattie, M. D., Benz, C. C., Bowers, J., Sensinger, K., Wong, L., Scott, G. K., . . . Haqq, C. (2006). Optimized high-throughput microRNA expression profiling

0 Bibliography 203

provides novel biomarker assessment of clinical prostate and breast cancer biopsies. Molecular Cancer, 5, 24. doi: 10.1186/1476-4598-5-24

McCormick, R. M. (1989). A solid-phase extraction procedure for DNA purification. Analytical Biochemistry, 181(1), 66-74.

McManus, M. T. (2003). MicroRNAs and cancer. Seminars in Cancer Biology, 13(4), 253-258.

McPherson, K., Steel, C. M., & Dixon, J. M. (2000). ABC of breast diseases. Breast cancer-epidemiology, risk factors, and genetics. BMJ, 321(7261), 624-628.

Mendell, J. T. (2008). miRiad roles for the miR-17-92 cluster in development and disease. Cell, 133(2), 217-222. doi: 10.1016/j.cell.2008.04.001

Meselson, M., & Stahl, F. W. (1958). The replication of DNA in Escherichia coli. Proceedings of the National Academy of Sciences of the United States of America, 44(7), 671-682. doi: 10.1073/pnas.44.7.671

Metzler, M., Wilda, M., Busch, K., Viehmann, S., & Borkhardt, A. (2004). High expression of precursor microRNA-155/BIC RNA in children with Burkitt lymphoma. Genes, Chromosomes and Cancer, 39(2), 167-169. doi: 10.1002/gcc.10316

Meyer, J. S., Alvarez, C., Milikowski, C., Olson, N., Russo, I., Russo, J., . . . Parwaresch, R. (2005). Breast carcinoma malignancy grading by Bloom-Richardson system vs proliferation index: reproducibility of grade and advantages of proliferation index. Modern Pathology, 18(8), 1067-1078. doi: 10.1038/modpathol.3800388

Michael, M. Z., O' Connor, S. M., van Holst Pellekaan, N. G., Young, G. P., & James, R. J. (2003). Reduced Accumulation of Specific MicroRNAs in Colorectal Neoplasia. Molecular Cancer Research, 1(12), 882-891.

Michailidou, K., Hall, P., Gonzalez-Neira, A., Ghoussaini, M., Dennis, J., Milne, R. L., . . . Miller, N. (2013). Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nature Genetics, 45(4), 353-361, 361e351-352.

Miki, Y., Swensen, J., Shattuck-Eidens, D., Futreal, P., Harshman, K., Tavtigian, S., . . . al., E. (1994). A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. Science, 266(5182), 66-71. doi: 10.1126/science.7545954

Miller, S. A., Dykes, D. D., & Polesky, H. F. (1988). A simple salting out procedure for extracting DNA from human nucleated cells. Nucleic Acids Research, 16(3), 1215.

Mishra, P. J., Banerjee, D., & Bertino, J. R. (2008). MiRSNPs or MiR-polymorphisms, new players in microRNA mediated regulation of the cell: Introducing microRNA pharmacogenomics. Cell cycle (Georgetown, Tex ), 7(7), 853-858.

204 0 Bibliography

Mishra, P. J., & Bertino, J. R. (2009). MicroRNA polymorphisms: the future of pharmacogenomics, molecular epidemiology and individualized medicine. Pharmacogenomics, 10(3), 399-416. doi: 10.2217/14622416.10.3.399

Mitchell, P. S., Parkin, R. K., Kroh, E. M., Fritz, B. R., Wyman, S. K., Pogosova-Agadjanyan, E. L., . . . Tewari, M. (2008). Circulating microRNAs as stable blood-based markers for cancer detection. Proceedings of the National Academy of Sciences of the United States of America, 105(30), 10513-10518. doi: 10.1073/pnas.0804549105

Mogilyansky, E., & Rigoutsos, I. (2013). The miR-17/92 cluster: a comprehensive update on its genomics, genetics, functions and increasingly important and numerous roles in health and disease. Cell Death and Differentiation, 20(12), 1603-1614. doi: 10.1038/cdd.2013.125

Morehead, J. R. (1982). Anatomy and Embryology of the Breast. Clinical Obstetrics and Gynecology, 25(2), 353-357.

Morozova, O., & Marra, M. A. (2008). Applications of next-generation sequencing technologies in functional genomics. Genomics, 92(5), 255-264.

Murakami, Y., Yasuda, T., Saigo, K., Urashima, T., Toyoda, H., Okanoue, T., & Shimotohno, K. (2006). Comprehensive analysis of microRNA expression patterns in hepatocellular carcinoma and non-tumorous tissues. Oncogene, 25(17), 2537-2545. doi: 10.1038/sj.onc.1209283

Nakai, K., Habano, W., Fujita, T., Schnackenberg, J., Kawazoe, K., Suwabe, A., & Itoh, C. (2002). Highly multiplexed genotyping of coronary artery disease-associated SNPs using MALDI-TOF mass spectrometry. Human Mutation, 20(2), 133-138. doi: 10.1002/humu.10099

NAP. (2012). Breast Cancer and the Environment: A Life Course Approach. Washington, DC: The National Academies Press.

Nasiri, H., Forouzandeh, M., Rasaee, M. J., & Rahbarizadeh, F. (2005). Modified salting-out method: high-yield, high-quality genomic DNA extraction from whole blood using laundry detergent. Journal of Clinical Laboratory Analysis, 19(6), 229-232.

Negrini, M., & Calin, G. A. (2008). Breast cancer metastasis: a microRNA story. Breast cancer research : BCR, 10(2), 203. doi: citeulike-article-id:2600681

Neighbors, M., & Tannehill-Jones, R. (2006). Human diseases: Delmar Learning.

Ness., J. V., Cimler., B. M., Rich B. Meyer, J., & Vermeulen., N. M. J. US Patent No. US 5393672.

Netter, F. H. (2011). Atlas of human anatomy. Philadelphia, PA: Saunders/Elsevier.

NFSTC. (2006). DNA Analyst Training-Laboratory Training Manual Protocol 3.05 Chelex®100 Non-Differential Extraction. National Forensic Science Technology Center.

0 Bibliography 205

O'Donnell, K. A., Wentzel, E. A., Zeller, K. I., Dang, C. V., & Mendell, J. T. (2005). c-Myc-regulated microRNAs modulate E2F1 expression. Nature, 435(7043), 839-843. doi: 10.1038/nature03677

Oldenburg, R. A., Meijers-Heijboer, H., Cornelisse, C. J., & Devilee, P. (2007). Genetic susceptibility for breast cancer: How many more genes to be found? Critical Reviews in Oncology/Hematology, 63(2), 125-149. doi: 10.1016/j.critrevonc.2006.12.004

Olive, V., Jiang, I., & He, L. (2010). mir-17-92, a cluster of miRNAs in the midst of the cancer network. International Journal of Biochemistry and Cell Biology, 42(8), 1348-1354. doi: 10.1016/j.biocel.2010.03.004

Omrani, M., Hashemi, M., Eskandari-Nasab, E., Hasani, S.-S., Mashhadi, M. A., Arbabi, F., & Taheri, M. (2014). hsa-mir-499 rs3746444 gene polymorphism is associated with susceptibility to breast cancer in an Iranian population. Biomarkers in Medicine, 8(2), 259-267. doi: 10.2217/bmm.13.118

Orsós, Z., Szanyi, I., Csejtei, A., Gerlinger, I., Ember, I., & Kiss, I. (2013). Association of pre-miR-146a rs2910164 Polymorphism with the Risk of Head and Neck Cancer. Anticancer Research, 33(1), 341-346.

Ota, A., Tagawa, H., Karnan, S., Tsuzuki, S., Karpas, A., Kira, S., . . . Seto, M. (2004). Identification and characterization of a novel gene, C13orf25, as a target for 13q31-q32 amplification in malignant lymphoma. Cancer Research, 64(9), 3087-3095.

Page, D. L. (2003). Special types of invasive breast cancer, with clinical implications. The American Journal of Surgical Pathology, 27(6), 832-835.

Paik, S., Shak, S., Tang, G., Kim, C., Baker, J., Cronin, M., . . . Wolmark, N. (2004). A Multigene Assay to Predict Recurrence of Tamoxifen-Treated, Node-Negative Breast Cancer. New England Journal of Medicine, 351(27), 2817-2826. doi: doi:10.1056/NEJMoa041588

Paranjape, T., Slack, F. J., & Weidhaas, J. B. (2009). MicroRNAs: tools for cancer diagnostics. Gut, 58(11), 1546-1554. doi: 10.1136/gut.2009.179531

Pastrello, C., Polesel, J., Della Puppa, L., Viel, A., & Maestro, R. (2010). Association between hsa-mir-146a genotype and tumor age-of-onset in BRCA1/BRCA2-negative familial breast and ovarian cancer patients. Carcinogenesis, 31(12), 2124-2126. doi: 10.1093/carcin/bgq184

Perou, C. M., Sorlie, T., Eisen, M. B., van de Rijn, M., Jeffrey, S. S., Rees, C. A., . . . Botstein, D. (2000). Molecular portraits of human breast tumours. Nature, 406(6797), 747-752.

Pettersson, E., Lundeberg, J., & Ahmadian, A. (2009). Generations of sequencing technologies. Genomics, 93(2), 105-111.

PRB. (2007). 2007 World Population Data Sheet. Washington, DC.

PRB. (2011). 2011 World Population Data Sheet. Washington, DC.

206 0 Bibliography

Price, C. W., Leslie, D. C., & Landers, J. P. (2009). Nucleic acid extraction techniques and application to the microchip. Lab on a Chip, 9(17), 2484-2494. doi: 10.1039/B907652M

Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A., Bender, D., . . . Sham, P. C. (2007). PLINK: a tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics, 81(3), 559-575. doi: 10.1086/519795

Radojicic, J., Zaravinos, A., Vrekoussis, T., Kafousi, M., Spandidos, D. A., & Stathopoulos, E. N. (2011). MicroRNA expression analysis in triple-negative (ER, PR and Her2/neu) breast cancer. Cell Cycle, 10(3), 507-517.

Rajewsky, N. (2006). microRNA target predictions in animals. Nature Genetics.

Reed, G. H., Kent, J. O., & Wittwer, C. T. (2007). High-resolution DNA melting analysis for simple and efficient molecular diagnostics. Pharmacogenomics, 8(6), 597-608.

Reich, D. E., Cargill, M., Bolk, S., Ireland, J., Sabeti, P. C., Richter, D. J., . . . Lander, E. S. (2001). Linkage disequilibrium in the human genome. Nature, 411(6834), 199-204.

Reinhart, B. J., Slack, F. J., Basson, M., Pasquinelli, A. E., Bettinger, J. C., Rougvie, A. E., . . . Ruvkun, G. (2000). The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature, 403(6772), 901-906.

Robbins, S. L., Kumar, V., & Cotran, R. S. (2010). Robbins and Cotran Pathologic Basis of Disease. Philadelphia, PA: Saunders/Elsevier.

Roldo, C., Missiaglia, E., Hagan, J. P., Falconi, M., Capelli, P., Bersani, S., . . . Croce, C. M. (2006). MicroRNA expression abnormalities in pancreatic endocrine and acinar tumors are associated with distinctive pathologic features and clinical behavior. Journal of clinical oncology : official journal of the American Society of Clinical Oncology, 24(29), 4677-4684. doi: 10.1200/JCO.2005.05.5194

Rosner, B. A. (2006). Fundamentals of Biostatistics: Thomson-Brooks/Cole.

Ross, J. S., & Fletcher, J. A. (1998). The HER-2/neu Oncogene in Breast Cancer: Prognostic Factor, Predictive Factor, and Target for Therapy. The Oncologist, 3(4), 237-252.

Roth, C., Rack, B., Muller, V., Janni, W., Pantel, K., & Schwarzenbach, H. (2010). Circulating microRNAs as blood-based markers for patients with primary and metastatic breast cancer. Breast Cancer Research, 12(6), R90.

Ryan, B. M., Robles, A. I., & Harris, C. C. (2010). Genetic variation in microRNA networks: the implications for cancer research. Nature Reviews Cancer, 10(6), 389-402.

Sahota, A., Brooks, A. I., & Tischfield, J. A. (2007). Preparing DNA from Mammalian sources for genotyping. CSH Protocols, 2007, pdb top19. doi: 10.1101/pdb.top19

0 Bibliography 207

Sahota, A., Brooks, A. I., Tischfield, J. A., & King, I. B. (2007). Preparing DNA from blood for genotyping. CSH Protocols. doi: 10.1101/pdb.prot4830

Sainsbury, J. R., Anderson, T. J., & Morgan, D. A. (2000). ABC of breast diseases: breast cancer. BMJ (Clinical research ed ), 321(7263), 745-750.

Saiyed, Z. M., Bochiwal, C., Gorasia, H., Telang, S. D., & Ramchand, C. N. (2006). Application of magnetic particles (Fe3O4) for isolation of genomic DNA from mammalian cells. Analytical Biochemistry, 356(2), 306-308.

Saiyed, Z. M., Ramchand, C. N., & Telang, S. D. (2008). Isolation of genomic DNA using magnetic nanoparticles as a solid-phase support. Journal of physics Condensed matter : an Institute of Physics journal, 20(20), 204153.

Sambrook, J., & Russell, D. W. (2001). Molecular Cloning: A Laboratory Manual: Cold Spring Harbor Laboratory Press.

Sanger, F., Nicklen, S., & Coulson, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences of the United States of America, 74(12), 5463-5467.

Schnitt, S. J. (2010). Classification and prognosis of invasive breast cancer: from morphology to molecular taxonomy. Modern Pathology, 23(S2), S60-S64.

Schork, N. J. (2002). Power Calculations for Genetic Association Studies Using Estimated Probability Distributions. The American Journal of Human Genetics, 70(6), 1480-1489.

Scott, G. K., Goga, A., Bhaumik, D., Berger, C. E., Sullivan, C. S., & Benz, C. C. (2007). Coordinate suppression of ERBB2 and ERBB3 by enforced expression of micro-RNA miR-125a or miR-125b. Journal of Biological Chemistry, 282(2), 1479-1486. doi: 10.1074/jbc.M609383200

Seligson, D. B., & Shrawder, E. J. 4,935,342.

Sempere, L. F., Christensen, M., Silahtaroglu, A., Bak, M., Heath, C. V., Schwartz, G., . . . Cole, C. N. (2007). Altered MicroRNA Expression Confined to Specific Epithelial Cell Subpopulations in Breast Cancer. Cancer Research, 67(24), 11612-11620. doi: 10.1158/0008-5472.can-07-5019

Shen, J., Ambrosone, C. B., DiCioccio, R. A., Odunsi, K., Lele, S. B., & Zhao, H. (2008). A functional polymorphism in the miR-146a gene and age of familial breast/ovarian cancer diagnosis. Carcinogenesis, 29(10), 1963-1966. doi: 10.1093/carcin/bgn172

Sherry, S. T., Ward, M. H., Kholodov, M., Baker, J., Phan, L., Smigielski, E. M., & Sirotkin, K. (2001). dbSNP: the NCBI database of genetic variation. Nucleic Acids Research, 29(1), 308-311.

Shi, Y., Xiang, P., Li, L., & Shen, M. (2011). Analysis of 50 SNPs in CYP2D6, CYP2C19, CYP2C9, CYP3A4 and CYP1A2 by MALDI-TOF mass spectrometry in Chinese Han population. Forensic Science International, 207(1-3), 183-187. doi: 10.1016/j.forsciint.2010.10.004

208 0 Bibliography

Si, M. L., Zhu, S., Wu, H., Lu, Z., Wu, F., & Mo, Y. Y. (2007). miR-21-mediated tumor growth. Oncogene, 26(19), 2799-2803. doi: 10.1038/sj.onc.1210083

Singletary, S. E., Allred, C., Ashley, P., Bassett, L. W., Berry, D., Bland, K. I., . . . Greene, F. L. (2003). Staging system for breast cancer: revisions for the 6th edition of the AJCC Cancer Staging Manual. The Surgical clinics of North America, 83(4), 803-819.

Singletary, S. E., & Connolly, J. L. (2006). Breast cancer staging: working with the sixth edition of the AJCC Cancer Staging Manual. CA: A Cancer Journal for Clinicians, 56(1), 37-47; quiz 50-31.

Smigielski, E. M., Sirotkin, K., Ward, M., & Sherry, S. T. (2000). dbSNP: a database of single nucleotide polymorphisms. Nucleic Acids Research, 28(1), 352-355.

Smith, L. M., Sanders, J. Z., Kaiser, R. J., Hughes, P., Dodd, C., Connell, C. R., . . . Hood, L. E. (1986). Fluorescence detection in automated DNA sequence analysis. Nature, 321(6071), 674-679. doi: 10.1038/321674a0

Smith, R. A., Jedlinski, D. J., Gabrovska, P. N., Weinstein, S. R., Haupt, L. M., & Griffiths, L. R. (2012). A genetic variant located in miR-423 is associated with reduced breast cancer risk. Cancer Genomics Proteomics, 9(3), 115-118.

Song, M.-y., Su, H.-j., Zhang, L., Ma, J.-l., Li, J.-y., Pan, K.-f., & You, W.-c. (2013). Genetic Polymorphisms of <italic>miR-146a</italic> and <italic>miR-27a</italic>, <italic>H. pylori</italic> Infection, and Risk of Gastric Lesions in a Chinese Population. PloS one, 8(4), e61250. doi: 10.1371/journal.pone.0061250

Sørlie, T., Tibshirani, R., Parker, J., Hastie, T., Marron, J. S., Nobel, A., . . . Botstein, D. (2003). Repeated observation of breast tumor subtypes in independent gene expression data sets. Proceedings of the National Academy of Sciences of the United States of America, 100(14), 8418-8423. doi: 10.1073/pnas.0932692100

Spizzo, R., Nicoloso, M. S., Lupini, L., Lu, Y., Fogarty, J., Rossi, S., . . . Calin, G. A. (2010). miR-145 participates with TP53 in a death-promoting regulatory loop and targets estrogen receptor-alpha in human breast cancer cells. Cell Death and Differentiation, 17(2), 246-254. doi: 10.1038/cdd.2009.117

Stacey, S. N., Manolescu, A., Sulem, P., Rafnar, T., Gudmundsson, J., Gudjonsson, S. A., . . . Stefansson, K. (2007). Common variants on chromosomes 2q35 and 16q12 confer susceptibility to estrogen receptor-positive breast cancer. Nature Genetics, 39(7), 865-869.

Stepanov, V. A., & Trifonova, E. A. (2013). Multiplex genotyping of single nucleotide polymorphisms by MALDI-TOF mass-spectrometry: ferequencies of 56 SNP in immune response genes in human populations. Journal of Molecular Biology, 47(6), 976-986.

Stewart, B. W., Wild, C., International Agency for Research on, C., & World Health, O. (2014). World cancer report 2014. Lyon, France: International Agency for Research on Cancer.

Strachan, T., & Read, A. (2011). Human molecular genetics: Garland Science.

0 Bibliography 209

Strausberg, R. L., Levy, S., & Rogers, Y.-H. (2008). Emerging DNA sequencing technologies for human genomic medicine. Drug Discovery Today, 13(13–14), 569-577. doi: 10.1016/j.drudis.2008.03.025

Sun, G., Yan, J., Noltner, K., Feng, J., Li, H., Sarkis, D. A., . . . Rossi, J. J. (2009). SNPs in human miRNA genes affect biogenesis and function. RNA, 15(9), 1640-1651. doi: 10.1261/rna.1560209

Sun, Q., Gu, H., Zeng, Y., Xia, Y., Wang, Y., Jing, Y., . . . Wang, B. (2010). Hsa-mir-27a genetic variant contributes to gastric cancer susceptibility through affecting miR-27a and target gene expression. Cancer Science, 101(10), 2241-2247. doi: 10.1111/j.1349-7006.2010.01667.x

Syrmis, M. W., Moser, R. J., Kidd, T. J., Hunt, P., Ramsay, K. A., Bell, S. C., . . . Whiley, D. M. (2013). High-throughput single-nucleotide polymorphism-based typing of shared Pseudomonas aeruginosa strains in cystic fibrosis patients using the Sequenom iPLEX platform. Journal of Medical Microbiology, 62(Pt 5), 734-740. doi: 10.1099/jmm.0.055905-0

Szabo, C. I., Schutte, M., Broeks, A., Houwing-Duistermaat, J. J., Thorstenson, Y. R., Durocher, F., . . . Meijers-Heijboer, H. (2004). Are ATM Mutations 7271T→G and IVS10-6T→G Really High-Risk Breast Cancer-Susceptibility Alleles? Cancer Research, 64(3), 840-843. doi: 10.1158/0008-5472.can-03-2678

Takamizawa, J., Konishi, H., Yanagisawa, K., Tomida, S., Osada, H., Endoh, H., . . . Takahashi, T. (2004). Reduced expression of the let-7 microRNAs in human lung cancers in association with shortened postoperative survival. Cancer Research, 64(11), 3753-3756. doi: 10.1158/0008-5472.CAN-04-0637

Tan, S. C., & Yiap, B. C. (2009). DNA, RNA, and Protein Extraction: The Past and The Present. Journal of Biomedicine and Biotechnology, 2009. doi: 10.1155/2009/574398

Tanaka, K., Waki, H., Ido, Y., Akita, S., Yoshida, Y., Yoshida, T., & Matsuo, T. (1988). Protein and polymer analyses up to m/z 100 000 by laser ionization time-of-flight mass spectrometry. Rapid Communications in Mass Spectrometry, 2(8), 151-153. doi: 10.1002/rcm.1290020802

Tetzlaff, M. T., Liu, A., Xu, X., Master, S. R., Baldwin, D. A., Tobias, J. W., . . . Baloch, Z. W. (2007). Differential expression of miRNAs in papillary thyroid carcinoma compared to multinodular goiter using formalin fixed paraffin embedded tissues. Endocrine Pathology, 18(3), 163-173. doi: 10.1007/s12022-007-0023-7

Thomas, G., Jacobs, K. B., Kraft, P., Yeager, M., Wacholder, S., Cox, D. G., . . . Hunter, D. J. (2009). A multistage genome-wide association study in breast cancer identifies two new risk alleles at 1p11.2 and 14q24.1 (RAD51L1). Nature Genetics, 41(5), 579-584.

Thomas, S., Tilzer, L., & Moreno, R. US Patent No. US 5175271.

Trejo-de la O, A., Torres, J., Pérez-Rodríguez, M., Camorlinga-Ponce, M., Luna, L. F., Abdo-Francis, J. M., . . . Maldonado-Bernal, C. (2008). TLR4 single-

210 0 Bibliography

nucleotide polymorphisms alter mucosal cytokine and chemokine patterns in Mexican patients with Helicobacter pylori-associated gastroduodenal diseases. Clinical Immunology, 129(2), 333-340.

Troutman, T., Prasauckas, K. A., Kennedy, M. A., Stevens, J., Davies, M. G., & Dadd, A. T. (2002). How to Properly Use and Maintain Laboratory Equipment Molecular Biology Problem Solver (pp. 49-111): John Wiley & Sons, Inc.

Turnbull, C., Ahmed, S., Morrison, J., Pernet, D., Renwick, A., Maranian, M., . . . Easton, D. F. (2010). Genome-wide association study identifies five new breast cancer susceptibility loci. Nature Genetics, 42(6), 504-507.

Ullrich, A., Shine, J., Chirgwin, J., Pictet, R., Tischer, E., Rutter, W., & Goodman, H. (1977). Rat insulin genes: construction of plasmids containing the coding sequences. Science, 196(4296), 1313-1319. doi: 10.1126/science.325648

van de Vijver, M. J., He, Y. D., van 't Veer, L. J., Dai, H., Hart, A. A. M., Voskuil, D. W., . . . Bernards, R. (2002). A Gene-Expression Signature as a Predictor of Survival in Breast Cancer. New England Journal of Medicine, 347(25), 1999-2009. doi: doi:10.1056/NEJMoa021967

Ventura, A., Young, A. G., Winslow, M. M., Lintault, L., Meissner, A., Erkeland, S. J., . . . Jacks, T. (2008). Targeted deletion reveals essential and overlapping functions of the miR-17 through 92 family of miRNA clusters. Cell, 132(5), 875-886. doi: 10.1016/j.cell.2008.02.019

Vogelstein, B., & Gillespie, D. (1979). Preparative and analytical purification of DNA from agarose. Proceedings of the National Academy of Sciences of the United States of America, 76(2), 615-619.

Volinia, S., Calin, G. A., Liu, C.-G., Ambs, S., Cimmino, A., Petrocca, F., . . . Croce, C. M. (2006). A microRNA expression signature of human solid tumors defines cancer gene targets. Proceedings of the National Academy of Sciences of the United States of America, 103(7), 2257-2261. doi: 10.1073/pnas.0510565103

Walsh, P. S., Metzger, D. A., & Higushi, R. (2013). Chelex 100 as a medium for simple extraction of DNA for PCR-based typing from forensic material. BioTechniques 10(4): 506-13 (April 1991). Biotechniques, 54(3), 134-139.

Wang, A., Xu, B., Tong, N., Chen, S., Yang, Y., Zhang, X., . . . Hu, X. (2012). Meta-analysis confirms that a common G/C variant in the pre-miR-146a gene contributes to cancer susceptibility and that ethnicity, gender and smoking status are risk factors. Genetics and Molecular Research, 11, 3051-3062.

Wang, P. Y., Gao, Z. H., Jiang, Z. H., Li, X. X., Jiang, B. F., & Xie, S. Y. (2013). The associations of single nucleotide polymorphisms in miR-146a, miR-196a and miR-499 with breast cancer susceptibility. PloS one, 8(9), e70656. doi: 10.1371/journal.pone.0070656

Wang, S., Bian, C., Yang, Z., Bo, Y., Li, J., Zeng, L., . . . Zhao, R. C. (2009). miR-145 inhibits breast cancer cell growth through RTKN. International Journal of Oncology, 34(5), 1461-1466. doi: 10.3892/ijo_00000275

0 Bibliography 211

Wang, Z. X., Lu, B. B., Wang, H., Cheng, Z. X., & Yin, Y. M. (2011). MicroRNA-21 modulates chemosensitivity of breast cancer cells to doxorubicin by targeting PTEN. Archives of Medical Research, 42(4), 281-290. doi: 10.1016/j.arcmed.2011.06.008

Weinberg, R. A. (2007). The biology of cancer: Garland Science.

Welcsh, P. L., & King, M.-C. (2001). BRCA1 and BRCA2 and the genetics of breast and ovarian cancer. Human Molecular Genetics, 10(7), 705-713. doi: 10.1093/hmg/10.7.705

Wen, Y. H., Shi, X., Chiriboga, L., Matsahashi, S., Yee, H., & Afonja, O. (2007). Alterations in the expression of PDCD4 in ductal carcinoma of the breast. Oncology Reports, 18(6), 1387-1393.

Werner, M., Sych, M., Herbon, N., Illig, T., Konig, I. R., & Wjst, M. (2002). Large-scale determination of SNP allele frequencies in DNA pools using MALDI-TOF mass spectrometry. Human Mutation, 20(1), 57-64. doi: 10.1002/humu.10094

Westholm, J. O., & Lai, E. C. (2011). Mirtrons: microRNA biogenesis via splicing. Biochimie, 93(11), 1897-1904. doi: 10.1016/j.biochi.2011.06.017

Wigginton, J. E., Cutler, D. J., & Abecasis, G. R. (2005). A note on exact tests of Hardy-Weinberg equilibrium. The American Journal of Human Genetics, 76(5), 887-893.

Wittwer, C. T., Reed, G. H., Gundry, C. N., Vandersteen, J. G., & Pryor, R. J. (2003). High-resolution genotyping by amplicon melting analysis using LCGreen. Clinical Chemistry, 49(6 Pt 1), 853-860.

Wolff, M. S., Collman, G. W., Barrett, J. C., & Huff, J. (1996). Breast cancer and environmental risk factors: epidemiological and experimental findings. Annual Review of Pharmacology and Toxicology, 36, 573-596. doi: 10.1146/annurev.pa.36.040196.003041

Wooster, R., Bignell, G., Lancaster, J., Swift, S., Seal, S., Mangion, J., . . . Stratton, M. R. (1995). Identification of the breast cancer susceptibility gene BRCA2. Nature, 378(6559), 789-792.

Wu, Q., Lu, Z., Li, H., Lu, J., Guo, L., & Ge, Q. (2011). Next-Generation Sequencing of MicroRNAs for Breast Cancer Detection. Journal of Biomedicine and Biotechnology, 2011. doi: 10.1155/2011/597145

Wu, Y., Jin, M., Liu, B., Liang, X., Yu, Y., Li, Q., . . . Chen, K. (2011). The association of XPC polymorphisms and tea drinking with colorectal cancer risk in a Chinese population. Molecular Carcinogenesis, 50(3), 189-198.

Xu, B., Feng, N.-H., Li, P.-C., Tao, J., Wu, D., Zhang, Z.-D., . . . Wu, H.-F. (2010). A functional polymorphism in Pre-miR-146a gene is associated with prostate cancer risk and mature miR-146a expression in vivo. The Prostate, 70(5), 467-472. doi: 10.1002/pros.21080

Xu, T., Zhu, Y., Wei, Q. K., Yuan, Y., Zhou, F., Ge, Y. Y., . . . Zhuang, S. M. (2008). A functional polymorphism in the miR-146a gene is associated with the

212 0 Bibliography

risk for hepatocellular carcinoma. Carcinogenesis, 29(11), 2126-2131. doi: 10.1093/carcin/bgn195

Xu, Y., Gu, L., Pan, Y., Li, R., Gao, T., Song, G., . . . He, B. (2013). Different Effects of Three Polymorphisms in MicroRNAs on Cancer Risk in Asian Population: Evidence from Published Literatures. PloS one, 8(6), e65123. doi: 10.1371/journal.pone.0065123

Xu, Y., Li, L., Xiang, X., Wang, H., Cai, W., Xie, J., . . . Xie, Q. (2013). Three common functional polymorphisms in microRNA encoding genes in the susceptibility to hepatocellular carcinoma: A systematic review and meta-analysis. Gene, 527(2), 584-593.

Yamashita, J., Iwakiri, T., Fukushima, S., Jinnin, M., Miyashita, A., Hamasaki, T., . . . Ihn, H. (2013). The rs2910164 G>C polymorphism in microRNA-146a is associated with the incidence of malignant melanoma. Melanoma Research, 23(1), 13-20.

Yan, L.-X., Huang, X.-F., Shao, Q., Huang, M. A. Y., Deng, L., Wu, Q.-L., . . . Shao, J.-Y. (2008). MicroRNA miR-21 overexpression in human breast cancer is associated with advanced clinical stage, lymph node metastasis and patient poor prognosis. RNA, 14(11), 2348-2360. doi: 10.1261/rna.1034808

Yanaihara, N., Caplen, N., Bowman, E., Seike, M., Kumamoto, K., Yi, M., . . . Harris, C. C. (2006). Unique microRNA molecular profiles in lung cancer diagnosis and prognosis. Cancer Cell, 9(3), 189-198. doi: 10.1016/j.ccr.2006.01.025

Yang, R., & Burwinkel, B. (2012). A bias in genotyping the miR-27a rs895819 and rs11671784 variants. Breast Cancer Research and Treatment, 134(2), 899-901. doi: 10.1007/s10549-012-2140-3

Yang, R., Schlehe, B., Hemminki, K., Sutter, C., Bugert, P., Wappenschmidt, B., . . . Burwinkel, B. (2010). A genetic variant in the pre-miR-27a oncogene is associated with a reduced familial breast cancer risk. Breast Cancer Research and Treatment, 121(3), 693-702.

Ye, Y., Wang, K. K., Gu, J., Yang, H., Lin, J., Ajani, J. A., & Wu, X. (2008). Genetic variations in microRNA-related genes are novel susceptibility loci for esophageal cancer risk. Cancer Prevention Research, 1(6), 460-469. doi: 10.1158/1940-6207.CAPR-08-0135

Yi, R., Qin, Y., Macara, I. G., & Cullen, B. R. (2003). Exportin-5 mediates the nuclear export of pre-microRNAs and short hairpin RNAs. Genes and Development, 17(24), 3011-3016. doi: 10.1101/gad.1158803

Youl, P. H., Baade, P. D., Aitken, J. F., Chambers, S. K., Turrell, G., Pyke, C., & Dunn, J. (2011). A multilevel investigation of inequalities in clinical and psychosocial outcomes for women after breast cancer. BMC cancer, 11, 415. doi: 10.1186/1471-2407-11-415

Yu, Z., Baserga, R., Chen, L., Wang, C., Lisanti, M. P., & Pestell, R. G. (2010). microRNA, Cell Cycle, and Human Breast Cancer. The American Journal of Pathology, 176(3), 1058-1064. doi: 10.2353/ajpath.2010.090664

0 Bibliography 213

Yu, Z., Wang, C., Wang, M., Li, Z., Casimiro, M. C., Liu, M., . . . Pestell, R. G. (2008). A cyclin D1/microRNA 17/20 regulatory feedback loop in control of breast cancer cell proliferation. Journal of Cell Biology, 182(3), 509-517. doi: 10.1083/jcb.200801079

Zhang, B., Beeghly-Fadiel, A., Long, J., & Zheng, W. (2011). Genetic variants associated with breast-cancer risk: comprehensive research synopsis, meta-analysis, and epidemiological evidence. The lancet oncology, 12(5), 477-488. doi: 10.1016/s1470-2045(11)70076-6

Zhang, B., Pan, X., Cobb, G. P., & Anderson, T. A. (2007). microRNAs as oncogenes and tumor suppressors. Developmental Biology, 302(1), 1-12. doi: 10.1016/j.ydbio.2006.08.028

Zhang, B., Wang, Q., & Pan, X. (2007). MicroRNAs and their regulatory roles in animals and plants. Journal of Cellular Physiology, 210(2), 279-289. doi: 10.1002/jcp.20869

Zhang, C.-M., Zhao, J., & Deng, H.-Y. (2013). MiR-155 promotes proliferation of human breast cancer MCF-7 cells through targeting tumor protein 53-induced nuclear protein 1. Journal of Biomedical Science, 20(1), 79.

Zhang, L., Huang, J., Yang, N., Greshock, J., Megraw, M. S., Giannakakis, A., . . . Coukos, G. (2006). microRNAs exhibit high frequency genomic alterations in human cancer. Proceedings of the National Academy of Sciences of the United States of America, 103(24), 9136-9141. doi: 10.1073/pnas.0508889103

Zhang, M., Jin, M., Yu, Y., Zhang, S., Wu, Y., Liu, H., . . . Chen, K. (2012). Associations of miRNA polymorphisms and female physiological characteristics with breast cancer risk in Chinese population. European Journal of Cancer Care, 21(2), 274-280. doi: 10.1111/j.1365-2354.2011.01308.x

Zhang, N., Huo, Q., Wang, X., Chen, X., Long, L., Jiang, L., . . . Yang, Q. (2013). A genetic variant in pre-miR-27a is associated with a reduced breast cancer risk in younger Chinese population. Gene, 529(1), 125-130. doi: 10.1016/j.gene.2013.07.041

Zhang, Y., Jin, M., Liu, B., Ma, X., Yao, K., Li, Q., & Chen, K. (2008). Association between H-RAS T81C genetic polymorphism and gastrointestinal cancer risk: a population based case-control study in China. BMC cancer, 8, 256.

Zhang, Y., Liu, B., Jin, M., Ni, Q., Liang, X., Ma, X., . . . Chen, K. (2009). Genetic polymorphisms of transforming growth factor-β1 and its receptors and colorectal cancer susceptibility: A population-based case-control study in China. Cancer Letters, 275(1), 102-108.

Zhong, S., Chen, Z., Xu, J., Li, W., & Zhao, J. (2013). Pre-mir-27a rs895819 polymorphism and cancer risk: a meta-analysis. Molecular Biology Reports, 40(4), 3181-3186. doi: 10.1007/s11033-012-2392-3

Zhu, S., Si, M. L., Wu, H., & Mo, Y. Y. (2007). MicroRNA-21 targets the tumor suppressor gene tropomyosin 1 (TPM1). Journal of Biological Chemistry, 282(19), 14328-14336. doi: 10.1074/jbc.M611393200

214 0 Bibliography

Zhu, W., Qin, W., Atasoy, U., & Sauter, E. (2009). Circulating microRNAs in breast cancer and healthy subjects. BMC Research Notes, 2(1), 89.

0 Bibliography 215

Appendices

Appendix A Modified salting out protocol for DNA extraction from whole blood samples.

Adapted from Nasiri et al., 2005.

1. Thaw blood by placing in 4°C fridge for a few hours or quick-thawing at

room temperature (RT) on bench.

2. Transfer contents of blood vial by pipetting blood carefully into a 15ml

falcon tube.

3. Add 8ml of lysis buffer (0.3 M sucrose, 0.01 M TrisHCl, pH 7.5, 5mM

MgCl2, 1% Triton X100) to each sample and mix by inversion.

4. Centrifuge the tubes at 2500rcf for 30mins at 4°C. Prepare a beaker with

diluted (5-10%) Decon90® for blood waste disposal.

5. Discard the supernatant by pouring carefully into beaker containing

Decon90® without disturbing pellet on the bottom. Alternatively, remove

supernatant by using a transfer pipette, without disturbing pellet.

6. Add 300µl of 10mM TrisHCl, pH 8.0 to the falcon tube and break up the

pellet using a transfer pipette. Ensure pellet is completely broken

up/resuspended.

7. Transfer to fresh microfuge tubes and resuspend by vortexing vigorously.

216 0 Appendices

8. Centrifuge at 2500rcf for 20 mins at 4°C temperature.

9. Discard supernatant, and add 330µl of 10mM TrisHCl, pH 8.0; 330 µl of

laundry powder (30mg/ml) and a glass bead. Vortex the samples for 1 min

and add 250 µl of 6M NaCl and vortex again for 20 seconds.

10. Transfer sample to a fresh 2ml microfuge tube and centrifuge tube at

15000rcf for 10 min.

11. Transfer supernatant to fresh microfuge tubes (about 900µl) and precipitate

by addition of 900µl of chilled 100% ethanol. Gently swirl and observe the

DNA strands precipitate.

12. Using an inoculating loop, washed twice in 70% ethanol, transfer the DNA

into one labelled 2ml tube containing 1ml 1X TE buffer (10 mM Tris pH 8.0,

1 mM EDTA).

13. Parafilm the 2ml tube and allow the DNA to dissolve by incubating at 70°C

for 5 minutes or at 37 °C overnight.

14. Discard the blood waste into the beaker containing Decon90®, cover with

aluminium foil and let sit overnight. It can be poured down the sink with

copious amounts of water the next day.

15. Vortex and quantitate using the Thermo Scientific NanoDropTM 8000 UV-Vis

Spectrophotometer (Thermo Fisher Scientific Inc., Wilmington, DE. USA) or

other methods (picogreen, fluorometer) and keep a record of concentrations.

Store DNA tubes in appropriate boxes in -20°C freezer.

0 Appendices 217

16. Prepare DNA to a concentration of 20 ng/µl dilution from 1X TE stock DNA

using sterilised deionised water and use in PCRs (store this at 4°C).

218 0 Appendices

Appendix B Protocol for genotyping using multiplex PCR and matrix-assisted laser

desorption/ionisation time-of-flight mass spectrometry (MALDI-TOF MS) analysis.

Primary Multiplex PCR

1. Set up multiplex PCR reactions to a final volume of 5 µl using 10 ng of

genomic DNA, 0.1 µM of PCR Primers mix (IDT® Pte. Ltd., Baulkham Hills,

NSW, Australia), 1 U PCR enzyme (Roche®, Indianapolis, IN, USA), 1X

SEQUENOM PCR Buffer (SEQUENOM Inc., San Diego, CA, USA), 2 mM

SEQUENOM MgCl2 (SEQUENOM Inc., San Diego, CA, USA) and 500 µM

SEQUENOM 2´-deoxynucleoside- 5´-triphosphate (dNTP) mix

(SEQUENOM Inc., San Diego, CA, USA) in an Applied Biosystems®

MicroAmp® EnduraPlate™ Optical 96-Well Clear Reaction Plates with

Barcode (Life Technologies Australia Pty Ltd., Mulgrave, VIC, Australia).

2. Place reaction plate in Applied Biosystems® Veriti® 96-Well Thermal Cycler

(Life Technologies Australia Pty Ltd., Mulgrave, VIC, Australia) and

perform primary PCR using the following conditions: initial denaturation at

95°C for 2 min; 45 cycles at 95°C for 30 s, 56°C for 30 s, 72°C for 1 min;

and final extension at 72°C for 5 min.

Shrimp Alkaline Phosphatase (SAP) dephosphorylation

3. Remove plate from thermal cycler and add 0.5 U of shrimp alkaline

phosphatase (SAP) (iPLEX® GOLD reaction kit, SEQUENOM Inc., San

Diego, CA, USA).

0 Appendices 219

4. To neutralize unincorporated dNTPs run dephosphorylation reaction using

the following cycling conditions: 37°C for 40 min and 85°C for 5 min.

iPLEX® primer extension reactions

5. Remove reaction plate from thermal cycler and perform primer extension

reactions using the iPLEX® Gold Reagent Kit according to the SEQUENOM

linear adjustment method described in iPLEX™ GOLD genotyping

application guide.

6. Prepare the iPLEX® reaction cocktail using: 0.222X iPLEX® Buffer Plus

(SEQUENOM Inc., San Diego, CA, USA), 1X iPLEX® Termination Mix

(SEQUENOM Inc., San Diego, CA, USA), 1X iPLEX® Enzyme

(SEQUENOM Inc., San Diego, CA, USA) and an extend primer mixture

made according to the manufacturer’s linear adjustment method.

7. Add iPLEX® reaction cocktail to the amplification products and place

reaction plate in the thermal cycler. Run iPLEX® primer extension reactions

using the following conditions: initial denaturation at 94°C for 30s; 40 cycles

of 94°C for 5s, annealing at 52°C for 5s and extension 80°C for 5s. Annealing

and extension procedure repeats five-times (to give a total of 200 short

cycles), followed by a 5 s denaturation step, and a final extension step of 3

min at 72°C.

8. Remove reaction plate and add 41 µl of sterile water and 15 mg of Clean

Resin (SEQUENOM Inc., San Diego, CA, USA) to each iPLEX® reaction.

Seal reaction plate and mix continuously by rotation for at least 15 minutes.

9. Centrifuge reaction plate at 3200g for 5 minutes.

220 0 Appendices

MALDI-TOF MS analysis and data analysis

10. Transfer reaction plate to the SEQUENOM® MassARRAY® Nanodispenser

(SEQUENOM Inc., San Diego, CA, USA) and dispense a total of 12-16 nl of

each iPLEX® reaction product onto a SpectroCHIP® II G96 (SEQUENOM

Inc., San Diego, CA, USA).

11. Run SpectroCHIP® analysis using SEQUENOM® MassArray® Analyzer 4

and laser desorption/ionisation, automated spectra acquisition using

SpectroAcquire Version 4.0 (SEQUENOM Inc., San Diego, CA, USA).

12. Perform genotype data analysis with the MassARRAY® Typer software

version 4.0 (SEQUENOM Inc., San Diego, CA, USA).

0 Appendices 221

Appendix C Protocol for genotyping using automated Sanger sequencing

1. Prepare a 10 µl Exo-SAP-IT Clean-up reaction using 5ul of HRM product, 4

µl of Reverse Osmosis water and 1µl of Exo-SAP-IT. Mix reaction and

incubate in Applied Biosystems® Veriti® 96-Well Thermal Cycler (Life

Technologies Australia Pty Ltd., Mulgrave, VIC, Australia) using the

following cycling conditions: 37°C for 15 min and 80°C for 15 min.

2. Subsequently, prepare a 20 µl reaction mixing 15 ng/µl of the Exo-SAP-IT

product, 650nM of each primer (separate reactions for forward and reverse

primer), 3.5ul of BDT v3.1 Ready Reaction Mix (100RR) and 1x Sequencing

Buffer. Incubate in Applied Biosystems® Veriti® 96-Well Thermal Cycler

(Life Technologies Australia Pty Ltd., Mulgrave, VIC, Australia) using the

following cycling conditions: 1 cycle of 96°C for 1 min, 30 cycles of 96°C

for 10 s, 50°C for 5 s. and 60°C for 4 min, 1 cycle of 4°C for 5 min and 1

cycle of 10°C for 5 min.

3. Perform standard ethanol precipitation on the products and load them onto the

ABI-3130 Genetic Analyzer (Life Technologies Australia Pty Ltd., Mulgrave,

VIC, Australia) to perform automated Sanger sequencing.

222 0 Appendices

study of mirna polymorphisms and their potential...

Documents