Download - IDENTIFICATION AND VALIDATION OF CANDIDATE ......1.1.3 Current breast cancer screening methods 4 1.1.4 Early diagnosis of breast cancer is essential 8 1.2 Cancer biomarkers 9 1.2.1

IDENTIFICATION AND VALIDATION OF CANDIDATE BREAST CANCER BIOMARKERS: A MASS SPECTROMETRIC APPROACH

by

Vathany Kulasingam

A thesis submitted in conformity with the requirements for the Degree of Doctor of Philosophy

Graduate Department of Laboratory Medicine and Pathobiology University of Toronto

© Copyright by Vathany Kulasingam 2008

ii

IDENTIFICATION AND VALIDATION OF CANDIDATE BREAST

CANCER BIOMARKERS: A MASS SPECTROMETRIC APPROACH

Vathany Kulasingam

Doctor of Philosophy 2008

Department of Laboratory Medicine and Pathobiology

University of Toronto

ABSTRACT

One of the best ways to diagnose breast cancer early or to predict therapeutic

response is to use serum biomarkers. Unfortunately, for breast cancer, we do not

have effective serological biomarkers. We hypothesized that novel candidate tumor

markers for breast cancer may be secreted or shed proteins that can be detected in

tissue culture supernatants of human breast cancer cell lines. A two-dimensional

liquid chromatography-tandem mass spectrometry (2D-LC-MS/MS) strategy was

utilized to identify and compare levels of extracellular and membrane-bound proteins

in the conditioned media. Proteomic analysis of the media identified in excess of 600,

500 and 700 proteins in MCF-10A, BT474 and MDA-MB-468, respectively. We

successfully identified the internal control proteins, kallikreins 5, 6 and 10 (ranging in

concentration from 2-50 µg/L), as validated by ELISA and confidently identified HER-

2/neu in BT474 cells. Sub-cellular localization was determined based on Genome

Ontology (GO) for the 1,139 proteins, of which 34% were classified as extracellular

and membrane-bound. Tissue specificity, functional classifications and label-free

quantification were performed. The levels of eleven promising molecules were

iii

measured in biological samples to determine its discriminatory ability for control

versus cases. This screen yielded activated leukocyte cell adhesion molecule

(ALCAM) as a promising candidate. The levels of ALCAM, in addition to the classical

breast cancer tumor markers carbohydrate antigen 15-3 (CA 15-3) and

carcinoembryonic antigen (CEA) were examined in 300 serum samples by

quantitative ELISA. All three biomarkers effectively separated cancer from non-

cancer groups. ALCAM, with area under the curve (AUC) of 0.78 [95% CI: 0.73,

0.84] outperformed CA15-3 (AUC= 0.70 [95% CI: 0.64, 0.76]) and CEA (AUC= 0.63

[95% CI: 0.56, 0.70]). The incremental values of AUC for ALCAM over that for CA15-

3 were statistically significant (Delong test, p <0.05). Serum ALCAM appears to be a

new biomarker for breast cancer and may have value for disease diagnosis.

iv

DEDICATION

I dedicate this PhD thesis to my beloved Swami whose guidance and grace

enabled me to complete this degree and to my late father, who had taught me

through example to face adversity and to conquer it, no matter the nature of the

challenge.

v

ACKNOWLEDGEMENTS

“Matha, Pitha, Guru, Daivam” in Sanskrit means 'Mother, Father, Teacher,

God. It represents the order of significance and respect to be accorded to these

people. Therefore, first and foremost, with utmost humility, I would like to thank my

mother and late father for investing in me. There were many sacrifices made along

the way and I am truly indebted to my family for providing me with unconditional

support and love. Thank you for instilling in me the values of hard work, dedication

and determination.

My PhD journey has been a process of experiencing, learning and maturing,

both in terms of scientific knowledge and personal growth. This thesis would not

have been possible without the confidence, passion and motivation of my

teacher/supervisor. I am grateful to Dr. Eleftherios P. Diamandis for his guidance,

mentorship, patience, encouragement and friendship. Though I do not say it often,

THANK YOU, for giving me the opportunity to work in your laboratory and to explore

the field of scientific research.

I would also like to acknowledge the members of my PhD advisory and oral

examination committee for their mentorship and valuable advice: Drs. Andrew Emili,

Hilmi Ozcelik, Joe Minta, James Scholey and K.W. Michael Siu. I would like to

extend my acknowledgements to the Department of Laboratory Medicine and

Pathobiology, University of Toronto, and to my funding sources: Natural Sciences

and Engineering Research Council of Canada and Proteomic Methods Inc.

vi

In addition, many thanks to all members of the Advanced Centre for Detection

of Cancer (ACDC) laboratory, past and present, for their continued support and

friendship. Thank you for making my stay in the laboratory so much more than a

scientific exercise. In particular, I am grateful to Christopher R. Smith, Ihor Batruch,

Girish Sardana, Anton Soosaipillai and Tammy Earle for their technical support,

expertise and friendship.

Last, but certainly not least, my most humble salutations at the divine lotus

feet of my beloved Swami. May your grace always be with me and may your light

guide me in the right path.

vii

TABLE OF CONTENTS ABSTRACT ii DEDICATION iv ACKNOWLEDGEMENTS v TABLE of CONTENTS vii LIST of TABLES xi LIST of FIGURES xii LIST of ABBREVIATIONS xiv CHAPTER 1: INTRODUCTION 1 1.1 Breast Cancer 2

1.1.1 Statistics / Epidemiology 2 1.1.2 Anatomy and types of breast cancer 3 1.1.3 Current breast cancer screening methods 4 1.1.4 Early diagnosis of breast cancer is essential 8

1.2 Cancer biomarkers 9

1.2.1 Definition and types of biomarkers 9 1.2.2 Characteristics of an ideal tumor marker 12 1.2.3 Historical overview of cancer biomarkers 12 1.2.4 Current applications of tumor markers and their clinical utility

13

1.2.5 Currently-available breast cancer biomarkers: Clinical utility and limitations

14

1.2.5.1 MUC-1 15 1.2.5.2 CEA 18 1.2.5.3 Circulating levels of HER-2/neu 18 1.2.5.4 Other promising serological breast cancer markers 19 1.2.5.5 Non-serological markers for breast cancer 20

1.2.6 Renewed interest in discovering novel breast cancer biomarkers

21

1.3 Mechanisms of biomarker elevation in biological fluids 22

1.3.1 Gene over-expression 23 1.3.2 Increased protein secretion and shedding 24 1.3.3 Angiogenesis, invasion and destruction of tissue architecture

25

1.4 Strategies for discovering novel cancer biomarkers 26

1.4.1 Gene-expression profiling 27 1.4.2 Mass spectrometry-based profiling 30 1.4.3 Peptidomics 32 1.4.4 Cancer biomarker family approach 33 1.4.5 Secreted protein approach 34

viii

1.4.6 Other prominent strategies 35 1.5 Emergence of proteomics and relevance to breast cancer 36

1.5.1 Basic components of a mass spectrometer 36 1.5.2 Breast cancer proteomics: Sources to mine for biomarkers 39 1.5.3 Tissue culture based biomarker discovery platform 42

1.6 Purpose and aims of the present study 46

1.6.1 Rationale 46 1.6.2 Hypothesis 49 1.6.3 Objectives 50

CHAPTER 2: ANALYSIS OF THE CONDITIONED MEDIA OF THREE BREAST CELL LINES

52

2.1 Introduction 53 2.2 Materials and Methods 58

2.2.1 Cell lines 58 2.2.2 Cell culture 58 2.2.3 Sample preparation 59 2.2.4 Strong cation exchange liquid chromatography 60 2.2.5 Tandem mass spectrometry (LC-MS/MS) 60 2.2.6 Data analysis 61 2.2.7 Spectral counting 62 2.2.8 Total protein and lactate dehydrogenase assay 63 2.2.9 Quantification of KLK5, KLK6 and KLK10 63

2.3 Results 64 2.3.1 Optimization of cell culture 64 2.3.2 Identification of proteins by MS 66 2.3.3 Identification of internal control proteins in CM by MS 70 2.3.4 Cellular localization of identified proteins 71 2.3.5 Overlap of proteins between the three cell lines 71 2.3.6 Cell lysate proteome 74 2.3.7 Spectral counting and identification of differentially expressed proteins

76

2.4 Discussion 84 CHAPTER 3: BIOINFORMATICS and CANDIDATE SELECTION 89 3.1 Introduction 90 3.2 Materials and Methods 91

3.2.1 Tissue-specific expression 91 3.2.2 Biological functions analysis 91 3.2.3 Comparison of proteins identified from CM with other publications

92

3.2.4 Single nucleotide polymorphisms (SNPs) and human 92

ix

Plasma Proteome database 3.2.5 Selection of candidates 92

3.3 Results 94 3.3.1 Tissue-specific expression 94 3.3.2 Biological functions analysis 95 3.3.3 Comparison of proteins identified from CM with other publications

99

3.3.4 SNPs and human Plasma Proteome database 1033.3.5 Selection of candidates 104

3.4 Discussion 111 CHAPTER 4: VERIFICATION PHASE 1134.1 Introduction 1144.2 Materials and Methods 118

4.2.1 Quantification of Elafin and Kallikrein 5, 6 and 10 1184.2.2 Verification strategy 1194.2.3 Quantification of Cystatin C, Lipocalin-2 and Transforming growth factor beta-2

119

4.2.4 Quantification of ALCAM, BCAM, NrCAM and Fractalkine 1214.3 Results 122

4.3.1 Elafin 1224.3.2 Kallikrein 5, 6, 10 1244.3.3 Cystatin C 1264.3.4 Lipocalin-2 1284.3.5 Transforming growth factor beta-2 1304.3.6 B-cell adhesion molecule (BCAM) 1324.3.7 Neuronal cell adhesion molecule (NrCAM) 1344.3.8 Fractalkine 1364.3.9 Activated leukocyte cell adhesion molecule (ALCAM) 138

4.4 Discussion 140 CHAPTER 5: VALIDATION OF ALCAM AS A SEROLOGICAL BREAST CANCER DIAGNOSTIC MARKER

142

5.1 Introduction 1435.2 Materials and Methods 146

5.2.1 Patients and specimens 1465.2.2 Measurement of ALCAM, CA 15-3 and CEA in serum 1475.2.3 Data analysis and statistics 147

5.3 Results 1505.3.1 ALCAM ELISA assay development 1505.3.2 Association of biomarkers with age 1525.3.3 Correlations among biomarkers 1525.3.4 Association of biomarkers with tumor characteristics for cases

155

x

5.3.5 Association of biomarkers with breast cancer 1585.3.6 The diagnostic values of the three markers 162

5.4 Discussion 165 CHAPTER 6: SUMMARY AND FUTURE DIRECTIONS 1716.1 Summary 1726.2 Future Directions 174 References 176

xi

LIST OF TABLES

Table Title Page

1.1 Definitions and specifications of biomarkers

11

1.2 Advantages and disadvantages of a cell culture-based model for biomarker discovery

45

2.1 Total number of proteins identified per number of peptides

69

2.2 Top 100 differentially expressed extracellular and membrane proteins

80

3.1 Proteins elevated in breast tumor tissue proteome and found in CM analysis

96

3.2 Proposed filtering criteria for candidate selection 106

3.3 Top 30 candidates from discovery phase

108

3.4 Our candidate selection criteria 109

3.5 Top 11 candidates selected for verification phase 110

5.1 Spearman's rank correlation coefficients among 3 markers for female controls and cases

154

5.2 Marker distributions by tumor characteristics for cases

156

5.3 Results from logistic regression models

161

5.4 ROC analysis for biomarkers 164

xii

LIST OF FIGURES Figure Title Page

2.1 Cell number, total protein, LDH and kallikrein levels in 24 hour

conditioned media of cell lines

65

2.2 Outline of experimental workflow

67

2.3 Number of proteins identified in CM by LC-MS/MS for the 3 cell lines

68

2.4 Cellular localization for the 3 cell lines

72

2.5 Overlap of proteins in CM

73

2.6 Proteome of MDA-MB-468 cell lysate by LC-MS/MS

75

2.7 Overlap of proteins between cell lines and cellular localization using label-free quantification

79

3.1 Biological functions analyses

97

3.2 Canonical pathways and known genes linked to breast cancer

98

3.3 Overlap among other publications

102

4.1 Levels of elafin in biological samples

123

4.2 Levels of KLK5, KLK6 and KLK10 in biological samples

125

4.3 Levels of cystatin C in serum

127

4.4 Levels of lipocalin-2 in serum

129

4.5 Levels of transforming growth factor beta-2 (TGF-β2) in serum

131

4.6 Levels of B-cell cell adhesion molecule (BCAM) in serum

133

4.7 Levels of neuronal cell adhesion molecule (NrCAM) in serum

135

4.8 Levels of fractalkine in serum

137

4.9 Levels of activated leukocyte cell adhesion molecule (ALCAM) in serum

139

xiii

Figure Title Page

5.1 Model of homophilic ALCAM-ALCAM interactions between cells

145

5.2 Schematic of a sandwich ELISA assay

151

5.3 Scatter plot of individual markers for cases and female controls versus age

153

5.4 Scatter plot of ALCAM distribution by tumor grade

157

5.5 Distribution of markers: ALCAM, CA 15-3 and CEA

160

5.6 ROC curves for the three markers (CA 15-3, CEA, ALCAM)

163

xiv

LIST OF ABBREVIATIONS

2-DE, 2-dimensional electrophoresis

ACN, acetonitrile

AFP, alpha-feto protein

ALCAM, activated leukocyte cell adhesion molecule

ASCO, American Society of Clinical Oncology

AUC, area under the curve

BCAM, B-cell adhesion molecule

CA 15-3, carbohydrate antigen 15-3

CAMs, cell adhesion molecules

CBE, clinical breast examination

CDCHO, Chemically Defined Chinese Hamster Ovary

CEA, carcinoembryonic antigen

CI, confidence interval

CIS, carcinoma in situ

CM, conditioned media

CV, coefficient of variation

DFP, diflunisal phosphate

DFS, disease-free survival

DTT, dithiothreitol

ECD, extracellular domain

ELISA, enzyme-linked immunosorbent assay

ER, estrogen receptor

xv

ESI, electrospray ionization

FBS, fetal bovine serum

FDA, Food and Drug Administration

GO, genome ontology

HCG, human chorionic gonadotropin-β

HE4, human epididymis protein 4

HPLC, high performance liquid chromatography

ICAT, isotope-coded affinity tags

IGFBP, insulin-like growth factor binding protein

Ig-SF, immunoglobulin superfamily

IHC, immunohistochemistry

IPA, Ingenuity Pathways Analysis

KLK, kallikrein gene

KLK, kallikrein protein

LC-MS/MS, liquid chromatography tandem mass spectrometry

LDH, lactate dehydrogenase

MALDI, matrix-assisted laser desorption ionization

MMP, matrix metalloproteinase

MRI, magnetic resonance imaging

MRM, multiple reaction monitoring

MS, mass spectrometry

NAF, nipple aspirate fluid

NrCAM, neuronal cell adhesion molecule

xvi

NSCLC, non-small cell lung carcinoma

OR, odds ratio

OS, overall survival

PBS, phosphate buffered saline

PCR, polymerase-chain reaction

PDAC, pancreatic ductal adenocarcinoma

PEM, polymorphic epithelial mucin

PgR, progesterone receptor

PSA, prostate-specific antigen

ROC, receiver operating characteristic

SCX, strong cation exchange

SFM, serum-free media

SNP, single nucleotide polymorphism

TFA, trifluoroacetic acid

TGF- β2

TIF, tumor interstitial fluid

TOF, time-of-flight

uPA, urokinase plasminogen activator

Chapter 1: Introduction 1

CHAPTER 1:

INTRODUCTION

Sections of this chapter were published in Nature Clinical Practice Oncology:

Kulasingam, V. and Diamandis, E.P. Strategies for Discovering Novel Cancer Biomarkers by Utilizing Emerging Technologies.

Nat Clin Pract Oncol. Accepted: November 2007 (In Press)

Copyright permission has been granted.


1.1 Breast Cancer

1.1.1 Statistics / Epidemiology

Breast cancer is an important public health issue. It is both a heterogeneous

disease and the most common cancer affecting women worldwide, with

approximately one million new cases diagnosed each year1. Globally, it accounts for

22% of all new cancer diagnoses in women and represents 7% of the more than 7.6

million cancer-related deaths worldwide2. It is the most common cancer in Canadian

women, both before and after menopause, and is a major cause of premature death.

In Canada, an estimated 22,000 women will be diagnosed with breast cancer and

5,000 will die of it every year. In fact, 1 in 9 Canadian women is expected to develop

breast cancer during her lifetime and 1 in 27 will die of it. It is a disease of the middle

and late ages of life, as 75% of breast cancer is diagnosed in women over the age of

50 (www.cancerfacts.com). While breast cancer is less common at a young age,

younger women tend to have a more aggressive form of the disease than older

women. Some of the risk factors for breast cancer include age, family history,

reproductive history, obesity, hormone usage, radiation exposure and diet. However,

70% of all breast cancer cases have no identifiable risk factor3.

After long-term increases in women aged 40 and over, incidence rates are

beginning to either stabilize or drop since the 1990s4. Earlier, incidence rate of

breast cancer was increasing while mortality rates were declining5. This was partly

due to both the increased use of screening for early disease and the widespread

administration of systemic adjuvant therapy. The five-year survival rate is close to

98% when the cancer is confined to the breast6. However, when breast cancer has


metastasized at the time of diagnosis, the five-year survival rate is ~27%.

Fortunately at the time of diagnosis, 60% of cases are still localized to the breast

hence yielding excellent five-year survival rates but 40% of individuals are

diagnosed when the cancer is regional or distant6. For metastatic disease, the

median time to treatment failure is 9 months and the median survival is 2 years.

Therefore, the prognosis for women with metastatic breast cancer is not good.

1.1.2 Anatomy and types of breast cancer

The breast is organized around the mammary gland. It is made up of 10-20

lobes, and within each lobe, there are many smaller lobules. Lobules contain groups

of glands that can produce milk7. The lobules are all linked by tubes called ducts,

and all ducts lead to the nipple. Fat and connective tissue surround the ducts and

lobules. The cells forming the ducts and lobules are epithelial cells whose main

function is to produce and to secrete the various constituents of milk. In addition,

epithelial cells are surrounded by a layer of myoepithelial cells, attached on a basal

membrane, whose role is to maintain the tubular structure of ducts and lobules.

The histologic classification of breast carcinoma can be broadly grouped into

two categories: in situ carcinoma (CIS, tumor cells are confined by the basement

membrane) which accounts for 15-30% of all cases and invasive carcinoma (tumor

cells have invaded beyond the basement membrane) which accounts for 70-85% of

all cases. CIS is further subtyped into ductal CIS and lobular CIS. Invasive

carcinoma is subdivided into invasive ductal carcinoma (70-80% of all cases that are

invasive) and invasive lobular carcinoma (5-10%). Moreover, 5-10% of breast


cancers are caused by the inheritance of a germline mutation in a cancer

predisposing gene. The most important of these genes are BRCA1 and BRCA2.

BRCA1 protein functions in regulating transcription, inhibiting cellular proliferation

and repairing DNA8. BRCA2 appears to have similar functions to BRCA1. Studies

have shown that women who carry either of these genes have a 80-85% lifetime risk

of developing breast cancer9.

The main presenting features in women with symptomatic breast cancer

include a lump in the breast, nipple change or discharge and skin contour changes.

Patients with primary breast cancers are offered surgery, often followed by adjuvant

therapeutics. The use of additional systemic anticancer treatment given to patients

after a cancer is surgically removed is referred to as adjuvant systemic therapy. The

goal of adjuvant therapy is to eliminate any remaining tumor cells in the body

(micrometastases). This form of therapy improves the likelihood of surviving the

cancer because it decreases the chance that the cancer will return. Despite these

treatments though, 40% of patients with lymph node-positive disease will experience

a relapse, and the majority of these patients will die from disseminated cancer10.

Although adjuvant systemic therapy has led to considerable improvements of the

prognosis of the breast cancer population, it also carries the significant adverse

effect of overtreatment11.

1.1.3 Current breast cancer screening methods

The primary goal of screening is to prevent lethal, progressive disease by

detecting cancer at an earlier, more treatable stage or by detecting precursor lesions


that can be removed before they develop into invasive cancers. Screening is a

presumptive identification for disease; it is not a diagnostic tool. Screening alerts

individuals for further testing. The current screening methods used to detect breast

tumors either benign or malignant, include clinical breast examination (CBE),

mammography and ultrasound. Mammography remains the cornerstone of breast

cancer screening and early diagnosis. Calcifications, masses and distortions can

potentially be detected via mammography. The introduction of mammography in

1983 was followed by a steady increase in the age-adjusted incidence rates,

particularly among in situ (stage 0) and early-stage (stage 1) patients12. Certainly the

overall incidence of breast cancer is significantly higher than that of the pre-

mammographic era. The sensitivity and specificity of mammography for women over

the age of 50 is 77-84% and 90-94%, respectively12. It is lower in women aged 40-49

yrs (70% sensitivity with 90% specificity). Mammographic screening for women aged

50–69 years is effective in reducing breast cancer mortality, and reductions in

mortality have been observed where screening has been introduced13. For example,

with the advent of mammographic screening, two-thirds of newly diagnosed breast

cancer patients are node-negative14. Approximately 70% of these patients are cured

of breast cancer by surgery while the remaining 30% develop recurrent disease

within 10 years of diagnosis15.

However, there are a number of limitations to mammography. Specifically,

mammography will not detect all breast cancers, and some breast cancers detected

with mammography may still have poor prognosis. Furthermore, it suffers from high

false positive and negative rates, hazardous exposure and patient discomfort16,17.


For women under the age of 40, mammographic screening yields a poor sensitivity

of only 33%18,19. Not only does it suffer from low positive predictive value of

approximately 25%20, its benefit for early detection in pre-menopausal women is still

debated. Mostly it is argued that breasts in younger women may be more dense

(dense breast tissue decreases the performance of mammography), and

corresponding tumor rates may be higher than predicted. Thus a portion of

mammographically detected tumors in younger women that undergo regular

screening may already be disseminated. Lastly, mammography may be harmful to

women that carry germ-line mutations such as ataxia telangiectasia or BRCA1/2,

possibly because of their increased sensitivity to radiation21,22. In addition, the

incidence of stage 2 and 3 disease has not fallen commensurately, suggesting a

bias in the detection of indolent cancers rather than aggressive cancers by

mammography12.

For early detection of breast cancers, the current recommendations are that

average-risk women (aged >40 years) should begin annual mammography in

addition to performing CBE annually23. Women in their 20s and 30s are

recommended to perform CBE every 3 years23. Finally, for women at significantly

increased risk for breast cancer, it has been recommended that they may benefit

from earlier initiation of screening (age 25), screening at shorter intervals, and the

screening with additional modalities such as ultrasound or magnetic resonance

imaging (MRI)23. Screening modalities such as ultrasound and MRI are available but

are not recommended for use as a population screening tool due to a lack of


evidence for its benefit if used, operator dependence, non-reproducible results and

high false-positive rates.

As well, there are 4 different biases that need to be considered when

evaluating cancer screening: 1) Lead time bias which is the time period by which

screening advances the diagnosis of the disease. 2) Length bias which states that

slow progressing cases are more likely to be detected at screening than rapidly

progressing cases. 3) Selection bias which states that individuals who accept

screening are, in general, a healthier group than those who decline the screening

method. Finally, the fourth bias states that some lesions identified as cancers would

not have been presented clinically in individuals in the absence of screening.

It is possible that mammographic screening has significantly contributed to an

increase in the detection of early-stage disease that may have little chance of

progressing or becoming lethal over the course of a person’s lifetime12. Thus, while

mammography may have increased the incidence of breast cancer and while it may

have good sensitivity and specificity for women over the age of 40, the mortality

rates for breast cancer have not changed significantly since its introduction6. Hence

it is crucial to be able to differentiate indolent/low-risk tumors from aggressive/higher

risk tumors, so that an individual is not over-treated. Mammographic screening does

not provide information regarding the prognosis of the lesion detected. What is

needed are screening strategies that are less biased towards indolent cancers.


1.1.4 Early diagnosis of breast cancer is essential

The concept of early detection of various forms of cancer before they spread,

and become incurable, has enticed physicians and research scientists for decades24.

40% of breast cancers have regional or distant spread of their disease at the time of

diagnosis6. Moreover, survival rates for people diagnosed with advanced breast

cancer have changed little over the past 20 years. It is known that survival is

excellent for breast cancer when early-stage disease is treated with existing

therapies24. Without doubt, shifting all cases to early detection will have a profound

impact on overall mortality and economic burden.

Unfortunately, other than definitive diagnosis by biopsy and histopathology,

no diagnostic or screening test is presently suitable for the early detection of

clinically relevant breast cancer. This is because sufficiently high sensitivity (the

probability of the test being positive in individuals with the disease) and specificity

(the probability of the test being negative in individuals without the disease) are

usually both not attributes of the same test; an increase in sensitivity tends to result

in a reduction in specificity, and vice versa. Newer diagnostic methods with improved

sensitivity and specificity are clearly needed to identify women with early stage

breast cancer.

The criteria for effective early detection state that the disease must be

common with a high mortality rate. Second, the screening test must accurately

detect early-stage disease. Third, the treatment after detection through screening

must demonstrate improvements in prognosis and finally, the potential benefits must


outweigh the potential harms and costs of screening24. One of the most promising

ways to achieve this is through the use of cancer biomarkers.

1.2 Cancer biomarkers

1.2.1 Definition and types of biomarkers

The ability to detect human malignancy via a simple blood test has long been

a major objective in medical screening. The advantages of such an easy to use,

relatively non-invasive and operator-independent test are self-evident. In this respect,

cancer biomarkers can be DNA, mRNA, proteins, metabolites, or processes such as

apoptosis, angiogenesis or proliferation25. The markers are produced either by the

tumor itself or by other tissues, in response to the presence of cancer or other

associated conditions, such as inflammation. Such biomarkers can be found in a

variety of fluids, tissues and cell lines. Tumor markers can be used for screening the

general population, for differential diagnosis in symptomatic patients, and for clinical

staging of cancer. Additionally, they can be used to estimate the tumor volume, to

evaluate response to treatment, to assess recurrence through monitoring or as

prognostic indicators for disease progression (Table 1.1). Given the low prevalence

of cancer in any given population, no known marker meets all of these criteria.

A number of different types and forms of tumor markers exist. These markers

include hormones and different functional subgroups of proteins such as enzymes,

glycoproteins, oncofetal antigens and receptors. Furthermore, other tumor changes

such as genetic mutations, amplifications, translocations and changes in microarray-

generated profiles (signatures) are also forms of tumor markers. Regardless of the


types of tumor markers/profiles, the use of a tumor marker in a clinic must be

associated with proven improvements in patient outcomes, such as increased

survival or enhanced quality of life25. As mentioned earlier, other than definitive

diagnosis by biopsy and histopathology, no diagnostic or screening test is presently

suitable for the early detection of clinically relevant breast cancer.


Table 1.1: Definitions and specifications of biomarkers

Diagnostic (screening) biomarker A marker that is used to detect and identify a given type of cancer in an individual. These types of markers are expected to have high specificity and sensitivity. For example, the presence of Bence-Jones protein in urine remains one of the strongest diagnostic indicators of multiple myeloma. Prognostic biomarker This is used once the disease status has been established. These biomarkers are expected to predict the likely course of the disease, its recurrence, and thus they have an important influence on the aggressiveness of the therapy. For example, the traditional prognostic factors in breast cancer include tumor size, tumor grade and nodal status26. Stratification (predictive) biomarker This serves to predict the response to a drug before starting treatment. It classifies individuals as likely responders or non-responders to a particular treatment. These biomarkers mainly arise from array-type experiments that make is possible to predict clinical outcome from the molecular characteristics of the patient’s tumor. Specificity The proportion of control/normal subjects who test negative for the biomarker. Sensitivity The proportion of individuals with confirmed disease that test positive for the biomarker. Receiver operating characteristic (ROC) curve A graphical representation of the relationship between sensitivity and specificity. It is used to evaluate the efficacy of a tumor marker at various cut-off points. An ideal graph is the one giving the maximum area under the curve (AUC). Diagonal line represents a useless test (AUC = 0.5). Curved line represents a useful (AUC < 1.00) but not perfect (AUC = 1.00) test.

Tumor Marker Value

Frequency Distribution

Chosen Cut-Off

Increases SpecificityIncreases Sensitivity

A = True NegativesB = False NegativesC = False PositivesD = True Positives

A

B C

D

0% 100%

100%

Tru

e P

ositi

ve R

ate

False Positive Rate


1.2.2 Characteristics of an ideal tumor marker

An ideal tumor marker should be measured easily, reliably and cost-

effectively using an assay with high analytical sensitivity and specificity24. In

particular, an ideal tumor marker should be produced by the tumor cells and enter

the circulation and it should be present at low levels in serum of healthy or benign

disease patients and increase significantly in cancer (preferably in one cancer type).

Moreover, an ideal tumor marker should be present in detectable (or higher than

normal) quantities at early or preclinical stages and the quantitative levels of the

tumor marker should reflect the tumor burden. Finally, it should demonstrate high

diagnostic sensitivity (few false negatives) and specificity (few false positives).

A caveat to currently used tumor markers is that generally, they suffer from

low diagnostic specificity and sensitivity. Only a few markers have entered routine

use, and only for a limited number of cancer types and clinical settings. In the

majority of cases, the current markers are used in conjunction with imaging, biopsy

and associated clinicopathological information before a clinical decision is made.

1.2.3 Historical overview of cancer biomarkers

The first cancer marker ever reported was the presence of the light chain of

immunoglobulin in the urine, of 75% of myeloma patients27. Since its discovery in

1847, the test is still employed by clinicians today, but with use of modern

quantification techniques. From 1930–1960, scientists identified numerous

hormones, enzymes, and other proteins whose concentration was altered in

biological fluids from cancer patients. The modern era of monitoring malignant


disease, however, began in the 1960s with the discovery of alpha-fetoprotein

(AFP)28 and carcinoembryonic antigen (CEA)29, which was facilitated by the

introduction of immunological techniques such as the radioimmunoassay. In the

1980s, the era of hybridoma technology enabled development of the ovarian

epithelial cancer marker, carbohydrate antigen 125 (CA 125)30. In 1980, prostate-

specific antigen (PSA), considered one of the best cancer markers, was

discovered31.

1.2.4 Current applications of tumor markers and their clinical utility

One of the applications of a tumor marker is for population screening. A

screening test should have very high sensitivity and exceptional specificity, to avoid

too many false positives in low cancer prevalence populations. Furthermore, the test

must demonstrate a benefit in terms of clinical outcome. Unfortunately, current

biomarkers suffer from low diagnostic sensitivity and specificity to serve as

screening markers. With the exception of PSA, current tumor markers are more

frequently elevated at late stages of disease. Hence, the current clinical utility of any

marker to serve as a screening tool is limited. Another application of a tumor marker

is for diagnosis. Similar to its utility as a screening marker, the current biomarkers

suffer from low diagnostic sensitivity and specificity to serve as diagnostic markers.

A further application of a tumor marker is as a prognostic marker. Most cancer

markers have some prognostic value however; specific therapeutic interventions

cannot be issued since their accuracy of prediction is rather poor. In addition, some

markers can serve as a predictive indicator of therapeutic response. In this respect,


very few markers have predictive power (exceptions include steroid hormone

receptors and HER-2 amplification for breast cancer) but the provided information

helps for therapy selection. Yet another application of a tumor marker is for tumor

staging. Besides AFP and human chorionic gonadotropin-β (HCG) for use of staging

testicular cancer, the accuracy of the other markers to determine tumor staging is

poor. Two more current applications of tumor markers exist which include detecting

early tumor recurrence and monitoring effectiveness of cancer therapy. The

usefulness of the current markers to serve the former role is controversial as lead

time is short and does not significantly affect outcome. In addition, therapies for

treating recurrent disease are not usually effective and clinical relapses could occur

without biomarker elevation or biomarker elevation is non-specific. With respect to

the latter application (monitoring effectiveness of cancer therapy), current

biomarkers provide information on therapeutic response (effective or non-effective)

that is readily interpretable and more economical than imaging modalities. Hence

current markers play a very essential clinical role in this application.

1.2.5 Currently-available breast cancer biomarkers: Clinical utility and

limitations

The currently used serological breast cancer markers include carbohydrate

antigen 15-3 (CA 15-3) and carcinoembryonic antigen (CEA)32. Briefly, CA 15-3 and

BR 27.29 (also known as CA 27.29) serum assays detect the same antigen, i.e.

MUC-1 protein and provide similar clinical information33. CA 15-3 has however, been

more widely investigated than BR 27.29. CA 15-3 and CEA levels in serum are


related to tumor size and nodal involvement and are recommended by international

bodies such as American Society of Clinical Oncology (ASCO) for monitoring

patients with metastatic disease during active therapy34. Both these markers have

been recommended to be used in conjunction with diagnostic imaging, history and

physical examination. In general, serum marker levels reflect tumor burden and for

this reason, the current markers are not sensitive enough to be used for screening

and early diagnosis of primary breast cancer32,35,36. However, the role of tumor

markers in diagnosis of recurrent disease and in the evaluation of response to

treatment is well established. This is desirable since tumor marker elevation

represents a simple, objective method for monitoring of therapeutic response, which

has significant advantages over conventional imaging methods, particularly in

relation to cost-effectiveness of the tests. Indeed, prospective randomized studies

are required to demonstrate any survival benefit when earlier therapeutic

interventions are instituted upon elevation of serum markers. In the following,

detailed descriptions of the currently available serum breast cancer markers are

described.

1.2.5.1 MUC-1

CA 15-3 and CA 27-29 both measure the serum levels of MUC-1. The major

difference between the two tests is the different immunoreagents used,

predominantly the monoclonal antibodies utilized. CA 15-3 is the most widely used

test to assay MUC-1 and can be considered the gold standard. It consists of a

sandwich capture assay which uses the monoclonal antibody 115D8 (raised against


human milk fat globule membranes) and DF3 (raised against a membrane-enriched

fraction of metastatic human breast carcinoma)37,38. On the other hand, CA 27.29 is

measured using a solid-phase competitive immunoassay in which the monoclonal

antibody B27.29 is used either as a catcher or as a tracer. This antibody recognizes

the same epitope as DF3 but the binding of B27-29 is not influenced by the

presence of glucidic residues39. In either case, the test measures serum levels of

MUC-1. MUC-1 is a polymorphic epithelial mucin (PEM), a large glycoprotein found

on the apical surface of polarized epithelial cells40. Normally, MUC-1 is expressed in

the ducts and acini, from where it is released into the milk in soluble form or bound

on milk fat globules. However in cancer, with disruption of normal cell polarization

and tissue architecture, MUC-1 is shed into the bloodstream and hence its levels

can be measured by this immunoassay. MUC-1 is a high molecular weight (250-

1000 kDa) protein that can activate membrane receptors for growth factors, reduce

E-cadherin-mediated cell adhesion, thereby promoting cell migration, and reduce the

cellular apoptotic response to oxidative stress41,42. For CA 15-3, the diagnostic

sensitivity of the test is 10%, 20% and 40% in patients with stage I, stage II and

stage III disease, respectively. In addition to lacking sensitivity for early disease, CA

15-3 also lacks specificity for breast cancer as it is found elevated in 5% of healthy

individuals43,44. Furthermore, increased levels of CA 15-3 can be observed in several

non-neoplastic conditions, including benign breast pathology, chronic liver disorders

and immunological disorders45. As stated earlier, currently CA 15-3 is used for post-

operative surveillance in patients with no evidence of disease and for monitoring

therapy in advanced disease. Some studies have shown that pre-surgical CA 15-3


level is a prognostic factor with both disease-free survival (DFS) and overall survival

(OS) being shorter in patients with a high value for this marker46,47. Despite this, it

has not been proven that CA 15-3 is an independent prognostic factor and hence is

not currently used in the clinic to fulfill this role.

Serial tumor marker determinations can be useful tools in the diagnosis of

metastatic breast cancer. In the presence of distant metastases, the clinical

sensitivity of CA 15-3 in different studies has ranged from 50-90% depending on the

anatomical site. For example, liver metastases are associated with the highest

sensitivity followed by skeletal and lung33. Some studies have also suggested that

about two-thirds of patients display elevations of CA 15-3 either before or at the time

of recurrence, with a lead time ranging from 2-9 months45. Unfortunately, some other

studies were not as positive.

CA 15-3 is useful in the monitoring of response to either endocrine therapy or

cytotoxic therapy48. It should be noted that ASCO states that a marker cannot, in any

situation, stand alone to define response to treatment34. To further complicate the

matter, the magnitude of variation (also called the “critical difference”) between

successive marker levels that constitutes a clinically significant change is not well

defined. This “critical difference” depends on both the analytical imprecision of the

assay and the normal intra-individual biological variation. In general for CA 15-3, it

has been estimated that at least a 30% change is required before successive marker

concentrations can be regarded as significantly altered49. Having stated this, another

point to consider is the phenomena of “tumor marker spike”50. This refers to an

increase in tumor marker levels following initiation of chemotherapy due to massive


neoplastic cell necrosis induced by cytotoxic agents. In fact, this can be observed in

30% of patients who show a response to the therapy. Normally, the peak usually

occurs within 30 days from commencement of therapy but marker levels can remain

elevated for as long as 3 months. Therefore, marker values should be evaluated with

caution.

1.2.5.2 CEA

CEA is a single-chained glycoprotein, 640 amino acids, with a mass of 150-

300 kDa. It is an adhesion molecule that is part of the immunoglobulin superfamily51.

Therefore, CEA might play a role in cellular-matrix recognition. CEA can be

measured by commercially available immunoassays using either a radioisotope or a

non-radioactive (enzyme or chemiluminenscent) label. Currently, CEA is used in the

clinic for post-operative surveillance in patients with no evidence of breast cancer.

Overall, CEA appears to be less sensitive than CA 15-3/BR 27.29. Furthermore,

CEA is also used in the clinic for monitoring therapy in advanced disease, especially

if CA 15-3/BR 27.29 is not elevated34. The ability of CEA to predict prognosis has

been conflicting47. Only 50-60% of patients with metastatic disease will have

elevated CEA levels, compared with 75-90% who have elevated levels of CA 15-352.

1.2.5.3 Circulating levels of HER-2/neu

Some markers are able to predict response to or resistance to a specific

therapy. A classic example of this is the circulating levels of HER-2, an oncoprotein

of 185kDa. It is a transmembrane glycoprotein whose overexpression is present in


20-30% of primary breast cancers and is associated with poor prognosis, short

survival and recurrence. HER-2 can be proteolytically cleaved and circulating HER-2

ectodomain can be detected in 80% of patients with tumors overexpressing HER-2

compared with 3% of those with tumors not overexpressing the oncoprotein. The

negative prognostic effect of high circulating levels of HER-2 ectodomain seems to

be related to the resistance to chemotherapy such as paclitaxel (a mitotic inhibitor)

and doxorubicin (a DNA interacting drug)53,54.

1.2.5.4 Other promising serological breast cancer markers

High circulating levels of different hormones have been found to represent a

risk factor for breast cancer development. Elevated concentrations of prolactin,

insulin, insulin-like growth factor type I and androgens (testosterone) are frequently

detectable in subjects who finally develop malignant breast neoplasms. Moreover,

several circulating molecules have been revealed to be associated with patient

outcome including cyclins and p53 (cell cycle controllers), matrix metalloproteinases

(MMPs), urokinase plasminogen activator (uPA) and its inhibitor PA 1-1, cathepsins

(involved in local invasion and metastasis) and vascular endothelial growth factors

(angiogenesis). However for most of these potential markers, their real impact of

their application in clinical practice is currently unknown. Recently, it was

recommended that uPA/PA1 measurements by ELISA on breast cancer tissue may

be used for evaluating prognosis in patients newly diagnosed with node negative

breast cancer34. Low levels of these markers are associated with low risk of

recurrence.


In summary, measurement of circulating tumor marker levels in breast cancer

is most established in advanced disease so its clinical roles are to detect

recurrences in asymptomatic patients and to monitor anti-neoplastic treatments55.

1.2.5.5 Non-serological markers for breast cancer

Currently, the hormone receptors estrogen and progesterone are used in the

clinic as breast cancer tissue-based markers. Estrogen receptors (ER) are used for

predicting response to hormone therapy in both early and advanced breast cancer

and in combination with other factors for assessing prognosis in breast cancer34. ER

alone is a relatively weak prognostic factor. In a meta-analysis, it was found that ER-

positive patients were 7-times less likely to develop recurrent disease than ER-

negative patients after at least 5 years of adjuvant tamoxifen treatment10.

Progesterone receptor (PgR) is usually combined with ER for predicting response to

hormone therapy. The last tissue-based marker used in the clinic for breast cancer is

HER-2/neu. It is used to determine prognosis and it is most useful in node-positive

patients. There is conflicting data in node-negative patients. HER-2 tissue levels is

also used for selecting patients with either early or metastatic breast cancer for

treatment with Trastuzumab (Herceptin). Finally, the genetic markers for breast

cancer include BRCA1 and BRCA2, which is used in the clinic in some specialized

centers. Both of these markers are used for identifying individuals who are at high

risk of developing breast or ovarian cancer in high risk families.


1.2.6 Renewed interest in discovering novel breast cancer biomarkers

Rational management of this disease requires the availability of reliable

diagnostic, prognostic and predictive markers. Current therapies for advanced

cancers are elusive. Novel non-invasive methods for detecting breast cancer early in

the course of the disease have the potential to reduce morbidity and mortality, as

well as receive a higher compliance rate by patients undergoing screening24. In

addition, the clinical course of breast cancer is highly variable, so it is also crucial to

be able to predict the course of the disease in individual patients (prognosis) to

ensure adequate treatment and surveillance. Moreover, metastatic breast cancer is

regarded as incurable and thus, the goal of treatment is generally palliative at this

stage. In this context, the use of serial measurements of serum tumor markers is

potentially useful in deciding whether to persist in using a particular type of therapy,

to terminate its use or to switch to an alternative therapy. Therefore, optimal

management of patients with breast cancer requires the use of tumor markers. As

well, the risk of cancer recurrence is high in those who have previously had cancer,

even for those who have been in remission for five years. Cancer survivors

constitute a high-risk group that would benefit from improved tests for early detection

of disease recurrence. Novel biomarkers for breast cancer should have utility for

early breast cancer diagnosis, prognosis and/or prediction of therapeutic response.

Every era of biomarker discovery seems to be associated closely with the

emergence of a new and powerful analytical technology. The past decade has

witnessed an impressive growth in the field of large-scale and high-throughput

biology, which has contributed to an era of new technology development. The


completion of a number of genome sequencing projects, the discovery of oncogenes

and tumor-suppressor genes, and recent advances in genomic and proteomic

technologies, together with powerful bioinformatics tools, will have a direct and major

impact on the way the search for cancer biomarkers is currently conducted. Early

cancer biomarker discoveries were mainly based on empirical observations, such as

the overexpression of CEA. The modern technologies are capable of performing

parallel rather than serial analyses, and can help to identify distinguishing patterns

and multiple markers rather than a single marker; such strategies represent a central

component and a paradigm shift in the search for novel biomarkers.

These breakthroughs have since paved the way for countless new avenues

for biomarker identification. Very few serum tumor markers, however, have been

introduced to the clinic over the past 15 years56. The next two sections will highlight

some mechanisms behind biomarker elevation in biological fluids and outline

strategies for novel marker identification. These strategies should facilitate delivery

of potential candidate molecules for cancer diagnosis, prognosis and prediction of

therapy. These projected discoveries may be instrumental in substantially reducing

the burden of cancer by providing prevention, individualized therapies and improved

monitoring post-treatment.

1.3 Mechanisms of biomarker elevation in biological fluids

Some of the major mechanisms by which molecules can be elevated in

biological fluids during cancer initiation and progression are discussed below. Such

molecules could serve as effective cancer biomarkers.


1.3.1 Gene over-expression

The protein encoded by a gene can be expressed in increased quantities due

to increases in gene or chromosome copy number (i.e. gene amplification) or

through increased transcriptional activity. The latter could be the result of

imbalances between gene repressors and activators. Epigenetic changes, such as

DNA methylation, are also known to affect gene expression. On a larger scale,

chromosomal translocations can result in gene regulation by promoters that are

sometimes enhanced by steroid hormones;57 transposons can also serve a similar

role.

An example of a putative biomarker is the protein human epididymis protein 4

(HE4), which is overexpressed in ovarian carcinoma. Using cDNA microarrays to

identify overexpressed genes in ovarian carcinoma, 101 transcripts were shown to

be overexpressed in ovarian cancers compared with normal tissues58,59. Real-time

polymerase-chain reaction (PCR) of an independent set of benign and malignant

tissues confirmed that 12 of the transcripts were indeed overexpressed in ovarian

cancers. Two of them, WDFC2 (also known as HE4) and MSLN, seemed to have

the highest selectivity. Quantification of HE4 protein levels in serum revealed that it

can be a potential biomarker for ovarian cancer60; though, clinical evaluation is

pending. Gene and protein expression of HE4 in a large series of normal and

malignant adult tissues, however, showed that HE4 is present in pulmonary,

endometrial and breast adenocarcinomas, in addition to positive staining in ovarian

carcinoma61.


1.3.2 Increased protein secretion and shedding

Given that 20–25% of all proteins are secreted, aberrant secretion or

shedding of membrane-bound proteins with an extracellular domain (ECD) is

another means by which molecules can be elevated in biological fluids. Alterations in

the signal peptide of proteins caused by single nucleotide polymorphisms may result

in atypical secretion patterns62. Moreover, elevation of molecules in biological fluids

can be the result of a change in the polarity of the cancer cells, which could result in

the release of cancer-associated glycoproteins into the circulation. Increased

expression of proteases that cleave the ECD portion of membrane proteins

represents another possibility for increased circulating levels.

Many proteins are secreted into the circulation such as AFP, which is rapidly

released from both normal and cancer cells63. A classic example of shedding of

membrane proteins into fluids (and thus serving as a cancer biomarker) is HER-2

(also known as ERBB2). HER-2 is a cell membrane surface-bound tyrosine kinase

involved in cell growth and differentiation64. Its overexpression is associated with a

high risk of breast and ovarian cancer relapse and death, and HER-2 is the target of

the therapeutic monoclonal antibody trastuzumab (Herceptin, Genetech)65. The

HER-2 protein consists of a cysteine-rich extracellular ligand-binding domain, a short

transmembrane domain, and a cytoplasmic protein tyrosine kinase domain. The

ECD of HER-2 can be released by proteolytic cleavage from the full-length receptor

protein and can be detected in serum. High levels of HER-2 in serum correlate with

poor prognosis in patients with breast cancer66. In 2000, the Food & Drug


Administration (FDA) approved the serum HER-2 test, which is the first blood test for

measuring circulating levels of HER-2 to be approved for the follow-up and

monitoring of patients with metastatic breast cancer.

1.3.3 Angiogenesis, invasion and destruction of tissue architecture

Tissue invasion by the tumor might permit direct release of molecules into the

interstitial fluid and subsequent delivery by the lymphatics into the blood. For

epithelial cancer types, the proteins must break through the basement membrane of

the invading tumor before they appear in blood. For example, PSA is abundantly

expressed by prostatic columnar epithelial cells and secreted into the glandular

lumen, comprising a major component of seminal plasma (0.5–3.0 g/l) upon

ejaculation. In healthy men, low levels of PSA enter the circulation by diffusing

through a number of anatomic barriers, including the basement membrane, stromal

layer and the walls of blood and lymphatic capillaries. This process gives rise to a

normal serum PSA range of 0.5–2.0 g/l.

Prostatic carcinomas most often arise in the glandular epithelium of the

peripheral prostate. Although PSA gene transcription is down-regulated in prostate

cancer, PSA protein levels in the circulation of prostate cancer patients increase due

to disruption of the anatomic barriers between the glandular lumen and capillaries.

Concomitant to early-stage prostate cancer is the loss of basal cells, disruption of

cell attachment, degradation of the basement membrane, initiation of

lymphangiogenesis67 and loss of the polarized structure and luminal secretion by

tumor cells. Consequently, PSA levels in the serum can rise to 4–10 g/l. Late-stage


prostate cancer is characterized by invasion of tumor cells into the stromal layers

and the circulation, and total loss of glandular organization. This situation allows for

considerable amounts of PSA to leak into the bloodstream, where levels typically

range from 10 to 1000 g/l. It should be mentioned that while PSA is one of the best

biomarkers currently used in the clinic, increased levels of PSA may be observed in

the blood of men with benign prostate conditions, such as prostatitis and benign

prostatic hyperplasia, or with a malignant growth in the prostate, highlighting its lack

of specificity and consequently its associated high false-positive rate.

1.4 Strategies for discovering novel cancer biomarkers

With the introduction of technologies that enabled simultaneous examination

of thousands of proteins and genes in single experiments (such as mass

spectrometry and protein and DNA arrays), renewed interest emerged to discover

novel cancer biomarkers. The new advances include completion of human genome

project, advances in bioinformatics, array analysis (e.g. DNA, RNA, protein), mass

spectrometry-based profiling and identification, laser-capture microdissection, single

nucleotide polymorphisms, comparative genomic hybridization and high-throughput

sequencing. These modern technologies are capable of performing parallel rather

than serial analyses, therefore, they provide opportunities to identify distinguishing

patterns (signatures; portraits) for cancer diagnosis, classification and prediction of

therapeutic response (individualized treatments). Furthermore, they provide the

means by which new, individual tumor markers could be discovered by using

reasonable hypotheses and novel analytical strategies. Given that the tumor-host


interface can generate enzymatic cleavage and shedding, and sharing of growth

factors, it is conceivable that either the tumor itself or its microenvironment could be

sources for biomarkers that would ultimately be shed into the serum proteome,

allowing for early disease detection and for monitoring therapeutic efficacy.

Certainly, genomic and proteomic technologies have significantly increased

the number of potential DNA, RNA and protein biomarkers under investigation. A

paradigm shift has recently been realized, whereby single biomarker analysis is

being replaced by multiparametric analysis of genes or proteins. A multiparametric

assay refers to the concept that a panel of markers may provide better clinical

information than the performance of the individual markers that make up the panel.

This has triggered the question of whether cancer has a unique fingerprint (i.e.

genomic, proteomic, metabolomic). A number of strategies for cancer biomarker

discovery that utilize emerging technologies are outlined below, included in it are

discussions regarding their merits and limitations.

1.4.1 Gene-expression profiling

Genomic microarrays represent a highly powerful technology for gene-

expression studies. Microarray experiments are usually performed with DNA or RNA

isolated from tissues, which are then labeled with a detectable marker and allowed

to hybridize to the arrays that are comprised of gene-specific probes representing

thousands of individual genes68. The greater the degree of hybridization, the more

intense the signal, thus implying a higher relative level of expression. Due to the

massive data per experiment, the molecular markers and their expression patterns


need to be analyzed by elaborate computational tools, which add an additional layer

of statistical complexity. Two basic analyses are unsupervised and supervised

hierarchical clustering algorithms69; the latter identify gene-expression patterns that

discriminate tumors on the basis of pre-defined clinical information70. In addition,

quantitative real-time PCR is generally considered the ‘gold standard’ against which

other methods are validated. The cancer sub-classification hypothesis states that

gene-expression patterns identified using DNA microarrays can predict the clinical

behavior of tumors71. The proof-of-principle for the cancer sub-classification

hypothesis has been provided for various malignancies, such as leukemias, breast

cancers and many other tumor types72-78. For example, results from gene-array

technologies have enabled breast cancers to be classified into prognostic categories

depending on the expression of certain genes. The 70-gene-panel microarray study

of survival prediction led to development of MammaPrint,79 which in February 2007

became the first multi-gene panel test to be approved by the US FDA for predicting

breast cancer relapse. Another gene-expression profile, Oncotype DX, based on

quantitative RT-PCR, has been commercially available for the same use since 2004.

National Surgical Adjuvant Breast and Bowel Project (NSABP) clinical trials initiated

in the 1980s were retrospectively analyzed with a median follow-up of 14 years, to

validate the gene signature identified by Oncotype DX for predicting the recurrence

of tamoxifen-treated, node-negative breast cancer80. Oncotype DX and MammaPrint

use different analytical platforms and despite their similar clinical indication, they

have only a single gene overlap in their panels. Nevertheless, over the past decade,

a tremendous growth in the application of gene-expression profiling has been


witnessed. It has contributed to the cancer subclassification theory,71 insights into

cancer pathogenesis and to the discovery of a large number of diagnostic markers81.

Michiels et al. recently performed a meta-analysis of seven of the most

prominent studies on cancer prognosis that used microarray-based expression

profiling82. Surprisingly, in five of the seven studies on cancer prognosis, the original

data could not be reproduced83. The other two studies showed much weaker

prognostic information than the original data. The meta-analysis also indicated that

the list of genes identified as predictors of prognosis was highly unreliable and that

the molecular signatures were strongly dependent on the selection of patients in the

training sets. This meta-analysis suggests that the results of the aforementioned

studies are over-optimistic and that they need careful validation and larger sample

sizes before conclusions for their clinical utility can be drawn. Despite promising

proof-of-principle data, discovery of novel subtypes of various carcinomas using

gene arrays and the use of these technologies for discovery of diagnostic markers,

these new tools are not yet recommended for widespread clinical use by either

organizations issuing clinical guidelines or by expert panels84. However, it has been

recommended that in newly diagnosed patients with node-negative, estrogen-

receptor positive breast cancer, the Oncotype DX assay can be used to predict the

risk of recurrence in patients treated with tamoxifen. The precise clinical utility and

appropriate application for other multiparameter assays, such as the MammaPrint

assay (a 70-gene panel for predicting the likelihood that cancer will recur), the

"Rotterdam Signature," (a 76-gene panel for predicting low or high risk of developing

metastatic disease) and the Breast Cancer Gene Expression Ratio (based on the


ratio of the expression of two genes: the homeobox gene-B13 (HOXB13) and the

interleukin-17B receptor gene (IL17BR) where in breast cancers that are more likely

to recur, the HOXB13 gene tends to be over-expressed, while the IL-17BR gene

tends to be under-expressed) are under investigation34.

1.4.2 Mass spectrometry-based profiling

Proteomic-pattern profiling is a recent approach to biomarker discovery.

Given that mRNA information does not best reflect the function of proteins, which

are the functional components within organisms, the utilization of tumor diagnosis or

subclassification via proteomic patterns seems promising. The rationale is that

proteins produced by cancer cells or their microenvironment may eventually enter

the circulation and that the patterns of expression of these proteins could be

assessed by mass spectrometry and used for diagnostic purposes, in combination

with a mathematical algorithm. Mass spectrometry-based methods for proteomic

analysis have improved and include more-advanced technology that allows for

higher mass accuracy, higher detection capability, and shorter cycling times, thereby

enabling increased throughput and more-reliable data85. Technologies such as

differential in-gel electrophoresis, two-dimensional polyacrylamide gel

electrophoresis and multidimensional protein-identification technology can be used

for high-throughput protein profiling. The technology that has received considerable

attention over the past involves the use of a minute amount of unfractionated serum

sample added to a “protein-chip”, which is subsequently analyzed by surface-

enhanced laser-desorption ionization time-of-flight mass spectrometry (SELDI-TOF-


MS) to generate a proteomic signature of serum86. These patterns reflect part of the

blood proteome, but without knowledge of the actual identity of the proteins. The

potential of proteomic pattern analysis was first demonstrated in the diagnosis of

ovarian cancer87. In this study, exceptional results were seen with a sensitivity of

100% (even for early-stage disease) and 95% specificity. These numbers are far

superior to the sensitivities and specificities obtained with current serologic cancer

biomarkers. Since then, this approach has been extended to a number of other

cancer types, such as breast, prostate, colon, liver, renal, pancreatic, head and neck

cancers88-94.

In spite of the optimism for this approach, a number of important limitations

were subsequently identified95. The limitations included bias from artifacts related to

the clinical sample collection and storage, the inherent qualitative nature of mass

spectrometers, failure to identify well-established cancer biomarkers, bias in

identifying high-abundance molecules within the serum and disagreement between

peaks generated by different research labs96-98. Another limitation includes possible

bioinformatic artefacts. Baggerly et al. showed that background/matrix peaks can

achieve a high level of discrimination between normal and cancer patients99. Despite

a 5-year lapse since the first report, no product has reached the clinic and no

independent validation studies have been published. Guideline-developing

organizations and expert panels do not currently recommend serum proteomic

profiling (identification of discriminating peaks) for clinical use34,100.


1.4.3 Peptidomics

The low-molecular-weight plasma or serum proteome has been the focus of

recent attempts to find novel biomarkers101. Peptides are essential for many

physiological processes such as blood pressure (angiotensin II) and blood glucose

(insulin) regulation. It has been suggested that “the low molecular-weight region of

the blood proteome is a treasure trove of diagnostic information ready to be

harvested by nanotechnology”102. The low-molecular-weight serum proteome has

been characterized by ultrafiltration, enzymatic digestion, and liquid chromatography

coupled to tandem mass spectrometry103,104 or via a top-down proteomics approach

(intact peptide is distinguished directly by its fragment ions)105 or by means of

pattern profiling106. Informative diagnostic peptides that are generated after

proteolysis of high abundance proteins by the coagulation and complement

enzymatic cascades can be identified by mass spectrometry. These proteomic

patterns were claimed to distinguish not only controls from cancer patients107 but

also between various types of cancers106.

One major consideration is that these peptides present in the serum are

derived from a low number of high abundance proteins. Koomen et al. studied

peptides in serum and concluded that sample collection is of immense importance,

and could give rise of artifacts, and that serum is not ideal for proteomic experiments

as it contains significant endoproteolytic and exoproteolytic enzymatic activity108.

This finding raises concerns regarding peptidomics data generated by profiling

technologies. Peptidomic profiling might represent nothing more than peptides

cleaved during coagulation or functions inherent to plasma or serum, including


immune modulation, inflammatory response and protease inhibition109. Many of the

aforementioned caveats associated with mass spectrometry-based protein profiling

technologies also apply to peptidomics.

1.4.4 Cancer biomarker family approach

The premise for the ’cancer biomarker family’ approach is that if a member of

a protein family is already an established biomarker, then, other members of that

family might also be good cancer biomarkers. For example, PSA is a member of the

human tissue kallikrein family. Kallikreins are secreted enzymes with trypsin-like or

chymotrypsin-like serine protease activity. They consist of a family of 15 genes

clustered in tandem on chromosome 19q13.4110. PSA (or kallikrein 3) and human

kallikrein 2 currently have important clinical applications as prostate cancer

biomarkers111. Other members of the human kallikrein family have been implicated

in the process of carcinogenesis and are currently being investigated as biomarkers

for diagnosis and prognosis. For example, human kallikrein 6 has been studied as a

novel ovarian cancer biomarker112. It was found that elevated serum levels of this

protein were associated with late-stage tumor, high grade and serous histotype and

with resistance to chemotherapy112. In general, levels of kallikrein 6 were linked to

decreased disease-free and overall survival, thus serving as an independent and

unfavorable prognostic indicator. Similarly, kallikreins 3, 5 and 14 have been shown

to be increased in the serum of breast cancer patients, thus potentially serving as

diagnostic markers. Being serine proteases, these proteins could be implicated in

tumor progression through extracellular matrix degradation.


1.4.5 Secreted protein approach

In theory, a candidate serological tumor marker should be a secreted protein,

because it has the highest likelihood of entering the circulation. Examination of

tissues or biological fluids near to the tumor site of origin may facilitate identification

of candidate molecules for further investigation. The increasing evidence that tumor

growth and progression is dependent on the malignant potential of the tumor cells as

well as on the microenvironment surrounding the tumor (e.g. stroma, endothelial

cells, immune and inflammatory cells), further supports this approach113,114. A

number of technologies can be utilized, but for systematic characterization of

proteins in complex mixtures, mass spectrometry is the preferred technology. In the

case of breast cancer, breast tissue, nipple aspirate fluid, breast cyst fluid, tumor

interstitial fluid and breast cancer cell lines can all be explored. The tumor interstitial

fluid that perfuse the tumor microenvironment in invasive ductal carcinomas of the

breast was examined by proteomic approaches115. Over 250 proteins were identified,

many of which were relevant to processes such as cell proliferation and invasion.

It should be noted that some of the widely used cancer biomarkers such as CEA, CA

125 and HER-2 are actually membrane-bound proteins, which are shed into the

circulation. The identification of secreted proteins in tissues or other biological fluids

does not necessarily imply that the proteins will be detectable in the sera of cancer

patients. Serum-based diagnostic tests depend on the stability of the protein, its

clearance, its association with other serum proteins and the extent of post-

translational modifications.


1.4.6 Other prominent strategies

A number of other cancer biomarker strategies exist. One approach that is

gaining popularity is based on protein arrays. Chinnaiyan and colleagues recently

published data suggesting that autoantibody signatures might improve the early

detection of prostate cancer116. Using a combination of phage-display technology

and protein microarrays, they identified new autoantibody-binding peptides derived

from prostate cancer tissue. Another prevailing view is that tumor-associated

antigens could serve as biosensors for cancer because tumors naturally elicit an

immune response in the host. Moreover, breaking the cancer genetics dogma that

hematologic malignancies result from chromosomal translocations117 and that

mutations underlie epithelial solid tumors, gene-fusions as a result of translocations

in prostate cancer have been reported using gene-expression datasets57. This

translocation seems to be frequent (occurring in 40–50% of cases), may have

prognostic value and it may be an early event in carcinogenesis. In addition, mass

spectrometry-based imaging of fresh-frozen tissue sections has yielded a number of

potential candidate molecules118,119. Besides proteomic profiling of serum, attempts

have been made to decipher the serum proteome via numerous fractionation

schemes to simplify and reduce the dynamic range of molecules present in serum120.

Finally, the use of animal models involving human tumor xenograft experiments

have also shown promise for biomarker discovery121,122.


1.5 Emergence of proteomics and relevance to breast cancer

Recently, we have witnessed the emergence of the “omics” era with

proteomics, peptidomics, degradomics, metabolomics, and so forth. However, the

only “omics” other than genomics to become a “buzzword” is proteomics, a term

used for studying the proteome of an organism. Most of the proteomic technology

platforms for biomarker discovery are centered on the implementation of mass

spectrometric techniques in conjunction with several other analytical techniques

such as gel electrophoresis, isoelectric focusing and chromatography. Furthermore,

these technologies have matured over the past few years and hence are capable of

identifying thousands of proteins simultaneously.

1.5.1 Basic components of a mass spectrometer

Mass spectrometry (MS) is an analytical technique that measures the masses

of individual molecules and atoms. Ultrahigh detection sensitivity and high molecular

specificity are hallmarks of MS. The advantages of MS are its ability to provide

molecular mass with high specificity, provide high detection sensitivity (detects a

single molecule), determine structures of most classes of unknown compounds, its

application to all kinds of samples (volatile, non-volatile, polar, nonpolar, solid, liquid,

gaseous materials) and its ability to analyze complex mixtures123. All information

gathered from a mass spectrometer comes from the analysis of gas-phase ions.

Therefore, the first step is to convert the analyte molecules into gas-phase ionic

species (performed by an ionization source) because once gaseous, its motion can

be manipulated (this cannot be done with neutral species)124. Then a mass analyzer


separates the molecular ions and their charged fragments according to mass-to-

charge (m/z) ratio. The ion current due to these mass-separated ions are detected

by a detector and displayed as a mass spectrum. All of these steps are carried out

under high vacuum to enable ions not to collide with other species. The generated

spectra are then analyzed by various algorithms125.

A variety of ionization sources exist, such as electron ionization, chemical

ionization, fast atom bombardment (FAB), field ionization, matrix-assisted laser

desorption ionization (MALDI) and electrospray ionization (ESI)123. The choice of

method of ionization depends on the nature of samples used. Electron ionization

was the most popular method for organic compounds since it was useful for

thermally stabile and relatively volatile compounds. Its upper mass limit of

compounds was 1kDa. Desorption ionization was for non-volatile and thermally

unstable compounds. MALDI was developed to study masses of greater than

200kDa. In MALDI, the sample is mixed with a matrix which is then irradiated with a

laser beam of short pulses123. The matrix would absorb energy at the wavelength of

the laser radiation and the energy is then transferred to the sample as laser beam

cause evaporation of the matrix (most applications use UV lasers (N2 lasers) or IR

lasers). Finally ESI encompasses three different processes: droplet formation,

droplet shrinkage and gaseous ion formation. A solution-based sample is passed

through an electrostatic field (3-4kV) generating an aerosol of fine mist of charged

droplets124. Nitrogen gas is normally added to assist in evaporation of solvent from

those charged droplets.


Following ionization, the ions enter the mass analyzer which separates gas-

phase ions generated from the ionization source based on their m/z ratio. Ion motion

in the mass analyzer can be manipulated by electric or magnetic fields, to direct ions

to a detector in an m/z-dependent manner. The commonly used mass analyzers can

be broadly grouped into beam (time-of-flight (TOF) and quadrupole) and trapping

(ion-trap and Fourier-transform ion-cyclotron resonance (FT-ICR)) analyzers.

Specifically, an ion-trap MS stores and manipulates ions in time rather than in space.

Quadrupole ion-trap instruments use an oscillating electric field for storage and

mass analysis of ions. The performance of a mass analyzer is based on mass range

(maximum allowable mass that can be analyzed), resolution (ability to separate 2

neighboring mass ions), and scan speed detection sensitivity (smallest amount of an

analyte that can be detected at a certain confidence level).

Finally, a detector measures the electric current in proportion with the number

of ions striking it to generate a mass spectrum which is used to search databases

using pattern matching algorithms such as MASCOT, SEQUEST and X!Tandem, to

generate a list of identified proteins per experiment. Tandem mass spectrometry

refers to mass selection, fragmentation and mass analysis. In this instance, in MS1,

a specified ion from a mixture of ions that are produced in the ion source is selected.

This ion undergoes fragmentation via collisions with neutral gas atoms and in MS2,

the products are analyzed.


1.5.2 Breast cancer proteomics: Sources to mine for biomarkers

In 1974, sera from normal volunteers, patients with a variety of non-

neoplastic diseases and patients with malignant or benign tumors were examined by

two-dimensional poly-acrylamide gel electrophoresis126. The authors did not identify

specific proteins, but rather, observed a differential expression pattern discriminating

the groups studied. However, despite the early searches for cancer biomarkers and

despite the rapidly advancing proteomic techniques with superior sensitivity, none of

the potential biomarkers identified in proteomic experiments has found a niche for

the management of breast cancer at the clinical level127.

Nevertheless, proteomics and in particular mass spectrometry, has been

employed to identify novel breast cancer biomarkers. Such studies have

predominantly examined breast tumor tissues and biological fluids including serum,

plasma, nipple aspirate or ductal lavage as well as cancer cell lines128. The intent of

examining these different sources include obtaining a better understanding of

mammary oncogenesis129 and potentially leading to improvements in screening,

diagnosis as well as prognosis and/or prediction of therapeutic response. Since

breast cancer is a complex and heterogenous disease, no single model or biological

source is expected to mimic all aspects of the disease130. For this reason, an

approach to biomarker development should be well conceived and play to the

strengths of current technologies while acknowledging and addressing the

limitations127.

One of the sources to mine for potential biomarkers is serum or plasma of

breast cancer patients, compared to serum of healthy controls. Exploring biological


fluids is an attractive way to look at secreted proteins. The analysis of plasma for

breast cancer biomarkers (and other cancer markers) is currently ongoing56,131. It

has been estimated that blood contains more than 100,000 different protein forms

with abundances that span 10-12 orders of magnitude56. Unfortunately, the

discovery of tumor-derived biomarkers by analyzing plasma is challenging because

the 20 most abundant plasma proteins (concentration ranges in the mg/mL range)

account for 99% of the total protein mass and impede detection of lower abundance

tumor antigens56. Potential tumor markers are expected to exist in the low ng-pg/mL

concentration range. Currently, without up-front fractionation techniques, the

presence of major proteins in blood represents a technological challenge for the

detection of the less abundant ones. The main concern is suppression of ionization

of low abundance proteins by high abundance proteins such as albumin and

immunoglobulins.

Fortunately for breast cancer, the mammary gland offers the possibility to

access local fluids, which could be potential sources for breast cancer biomarker

discovery. Fluid found within the ductal and lobular system of the breast can be

extracted through the nipple using an aspiration device to obtain nipple aspirate fluid

(NAF)132. Non-pregnant and non-lactating women continuously secrete and reabsorb

this fluid133. Consequently, NAF is a viable source to mine since it surrounds the

ducts and breast epithelial cells134-136. Despite this, only a limited number of proteins

have been identified in NAF, predominantly owing to the presence of high

abundance plasma proteins135,136.


Alternatively, another source to mine for potential biomarkers is at the tissue

level – examining normal mammary gland tissues and breast tumors137.

Nevertheless, these structures are complex, incorporating different cell types with

different proportions such as epithelial cells, adipocytes, myoepithelial cells and

fibroblasts. Due to this multifaceted population, breast tumor cells comprise a minor

fraction of this whole population of cells. Furthermore, tumor biopsies also contain

blood components; therefore proteomic analysis of breast tumor tissues also

identifies proteins from circulating cells and from plasma138. For tissue proteomics,

the hypothesis is that certain proteins originating in the tissue could subsequently

appear and be monitored in the bloodstream. Leaky capillary beds, local production

of proteases, and the high rates of cell death within the tumor mass are expected to

facilitate shedding or secretion of tumor proteins into the bloodstream. But given the

complexity of analyzing tissues, microdissection can be regarded as a reasonable

alternative for selectively isolating individual cell types. The limitations with this

approach though are the low amounts of material obtained, the large amount of

sample that is needed to perform an experiment and the quality of the dissected

material which interfere with proteomic experiments139.

Interestingly, there has also recently been an effort to take advantage of

animal models in breast cancer research and their examiniation by proteomics140.

For example, a conditional HER-2/neu-driven mouse model of breast cancer was

used to examine the proteome of tumor and normal mammary tissue122. The authors

identified over 700 proteins. A caveat to using an animal model to study human

disease is whether the same genetic alterations transform both mouse and human


epithelial cells130. Furthermore, some important aspects of breast cancer, particularly

steroid hormone dependence, are not well modeled in mice141. Regardless of the

species differences, examining the tissues and/or biological fluids in the rodent

model has the same limitations as examining them in humans.

1.5.3 Tissue culture based biomarker discovery platform

Despite optimistic views that many more protein cancer biomarkers will be

discovered through various high-throughput techniques, very few, if any, serum

cancer biomarkers have been introduced at the clinic. These molecules have not yet

been identified presumably because their concentration in serum and/or biological

fluids are too low and therefore cannot be measured or purified, unless specific

immunological reagents and highly sensitive ELISA methods are available.

Therefore, in the initial discovery phase for novel cancer biomarkers, a less complex

sample (elimination of high abundance proteins) is essential. Although clinical

validation of biomarkers must address variability arising from genetic, environmental,

and behavioral differences among humans, optimization of the discovery and

candidate verification processes involves controlling as many biological variables as

possible so that the current technologies being employed can be directly evaluated.

Moreover, given that a secretome in a tumor microenvironment contains the

extracellular matrix, constituted by proteins, receptors and adhesion molecules, as

well as a whole host of secreted proteins such as cytokines, chemokines, growth

factors and proteases – all of which can be potential biomarkers142, a cell culture

based model for sampling the secretome associated with breast cancer appears


promising. Secreted proteins play important roles in physiology and pathophysiology

and they can act locally and systemically in the body. The secretome reflects the

functionality of a cell in a given environment143. For cell culture-based proteomic

studies, the hypothesis is that proteins or their fragments originating from cancer

cells (hence present in the conditioned media) may eventually enter the circulation.

Conditioned media (CM) as a source to mine for biomarkers is increasingly gaining

popularity, as illustrated by the rise in publications over the past few years.

Breast cancer cell lines have been the most widely used models to

investigate how proliferation, apoptosis and migration become deregulated during

the progression of breast cancer144. A number of studies have used a cell culture

model system where the cells were grown in serum-free media to perform proteomic

analysis145-150. The clinical relevance of using a cell culture model to understand

biological processes and functions has been examined. Using DNA microarrays, the

molecular subtypes of 31 breast cell lines yielded two discriminating clusters

corresponding to luminal cell lines and basal/mesenchymal cell lines151. The basal

subtype was further subdivided into Basal A and Basal B; this subdivision was not

observed in primary tumors. In primary tumors, gene expression patterns have been

used to classify breast tumors into five clinically relevant subgroups (luminal A,

luminal B, basal, ERBB2-overexpressing and normal-like)152,153. In general, the

luminal subtypes are estrogen receptor (ER) positive and grow slowly whereas

basal-type lack ER and are usually high-grade cancers that grow rapidly. Recently,

the molecular taxonomy has been confirmed by protein expression profiling154,155.

Also recently, it was found that cell lines display the same heterogeneity in copy


number and expression abnormalities as the primary tumors156. Indeed, cancer cell

lines that are invasive in culture do form tumors in immune deficient mice. This is

primarily because the cancer cells in culture represent the tumor-forming cells in vivo.

While no single cell line is truly representative, a panel of cell lines show the

heterogeneity that is observed in primary breast cancers156. Table 1.2 outlines some

of the advantages and disadvantages to using a cell culture-based approach to

discover biomarkers using proteomics.


Table 1.2: Advantages and disadvantages of a cell culture-based model for biomarker discovery ADVANTAGES

Cell lines are readily available Cost-effective High-throughput Easily modified, versatile Easily propagated Enables secretome analysis Permits detection of low abundance proteins (do not represent the dynamic

range problem associated with plasma; less complex mixture) Allows for reproducibility (under well-defined experimental conditions, it yields

reproducible and quantifiable results), growth standardized The proteome of cancer cells should reflect the genetic alterations they

harbor Cancer cells can be grown as xenografts

DISADVANTAGES

No single cell line will reflect the heterogeneity of cancer Multiple variants of the same cell line exist Host stromal environment influencing tumor development and progression is

absent A reductionist approach; cannot mimic complexity of mammary gland; does

not take into account the complex interplay between cell types and the tissue microenvironment

Does not provide insight into the evolution of breast cancer from benign lesions and normal breast epithelial cells


1.6 Purpose and aims of the present study

1.6.1 Rationale

Proteins are more diverse than DNA or RNA and therefore carry more

information than nucleic acids, since alternative splicing and post-translational

modifications result in far more species of proteins from the same gene. Proteins are

also more dynamic and reflective of cellular physiology. However, despite optimistic

views that many more protein cancer biomarkers will be discovered through various

high-throughput techniques, very few, if any, serum cancer biomarkers have been

introduced at the clinic over the last 15 years.

The classical tumor markers carcinoembryonic antigen (CEA) and alpha-feto

protein (AFP) were discovered in the ‘60s mainly due to the introduction of novel and

relatively sensitive immunological techniques (such as radial immuno-diffusion),

which allowed for the detection of these antigens in cancer tissues with high

specificity and reasonable sensitivity. The most contemporary cancer biomarkers

used at the clinic today (such as carbohydrate antigen CA 125, CA 15.3, CA 19.9

and PSA) were mainly developed due to the emergence, in the late ‘70s, of the

monoclonal antibody technology. Most of these tumor markers were discovered by

using cell lines or tumor extracts as immunogens and then selecting specific

hybridoma clones which recognized these tumour antigens157. Therefore, it is

conceivable that novel tumor markers may be identified in the conditioned media of

cancer cell lines using newly emerging technologies such as mass spectrometry.

The assumption that cancer biomarkers to be discovered will be secreted or shed

proteins is reasonable, since it is expected that secreted or membrane-bound


proteins, the latter having the potential to be cleaved, have a high chance of

reaching the circulation and can be found in serum, where they can be measured

with immunological techniques. All currently known cancer biomarkers are indeed

secreted or shed proteins.

Therefore, in the initial discovery phase for novel cancer biomarkers, a less

complex sample (elimination of high abundance proteins) is essential. Using a cell

culture based model, a proteomic platform for biomarker discovery can be utilized,

which consists of three major phases. The first phase involves the identification of

markers using multi-dimensional protein identification technology (discovery phase).

Following identification, the proteins must be prioritized to select a subset of marker

candidates based on several criteria such as availability of reagent set for assay

development and literature association to disease biology. The second phase

consists of developing preliminary assays to measure the levels of the selected

proteins in a relevant biological fluid comprising cancer and normal patients

(verification phase). The final phase requires expanding the number of samples

used to evaluate only the candidates that continued to show promise in

discriminating cancer from normal from the verification phase (validation phase).

This step involves the development of a robust analytical immunoassay to measure

the proteins accurately in clinical samples.

Unforunately, the set of differentially expressed proteins identified in

proteomic studies thus far are different from one study to another, owing in part to a

lack of experimental standardization and to problems of heterogeneity between

biological materials used. This creates a tremendous challenge to finding statistically


relevant biomarkers. Because of these issues, an approach to biomarker

development should be well conceived and play to the strengths of current

technologies while acknowledging and addressing the limitations127.


1.6.2 Hypothesis

The evolution to a malignant phenotype involves mutations that often alter the

expression pattern of a variety of genes controlling cell proliferation, differentiation

and cell death. In this respect, cancer patients will likely display a distorted

expression pattern of various proteins early in the course of the disease.

Approximately 20-25% of all cellular proteins are secreted. Therefore, proteins or

their fragments originating from cancer cells or their microenvironment may

eventually enter the circulation. We hypothesize that novel candidate tumor markers

for breast cancer may be secreted or shed proteins and can be harnessed from

tissue culture supernatants of human breast cancer cell lines using mass

spectrometry.


1.6.3 Objectives

1. Utilize emerging proteomic technologies such as mass spectrometry to

develop procedures/methodologies to sample the secretome of three human

breast cell lines (discovery phase).

a. Demonstrate that human breast cancer cell lines (MCF-10A, BT474

and MDA-MB-468) can be cultured in large volumes in serum-free

media.

b. Optimize cell culture techniques to minimize cell death by measuring

intracellular protein lactate dehydrogenase (LDH) levels and to

maximize secreted protein concentration.

c. Monitor, whenever possible, the levels of internal positive control

secreted kallikrein proteins (eg. KLK5, KLK6, KLK10) over time in

culture.

d. Positively identify the internal control proteins and other proteins

present in the CM, via a “bottom-up” proteomic approach involving

fractionation and mass spectrometry.

2. Employ various bioinformatic analyses to select the top 10 most promising

candidate molecules to investigate further as serological breast cancer

biomarkers.

a. Focus on extracellular and membrane proteins, differentially expressed

proteins across the cell lines with relevance to disease based on

literature searches and reagent availability.


3. Evaluate the discriminatory ability of the 10 molecules in serum of normal and

breast cancer patients using an ELISA (commercially purchased or developed

in-house) (verification phase).

4. Select the top candidate from step 3 for further in-depth analysis using larger

sample size, in combination with currently used breast cancer biomarkers CA

15-3 and CEA (validation phase).

Chapter 2: Analysis of the Conditioned Media of Three Breast Cell Lines 52

CHAPTER 2:

ANALYSIS OF THE CONDITIONED MEDIA OF THREE BREAST CELL LINES

The work presented in this chapter was published in Molecular & Cellular Proteomics:

Kulasingam, V. and Diamandis, E.P. Proteomic analysis of conditioned media from three breast cancer cell lines: A mine for

biomarkers and therapeutic targets. Mol Cell Proteomics 2007 6: 1997-2011



2.1 Introduction

Aberrant secretion or shedding of proteins is commonly associated with

disease, including cancer. The pathogenic signalling pathways involved during the

process of cancer initiation and progression are not confined to the cancer cell itself

but are rather extended to the tumor-host interface113. It is a dynamic environment in

which fluctuating information flows between the tumor cells and the normal host

tissue. Therefore, it is conceivable that either the tumor itself or its microenvironment

could be sources for biomarkers that would ultimately be shed into the serum

proteome, allowing for early disease detection24, monitoring therapeutic efficacy or

understanding the biology of the disease158. Given that approximately 20-25% of all

cellular proteins are secreted, it is reasonable to hypothesize that proteins or their

fragments originating from cancer cells or their microenvironment may eventually

enter the circulation102.

Accordingly, one of the best ways to diagnose cancer early, or to predict

therapeutic response, is to use serum or tissue biomarkers. Carcinoembryonic

antigen (CEA) and carbohydrate antigen 15.3 (CA 15.3) are the most commonly

used tumor markers for breast cancer. Their levels in serum are related to tumor

size and nodal involvement and are recommended for monitoring therapy of

advanced breast cancer or recurrence but are not suitable for population screening

(due to low diagnostic sensitivity and specificity)32,35,36. Currently, mammography

remains the cornerstone of breast cancer screening, despite its disadvantages such

as high false positive and negative rates, hazardous exposure and patient

discomfort16,17. In addition, for women under the age of 40, mammographic


screening yields a poor sensitivity of 33%18,159. Recent technological advances in

proteomics have opened up new and exciting avenues for the discovery of

biomarkers or for characterization of molecules involved in cancer initiation and

progression.

A number of different proteomic-based approaches have been utilized to

discover and characterize disease-specific molecules. Fluid found within the ductal

and lobular system of the breast can be extracted through the nipple using an

aspiration device to obtain nipple aspirate fluid (NAF)132. Non-pregnant and non-

lactating women continuously secrete and reabsorb this fluid133. Because of the

complex nature of biological fluids relevant to breast cancer, only a handful of high

abundance proteins have been identified in NAF, which illustrate the need to find

another source to mine for the initial biomarker discovery 134,160,161.

A number of studies have used a cell culture model system where the cells

were grown in serum-free media to perform proteomic analysis145-150. Typically,

proteomic analysis of conditioned media (CM) involves culturing the cells in serum-

free media (SFM) to ensure that the collected CM contain no other extraneous

proteins, except for the secreted or shed proteins from the cancer cells, thereby

facilitating their identification through MS. Furthermore, given that the cell lines to be

used are specific to epithelial breast cancer cells, the proteins present in the CM

must originate from the cancer cell and not from the surrounding stroma, thereby

avoiding unnecessary complications in the analyses. Seeding density, incubation

time in SFM, volume of media used, type of SFM, type of tissue culture flasks are all

variables that need to be explored thoroughly to select the most optimal conditions


for growth. Culture conditions also need to determine the amount of cell death and

autolysis occurring in SFM. Measurements of lactate dehydrogenase (LDH), an

intracellular protein which if measured in the CM of cell lines is an indicator of cell

death, can be utilized. Alternatively, the amount of major cytosolic proteins such as

beta-actin or beta-tubulin can be used to optimize incubation times. Also, when

performing a proteomic analysis of conditioned media from cell lines, it is important

that the cells are extensively washed to remove protein components arising from the

fetal bovine serum (FBS) used in culture. Therefore, an essential experiment to

perform is to incubate tissue culture flasks with no cells added, but treated with the

same wash conditions and incubation times to serve as negative controls. The

proteins identified in the negative controls, predominantly FBS-derived proteins, can

be deleted from the list of proteins identified in the conditioned media as arising from

incomplete washing of the flasks. Moreover, it is known that cell growth is slower in

SFM and that cells are prone to autolysis resulting in non-specific release of

intracellular proteins into the culture supernatant. Hence, it is imporant to obtain a

proteome of the whole cell lysate derived from the same cell samples that were the

sources of CM to consider only proteins found uniquely in the CM versus those that

were found in the cell lysate. Finally, given the selective ionization process of mass

spectrometry, it is important to have at least biological triplicates (same cell line

prepared independently) in the analysis.

In this study, we report a shot-gun proteomics approach to sample the

conditioned media of three human breast cell lines (MCF-10A, BT474 and MDA-MB-

468). MCF-10A, a basal B subtype, with intact p53, was derived by spontaneous


immortalization of breast epithelial cells from a patient with fibrocystic disease and it

has been used extensively as a normal control in breast cancer studies162. These

cells do not survive when implanted subcutaneously into immunodeficient mice162.

BT474, a luminal subtype, obtained from a stage II localized solid tumor, is positive

for ER and progesterone receptor (PgR), which represent 50-60% of all breast

cancer cases163. This cell line also displays amplification of HER-2/neu or erbB-2 –

which represents 30% of all breast cancer cases64. HER-2/neu is a cell membrane

surface-bound tyrosine kinase involved in signal transduction, leading to cell growth

and differentiation. Its over-expression is associated with a high risk of relapse and

death64 and is the target of the therapeutic monoclonal antibody Herceptin65. Finally,

MDA-MB-468, a basal A-like subtype, obtained from a pleural effusion of a stage IV

patient164, is ER and PgR negative (15-25% of breast cancer) and PTEN negative

(30% of breast cancer)165,166.

These cell lines were cultured in serum-free media (SFM) to ensure that the

collected conditioned media (CM) contain no other extraneous proteins, except for

the secreted or shed proteins from the cancer cells. By collecting and concentrating

large volumes of CM produced from cell lines representing semi-normal (MCF-10A),

non-invasive (BT474) and metastatic origins (MDA-MB-468), the secreted and shed

proteins accumulated in the CM, thereby facilitating their identification through MS.

Our comparative proteomic analysis of the CM of MCF-10A, BT474 and MDA-MB-

468 identified over 600, 500 and 700 proteins, respectively. A large portion of the

proteins was present in all 3 cell lines however, a significant portion contained

proteins that were unique to each of the lines. Among these were the internal control


proteins, human kallikreins 5, 6 and 10 being identified by MS and ELISA in MDA-

MB-468 cells, at a concentration ranging from 2-50 µg/L. Members of the human

kallikrein family (KLKs) have been implicated in the process of carcinogenesis and

the application of kallikreins as biomarkers for diagnosis and prognosis are currently

being investigated. Kallikreins are secreted enzymes that encode for trypsin-like or

chymotrypsin-like serine proteases110. Prostate-specific antigen (PSA; KLK3),

belonging to the family of human tissue kallikreins, and human kallikrein 2 (KLK2)

currently have important clinical applications as prostate cancer biomarkers111. In

addition to the control proteins, various proteases, receptors, protease inhibitors,

cytokines and growth factors were identified. Finally, spectral counting analysis

revealed promising molecules to investigate further for both understanding the

disease and as potential biomarkers for breast cancer.


2.2 Materials and Methods

2.2.1 Cell lines

The breast epithelial cell line MCF-10A, and the breast cancer cell lines BT-

474 and MDA-MB-468 were purchased from the American Type Culture Collection

(ATCC), Rockville, MD. MCF-10A was maintained in Dulbecco’s modified Eagle’s

medium and F12 medium (DMEM/F12) supplemented with 8% fetal bovine serum

(FBS), epidermal growth factor (20ng/mL), hydrocortisone (0.5µg/mL), cholera toxin

(100ng/mL) and insulin (10µg/mL). BT-474 and MDA-MB-468 were maintained in

phenol-red-free RPMI 1640 culture medium (Gibco) supplemented with 8% FBS. All

cells were cultured in a humidified incubator at 37oC and 5% CO2 in tissue culture T-

75cm2 flasks.

2.2.2 Cell culture

Approximately 30x106 cells were seeded individually into six 175cm2 tissue

culture flasks per cell line. After 2 days, the RPMI or DMEM/F12 media were

discarded and the cells rinsed twice with 1X phosphate buffered saline (PBS).

Following this, 30mL of Chemically Defined Chinese Hamster Ovary (CDCHO)

serum-free medium (Gibco), supplemented with glutamine (8mM) (Gibco) was

added and the flasks were incubated for an additional 24 hours. The conditioned

media (CM) were collected and spun down to remove cellular debris. CM were then

frozen at -80oC until further use. A 1mL aliquot was taken at the time of harvest to

measure for total protein (Bradford assay), lactate dehydrogenase (LDH) and human

kallikreins 5, 6 and 10 (KLK5, KLK6, KLK10) via ELISA. The adhered cells were


trypsinized and counted using a hemocytometer. This procedure was repeated

several times for reproducibility. In addition, 30mL of the culture media (RPMI 1640

and DMEM/F12) were subjected to the same conditions as above, with no cells

added, and used for comparison. For the MDA-MB-468 cell lysate experiment, at the

end of 24 hours in SFM, the adhered cells were lyzed using a French Press (Thermo

Electron), where the cells are sheared by forcing them through a narrow space.

Total protein was measured and 400µg of protein from the lysate was added to

60mL of CDCHO medium and processed in the same manner as the CM. The cell

lysate experiment was performed in duplicate.

2.2.3 Sample preparation

Two 30mL CM were combined (60mL) for each cell line, creating 3 biological

replicates per cell line, and dialyzed using a molecular weight cut-off membrane of

3.5kDa. The CM was dialyzed in 5L of 1mM ammonium bicarbonate solution

overnight, at 4oC with two buffer changes. The dialyzed CM was poured equally into

two 50mL conical tubes. The CM was frozen and lyophilized to dryness. The

lyophilized sample was denatured using 8M urea and reduced with dithiothreitol

(DTT, final concentration 13mM; Sigma). Following reduction, the sample was

alkylated with 500mM iodoacetamide (Sigma) and desalted using a NAP5 column

(GE Healthcare). The sample was lyophilized and trypsin (Promega) digested (1:50,

trypsin:protein concentration) overnight in a 37oC waterbath. Following this, the

peptides were lyophilized to dryness.


2.2.4 Strong cation exchange liquid chromatography

The trypsin-digested dry sample was resuspended in 120µL of mobile phase

A (0.26M formic acid in 10% acetonitrile [ACN]). The sample was directly loaded

onto a PolySULFOETHYL ATM column (The Nest Group, Inc.) containing a

hydrophilic, anionic polymer (poly-2-sulfoethyl aspartamide). A 200Å pore size

column with a diameter of 5µm was used. A one hour fractionation procedure was

performed using a high performance liquid chromatography (HPLC) system (Agilent

1100). A linear gradient of 0.26M formic acid in 10% acetonitrile as the running

buffer and 1M ammonium formate added as the elution buffer was used. The eluent

was monitored at a wavelength of 280nm. Forty fractions, 200µL each, were

collected every minute after the start of the elution gradient. These 40 fractions were

pooled into 8 combined fractions (each pool consisting of 5 fractions) and lyophilized

to ~200µL.

2.2.5 Tandem mass spectrometry (LC-MS/MS)

The 8 pooled fractions per replicate per cell line were C18-extracted using a

ZipTipC18 pipette tip (Millipore; catalogue # ZTC18S096) and eluted in 4µL of 68%

ACN, made up of Buffer A and Buffer B (90% ACN, 0.1% formic acid, 10% water,

0.02% trifluoroacetic acid [TFA]). 80µL of Buffer A (95% water, 0.1% formic acid, 5%

ACN, 0.02% TFA) was added and 40µL were injected onto a 2 cm C18 trap column

(inner diameter 200 µm). The peptides were eluted from the trap column onto a

resolving 5 cm analytical C18 column (inner diameter 75 µm) with an 8 micron tip

(New Objective). The LC set-up was coupled online to a 2-D Linear Ion Trap (LTQ,


Thermo Inc) mass spectrometer using a nanoelectrospray ionization source (ESI) in

data-dependent mode. Each pooled fraction was run on a 120 minute gradient. The

eluted peptides were subjected to tandem mass spectrometry (MS/MS). DTAs were

created using the Mascot Daemon (v2.16) and extract_msn. The parameters for

DTA creation were: min. mass 300, max. mass 4000, automatic precursor charge

selection, min. peaks 10 per MS/MS scan for acquisition and a min. scans per group

of 1.

2.2.6 Data analysis

The resulting raw mass spectra from each pooled fraction were analyzed

using Mascot (Matrix Science, London, UK; version 2.1.03) and X!Tandem (GPM

Manager, version 2.0.0.4) search engines on the non-redundant IPI Human

database V3.16 (62000+ entries). Up to one missed cleave was allowed and

searches were performed with fixed carbamidomethylation of cysteines and variable

oxidation of methionine residues. A fragment tolerance of 0.4 Da and a parent

tolerance of 3.0 Da were used for both search engines, with trypsin as the digestion

enzyme. This operation resulted in 8 DAT files (Mascot) and 8 XML files (X!Tandem)

for each replicate sample per cell line. Scaffold (version Scaffold-01_05_19,

Proteome Software Inc., Portland, OR) was used to validate MS/MS based peptide

and protein identifications. Peptide identifications were accepted if they could be

established at greater than 95.0% probability as specified by the PeptideProphet

algorithm167. Protein identifications were accepted if they could be established at

greater than 80.0% probability and contained at least 1 identified peptide. Protein


probabilities were assigned by the ProteinProphet algorithm168. Proteins that

contained similar peptides and could not be differentiated based on MS/MS analysis

alone were grouped to satisfy the principles of parsimony. The DAT and XML files

for each cell line plus their respective negative control files (RPMI or DMEM culture

media only) were inputted into Scaffold to cross-validate Mascot and X!Tandem data

files. Each replicate sample was designated as one biological sample containing

both DAT and XML files in Scaffold and searched with MudPit option clicked. The

results obtained from Scaffold were processed using an in-house developed

program that generated the protein overlaps between samples. Protein

identifications were assigned a cellular localization based on information available

from Swiss-Prot, Genome Ontology (GO), Human Protein Reference Database

(HPRD) and other publicly available databases. To calculate the false-positive error

rate, the individual fractions were analyzed using the “sequence-reversed” decoy IPI

Human V3.16 database by Mascot and X!Tandem and data analysis was performed

as mentioned above.

2.2.7 Spectral counting

Using the number of total spectra output from Scaffold, we identified the

differentially expressed proteins using spectral counting. Common peptides among

proteins were grouped and proteins containing more than 10% of their total spectra

from negative control samples were removed and one excel file containing total

proteins identified and their presence (defined by spectral counts) in the 3 cell lines

were generated. A normalization criterion was applied to normalize the spectral


counts so that the values of the total spectral counts per sample were similar. An

average of the spectral counts was generated for each cell line (based on the

triplicate samples). The sum of the 3 variances for the cell lines, an indicator of the

variance within each cell line, was calculated. The variance of the average spectral

counts for each cell line revealed the variability between the cell lines. ANOVA

(Fisher test) was performed to obtain the ratio of the “between sample variance” to

the “within sample variance”. Apparent fold-changes were calculated when possible.

2.2.8 Total protein and lactate dehydrogenase assay

Total protein was quantitated in the CM using a Coomassie (Bradford) protein

assay reagent (Pierce). All samples were loaded in triplicates on a microtiter plate

and protein concentrations were estimated by reference to absorbances obtained for

a series of bovine serum albumin (BSA) standard protein dilutions. Lactate

dehydrogenase (LDH), an intracellular enzyme which if found in the CM is an

indicator of cell death, was measured using an enzymatic assay based on lactate to

pyruvate conversion and parallel production of NADH from NAD. The production of

NADH was measured by spectrophotometry at 340mm using an automated method

(Roche Modular system).

2.2.9 Quantification of KLK5, KLK6 and KLK10

The concentration of KLK5, 6 and 10 was quantified with KLK5, 6 or 10-

specific non-competitive immunoassays developed in our laboratory 112,169,170. For

more details, see the cited literature.


2.3 Results

2.3.1 Optimization of cell culture

MCF-10A, BT474 and MDA-MB-468 cells were grown in serum-free media

(SFM) to ensure that the conditioned media contained no other exogenous proteins.

Seeding density, incubation time in SFM, volume of media used, type of SFM, type

of tissue culture flasks were all variables that were explored thoroughly to select the

most optimal conditions for growth. Refer to the Materials and Methods section for

details on the optimal conditions selected for the cell lines. Approximately, 20, 33

and 40x106 cells were found to be attached and alive at the end of the experiment

with 25, 10 and 15 µg of total protein per mL in MCF-10A, BT474 and MDA-MB-468,

respectively (Figure 2.1A and B). Furthermore, to minimize cell death and maximize

secreted protein concentration in the CM, LDH levels, which represent the amount of

cell death occurring in cell culture, were also measured (Figure 2.1C). Known

amounts of all 3 types of cells were lyzed individually and their corresponding LDH

levels were graphed to create a LDH standard curve (Figure 2.1D). Using the LDH

standard graph, the value of LDH found in the 24 hour CM and the total number of

cells alive at the end of harvest, it was estimated that approximately 6-7% cell death

was occurring in the CM of the cells. Finally, to demonstrate the accumulation of

extracellular proteins in the optimized cell culture model system, the internal control

proteins KLK5, KLK6 and KLK10 were quantified in MDA-MB-468, using ELISAs

(Figure 2.1E). Kallikrein 5 was the most abundant kallikrein expressed in this cell line

(50µg/L) followed by KLK10 (~ 3.5µg/L) and KLK6 (~ 2µg/L).


Figure 2.1

Figure 2.1: Cell number, total protein, LDH and kallikrein levels in 24 hour conditioned media of cell lines. (A) Total number of cells alive during harvest of the CM; (B) Total protein concentration of CM (C) LDH levels, indicating approximately 6-7% cell death; (D) Standard LDH curve (n = 2); (E) KLK5, KLK6 and KLK10 levels in MDA-MB-468. This procedure was repeated several times for reproducibility.


2.3.2 Identification of proteins by MS

The workflow and experimental design performed in this study is shown in

Figure 2.2. Approximately 35-40 proteins were identified in the negative control

samples, which was made up of the culture media only (the list of proteins can be

found online in the supplementary data171). Many of the proteins in this list originated

from fetal bovine serum used to initially culture the cells. These proteins were

deleted from the list of total proteins identified in the CM and were not considered

further. In MCF-10A, we identified 632 proteins (Figure 2.3A). Of these, 459 were

identified in all 3 replicates, yielding a protein identification reproducibility of 73%.

Furthermore, a total of 505 proteins were identified in BT474 (Figure 2.3B). For this

cell line, 380 proteins were common to all 3 replicates (75% reproducibility). Finally,

723 proteins were identified in MDA-MB-468 (Figure 2.3C), of which 553 were

identified in all 3 replicates (76% reproducibility). In general, using the workflow

presented here, we achieved technical reproducibility (same sample injected twice

into the LTQ) of ~90% (data not shown). The total number of proteins identified per

number of total peptides per cell line is shown in Table 2.1. Many of the proteins

identified contained two or more peptide hits. A table containing detailed information

on all of the proteins identified for each of the cell lines, including number of unique

peptides identified per protein, peptide sequences, precursor ion mass and charge

states can be found online171. In addition, 8, 3 and 5 proteins were identified in MCF-

10A, BT474 and MDA-MB-468 cells respectively using a non-sense database,

yielding a false positive rate of ~ 1%172. The list of proteins found in the non-sense

database can be found online in the supplementary data171.


Figure 2.2

CELL CULTURE SAMPLE PREPARATION HPLC FRACTIONATION (SCX)

LC-MS/MS (LTQ)

MascotX!TandemMascot

X!Tandem

DATABASE SEARCHINGIDENTIFICATION PROBABILITY

Reproducibility Overlap Cellular Localization Spectral Counting

DATA ANALYSIS

Figure 2.2: Outline of experimental workflow


Figure 2.3

Rep 1

Rep 2Rep 3

21

7716

459

36167

Rep 1

Rep 2Rep 3

6

2322

380

401618

Rep 1

Rep 2Rep 3

6

3849

553

45725

A) MCF-10A (632 proteins)

B) BT474 (505 proteins)

C) MDA-MB-468 (723 proteins)

Figure 2.3: Number of proteins identified in CM by LC-MS/MS for the 3 cell lines. (A) MCF-10A, (B) BT474, (C) MDA-MB-468. Three independent replicates (rep) were generated and processed in the same manner for each cell line. Good reproducibility is shown as 73-76% overlap is observed among the cell lines.


Table 2.1: Total number of proteins identified per number of peptides

# of Peptides Identified MCF-10A BT474 MDA-MB-468

1 120 124 155 2 172 137 211 (KLK6) 3 95 77 111 (KLK5, 10)4 63 41 (HER-2/neu) 67

>/=5 182 126 179 Total: 632 505 723


2.3.3 Identification of internal control proteins in CM by MS

One of the advantages of our approach to biomarker discovery was the

presence of endogenous internal control proteins in the CM of MDA-MB-468. We

identified KLK5, 6 and 10 in the CM of MDA-MB-468 in all 3 replicates for this cell

line by MS. Three unique peptides were identified for KLK5, 2 unique peptides for

KLK6 and 3 unique peptides for KLK10. Furthermore, the BT474 cell line is well-

characterized to exhibit amplification of HER-2/neu (also known as ErbB-2). The

HER-2/neu protein consists of a cysteine-rich extracellular ligand binding domain

(ECD), a short transmembrane domain, and a cytoplasmic protein tyrosine kinase

domain173. ECD/HER-2 can be released by proteolytic cleavage from the full-length

HER-2 receptor and detected in serum. In the CM of BT474, we successfully

identified receptor tyrosine-protein kinase erbB-2 precursor with 4 unique peptides.

All of the peptides identified fall into the ECD portion (23-652 amino acid) of this

1255 amino acid protein. However, we did not identify CA 15-3 in the conditioned

media of the three cell lines by mass spectrometry. To validate this observation,

using a commercially available ELISA kit (Elecsys CA 15-3 Immunoassay; Roche

Diagnostics GmbH), we measured the levels of this glycoprotein in the same CM of

the cell lines. Very little (<3 units/mL) to no CA 15-3 was measured by ELISA in the

CM (data not shown).


2.3.4 Cellular localization of identified proteins

Figure 2.4 display the cellular distribution of proteins identified in the

conditioned media of MCF-10A (A), BT474 (B) and MDA-MB-468 (C). Twenty-two

percent of the proteins identified in MCF-10A CM were classified as being

extracellular and membrane-bound. For BT474 and MDA-MB-468, the percentages

were 25 and 28, respectively. A large portion of proteins identified were classified as

intracellular (> 50%), while 4-5% remained unclassified.

2.3.5 Overlap of proteins between the three cell lines

The proteins identified among the 3 cell lines were analyzed for overlapping

members (Figure 2.5). A significant portion (234 or 20%) of the 1,139 proteins was

identified in all 3 cell lines (Figure 2.5A). Figure 2.5B and 2.5C show the overlap

among the 175 extracellular proteins and the 211 membrane-bound proteins,

respectively. Combined together, extracellular and membrane proteins accounted for

34% of all proteins identified. MDA-MB-468 displayed the greatest number of

extracellular and membrane proteins, presumably illustrating that cancer cells

secrete and/or express an increased amount of these proteins. In accordance with

this postulation, cellular localization analysis of the overlap between BT474 and

MDA-MB-468, yielded the greatest percentage (40%) of secreted and membrane

proteins (Figure 2.5D).


Figure 2.4

Cytoplasm26%

Nucleus27%

Intracellular Organelles12%

Unclassified5%

Extracellular9%

Membrane13%

Cytoskeleton8%

Cytoplasm25%

Nucleus27%


Unclassified4%

Extracellular12%

Membrane13%

Cytoskeleton8%

Cytoplasm27%

Nucleus18%


Unclassified4%

Extracellular13%

Membrane15%

Cytoskeleton9%

A)

B)

C)

Figure 2.4: Cellular localization for the 3 cell lines. (A) MCF-10A, (B) BT474, (C) MDA-MB-468. A non-redundant list of proteins was created after Mascot and X!Tandem searching, and each protein were classified by its cellular location.


Figure 2.5

MCF-10A

BT

474

MD

A-M

B-4

68

234

62102

234

89120298

A) Total Proteins (1,139) B) Secreted Proteins (175)

C) Membrane Proteins (211)

MCF-10A

BT

474

MD

A-M

B-4

68

29

51126

282155

MCF-10AB

T4

74

MD

A-M

B-4

68

40

41835

222567

D) Overlap Between BT474 and MDA-MB-468 (89)

Figure 2.5: Overlap of proteins in CM. (A), (B), (C) Overlap of proteins (total number in brackets) between the 3 cell lines. (D) Pie chart showing the subcellular localization of 89 proteins uniquely identified in BT474 and MDA-MB-468, two cancer cell lines (not including MCF-10A proteins). Forty percent of the proteins are classified as extracellular and membrane-bound.


2.3.6 Cell lysate proteome

One of the major challenges in the analysis of secreted proteins is

distinguishing between proteins that are targeted to the extracellular space versus

those that arise as low-level contaminants due to cell death during routine cell

culture. To address this, we performed a cell lysate proteome experiment to examine

if our approach was enriching for secreted proteins. A total of 716 proteins were

identified in MDA-MB-468 cell lysate after removal of the negative control proteins

(culture media only). Eighty-seven percent protein identification reproducibility was

observed among the two replicates (Figure 2.6A). Five percent of the total proteome

was classified as extracellular/secreted (Figure 2.6B). In the CM of MDA-MB-468,

13% of the total proteins identified were classified as being secreted (Figure 2.4C).

The internal control secreted proteins, kallikreins 5, 6 and 10, were not identified in

the cell lysate. Of the secreted proteins identified in the cell lysate, 30 were also

identified in the CM for this cell line, while 19 proteins were unique to the lysate. A

table containing all of the lysate proteins identified, as well as the secreted proteins

that were found in both the MDA-MB-468 lysate and CM can be found online171.


Figure 2.6

A) MDA-MB-468 Lysate (716)

72 626 18 Rep 2Rep 1

B) Lysate Cellular Localization

Figure 2.6: Proteome of MDA-MB-468 cell lysate by LC-MS/MS. (A) Two independent replicates (rep) for the lysate were generated and processed in the same manner. Good reproducibility is shown (87% overlap). (B) GO cellular localization for the 716 proteins identified in the lysate. Five percent is classified as being extracellular.


2.3.7 Spectral counting and identification of differentially expressed proteins

The peptides that a mass spectrometer fragments are selected from a parent

scan (also called MS, full scan, survey scan, precursor scan). The resulting daughter

scans (also called MS-MS, fragment ion, MS2) are therefore dependant on the

parent scan. In our methodology, we sequence the six most abundant ions in the full

scan. If we continued to do this for every full scan, then we would repeatedly

sequence only the most abundant ions. Thus, to enable sequencing of the low

abundance ions, a criterion called dynamic exclusion is selected. In dynamic

exclusion, the mass spectrometer remembers the ions that it sequences so that it

does not re-sequence them in the future. This allows sequencing of the less

abundance ions. In this instance, the ions that are sequenced are put into a list that

has a fixed size (n=100) and this list constantly refreshes with every MS scan. As

can be imagined, high abundance peptides are sequenced more then low

abundance peptides (i.e. high abundance peptides are displaced from the list and

then are ‘redetected’ and this process repeats itself). This allows for a form of label-

free quantification – spectral counting – to take place. The premise is that counting

the number of spectra acquired per peptide is an indicator of relative protein

abundance in a mixture.

An alternative way to decipher protein abundance is to perform

multidimensional scaling for all 9 experiments (each cell line in triplicate) using

spectral counts. Refer to the Materials and Methods section for details on how the

analysis was conducted. The venn diagram in Figure 2.7 displays the overlaps of

proteins among the cell lines based on spectral count analysis (A) and their cellular


localization (B). The top ~100 extracellular and membrane-bound proteins obtained

from spectral counting analysis are shown in Table 2.2. The variability within the

replicates (within variance) and between the three cell lines (between variance) are

highlighted along with the F ratio. Apparent fold-changes were calculated where

possible. A numerical value is indicated in places where both cell lines/conditions

being examined contain a normalized spectral count greater than zero. In the event

of a comparison where one of the condition/cell line had a spectral count of zero, an

expression is given (i.e BT>>MCF; indicating that the spectral count for BT474 was

greater than MCF10A). Cells that are grey indicate a negative fold change whereas

cells that are white indicate a positive (numerical value indicated) or no fold change

(cells are blank). In addition, in the first column displaying the fold change between

BT474/MCF-10A, cells in white highlight the proteins that have a higher spectral

count in BT474 compared to MCF-10A whereas cells in black highlight proteins that

have a lower spectral count in BT474 compared to MCF-10A. A similar color coding

scheme applies to the other two columns comparing the different cell

lines/conditions within each column. Known breast cancer biomarkers such as HER-

2/neu is among the top 5 proteins identified by this unbiased method of analysis.

Furthermore, 23 proteins, previously associated with cancer (as determined by

Ingenuity Biomarkers Comparison Analysis software) were found among the top ~

100 extracellular and membrane proteins including epidermal growth factor receptor

(EGFR) and various insulin-like growth factor binding proteins (IGFBP-2, 3, 5 and 6).

A table containing all of the 1,062 proteins that this analysis was performed on can


be found online, in addition to a table that contains the overlaps of the proteins

among the cell lines171.


Figure 2.7

MCF-10A

BT

474

MD

A-M

B-4

68

211

5491248

8699273

Intracellular Organelles

15%

Nucleus21%

Cytoplasm25%

Cytoskeleton7%

Unclassified5%

Membrane14%

Extracellular13%

A)

B)

1,062 proteins

Figure 2.7: Spectral counting analysis. Overlap of proteins between cell lines (A) and cellular localization (B) using label-free quantification (spectral counting).


2.4 Discussion

In this study, a shot-gun proteomics strategy was utilized to sample the

conditioned media of 3 breast cell lines, MCF-10A, BT474 and MDA-MB-468. By

searching with both Mascot and X!Tandem, we successfully identified over 1,100

proteins in the CM of all 3 cell lines combined, which, to our knowledge, is one of the

largest repositories of proteins identified for breast cancer. Studies have shown that

by combining results from multiple search engines, a better, more confident protein

identification is made since different search engines are based on different

algorithms and scoring174,175. Two different methods of determining relative

abundance were used: protein identification and spectral counts. While spectral

counting as an index of protein abundance is appealing, there are a number of

different ways to analyze a dataset such as counting the spectra and adjusting by

the length of the protein (NSAF)176, counting peptides (not individual spectra) and

adjusting for number of tryptic peptides in the protein (PAI)177, calculating a function

of PAI called emPAI178, counting 1 if any spectra matched a peptide and assigning 0

otherwise (SASPECT) or merely counting the spectra. Currently, there is no

consensus as to which approach to use. In this study, we used both protein

identification and spectral counts to determine protein abundance.

We specifically examined MCF-10A, BT474 and MDA-MB-468 because they

represent a variety of breast cancer cases. We observed higher cell death in MCF-

10A as exhibited by elevated LDH levels and less viable cell counts at the end of the

culture, which was expected, as MCF-10A is considered to be a “normal” breast

epithelial cell line that does not have the advantage of growing uncontrollably as the


cancer cell lines in SFM (Figure 2.1). It was our aim that examining the proteins that

are unique to each of the cell lines, might shed light into both the pathways leading

to breast cancer development and to the discovery of biomarkers for breast cancer.

One of the advantages of our high-throughput qualitative comparative

proteomic analysis was the presence of internal controls in the CM of MDA-MB-468.

KLK5 expression, as assessed by quantitative RT-PCR, has been implicated as an

independent and unfavorable prognostic marker for breast carcinoma179. The clinical

utility of KLK6 as a breast cancer maker has not been determined, while KLK10

levels have been shown to predict response to tamoxifen therapy180. Based on

experience with other currently used biomarkers, the expected concentration of new

cancer markers in serum should be in the low µg/L range. Through our strategy, we

successfully identified all three control proteins by MS, thus supporting the notion

that we are enriching for, and surveying deep enough, into the low abundance

proteins, to mine for other candidate biomarkers for breast cancer. Furthermore, the

fact that we successfully identified the ECD portion of HER-2/neu in BT474 further

supports our hypothesis that candidate biomarkers can be discovered using the CM

of breast cancer cell lines. High levels of HER-2 in serum correlate with poor

prognosis in patients with breast cancer181. In 2000, the FDA cleared the serum

HER-2/neu test; the first FDA-cleared blood test for measuring circulating levels of

HER-2 in the follow-up and monitoring of patients with metastatic breast cancer. The

assay, ADVIA Centaur HER-2/neu Assay (www.oncogene.com), is a sandwich

immunoassay that uses two monoclonal antibodies that are specific for unique


epitopes on the extracellular domain of the HER-2 oncoprotein. The assay has a

reference limit 15 µg/L with a sensitivity of 0.1 µg/L182.

Particular emphasis was placed on extracellular, membrane-bound and

unclassified proteins since these proteins have the highest chance of being found in

the circulation and thus serving as cancer biomarkers or as important molecules

involved in cancer progression. More than 34% of these proteins were classified as

being extracellular and membrane-bound. Among the known and novel proteins

released by breast cancer cells, we identified various proteases, receptors, protease

inhibitors and cytokines. All experiments were performed in triplicates with excellent

reproducibility between runs. Due to the inherent nature of mass spectrometers, not

all peptides were ionized in each run and consequently, different peptides were

selected for ionization and detected. This selective ionization can account for the

75% reproducibility in our biological triplicates. As well, the various steps during

sample preparation, including C18 extraction of the fractions cannot be dismissed as

an important contributing factor to the variations observed.

Although the objective of this study was to identify secreted and membrane-

bound proteins that have the potential to be cleaved and thus found in circulation,

the identified proteins included many intracellular proteins, including ones classified

by GO as nuclear and cytoplasmic. During the cell culture process, a portion of the

cell population will die, resulting in the release of intracellular proteins into the media.

Despite optimizing cell culture conditions to minimize cell death, the identification of

intracellular proteins in the CM is inevitable because of the high sensitivity of MS-

based techniques utilized in this study. Martin et al examined the CM of a prostate


cancer cell line and found a very similar GO distribution to the one we present here -

that more than 50% of the proteins identified were intracellular145. Therefore, one of

the major challenges in the analysis of secreted proteins is distinguishing between

proteins that are targeted to the extracellular space versus those that arise as low-

level contaminants due to normal cell death in routine cell culture. To address this,

we performed a cell lysate proteome experiment to demonstrate conclusively that

through our approach, we are significantly enriching for secreted proteins.

Furthermore, a recent study examining the cell lysate proteome of human mammary

epithelial cell line, HMEC, found that 2% of the entire proteome was classified as

extracellular, which was consistent with our findings183.

Our group has previously published data on the conditioned media of a

prostate cancer cell line, PC3(AR)6 using a roller bottle cell culture method184.

Through this approach, the authors identified 262 proteins in the CM. The workflow

presented in the current study has significant differences and improvements

compared to our previous work. At the tissue culture step, we cultured three cell

lines in triplicate versus only one cell line studied. We also optimized our cell culture

conditions (seeding density, incubation time, volume of media used) to minimize cell

death and maximize secreted protein content. Different methods of fractionation of

the peptides versus protein fractionation and a more robust and sensitive mass

spectrometer was utilized in this study compared to our previous work. Finally, the

bioinformatic analysis has also been significantly improved upon from before as we

used two different search engines and incorporated protein and peptide probability


calculations into our final list of proteins. As a result, in the current study, we

identified over 1,000 proteins using the conditioned media approach.

Current therapies for advanced cancers are elusive. Novel breast cancer

biomarkers that can be effective early in the course of the disease have the potential

to reduce morbidity and mortality, as well as receive a higher compliance rate by

patients for undergoing screening. However, there is a growing consensus that

panels of markers may be able to supply the specificity and sensitivity that individual

markers lack. A number of studies have demonstrated that this is indeed true. While

a biomarker should be detected in serum using an immunoassay (ELISA),

developing such an assay for multiple potential novel biomarkers is very labor

intensive185. The vast majority of the proteins identified in this study (extracellular,

membrane and unclassified) do not have commercially available ELISA kits.

Alternatively, to decipher whether the candidates are present in serum, multiple

reaction monitoring (MRM) mass spectrometry technology can be performed. Using

the latter technology, it is possible, in a single experiment, to detect and quantify

specific peptides (representing specific proteins) in biological fluids of patients with

breast cancer to determine if the protein has biomarker potential. A number of

studies have shown the feasibility of such an approach186-188. Nevertheless, before

proceeding to analyzing biological fluids, the potential candidate biomarkers must

first be selected from the > 1,000 proteins identified.

Chapter 3: Bioinformatics and Candidate Selection 89

CHAPTER 3:

BIOINFORMATICS and CANDIDATE SELECTION

The work presented in this chapter is published, in part, in Molecular & Cellular Proteomics:





3.1 Introduction

Rapid advances in proteomic and genomic technologies have created

optimistic views that many more biomarkers will be discovered through various high-

throughput techniques. However, these predictions have yet to come true127. While

numerous strategies exist for biomarker discovery, the bottleneck to product

development and routine clinical use is in the verification phase of candidate

biomarkers. A verification and validation platform is more costly, labor-intensive and

requires more time than a discovery program127,185. There is no doubt that

meticulous verification steps are essential and the impact of the end product on the

future of diagnostics would be enormous.

Our discovery strategy involving analysis of conditioned media from breast

cancer cell lines yielded over 1,100 proteins. Given our hypothesis that novel breast

cancer biomarkers will be secreted or shed proteins, we chose to focus on the 402

extracellular, membrane and unclassified proteins for further analysis. To select the

most promising molecules into the next step of the platform for biomarker

identification, a number of bioinformatic and data analyses were performed including

examining tissue-specificity and biological functions. The degree of overlap between

the proteins identified in this study using a cell culture model and other studies using

relevant biological fluids such as NAF and tumor interstitial fluid (TIF) were

evaluated. Finally, the filtering criteria for candidate selection were performed.



3.2.1 Tissue-specific expression

To identify genes in our CM that were relatively breast-specific, Unigene

analysis on the 402 extracellular, membrane-bound and unclassified proteins, using

the EST ProfileViewer, was performed. In addition, the proteins identified in our CM

analysis were cross-compared to the proteome of breast tumors identified by gel

electrophoresis and mass spectrometry137.

3.2.2 Biological functions analysis

Extracellular, membrane-bound and unclassified proteins were evaluated by

the Ingenuity Pathways Analysis software to identify global functions of the proteins.

This software uses a knowledge base derived from the literature to relate gene

products to each other, based on their interaction and function. The list of proteins

and their corresponding IPI human identification numbers were uploaded as an

Excel spreadsheet file onto the Ingenuity software (www.ingenuity.com). Ingenuity

then uses these proteins and their identifiers to navigate the curated literature

database. The biological functions assigned to each network were ranked according

to the significance of that biological function to the network. A Fischer’s exact test

was used to calculate a P-value. For a detailed description of IPA, visit

www.ingenuity.com


3.2.3 Comparison of proteins identified from CM with other publications

The proteins identified in this study were compared to two nipple aspirate fluid

(NAF) proteomes, one tissue culture-based proteome of a well-studied breast cancer

cell line (MCF-7) and one proteome of breast tumor interstitial fluid (TIF)115,135,136,147.

3.2.4 Single nucleotide polymorphisms (SNPs) and human Plasma Proteome

database

The extracellular proteins identified in our conditioned media analysis were

searched using a well-curated SNPs database62.

Furthermore, the 402 extracellular, membrane and unclassified proteins from

the conditioned media analysis were searched against the human Plasma Proteome

database to identify proteins previously found in human plasma. This is a database,

available online, that contains manual annotation of proteins via exhaustive literature

research. The database includes information pertaining to isoform specific

expression, disease, localization, potential post translational modifications and SNPs.

3.2.5 Selection of candidates

Since over 1,000 proteins were identified in our proteomic analysis (discovery

phase), we used several methods of protein filtering to create a more limited set of

proteins for future exploration as potential breast cancer biomarkers. Literature

searches were performed on the top 100 differentially expressed secreted and

membrane proteins, to identify molecules that have not been previously studied as

serological markers for breast cancer. The proteins that met this criterion were again


filtered by selecting proteins that were expressed exclusively or preferentially in the

early or advanced stages of cancer (expressed in BT474 and MDA-MB-468 and not

in MCF-10A). More information on the selection criteria is provided in the results

section.


3.3 Results

3.3.1 Tissue-specific expression

Unigene analysis revealed 5 genes that were relatively specific to normal

and/or cancerous breast tissue (SCGB1D2, SBEM, TFF1, DCD and CALML5).

Literature mining on these proteins showed that one of the proteins had previously

been evaluated in serum of breast cancer patients (trefoil factor 1). Two of the

molecules were not considered further since SCGB1D2 (Lipophilin-B) differential

expression levels in the CM were minimal with an F ratio <1 and DCD (Dermcidin)

was identified in the media only culture. From the remaining two molecules, SBEM

(small breast epithelial mucin) had previously been examined by RT-PCR as a

potential marker for breast cancer189 but reagents were not available to perform an

immunoassay to evaluate its protein levels in serum of cancer patients. Finally,

CALML5 (Calmodulin-like protein 5) was also a protein that needed to be evaluated

further for its ability to be a circulating breast cancer biomarker.

Recently, the proteome of breast tumors were deciphered using gel

electrophoresis and mass spectrometry yielding over 2,000 proteins137. The authors

examined the proteomes of several breast tumors, tissues peripheral to the tumors

and also samples from patients undergoing non-cancer surgery. Using label-free

quantification, differentially expressed proteins between the normal breast tissue and

cancerous breast tissue was identified. Specifically, 54 proteins were found to be

strongly over-expressed in breast tumors compared to their corresponding

peripheral tissue samples and healthy mammary gland tissue samples. Cross-

comparison of these 54 breast cancer tissue-specific proteins with our conditioned


media analysis revealed that 25 of these proteins (46%) were also identified in our

study using a tissue culture approach (Table 3.1).

3.3.2 Biological functions analysis

As well, using the Ingenuity Pathways Analysis (IPA) software, we classified

the proteins by biological function. The top 15 functions are displayed in Figure 3.1.

The top functions were cellular movement followed by cell-to-cell signalling and

interaction.

Also using the Ingenuity Pathways Analysis, the 402 proteins were filtered by

extracting only those genes that were identified in literature to be present in human

and associated with cancer. Approximately 100/402 extracellular and membrane

genes were identified to meet this criterion. The top 14 canonical pathways that

demonstrated a relationship to the 100 genes filtered as well as known genes linked

to breast cancer are highlighted in Figure 3.2.


Table 3.1: Proteins elevated in breast tumor tissue proteome1 and found in CM analysis

1 Alldridge,L., Metodieva,G., Greenwood,C., Al Janabi,K., Thwaites,L., Sauven,P., & Metodiev,M. Proteome Profiling of Breast Tumors by Gel Electrophoresis and Nanoscale Electrospray Ionization Mass Spectrometry. J. Proteome Res. (2008).


Figure 3.1

Figure 3.1: Biological functions analyses. The top 15 functions for the 422 extracellular, membrane-bound and unclassified proteins are shown as determined by Ingenuity Pathways Analysis (IPA). The y-axis shows the negative log of P-value.


Figure 3.2

Figure 3.2: Canonical pathways and known genes linked to breast cancer. Using the Ingenuity Pathways Analysis, ~ 100 genes were identified in the literature as being reported in humans and cancer from the shortened list of candidates. The top 14 canonical pathways have been mapped onto the 100 genes along with genes linked to breast cancer including IFGBP5, VEGF and ERBB2.


3.3.3 Comparison of proteins identified from CM with other publications

Using the isotope-coded affinity tags (ICAT) technology, Pawlik et al

quantified tumor-specific proteins in NAF135. In total, of the 39 proteins that were

differentially expressed in tumor-bearing versus disease-free breasts, 6 were also

found among the 1,139 proteins in our CM. In addition, Varnum et al identified 64

proteins in NAF using MS, of which 15 were previously reported to be altered in

patients with breast cancer136. From the 64 proteins, 21 were also found in our

proteomic study. More importantly, of the 15 proteins previously reported to be

altered in serum or tumor from women with breast cancer, 10 were found in our CM

proteome. A proteomic study involving MCF-7 targeted membrane-associated breast

cancer proteins as potential biomarkers147. Using two-dimensional gel

electrophoresis (2-DE) and MS analysis of the protein spots, Canelle et al identified

98 proteins. Among the 98 proteins, 42 of them were also found in our study. In

addition, Celis et al mined another source of biomarkers through the investigation of

tumor interstitial fluid (TIF) that perfuse the breast tumor microenvironment115. Given

the importance of the tumor-host interface and the increasing appreciation of the role

that the microenvironment plays in cancer initiation and progression, we compared

our list of proteins identified through a cell culture model system to the proteins

excreted by the cells that are within their native cancer microenvironment. The

authors identified approximately 260 proteins using 2-D gel electrophoresis,

immunoblotting and mass spectrometry, of which 112 were also identified in our

study (43%). Figure 3.3 summarizes the overlaps observed among the other

publications and the data presented here. A table containing all of the proteins


identified by our group and the four previous studies can be found online171. Finally,

the lung, along with the bone, is one of the most frequent sites of breast cancer

metastasis. A set of 54 genes that mediate breast cancer metastasis to the lungs

have been identified190. Given that MDA-MB-468 cells were collected from pleural

effusion, we compared the list of proteins identified from MDA-MB-468 conditioned

media to the 54 candidate lung metastasis genes. Seven genes were found in the

CM of our cell line (KYNU, TNC, ROBO1, FSCN1, MAN1A1, LTBP1 and GSN).

Interestingly, none of the genes that overlapped between the lung and bone

metastasis signatures were identified in MDA-MB-468.

Another interesting study used tandem mass spectrometry to sample the CM

of four isogenic breast cancer cell lines differing in aggressiveness191. Three

independent secreted proteome preparations (biological replicates) for each of the

cancer cell lines were performed. Using a protein fractionation strategy involving C2

columns previously reported to enrich for secreted proteins in the CM146, the authors

identified over 250 proteins per cell line. From the 37 most significant secreted

proteins across the isogenic cell lines, 31 (87%) were also observed in our breast

cancer conditioned media analysis of 3 cell lines. More recently, another group

analyzed the conditioned media of human mammary epithelial cells (HMEC) by LC-

MS/MS and identified ~900 proteins192. To specifically focus on proteins that were

secreted or shed, the authors compared their conditioned media data with their

previous work on analyzing the proteome of whole HMEC lysates. Approximately

150 proteins were identified to be enriched in the extracellular compartment of


HMEC. Eighty-three of these proteins were also identified in our conditioned media

analysis (55% overlap).


Figure 3.3

Figure 3.3: Overlap among other publications. The overlap of proteins identified in the current conditioned media cell culture study and 4 other proteomic studies.

Kulasingam1,139

Celis

Pawlik

VarnumCanelle

148

112 336

43

2156

42


3.3.4 SNPs and human Plasma Proteome database

A potential mechanism for elevated levels of secreted proteins in circulation is

the presence of a SNP in the signal peptide region of a protein resulting in aberrant

secretion. Nine proteins were identified (CFB, CST1, DAG1, FAM3B, FN1, IGFBP7,

IL22RA2, PI3 and SERPINE1) as potentially containing SNPs in signal peptide

region. CFB (Complement factor B) and FN1 (Fibronectin) were not considered

further as potential targets for the verification phase as they are of high abundance

in circulation and are non-specific to breast cancer. CST1 (Cystatin SN) was not

previously examined as a serological breast cancer marker however no reagents

were available to test its diagnostic potential in serum using an immunoassay. For

DAG1 (Dystroglycan) and IGFBP7 (Insulin-like growth factor-binding protein 7),

there were existing patents with respect to breast cancer and thus these proteins

were not considered further due to conflict of interest. FAM3B (Protein FAM3B) and

IL22RA2 (Interleukin-22 receptor alpha-2 chain) was not considered further as its

differential expression levels in CM were minimal with an F ratio <1. As well, PI3

(elafin) appeared to be a promising molecule to investigate further while SERPINE1

(Plasminogen activator inhibitor 1) had already been examined in breast cancer

patients as a potential biomarker.

Finally, these proteins were searched with the Human Plasma Proteome

database to decipher whether they have been identified in plasma. 104 of 402

proteins were identified in human plasma.


3.3.5 Selection of candidates

A major rate-limiting step for a biomarker discovery platform thus far has been

in the selection of candidate molecules to investigate further as breast cancer

biomarkers from the over 100s to 1000s of proteins identified in the discovery phase.

Given the different experimental questions being asked, a standard procedure for

candidate selection from a discovery platform could not be established. However, a

number of filtering criteria exists that can be applied to a dataset. In our proteomic

analysis of breast cancer cell lines, we chose to focus on only the extracellular and

membrane proteins identified since it is these proteins that have the highest chance

of entering the circulation and hence serving as serological markers. As well, one of

the properties of an ideal tumor marker is that it should be tissue specific. Currently,

one of the few markers used in the clinic that are tissue-specific is prostate-specific

antigen (PSA) for prostate cancer. Therefore, it may be important to examine tissue

specificity of the narrowed list of proteins from the discovery phase using the

Unigene database or in silico analysis193 or comparing the proteome of breast

cancer tissues to the proteome of CM and focussing only on the overlapping

members. Another criterion to select a more limited set of proteins for future

exploration as potential breast cancer biomarkers is to compare the dataset with

mRNA microarray databases194 as well as to examine overlapping proteins with

proteomes of relevant biological fluids such as NAF or tumor interstitial fluid (TIF)115.

Additionally, it is also equally important to perform literature searches to identify

molecules that have not been previously studied as serological markers for breast

cancer.


Some of the secondary filtering criteria that may assist in selection of

candidate molecules include examining factors such as reagent availability

(recombinant protein, antibodies and ELISA) and focusing on proteins known to

participate in pathways or signaling cascades relevant to cancer progression. If

different cell lines were used in the discovery phase differing in aggressiveness, then,

proteins can be selected that are expressed only in the early or advanced stages of

cancer and not in the normal cell line. Table 3.2 lists some of the criteria that can be

applied to investigate further only the proteins that have the highest chance of being

successful in the verification and validation phases.


Table 3.2: Proposed filtering criteria for candidate selection

1. Focus on extracellular and membrane proteins 2. Examine tissue specificity (Unigene database, SAGE*) 3. Compare to mRNA microarray databases 4. Compare to proteins identified in other biological fluids 5. Examine molecules not previously examined as serological markers for the

cancer type (literature searches). 6. Perform tissue proteomics on relevant tissues to the cancer type and

compare with cell line data to select only candidates that overlap between the two

7. Reagent availability such as ELISA, antibodies etc 8. Focus on proteins known to participate in pathways/signaling related to

cancer 9. Focus on differentially expressed proteins (based on label-free or quantitative

proteomics) 10. Exclude high-abundance (µg/mL) serum proteins, especially liver products

and acute-phase reactants

*SAGE: Serial analysis of gene expression


In the following, we summarize how the candidates were narrowed to a more

manageable number to investigate further. Based on literature searches, from the

top 100 differentially expressed proteins (as defined by spectral counting), 46 of the

molecules had been previously examined as a serological breast cancer marker.

Patent searching of the remaining 54 proteins revealed 11 molecules that had a

previous patent with respect to breast cancer. The remaining 43 proteins were

scrutinized to select proteins for further investigation that was found only in the CM

of breast cancer cell lines and absent in MCF-10A (normal breast epithelial cell line).

This filtering criterion resulted in 30 candidates for further analysis (Table 3.3). Of

these 30 candidates, 11 of them had reagents available to develop an immunoassay

to measure the levels of these proteins in biological fluids. This list included: elafin,

kallikrein 5, 6 and 10, cystatin C, lipocalin-2, transforming growth factor beta-2,

activated leukocyte-cell adhesion molecule (ALCAM), B-cell adhesion molecule

(BCAM), neuronal cell adhesion molecule (NrCAM) and fractalkine. Table 3.4

highlights our candidate selection criteria discussed above and table 3.5

summarizes some of the properties of the 11 candidates selected for further analysis.


Table 3.4: Our candidate selection criteria

Selection criteria # of proteins 1) Total proteins identified in conditioned media of all 3 cell lines

1,139

2) Extracellular and membrane proteins identified 402

3) Differentially expressed proteins based on label-free quantification

100

4) Literature searches to focus on novel, serological breast cancer markers

54

5) Patent searches to focus on proteins without conflict of interest 43

6) Proteins present in breast cancer cell lines (BT474, MDA-MB-468); absent in normal cell line (MCF-10A)

30

7) Reagent availability (ELISA, antibodies) 11


Table 3.5: Top 11 candidates selected for verification phase Protein Name Gene Localization Plasma? NAF?135,136 TIF?115 CM?191,192 Activated leukocyte cell adhesion molecule ALCAM membrane √

B-cell adhesion molecule BCAM membrane Cystatin C CST3 extracellular √ √ √ Elafin PI3 extracellular √ √ Fractalkine CX3CL1 membrane √ Kallikrein 5 KLK5 extracellular √ Kallikrein 6 KLK6 extracellular √ Kallikrein 10 KLK10 extracellular √ Lipocalin-2 LCN2 extracellular √ √ Neuronal cell adhesion molecule NRCAM membrane Transforming growth factor beta-2 TGFB2 extracellular √ √

NAF: nipple aspirate fluid; TIF: tumor interstitial fluid; CM: conditioned media


3.4 Discussion

One of the properties of an ideal tumor marker is that it should be tissue

specific and elevated in cancer and not in healthy or benign conditions. Currently,

the only marker used in the clinic that is tissue-specific is PSA for prostate cancer.

Examining the proteome of breast cancer tissues and comparing it to the peripheral

normal tissue proteome appears promising to identify potential candidates to

evaluate further. However, examining tissues, a complex source for biomarkers,

gives rise predominantly to intracellular proteins. In fact, among the 54 proteins

identified by Alldridge et al as being strongly over-expressed in breast tumors

compared to healthy control tissues, only a handful of proteins were classified by

localization as extracellular or membrane proteins, which would have the highest

chance of entering into circulation137. Nevertheless, the conditioned media analysis

did identify approximately half of the differentially expressed breast tissue-specific

proteins. The most relevant proteins included Ras GTPase-activating-like protein

IQGAP1, polymeric-immunoglobulin receptor and moesin.

Interestingly, the major biological functions of the top 100 proteins identified in

our study were cellular movement and cell-to-cell signaling. In addition,

approximately 25% of these proteins have previously been associated with other

cancer types in the literature. These proteins, when mapped onto various canonical

pathways reveal an intricate connection among themselves. Given that

dysregulation of signalling pathways play an important role in cancer initiation and

progression, the proteins identified in this study may be useful for further

investigation.


Recently, the analysis of thousands of genes in breast and colorectal cancers

have shown that individual tumors can accumulate an average of 90 mutant

genes195. The authors identified 189 previously unknown genes that were mutated at

a high frequency. From these genes, six were found in this study (filamin-B, spectrin

alpha chain, gelsolin, extracellular sulfatase Sulf-2, neuronal cell adhesion molecule

and polypeptide N-acetylgalactosaminyltransferase 5). Furthermore, it is particularly

important that many of the proteins identified by other groups using relevant

biological fluids such as NAF and TIF were also present in our analysis (Figure 3.3).

Many of the proteins identified by Pawlik et al were highly abundance serum proteins

such as albumin, transferrin and various immunoglobulins – all of which were not

identified in our serum-free media culture. Finally, a large portion of the shortened

list of proteins identified was also found in the human Plasma Proteome database.

This finding was not unexpected as it served to highlight the fact that many of the

proteins identified in our study had previously been found in plasma. But it also

demonstrated that many more proteins have yet to be identified in plasma. There

can be a number of reasons why they have not been identified – one of which is the

fact that their concentration in plasma is too low to measure by current technologies

and thus other means of initially identifying them and then developing a specific and

sensitive immunoassay are critical.

Finally, based on all of the bioinformatic and data analyses, the selection of

the top candidates to investigate in the verification phase of the proteomic platform

for biomarker discovery yielded 11 proteins. This included 7 extracellular proteins

and 4 membrane proteins.

Chapter 4: Verification Phase 113

CHAPTER 4:

VERIFICATION PHASE

The work presented in this chapter is published, in part, in Molecular & Cellular Proteomics:





4.1 Introduction

Eleven filtered molecules were selected to evaluate their potential to be

circulating breast cancer biomarkers in serum, using an immunoassay specific for

the molecules. Seven of these molecules were secreted proteins and included in this

list was elafin. Elafin, a secreted epithelial proteinase inhibitor, also referred to as

skin-derived anti-leukoproteinase (SKALP) or elastase-specific inhibitor (ESI)

belongs to the Trappin gene family196. Proteases and their inhibitors play an

important role in cancer metastasis and angiogenesis197. Previous studies have

shown that elafin is expressed in normal mammary epithelial cells, but it is down-

regulated in most breast tumor cell lines198.

Another family of molecules that was included in the extracellular filtered

molecules was kallikreins 5, 6 and 10. Kallikreins are secreted enzymes that encode

for trypsin-like or chymotrypsin-like serine proteases199. Accumulating evidence

indicates that the KLK family is dysregulated in cancer. Although many KLKs are

over-expressed in cancerous tissues, it is not currently known whether this also

reflects an increase in proteolytic activity. Clinical data linking an increase in KLK

expression with patient prognosis strongly suggests that KLKs are implicated in

tumor progression. Among all biomarkers to date, KLK3/PSA has had the greatest

impact in clinical medicine, for the screening and monitoring of prostate cancer200.

The fifth molecule to be evaluated was cystatin C, an extracellular cysteine

protease inhibitor that belongs to the cystatin superfamily. It has been reported to be

involved in many disease processes, such as inflammation and tumor metastasis201.

In fact, cystatin C has also been found to have utility as a biomarker for renal


function assessment. Specifically, cystatin C serum concentration correlates closely

to the glomerular clearance rate202.

Adding to this list of candidates was lipocalin-2 (neutrophil gelatinase-

associated lipocalin) whose expression has been observed in most tissues, and its

synthesis is induced in epithelial cells during inflammation203. Lipocalin-2 has been

implicated in a variety of cellular processes including the innate immune response,

differentiation, tumorigenesis, and cell survival204,205. Its association with MMP-9 may

modulate protease activity by protecting MMP-9 from degradation206. It has been

associated with several tumor types as well, including breast, ovarian, colorectal,

and pancreatic cancers207-209. Its function in cancer is unclear, although the invasive

and metastatic behavior of tumor cells is suppressed by lipocalin-2 in models of

breast and colon cancer210,211.

The final extracellular protein to be included in the top 11 candidates to

investigate further was transforming growth factor beta-2 (TGF-β2). It has four

fundamental activities. TGF-β2 can act as a growth inhibitor for most types of cells,

serve to enhance the deposition of extracellular matrix, be immunosuppressive and

play a role during fetal development. It is expressed in discrete areas, such as

epithelium, myocardium, cartilage and bone of extremities and in the nervous

system, suggesting specific functions189,212,213.

Furthermore, tumor metastasis involves invasive growth into neighboring

tissue, survival in circulation, extravasation and colonization of distant organs.

Therefore, movement through tissue barriers is a pivotal step in metastasis. For this

step to occur, proteolysis of extracellular matrix, remodeling of the actin cytoskeleton


and selective cell adhesion interactions are all important factors. Cell adhesion

molecules (CAMs) are involved in cell-cell and cell-matrix interactions214. They serve

to maintain tissue architecture and are involved in neurogenesis, hematopoiesis,

immune responses and in tumor progression. Changes in expression of CAMs may

accompany neoplastic progression. In this respect, the first membrane protein to be

evaluated as a potential breast cancer marker was activated leukocyte cell adhesion

molecule (ALCAM). ALCAM is a member of the family of cell adhesion molecules

and is one of the members of a small subgroup of transmembrane glycoproteins of

the immunoglobulin superfamily (IgSF). In addition, ALCAM has been implicated in

cell migration and is a marker for the identification of pluripotent mesenchymal stem

cells. In vitro studies have suggested that ALCAM may favor interactions between

tumor and endothelial cells. One study has suggested that strong cytoplasmic

ALCAM expression in primary breast cancer, as measured by immunohistochemistry,

might be a marker for aggressive breast cancer215.

Besides ALCAM, two additional members of this family are CD146/MUC18

and BCAM/Lutheran blood group glycoprotein (basal cell adhesion molecule;

BCAM)216. BCAM was the first laminin receptor to be identified that is a member of

the Ig superfamily. Laminins are a family of extracellular proteins that are an integral

part of all basement membranes and of the extracellular matrix proteins. Only α5

chain-containing laminins are known ligands for BCAM. Very limited information is

available about the expression of BCAM in tumors and therefore the roles of BCAM

in tumor progression remain unclear. Without wishing to be bound to theory, it may

be that, because BCAM can interact with laminin α5, which is widely expressed in


basement membranes, BCAM may mediate the involvement of tumor cells during

invasive processes216. Yet another adhesion molecule that was evaluated is

neuronal cell adhesion molecule (NrCAM). Like the other two adhesion molecules,

NrCAM functions in cell adhesion and cell movement.

The last candidate to be explored was fractalkine, also called neurotactin or

CX3C membrane-anchored chemokine. It is a single-pass type I membrane.

Fractalkine has been associated with cell movement and adhesion. The soluble form

of fractalkine is chemotactic for T-cells and monocytes. The membrane-bound form

promotes adhesion of those leukocytes to endothelial cells and hence it may play a

role in regulating leukocyte adhesion and migration processes at the endothelium.



The first 4 candidates to be analyzed were elafin and kallikreins 5, 6 and 10.

Their levels in serum, various biological fluids and tissues were examined.

4.2.1 Quantification of Elafin and Kallikrein 5, 6 and 10

Elafin sandwich ELISA kit, purchased from Hycult biotechnology, was used to

measure levels of human elafin in serum, pooled biological samples and pooled

tissue lysates. The assay was performed according to the manufacturers’

instructions. The concentration of KLK5, 6 and 10 was quantified with KLK5, 6 or 10-

specific non-competitive immunoassays developed in our laboratory 112,170,217. For

more details, see the cited literature.

Serum from apparently healthy females and from patients with varying levels

of CA 15-3 was used. The following 10 biological fluids were examined: amniotic

fluid, ascites, breast cyst fluid, cerebral spinal fluid (CSF), follicular fluid, milk, NAF,

urine, saliva and seminal plasma. Ten samples were combined for each fluid to

generate a pooled sample. The following 27 human tissues were examined: adrenal

(5), aorta (3), bladder (3), bone (4), colon (5), esophagus (4), heart (5), kidney (5),

liver (5), lung (4), lymph node (3), muscles (3), pancreas (5), skin (4), small intestine

(4), spinal cord (3), spleen (5), stomach (4), thyroid (3), trachea (4), ureter (4),

prostate (4), testis (4), breast (4), fallopian tube (4), uterus (4) and ovary (2). The

number of samples used to pool is indicated in brackets. Tissue extracts were

prepared by pulverizing 0.2 g of each tissue in liquid nitrogen followed by addition of

an extraction buffer (2 mL of 50 mmol/L Tris-HCl buffer, pH 8.0, containing 150


mmol/L NaCl, 5 mmol/L EDTA, and 10 mL/L NP-40 surfactant). After incubation of

the mixture on ice for 30 min, it was centrifuged at 14 000g at 4°C for 30 min. The

resulting supernatants were collected and stored at -20°C. Our procedures have

been approved by the institutional review boards of Mount Sinai Hospital and the

University Health Network, Toronto, Canada.

4.2.2 Verification strategy

For the remaining 7 candidates, the approach was to examine only serum of

cancer patients since the goal was to obtain a serological marker for breast cancer.

In the first screen, pooled serum samples were used, with each pool containing 3

samples. Normal female, normal male and breast cancer serum from women with <

30 units/mL (each one of these groups containing 3 samples per pool with 3 pools in

total) and from women > 30 units/mL were used (3 samples per pool and 9 pools in

total). To determine specificity of the molecules, 1 pooled sample containing 5 serum

samples for each of the following cancer types was also analyzed: prostate, colon,

lung, ovarian and pancreatic cancers. Candidates that showed potential to

discriminate controls from cases were screened in the second step by breaking

down the pooled samples into individual serum samples.

4.2.3 Quantification of Cystatin C, Lipocalin-2 and Transforming growth factor

beta-2 using commercial ELISA kits

Human cystatin C ELISA kit, purchased from BioVendor, LLC, was used to

measure levels of cystatin C in serum. The assay was performed according to the


manufacturers’ instructions. Briefly, 96-well plates were coated with polyclonal anti-

human cystatin C specific antibody. Once the samples were added, unbound

proteins were washed. Horseradish peroxidase (HRP) conjugated polyclonal anti-

human cystatin C antibody was added to the wells and incubated. Following another

washing step, to remove unbound antibody-HRP conjugate, a substrate solution

(H2O2 and TMB) was added. The enzymatic reaction yielded a blue product that

turned yellow when acidic stop solution was added. The intensity of the color,

measured spectrophotochemically at 450 nm, was directly proportional to the

amount of the human cystatin C bound in the initial step.

Human lipocalin-2/NGAL Quantikine ELISA kit, purchased from R&D Systems

(Minneapolis, MN), was used to measure its levels in serum. Similar in principle to

cystatin C, a monoclonal antibody specific for lipocalin-2 was coated onto a

microplate. After sample loading, any lipocalin-2 present was bound by the

immobilized antibody. After washing away unbound substances, an enzyme-linked

monoclonal antibody specific for lipocalin-2 was added to the wells. Following a

wash to remove any unbound antibody-enzyme reagent, a substrate solution was

added to the wells and color developed in proportion to the amount of lipocalin-2

bound in the initial step. The color development was stopped and the intensity of the

color was measured spectrophotochemically at 450 nm.

Human TGF-β2 Quantikine ELISA kit, purchased from R&D Systems

(Minneapolis, MN), was used to measure its levels in serum. The assay is similar in

principle to lipocalin-2.


4.2.4 Quantification of ALCAM, BCAM, NrCAM and Fractalkine using in-house

developed ELISAs

ELISA immunoassays were developed in-house for the following candidates:

ALCAM, BCAM, NrCAM and fractalkine. In general, the assays were based on

mouse monoclonal antibody capture (ALCAM, BCAM, NrCAM: 250ng/well and

fractalkine: 500ng/well) and biotinylated goat anti-human detection antibody

(ALCAM: 5ng/well; BCAM: 2ng/well; NrCAM: 4ng/well and fractalkine: 25ng/well)

(both obtained from R&D Systems, Minneapolis, MN). The assay had a detection

limit of 0.05 µg/L and a dynamic range of up to 10 µg/L. Briefly 96-well polystyrene

plates were first coated with a capture antibody specific for the molecule. After

overnight incubation, the plates were washed and loaded with 50 μL of serum or

standards and 50 µL of an assay buffer for 1.5 hours. After washing the plate, 100

µL of another biotinylated antibody (R&D) was added, creating a sandwich-type

assay, and the plates were incubated for an additional 1 hour with gentle shaking.

After washing, alkaline phosphatase-conjugated streptavidin was added and

incubated for 15 min and washed. Finally, diflunisal phosphate (DFP) and terbium-

based detection was performed, essentially as described by Christopoulos and

Diamandis.218.


4.3 Results

4.3.1 Elafin

We identified elafin in BT474 and MDA-MB-468 with the latter identifying

more unique peptides for the protein. Using a commercially available sandwich

immunoassay for elafin, we examined a number of different biological samples for its

expression. Serum from women with different levels of CA 15-3 was measured.

Typically, women with CA 15-3 levels of <30 units/mL are considered in the “normal”

range32. Examining sera from women with <30 units/mL, >30<100 units/mL and

>100 units/mL CA15-3 for elafin showed no significant difference among the groups

(Kruskal-Wallis test, P = 0.64) (Figure 4.1A). In addition, using the Mann Whitney

test, we observed no significant difference between normal and tumor breast

cytosols (P = 0.55) (Figure 4.1B). The levels of elafin in a variety of pooled biological

samples showed that milk contained 18 µg/L of this protein, followed by urine at 7

µg/L, follicular fluid and seminal plasma (Figure 4.1C). These were biological

samples from normal individuals and hence do not correlate to diseased phenotype.

Finally, we checked the levels of elafin in normal pooled tissue lysates. After

correction of total protein content in the samples, colon, small intestine and ureter

contained the highest levels. Among the top 10 expressing tissues, breast was

ranked as seventh (Figure 4.1D).


Figure 4.1

< 30 > 30 <100 >1000

10000

20000

30000

40000

50000

CA 15-3 (Units/mL)

Ela

fin

(n

g/L

)Elafin in Breast Cytosols

Normal Tumor0

2500

50005000

15000

25000

Ela

fin

Co

rrec

ted

fo

r T

ota

l P

rote

in(n

g/g

)

Milk

(10)

Urine

(10)

Follicu

lar F

luid

(10)

Semin

al P

lasm

a (1

0)0

5000

10000

15000

20000

Ela

fin

(n

g/L

)

Colon (5

)

Smal

l Inte

stin

e (4

)

Urete

r (4)

Spinal

Cord

(3)

Skin (4

)

Bone (4

)

Breas

t (4)

Pancr

eas

(5)

Esophag

us (4

)

Heart

(5)

Spleen

(5)

0

5000

10000

15000

20000

25000

30000

35000

Ela

fin

(n

g/

g)

A) B)

C) D)

Figure 4.1: Levels of elafin in biological samples. (A) Serum from women with known CA 15-3 levels were analyzed for elafin expression by ELISA (median values shown). (B) Normal and tumor breast cytosols as well as various pooled biological samples (C) and pooled tissues (D) were assayed. The number of samples used to generate a pooled sample is indicated in brackets. No expression was observed in other biological samples and tissues examined (See Materials and Methods section for a complete list of all samples surveyed).


4.3.2 Kallikrein 5, 6, 10

In addition, using the same sample sets as elafin, we measured the levels of

kallikreins 5, 6 and 10. In serum of control and breast cancer cases, for KLK5 using

the non-parametric Mann Whitney test, we obtained no significant difference (P =

0.83). However, for both KLK6 and KLK10, there was a weak significance of P =

0.01 and 0.04, respectively using the Mann Whitney test (data not shown).

Furthermore, examining the breast cytosols (normal and tumor) for KLK expression

yielded no significant difference. Interestingly, in pooled biological samples, KLK5

was found expressed in all of the relevant breast fluids (milk, breast cyst fluid and

NAF) (Figure 4.2). Finally, in the various tissues screened for KLK expression, there

was a weak expression in skin and breast for KLK5, weak expression in ovary and

spinal cord for KLK6 and a weak expression in esophagus for KLK10 (data not

shown).


Figure 4.2

A)

0

20

40

60

80

100

120

brea

st cy

st flu

id (3

)

saliv

a (3

)

norm

al urin

e (1

0)

NAF (10)

amnio

tic fl

uid (1

0)

sem

inal plas

ma

(10)

ascit

es (1

0)m

ilk

follic

ular f

luid

(10)

CSF (7)

KL

K5

(µg

/L)

B)

0

50

100

150

200

250

brea

st cy

st flu

id (3

)

saliv

a (3

)

norm

al urin

e (1

0)

NAF (10)

amnio

tic fl

uid (1

0)

sem

inal plas

ma

(10)

ascit

es (1

0)m

ilk

follic

ular f

luid

(10)

CSF (7)

KL

K6

(µg

/L)

C)

010

2030

4050

6070

8090

100

brea

st cy

st flu

id (3

)

saliv

a (3

)

norm

al urin

e (1

0)

NAF (10)

amnio

tic fl

uid (1

0)

sem

inal plas

ma

(10)

ascit

es (1

0)m

ilk

follic

ular f

luid

(10)

CSF (7)

KL

K10

(µ

g/L

)

Figure 4.2: Levels of KLK5, KLK6 and KLK10 in biological samples. Various pooled biological samples were assayed for kallikrein expression by ELISA. The number of samples used to generate a pooled sample is indicated in brackets.


4.3.3 Cystatin C

Cystatin C was found in the CM of MCF-10A and BT474 with a 3-fold higher

expression in the latter. It was also identified in the NAF proteome and in the human

plasma proteome. Its levels, based on the immunoassay, indicate that it is a high

abundance protein with values in the low mg/L range. In the first screen, analyzing

the pooled serum samples using the immunoassay showed that breast, prostate and

pancreatic cancers discriminated cases and controls (Figure 4.3A). In the second

phase of analysis, the pools were broken into its respective individual samples and

analyzed for breast, prostate and pancreatic cancers (Figure 4.3B). Individual

analysis revealed a small incremental change between the groups.


Figure 4.3

A)

B)

Normal

Fem

ale

Normal

Mal

e

Low CA15

-3

Breas

t

Prost

ate

Ovaria

n

Colon

Lung

Pancr

eatic

0

1

2

3

Cys

tati

n C

(m

g/L

)

Normal

Fem

ale

Normal

Mal

e

Low CA 1

5-3

Breas

t

Prost

ate

Pancr

eatic

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

Cys

tati

n C

(m

g/L

)

Figure 4.3: Levels of cystatin C in serum. (A) Pooled serum samples representing various cancer types and (B) Individual serum samples from healthy individuals and patients with breast, prostate and pancreatic cancers.


4.3.4 Lipocalin-2

Lipocalin-2 was found in the CM of BT474 and MDA-MB-468, and absent in

the normal breast epithelial cell line MCF-10A. It was also identified in human

plasma proteome. In the first verification phase of analysis, lipocalin-2 was highly

elevated in pancreatic cancer compared to all other cancer types examined (Figure

4.4A). In the second phase of verification, 91 serum samples were examined

including 6 normal females, 6 normal males, 10 chronic pancreatitis, 35 resected

pancreatic ductal adenocarcinoma (PDAC) and 34 unresectable pancreatic

adenocarcinoma (Figure 4.4B). At 90% specificity (cut-off value 52 µg/L), 51%

sensitivity (18/35) for PDAC patients was observed.


Figure 4.4

Normal

Fem

ale

Normal

Mal

e

Low CA15

-3

Breas

t

Prost

ate

Ovaria

n

Colon

Lung

Pancr

eatic

0

25

50

75

100

125

150

175

Lip

oca

lin

(

g/L

)

Normal

Fem

ale

Normal

Mal

e

Chronic

Pan

crea

titis

PDAC

Unrese

ctab

le

0

50

100

150

200200

400

Lip

oca

lin

(

g/L

)

A)

B)

Figure 4.4: Levels of lipocalin-2 in serum. (A) Pooled serum samples representing various cancer types and (B) Individual serum samples from healthy individuals, patients with chronic pancreatitis and patients with pancreatic cancer. PDAC: resected pancreatic ductal adenocarcinoma. Dotted horizontal line represents the cut-off at 90% specificity.


4.3.5 Transforming growth factor beta-2

TGF-β2 was found only in the metastatic breast cancer cell line, MDA-MB-

468. It was previously identified in TIF proteome. Analyzing the pooled serum

samples revealed that TGF- β2 was not good at discriminating between the different

cancer types and between controls and cases (Figure 4.5). This molecule was not

analyzed further.


Figure 4.5

Normal

Fem

ale

Normal

Mal

e

Low CA15

-3

Breas

t

Prost

ate

Ovaria

n

Colon

Lung

Pancr

eatic

0

50

100

150

200

250

TG

F-

2 (n

g/L

)

Figure 4.5: Levels of transforming growth factor beta-2 (TGF-β2) in serum. Pooled serum samples representing various cancer types were examined.


4.3.6 B-cell adhesion molecule (BCAM)

BCAM was initially identified in the CM of the cancer cell lines. In the pooled

samples analysis, BCAM levels were increased in breast and prostate cancers

compared to their respective normal groups (Figure 4.6A). In the individual samples

screening process, at 90% specificity (cut-off point of 32 µg/L) the sensitivity for

breast cancer diagnosis was 34% (Figure 4.6B). Examining the values between

normal women (n = 9) and patients with breast cancer (n = 35) by the non-

parametric Mann Whitney test (two-tailed), demonstrated that the medians were not

significantly different (median normals = 27 µg/L; median cancer = 26 µg/L; P <

0.84). However, the Spearman correlation coefficient between BCAM and CA 15-3

was 0.56 for 35 samples with a P-value of <0.0004 (data not shown). Finally, in the

second phase of the screening process, serum samples from prostate cancer

patients continued to be elevated compared to normal males.


Figure 4.6

Normal

Fem

ale

Normal

Mal

e

Low CA15

-3

Breas

t

Prost

ate

Ovaria

n

Colon

Lung

Pancr

eatic

0

25

50

75

100

BC

AM

(

g/L

)

Normal

Fem

ale

Normal

Mal

e

Low CA 1

5-3

Breas

t

Prost

ate

Ovaria

nLung

Colon

Pancr

eatic

0

25

50

75

100

BC

AM

(

g/L

)

A)

B)

Figure 4.6: Levels of B-cell cell adhesion molecule (BCAM) in serum. (A) Pooled serum samples representing various cancer types and (B) Individual serum samples that made the pooled samples were analyzed individually. Dotted horizontal line represents the cut-off at 90% specificity.


4.3.7 Neuronal cell adhesion molecule (NrCAM)

NrCAM was found expressed only in the localized breast cancer cell line

(BT474). Pooled breast cancer serum was elevated in the first phase of the

screening process compared to healthy controls (Figure 4.7A). This observation did

not hold true when the pools were broken down into their individual samples (Figure

4.7B). NrCAM was ineffective at discriminating between the different cancer types

and between controls and cases. This molecule was not analyzed further.


Figure 4.7

Normal

Fem

ale

Normal

Mal

e

Low CA15

-3

Breas

t

Prost

ate

Ovaria

n

Colon

Lung

Pancr

eatic

0

5

10

15

20

25

NrC

AM

(

g/L

)

Normal

Fem

ale

Normal

Mal

e

Low CA 1

5-3

Breas

t

Prost

ate

Ovaria

nLung

Colon

Pancr

eatic

0

5

10

15

20

25

NrC

AM

(

g/L

)

A)

B)

Figure 4.7: Levels of neuronal cell adhesion molecule (NrCAM) in serum. (A) Pooled serum samples representing various cancer types and (B) Individual serum samples that made the pooled samples were analyzed individually. Dotted horizontal line represents the cut-off at 90% specificity.


4.3.8 Fractalkine

Like many of the other candidates selected for analysis in the verification

phase of this study, fractalkine was initially found expressed in the CM of the breast

cancer cell lines, BT474 and MDA-MB-468 and absent in the semi-normal breast

epithelial cell line, MCF-10A. Fractalkine was elevated in the lung cancer pool

compared to all other cancer types (Figure 4.8A). In the second phase of evaluation,

58 serum samples were examined including 6 normal females, 6 normal males, 22

adenocarcinoma, 1 large cell carcinoma, 17 non-small cell lung carcinoma (NSCLC)

and 6 squamous cell lung carcinoma (Figure 4.8B). No statistically significant

difference was observed between the groups examined (Kruskal-Wallis test; P =

0.1135).


Figure 4.8

Normal

Fem

ale

Normal

Mal

e

Low CA15

-3

Breas

t

Prost

ate

Ovaria

n

Colon

Lung

Pancr

eatic

0

2

4

6

8

10

2545

Fra

ctal

kin

e (

g/L

)

Normal

Fem

ale

Normal

Mal

e

Adenoca

rcin

oma

Large

Cell

NSCLC

Squamous

0

5

10

15

202060

100

Fra

ctal

kin

e (

g/L

)

A)

B)

Figure 4.8: Levels of fractalkine in serum. (A) Pooled serum samples representing various cancer types and (B) Individual serum samples from healthy individuals and patients with different types of lung cancer. NSCLC: Non-small cell lung cancer. Dotted horizontal line represents the cut-off at 90% specificity.


4.3.9 Activated leukocyte cell adhesion molecule (ALCAM)

ALCAM was initially identified in the CM of cancer cell lines, BT474 and MDA-

MB-468. Analyzing the pooled samples by an ALCAM-specific immunoassay (Figure

4.9A) showed elevated levels of ALCAM in breast cancer and prostate cancer

compared to their control groups of normal female and male, respectively. In the

second phase of the screening process, the pools were broken down into individual

serum samples (Figure 4.9B) and ALCAM levels continued to be increased in breast

and prostate cancers compared to their controls. When comparing the values

between normal women (n = 9) and patients with breast cancer (n = 35) by the non-

parametric Mann Whitney test (two-tailed), the medians were significantly different

(median normals = 56 µg/L; median cancer = 84 µg/L; P < 0.0002). For ALCAM, at

90% specificity (cut-off point of 62 µg/L) the sensitivity for breast cancer diagnosis

was 91%. CA 15-3 levels were measured in the serum of breast cancer patients and

the Spearman correlation coefficient between ALCAM and CA 15-3 was 0.63 for 35

samples with a P-value of <0.0001 (data not shown). Finally, the sensitivity of the

test for breast cancer diagnosis in patients where CA 15-3 was normal (<30 U/mL)

was 78% (data not shown).


Figure 4.9

Normal

Fem

ale

Normal

Mal

e

Low CA15

-3

Breas

t

Prost

ate

Ovaria

n

Colon

Lung

Pancr

eatic

0

50

100

150

200

250

AL

CA

M (

g/L

)

Normal

Fem

ale

Normal

Mal

e

Low CA 1

5-3

Breas

t

Prost

ate

Ovaria

nLung

Colon

Pancr

eatic

0

50

100

150

200

250

AL

CA

M (

g/L

)

A)

B)

Figure 4.9: Levels of activated leukocyte cell adhesion molecule (ALCAM) in serum. (A) Pooled serum samples representing various cancer types and (B) Individual serum samples that made the pooled samples were analyzed individually. Dotted horizontal line represents the cut-off at 90% specificity.


4.4 Discussion

Analysis of serum from healthy individuals and patients with breast cancer, as

well as other cancer types, resulted in one promising candidate molecule: activated

leukocyte cell adhesion molecule (ALCAM). Our results provide strong evidence that

ALCAM may be a novel breast cancer diagnostic marker, either alone, or combined

with existing breast cancer biomarkers such as CA 15-3 and CEA. Importantly,

ALCAM seemed to be superior to CA 15-3 since it was elevated in 78% of patients

with breast cancer who had normal levels of CA 15-3. This means that ALCAM can

identify a considerable number of patients (78%) who will all be missed by CA 15-3

testing. Furthermore, for ALCAM, at 90% specificity, the sensitivity for breast cancer

diagnosis (all stages) was 91%.

Interestingly, ALCAM and BCAM levels were elevated in serum of prostate

cancer patients. It continued to be elevated in the second phase of the screening

process where the pooled samples were broken down into their individual serum

samples. Given that both breast and prostate cancers are partly hormone-dependent,

this may suggest a hormonal regulation of both ALCAM and BCAM in breast and

prostate cancer. To date, no hormone response elements have been identified for

ALCAM. Further studies are needed to examine this observation in more detail.

In addition, while we did not observe circulating biomarker potential in elafin

to discriminate normal from diseased individuals, to our knowledge, we are the first

to report on the expression of this protein in breast cytosols, biological fluids and

tissue lysates using an immunoassay. Elafin was found to be expressed in normal

breast tissue. Lastly, the verification phase revealed a potential biomarker for


pancreatic cancer, lipocalin-2. Certainly, a larger serum sample size with known

clinicopathologic parameters are needed to decipher whether lipocalin-2 can be a

diagnostic marker for pancreatic cancer.

In summary, elafin, kallikreins 5, 6 and 10, cystatin C, TGF-β2, BCAM,

NrCAM and fractalkine did not show breast cancer biomarker potential in the

verification phase of analysis. Lipocalin-2 showed promise to discriminate between a

subset of pancreatic cancer patients but further studies is warranted to determine its

relevance as a serological marker. ALCAM demonstrated excellent preliminary

promise to be a circulating breast cancer biomarker. It is possibly that the

measurement of ALCAM in blood to obtain information on patient prognosis or

patient recurrence or information on good or bad response to chemotherapy

treatment might be beneficial. Using specimens with known clinical information will

facilitate characterizing the utility of ALCAM as a circulating breast cancer biomarker.

Chapter 5: Validation of ALCAM as a Serological Breast Cancer Diagnostic Marker 142

CHAPTER 5:

VALIDATION OF ALCAM AS A SEROLOGICAL BREAST CANCER DIAGNOSTIC MARKER


5.1 Introduction

Cell adhesion molecules are cell surface receptors that mediate cell-cell and

cell-substrate interactions. These molecules can be grouped into four families:

integrins, cadherins, selectins and the immunoglobulin superfamily (Ig-SF)219.

Alterations in cellular adhesion and communication can contribute to uncontrolled

cell growth. Tumor cells use adhesion molecules to cluster together and they must

maintain their adhesion to each other to invade. Hence, alterations in adhesion

molecules are essential during primary tumor formation and in metastasis 220. In this

respect, activated leukocyte cell adhesion molecule (ALCAM, CD166 or human

melanoma metastasis clone D [MEMD]) is a type 1 transmembrane glycoprotein of

the Ig-SF221. The molecular weight of ALCAM is 65kDa but with N-glycosylation at 8

putative sites, the mature ALCAM molecule has a molecular weight of 110kDa222.

Five extracellular Ig domains, a transmembrane region and a short cytoplasmic tail

make up the ALCAM protein that resembles E-cadherin in motif-arrangement221.

ALCAM mediates both heterophilic (ALCAM-CD6 [lymphocyte cell-surface receptor])

and homophilic (ALCAM-ALCAM) cell-cell interactions223. The extracellular

structures of ALCAM provide two structurally and functionally distinguishable

modules, one involved in ligand binding (to CD6)224 and the other in avidity225

(Figure 5.1). Both modules are required for stable, homophilic ALCAM-ALCAM cell-

cell adhesion223. Its short cytoplasmic tail does not contain any known signaling

motifs. Physiologically, ALCAM is expressed in activated leukocytes and neural,

epithelial and hematopoietic progenitor cells226. Functionally, ALCAM may act as a

cell surface sensor to register local growth saturation and to regulate cellular


signaling and dynamic responses214. ALCAM-CD6 interaction is required for optimal

activation of T-cells suggesting a possible ALCAM involvement in the immunologic

response to tumor cells227. ALCAM may favor interactions between tumor and

endothelial cells214. In fact, ALCAM expression has been shown to correlate with the

invasiveness of malignant melanoma and has been proposed as a prognostic

marker in this disease228,229.

The aim of our study was to investigate if ALCAM, either alone, or in

combination with the classical breast cancer biomarkers (CA 15-3 and CEA)

represent a new strategy for breast cancer diagnosis with high sensitivity and

specificity in serum, using quantitative methodologies. The association between

serum marker concentrations with various clinicopathologic parameters was also

examined.


Figure 5.1

Figure 5.1: Model of homophilic ALCAM-ALCAM interactions between cells. The extracellular structures of ALCAM consist of 2 functional modules. Ig domains D1-2 are essential for ligand binding while domains D3-5 are involved in cis-oligomerization at the cell surface.

D1 D2

D3 D4 D5

Extracellular

Cell membrane

Cytoplasm

Oligomerizing module (D3-5)

Ligand binding module (D1-2)



5.2.1 Patients and specimens

The clinical material used consisted of 150 serum samples from primary

breast cancer patients (ages 34 to 82 years; median, 62 years), 100 serum samples

from normal, apparently healthy women (ages 24 to 56 years; median, 40 years),

and as an additional control, 50 serum samples from normal healthy men (ages 23

to 61 years; median, 48 years). The samples from primary breast cancer patients

were from untreated individuals collected prior to surgery. Histologically, 94 were

classified as invasive ductal carcinoma and/or multifocal invasive ductal carcinoma,

24 as invasive lobular carcinoma and/or multifocal invasive lobular carcinoma and

32 as either invasive ductal carcinoma + invasive lobular carcinoma, invasive ductal

carcinoma with various aspects, lobular carcinoma in situ, medullary carcinoma or

other. Histologic classification was based on the World Health Organization of breast

tumors recommendation. Patients with disease of clinical stages 1 to 3 were

represented in this study. Of the 150 primary breast carcinoma patients, 32 were

stage 1, 57 were stage 2A or 2B, 27 were stage 3A or 3B and stage information was

not available for the remaining 34. Clinical grades 1, 2 and 3, corresponding to 26,

62 and 56 patients, respectively, were included in this study. The characteristics of

the breast cancer patients in terms of tumor diameter, lymph node status,

menopausal status and hormone receptor status are described later. Serum

samples, obtained from Venice, Italy, from all patients were stored at -80oC until

further analysis. Our protocols have been approved by the review boards of the

participating institutions.


5.2.2 Measurement of ALCAM, CA 15-3 and CEA in serum

The concentration of ALCAM in serum was measured by using a highly

sensitive and specific non-competitive “sandwich-type” ELISA, developed in our

laboratory. The assay is based on mouse monoclonal antibody capture and

biotinylated goat-anti human detection antibody (both obtained from R&D Systems,

Minneapolis, MN). The assay has a detection limit of 0.05 µg/L and a dynamic range

of up to 10 µg/L. Precision was less than 10% within the measurement range. Assay

parameters such as stability, linearity, cross-reactivity, recovery and reproducibility

were examined. Serum samples were analyzed in triplicate with inclusion of two

quality control samples in every run. In addition, CA 15-3 and CEA were measured

using a commercially available automated ELISA kit (Elecsys CA 15-3 and CEA

Immunoassay, respectively; Roche Diagnostics, Indianapolis, IN). The upper limit of

normal for CA 15-3 for this method is 30 U/mL and for CEA is 5 ng/mL.

5.2.3 Data analysis and statistics

The relationships between biomarkers and patient and tumor characteristics

were examined with the Kruskal-Wallis test, a nonparametric method for examining

differences among multiple groups. Spearman’s rank correlation coefficient was

used to assess the correlations among biomarkers. Logistic regression was

performed to calculate the odds ratio (OR) that defines the relation between

biomarkers and case or control status. OR were calculated on log-transformed


biomarkers and were represented with their 95% confidence interval (95% CI) and

two-sided p-values.

To further evaluate the diagnostic or prognostic usefulness of the markers for

dichotomous classification, we considered receiver operating characteristic (ROC)

curve analysis. If by convention larger values of a biomarker are associated with

adverse outcome, a cut-off point is used to define a positive marker-based test result,

i.e., positive if the marker value exceeds some cut-off point. For a marker measured

on continuous scales, a ROC curve is a plot of true positive fraction versus false

positive fraction, evaluated for all possible cut-off point values. For binary outcome,

i.e., response to chemotherapy, the ROC curve quantifies the discriminatory ability

of a marker for separating cases from controls. The standard deviations of the area

under the curve (AUC) and the differences between AUCs are computed with the U-

statistic of DeLong et al230, or the bootstrap resampling method.

For each ROC curve, we calculated the AUC, which ranges from 0.5 (for a

non-informative marker) to 1 (for a perfect marker) and corresponds to the

probability that a randomly selected case has a higher marker value than a randomly

selected control. Bootstrap method was used to calculate the confidence intervals

for AUC.

The ROC analysis was first conducted on individual markers and then in

combination, to explore the potential that a marker panel can lead to improved

performance. We considered an algorithm that renders a single composite score

using the linear predictor fitted from a binary regression model. This algorithm has

been justified to be optimal under the linearity assumption231 in the sense that ROC


curve is maximized (i.e., best sensitivity) at every threshold value. Since an

independent validation series was not available for this study, the predictive

accuracy of the composite scores was evaluated based on re-sampling of the

original data. All analyses were performed using Splus 8.0 software (Insightful Corp.,

Seattle WA).


5.3 Results

5.3.1 ALCAM ELISA assay development

To investigate the diagnostic potential of ALCAM, we developed a robust

sandwich-type ELISA using two antibodies specific for the human molecule (Figure

5.2). To ensure that the immunoassay was suitable for measuring clinical serum

samples, the recovery, reproducibility, linearity, cross-reactivity and serum sample

stability were examined. Recombinant human ALCAM protein was added into the

general diluent (control), normal serum (male and female) and into serum of breast

cancer patients at different concentrations, and measured with the ALCAM

immunoassay. A recovery of 90-100% was observed in these samples. The assay

also showed negligible cross-reactivity to another adhesion molecule of the Ig-SF, B-

cell adhesion molecule223, displayed excellent linearity with serial dilutions and

showed < 10% CV for intra- and inter-assay variability studies. Finally, the design of

the stability study consisted of collecting serum at different time points (2 weeks, 4

weeks and fresh samples) and storing them at 4oC, -20oC and -80oC. ALCAM levels

were measured in these samples using the immunoassay. No difference was

observed among the samples stored at the different temperature conditions and

among the different time point collections, compared to the freshly obtained samples

(data not shown).


Figure 5.2

Figure 5.2: Schematic of a sandwich ELISA assay. Ab, antibody; SA-ALP, streptavidin alkaline phosphatase.

YY YY YY YY YY YY B

Coat well with 1st 1o Ab (capture)

Add Antigen Add Biotinylated Ab (detection)

Add SA-ALP Add Substrate

YY B YY B


5.3.2 Association of biomarkers with age

Since cases and controls were not matched for age, we first explored if

marker values differed by age. The comparisons between cases and controls were

based on data from females only. While no change with age was observed for CA

15-3 concentrations, the level of CEA appeared to increase with age for both cases

and controls. With respect to ALCAM, there was a trend for marker level to increase

with age for cases but not for controls (Figure 5.3).

5.3.3 Correlations among biomarkers

Spearman’s rank correlation coefficients were used to assess the correlations

among markers for female controls and cases, respectively, and the results are

listed in Table 5.1. CEA appeared to be weakly correlated with ALCAM in both cases

(Spearman r = 0.371, p < 0.001) and controls (Spearman r = 0.348, p = 0.001),

whereas CA 15-3 was weakly correlated with ALCAM among cases only (Spearman

r = 0.2, p = 0.015).


Figure 5.3

Age

log(

CA

15-3

)

20 30 40 50 60 70 80

23

45

67

8

cancer: slope = 0.0004 (p = 0.928 )Normal: slope = 0.007 (p = 0.301 )

Age

log(

CE

A)

20 30 40 50 60 70 80

-10

12

34

cancer: slope = 0.02 (p = 0.001 )Normal: slope = 0.028 (p = 0.012 )

Age

log

(Alc

am

)

20 30 40 50 60 70 80

4.0

4.5

5.0

cancer: slope = 0.0053 (p = 0 )Normal: slope = 0.001 (p = 0.821 )

g p

Figure 5.3: Scatter plot of individual markers for cases and female controls versus age. Solid lines and dashed lines are loess fit (weighted polynomial regression) for cancer and normal patients respectively. Slopes are based on the fit of linear regression models of log(marker) with age as the predictor. The cancer and normal slope for CA 15-3 is horizontal stating that there is no change with age and CA 15-3 concentration. The levels of CEA increase as age increases for both cases and controls as illustrated by the steep slope. Finally for ALCAM, there is a trend for ALCAM concentrations to increase with age for cases but not for controls.


Table 5.1: Spearman's rank correlation coefficients among 3 markers for female controls and cases Female Controls Cases CA15-3 CEA ALCAM CA15-3 CEA ALCAM CA15-3 1 1 CEA -0.091 1 0.161* 1 ALCAM 0.082 0.348* 1 0.2* 0.371* 1

*: p < 0.05


5.3.4 Association of biomarkers with tumor characteristics for cases

The association of ALCAM, CA 15-3 and CEA with patient and tumor

characteristics such as age, tumor diameter, ER and PgR status, grade, histology,

ratio of lymph node positive (lpos) and total lymph nodes (ltot), menopausal status,

and stage were examined. A significant association was obtained for the following

clinicopathologic variables: age (<=50, 51-60, 61-70 and >70), menopausal status

(pre- and post-menopausal), and stage (I, II, III). The distributions of each marker in

cases for these variables are given in Table 5.2. Post-menopausal women displayed

higher values of CEA and ALCAM (all p < 0.001). As well, levels of ALCAM were not

significantly associated with stage whereas CEA and CA15-3 were. Finally, while a

statistically significant p-value was not obtained for an association between ALCAM

values and tumor grade, a general trend was observed with elevated ALCAM levels

and tumor grade, with ALCAM levels being elevated as early as grade 1 compared

to control individuals (Figure 5.4).


Table 5.2: Marker distributions by tumor characteristics for cases

# Of Patients CA15-3 CEA

ALCAM Median Q31* Median Q31* Median Q31* Age <=50 36 20.36 5.11 1.59 0.59 66.00 7.0051-60 34 21.78 6.06 1.82 0.99 77.00 11.5061-70 40 18.90 4.34 2.02 0.75 75.00 9.0070+ 40 22.92 6.12 2.62 0.96 82.00 8.25p value** 0.31 0.01 <0.001

Menopausal status pre 30 21.30 5.35 1.03 0.56 66.00 6.00post 103 20.61 6.06 2.14 0.98 78.00 9.50p value** 0.92 <0.001 <0.001

Stage I 32 17.20 5.93 1.48 0.61 72.00 11.25II 57 19.46 4.66 1.75 0.86 74.00 8.00III 27 23.40 10.32 2.47 1.10 72.00 11.50p value** 0.003 0.004 0.88

* Q31, semi-interquartile range: computed as one half the difference between the 75th percentile (Q3) and the 25th percentile (Q1) ** p value: computed from global nonparametric Kruskal-Wallis test for testing the association between a marker and a clinical variable


Figure 5.4

Normal

Fem

ale

Normal

Mal

e

Grade

1

Grade

2

Grade

3

0

25

50

75

100100

200A

LC

AM

(

g/L

)

Figure 5.4: Scatter plot of ALCAM (y-axis) distribution by tumor grade (x-axis) of the primary breast carcinoma cases examined. The solid horizontal line indicates the median value for each of the groups.


5.3.5 Association of biomarkers with breast cancer

The distributions of the 3 markers, as measured by immunoassays, in cases

and controls, are shown in Figure 5.5. Distributions of the patients with breast cancer

differed from controls (female or male) for ALCAM, but to a lesser degree for the

other two markers. The median values of males and females were similar for all 3

markers. When comparing the ALCAM values between normal women (n = 100) and

patients with breast cancer (n = 150) by the non-parametric Mann Whitney test (two-

tailed), the medians were significantly different (median normals = 60 µg/L; median

cancer = 74 µg/L; P<0.0001). For CA 15-3, the medians were significantly different

(median normals = 15 units/mL; median cancer = 21 units/mL; P<0.0001). Lastly for

CEA, the medians were different (median normals = 1.3 µg/L; median cancer = 1.9

µg/L; P = 0.0003). The association of the markers with cancer was further

considered with linear regression models of logarithm-transformed marker values as

a function of clinical status (cancer vs non-cancer; females only) and age. Adjusting

for age, the mean levels of log(CA15-3) and log(ALCAM) were significantly higher in

cancer; levels of log(CEA) did not differ between cancer and controls.

We also considered logistic regression models to further characterize the

associations between markers and breast cancer, adjusting for age. Similar to the

results from linear regression, we found that two individual markers, CA 15-3

(OR=1.12, 95% CI [1.04,1.19]) and ALCAM (OR=1.42, 95% CI [1.14,1.77])

univariately predicted breast cancer, but this was not the case for CEA (OR=0.99,

95% CI [0.95,1.05]), whose 95% confidence interval fell into 1.00. In a logistic

regression model, which included age and all three markers, we found that CA15-3


and ALCAM independently predicted breast cancer. Results from the logistic

regression models are given in Table 5.3.


Figure 5.5

Normal

Fem

ales

Breas

t Can

cer

0

25

50

75

100100150200

AL

CA

M (

g/L

)

Normal

Fem

ale

Breas

t Can

cer

0

30

60

90100

3300

CA

15-

3 (U

/mL

)

Normal

Fem

ale

Breas

t Can

cer

0

5

1010

60

CE

A (

g/L

)

A)

B)

C)

Figure 5.5: Distribution of markers A) ALCAM, B) CA 15-3 and C) CEA in the three groups (normal female, normal male and breast carcinoma) examined by an immunoassay specific to the molecule. The solid horizontal line indicates the median value for each of the groups. The dotted horizontal line indicates the cut-off values to discriminate cancer from control subjects A) ALCAM: 78 µg/L, 95% specificity cut-off; B) CA 15-3: 30 U/mL and C) CEA: 5 ng/mL.


Table 5.3: Results from logistic regression models Univariate * Multivariate** OR 95% CI OR 95% CI Marker CA15-3 1.12 (1.04,1.19) 1.09 (1.02,1.18)CEA 0.99 (0.95,1.05) 0.94 (0.89,1.00)ALCAM 1.42 (1.14,1.77) 1.39 (1.09,1.78)

*: logistic model with logarithm of the marker and age as predictors. **: logistic model with logarithm of all three markers and age as predictors. OR, odds ratio; CI, confidence interval.


5.3.6 The diagnostic values of the three markers

ROC curve analysis (Figure 5.6) was used to quantify the diagnostic value of

the three markers. All three markers have AUC significantly better than 0.5, with

ALCAM having the best performance (AUC=0.78, 95% CI [0.73,0.84]). The

superiority of ALCAM over the other two markers was also evident when we

considered sensitivities at fixed values of 90% and 80% specificities, respectively

(Table 5.4). For example, at specificity of 80%, ALCAM yielded a sensitivity of 60%,

compared with 48% for CA15-3. Likewise, at 90% specificity, ALCAM displayed

higher sensitivity than CA 15-3 and CEA. The incremental values of AUC for ALCAM

over that for CA15-3 are statistically significant (Delong test, p <0.05). Combining

CA15-3 and ALCAM, based on the linear predictors from a logistic regression model,

yielded a ROC curve with an AUC of 0.81 (bootstrap 95% CI [0.75, 0.87]).

Combining CA15-3, ALCAM and CEA did not result in any improvement in ROC

curves compared to CA15-3 and ALCAM. Re-sampling methods which aimed to

adjust for over-fitting232 did not yield substantially different results.


Figure 5.6

False Positive Fraction

Tru

e P

ositi

ve F

ract

ion

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0


Tru

e P

ositi

ve F

ract

ion

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0


Tru

e P

ositi

ve F

ract

ion

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

CA15-3, AUC = 0.7 ( 0.63 , 0.76 )CEA, AUC = 0.63 ( 0.56 , 0.7 )ALCAM, AUC = 0.78 ( 0.73 , 0.84 )

Figure 5.6: ROC curves for the three markers (CA 15-3, CEA, ALCAM).


Table 5.4: ROC analysis for biomarkers Sensitivity --------------------------------------------------------------------

AUC 95% CI 90%

Specificity 95% CI 80%

Specificity 95% CI Marker CA15-3 0.70 (0.64,0.76) 0.32 (0.19,0.44) 0.48 (0.32,0.63)CEA 0.63 (0.56,0.70) 0.22 (0.12,0.31) 0.32 (0.22,0.41)ALCAM 0.78 (0.73,0.84) 0.47 (0.38,0.57) 0.60 (0.48,0.73)Combined* 0.81 (0.75,0.87) 0.52 (0.39,0.64) 0.67 (0.54,0.80)

Combined*: linear combination of CA15-3 and ALCAM The incremental values of AUC for ALCAM over that for CA15-3 are statistically significant (Delong test, p <0.05)


5.4 Discussion

In general, epithelial cells are less mobile than cells of mesenchymal origin.

Consequently, development of malignant carcinomas that arise from normal

epithelia (such as in breast cancer) affect the intercellular adhesion system233. Most

primary cancers show loss of expression of adhesion molecules to allow for a critical

step in metastasis to occur: detachment of the invading cell from its neighbors 234,235.

Nevertheless, while the long standing hypothesis that loss of adhesion molecules

permit invasion is predominantly true235,236, studies have showed that loss of

adhesion itself is not sufficient to cause invasiveness237. Specifically, there are

examples of poorly differentiated, non-adhesive carcinomas with unchanged

amounts of E-cadherin (adhesion molecule expressed in epithelial cells), indicating

that the invasive phenotype in these tumors is not due to reduced expression of the

molecule but rather, to interference with its cell adhesive function233.

Therefore, a number of potential reasons exist for observing elevated levels

of adhesion molecules such as ALCAM in cancer patients versus normal individuals.

First, increased homotypic intercellular adhesion (due to elevated levels of these

molecules) may favor the metastatic process since cell aggregates, rather than

single cells breaking away from the primary tumor, have a greater chance of survival

in the circulation and of lodging in other organs238. Second, it is known that cell

adhesion is necessary for the metastatic spread of cancer cells to new organs

(secondary tumor establishment)235. As well, overproduction of adhesion molecules

may disrupt the normally operative intercellular adhesion forces, allowing more cell

movement and the adoption of a less ordered tissue architecture239. As an


illustration, a substance that has been studied extensively as a marker for breast

cancer is CEA. CEA is a member of the immunoglobulin supergene family and is

expressed in a large variety of secretory tissues240,241. Interestingly, expression of

CEA is increased in colon carcinomas and it may be important to processes of

intercellular recognition242,243. It has been suggested that this might either result in

disturbance of normal intercellular adhesion or provide advantages in further steps

of metastasis239 such as conceivably facilitating establishment of a secondary

tumor233,235. These factors may be true for ALCAM but further studies are needed to

confirm these speculations.

Given that ALCAM is a transmembrane protein with an extracellular domain, it

is very likely that membrane shedding may lead to elevated levels of ALCAM in

circulation. MMP-2, a metalloproteinase involved in degrading cell-cell connections,

has been shown to be elevated in serum of breast cancer patients and its levels

correlate with poor prognosis244,245. It is possible that a putative substrate for MMP-2

is ALCAM. Therefore, it is probable that an increase in MMP-2 or other proteases

(such as the kallikreins) may result in increased shedding of ALCAM into the

circulation. Certainly, further studies are warranted to decipher the mechanisms by

which ALCAM is elevated in breast cancer.

ALCAM expression has been explored in a number of different tumor types

displaying a clear upregulation in some tumors and downregulation in others. In

addition, variable levels of ALCAM expression have been found at different stages of

tumor development in the same type of malignancies. In melanoma, ALCAM has

been suggested to exhibit a role in melanoma cell invasion and neoplastic


progression246. In prostate carcinoma, ALCAM gene was found upregulated in high

Gleason grade prostate cancers compared to benign prostatic hyperplasia cases247.

However, one study observed an upregulation of ALCAM in low-grade tumors and a

downregulation in high-grade prostatic tumors248. Yet, another study on prostate

cancer found ALCAM to predict prostate-specific antigen (PSA) relapse226. In colon

cancer, using IHC, no significant correlation with patient age, tumor grade, stage or

nodal status and ALCAM expression was observed, but membranous ALCAM

expression correlated significantly with shortened patient survival228.

There have been a few studies investigating ALCAM expression in breast

cancer. Low levels of ALCAM mRNA correlated with nodal involvement, high grade

and worse prognosis249. In fact, low levels of ALCAM transcripts in the primary

breast tumor correlated with skeletal metastases and poor prognosis250. At the

protein level, laser scanning cytometry and confocal microscopy showed that high

levels of ALCAM correlated with small tumor diameter, low grade and the presence

of hormone receptors, which supported the view that this adhesion molecule is a

tumor suppressor with prognostic significance 221. However, an IHC analysis showed

that high cytoplasmic ALCAM expression was associated with shortened patient

disease-free survival215. Yet a further study found that ALCAM-ALCAM interactions

between breast cancer cells were important for survival in the primary tumor and that

a loss of ALCAM was associated with programmed cell death251. Finally, Ihnen et al

discovered that patients with high ALCAM mRNA expression who did not receive

chemotherapy tended to have a worse prognosis, suggesting that high ALCAM

expression levels may be a marker for prediction of the response to adjuvant


chemotherapy in breast cancer252. Indeed, the discordant data between RNA and

protein levels of ALCAM in breast cancer and even discordance among different

protein expression studies suggest the need for additional research to evaluate the

role of ALCAM in breast cancer.

The finding of decreased levels of ALCAM in breast cancer tissue compared

to normal breast tissue is not contradictory to our results of elevated levels of

ALCAM in serum of breast cancer patients. It is possible that ALCAM levels

decrease in tissue but is elevated in serum. For example, although PSA gene

transcription is down-regulated in prostate cancer, PSA protein levels in the

circulation of prostate cancer patients increase due to disruption of the anatomic

barriers between the glandular lumen and capillaries. Healthy men display serum

PSA range of 0.5-2 g/L (low levels of PSA enter the circulation by diffusing through

a number of anatomic barriers). In early stage prostate cancer, PSA levels in the

serum rise to 4-10 g/L (due to destruction of tissue architecture). In late stage

prostate cancer, due to invasion of tumor cells, considerable amounts of PSA leak

into the bloodstream (PSA levels typically range from 10 to 1000 g/L). In addition,

our data of increased ALCAM levels in cancer versus healthy controls is in line with

data about ALCAM expression in prostate and colon cancer and melanoma. It

should also be noted that we are the first to report presence of ALCAM in serum of

breast cancer patients. Until now, all studies regarding ALCAM expression have

been performed at the transcript level or using IHC or confocal microscopy. In this

study, we developed a robust and highly sensitivity immunoassay to measure

ALCAM in biological fluids.


It is generally agreed upon that no single cancer biomarker will provide all

necessary information for optimal cancer diagnosis. The current trend is to focus on

the identification of multiple biomarkers that can be used in combination. The

present data provides evidence that serum ALCAM represents a novel biomarker for

breast cancer. This biomarker displays higher diagnostic sensitivity for breast cancer

than the currently used tumor markers CA 15-3 and CEA (Table 5.4). Moreover, as a

result of the moderate correlation between ALCAM and CA 15-3 (Table 5.1), there

are patients with normal levels of CA 15-3 (< 30 U/mL) who have elevated ALCAM

levels. In fact, among the 120/150 cancer patients examined who displayed normal

levels of CA 15-3, 48 of them (40%) had elevated levels of ALCAM (values of 78

µg/L or greater; the cut-off for 95% specificity). For this reason, CA 15-3

measurements will benefit from combining ALCAM measurements, to increase the

diagnostic sensitivity of each of the markers alone. As well, assuming a 95%

specificity, the statistical power of our study (n = >100 for both control and cases)

will allow the detection of a 20% difference between mean values of ALCAM levels

in breast cancer patients and controls (data not shown). The difference between the

ALCAM means in this study was >20%, within the power of our study.

Although a statistically significant correlation between ALCAM expression and

tumor grade in breast cancer was not obtained, a general trend towards elevated

ALCAM levels with increasing grade was observed (Figure 5.4). We hypothesize

that the upregulation of ALCAM is an early event in malignant cell transformation in

breast cancer. In addition, a correlation of elevated ALCAM levels with increasing

age was observed in primary breast carcinoma patients (Table 5.2). However, there


was no correlation between ALCAM levels and age of normal women (data not

shown). This suggests that the difference in age between cases and controls is not a

confounding factor in this study. Furthermore, CEA was a weak marker in this study

and this is consistent to its performance in the clinic for breast cancer.

In conclusion, we show evidence that serum ALCAM concentration

represents a novel biomarker for breast carcinoma, which has potential utility as a

diagnostic tool. The combination of ALCAM with CA 15-3 improved the diagnostic

sensitivity. The availability of a reliable immunoassay, such as the one developed in

this study, for measuring serum ALCAM may facilitate further studies to establish the

clinical usefulness of this marker and to clarify the biological roles of ALCAM in

breast cancer.

Chapter 6: Summary and Future Directions 171

CHAPTER 6:

SUMMARY AND FUTURE DIRECTIONS


6.1 Summary

A proteomic platform, consisting of discovery, verification and validation

phases, was utilized in this thesis to identify breast cancer biomarkers. Overall, this

thesis has provided initial insights into potential serological breast cancer biomarkers.

Below is a summary of the key findings of this study.

Key Findings

1. Discovery Phase:

a. Two-dimensional liquid chromatography-tandem mass spectrometry

(2D-LC-MS/MS) strategy was utilized to sample the secretome of 3

breast cell lines, MCF-10A, BT474 and MDA-MB-468.

b. Over 1,100 proteins in the conditioned media of all 3 cell lines

combined were identified.

c. Label-free quantification, biological function, cellular localization and

other bioinformatic and data analyses were performed.

d. Various filtering criteria were applied to narrow the list of potential

biomarkers from 400 extracellular and membrane proteins.

e. The top 11 candidates were selected to enter the second phase of the

proteomic platform for biomarker discovery.

2. Verification Phase:

a. Pooled serum samples, other biological fluids and tissue lysates were

used to examine the levels of the following candidates for its ability to


discriminate between breast cancer patients and controls using an

immunoassay

II. Elafin

III. Kallikreins 5, 6 and 10

IV. Cystatin C, lipocalin-2 and transforming growth factor beta-2

(secreted proteins)

V. ALCAM, BCAM, NrCAM and fractalkine (membrane

proteins; in-house developed ELISA)

b. ALCAM demonstrated the greatest potential to discriminate

between normal and breast cancer serum samples

3. Validation Phase:

a. Examined levels of ALCAM in 300 serum samples with known

clinicopathological parameters

b. ALCAM, with area under the curve (AUC) of 0.78 [95% CI: 0.73,

0.84] outperformed CA15-3 (AUC= 0.70 [95% CI: 0.64, 0.76]) and

CEA (AUC= 0.63 [95% CI: 0.56, 0.70]).

c. 40% of breast cancer patients with normal levels of CA 15-3

displayed elevated ALCAM levels

d. Serum ALCAM concentrations appear to be a new biomarker for

breast cancer and may have value for disease diagnosis


6.2 Future Directions

The experimental data provided in this thesis has led to the identification

of a potential serological breast cancer biomarker, activated leukocyte cell

adhesion molecule (ALCAM). The availability of a reliable immunoassay, such

as the one developed in this study, for measuring serum ALCAM may facilitate

further studies to establish the clinical usefulness of this marker and to clarify the

biological roles of ALCAM in breast cancer. Examining the levels of ALCAM in

other serum samples such as those obtained from patients pre- and post-surgery

as well as serial serum samples collected from patients undergoing therapy may

be beneficial in evaluating the biomarker potential of ALCAM. Indeed, ALCAM

may be a predictive or prognostic marker for breast cancer, rather than a

diagnostic or screening marker for the general population. Also, ALCAM may be

useful only for a sub-population of patients. A larger sample set may facilitate

identifying this sub-population.

Moreover, the growing consensus given the heterogeneity of breast

cancer is that no single cancer biomarker will provide all necessary information

for optimal cancer diagnosis. The current trend is to focus on the identification of

multiple biomarkers that can be used in combination. Hence, ALCAM in

combination with some of the other potential candidates discovered in this study,

which have not yet been evaluated, can be examined. As well, for many of the

potential candidates discovered, reagents were not available to develop an

immunoassay to measure their levels in biological fluids. A highly sensitive and

high-throughput verification platform is desperately needed to bridge the gap


between discovery and validation phases of candidate biomarkers. In addition to

its important role as a discovery platform, the use of MS as a quantification tool is

gaining popularity. It has been suggested that multiple reaction monitoring

(MRM) assays will enable fast, high-throughput verification of candidate

biomarkers in blood. Therefore, for many of the potential candidate molecules

discovered in phase 1 of this study, a quantitative technique based on mass

spectrometry may be developed to measure its levels in serum.

With respect to the verification phase, it was discovered that ALCAM and

BCAM levels were elevated in prostate cancer compared to healthy male serum

samples. The elucidation of whether ALCAM and/or BCAM are potential

diagnostic markers for prostate cancer will involve performing a validation study,

such as the one presented in this thesis, utilizing a large sample set with known

clinical parameters. In addition, brief examination of lipocalin-2 serum levels in

various pancreatic cancer patients revealed that this protein may be elevated in a

sub-population of resected pancreatic ductal adenocarcinoma. Further

investigation into this observation via examining a much larger sample set may

yield a potential pancreatic cancer tumor marker.

Lastly, the work presented in this thesis demonstrated that ALCAM

(transmembrane protein) levels were elevated in serum. Studies examining the

mechanism behind ALCAM shedding will be important. A detailed assessment of

various protease activities in cleaving ALCAM along with examining the biological

roles of ALCAM in breast cancer pathogenesis may provide important

information.

References 176

REFERENCES

References 177

References

1. Esteva,F.J. & Hortobagyi,G.N. Adjuvant systemic therapy for primary breast cancer. Surg. Clin. North Am. 79, 1075-1090 (1999).

2. Bray,F., McCarron,P., & Parkin,D.M. The changing global patterns of female breast cancer incidence and mortality. Breast Cancer Res. 6, 229-239 (2004).

3. Mincey,B.A. Genetics and the management of women at high risk for breast cancer. Oncologist. 8, 466-473 (2003).

4. Weir,H.K., Thun,M.J., Hankey,B.F., Ries,L.A., Howe,H.L., Wingo,P.A., Jemal,A., Ward,E., Anderson,R.N., & Edwards,B.K. Annual report to the nation on the status of cancer, 1975-2000, featuring the uses of surveillance data for cancer prevention and control. J. Natl. Cancer Inst. 95, 1276-1299 (2003).

5. Peto,R., Boreham,J., Clarke,M., Davies,C., & Beral,V. UK and USA breast cancer deaths down 25% in year 2000 at ages 20-69 years. Lancet 355, 1822 (2000).

6. Jemal,A., Siegel,R., Ward,E., Hao,Y., Xu,J., Murray,T., & Thun,M.J. Cancer statistics, 2008. CA Cancer J. Clin. 58, 71-96 (2008).

7. Hansen,R.K. & Bissell,M.J. Tissue architecture and breast cancer: the role of extracellular matrix and steroid hormones. Endocr. Relat Cancer 7, 95-113 (2000).

8. Venkitaraman,A.R. Breast cancer genes and DNA repair. Science 286, 1100-1102 (1999).

9. Ford,D., Easton,D.F., Stratton,M., Narod,S., Goldgar,D., Devilee,P., Bishop,D.T., Weber,B., Lenoir,G., Chang-Claude,J., Sobol,H., Teare,M.D., Struewing,J., Arason,A., Scherneck,S., Peto,J., Rebbeck,T.R., Tonin,P., Neuhausen,S., Barkardottir,R., Eyfjord,J., Lynch,H., Ponder,B.A., Gayther,S.A., Zelada-Hedman,M., & . Genetic heterogeneity and penetrance analysis of the BRCA1 and BRCA2 genes in breast cancer families. The Breast Cancer Linkage Consortium. Am. J. Hum. Genet. 62, 676-689 (1998).

10. Tamoxifen for early breast cancer: an overview of the randomised trials. Early Breast Cancer Trialists' Collaborative Group. Lancet 351, 1451-1467 (1998).

11. Bergh,J. & Holmquist,M. Who should not receive adjuvant chemotherapy? International databases. J. Natl. Cancer Inst. Monogr 103-108 (2001).

References 178

12. Esserman,L.J., Shieh,Y., Park,J.W., & Ozanne,E.M. A role for biomarkers in the screening and diagnosis of breast cancer in younger women. Expert. Rev. Mol. Diagn. 7, 533-544 (2007).

13. Shapiro,S., Coleman,E.A., Broeders,M., Codd,M., de Koning,H., Fracheboud,J., Moss,S., Paci,E., Stachenko,S., & Ballard-Barbash,R. Breast cancer screening programmes in 22 countries: current policies, administration and guidelines. International Breast Cancer Screening Network (IBSN) and the European Network of Pilot Projects for Breast Cancer Screening. Int. J. Epidemiol. 27, 735-742 (1998).

14. Boring,C.C., Squires,T.S., & Tong,T. Cancer statistics, 1991. CA Cancer J. Clin. 41, 19-36 (1991).

15. Yang,J.H., Slack,N.H., & Nemoto,T. Effect of axillary nodal status on the long-term survival following mastectomy for breast carcinoma: nodal metastases may not always suggest systemic disease. J. Surg. Oncol. 36, 243-248 (1987).

16. Leitch,A.M. Controversies in breast cancer screening. Cancer 76, 2064-2069 (1995).

17. Esserman,L., Cowley,H., Eberle,C., Kirkpatrick,A., Chang,S., Berbaum,K., & Gale,A. Improving the accuracy of mammography: volume and outcome relationships. J. Natl. Cancer Inst. 94, 369-375 (2002).

18. Mincey,B.A. & Perez,E.A. Advances in screening, diagnosis, and treatment of breast cancer. Mayo Clin. Proc. 79, 810-816 (2004).

19. Antman,K. & Shea,S. Screening mammography under age 50. JAMA 281, 1470-1472 (1999).

20. Elmore,J.G., Barton,M.B., Moceri,V.M., Polk,S., Arena,P.J., & Fletcher,S.W. Ten-year risk of false positive screening mammograms and clinical breast examinations. N. Engl. J. Med. 338, 1089-1096 (1998).

21. Swift,M. Ionizing radiation, breast cancer, and ataxia-telangiectasia. J. Natl. Cancer Inst. 86, 1571-1572 (1994).

22. Sharan,S.K., Morimatsu,M., Albrecht,U., Lim,D.S., Regel,E., Dinh,C., Sands,A., Eichele,G., Hasty,P., & Bradley,A. Embryonic lethality and radiation hypersensitivity mediated by Rad51 in mice lacking Brca2. Nature 386, 804-810 (1997).

23. Smith,R.A., Cokkinides,V., & Eyre,H.J. American Cancer Society guidelines for the early detection of cancer, 2006. CA Cancer J. Clin. 56, 11-25 (2006).

References 179

24. Etzioni,R., Urban,N., Ramsey,S., McIntosh,M., Schwartz,S., Reid,B., Radich,J., Anderson,G., & Hartwell,L. The case for early detection. Nat. Rev. Cancer 3, 243-252 (2003).

25. Hayes,D.F., Bast,R.C., Desch,C.E., Fritsche,H., Jr., Kemeny,N.E., Jessup,J.M., Locker,G.Y., Macdonald,J.S., Mennel,R.G., Norton,L., Ravdin,P., Taube,S., & Winn,R.J. Tumor marker utility grading system: a framework to evaluate clinical utility of tumor markers. J. Natl. Cancer Inst. 88, 1456-1466 (1996).

26. Elston,C.W., Ellis,I.O., & Pinder,S.E. Pathological prognostic factors in breast cancer. Crit Rev. Oncol. Hematol. 31, 209-223 (1999).

27. Moore,D.H., Kabat,E.A., & Gutman,A.B. BENCE-JONES PROTEINEMIA IN MULTIPLE MYELOMA. J. Clin. Invest 22, 67-75 (1943).

28. ABELEV,G.I., PEROVA,S.D., KHRAMKOVA,N.I., POSTNIKOVA,Z.A., & IRLIN,I.S. Production of embryonal alpha-globulin by transplantable mouse hepatomas. Transplantation 1, 174-180 (1963).

29. Gold,P. & Freedman,S.O. Specific carcinoembryonic antigens of the human digestive system. J. Exp. Med. 122, 467-481 (1965).

30. Bast,R.C., Jr., Feeney,M., Lazarus,H., Nadler,L.M., Colvin,R.B., & Knapp,R.C. Reactivity of a monoclonal antibody with human ovarian carcinoma. J. Clin. Invest 68, 1331-1337 (1981).

31. Papsidero,L.D., Wang,M.C., Valenzuela,L.A., Murphy,G.P., & Chu,T.M. A prostate antigen in sera of prostatic cancer patients. Cancer Res. 40, 2428-2432 (1980).

32. Diamandis,E.P., Fritsche,H.A., Lilja,H., Chan,D.W., & Schwartz,M.K. Tumor Markers: Physiology, Pathobiology, Technology, and Clinical Applications. (AACC Press, Washington, DC.; 2002).

33. Seregni,E., Coli,A., & Mazzucca,N. Circulating tumour markers in breast cancer. Eur. J. Nucl. Med. Mol. Imaging 31 Suppl 1, S15-S22 (2004).

34. Harris,L., Fritsche,H., Mennel,R., Norton,L., Ravdin,P., Taube,S., Somerfield,M.R., Hayes,D.F., & Bast,R.C., Jr. American Society of Clinical Oncology 2007 update of recommendations for the use of tumor markers in breast cancer. J. Clin. Oncol. 25, 5287-5312 (2007).

35. Lumachi,F. & Basso,S.M. Serum tumor markers in patients with breast cancer. Expert. Rev. Anticancer Ther. 4, 921-931 (2004).

36. Khatcheressian,J.L., Wolff,A.C., Smith,T.J., Grunfeld,E., Muss,H.B., Vogel,V.G., Halberg,F., Somerfield,M.R., & Davidson,N.E. American society

References 180

of clinical oncology 2006 update of the breast cancer follow-up and management guidelines in the adjuvant setting. J. Clin. Oncol. 24, 5091-5097 (2006).

37. Hilkens,J., Buijs,F., Hilgers,J., Hageman,P., Calafat,J., Sonnenberg,A., & van,d., V Monoclonal antibodies against human milk-fat globule membranes detecting differentiation antigens of the mammary gland and its tumors. Int. J. Cancer 34, 197-206 (1984).

38. Kufe,D., Inghirami,G., Abe,M., Hayes,D., Justi-Wheeler,H., & Schlom,J. Differential reactivity of a novel monoclonal antibody (DF3) with human malignant versus benign breast tumors. Hybridoma 3, 223-232 (1984).

39. Reddish,M.A., Jackson,L., Koganty,R.R., Qiu,D., Hong,W., & Longenecker,B.M. Specificities of anti-sialyl-Tn and anti-Tn monoclonal antibodies generated using novel clustered synthetic glycopeptide epitopes. Glycoconj. J. 14, 549-560 (1997).

40. Mensdorff-Pouilly,S., Snijdewint,F.G., Verstraeten,A.A., Verheijen,R.H., & Kenemans,P. Human MUC1 mucin: a multifaceted glycoprotein. Int. J. Biol. Markers 15, 343-356 (2000).

41. Yin,L., Li,Y., Ren,J., Kuwahara,H., & Kufe,D. Human MUC1 carcinoma antigen regulates intracellular oxidant levels and the apoptotic response to oxidative stress. J. Biol. Chem. 278, 35458-35464 (2003).

42. Schroeder,J.A., Thompson,M.C., Gardner,M.M., & Gendler,S.J. Transgenic MUC1 interacts with epidermal growth factor receptor and correlates with mitogen-activated protein kinase activation in the mouse mammary gland. J. Biol. Chem. 276, 13057-13064 (2001).

43. Cheung,K.L., Graves,C.R., & Robertson,J.F. Tumour marker measurements in the diagnosis and monitoring of breast cancer. Cancer Treat. Rev. 26, 91-102 (2000).

44. Nicolini,A. & Carpi,A. Postoperative follow-up of breast cancer patients: overview and progress in the use of tumor markers. Tumour. Biol. 21, 235-248 (2000).

45. Colomer,R., Ruibal,A., Genolla,J., Rubio,D., Del Campo,J.M., Bodi,R., & Salvador,L. Circulating CA 15-3 levels in the postsurgical follow-up of breast cancer patients and in non-malignant diseases. Breast Cancer Res. Treat. 13, 123-133 (1989).

46. Duffy,M.J., Shering,S., Sherry,F., McDermott,E., & O'Higgins,N. CA 15-3: a prognostic marker in breast cancer. Int. J. Biol. Markers 15, 330-333 (2000).

References 181

47. Ebeling,F.G., Stieber,P., Untch,M., Nagel,D., Konecny,G.E., Schmitt,U.M., Fateh-Moghadam,A., & Seidel,D. Serum CEA and CA 15-3 as prognostic factors in primary breast cancer. Br. J. Cancer 86, 1217-1222 (2002).

48. Duffy,M.J. CA 15-3 and related mucins as circulating markers in breast cancer. Ann. Clin. Biochem. 36 ( Pt 5), 579-586 (1999).

49. Soletormos,G., Schioler,V., Nielsen,D., Skovsgaard,T., & Dombernowsky,P. Interpretation of results for tumor markers on the basis of analytical imprecision and biological variation. Clin. Chem. 39, 2077-2083 (1993).

50. Yasasever,V., Dincer,M., Camlica,H., Karaloglu,D., & Dalay,N. Utility of CA 15-3 and CEA in monitoring breast cancer patients with bone metastases: special emphasis on "spiking" phenomena. Clin. Biochem. 30, 53-56 (1997).

51. Berling,B., Kolbinger,F., Grunert,F., Thompson,J.A., Brombacher,F., Buchegger,F., von Kleist,S., & Zimmermann,W. Cloning of a carcinoembryonic antigen gene family member expressed in leukocytes of chronic myeloid leukemia patients and bone marrow. Cancer Res. 50, 6534-6539 (1990).

52. Hayes,D.F., Zurawski,V.R., Jr., & Kufe,D.W. Comparison of circulating CA15-3 and carcinoembryonic antigen levels in patients with breast cancer. J. Clin. Oncol. 4, 1542-1550 (1986).

53. Pegram,M.D., Pauletti,G., & Slamon,D.J. HER-2/neu as a predictive marker of response to breast cancer therapy. Breast Cancer Res. Treat. 52, 65-77 (1998).

54. Mehta,R.R., McDermott,J.H., Hieken,T.J., Marler,K.C., Patel,M.K., Wild,L.D., & Das Gupta,T.K. Plasma c-erbB-2 levels in breast cancer patients: prognostic significance in predicting response to chemotherapy. J. Clin. Oncol. 16, 2409-2416 (1998).

55. Colomer,R., Ruibal,A., & Salvador,L. Circulating tumor marker levels in advanced breast carcinoma correlate with the extent of metastatic disease. Cancer 64, 1674-1681 (1989).

56. Anderson,N.L. & Anderson,N.G. The human plasma proteome: history, character, and diagnostic prospects. Mol. Cell Proteomics. 1, 845-867 (2002).

57. Tomlins,S.A., Rhodes,D.R., Perner,S., Dhanasekaran,S.M., Mehra,R., Sun,X.W., Varambally,S., Cao,X., Tchinda,J., Kuefer,R., Lee,C., Montie,J.E., Shah,R.B., Pienta,K.J., Rubin,M.A., & Chinnaiyan,A.M. Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science 310, 644-648 (2005).

References 182

58. Ono,K., Tanaka,T., Tsunoda,T., Kitahara,O., Kihara,C., Okamoto,A., Ochiai,K., Takagi,T., & Nakamura,Y. Identification by cDNA microarray of genes involved in ovarian carcinogenesis. Cancer Res. 60, 5007-5011 (2000).

59. Welsh,J.B., Zarrinkar,P.P., Sapinoso,L.M., Kern,S.G., Behling,C.A., Monk,B.J., Lockhart,D.J., Burger,R.A., & Hampton,G.M. Analysis of gene expression profiles in normal and neoplastic ovarian tissue samples identifies candidate molecular markers of epithelial ovarian cancer. Proc. Natl. Acad. Sci. U. S. A 98, 1176-1181 (2001).

60. Hellstrom,I., Raycraft,J., Hayden-Ledbetter,M., Ledbetter,J.A., Schummer,M., McIntosh,M., Drescher,C., Urban,N., & Hellstrom,K.E. The HE4 (WFDC2) protein is a biomarker for ovarian carcinoma. Cancer Res. 63, 3695-3700 (2003).

61. Galgano,M.T., Hampton,G.M., & Frierson,H.F., Jr. Comprehensive analysis of HE4 expression in normal and malignant human tissues. Mod. Pathol. 19, 847-853 (2006).

62. Jarjanazi,H., Savas,S., Pabalan,N., Dennis,J.W., & Ozcelik,H. Biological implications of SNPs in signal peptide domains of human proteins. Proteins (2007).

63. ABELEV,G.I. & Eraiser,T.L. Cellular aspects of alpha-fetoprotein reexpression in tumors. Semin. Cancer Biol. 9, 95-107 (1999).

64. Slamon,D.J., Clark,G.M., Wong,S.G., Levin,W.J., Ullrich,A., & McGuire,W.L. Human breast cancer: correlation of relapse and survival with amplification of the HER-2/neu oncogene. Science 235, 177-182 (1987).

65. Shak,S. Overview of the trastuzumab (Herceptin) anti-HER2 monoclonal antibody clinical program in HER2-overexpressing metastatic breast cancer. Herceptin Multinational Investigator Study Group. Semin. Oncol. 26, 71-77 (1999).

66. Molina,R., Jo,J., Filella,X., Zanon,G., Pahisa,J., Munoz,M., Farrus,B., Latre,M.L., Gimenez,N., Hage,M., Estape,J., & Ballesta,A.M. C-erbB-2 oncoprotein in the sera and tissue of patients with breast cancer. Utility in prognosis. Anticancer Res. 16, 2295-2300 (1996).

67. Stacker,S.A., Achen,M.G., Jussila,L., Baldwin,M.E., & Alitalo,K. Lymphangiogenesis and cancer metastasis. Nat. Rev. Cancer 2, 573-583 (2002).

68. Quackenbush,J. Microarray analysis and tumor classification. N. Engl. J. Med. 354, 2463-2472 (2006).

References 183

69. Eisen,M.B., Spellman,P.T., Brown,P.O., & Botstein,D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. U. S. A 95, 14863-14868 (1998).

70. Golub,T.R., Slonim,D.K., Tamayo,P., Huard,C., Gaasenbeek,M., Mesirov,J.P., Coller,H., Loh,M.L., Downing,J.R., Caligiuri,M.A., Bloomfield,C.D., & Lander,E.S. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531-537 (1999).

71. Perou,C.M., Sorlie,T., Eisen,M.B., van de,R.M., Jeffrey,S.S., Rees,C.A., Pollack,J.R., Ross,D.T., Johnsen,H., Akslen,L.A., Fluge,O., Pergamenschikov,A., Williams,C., Zhu,S.X., Lonning,P.E., Borresen-Dale,A.L., Brown,P.O., & Botstein,D. Molecular portraits of human breast tumours. Nature 406, 747-752 (2000).

72. Alizadeh,A.A., Ross,D.T., Perou,C.M., & van de,R.M. Towards a novel classification of human malignancies based on gene expression patterns. J. Pathol. 195, 41-52 (2001).

73. Weigelt,B., Hu,Z., He,X., Livasy,C., Carey,L.A., Ewend,M.G., Glas,A.M., Perou,C.M., & van't Veer,L.J. Molecular portraits and 70-gene prognosis signature are preserved throughout the metastatic process of breast cancer. Cancer Res. 65, 9155-9158 (2005).

74. Alizadeh,A.A., Eisen,M.B., Davis,R.E., Ma,C., Lossos,I.S., Rosenwald,A., Boldrick,J.C., Sabet,H., Tran,T., Yu,X., Powell,J.I., Yang,L., Marti,G.E., Moore,T., Hudson,J., Jr., Lu,L., Lewis,D.B., Tibshirani,R., Sherlock,G., Chan,W.C., Greiner,T.C., Weisenburger,D.D., Armitage,J.O., Warnke,R., Levy,R., Wilson,W., Grever,M.R., Byrd,J.C., Botstein,D., Brown,P.O., & Staudt,L.M. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503-511 (2000).

75. Rosenwald,A., Wright,G., Chan,W.C., Connors,J.M., Campo,E., Fisher,R.I., Gascoyne,R.D., Muller-Hermelink,H.K., Smeland,E.B., Giltnane,J.M., Hurt,E.M., Zhao,H., Averett,L., Yang,L., Wilson,W.H., Jaffe,E.S., Simon,R., Klausner,R.D., Powell,J., Duffey,P.L., Longo,D.L., Greiner,T.C., Weisenburger,D.D., Sanger,W.G., Dave,B.J., Lynch,J.C., Vose,J., Armitage,J.O., Montserrat,E., Lopez-Guillermo,A., Grogan,T.M., Miller,T.P., LeBlanc,M., Ott,G., Kvaloy,S., Delabie,J., Holte,H., Krajci,P., Stokke,T., & Staudt,L.M. The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N. Engl. J. Med. 346, 1937-1947 (2002).

76. Pomeroy,S.L., Tamayo,P., Gaasenbeek,M., Sturla,L.M., Angelo,M., McLaughlin,M.E., Kim,J.Y., Goumnerova,L.C., Black,P.M., Lau,C., Allen,J.C., Zagzag,D., Olson,J.M., Curran,T., Wetmore,C., Biegel,J.A., Poggio,T.,

References 184

Mukherjee,S., Rifkin,R., Califano,A., Stolovitzky,G., Louis,D.N., Mesirov,J.P., Lander,E.S., & Golub,T.R. Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415, 436-442 (2002).

77. Iizuka,N., Hamamoto,Y., & Oka,M. Predicting individual outcomes in hepatocellular carcinoma. Lancet 364, 1837-1839 (2004).

78. Chen,H.Y., Yu,S.L., Chen,C.H., Chang,G.C., Chen,C.Y., Yuan,A., Cheng,C.L., Wang,C.H., Terng,H.J., Kao,S.F., Chan,W.K., Li,H.N., Liu,C.C., Singh,S., Chen,W.J., Chen,J.J., & Yang,P.C. A five-gene signature and clinical outcome in non-small-cell lung cancer. N. Engl. J. Med. 356, 11-20 (2007).

79. van de Vijver,M.J., He,Y.D., van't Veer,L.J., Dai,H., Hart,A.A., Voskuil,D.W., Schreiber,G.J., Peterse,J.L., Roberts,C., Marton,M.J., Parrish,M., Atsma,D., Witteveen,A., Glas,A., Delahaye,L., van,d., V, Bartelink,H., Rodenhuis,S., Rutgers,E.T., Friend,S.H., & Bernards,R. A gene-expression signature as a predictor of survival in breast cancer. N. Engl. J. Med. 347, 1999-2009 (2002).

80. Paik,S., Shak,S., Tang,G., Kim,C., Baker,J., Cronin,M., Baehner,F.L., Walker,M.G., Watson,D., Park,T., Hiller,W., Fisher,E.R., Wickerham,D.L., Bryant,J., & Wolmark,N. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N. Engl. J. Med. 351, 2817-2826 (2004).

81. Pollack,J.R. A perspective on DNA microarrays in pathology research and practice. Am. J. Pathol. 171, 375-385 (2007).

82. Michiels,S., Koscielny,S., & Hill,C. Prediction of cancer outcome with microarrays: a multiple random validation strategy. Lancet 365, 488-492 (2005).

83. Ioannidis,J.P. Microarrays and molecular research: noise discovery? Lancet 365, 454-455 (2005).

84. Diamandis, E. P., Schmitt, M., and van der Merwe, D. National Academy of Clinical Biochemistry Guidelines: The Use of Microarrays in Cancer Diagnostics. American Association for Clinical Chemistry . 2006.

Ref Type: Electronic Citation

85. Domon,B. & Aebersold,R. Mass spectrometry and protein analysis. Science 312, 212-217 (2006).

86. Wulfkuhle,J.D., Paweletz,C.P., Steeg,P.S., Petricoin,E.F., III, & Liotta,L. Proteomic approaches to the diagnosis, treatment, and monitoring of cancer. Adv. Exp. Med. Biol. 532, 59-68 (2003).

References 185

87. Petricoin,E.F., Ardekani,A.M., Hitt,B.A., Levine,P.J., Fusaro,V.A., Steinberg,S.M., Mills,G.B., Simone,C., Fishman,D.A., Kohn,E.C., & Liotta,L.A. Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359, 572-577 (2002).

88. Li,J., Zhang,Z., Rosenzweig,J., Wang,Y.Y., & Chan,D.W. Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer. Clin. Chem. 48, 1296-1304 (2002).

89. Petricoin,E.F., III, Ornstein,D.K., Paweletz,C.P., Ardekani,A., Hackett,P.S., Hitt,B.A., Velassco,A., Trucco,C., Wiegand,L., Wood,K., Simone,C.B., Levine,P.J., Linehan,W.M., Emmert-Buck,M.R., Steinberg,S.M., Kohn,E.C., & Liotta,L.A. Serum proteomic patterns for detection of prostate cancer. J. Natl. Cancer Inst. 94, 1576-1578 (2002).

90. Chen,Y.D., Zheng,S., Yu,J.K., & Hu,X. Artificial neural networks analysis of surface-enhanced laser desorption/ionization mass spectra of serum protein pattern distinguishes colorectal cancer from healthy population. Clin. Cancer Res. 10, 8380-8385 (2004).

91. Paradis,V., Degos,F., Dargere,D., Pham,N., Belghiti,J., Degott,C., Janeau,J.L., Bezeaud,A., Delforge,D., Cubizolles,M., Laurendeau,I., & Bedossa,P. Identification of a new marker of hepatocellular carcinoma by serum protein profiling of patients with chronic liver diseases. Hepatology 41, 40-47 (2005).

92. Tolson,J., Bogumil,R., Brunst,E., Beck,H., Elsner,R., Humeny,A., Kratzin,H., Deeg,M., Kuczyk,M., Mueller,G.A., Mueller,C.A., & Flad,T. Serum protein profiling by SELDI mass spectrometry: detection of multiple variants of serum amyloid alpha in renal cancer patients. Lab Invest 84, 845-856 (2004).

93. Rosty,C., Christa,L., Kuzdzal,S., Baldwin,W.M., Zahurak,M.L., Carnot,F., Chan,D.W., Canto,M., Lillemoe,K.D., Cameron,J.L., Yeo,C.J., Hruban,R.H., & Goggins,M. Identification of hepatocarcinoma-intestine-pancreas/pancreatitis-associated protein I as a biomarker for pancreatic ductal adenocarcinoma by protein biochip technology. Cancer Res. 62, 1868-1875 (2002).

94. Wadsworth,J.T., Somers,K.D., Stack,B.C., Jr., Cazares,L., Malik,G., Adam,B.L., Wright,G.L., Jr., & Semmes,O.J. Identification of patients with head and neck cancer using serum protein profiles. Arch. Otolaryngol. Head Neck Surg. 130, 98-104 (2004).

95. Diamandis,E.P. Point: Proteomic patterns in biological fluids: do they represent the future of cancer diagnostics? Clin. Chem. 49, 1272-1275 (2003).

96. Karsan,A., Eigl,B.J., Flibotte,S., Gelmon,K., Switzer,P., Hassell,P., Harrison,D., Law,J., Hayes,M., Stillwell,M., Xiao,Z., Conrads,T.P., &

References 186

Veenstra,T. Analytical and preanalytical biases in serum proteomic pattern analysis for breast cancer diagnosis. Clin. Chem. 51, 1525-1528 (2005).

97. Banks,R.E., Stanley,A.J., Cairns,D.A., Barrett,J.H., Clarke,P., Thompson,D., & Selby,P.J. Influences of blood sample processing on low-molecular-weight proteome identified by surface-enhanced laser desorption/ionization mass spectrometry. Clin. Chem. 51, 1637-1649 (2005).

98. Ransohoff,D.F. Lessons from controversy: ovarian cancer screening and serum proteomics. J. Natl. Cancer Inst. 97, 315-319 (2005).

99. Baggerly,K.A., Morris,J.S., Edmonson,S.R., & Coombes,K.R. Signal in noise: evaluating reported reproducibility of serum proteomic tests for ovarian cancer. J. Natl. Cancer Inst. 97, 307-309 (2005).

100. Chan, D. W., Semmes, O. J., Petricoin, E., Liotta, L., van der Merwe, D., and Diamandis, E. P. National Academy of Clinical Biochemistry Guidelines: The Use of MALDI-TOF Mass Spectrometry Profiling to Diagnose Cancer. American Association for Clinical Chemistry . 2006.

Ref Type: Electronic Citation

101. Lopez,M.F., Mikulskis,A., Kuzdzal,S., Bennett,D.A., Kelly,J., Golenko,E., DiCesare,J., Denoyer,E., Patton,W.F., Ediger,R., Sapp,L., Ziegert,T., Lynch,C., Kramer,S., Whiteley,G.R., Wall,M.R., Mannion,D.P., Della,C.G., Rakitan,J.S., & Wolfe,G.M. High-resolution serum proteomic profiling of Alzheimer disease samples reveals disease-specific, carrier-protein-bound mass signatures. Clin. Chem. 51, 1946-1954 (2005).

102. Liotta,L.A., Ferrari,M., & Petricoin,E. Clinical proteomics: written in blood. Nature 425, 905 (2003).

103. Tirumalai,R.S., Chan,K.C., Prieto,D.A., Issaq,H.J., Conrads,T.P., & Veenstra,T.D. Characterization of the low molecular weight human serum proteome. Mol. Cell Proteomics. 2, 1096-1103 (2003).

104. Harper,R.G., Workman,S.R., Schuetzner,S., Timperman,A.T., & Sutton,J.N. Low-molecular-weight human serum proteome using ultrafiltration, isoelectric focusing, and mass spectrometry. Electrophoresis 25, 1299-1306 (2004).

105. Rai,D.K., Green,B.N., Landin,B., Alvelius,G., & Griffiths,W.J. Accurate mass measurement and tandem mass spectrometry of intact globin chains identify the low proportion variant hemoglobin Lepore-Boston-Washington from the blood of a heterozygote. J. Mass Spectrom. 39, 289-294 (2004).

106. Villanueva,J., Shaffer,D.R., Philip,J., Chaparro,C.A., Erdjument-Bromage,H., Olshen,A.B., Fleisher,M., Lilja,H., Brogi,E., Boyd,J., Sanchez-Carbayo,M., Holland,E.C., Cordon-Cardo,C., Scher,H.I., & Tempst,P. Differential

References 187

exoprotease activities confer tumor-specific serum peptidome patterns. J. Clin. Invest 116, 271-284 (2006).

107. Lopez,M.F., Mikulskis,A., Kuzdzal,S., Golenko,E., Petricoin,E.F., III, Liotta,L.A., Patton,W.F., Whiteley,G.R., Rosenblatt,K., Gurnani,P., Nandi,A., Neill,S., Cullen,S., O'Gorman,M., Sarracino,D., Lynch,C., Johnson,A., Mckenzie,W., & Fishman,D. A novel, high-throughput workflow for discovery and identification of serum carrier protein-bound peptide biomarker candidates in ovarian cancer samples. Clin. Chem. 53, 1067-1074 (2007).

108. Koomen,J.M., Li,D., Xiao,L.C., Liu,T.C., Coombes,K.R., Abbruzzese,J., & Kobayashi,R. Direct tandem mass spectrometry reveals limitations in protein profiling experiments for plasma biomarker discovery. J. Proteome Res. 4, 972-981 (2005).

109. Diamandis,E.P. Peptidomics for cancer diagnosis: present and future. J. Proteome Res. 5, 2079-2082 (2006).

110. Borgono,C.A. & Diamandis,E.P. The emerging roles of human tissue kallikreins in cancer. Nat. Rev. Cancer 4, 876-890 (2004).

111. Rittenhouse,H.G., Finlay,J.A., Mikolajczyk,S.D., & Partin,A.W. Human Kallikrein 2 (hK2) and prostate-specific antigen (PSA): two closely related, but distinct, kallikreins in the prostate. Crit Rev. Clin. Lab Sci. 35, 275-368 (1998).

112. Diamandis,E.P., Scorilas,A., Fracchioli,S., Van Gramberen,M., De Bruijn,H., Henrik,A., Soosaipillai,A., Grass,L., Yousef,G.M., Stenman,U.H., Massobrio,M., Van Der Zee,A.G., Vergote,I., & Katsaros,D. Human kallikrein 6 (hK6): a new potential serum biomarker for diagnosis and prognosis of ovarian carcinoma. J. Clin. Oncol. 21, 1035-1043 (2003).

113. Liotta,L.A. & Kohn,E.C. The microenvironment of the tumour-host interface. Nature 411, 375-379 (2001).

114. Jung,Y.D., Ahmad,S.A., Liu,W., Reinmuth,N., Parikh,A., Stoeltzing,O., Fan,F., & Ellis,L.M. The role of the microenvironment and intercellular cross-talk in tumor angiogenesis. Semin. Cancer Biol. 12, 105-112 (2002).

115. Celis,J.E., Gromov,P., Cabezon,T., Moreira,J.M., Ambartsumian,N., Sandelin,K., Rank,F., & Gromova,I. Proteomic characterization of the interstitial fluid perfusing the breast tumor microenvironment: a novel resource for biomarker and therapeutic target discovery. Mol. Cell Proteomics. 3, 327-344 (2004).

116. Wang,X., Yu,J., Sreekumar,A., Varambally,S., Shen,R., Giacherio,D., Mehra,R., Montie,J.E., Pienta,K.J., Sanda,M.G., Kantoff,P.W., Rubin,M.A., Wei,J.T., Ghosh,D., & Chinnaiyan,A.M. Autoantibody signatures in prostate cancer. N. Engl. J. Med. 353, 1224-1235 (2005).

References 188

117. Nowell,P.C. & HUNGERFORD,D.A. Chromosome studies on normal and leukemic human leukocytes. J. Natl. Cancer Inst. 25, 85-109 (1960).

118. Caprioli,R.M. Deciphering protein molecular signatures in cancer tissues to aid in diagnosis, prognosis, and therapy. Cancer Res. 65, 10642-10645 (2005).

119. Yanagisawa,K., Shyr,Y., Xu,B.J., Massion,P.P., Larsen,P.H., White,B.C., Roberts,J.R., Edgerton,M., Gonzalez,A., Nadaf,S., Moore,J.H., Caprioli,R.M., & Carbone,D.P. Proteomic patterns of tumour subsets in non-small-cell lung cancer. Lancet 362, 433-439 (2003).

120. Faca,V., Pitteri,S.J., Newcomb,L., Glukhova,V., Phanstiel,D., Krasnoselsky,A., Zhang,Q., Struthers,J., Wang,H., Eng,J., Fitzgibbon,M., McIntosh,M., & Hanash,S. Contribution of protein fractionation to depth of analysis of the serum and plasma proteomes. J. Proteome Res. 6, 3558-3565 (2007).

121. Kuick,R., Misek,D.E., Monsma,D.J., Webb,C.P., Wang,H., Peterson,K.J., Pisano,M., Omenn,G.S., & Hanash,S.M. Discovery of cancer biomarkers through the use of mouse models. Cancer Lett. 249, 40-48 (2007).

122. Whiteaker,J.R., Zhang,H., Lei,Z., Wang,P., Kelly-Spratt,K.S., Ivey,R.G., Piening,B.D., Feng,L.C., Kasarda,E., Gurley,K.E., Eng,J.K., Chodosh,L.A., Kemp,C.J., McIntosh,M.W., & Paulovich,A.G. Integrated Pipeline for Mass Spectrometry-Based Discovery and Confirmation of Biomarkers Demonstrated in a Mouse Model of Breast Cancer. J. Proteome Res. (2007).

123. Glish,G.L. & Vachet,R.W. The basics of mass spectrometry in the twenty-first century. Nat. Rev. Drug Discov. 2, 140-150 (2003).

124. Steen,H. & Mann,M. The ABC's (and XYZ's) of peptide sequencing. Nat. Rev. Mol. Cell Biol. 5, 699-711 (2004).

125. Marcotte,E.M. How do shotgun proteomics algorithms identify proteins? Nat. Biotechnol. 25, 755-757 (2007).

126. Wright,G.L., Jr. Two-dimensional acrylamide gel electrophoresis of cancer-patient serum proteins. Ann. Clin. Lab Sci. 4, 281-293 (1974).

127. Rifai,N., Gillette,M.A., & Carr,S.A. Protein biomarker discovery and validation: the long and uncertain path to clinical utility. Nat. Biotechnol. 24, 971-983 (2006).

128. Bertucci,F., Birnbaum,D., & Goncalves,A. Proteomics of breast cancer: principles and potential clinical applications. Mol. Cell Proteomics. (2006).

References 189

129. Bose,R., Molina,H., Patterson,A.S., Bitok,J.K., Periaswamy,B., Bader,J.S., Pandey,A., & Cole,P.A. Phosphoproteomic analysis of Her2/neu signaling and inhibition. Proc. Natl. Acad. Sci. U. S. A 103, 9773-9778 (2006).

130. Vargo-Gogola,T. & Rosen,J.M. Modelling breast cancer: one size does not fit all. Nat. Rev. Cancer 7, 659-672 (2007).

131. Pitteri,S.J. & Hanash,S.M. Proteomic approaches for cancer biomarker discovery in plasma. Expert. Rev. Proteomics. 4, 589-590 (2007).

132. Shao,Z.M. & Nguyen,M. Nipple aspiration in diagnosis of breast cancer. Semin. Surg. Oncol. 20, 175-180 (2001).

133. Wrensch,M.R., Petrakis,N.L., Gruenke,L.D., Ernster,V.L., Miike,R., King,E.B., & Hauck,W.W. Factors associated with obtaining nipple aspirate fluid: analysis of 1428 women and literature review. Breast Cancer Res. Treat. 15, 39-51 (1990).

134. Alexander,H., Stegner,A.L., Wagner-Mann,C., Du Bois,G.C., Alexander,S., & Sauter,E.R. Proteomic analysis to identify breast cancer biomarkers in nipple aspirate fluid. Clin. Cancer Res. 10, 7500-7510 (2004).

135. Pawlik,T.M., Hawke,D.H., Liu,Y., Krishnamurthy,S., Fritsche,H., Hunt,K.K., & Kuerer,H.M. Proteomic analysis of nipple aspirate fluid from women with early-stage breast cancer using isotope-coded affinity tags and tandem mass spectrometry reveals differential expression of vitamin D binding protein. BMC. Cancer 6, 68 (2006).

136. Varnum,S.M., Covington,C.C., Woodbury,R.L., Petritis,K., Kangas,L.J., Abdullah,M.S., Pounds,J.G., Smith,R.D., & Zangar,R.C. Proteomic characterization of nipple aspirate fluid: identification of potential biomarkers of breast cancer. Breast Cancer Res. Treat. 80, 87-97 (2003).

137. Alldridge,L., Metodieva,G., Greenwood,C., Al Janabi,K., Thwaites,L., Sauven,P., & Metodiev,M. Proteome Profiling of Breast Tumors by Gel Electrophoresis and Nanoscale Electrospray Ionization Mass Spectrometry. J. Proteome Res. (2008).

138. Hondermarck,H., Tastet,C., Yazidi-Belkoura,I., Toillon,R.A., & Le,B., X Proteomics of Breast Cancer: The Quest for Markers and Therapeutic Targets. J. Proteome Res. (2008).

139. Hondermarck,H. Breast cancer: when proteomics challenges biological complexity. Mol. Cell Proteomics. 2, 281-291 (2003).

140. Pitteri,S.J., Faca,V.M., Kelly-Spratt,K.S., Kasarda,A.E., Wang,H., Zhang,Q., Newcomb,L., Krasnoselesky,A., Paczesny,S., Choi,G., Fitzgibbon,M., McIntosh,M.W., Kemp,C.J., & Hanash,S.M. Plasma Proteome Profiling of a

References 190

Mouse Model of Breast Cancer Identifies a Set of Up-Regulated Proteins in Common with Human Breast Cancer Cells. J. Proteome Res. (2008).

141. Nandi,S., Guzman,R.C., & Yang,J. Hormones and mammary carcinogenesis in mice, rats, and humans: a unifying hypothesis. Proc. Natl. Acad. Sci. U. S. A 92, 3650-3657 (1995).

142. Chen,S.T., Pan,T.L., Juan,H.F., Chen,T.Y., Lin,Y.S., & Huang,C.M. Breast Tumor Microenvironment: Proteomics Highlights the Treatments Targeting Secretome. J. Proteome Res. (2008).

143. Hathout,Y. Approaches to the study of the cell secretome. Expert. Rev. Proteomics. 4, 239-248 (2007).

144. Lacroix,M. & Leclercq,G. Relevance of breast cancer cell lines as models for breast tumours: an update. Breast Cancer Res. Treat. 83, 249-289 (2004).

145. Martin,D.B., Gifford,D.R., Wright,M.E., Keller,A., Yi,E., Goodlett,D.R., Aebersold,R., & Nelson,P.S. Quantitative proteomic analysis of proteins released by neoplastic prostate epithelium. Cancer Res. 64, 347-355 (2004).

146. Mbeunkui,F., Fodstad,O., & Pannell,L.K. Secretory protein enrichment and analysis: an optimized approach applied on cancer cell lines using 2D LC-MS/MS. J. Proteome. Res. 5, 899-906 (2006).

147. Canelle,L., Bousquet,J., Pionneau,C., Hardouin,J., Choquet-Kastylevsky,G., Joubert-Caron,R., & Caron,M. A proteomic approach to investigate potential biomarkers directed against membrane-associated breast cancer proteins. Electrophoresis 27, 1609-1616 (2006).

148. Xiang,R., Shi,Y., Dillon,D.A., Negin,B., Horvath,C., & Wilkins,J.A. 2D LC/MS analysis of membrane proteins from breast cancer cell lines MCF7 and BT474. J. Proteome Res. 3, 1278-1283 (2004).

149. Adam,P.J., Boyd,R., Tyson,K.L., Fletcher,G.C., Stamps,A., Hudson,L., Poyser,H.R., Redpath,N., Griffiths,M., Steers,G., Harris,A.L., Patel,S., Berry,J., Loader,J.A., Townsend,R.R., Daviet,L., Legrain,P., Parekh,R., & Terrett,J.A. Comprehensive proteomic analysis of breast cancer cell membranes reveals unique proteins with potential roles in clinical cancer. J. Biol. Chem. 278, 6482-6489 (2003).

150. Patwardhan,A.J., Strittmatter,E.F., Camp,D.G., Smith,R.D., & Pallavicini,M.G. Comparison of normal and breast cancer cell lines using proteome, genome, and interactome data. J. Proteome Res. 4, 1952-1960 (2005).

151. Charafe-Jauffret,E., Ginestier,C., Monville,F., Finetti,P., Adelaide,J., Cervera,N., Fekairi,S., Xerri,L., Jacquemier,J., Birnbaum,D., & Bertucci,F.

References 191

Gene expression profiling of breast cell lines identifies potential new basal markers. Oncogene 25, 2273-2284 (2006).

152. Sorlie,T., Tibshirani,R., Parker,J., Hastie,T., Marron,J.S., Nobel,A., Deng,S., Johnsen,H., Pesich,R., Geisler,S., Demeter,J., Perou,C.M., Lonning,P.E., Brown,P.O., Borresen-Dale,A.L., & Botstein,D. Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc. Natl. Acad. Sci. U. S. A 100, 8418-8423 (2003).

153. Sorlie,T., Perou,C.M., Tibshirani,R., Aas,T., Geisler,S., Johnsen,H., Hastie,T., Eisen,M.B., van de,R.M., Jeffrey,S.S., Thorsen,T., Quist,H., Matese,J.C., Brown,P.O., Botstein,D., Eystein,L.P., & Borresen-Dale,A.L. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl. Acad. Sci. U. S. A 98, 10869-10874 (2001).

154. Abd El-Rehim,D.M., Ball,G., Pinder,S.E., Rakha,E., Paish,C., Robertson,J.F., Macmillan,D., Blamey,R.W., & Ellis,I.O. High-throughput protein expression analysis using tissue microarray technology of a large well-characterised series identifies biologically distinct classes of breast cancer confirming recent cDNA expression analyses. Int. J. Cancer 116, 340-350 (2005).

155. Jacquemier,J., Ginestier,C., Rougemont,J., Bardou,V.J., Charafe-Jauffret,E., Geneix,J., Adelaide,J., Koki,A., Houvenaeghel,G., Hassoun,J., Maraninchi,D., Viens,P., Birnbaum,D., & Bertucci,F. Protein expression profiling identifies subclasses of breast cancer and predicts prognosis. Cancer Res. 65, 767-779 (2005).

156. Neve,R.M., Chin,K., Fridlyand,J., Yeh,J., Baehner,F.L., Fevr,T., Clark,L., Bayani,N., Coppe,J.P., Tong,F., Speed,T., Spellman,P.T., DeVries,S., Lapuk,A., Wang,N.J., Kuo,W.L., Stilwell,J.L., Pinkel,D., Albertson,D.G., Waldman,F.M., McCormick,F., Dickson,R.B., Johnson,M.D., Lippman,M., Ethier,S., Gazdar,A., & Gray,J.W. A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes. Cancer Cell 10, 515-527 (2006).

157. Bast,R.C., Jr., Klug,T.L., St John,E., Jenison,E., Niloff,J.M., Lazarus,H., Berkowitz,R.S., Leavitt,T., Griffiths,C.T., Parker,L., Zurawski,V.R., Jr., & Knapp,R.C. A radioimmunoassay using a monoclonal antibody to monitor the course of epithelial ovarian cancer. N. Engl. J. Med. 309, 883-887 (1983).

158. Petricoin,E.F. & Liotta,L.A. Clinical applications of proteomics. J. Nutr. 133, 2476S-2484S (2003).

159. Antman,K. & Shea,S. Screening mammography under age 50. JAMA 281, 1470-1472 (1999).

References 192

160. Malatesta,M., Mannello,F., Bianchi,G., Sebastiani,M., & Gazzanelli,G. Biochemical and ultrastructural features of human milk and nipple aspirate fluids. J. Clin. Lab Anal. 14, 330-335 (2000).

161. Klein,P., Glaser,E., Grogan,L., Keane,M., Lipkowitz,S., Soballe,P., Brooks,L., Jenkins,J., Steinberg,S.M., DeMarini,D.M., & Kirsch,I. Biomarker assays in nipple aspirate fluid. Breast J. 7, 378-387 (2001).

162. Soule,H.D., Maloney,T.M., Wolman,S.R., Peterson,W.D., Jr., Brenz,R., McGrath,C.M., Russo,J., Pauley,R.J., Jones,R.F., & Brooks,S.C. Isolation and characterization of a spontaneously immortalized human breast epithelial cell line, MCF-10. Cancer Res. 50, 6075-6086 (1990).

163. Lasfargues,E.Y., Coutinho,W.G., & Redfield,E.S. Isolation of two human tumor epithelial cell lines from solid breast carcinomas. J. Natl. Cancer Inst. 61, 967-978 (1978).

164. Cailleau,R., Olive,M., & Cruciger,Q.V. Long-term human breast carcinoma cell lines of metastatic origin: preliminary characterization. In Vitro 14, 911-915 (1978).

165. She,Q.B., Solit,D., Basso,A., & Moasser,M.M. Resistance to gefitinib in PTEN-null HER-overexpressing tumor cells can be overcome through restoration of PTEN function or pharmacologic modulation of constitutive phosphatidylinositol 3'-kinase/Akt pathway signaling. Clin. Cancer Res. 9, 4340-4346 (2003).

166. Panigrahi,A.R., Pinder,S.E., Chan,S.Y., Paish,E.C., Robertson,J.F., & Ellis,I.O. The role of PTEN and its signalling pathways, including AKT, in breast cancer; an assessment of relationships with other prognostic factors and with outcome. J. Pathol. 204, 93-100 (2004).

167. Keller,A., Nesvizhskii,A.I., Kolker,E., & Aebersold,R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 74, 5383-5392 (2002).

168. Nesvizhskii,A.I., Keller,A., Kolker,E., & Aebersold,R. A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 75, 4646-4658 (2003).

169. Luo,L.Y., Grass,L., Howarth,D.J., Thibault,P., Ong,H., & Diamandis,E.P. Immunofluorometric assay of human kallikrein 10 and its identification in biological fluids and tissues. Clin. Chem. 47, 237-246 (2001).

170. Yousef,G.M., Polymeris,M.E., Grass,L., Soosaipillai,A., Chan,P.C., Scorilas,A., Borgono,C., Harbeck,N., Schmalfeldt,B., Dorn,J., Schmitt,M., & Diamandis,E.P. Human kallikrein 5: a potential novel serum biomarker for breast and ovarian cancer. Cancer Res. 63, 3958-3965 (2003).

References 193

171. Kulasingam,V. & Diamandis,E.P. Proteomics analysis of conditioned media from three breast cancer cell lines: a mine for biomarkers and therapeutic targets. Mol. Cell Proteomics. 6, 1997-2011 (2007).

172. Elias,J.E., Haas,W., Faherty,B.K., & Gygi,S.P. Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations. Nat. Methods 2, 667-675 (2005).

173. Pinkas-Kramarski,R., Alroy,I., & Yarden,Y. ErbB receptors and EGF-like ligands: cell lineage determination and oncogenesis through combinatorial signaling. J. Mammary. Gland. Biol. Neoplasia. 2, 97-107 (1997).

174. Searle, B. C., Brundege, J. M., and Turner, M. Improving Sensitivity by Combining Results from Multiple MS/MS Search Methodologies with the Scaffold Computer Algorithm. Human Proteome Organization (HUPO), 4th Annual World Congress, Munich, Germany . 8-28-2005.

Ref Type: Abstract

175. Domon,B. & Aebersold,R. Challenges and Opportunities in Proteomics Data Analysis. Mol. Cell Proteomics. 5, 1921-1926 (2006).

176. Zybailov,B., Mosley,A.L., Sardiu,M.E., Coleman,M.K., Florens,L., & Washburn,M.P. Statistical analysis of membrane proteome expression changes in Saccharomyces cerevisiae. J. Proteome Res. 5, 2339-2347 (2006).

177. Rappsilber,J., Ryder,U., Lamond,A.I., & Mann,M. Large-scale proteomic analysis of the human spliceosome. Genome Res. 12, 1231-1245 (2002).

178. Ishihama,Y., Oda,Y., Tabata,T., Sato,T., Nagasu,T., Rappsilber,J., & Mann,M. Exponentially modified protein abundance index (emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein. Mol. Cell Proteomics. 4, 1265-1272 (2005).

179. Yousef,G.M., Scorilas,A., Kyriakopoulou,L.G., Rendl,L., Diamandis,M., Ponzone,R., Biglia,N., Giai,M., Roagna,R., Sismondi,P., & Diamandis,E.P. Human kallikrein gene 5 (KLK5) expression by quantitative PCR: an independent indicator of poor prognosis in breast cancer. Clin. Chem. 48, 1241-1250 (2002).

180. Luo,L.Y., Diamandis,E.P., Look,M.P., Soosaipillai,A.P., & Foekens,J.A. Higher expression of human kallikrein 10 in breast cancer tissue predicts tamoxifen resistance. Br. J. Cancer 86, 1790-1796 (2002).

181. Lipton,A., Ali,S.M., Leitzel,K., Demers,L., Chinchilli,V., Engle,L., Harvey,H.A., Brady,C., Nalin,C.M., Dugan,M., Carney,W., & Allard,J. Elevated serum Her-2/neu level predicts decreased response to hormone therapy in metastatic breast cancer. J. Clin. Oncol. 20, 1467-1472 (2002).

References 194

182. Esteva,F.J., Valero,V., Booser,D., Guerra,L.T., Murray,J.L., Pusztai,L., Cristofanilli,M., Arun,B., Esmaeli,B., Fritsche,H.A., Sneige,N., Smith,T.L., & Hortobagyi,G.N. Phase II study of weekly docetaxel and trastuzumab for patients with HER-2-overexpressing metastatic breast cancer. J. Clin. Oncol. 20, 1800-1808 (2002).

183. Jacobs,J.M., Mottaz,H.M., Yu,L.R., Anderson,D.J., Moore,R.J., Chen,W.N., Auberry,K.J., Strittmatter,E.F., Monroe,M.E., Thrall,B.D., Camp,D.G., & Smith,R.D. Multidimensional proteome analysis of human mammary epithelial cells. J. Proteome Res. 3, 68-75 (2004).

184. Sardana,G., Marshall,J., & Diamandis,E.P. Discovery of candidate tumor markers for prostate cancer via proteomic analysis of cell culture-conditioned medium. Clin. Chem. 53, 429-437 (2007).

185. Zolg,W. The proteomic search for diagnostic biomarkers: lost in translation? Mol. Cell Proteomics. 5, 1720-1726 (2006).

186. Anderson,L. & Hunter,C.L. Quantitative mass spectrometric multiple reaction monitoring assays for major plasma proteins. Mol. Cell Proteomics. 5, 573-588 (2006).

187. Anderson,N.L., Anderson,N.G., Haines,L.R., Hardie,D.B., Olafson,R.W., & Pearson,T.W. Mass spectrometric quantitation of peptides and proteins using Stable Isotope Standards and Capture by Anti-Peptide Antibodies (SISCAPA). J. Proteome Res. 3, 235-244 (2004).

188. Barnidge,D.R., Goodmanson,M.K., Klee,G.G., & Muddiman,D.C. Absolute quantification of the model biomarker prostate-specific antigen in serum by LC-Ms/MS using protein cleavage and isotope dilution mass spectrometry. J. Proteome Res. 3, 644-652 (2004).

189. O'Brien,N., O'Donovan,N., Foley,D., Hill,A.D., McDermott,E., O'Higgins,N., & Duffy,M.J. Use of a Panel of Novel Genes for Differentiating Breast Cancer from Non-Breast Tissues. Tumour. Biol. 28, 312-317 (2008).

190. Minn,A.J., Gupta,G.P., Siegel,P.M., Bos,P.D., Shu,W., Giri,D.D., Viale,A., Olshen,A.B., Gerald,W.L., & Massague,J. Genes that mediate breast cancer metastasis to lung. Nature 436, 518-524 (2005).

191. Mbeunkui,F., Metge,B.J., Shevde,L.A., & Pannell,L.K. Identification of differentially secreted biomarkers using LC-MS/MS in isogenic cell lines representing a progression of breast cancer. J. Proteome Res. 6, 2993-3002 (2007).

192. Jacobs,J.M., Waters,K.M., Kathmann,L.E., Camp Ii,D.G., Wiley,H.S., Smith,R.D., & Thrall,B.D. The Mammary Epithelial Cell Secretome and Its

References 195

Regulation by Signal Transduction Pathways. J. Proteome Res. 7, 558-569 (2008).

193. Porter,D.A., Krop,I.E., Nasser,S., Sgroi,D., Kaelin,C.M., Marks,J.R., Riggins,G., & Polyak,K. A SAGE (serial analysis of gene expression) view of breast tumor progression. Cancer Res. 61, 5697-5702 (2001).

194. Dombkowski,A.A., Cukovic,D., & Novak,R.F. Secretome analysis of microarray data reveals extracellular events associated with proliferative potential in a cell line model of breast disease. Cancer Lett. 241, 49-58 (2006).

195. Sjoblom,T., Jones,S., Wood,L.D., Parsons,D.W., Lin,J., Barber,T.D., Mandelker,D., Leary,R.J., Ptak,J., Silliman,N., Szabo,S., Buckhaults,P., Farrell,C., Meeh,P., Markowitz,S.D., Willis,J., Dawson,D., Willson,J.K., Gazdar,A.F., Hartigan,J., Wu,L., Liu,C., Parmigiani,G., Park,B.H., Bachman,K.E., Papadopoulos,N., Vogelstein,B., Kinzler,K.W., & Velculescu,V.E. The consensus coding sequences of human breast and colorectal cancers. Science 314, 268-274 (2006).

196. Schalkwijk,J., Wiedow,O., & Hirose,S. The trappin gene family: proteins defined by an N-terminal transglutaminase substrate domain and a C-terminal four-disulphide core. Biochem. J. 340 ( Pt 3), 569-577 (1999).

197. Liotta,L.A., Steeg,P.S., & Stetler-Stevenson,W.G. Cancer metastasis and angiogenesis: an imbalance of positive and negative regulation. Cell 64, 327-336 (1991).

198. Zhang,M., Zou,Z., Maass,N., & Sager,R. Differential expression of elafin in human normal mammary epithelial cells and carcinomas is regulated at the transcriptional level. Cancer Res. 55, 2537-2541 (1995).

199. Borgono,C.A. & Diamandis,E.P. The emerging roles of human tissue kallikreins in cancer. Nat. Rev. Cancer 4, 876-890 (2004).

200. Rittenhouse,H.G., Finlay,J.A., Mikolajczyk,S.D., & Partin,A.W. Human Kallikrein 2 (hK2) and prostate-specific antigen (PSA): two closely related, but distinct, kallikreins in the prostate. Crit Rev. Clin. Lab Sci. 35, 275-368 (1998).

201. Henskens,Y.M., Veerman,E.C., & Nieuw Amerongen,A.V. Cystatins in health and disease. Biol. Chem. Hoppe Seyler 377, 71-86 (1996).

202. Laterza,O.F., Price,C.P., & Scott,M.G. Cystatin C: an improved estimator of glomerular filtration rate? Clin. Chem. 48, 699-707 (2002).

203. Friedl,A., Stoesz,S.P., Buckley,P., & Gould,M.N. Neutrophil gelatinase-associated lipocalin in normal and neoplastic human tissues. Cell type-specific pattern of expression. Histochem. J. 31, 433-441 (1999).

References 196

204. Yang,J., Goetz,D., Li,J.Y., Wang,W., Mori,K., Setlik,D., Du,T., Erdjument-Bromage,H., Tempst,P., Strong,R., & Barasch,J. An iron delivery pathway mediated by a lipocalin. Mol. Cell 10, 1045-1056 (2002).

205. Devireddy,L.R., Teodoro,J.G., Richard,F.A., & Green,M.R. Induction of apoptosis by a secreted lipocalin that is transcriptionally regulated by IL-3 deprivation. Science 293, 829-834 (2001).

206. Yan,L., Borregaard,N., Kjeldsen,L., & Moses,M.A. The high molecular weight urinary matrix metalloproteinase (MMP) activity is a complex of gelatinase B/MMP-9 and neutrophil gelatinase-associated lipocalin (NGAL). Modulation of MMP-9 activity by NGAL. J. Biol. Chem. 276, 37258-37265 (2001).

207. Furutani,M., Arii,S., Mizumoto,M., Kato,M., & Imamura,M. Identification of a neutrophil gelatinase-associated lipocalin mRNA in human pancreatic cancers using a modified signal sequence trap method. Cancer Lett. 122, 209-214 (1998).

208. Fernandez,C.A., Yan,L., Louis,G., Yang,J., Kutok,J.L., & Moses,M.A. The matrix metalloproteinase-9/neutrophil gelatinase-associated lipocalin complex plays a role in breast tumor growth and is present in the urine of breast cancer patients. Clin. Cancer Res. 11, 5390-5395 (2005).

209. Bartsch,S. & Tschesche,H. Cloning and expression of human neutrophil lipocalin cDNA derived from bone marrow and ovarian cancer cells. FEBS Lett. 357, 255-259 (1995).

210. Hanai,J., Mammoto,T., Seth,P., Mori,K., Karumanchi,S.A., Barasch,J., & Sukhatme,V.P. Lipocalin 2 diminishes invasiveness and metastasis of Ras-transformed cells. J. Biol. Chem. 280, 13641-13647 (2005).

211. Lee,H.J., Lee,E.K., Lee,K.J., Hong,S.W., Yoon,Y., & Kim,J.S. Ectopic expression of neutrophil gelatinase-associated lipocalin suppresses the invasion and liver metastasis of colon cancer cells. Int. J. Cancer 118, 2490-2497 (2006).

212. Lawrence,D.A. Transforming growth factor-beta: a general review. Eur. Cytokine Netw. 7, 363-374 (1996).

213. Takeuchi,M., Alard,P., & Streilein,J.W. TGF-beta promotes immune deviation by altering accessory signals of antigen-presenting cells. J. Immunol. 160, 1589-1597 (1998).

214. Swart,G.W., Lunter,P.C., Kilsdonk,J.W., & Kempen,L.C. Activated leukocyte cell adhesion molecule (ALCAM/CD166): signaling at the divide of melanoma cell clustering and cell migration? Cancer Metastasis Rev. 24, 223-236 (2005).

References 197

215. Burkhardt,M., Mayordomo,E., Winzer,K.J., Fritzsche,F., Gansukh,T., Pahl,S., Weichert,W., Denkert,C., Guski,H., Dietel,M., & Kristiansen,G. Cytoplasmic overexpression of ALCAM is prognostic of disease progression in breast cancer. J. Clin. Pathol. 59, 403-409 (2006).

216. Kikkawa,Y. & Miner,J.H. Review: Lutheran/B-CAM: a laminin receptor on red blood cells and in various tissues. Connect. Tissue Res. 46, 193-199 (2005).

217. Luo,L.Y., Grass,L., Howarth,D.J., Thibault,P., Ong,H., & Diamandis,E.P. Immunofluorometric assay of human kallikrein 10 and its identification in biological fluids and tissues. Clin. Chem. 47, 237-246 (2001).

218. Christopoulos,T.K. & Diamandis,E.P. Enzymatically amplified time-resolved fluorescence immunoassay with terbium chelates. Anal. Chem. 64, 342-346 (1992).

219. Johanning,G.L. Modulation of breast cancer cell adhesion by unsaturated fatty acids. Nutrition 12, 810-816 (1996).

220. Ofori-Acquah,S.F. & King,J.A. Activated leukocyte cell adhesion molecule: a new paradox in cancer. Transl. Res. 151, 122-128 (2008).

221. Jezierska,A., Olszewski,W.P., Pietruszkiewicz,J., Olszewski,W., Matysiak,W., & Motyl,T. Activated Leukocyte Cell Adhesion Molecule (ALCAM) is associated with suppression of breast cancer cells invasion. Med. Sci. Monit. 12, BR245-BR256 (2006).

222. Denzinger,T., Diekmann,H., Bruns,K., Laessing,U., Stuermer,C.A., & Przybylski,M. Isolation, primary structure characterization and identification of the glycosylation pattern of recombinant goldfish neurolin, a neuronal cell adhesion protein. J. Mass Spectrom. 34, 435-446 (1999).

223. Swart,G.W. Activated leukocyte cell adhesion molecule (CD166/ALCAM): developmental and mechanistic aspects of cell clustering and cell migration. Eur. J. Cell Biol. 81, 313-321 (2002).

224. Bowen,M.A. & Aruffo,A. Adhesion molecules, their receptors, and their regulation: analysis of CD6-activated leukocyte cell adhesion molecule (ALCAM/CD166) interactions. Transplant. Proc. 31, 795-796 (1999).

225. van Kempen,L.C., Nelissen,J.M., Degen,W.G., Torensma,R., Weidle,U.H., Bloemers,H.P., Figdor,C.G., & Swart,G.W. Molecular basis for the homophilic activated leukocyte cell adhesion molecule (ALCAM)-ALCAM interaction. J. Biol. Chem. 276, 25783-25790 (2001).

226. Kristiansen,G., Pilarsky,C., Wissmann,C., Kaiser,S., Bruemmendorf,T., Roepcke,S., Dahl,E., Hinzmann,B., Specht,T., Pervan,J., Stephan,C., Loening,S., Dietel,M., & Rosenthal,A. Expression profiling of microdissected

References 198

matched prostate cancer samples reveals CD166/MEMD and CD24 as new prognostic markers for patient survival. J. Pathol. 205, 359-376 (2005).

227. Zimmerman,A.W., Joosten,B., Torensma,R., Parnes,J.R., van Leeuwen,F.N., & Figdor,C.G. Long-term engagement of CD6 and ALCAM is essential for T-cell proliferation induced by dendritic cells. Blood 107, 3212-3220 (2006).

228. Weichert,W., Knosel,T., Bellach,J., Dietel,M., & Kristiansen,G. ALCAM/CD166 is overexpressed in colorectal carcinoma and correlates with shortened patient survival. J. Clin. Pathol. 57, 1160-1164 (2004).

229. Degen,W.G., van Kempen,L.C., Gijzen,E.G., van Groningen,J.J., van Kooyk,Y., Bloemers,H.P., & Swart,G.W. MEMD, a new cell adhesion molecule in metastasizing human melanoma cell lines, is identical to ALCAM (activated leukocyte cell adhesion molecule). Am. J. Pathol. 152, 805-813 (1998).

230. DeLong,E.R., DeLong,D.M., & Clarke-Pearson,D.L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44, 837-845 (1988).

231. McIntosh,M.W. & Pepe,M.S. Combining several screening tests: optimality of the risk score. Biometrics 58, 657-664 (2002).

232. Zheng,Y., Katsaros,D., Shan,S.J., de,l.L., I, Porpiglia,M., Scorilas,A., Kim,N.W., Wolfert,R.L., Simon,I., Li,L., Feng,Z., & Diamandis,E.P. A multiparametric panel for ovarian cancer diagnosis, prognosis, and response to chemotherapy. Clin. Cancer Res. 13, 6984-6992 (2007).

233. Behrens,J. The role of cell adhesion molecules in cancer invasion and metastasis. Breast Cancer Res. Treat. 24, 175-184 (1993).

234. Stoll,B.A. Biological mechanisms in breast cancer invasiveness: relevance to preventive interventions. Eur. J. Cancer Prev. 9, 73-79 (2000).

235. Sommers,C.L. The role of cadherin-mediated adhesion in breast cancer. J. Mammary. Gland. Biol. Neoplasia. 1, 219-229 (1996).

236. Li,G. & Herlyn,M. Dynamics of intercellular communication during melanoma development. Mol. Med. Today 6, 163-169 (2000).

237. Shimoyama,Y., Hirohashi,S., Hirano,S., Noguchi,M., Shimosato,Y., Takeichi,M., & Abe,O. Cadherin cell-adhesion molecules in human epithelial tissues and carcinomas. Cancer Res. 49, 2128-2133 (1989).

238. Updyke,T.V. & Nicolson,G.L. Malignant melanoma cell lines selected in vitro for increased homotypic adhesion properties have increased experimental metastatic potential. Clin. Exp. Metastasis 4, 273-284 (1986).

References 199

239. Benchimol,S., Fuks,A., Jothy,S., Beauchemin,N., Shirota,K., & Stanners,C.P. Carcinoembryonic antigen, a human tumor marker, functions as an intercellular adhesion molecule. Cell 57, 327-334 (1989).

240. Bormer,O.P. Immunoassays for carcinoembryonic antigen: specificity and interferences. Scand. J. Clin. Lab Invest 53, 1-9 (1993).

241. Nap,M., Hammarstrom,M.L., Bormer,O., Hammarstrom,S., Wagener,C., Handt,S., Schreyer,M., Mach,J.P., Buchegger,F., von Kleist,S., & . Specificity and affinity of monoclonal antibodies against carcinoembryonic antigen. Cancer Res. 52, 2329-2339 (1992).

242. Soletormos,G., Nielsen,D., Schioler,V., Skovsgaard,T., & Dombernowsky,P. Tumor markers cancer antigen 15.3, carcinoembryonic antigen, and tissue polypeptide antigen for monitoring metastatic breast cancer during first-line chemotherapy and follow-up. Clin. Chem. 42, 564-575 (1996).

243. Hostetter,R.B., Augustus,L.B., Mankarious,R., Chi,K.F., Fan,D., Toth,C., Thomas,P., & Jessup,J.M. Carcinoembryonic antigen as a selective enhancer of colorectal cancer metastasis. J. Natl. Cancer Inst. 82, 380-385 (1990).

244. Vihinen,P., Ala-aho,R., & Kahari,V.M. Matrix metalloproteinases as therapeutic targets in cancer. Curr. Cancer Drug Targets. 5, 203-220 (2005).

245. Leppa,S., Saarto,T., Vehmanen,L., Blomqvist,C., & Elomaa,I. A high serum matrix metalloproteinase-2 level is associated with an adverse prognosis in node-positive breast carcinoma. Clin. Cancer Res. 10, 1057-1063 (2004).

246. van Kempen,L.C., van den Oord,J.J., van Muijen,G.N., Weidle,U.H., Bloemers,H.P., & Swart,G.W. Activated leukocyte cell adhesion molecule/CD166, a marker of tumor progression in primary malignant melanoma of the skin. Am. J. Pathol. 156, 769-774 (2000).

247. Stamey,T.A., Warrington,J.A., Caldwell,M.C., Chen,Z., Fan,Z., Mahadevappa,M., McNeal,J.E., Nolley,R., & Zhang,Z. Molecular genetic profiling of Gleason grade 4/5 prostate cancers compared to benign prostatic hyperplasia. J. Urol. 166, 2171-2177 (2001).

248. Kristiansen,G., Pilarsky,C., Wissmann,C., Stephan,C., Weissbach,L., Loy,V., Loening,S., Dietel,M., & Rosenthal,A. ALCAM/CD166 is up-regulated in low-grade prostate cancer and progressively lost in high-grade lesions. Prostate 54, 34-43 (2003).

249. King,J.A., Ofori-Acquah,S.F., Stevens,T., Al Mehdi,A.B., Fodstad,O., & Jiang,W.G. Activated leukocyte cell adhesion molecule in breast cancer: prognostic indicator. Breast Cancer Res. 6, R478-R487 (2004).

References 200

250. Davies,S.R., Dent,C., Watkins,G., King,J.A., Mokbel,K., & Jiang,W.G. Expression of the cell to cell adhesion molecule, ALCAM, in breast cancer patients and the potential link with skeletal metastasis. Oncol. Rep. 19, 555-561 (2008).

251. Jezierska,A., Matysiak,W., & Motyl,T. ALCAM/CD166 protects breast cancer cells against apoptosis and autophagy. Med. Sci. Monit. 12, BR263-BR273 (2006).

252. Ihnen,M., Muller,V., Wirtz,R.M., Schroder,C., Krenkel,S., Witzel,I., Lisboa,B.W., Janicke,F., & Milde-Langosch,K. Predictive impact of activated leukocyte cell adhesion molecule (ALCAM/CD166) in breast cancer. Breast Cancer Res. Treat. (2008).

Download - IDENTIFICATION AND VALIDATION OF CANDIDATE ......1.1.3 Current breast cancer screening methods 4 1.1.4 Early diagnosis of breast cancer is essential 8 1.2 Cancer biomarkers 9 1.2.1

Top Related