algorithms for applied digital image cytometry163446/fulltext01.pdf · 2009-02-14 ·...

86
Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 896 Algorithms for Applied Digital Image Cytometry BY CAROLINA WÄHLBY ACTA UNIVERSITATIS UPSALIENSIS UPPSALA 2003

Upload: others

Post on 31-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology 896

Algorithms for Applied DigitalImage Cytometry

BY

CAROLINA WÄHLBY

ACTA UNIVERSITATIS UPSALIENSISUPPSALA 2003

Dissertation presented at Uppsala University to be publicly examined in H¨aggsalen (room10132),Angstrom Laboratory, Uppsala, Friday, October 31, 2003 at 10:15 for the Degree ofDoctor of Philosophy. The examination will be conducted in English.

AbstractWahlby, C. 2003. Algorithms for Applied Digital Image Cytometry. Acta UniversitatisUpsaliensis.Comprehensive Summaries of Uppsala Dissertations from the Faculty of Scienceand Technology896. 75 pp. Uppsala. ISBN 91-554-5759-2

Image analysis can provide genetic as well as protein level information from fluorescencestained fixed or living cells without loosing tissue morphology. Analysis of spatial, spectral,and temporal distribution of fluorescence can reveal important information on the single celllevel. This is in contrast to most other methods for cell analysis, which do not account forinter-cellular variation. Flow cytometry enables single-cell analysis, but tissue morphology islost in the process, and temporal events cannot be observed.

The need for reproducibility, speed and accuracy calls for computerized methods for cellimage analysis, i.e., digital image cytometry, which is the topic of this thesis.

Algorithms for cell-based screening are presented and applied to evaluate the effect ofinsulin on translocation events in single cells. This type of algorithms could be the basis forhigh-throughput drug screening systems, and have been developed in close cooperation withbiomedical industry.

Image based studies of cell cycle proteins in cultured cells and tissue sections show thatcyclin A has a well preserved expression pattern while the expression pattern of cyclin E isdisturbed in tumors. The results indicate that analysis of cyclin E expression provides additionalvaluable information for cancer prognosis, not visible by standard tumor grading techniques.

Complex chains of events and interactions can be visualized by simultaneous staining ofdifferent proteins involved in a process. A combination of image analysis and staining proceduresthat allow sequential staining and visualization of large numbers of different antigens in singlecells is presented. Preliminary results show that at least six different antigens can be stained inthe same set of cells.

All image cytometry requires robust segmentation techniques. Clustered objects, backgroundvariation, as well as internal intensity variations complicate the segmentation of cells in tissue.Algorithms for segmentation of 2D and 3D images of cell nuclei in tissue by combining intensity,shape, and gradient information are presented.

The algorithms and applications presented show that fast, robust, and automatic digitalimage cytometry can increase the throughput and power of image based single cell analysis.

Carolina Wahlby, Centre for Image Analysis. Uppsala University. L¨agerhyddsv. 3, SE-752 37Uppsala, Sweden

c�

Carolina Wahlby 2003

ISBN 91-554-5759-2ISSN 1104-232Xurn:nbn:se:uu:diva-3344 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-3608)

Printed in Sweden by Universitetstryckeriet, Uppsala 2003

Till min lilla och min stora familj.

� � �

List of enclosed papers

The thesis is based on the following publications, which will be referred to inthe text by their Roman numerals:

I. C. Wahlby, J. Lindblad, M. Vondrus, E. Bengtsson, and L. Bj¨orkesten.Algorithms for cytoplasm segmentation of fluorescence labelled cells.Analytical Cellular Pathology, 24:101–111, 2002.

II. J. Lindblad, C. Wahlby, E. Bengtsson, and A. Zaltsman. Image analy-sis for automatic segmentation of cytoplasms and classification of Rac1activation.Accepted for publication in Cytometry.

III. F. Erlandsson, C. Linnman (-W¨ahlby), S. Ekholm, E. Bengtsson, and A.Zetterberg. A detailed analysis of cyclin A accumulation at the G1/Sborder in normal and transformed cells.Experimental Cell Research,259:86–95, 2000.

IV. F. Erlandsson, C. W¨ahlby, S. Ekholm-Reed, A.-C. Hellstr¨om, E. Bengts-son, and A. Zetterberg. Abnormal expression pattern of cyclin E in tu-mor cells.International Journal of Cancer, 104:369–375, 2003.

V. C. Wahlby, F. Erlandsson, K. Nyberg, J. Lindblad, A. Zetterberg, and E.Bengtsson. Multiple tissue antigen analysis by sequential immunofluo-rescence staining and multi-dimensional image analysis. InProceedingsof the 12th Scandinavian Conference on Image Analysis (SCIA), pp. 25–32, Bergen, Norway, June 2001.

iv

VI. C. Wahlby, F. Erlandsson, E. Bengtsson, and A. Zetterberg. Sequentialimmunofluorescence staining and image analysis for detetion of largenumbers of antigens in individual cell nuclei.Cytometry, 47:32–41,2001.

VII. C. Wahlby and E. Bengtsson. Segmentation of cell nuclei in tissue bycombining watersheds with gradient information. InProceedings of the13th Scandinavian Conference on Image Analysis (SCIA), volume 2749of Lecture Notes in Computer Science, pp. 408–414, G¨oteborg, Sweden,July 2003.

VIII. C. Wahlby, I.-M. Sintorn, F. Erlandsson, G. Borgefors, and E. Bengts-son. Combining intensity, edge, and shape information for 2D and 3Dsegmentation of cell nuclei in tissue sections.Submitted for publication.

All papers published or accepted for publication are reproduced with permis-sion from the publisher.

The author has significantly contributed to the work performed in all the pa-pers. The author has been deeply involved in algorithm discussions, methoddevelopment, and algorithm implementations which have taken place in con-nection to the work. Papers I and II have been produced in close cooperationwith J. Lindblad, Papers III–VI have been produced in close cooperation withF. Erlandsson, and Paper VIII has been produced in close cooperation withI.-M. Sintorn.

Faculty opponent is Dr. Stephen Lockett, National Cancer Institute in Freder-ick, Maryland, USA.

v

Related work

In addition to the papers included in this thesis, the author has also written orcontributed to the following publications:

i. H. Soderqvist, G. Imreh, M. Kihlmark, C. Linnman (-W¨ahlby), N. Ring-ertz, and E. Hallberg. Intracellular distribution of an integral nuclearpore membrane protein fused to green fluorescent protein: Localizationof a targeting domain.European Journal of Biochemistry250:808–813,1997.

ii. C. Linnman (-Wahlby) and E. Bengtsson. Detection of fluorescent fociand evaluation of spatial relationships in 3D-fluorescence microscopyimages of mammalian cells. InProceedings of the Swedish Society forAutomated Image Analysis (SSAB) Symposium on Image Analysis, pp.57–60, Goteborg, Sweden, March 1999.

iii. C. Linnman (-Wahlby), E. Bengtsson, S. Ekholm-Jensen, and A. Zetter-berg. Detection of fluorescent foci and evaluation of spatial relationshipsin 3D-fluorescence microscopy images of mammalian cells. Abstractin Analytical Cellular Pathology18:36–37, special issue from the6thCongress for the European Society for Analytical Cellular Pathology(ESACP), Heidelberg, Germany, April 1999.

iv. C. Linnman (-Wahlby), J. Lindblad, M. Vondrus, T. Jarkrans, E. Bengts-son, and L. Bj¨orkesten. Automatic cytoplasm segmentation of fluores-cence labelled cells. InProceedings of the Swedish Society for Auto-mated Image Analysis (SSAB) Symposium on Image Analysis, pp. 29–32, Halmstad, Sweden, March 2000.

vi

v. C. Wahlby, F. Erlandsson, A. Zetterberg and E. Bengtsson. Multi-di-mensional image analysis of sequential immunofluorescence staining.Abstract inAnalytical Cellular Pathology22:61, special issue from the7th Congress for the European Society for Analytical Cellular Pathology(ESACP), Caen, France, April 2001.

vi. F. Erlandsson, C. W¨ahlby, E. Bengtsson, and A. Zetterberg. Detection oflarge numbers of antigens using sequential immunofluorescence stain-ing. Abstract inAnalytical Cellular Pathology22:56–57, special issuefrom the7th Congress for the European Society for Analytical CellularPathology (ESACP), Caen, France, April 2001.

vii. J. Lindblad, C. Wahlby, M. Vondrus, E. Bengtsson, and L. Bj¨orkesten.Statistical quality control for segmentation of fluorescence labelled cells.In Proceedings of the 5th Korea-Germany Joint Workshop on AdvancedMedical Image Processing, Seoul, Korea, May 2001.

viii. C. Wahlby, F. Erlandsson, J. Lindblad, A. Zetterberg, and E. Bengtsson.Analysis of cells using image data from sequential immunofluorescencestaining experiments. InProceedings of the 5th Korea-Germany JointWorkshop on Advanced Medical Image Processing, Seoul, Korea, May2001.

ix. E. Bengtsson, C. W¨ahlby, and J. Lindblad. Robust cell image segmen-tation methods. InProceedings of the 6th Open Russian-German Work-shop on Pattern Recognition and Image Understanding, Village Katunof Altai Region, Russia, August 2003.

vii

Contents

List of enclosed papers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vRelated work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

1 Introduction and Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Drug screening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Cancer research .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Segmentation and staining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 About this thesis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.1 The cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1.1 The cell cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.1.2 Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Visualizing the cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2.1 Fluorescence microscopy .. . . . . . . . . . . . . . . . . . . . . . . . . 72.2.2 Cell preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.2.3 Immunofluorescence staining .. . . . . . . . . . . . . . . . . . . . . . 92.2.4 DNA staining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.2.5 GFP tagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.2.6 Other fluorescence staining techniques .. . . . . . . . . . . . . . . 11

2.3 Cytometry – the measurement of cell properties .. . . . . . . . . . . . 122.3.1 Flow cytometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.3.2 Image cytometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.3.3 Flow cytometry vs. image cytometry . . . . . . . . . . . . . . . . . 13

3 Digital image cytometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.1 Image acquisition:

image quality vs. image quantity. . . . . . . . . . . . . . . . . . . . . . . . 163.2 Pre-processing .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.2.1 Reduction of intensity non-uniformities. . . . . . . . . . . . . . . 173.2.2 Image registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.3 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.3.1 Thresholding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.3.2 Watershed segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.3.3 Shape-based watershed segmentation .. . . . . . . . . . . . . . . . 223.3.4 Edge-based watershed segmentation. . . . . . . . . . . . . . . . . 233.3.5 Merging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

viii

i

3.3.6 Model-based splitting. . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.3.7 Seeded watershed segmentation . . . . . . . . . . . . . . . . . . . . . 253.3.8 Extension to 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.3.9 Other methods .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.4 Feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.5 Data analysis and evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.6 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4 New methods and applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.1 Papers I and II:

Cell segmentation and analysis of Rac1 activation . . . . . . . . . . . 334.1.1 Methods and experiments. . . . . . . . . . . . . . . . . . . . . . . . . 334.1.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.1.3 Conclusions and comments . . . . . . . . . . . . . . . . . . . . . . . . 42

4.2 Papers III and IV:Studies of cyclin A and E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.2.1 Methods and experiments. . . . . . . . . . . . . . . . . . . . . . . . . 454.2.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474.2.3 Conclusions and comments . . . . . . . . . . . . . . . . . . . . . . . . 49

4.3 Papers V and VI:Sequential immunofluorescence staining. . . . . . . . . . . . . . . . . . 52

4.3.1 Methods and experiments. . . . . . . . . . . . . . . . . . . . . . . . . 524.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544.3.3 Conclusions and comments . . . . . . . . . . . . . . . . . . . . . . . . 55

4.4 Papers VII and VIII:Segmentation of cell nuclei in tissue . . . . . . . . . . . . . . . . . . . . . 58

4.4.1 Methods and experiments. . . . . . . . . . . . . . . . . . . . . . . . . 584.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614.4.3 Conclusions and comments . . . . . . . . . . . . . . . . . . . . . . . . 62

5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655.2 Discussion and future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

6 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

ix

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 71

Introduction and Objectives

The cells are the building blocks of our bodies and their different functionsare of great importance to our overall health. Understanding complex biolog-ical systems requires integration of information from many different sources.Genomics provides important tools for studies of individuals on the geneticlevel and proteomics explores structure and function of protein interactions.The next level of biological complexity is the cell.

Most analytical methods in cell biology do not account for inter-cellularvariation. Material from a single cell is not sufficient for analysis, rather amixture of cells is needed. Single cell analysis using flow cytometry has beenwidely used, but tissue or cell culture morphology is lost in the process of theanalysis. Observations of temporal events taking place in single living cellsare also impossible using flow cytometry.

Image-based analysis can provide genetic as well as molecular information(e.g., by immunofluorescence staining, fluorescence in situ hybridization andGFP-tagging) in fixed or living cells without losing tissue morphology or spa-tial organization in a cell culture. Time sequence analysis of single cells isalso possible. Analysis of spatial, spectral and temporal distributions of fluo-rescence can reveal important information on the single cell level. The needfor reproducibility, speed and accuracy calls for computerized methods for dataanalysis.

Digital image cytometry is the measurement of cell properties from im-ages with the aid of computers. This thesis presents the result of work indevelopment of algorithms for digital image cytometry applied to fluorescencemicroscopy images of cells. Essentially, the thesis is based on two projects;one related to drug screening, and one related to cancer research. These twoprojects have a lot in common, and the basis for all single cell analysis is imagesegmentation. Much effort has therefore been put into development of robustimage segmentation techniques. Many of the methods developed could be usedfor a wide range of image analysis applications; not only image cytometry.

1.1 Drug screeningThe aim of the first project was to develop methods for efficient and quan-titative measurements of the effect of potential drugs on individual cells in

1

culture. In the search for new drugs, genomics together with proteomics canidentify large numbers of molecules that interact with proteins produced bydisease-related genes. Such molecules have the potential of restoring the nor-mal function of the cell. Before testing the different molecules on animalsand at the clinical level, they can be tested on cultured cells. Each type ofpotential molecule is added to cultured cells and the reaction of the cells isobserved by imaging the distribution of fluorescing reporter signals over time.All molecules that are, e.g., toxic, can thus be found and removed from furthertesting. This is called cell-based screening.

High-throughput cell-based screening requires fast, objective, and auto-matic methods for analysis of time sequence image data. Digital image cytom-etry has great potential for removing bottlenecks in the drug discovery process.Development of image-based screening methods was carried out in coopera-tion with Amersham Biosciences, in Uppsala, Sweden, and Cardiff, Wales.

1.2 Cancer researchThe aim of the second project was to develop methods for discovery of rela-tionships between cell cycle aberrations in individual tumor cells and cancerprognosis for patients. No two cancer tumors show the same behavior, and thecells within a tumor are very heterogeneous. These facts call for single cellanalysis. Immunofluorescence staining combined with image analysis can re-veal genetic changes in individual cells. Reliable results require large numbersof unbiased observations, making quantitative (semi-) automatic digital imagecytometry an attractive solution.

Focus of the analysis was the cell cycle and investigation of the exact timingof the expression of different cell cycle regulating proteins. Differences inexpression patterns between normal and transformed cells were observed, andset in relation to tumor morphology, treatment, and prognosis for patients. Thisproject was carried out in close cooperation with the Department of Oncology-Pathology, the Karolinska Institute, Stockholm, Sweden.

1.3 Segmentation and stainingOne of the first, and most difficult steps in any image analysis task is to findthe objects of interest in the image. The objects have to be separated fromeach other as well as from the image background before measurements can beperformed on a single object basis. This is called image segmentation, and isessential for all single cell analysis. Much effort has therefore been put intodevelopment of image segmentation algorithms, and methods for segmentationof living and fixed cells in culture and tissue sections are presented.

2

Complex chains of events and interactions can be visualized by multicolorstaining of tissue and cell samples using various techniques. A combinationof image analysis and staining procedures that allow for sequential stainingand visualization of large numbers of different antigens in single cells wastherefore developed.

1.4 About this thesisChapter two contains brief overviews of the cell, how it duplicates in the cellcycle, and how a normal cell turns into a cancer cell. Different techniques forstaining and visualizing the cell and its constituents are described, and methodsfor making measurements on single cells are discussed. These introductionsgive very short descriptions, but hopefully they contain enough information tofamiliarize the reader with the concepts used throughout the thesis.

Chapter three brings up the different steps of digital image analysis in gen-eral, and digital image cytometry in particular. The main emphasis is on imagesegmentation, as this has been the main part of the work leading to this thesis.

The material presented in chapter four is a description and discussion of thepapers included in the thesis. The papers have been grouped into four groups,as they are related to each other.

Chapter five concludes with a summary of the thesis, the results are dis-cussed, and some future research areas are suggested.

3

Background

2.1 The cellA cell is not just a well-mixed container of chemicals, but a very intricate sys-tem of membranes, vesicles and signaling pathways. Much is still left to dis-cover about the spatial organization and mobility of specific macro-molecules(such as proteins, protein complexes, enzymes, signaling substances, etc.) incells, and the spatial relationship between cells in the tissue. All cells in an or-ganism originate from a single cell; the fertilized egg. The cells of the organismall contain the exact same genetic code, or DNA, organized in chromosomes.Triggered by millions of chemical signals, the cells divide and differentiateinto the building blocks of our different body parts.

2.1.1 The cell cycle

Organisms consist of cells that multiply through cell division. In a fully devel-oped organism, cells only need to multiply when worn out or damaged cellshave to be replaced. Before a cell can divide, it has to grow in size, dupli-cate its DNA (contained in the chromosomes) and separate the chromosomesbetween the two daughter cells. These different processes are coordinated inthe cell cycle. The 2001 Nobel Prize in Physiology or Medicine was awardedjointly to Leland Hartwell, Tim Hunt and Paul Nurse for their discoveries of”key regulators of the cell cycle” [20]. Using genetic and biochemical meth-ods, they identified the first cyclin-dependent kinases (CDKs) and cyclins thatcontrol the cell cycle in eukaryotic organisms. CDK and cyclin are the keymolecules that control and coordinate DNA synthesis, chromosome separationand cell division. A group of different CDKs and cyclins together drive thecell from one cell cycle phase to the next by activation of the CDKs throughassociation with different cyclins, which function as regulatory subunits. Theamount of CDK molecules is constant during the cell cycle, but their activitiesvary because of the regulatory function of the cyclins. The different cyclinshave temporally distinct and highly regulated patterns of expression, and areonly present at very specific stages of the cell cycle.

The cell cycle consists of several phases, see Figure 2.1. In the first phase,G1 (stands for Gap 1), the cell grows. When it has reached its appropriatesize it enters the phase of DNA synthesis, S, where the chromosomes are du-

5

Figure 2.1:The cell cycle. During G1, the cell grows. DNA is synthesized during S, resultingin duplication of the chromosomes. Cyclin E is involved in the initiation of S, and cyclin A ispresent during the progression through S and G2. During G2, the cell prepares for cell division,or M (mitosis). After mitosis, the cell enters G1 again. The cell may either continue cycling orenter a resting state called G0.

plicated. During the next phase, G2 (Gap 2), the cell prepares for division.In mitosis, M, the chromosome duplicates separate, and the cell divides intotwo daughter cells. Through this mechanism the daughter cells receive identi-cal sets of chromosomes. After division, the cells are back in G1 and the cellcycle is completed. Although the adult human must manufacture millions ofcells each second simply to maintain status quo, most cells are not growingor dividing but instead are in a resting state, G0, performing their specializedfunction [2].

Several different cyclins are involved in the cell cycle. Cyclin E is involvedin the initiation of S and appears in G1, after the cell has reached the so-calledrestriction point [17]. The level of cyclin E peaks in late G1 and disappearsagain in early S. Cyclin A appears at the G1/S border, is present during theprogression through S and G2 and disappears at the beginning of mitosis [21]as illustrated in Figure 2.1. The roles of these two cyclins have been investi-gated in Papers III and IV, as discussed in Section 4.2.

2.1.2 Cancer

To produce and maintain the intricate organization of the body, the componentcells must obey strict controls that limit their proliferation. A very intricatesystem of checkpoints and controls protect the organism from development ofcancer. Greatly simplified, one can say that if the cell cycle control is disrupted,the cells may divide before the chromosome duplication is completed, or partsof the chromosomes may be duplicated more than once before each cell divi-sion. This leads to genetic instability. This, in turn, may further disrupt thecontrol of the cell cycle. The cells will then start to divide in an uncontrolled

6

way, producing more new cells than the body needs, and a tumor is formed.Lack of control of the cell divisions may lead to even further genetic changes,and the tumor cells may start to invade other parts of the body, i.e., the tumorbecomes malignant. The genetic changes associated with a cancer tumor oftendiffer between different individuals. This means that the progression of thedisease, and the optimal choice of treatment, is different for different individ-uals. If the links between disrupted cell cycle control and tumor behavior arefound, it may be possible to significantly improve treatment and outcome forcancer patients.

2.2 Visualizing the cellMany methods to study the biochemistry of cells exist. Most of these meth-ods require a large number of cells as input; events taking place in individualcells are not possible to study. Instead, the average of a population of cellsis observed. For single cell studies, single cell signal analysis or imaging isrequired.

Most of the structures within a cell are difficult to detect without addingsome kind of dye. There is a wide range of dyes available that can be used tovisualize specific structures in normal absorption light microscopy. Despite itscommon use, it is difficult to analyze the dyes one by one using absorption lightmicroscopy. All of the studies in this thesis therefore make use of fluorescentdyes and fluorescence microscopy. Several fluorescent dyes of the same cellpreparation can be imaged one by one using different excitation and emissionfilters in the microscope, and the fluorescence signal is seen against a darkbackground.

All real world systems are difficult to study without interfering with thesystem itself. This is true also for cells. In the studies described in this thesis,the cells have either been cultured in a solution of nutrients in a plastic dish,out of their natural environment, or they have been cut out from their naturalenvironment and fixed (killed) before being studied. Still, we believe that theinformation we extract, and the theories we construct, are relevant also for cellsin their natural environment.

2.2.1 Fluorescence microscopyIn normal absorption light microscopy, Figure 2.2 A, the observed light orig-inates from a light source, and the specimen is seen as darker, or colored(stained) areas where the light is attenuated. All colors are usually registeredin the same image. The images used in this thesis were all produced by flu-orescence microscopy, Figure 2.2 B. Fluorescence microscopy differs fromnormal light microscopy mainly due to the fact that the light that is registered

7

by the camera originates from the fluorochromes in the specimen. Light froma light-source reaches the specimen, and the fluorochromes in the specimenemit fluorescence, which is detected by the camera. The bright signal from thefluorochromes is seen against a dark background. A dichroic mirror reflectinghigh energy and transmitting low energy light together with band pass filtersplaced between the light-source and the specimen and in front of the cam-era control what fluorochromes to excite and what emission spectra to image.Only the part of the specimen labeled with fluorochromes with a specific spec-tra will be imaged using a given filter and mirror combination. Each structure,or even protein, in the cell can be imaged separately. The number of differentstructures that can be stained and imaged is limited by the availability of fluo-rescent markers and their spectra, as is described in Papers V and VI, discussedin Section 4.3.

Fluorescence microscopy images are often blurred by light originating fromout-of-focus fluorescence. Different optical and computational methods canbe applied to enhance the image quality. Confocal microscopy [10] and multi-photon microscopy [45] are two optical methods and deconvolution microscopy[1] is a computational method for enhancement of image resolution. All thesemethods try to improve the image by removing light that originates from unfo-cused parts of the imaged object, only keeping the light that originates from thecurrent focal plane. In confocal microscopy, Figure 2.2 C, the fluorochromesin the specimen are excited by a well focused laser, and emitted light from out-of-focus planes is blocked by a pinhole. This means that if a three-dimensional(3D) object is observed, a series of well focused (two-dimensional) 2D images,or optical sections, can be acquired. The stack of 2D images can then be usedas a 3D image.

Both 2D and 3D images were used in the work leading to this thesis. Forthe fast screening applications, cultured cells growing on a flat surface wereimaged in 2D. Here, 3D imaging would have been too tedious, and 3D datawould have provided little additional information about the flat cultured cells.

In tumor material, the cells lie in different planes in the tissue, and 3Dimaging is necessary to get a correct view of each cell nucleus. Despite thisfact, we used 2D images for most applications on tumor material, as 2D imagesare easier to acquire and handle. The algorithms developed for segmentation ofcell nuclei in tissue were however tested on a 3D image as described in PaperVIII, discussed in Section 4.4. AS no aut-focusing routines were available, 3Dimages were used also in the sequential staining experiments of Papers V andVI, discussed in Section 4.3.

8

A: absorption B: fluorescence C: confocal

Figure 2.2: Simplified ray paths of different types of microscopes.A: Absorption lightmicroscope.B: Fluorescence microscope.C: Confocal microscope. For all sketches there isa light-source (a), a specimen (b), and a camera (c). InB, a dichroic mirror (d), together withan excitation filter (e), and an emission filter (f), control what excitation wavelengths reach thespecimen and what emission wavelengths reach the camera. InC, emission light from out-of-focus planes (d) is stopped by a pin-hole in front of the camera.

2.2.2 Cell preparation

Some proteins emit a natural fluorescence, but this fluorescence is usually inthe high-energy UV wavelengths, and special optics are needed to observeit. Instead, different fluorescent markers are used to make the structures ofinterest fluoresce. There are fluorescent dyes that can be used in living cells,e.g., the green fluorescent protein (GFP) as described below, but most dyesrequire fixed cells. Cells can either be grown in cell cultures, e.g., attached toplastic or glass, or they can be removed from the human body by biopsy or byoperation, e.g., when a cancer tumor is removed. Cultured cells are fixed byexposing the cells to a fixation medium, usually a buffer containing alcohol.Tissue material is often not gathered in immediate connection with the cellanalysis, and is therefore first embedded in paraffin for storage. Before thecells can be dyed, the tissue samples are cut in thin (4–30µm) sections. Thesections are attached to glass slides and the paraffin is removed by repeatedwashing in graded alcohols.

2.2.3 Immunofluorescence staining

Specific cellular structures, such as proteins and protein complexes, can be vi-sualized using immunofluorescence staining. A solution containing primaryantibodies highly specific for the structure of interest, or antigen, is added tothe cells. The antibodies interact with and bind to the antigen. Antibodies thatdo not bind are removed by washing the cells in a buffer solution. Fluorescent

9

molecules, or fluorochromes, may be directly attached to the primary antibod-ies, but more commonly secondary antibodies labeled with fluorochromes areadded to the cells. The secondary antibodies bind to the primary antibodies inseveral copies, leading to signal amplification, see Figure 2.3 A.

Several different primary and secondary antibodies may be used to visual-ize several different antigens simultaneously. There is, however, a limitation inthe number of antigens that can be visualized at the same time. If different pri-mary antibodies produced by the same species are used together, the secondaryantibodies will cross react. If the emission and/or excitation spectra of the fluo-rochromes cannot be separated using filters, the different fluorochromes cannotbe imaged one by one. The number of different stains in one sample is there-fore usually limited to three or four. One way to visualize a larger number ofdifferent antigens in the same set of cells is by sequential staining, as describedin Papers V and VI, and discussed in Section 4.3.

2.2.4 DNA stainingInstead of using antibodies, other molecules that interact with the cell may beused. One example is 4’,6-diamidino-2-phenylindole, or DAPI, and the verysimilar Hoechst dye. DAPI is a fluorescing molecule that binds directly to theDNA chain, see Figure 2.3 B. DAPI can be applied directly to living cells,but as it binds to DNA, it may disturb the behavior of the cells. As DNA ispresent in the cell nucleus, DAPI is a very good nuclear stain, and has beenused in all papers included in this thesis, except for Paper I. There is, however,one problem with DAPI; it has to be excited by short wavelength UV light. Inconfocal microscopy, UV lasers are not very common, and propidium iodide ismore common for nuclear staining. Propidium iodide also binds to DNA, butis excited by green light. Propidium iodide was used for 3D confocal imagingof cell nuclei in Paper VIII.

Another example of a molecule that interacts with DNA is bromodeoxyuri-dine, or BrdU. BrdU is a synthetic molecule that looks almost exactly likethymidine, one of the four building blocks of DNA. If BrdU is added to livingcells that are synthesizing DNA, BrdU will be incorporated in the new DNAchain in place of thymidine. In other words, all cells in the S-phase of thecell cycle will bind BrdU. After the cells have been fixed, fluorescence labeledantibodies directed against BrdU can be used to visualize S-phase cells, seeFigure 2.3 C. This staining method was used in Papers III and IV, as discussedin Section 4.2.

2.2.5 GFP taggingThe green fluorescent protein (GFP) originates from the jellyfishAequoreavictoria and its natural function seems to be the conversion of blue chemi-

10

A B C D

Figure 2.3:Different techniques for fluorescence labeling.A: A primary antibody (1st Ab)binds to an antigen, e.g., a cellular structure, and secondary antibodies (2nd Ab) labeled withfluorescing molecules bind to the primary antibody.B: DAPI is a fluorescing molecule thatbinds directly to the DNA chain.C: BrdU is a synthetic molecule that is incorporated in newlysynthesized DNA and can be detected by fluorescence labeled antibodies.D: DNA coding forthe protein of interest (protein A) can be fused with the DNA coding for green fluorescentprotein (GFP). When the DNA is translated, the protein is synthesized together with a tailconsisting of the fluorescing GFP.

luminiscence of the Ca2� -sensitive photo-protein aequorin into emission ofgreen light. It is not well understood how and why jellyfish use their biolu-minescent capabilities, or what biological function this serves. GFP is a shortamino acid chain that folds into a tight cylinder enclosing a fluorescing struc-ture [12]. The DNA coding for GFP can be inserted into both procaryotic andeucaryotic cells. GFP will then be synthesized by the cell and can fluorescewithout addition of any substrates or cofactors. The DNA coding for GFP canbe fused with a wide range of different DNAs coding for proteins that onewants to study. When inserted into a cell, the fusion protein will be expressed,and its intra-cellular distribution and behavior can be studied in living cells,see Figure 2.3 D. Its extreme stability and low toxicity makes GFP a uniquemarker for studies of living cells.

2.2.6 Other fluorescence staining techniquesThere are numerous other methods to stain cells, and new techniques are de-veloped continuously. Two examples are FISH and quantum dots.

FISH, or fluorescence in situ hybridization [57], is a method where specificDNA sequences can be stained by attaching fluorochromes to complementaryDNA (cDNA). The fluorescence labeled cDNA is then allowed to hybridizewith the cellular DNA in situ, and, e.g., very subtle genetic changes can bedetected.

Quantum dots, also called nano-chrystals, are semiconductor particles thatcan glow in a large number of different colors depending on their size. Recentwork has shown that biological molecules, such as antibodies, can be attachedsturdily to quantum dots [53]. They have been used as bright and specificprobes for fluorescence imaging in living tissue, and fluorescence from deepunder the skin of mice has been imaged, without harming the animal [32].

11

Quantum dots have broad excitation spectra, but a narrow Gaussian emissionspectra at wavelengths controllable by the size of the particles, and show greatpromise for multicolor fluorescence imaging.

2.3 Cytometry – the measurement of cell propertiesThere are basically two different approaches for measurements of propertiesof individual cells; flow cytometry and image cytometry.

2.3.1 Flow cytometryIn flow cytometry, measurements are made on fluorescence labeled cells insolution flowing single-file past a laser beam. A momentary pulse of fluo-rescence is emitted as a cell crosses a laser beam. The emission is measuredby photo-multipliers at a 90

angle from the beam. Typically, two or threedetectors are used with different wavelength bandpass filters, allowing the si-multaneous detection of emissions from different fluorochromes in a singlecell. The main advantage of flow cytometry is the large number of cells thatcan be analyzed in a short period of time; as many as 10,000 cells per minutecan be analyzed. In addition to fluorescence, two types of light scatter can bemeasured: low-angle forward scatter, roughly proportional to the diameter ofthe cell, and orthogonal side scatter, which is proportional to the granularity.These parameters can be used for identifying debris or aggregated cells.

One of the problems with flow cytometry is that the cells must be in a singlecell suspension with minimal aggregation. Cells from solid tissue, such asbiopsies, can be used only if they can be dissociated and dispersed withoutbreaking the cells. Different types of cells may be affected differently by thedispersion, leading to biased cell counts. Information on the cell organizationin the tissue, or tissue morphology, is also lost in the process.

Flow cytometry is widely used in research on basic cellular and molecu-lar mechanisms. It is, however, not possible to quantify more complex stain-ing patterns on the sub-cellular level, such as the precise relative location ofdifferent antigens in the cells. Dynamic events, such as trafficking of GFPcomplexes described in Paper II, would also be impossible to study by flowcytometry.

2.3.2 Image cytometryImage cytometry is the measurement of cell properties using images. Imagescan be analyzed visually, e.g., measuring the size of the cells, or counting thenumber of stained spots in each cell. Visual analysis can be performed with orwithout the aid of computers, e.g., by using the mouse to manually outline the

12

boundaries of each cell. The main drawback with visual analysis is the tediousmarking and counting, but also the bias caused by a subjective observer. Ifvisual analysis is performed a second time, by the same or a different person,the result will often not be the same as the first time, i.e., the results are notfully reproducible. On the other hand, if the analysis can be automated usingdigital image cytometry, both time, cost, and subjectivity can be reduced. Thiswill be further discussed in Chapter 3.

2.3.3 Flow cytometry vs. image cytometryComparisons between flow and image cytometry are described [15, 24], andthey show discrepancies in the results obtained when applying the two tech-niques to similar material. These differences may be caused by loss of cellsduring the dispersion prior to flow cytometry. Such loss of cells i difficult toverify using flow cytometry.

The speed of image-based cytometry is not yet comparable to that of flowcytometry, limiting the number of cells that can be analyzed. There are, how-ever, a large number of parameters detectable by image cytometry that cannotbe distinguished by flow cytometry. And, maybe most important of all, in im-age cytometry cells can stay in their (more or less) natural environment, andthe images can easily be re-analyzed or visually inspected if the results aredubious. Fast, automated methods will help increase the throughput of imagecytometry. Fast algorithms together with increased computer power can makethe speed of image cytometry comparable to that of flow cytometry.

13

Digital image cytometry

Digital image cytometry is the measurement of cell properties with the aidof a computer. A digital image is not continuous, but consists of discrete pic-ture elements, or pixels. The color of a pixels is also described by a discretenumber. In a gray-level image, each shade of gray is represented by a num-ber corresponding to the brightness, or gray-level, of that particular pixel. Incolor images, each spectral band (usually red, green and blue) is representedby a separate discrete gray-level image, and in a time sequence, or movie, eachtime frame is a discrete image. In 3D images, the third dimension will be rep-resented by discrete slices of the imaged volume. In this case, the pixels areoften referred to as voxels instead, meaning volume element. In 2D, a pixelcan have two types of neighbors: edge and vertex neighbors. In 3D, there arethree types of neighbors: face, edge and vertex neighbors, see Figure 3.1.

This discrete image environment requires that the image analysis algorithmswe develop must work in discrete space. But as the objects we image usu-ally exist in the real, continuous world, the goal is to develop image analysisalgorithms that make measurements that are closely correlated with the truefeatures of the continuous world.

A number of image analysis steps have to be considered when making mea-surements on digital images. Although the applications may vary, the problemsencountered and the sequence of steps, or methodology, stays fairly much thesame.

The very first step is to acquire an image of the event or feature we wantto study. This also involves the preparation of specimen and staining proce-dures, as discussed in Section 2.2. In this step, it is important to consider thedemands on image quality. The next step is image pre-processing, e.g., com-pensating intensity non-uniformities, image rotation, etc., which could not beavoided during image acquisition. In order to make measurements on objects,the objects have to be found and outlined in the image. This is the next, andoften most difficult step, called image segmentation. Once the objects in theimage have been segmented, different measurements can be made on them.This step, called feature extraction, leads to numerical data, that may or maynot be comprehensible for the end user. Therefore, a final step involving dataanalysis and evaluation is needed.

The described steps will be discussed in this chapter, and focus will be on

15

2D edge 2D vertex 3D face 3D edge 3D vertex

Figure 3.1: Neighborhood relations among pixels and voxels. A pixel (2D) can have twotypes of neighbors: edge and vertex neighbors. A voxel (3D) can have three types of neighbors:face, edge, and vertex neighbors.

the methods used in the research leading to this thesis. A few other methodswill be discussed briefly. Many of the methods discussed here, are, or can be,applied to completely different image analysis problems. This is, however, notwithin the scope of this thesis.

3.1 Image acquisition:image quality vs. image quantity

Resolution is the capability of distinguishing between two separate but adja-cent objects or between two nearly equal wavelengths, i.e., the resolution ofan image determines how small details of the real world we can detect. Thenumber of pixels in an image determines the limit of spatial resolution. Therange of values each pixel can have determines the limit for spectral resolu-tion. The higher the sampling frequency, the greater the size of the image, andthe more data there is to handle during the image analysis. A high samplingfrequency does, however, not necessarily mean high image resolution. If anobject is poorly focused, or weakly stained, the resolution will be poor even ifthe sampling frequency is high. On the other hand, an unnecessarily low sam-pling frequency on a brightly stained and well focused object will result in lossof information. The optics of the microscope and the wavelength of light alsoinfluence the limits of possible resolution. A choice has to be made betweensampling frequency and image size. It may not always be of great importancethat the analysis is fast, but if speed can be gained by reducing the image size,without affecting the result of the analysis, it is desirable.

Before image acquisition, one has to ask a number of questions: What res-olution can be achieved? How reproducible is the image? How much of thespatial and spectral image information is needed for analysis? Can the imageacquisition be automated without reduction of image quality? How much canthe image quality be improved by digital pre-processing, and how much of the

16

noise etc. can be avoided by fine-tuning the image acquisition?All the images used in the work leading to this thesis were acquired outside

the Centre for Image Analysis. In most of the cases, the authors of the paperswere involved in the image acquisition. The main focus of the work has, how-ever, not been to produce high quality images, but rather images with sufficientquality for meaningful analysis. For example, high speed was an important fac-tor for the high-throughput analysis described in Papers I and II. This resultedin an image quality that was lower than what could have been achieved witha slower imaging system. The fact that the images were acquired elsewhereoften limited the amount of available image data, and the possibility to adjustthe image acquisition.

3.2 Pre-processingImage pre-processing reduces the effect of undesired imperfections introducedby the imaging system. Pre-processing includes methods such as reduction ofintensity non-uniformities, smoothing to reduce noise, sometimes sharpeningto enhance edge information, and image registration to align shifted image sets[55].

3.2.1 Reduction of intensity non-uniformitiesIn microscopy, it is very difficult to achieve uniform illumination of the speci-men. The resulting intensity non-uniformities have to be reduced, especially ifintensity measures from different parts of the image are to be compared. It issometimes possible to model the intensity variations by imaging some kind ofhomogeneous phantom in the microscope. It is, however, not always feasible todo this in conjunction with the imaging procedure. Intensity non-uniformitiescan instead be compensated for by data-driven approaches.

One example of a data-driven approach approximates the image backgroundby a surface. The algorithm iteratively computes a better and better estimateof the intensity variations of the image background by fitting a cubic B-splinesurface [31] to the image. One of the nice features of cubic B-splines is thatthey are continuous and smooth, and their flexibility is controlled by the num-ber of control points used. The distance between the spline surface and theimage is minimized by least squares regression. To get a first estimate of thebackground, the spline surface is initially fitted to the whole image. This firstestimate will give a too bright estimate of the background, as it takes the en-tire image, including the brighter objects, into consideration at regression. Allimage pixels that deviate more than a constant number of standard deviationsfrom the background estimate are considered to belong to the objects and aremasked away.

17

A B C 10

20

30

40

50

60

Figure 3.2:Intensity non-uniformities can be reduced by pre-processing.A: An image beforebackground correction andB: a spline surface fitted to the image background.C: The imageafter background subtraction. Note that the intensity scale (same for all images) has been set tovisually enhance the contrast of the darker parts of the images.

The second iteration starts again with the original image, but this time, thespline surface is only fitted to the pixels that were not already masked away.Once again, all image pixels that deviate from the background estimate arefound and masked away. This iterative procedure continues until the averagechange in pixel value between two successively calculated backgrounds is lessthan half the original quantization step of the image. Convergence is fast andthe stop criterium is usually reached after 4–10 iterations. The backgroundapproximation is described in more detail in [22] and [36]. The result, whenapplied to an image of fluorescence stained cytoplasms of cultured cells, canbe seen in Figure 3.2.

3.2.2 Image registration

Image registration, or spatial matching, makes it possible to use the same seg-mentation mask for automatic analysis of data in subsequent image sets, andis often used in radiology. We use registration for image series of fluorescencelabeled cells. A transformation that maps all voxels in one image to anothercan be based on image-specific landmarks. It is, however, both difficult andtime-consuming to find good landmarks in many image types. Instead, regis-tration can be based on the intensities of the image voxels. Such a registrationalgorithm needs as input the allowed transformations, a measure of the cost fora given transformation (i.e., a measure of the similarity between two imagesafter transformation), and a minimization method that finds the transformationresulting in the lowest cost [5].

In the experiments of Papers V and VI, discussed in Section 4.3, cells wereimaged repeatedly, and removed from the microscope between each imagingstep. Rigid translation inx-, y-, andz-direction as well as rotation around thez-axis was present. The cost of a given transformation was calculated usingthe inverse of the correlation coefficient, i.e., high correlation represents highsimilarity, and thereby low cost. The cost of all allowed transformations was

18

A B C D

Figure 3.3: Image shift can be reduced by registration.A: A reference image andB: animage to be registered.C, D: The difference betweenA andB before and after registration.Zero difference is shown in mid-gray. The transformations in this particular example wererotation of -0.2

around thez-axis, translation -13.6 pixels inx-, -60.9 pixels iny-, and +2.1pixels inz-direction. All images are maximum intensity projections of the 3D result and havebeen contrast enhanced to show the differences.

thereafter minimized using Powell’s method [46]. Powell’s method starts bysearching for a minimum of one transformation direction ˆei at a time usingBrent’s one-dimensional minimization method. The search is started in posi-tion P0. The minimum found in the preceding search direction ˆei � 1 is usedas a starting point for the search in each new direction ˆei . Once all allowedtransformation directions have been searched, a new positionP1 is reached. Anew search direction ˆn is then defined as ˆn � P0

� P1. This new direction canbe thought of as a weighted average of all transformations. Once a minimumalongn is found, the search continues along all ˆei and is iterated until no lowercost can be found.

The registered images were interpolated using trilinear interpolation. Allimage parts that were not overlapping, i.e., not common to all volumes, wereexcluded from further analysis. An example showing two images before andafter registration, and the difference between the images before and after reg-istration, is shown in Figure 3.3.

3.3 Segmentation

Segmentation is the process in which an image is divided into its constituentobjects or parts, and background [23]. It is often the most vitaland mostdifficult step in an image analysis task. The segmentation result usually de-termines eventual success of the analysis. For this reason, many segmentationtechniques have been developed, and there exist almost as many segmentationalgorithms as there are segmentation problems.

The construction of a segmentation algorithm can be thought of as defininga model of the objects that we want to detect in the image. This model is thenthe basis for the segmentation algorithm.

19

3.3.1 Thresholding

In the most simple case, we create a model that says that objects are brighterthan the image background and individual objects are well separated from eachother. If we can find a suitable intensity threshold that separates the brightobjects from the dark background, the segmentation is completed. We simplyfind all connected components brighter than the threshold, and say that theyare our objects. The tricky part is to find a suitable threshold. There are manydifferent thresholding methods, see [51] for an overview. One approach isto look for valleys in the image histogram. An image histogram is createdby plotting the number of pixels per intensity-level against intensity-level. Ifobjects are bright and background is dark, the histogram will have one peak forobjects and one for background, and a valley will be present between the peaks.Figure 3.4 A shows an image of fluorescence labeled nuclei of cultured cells.The image histogram is plotted in Figure 3.4 B, and a threshold is placed atintensity 30. In Figure 3.4 C, the intensity variation along a row in A is plottedagainstx-position, and the threshold is shown as a horizontal line. The resultof thresholding the image at this level, and labeling the different connectedcomponents, is shown in Figure 3.4 D. Clustered objects will not be separatedby thresholding.

3.3.2 Watershed segmentation

If all the objects are brighter than the image background, but clustered, as inthe image of cytoplasms in Figure 3.5 A, thresholding will only separate theobjects from the background, and not separate the individual objects from eachother. There is no single threshold that will separate all cells and at the sametime find all cells. We can, however, create a model that says that objects havehigh intensity, and are less intense at borders towards other objects. If imageintensity is thought of as height, the cells can be thought of as mountains sepa-rated by valleys in an intensity landscape, see Figure 3.5 B. The segmentationtask is then to find the mountains in the landscape.

A segmentation algorithm that has proven to be very useful for many ar-eas of image segmentation where landscape-like image models can be usedis watershed segmentation. The method was originally suggested by Diga-bel and Lantu´ejoul, and extended to a more general framework by Lantu´ejouland Beucher [4]. Watershed segmentation has then been refined and used inmany situations, see, e.g., [40, 59] for an overview. The watershed algorithmworks through intensity layer by intensity layer and splits the image into re-gions similar to the drainage regions of a landscape. If the intensity of theimage is thought of as height of a landscape, watershed segmentation can bedescribed as submerging the image landscape in water, and allowing water torise from each minimum in the landscape. Each minimum will thus give rise

20

x

y

100 200 300 400 500

100

200

300

400

500

A

0 20 40 60 80 100 1200

100

200

300

400

500

600

700

800

900

pixel intensity

nu

mb

er

of

pix

els

(a

ll o

f im

ag

e)

Threshold at intensity = 30

B

0 100 200 300 400 5000

20

40

60

80

100

x

pix

el i

nte

nsi

ty a

t y=

30

0

Threshold at intensity = 30

C

x

y

100 200 300 400 500

100

200

300

400

500

D

Figure 3.4:Image segmentation by thresholding.A: Fluorescence stained nuclei of culturedcells. B: Image histogram ofA. A threshold is placed where the histogram shows a localminimum. The vertical line corresponds to a threshold at intensity 30.C: An intensity profilealong the rowy=300 ofA, with the intensity threshold represented by a horizontal line.D: Theresult after thresholding and labeling of connected components. All nuclei are not separated bythresholding.

to a catchment basin, and when the water rising from two different catchmentbasins meet, a watershed, or border, is built in the image landscape. All pixelsassociated with the same catchment basin are assigned the same label. Wa-tershed segmentation can be implemented with sorted pixel lists [60], so thatessentially only one pass through the image is required. This implies that thesegmentation can be done very fast.

In the case where we want to find bright mountains separated by less brightvalleys, we simply turn the landscape up-side-down, inverting the image, andthink of the mountains as lakes separated by ridges instead of mountains sep-arated by valleys. The result after applying watershed segmentation to theimage of the cytoplasms can be seen in Figure 3.5 C.

21

A B C

Figure 3.5:Image segmentation by watershed segmentation.A: Fluorescence stained cyto-plasms of cultured cells.B: The intensity ofA plotted as a landscape.C: The result of watershedsegmentation of the inverted image.

3.3.3 Shape-based watershed segmentation

If the clustered objects are not separated by less intense borders, they mayhave some other feature, or combination of features, that can be included inthe segmentation model. One example of such a feature is roundness. Thecell nuclei in Figure 3.6 A are all fairly round in shape, but have internal in-tensity variations that are sometimes greater than those between the individualnuclei. The clustered nuclei can easily be separated from the background usingthresholding. The thresholded image can then be transformed into a distanceimage, where the intensity of each object pixel corresponds to the distance tothe nearest background pixel. This distance transformation can be performedby two passes through the image [7, 8]. The result will be an image showingbright cones, each corresponding to a round object, see Figure 3.6 B. Water-shed segmentation can then be applied to the inverted distance image, and theclustered objects are separated based on roundness, see result in Figure 3.6 C.Shape-based segmentation has proven useful for segmentation of cell nuclei ina number of studies [28, 38, 43, 49].

A B C

Figure 3.6: Shape-based watershed segmentation.A: Free and clustered cell nuclei.B:Distance transformation applied to a thresholded version ofA. The distance from each objectpixel to the image background is coded as intensity and displayed as height in a landscape.C:The result of watershed segmentation ofB together withA.

22

3.3.4 Edge-based watershed segmentation

It is seldom easy to separate the objects from the image background usingthresholding due to intensity variations in the image background. In somecases, it is possible to reduce these background variations by pre-processing,as discussed in Section 3.2. In other cases, a better model than that sayingthat objects are brighter than background, is one that says that the transitionbetween object and background is marked by a fast change in image intensity.In Figure 3.7 A, the background in the upper left corner of the image has thesame intensity as the objects in the lower right corner of the image. The objectsare still visually clearly detectable as their local intensity is different from thelocal background.

Intensity changes can be described as the magnitude of the image gradient.The magnitude of the gradient expresses the local contrast in the image, i.e.,sharp edges have a large gradient magnitude, while more uniform areas in theimage have a gradient magnitude close to zero. The local maximum of the gra-dient amplitude marks the position of the strongest edge between object andbackground. The commonly used Sobel operators are a set of linear filters forapproximating gradients in thex, y (andz) directions of an image. The gra-dient magnitude image is approximated by adding the absolute values of theconvolutions of the image with the different Sobel operators [55]. Figure 3.7 Bshows the gradient magnitude, where large magnitude is shown as high imageintensity. If watershed segmentation is applied to the gradient magnitude im-age, the water will rise and meet at the highest points of the ridges, as shown inFigure 3.7 C. This corresponds to the location of the fastest change in intensity,just as in our segmentation model.

A B C

Figure 3.7: Edge-based watershed segmentation.A: Fluorescence stained cell nuclei in asection from a tumor. Due to background variation, separation of nuclei and background bythresholding is not possible.B: The gradient magnitude ofA. C: Result after applying water-shed segmentation to the gradient magnitude and overlaying the result with the original image.

23

3.3.5 Merging

When watershed segmentation is applied to an image, water will rise fromev-ery minimum in the image, i.e., a unique label will be given to each imageminimum. In many cases, not all image minima are relevant. Only the largerintensity variations mark relevant borders of objects. This means that apply-ing watershed segmentation will lead to over-segmentation, i.e., objects in theimage will be divided into several parts, see Figure 3.8 A. Over-segmentationcan be reduced by a pre-processing step reducing the number of local imageminima, e.g., by smoothing the image with a mean or median filter. Smooth-ing may, however, remove important structures, such as edges, in the image.An alternative to pre-processing is post-processing. After applying watershedsegmentation, over-segmented objects can be merged.

Merging can be performed according to different rules, based on the seg-mentation model. One example is merging based on the height of the ridgeseparating two catchment basins, as compared to the depth of the catchmentbasins. The model says that a true separating ridge, must have a height greaterthan a given threshold. All pairs of lakes that at some point along their sepa-rating ridge have a height lower than the threshold are merged. The result ofmerging Figure 3.8 A at height 10 is shown in Figure 3.8 B.

Other merging criteria may also be used. For example, if we know that anobject must have a certain size, we can include this in our model and say thatevery object smaller than this size should be merged with one of its neighbors.If there are several neighbors to chose from, we say that merging should bewith the neighbor towards which the small object has, e.g., its weakest ridge.This merging method was used in Paper I, see Section 4.1. The result of thismerging method applied to Figure 3.8 B is shown in C. The length of the borderbetween two objects can also been used to decide if neighboring objects shouldbe merged or not [58].

Defining the strength of a border as the weakest point along the bordermay lead to merging of many correctly segmented objects due to single weakborder pixels or weak border parts originating from locally less steep gradients.Another simple measure, which is less sensitive to noise and local variations, isthe mean value of all pixels along the border.This approach of merging, appliedto the gradient magnitude, is described in Papers VII and VIII, as discussed inSection 4.4.

3.3.6 Model-based splitting

Watershed segmentation may also lead to under-segmentation, e.g., if the im-age has been saturated and no intensity variations are present at the border be-tween clustered objects. Two examples of clusters are shown in Figure 3.9 A.As mentioned above, clusters of round objects may be split using a distance

24

A B C

Figure 3.8:Edge-based merging.A: Fluorescence labeled cytoplasms with internal intensityvariations leading to over-segmentation.B: Result after merging on minimum height of sepa-rating ridge. Some over-segmentation still remains.C: Result after further merging of all smallobjects with the neighbor towards which it has its weakest ridge.

transformation followed by watershed segmentation. If the objects are notround, more advanced models are needed to decide which objects are clusters,and how to split them. In Paper I, discussed in Section 4.1, a number of fea-tures, such as area and curvature, are combined to classify each object as acorrectly segmented cell or a cluster of cells. All objects classified as clustersare sent to a splitting step. A number of different splitting lines are found bypairing concavities of the clusters as described in Section 4.1. The splittingline resulting in the two new objects best corresponding to the model of a cor-rectly segmented cell is used as the final splitting line. The line detection andthe result after splitting is illustrated in Figure 3.9 B and C.

A B C

Figure 3.9: Model-based splitting.A: Two objects are under-segmented after watershedsegmentation and merging. The clusters are found by comparing each object to a model of acorrectly segmented cell using a combination of features.B: Concavities of the clusters (shownin gray) are used to create possible splitting lines (black).C: Result after splitting of clusters.

3.3.7 Seeded watershed segmentationBoth over- and under-segmentation can also be reduced by including a prioriinformation in our model before applying watershed segmentation. Seededwatershed segmentation [3, 40, 59] means that starting regions, called seeds,

25

are given as input to the watershed segmentation. Water is then only allowedto rise from these seeded regions, and all other image minima are floodedby the water rising from the seeds. The water will continue to rise until thewater rising from one seeded region meets the water rising from another seededregion, or a pre-defined object/background threshold. This means that we willalways end up with exactly as many regions as we had input seeds.

Seeds from h-maxima transformation

Seeds can be set manually [37], or in an automated way. For example, we mayknow that, despite variations in both object and background, each object hasa certain contrast compared to its local neighborhood. Such regions can bedetected using morphological filters. One example is the extendedh-maximatransform, which filters out the relevant maxima using a contrast criterion [54].All maxima are compared to their local neighborhood, and only those maximagreater than a given thresholdh are kept. A lowh will result in many seeds,often more than one seed per object. A highh will result in fewer seeds, andsome objects may not get a seed at all.

An example is shown in Figure 3.10. The intensity along a pixel row in Ais shown in B, and theh-maxima are marked in gray. Note that maxima thatdo not contain gray markers do so in a different pixel row. Despite backgroundvariation,h-maxima are found in all cells, as shown in C.

Seeded watershed segmentation is very useful if we perform our segmenta-tion in the gradient magnitude of the image. We can find seeds in object andbackground regions based on intensity information in the original image, andthen let the water rise from these seeds placed in the gradient magnitude image.Object and background seeds are shown in Figure 3.10 D and the result afterwatershed segmentation is shown in E. This segmentation approach, combinedwith merging based on edge-strength (see Figure 3.10 F) was developed andused in Papers VII and VIII, and is discussed in Section 4.4.

Seeds from a parallel image

Cells often vary very much in shape and size, and touch each other. Water-shed segmentation will not always give a satisfactory result, as seen in Fig-ure 3.11 A. If we have a single seed per cell, the task of finding the borders ofthe cell is greatly simplified. Theh-maxima transformation results in uselessseeds due to great intensity variations within the cells. The nice thing aboutcells is, however, that each cytoplasm has a natural marker that may be in-cluded in the segmentation model: the nucleus. If the nuclei, which are fairlyround in shape and usually nicely separated, are stained and imaged in parallelwith the cells, they can be used as seeds for watershed segmentation of thecells. The nuclei of the cells in Figure 3.11 A are shown in B, and the result ofseeded watershed segmentation using the nuclei as seeds is shown in C.

26

A 0 100 200 300 400 5000

10

20

30

40

50

60

70

80

90

100

110

x−position

inte

nsity

B C

D E F

Figure 3.10:Seeded watershed segmentation.A: Fluorescence stained cell nuclei in tumortissue. Due to background variation, separation of nuclei and background by thresholding isnot possible.B: Intensity profile across one row of pixels ofA, andh-maxima ath=5 shownas vertical bars in gray. Nuclei withouth-maxima haveh-maxima in a different row.C: Theoriginal image with theh-maxima overlaid.D: Object seeds found byh-maxima transformation(white) and background seeds found byh-minima transformation of the gradient magnitudeimage ofC followed by removal of small objects (black).E: Result after seeded watershedsegmentation of the gradient magnitude image. More than one seed per object leads to over-segmentation.F: Over-segmentation is reduced by merging on edge-strength. Also poorlyfocused objects may be removed by this step.

In time-lapse experiments, nuclear stains are often undesirable as they mayinterfere with the natural behavior of living cells. A nuclear stain may, how-ever, still be used for image segmentation if the cells are fairly stationary. Thestain is simply added after the completed experiment, and the same image ofthe nuclei is used for segmentation of all time-lapse images. In Paper II, anuclear stain was present throughout the experiment, but only one image wasneeded for segmentation of the full time-lapse experiment. If the cells move,an image of the nuclei can be used as a starting point for back-tracking of cellmotion.

3.3.8 Extension to 3D

Most of the discussed methods can be extended to 3D images. For most meth-ods, the only difference is that instead of working with 2D 3� 3 pixel neigh-borhoods, we work with 3D 3� 3 � 3 voxel neighborhoods. As an example,the result of applying seeded watershed segmentation to a 3D image of fluo-

27

A B C

Figure 3.11: Cell segmentation using nuclei as seeds.A: Clustered fluorescence labeledcells with varying shapes and intensities are difficult to separate from each other. Watershedsegmentation will result in both over- and under-segmentation (white lines).B: A parallel imageshowing the cell nuclei can be used as a seed for watershed segmentation of the cells.C: Theresult of watershed segmentation (white lines) using the nuclei (black lines) as seeds.

rescence stained cell nuclei in a tumor section is shown in Figure 3.12 takenfrom Paper VIII.

3.3.9 Other methods

Many other methods for image segmentation exist, for related and completelydifferent image analysis applications. We have mainly focused on fast algo-rithms (especially in Papers I and II) to allow large scale screening studies.Algorithms which are known to be comparably slow, e.g., diffusion modelsand iteratively refined active shape models [27] have not been tested. If thetime restrictions allow it, additional refinement of the segmentation results canmost probably be achieved by post-processing the segmentation with, e.g., asnake algorithm [44]. Improvements have recently been made to the speed ofsnake-based models, but user interaction is often needed [26]. Segmenting thesame image several times with different algorithms followed by selection ofthe best result, based on shape and intensity features, may also be an attractiveapproach if the time is not restricted.

3.4 Feature extractionOnce the objects of interest have been segmented from each other and fromthe image background, a large number of descriptive features can be extractedfrom the individual objects in the image, see [50] for an overview. All featuresthat we can extract have to be based on the actual pixel values and their spa-tial arrangements within the object, as that is the only information available.Some examples of features, and suggestions of how they may be calculated arepresented below.

28

A B

C D

Figure 3.12:Segmentation of a 3D image.A Maximum intensity projection of 99z-slices ofa 3D image of a cervical carcinoma tumor.B: Surface rendering (using marching cubes) of thefinal 3D segmentation result. Objects on the border of the image have been removed for bettervisualization. A densely packed cell layer is clearly visible.C, D: Close-ups showing caseswhere our method separates clustered cell nuclei.

Morphometric, or shape features; features that are based solely on the spatialarrangements of pixels/voxels.

Area/Volume: The number of pixels/voxels belonging to the object.

Perimeter: The sum of steps taken when walking around the edge pixels ofa 2D object. For accuracy, horizontal and vertical steps are optimallyweighted bya � 0 � 948 and diagonal steps are optimally weighted byb � 1 � 343. [14, 29, 34]

Compactness index: A measure of compactness =Perimeter2� �

4π � Area� .

Convex area: The area of the convex hull of the object. The convex hull canbe approximated as the smallest convex polyhedron that will cover theobject [9].

29

Convex perimeter: The perimeter of the convex hull.

Densitometric, or intensity features; features that only describe the gray-level values (without considering the spatial distribution).

Integrated intensity: The sum of the intensity values of all pixels/voxels be-longing to the object.

Mean object intensity: The average of the pixel/voxel values.

Intensity range: The difference between the maximum and minimum objectintensity.

Textural or structural features; more complex features, describing combinedspatial and gray-level values.

Edge intensity range: The difference between the maximum and minimumintensity along the border of the object.

Mass displacement: The Euclidean distance between the center of massgiven by the gray-level image of the object and the center of mass givenby a binary mask of the object.

All categories of features have been used in the work leading to this thesis. Anumber of problem-specific features have also been developed, see Paper II,discussed in Section 4.1.

3.5 Data analysis and evaluationThe numerical data produced by feature extraction may not always be the de-sired end result, and it may be difficult to interpret. A simple example is ifthe goal of the analysis is to decide the percentage of small, medium-sized andlarge objects in an image. Numerical data representing the area of each objectdoes not provide the final answer. The numerical data has to be analyzed, andeach object has to be classified as small, medium, or large in order to calculatethe desired percentages.

In many cases, the goal of the analysis is to retrieve more complicated in-formation from the images, and a single feature like area is not sufficient forobject description and classification. A larger number of features is needed,and the different classes have to be separated by multivariate statistic analysis.

It is very important to note that increasing the number of features for objectclassification willnot necessarily improve the classification result. It has beenobserved that, beyond a certain point, inclusion of additional features leadsto worse rather than better performance [16]. One should instead try to pick alimited set of features that can discriminate between the relevant populations aswell as possible. This means that feature selection should be problem-specific.

30

An additional requirement is robustness so that the results can be reproducedfor new, independently collected material.

Most classifiers make use of a priori knowledge, e.g., the classifier consistsof a set of intelligently chosen thresholds in feature space. A common wayto create a classifier is by training the classifier on data with known classes.This known dataset is often referred to as the “gold standard”. For single cellanalysis, few methods other than visual inspection are available for creatingthe gold standard. This means that the gold standard may be biased by theobserver. An unbiased gold standard is very desirable, but often difficult to findin image cytometry. Visual inspections by different persons may give differentanswers (inter-observer variability). If the same analysis is performed twice bythe same person, we may see intra-observer variability as well. The best wayto increase the reliability of a gold standard is to increase the number of visualinspections, as well as the number of different professionals that perform theinspection.

Training the classifier on a gold standard (unbiased or not), and achievinggood results on the training set, does not always mean that we will get reliableresults when classifying new data. That is, if the feature set is very large,and the number of objects in the training set is small, we will always be ableto create a classifier that is 100% correct on the training set, but most likelyfails on a test set. The classifier has learned to recognize all the individualobjects rather than built a generally valid model of their appearance. If thetraining set is defined in size, the only way to avoid this problem is by usinga smaller number of features. The feature set can be reduced by automaticfeature selection, as described in Papers I and II, discussed in Section 4.1.Briefly, one can say that different classifiers trained on different combinationsof a reduced number of features are tested on a test set. This test set mustalso be known. Based on the classifier performance on the test set, an optimalnumber and combination of features can be found.

In order to see if the final classifier will perform well also on data not usedduring training or testing, the classifier has to be evaluated. This means thatwe need yet another set of known data, not previously seen by the classifier. Ifthe classifier performs well also on this data, referred to as the validation set,it is probably useful.

Training the classifier, selecting relevant features, and evaluating the classi-fier performance can all be combined in one step, as described in Papers I andII, discussed in Section 4.1.

3.6 ImplementationAll algorithms tested and used in this thesis have been implemented by theauthor, solely, or in cooperation with Joakim Lindblad and Ida-Maria Sintorn.

31

The program code is written in C and C++ as well as the macro language forthe IPAD/IMP platform [42], where the functions are integrated. IMP (IM-age Processing) is a general image analysis software for image data developedat the Centre for Image Analysis. The software runs on any standard UNIXworkstation using X-Windows and Motif as user interface. Some of the meth-ods were first implemented and tested using Matlab (The MathWorks, Inc.,Natick, MA).

Although very little effort has gone into optimizing the code, some exam-ples of preliminary performance figures can be worth mentioning. With thecurrent implementation in C++, the seeded cytoplasm segmentation presentedin Section 3.3.7, when performed on an image of 640� 640 pixels, containing153 cells, takes 0.68 seconds on a 1.7 GHz Pentium. The preceding segmenta-tion of nuclei takes less than 0.27 seconds.

32

New methods and applications

4.1 Papers I and II:Cell segmentation and analysis of Rac1 activation

Drugs are often synthetic molecules that affect the dynamic events in individ-ual cells. By studying cells over time, and exposing them to drugs, importantinformation about the applied drugs can be retrieved. Analysis of single cellscan thus be used as a screening step in the search for new drugs. Cell-basedanalysis provides a cheap, fast, and ethical pre-screening step prior to testingnew substances on animals and at the clinical level. High-throughput screeningrequires fast and fully automatic methods for single cell analysis.

The basis for all automated image cytometry is cell segmentation. Beforeany cell-specific spatial, spectral and temporal features can be extracted, theindividual cells have to be segmented from the image background and fromeach other. While the nuclei of the cells are often quite distinct and easy todetect, e.g., based on their regular shape [38, 49], the cells provide a lot morechallenge, especially if no nuclear counter stain is available. The goal of PaperI was therefore to develop an algorithm for fully automatic cell segmentation.This, in turn, lead to the development of a cell classifier and a procedure forautomatic feature selection.

In Paper II, it was not the cell segmentation in itself that was the goal, butanalysis of dynamic events taking place in cells exposed to different stimuli.Rac1, a molecule that is involved in a wide range of cellular processes, such asactin reorganization, cell cycle progression, gene transcription, cell adhesionand migration, was fused with GFP, as described in Section 2.2.5. Transloca-tion and ruffle formation of the GFP-Rac1 fusion protein was imaged over timein living cells. At addition of an insulin-like growth factor (IGF-1), GFP-Rac1was translocated to the cellular membrane. A fully automatic digital imageanalysis system for quantitative estimation of GFP-Rac1 translocation in sin-gle cells was developed. Such a system could be the basis for high-throughputdrug screening.

4.1.1 Methods and experiments

A cell classifier was developed during the work leading to Paper I. The sameprocedures were used for creating a cell activation classifier in Paper II. The

33

procedures for creating a classifier, including automatic feature selection, test-ing and evaluation, are therefore described after the overview of the methodsand experiments of Paper I and II.

Cell segmentation: Paper I

The goal of Paper I was to develop a method for automatic cell segmenta-tion. As cells vary in size and shape, and often touch, simple segmentationtechniques often result in partial (over-segmented) and/or clustered (under-segmented) cells, all mixed up with the correctly segmented cells. A method toimprove the result was developed by combining a number of processing stepsas shown in Figure 4.1 A. Errors such as over- and under-segmentation pro-duced by the initial processing step are, to some extent, detected and correctedby an automatic quality control and feed-back step.

The process is initiated by reduction of intensity non-uniformities using theB-spline-based data-driven approach described in Section 3.2. Once the back-ground has become flat and even, the cells can be separated from the back-ground by intensity thresholding. Cells are separated from each other usingwatershed segmentation with double thresholds: a lower threshold that repre-sents the rising water-level and an upper threshold describing the allowed vari-ation of the water-level. The distance between the two thresholds decides theminimum intensity valley that can separate two adjacent cells. This reducesover-segmentation due to false valleys that would appear with the standardwatershed algorithm. The distance between the upper and the lower thresholddepends on the imaging conditions and was set by visual inspection of a setof training images. The result of the initial segmentation step when applied toFigure 4.1 B can be seen in Figure 4.1 C.

Despite the double thresholds, the initial segmentation often results in bothover- and under-segmentation, as well as a lot of small noise objects. Theover-segmentation is reduced by merging small objects with their neighboringobjects. Small objects are defined by a threshold on integrated pixel inten-sity. Once a small object is found, its neighborhood is examined. If a singletouching neighboring object is found, this label is put in a merging list. If notouching objects are found, a zero is put in the merging list. If several touchingobjects are found, the label of the neighbor with the highest integrated inten-sity of touching border pixels is put in the merging list. When all small objectshave been examined, the merging list is reduced to avoid exchange of labelsbetween groups of small objects. Objects are then merged according to themerging list. A zero in the merging list means that the object should be dis-carded as noise. The result after merging, and removal of small noise objects,can be seen in Figure 4.1 D.

As can be seen in Figure 4.1 D, some cells are not correctly segmentedafter the initial segmentation and merging step; they are under-segmented. Be-

34

A B

C D

Figure 4.1: Segmentation of cells.A: An overview of the steps in the presented cell seg-mentation algorithm. Objects that are non-cell-like (according to the quality measure) are sentto the splitting step.B: A gray-level image of a cluster of cells.C: The result of the initialsegmentation step. Some objects are over-segmented, i.e., split into several smaller parts.D:The result after merging.

fore splitting clusters of under-segmented objects, all under-segmented clus-ters have to be found. This is done by the quality control. By extracting de-scriptive features from all objects, and comparing them with the features of amodel of a correctly segmented cell (based on a large number of correctly seg-mented cells), a quality measure can be extracted. The quality measure is thestatistical Mahalanobis distance between the object and the cell model in themulti-parameter space of the selected features [25]. In other words, the qualitymeasure is a classifier, as described below.

Objects that show low similarity with the cell model have low quality andare sent to the splitting step as they may be clusters. Potential splitting linesare found based on concavities in the binary shape of the objects. Using a5 � 5 neighborhood, the concavities are found and filled, giving a discrete ap-proximation to the convex hull of the object [9]. The deepest points of everyconcavity, calculated as the distance from the convex hull to the object border,are used as end-points for potential splitting lines. For every object, a set ofend-points is obtained. End-points from different concavities are then com-bined forming potential splitting lines. For every potential splitting line thequality of the two new objects is calculated and compared with the quality ofthe object before split. A splitting line must result in two high-quality objects,

35

A B

Figure 4.2: CHO-hIR cells expressing GFP-Rac1 fusion protein, imaged on IN Cell Ana-lyzer 3000.A: Cytoplasms, before andB: 4.3 min after incubation with IGF-1. Ruffles appearas bright formations along the edges of the cells.

where the worst quality of the two is better than that of the cluster. Should theobject contain less than two concavities or no splitting line resulting in betterobject quality be found, nothing is done.

Every object will have a quality measure at the end of the segmentationprocedure. If the application allows for it, objects with low quality, i.e., objectsthat have little similarity with the model of a correctly segmented cell, can beexcluded from further analysis by a threshold on object quality. The obviousnext step is analysis of spatial, temporal and spectral cell features, but this wasnot within the scope of Paper I.

Rac1 activation analysis: Paper II

The objective of Paper II was to create a fully automatic method for measure-ment of Rac1 activation. Rac1 was visualized using cells expressing GFP-Rac1fusion protein. Rac1 activation can be described as bright formations appear-ing at the outer rims of the cytoplasm, referred to as ruffles, see Figure 4.2.Image sequences showing cells before and after exposure to stimuli (IGF-1 orinsulin) were segmented to identify individual cells, and a number of featureswere extracted and used as a basis for classification and evaluation of activa-tion.

Images were acquired using an automated confocal high-speed fluorescentsystem with built-in auto-focusing designed for imaging of real-time cellularevents (IN Cell Analyzer 3000 by Amersham Biosciences). Intensity non-uniformities as well as a vertical striped pattern appeared in the images due todisturbed sensor response. The stripe pattern was reduced by approximatingthe background-level of each vertical pixel column as the 20% percentile ofthe pixel intensities. Each column was then divided by this value, making the

36

background constant for each column. The B-spline based approach describedin Section 3.2 was then used for reduction of intensity non-uniformities. Thefinal background estimate was also used for object/background segmentation.

Cell segmentation was greatly simplified through the presence of a paral-lel image showing the nuclear stain Hoechst (similar to the DAPI stain de-scribed in Section 2.2.4). The nuclei were relatively easy to segment by in-tensity thresholding and shape-based watershed segmentation as described inSection 3.3. The cells could thereafter be segmented using seeded watershedsegmentation, as described in Section 3.3.

Once the cells were segmented, the individual cells could be classified asshowing no, moderate or high activation. A number of general-purpose de-scriptive features, together with a limited number of problem-specific features,were extracted from each individual cytoplasm. The problem-specific featuresincluded watershed-based segmentations of the cytoplasm landscape, using theidea that ruffles can be thought of as coastal mountains of a cell-island, see Fig-ure 4.3. The area of the coastal mountain-like ruffles, as well as the area of thedrainage region of the cell-island before and during ruffling, proved to be use-ful features for classification of Rac1 activation. Features were selected, and aclassifier was trained, tested, and evaluated as described below.

The final classification results were compared with those achieved by visualinspection. Variation between visual classifications performed by different per-sons, as well as the variation between classifications performed by the sameperson at two different time points, i.e., inter- and intra-observer variability,were also compared with the results from the automatic classification.

Classifier training was complicated due to the variance in the known train-ing data, or gold standard. The gold standard was created by visual inspectionof the images, as no other method for evaluation of time-dependent events tak-ing place in single cells is available. The robustness of the developed methodwas, however, also tested on a separate image set with additional a priori infor-mation. This set of image series showed GFP-Rac1 expressing cells that hadbeen exposed to different concentrations of insulin. An increased Rac1 activa-tion could be expected at higher insulin concentration, and therefore provideda priori information about the expected result.

Creating a classifier

Quadratic Discriminant Analysis (QDA) [16] is a classification method de-rived from Bayes decision theory and is designed to minimize the error rate,i.e., the percentage of incorrectly classified instances in a dataset. Based ona set of discriminant functionsgi

�x � � i � 1 � 2 � � � � c, a feature vectorx is as-

signed to classωi if gi�x � �

gj�x � for all j �� i. General Bayes minimum-

error-rate classification can be achieved by use of the discriminant functionsgi

�x � � logp

�x � ωi � � logP

�ωi � , whereP

�ωi � is the a priori probability of class

37

A B

C D

Figure 4.3:Illustration of a problem-specific feature; ruffle regions.A-D: The ruffle regions,shown in a darker shade, before and after stimulation with IGF-1.C-D: 3D visualization of theruffle regions, where intensity is interpreted as height in a landscape.

i. Evaluating this expression for ad-dimensional multivariate normal distribu-tion, we arrive at the following discriminant functions

gi � x � � �12 � x � µi � tΣ � 1

i � x � µi � �d2

log2π �12

log � Σi � � logP � ωi � � (4.1)

Substituting the mean and covariance matrix with the sample meansmi andsample covariance matricesSi for the c classes, we get a QDA classifier thathas the following discriminant functions

gi � x � � �12 � x � mi � tS� 1

i � x � mi � �12

log � Si � � logP � ωi � (4.2)

giving hyperquadric decision boundaries.In Paper I, only one class is used, i.e., the class of a correctly segmented cell,

and the quality measure is given by the inverse of the discriminant function.

38

The more an object deviates from the cell model, the lower its quality. Thequality of an objectx is

q�x � �

��

12

�x � m� tS� 1 �

x � m� � � 1� (4.3)

wherem is the mean andS is the covariance matrix of the features describ-ing our cell model. The cell model is based on a large number of correctly(semi-manually) segmented cells. The quality measure is also referred to asthe Mahalanobis distance fromm to x.

The estimates of the class mean values and covariance matrices have to bederived from a “training set” of known data, or gold standard. As describedbelow, the classifier must also be tested on a known “test set”, and finallyevaluated on a known “validation set”. In other words, a lot of data with knownclasses is needed. For analysis of time-dependent events in single cell, theonly available gold standard is the classification of cells by visual inspection,limiting the amount of known data.

Feature selection

Having a classifier and a set of features, it is tempting to use the completeset of features in the classification, as every included feature should, in theideal case, contribute to reducing the probability of error. Unfortunately, thisis not the case, as described in Section 3.5. Instead, selecting a limited numberof features reduces the tendency for over-training and improves the classifierperformance on new, unseen, datasets. Feature selection algorithms can becharacterized by asearch rule, aselection criterionand astopping rule[48].

The search rule that we have applied is the Sequential Floating BackwardSelection procedure [47]. First, a large number of features (selected due totheir assumed relevance for the current problem) was extracted from the train-ing set and included in the classifier. One feature at a time was then temporar-ily removed, and the performance of the classifier was tested on a test set. Thefeature that contributed the least to the classification performance was then re-moved. This was done over and over again until only one feature was left.After removing a feature, inclusion of each of the previously removed featureswas tested. In this way, accidental removal of the best feature early in theprocess was avoided.

The selection criterion tells us which features to include or not in the clas-sifier. We used increase in classifier performance as a selection criterion. ROCcurves (Receiver Operating Characteristic) [56] were used for evaluation ofclassifier performance in Paper I. In Paper II, the performance was measuredusing Cohen’s weightedκw [19]. Cohen’sκ provides a measure of the degreeto which two classifiers concur in their respective sortings of items into mutu-ally exclusive categories. On the basis of the proportion in which the judges

39

agree (po) and the proportion of agreement expected by chance (pc), Cohen’sκ is given by

κ �po

� pc

1 � pc� (4.4)

Cohen’s weightedκ takes the “off-by-one” mis-classifications into account,and judge them as partially correct. Cohen’sκ and Cohen’s weightedκ provideperformance measures which are more reliable than just percentages of correctsortings.

As a stopping rule, we picked the set of features leading to the best per-forming classifier. During the feature selection we kept track of the featuresincluded in the best performing classifier, for every possible number of fea-tures. We could then backtrack this list and pick the feature set resulting in thebest classifier.

Training and evaluation

For training and classifier evaluation, a limited number of visually classifiedimages were available. The limited amount of training data was mainly dueto the tedious visual classification. The visually classified cells had to be usedas training and test sets at feature selection, as well as for the final evaluationof classifier performance. Since evaluation of the result in an unbiased way isof most importance, great care was taken to never use the same data for bothtraining and evaluation.

A cross-validation scheme was applied to maximize the use of the data. Theclassification results were compared with manually classified results. In PaperII, the results were also compared with the inter- and intra-observer variabilityof visual classifications.

4.1.2 Results

Cell segmentation

The cell segmentation method of Paper I was tested on images of cells stainedwith a cytoplasmic stain and imaged by a CCD-camera attached to a fluores-cence microscope. A 10� and a 20� objective were used resulting in twoimage sets with different resolution. Features for the quality measure were au-tomatically selected using the methodology previously outlined, and four fea-tures were found to be enough to give a good trade-off between performanceand generalization. Examples of features that were selected are mass displace-ment, integrated pixel intensity, edge intensity range, and convex perimeter.

The result of the segmentation was compared with a manual segmentationof the same image. For the 10� image set, 89% of a total of 909 cells werecorrectly segmented. However, if the quality measure of the individual objects

40

was used to throw away, e.g., the 11% least cell-like objects, 92% of the re-maining cells were correctly segmented. There is of course a risk taken thatthe most interesting cells are thrown away, and this should only be done ifthe features of interest are independent of the features of the quality measure.For the 20� image set, 93% out of 251 cells were correctly segmented beforethe splitting step was applied. The splitting step proved not to be robust forthe higher resolution images, and resulted in over-segmentation. Rejecting the12% least cell-like objects increased the success rate to 97% in the remainingsample.

Table 4.1: Comparing automatic and visual classifiers

all classes class 1+2 grouped

classifier percent correct Cohen’sκ percent correct Cohen’sκ

C vs W 46.2 0.241 74.0 0.313

C vs A1 43.8 0.251 68.3 0.304

C vs A2 44.2 0.209 72.2 0.249

C vs B 47.0 0.237 76.4 0.295

A1 vs B 44.9 0.243 80.4 0.510

A1 vs A2 70.1 0.485 82.3 0.592

C vs all 65.9 0.494 79.5 0.475

Rac1 activation analysisThe automatic classification of Rac1 activation based on the described methodwas compared with each of three visual classifications, and with a weightedsum (W) of all the visual classifications, see table 4.1. The three visual clas-sifications were performed by two different persons, A and B, and person Arepeated the visual classification once, 9 months after the first classification.The result of the automatic classification, i.e., computer results, is denotedclassifier C. This resulted in classifiers A1, A2, B, and C. Comparing A1 (orA2) versus B gives an estimate of the the inter-observer variability, and A1versus A2 gives an estimate of the intra-observer variability.

The last row of the table, ’C vs all’, is a comparison of the computer resultand the most similar of the visual results. For example, if the three visualclassifications of a cell are 1, 1 and 2, the weighted sum W will say class 1,and then if the computer classification is 2 it would normally be consideredan error, but if the computer classification is compared with the most similarvisual result, i.e., a 2, it will not be an error. This can also be seen in Figure 4.4,

41

V: 010=>0V: 111=>1V: 112=>1

V: 113=>1V: 112=>1

V: 112=>1

V: 122=>2

V: 112=>1

V: 102=>2

V: 043=>0

V: 101=>1V: 222=>2

V: 212=>2

C: 1C: 1C: 2

C: 1C: 1

C: 2

C: 1

C: 1

C: 2

C: 0

C: 1C: 2

C: 2 0 0.1 1 3 10 300

10

20

30

40

50

60

70

80

90

100

%Insulin concentration [nM]

582 cells582 cells582 cells582 cells 679 cells679 cells679 cells679 cells 881 cells881 cells881 cells881 cells 917 cells917 cells917 cells917 cells 745 cells745 cells745 cells745 cells 162 cells162 cells

Computerclassification

−1 0 1 2

Figure 4.4: Left: Classification results. Classes:-1: no GFP-Rac1 expression,0: no ruffleformation,1: low–medium response,2: high response,3: unclear,4: no visual assessment.V: XXX � X corresponds to visual classifiers A1, A2, B, and the weighted sum W.C: X cor-responds to the automatic classification provided by the described method.Right: Relativedistribution of the different classes for increasing insulin concentration. The proportion of acti-vated cells (classes 1 and 2) increases with increasing insulin concentration.

left, where the visual and computer calculated classes are plotted on top of eachcell. The number after the arrow corresponds to the weighted sum W of thevisual results.

As shown by the table, the fully automatic computer-based classification (C)resulted in 46.2% correct classification as compared to the weighted result ofthe visual classifications. This is comparable to the inter-observer variability,which was 44.9%.

For the second dataset, an increased proportion of activated cells (class 1and 2) as compared to inactive cells (class 0) was, a priori, expected for higherinsulin concentrations. Figure 4.4, right, shows the proportion of the differentclasses at increasing concentration of insulin. The classifier clearly detects alarger proportion of activated cells (two shades of light gray) at higher insulinconcentrations.

4.1.3 Conclusions and comments

The described cell segmentation method of Paper I rests on the assumptionthat all cells are similar, or belong to one out of a limited number of classes ofsimilar cells. This may appear as a problem when the goal of the analysis isto find variations in the staining patterns, structures, intensities, etc. However,if it is possible to stain for several antigens in parallel, segmentation can beperformed on one image, and the stain of interest, imaged at a different wave-length, can be analyzed using the segmentation result as a template for feature

42

extraction. Cell segmentation can also be simplified by parallel staining of thecell nucleus, as is the case in Paper II.

The watershed implementation with double thresholds described in Paper Iturned out to be a bit slow. After publication of Paper I, it was implemented bycombining watershed segmentation based on sorted pixels [60], and a mergingstep based on the weakest border pixel, described in Section 3.3. A thresholdon the weakest border pixel corresponding to the distance between the upperand the lower threshold in the double threshold method gave the same results,but approximately 100 times faster.

The number of images available at the development of the cell segmentationmethod was limited, causing variations in the choice of features depending onthe choice of test image. It would be interesting to try other, more advancedclassifiers than just a statistical distance measure for the quality measure. This,together with features that better separate the correctly segmented cells fromthe incorrectly segmented clusters, would probably improve the result. A two-class approach, where one class describes the single-cell model, and one classdescribes the cluster may be better than the single-cell quality measure used.

Including a nuclear stain greatly simplified the segmentation of the cellsin Paper II, but is not always practical, since nuclear a stain may disturb thebehavior of the cells. A nuclear stain could, however, be applied after thetime-lapse experiment when the cells have been fixed.

The difference between Rac1 classification made by the fully automatic im-age analysis procedure described here and the gold standard was roughly aslarge as the difference between two visual classifications of the same material.This indicates that the image analysis procedure is almost as reliable as themanual classification. It should be noted that visual classification of the com-plex Rac-1 activation is very subjective and the distinction between “moderate”and “high” responding cells was not easy to make by eye. Visual classifica-tion is also very time-consuming, compared to fully automatic classification,making the automatic system attractive for analysis of large datasets. As thedifficulty in classifying the cells lies, not in the delineation of the cells (whichis performed in a satisfactory way by the automatic system), but rather in thequantification of the ruffling, methods for semi-automatic classification basedon user interaction are difficult to define.

The cell analysis methodologies described in Paper I and II are very genericin their nature and applicable in a great variety of situations. The two suggestedcell segmentation methods (with and without a nuclear stain) have proved tobe both robust and versatile and should be useful in a wide range of similarsituations. The described problem-specific features based on ruffle regionsand internal drainage regions of Paper II are explicitly designed to capturethe ruffling of the cells and may thus be of limited value for studying otherevents. The use of watershed segmentation as a tool for finding and defining

43

sub-cellular regions should, however, be of importance also for other studies.Analysis of time-dependent events, such as ruffle formation, in single cells,

is of great importance for cell-based drug screening and evaluation. Fully au-tomated image-based systems, such as the one described in Paper II, promisea rapid route for the analysis of biological events that demand high spatial andtemporal resolution.

44

4.2 Papers III and IV:Studies of cyclin A and E

Cells multiply through cell division, and the different phases of the cell di-vision are coordinated in the cell cycle. A disrupted cell cycle may lead tocancer, and detailed knowledge of the cell cycle will provide information thatcan be of use for diagnosis as well as treatment of cancer patients. The onsetand passage through S-phase, where DNA is replicated, is controlled by cy-clin A and E. In many earlier studies of cyclin A and E, cell populations havebeen observed as a group, and the behavior of individual cells has not beentaken into consideration. In the two presented studies, the pattern of cyclin Aand E expression, in relation to cell cycle position, has been analyzed in singlecells asynchronously growing in unperturbed cell cultures from a wide rangeof tumors and normal tissue. The expression patterns of cyclin A and E inindividual cells in tumors from cancer patients have also been analyzed.

The aim of Paper III was to find the precise temporal relationship betweenthe start of cyclin A accumulation and the onset of DNA replication. In Pa-per IV, the cyclin E expression in normal cells was compared to the expressionpattern in tumor cells.

4.2.1 Methods and experimentsCell cultures from a wide range of tumors (malignant melanomas, breast tu-mors, cancer of the colon, oral epithelial cancer, and small cell lung carcinoma)and normal tissue (fibroblasts of several kinds, retinal pigment epithelial cellsand osteoblasts) were seeded on microscope cover-slips. Before the cells werefixed, cells in S-phase were labeled using BrdU, see Section 2.2.4. The cellswere then stained for cyclin A, cyclin E, and BrdU using immunofluorescentstains with fluorochromes emitting fluorescence in different wavelength inter-vals. The DNA was also stained using DAPI. An epifluorescent microscopewith a mercury lamp and conventional optics was used for viewing the cells.A CCD camera was attached to the microscope and the different stains wereimaged one by one, using different excitation and emission filters.

The goal was to analyze the expression-level of cyclin A and E in individ-ual cells, so the first step in the analysis was to find the individual cells. Ascyclin A and E are active in the cell nucleus, we chose to only analyze theexpression-levels within the cell nuclei. The individual cell nuclei were seg-mented by manual intensity thresholding of the image showing the DNA stainDAPI. Connected components were labeled and objects cut by the image bor-der were automatically excluded. Damaged as well as overlapping nuclei andnuclei with condensed DNA (i.e., cells in M) were manually removed fromfurther analysis. The segmentation result from the DAPI image was then usedas a template for finding the nuclei in the images of cyclin A, E and BrdU. The

45

detected light signal was assumed to be directly proportional to the concentra-tions of cyclin A, E and BrdU, and the mean pixel intensity value over eachindividual cell nucleus was used as a measure of concentration [6, 52]. Trueconcentrations are difficult to measure as different cell types may have dif-ferent permeability to antibodies, different amounts of non-specific antibodybinding and varying autofluorescence. The staining intensities of each indi-vidual nucleus was therefore investigated in relation to the distribution of thestaining intensities in the population as a whole.

Cells close to the G1/S border have very low amounts of cyclin A and BrdU,as synthesis has just begun and very little BrdU has become incorporated inthe DNA. The majority of cells do not express cyclin E, cyclin A or incorpo-rate BrdU, as only a small portion of the cells in an asynchronously growingpopulation are close to or inS-phase. In order to find the few positive (some-times weakly staining) cells and separate them from the large number of non-staining cells, an unbiased automated thresholding method was needed. Alsonon-staining cells showed background fluorescence, and the staining intensi-ties varied between different cell lines and different fluorochromes. Individualthresholds were therefore needed for each cell line and fluorochrome. A his-togram of the nuclear staining intensities was created for each of the cell linesand fluorochromes, see example in Figure 4.5. The staining intensity of thenon-staining cells was expected to be normally distributed, and a thresholdbetween the staining and the non-staining population should be placed at theupper end of this distribution. If the bending of the histogram is approximatedby iteratively fitting second degree polynomials to small “window views” ofthe histogram, a good threshold proved to be at the maximum of the secondderivative of the bending of the histogram. The resulting threshold dependson the bin size of the histogram and the window size used for the polynomialfitting. The stability of the thresholds in relation to bin and window size wasexamined and the same parameters were used for all cell lines and stains. Thethresholds were also shifted +/- 10% to control the stability of the final results.

The expression of cyclin E during S-phase was also studied using a methodunrelated to BrdU and cyclin A staining intensity. During the progression ofS-phase, DNA synthesis is taking place in different regions of the cell nucleus.This results in variations in the intra-nuclear distribution of the BrdU stain-ing pattern [39]. The BrdU staining pattern, imaged using 3D deconvolutionmicroscopy, was visually inspected to decide the location within S-phase.

We also wanted to study whether cyclin E was present in S- or G2-phasecells in human tumorsin vivo. As BrdU incorporation is not possible in fixedtumor material, a different marker for S-phase was needed. Paper III showedthat cyclin A can be used as a reliable marker for S- and G2 cells in both normaland tumor cells. A double staining of cyclin E and cyclin A was therefore per-formed on samples of cervical carcinoma. Individual cell nuclei in the tumors

46

Figure 4.5:An example of a histogram of the BrdU staining intensity of normal fibrobalsts.The fitted second degree polynomial (solid line) and the second derivative of it (broken line)have been plotted as well. The vertical line represents the threshold that separates the positivecells from the negative population.

were analyzed in a semi-automatic way, using manual delineation and meanpixel intensity as a measure of cyclin concentration. In an attempt to study theclinical relevance of the varying cyclin E expression, 12 matched samples ofcervical carcinoma in different tumor stages were investigated. The sampleswere matched with regard to age, tumor stage and tumor grade. The patientshad either died from metastasising disease within 3 years after receiving theinitial treatment, or the patient was still alive and disease-free after at least sixyears of follow-up.

4.2.2 Results

Cyclin A: Paper III

From examination of the cell populations, we could see that the nuclear cyclinA staining is strictly limited to the S, G2 and early M fractions of cells in bothnormal and transformed cell cultures. The number of cyclin A-positive cellswas highly correlated with the number of BrdU-positive cells for all examinedcultures. Cultures with short G2 phases show a 1:1 relationship between thenumber of cyclin A and BrdU-positive cells. Cultures with longer G2 phaseshave a larger proportion of cyclin A-positive cells, see Figure 4.6, left. Veryfew of the cells positive for cyclin E, i.e., cells in late G1 and early S, wereeither positiveonly for cyclin A or only for BrdU, see Figure 4.6, right. Thisindicates that cyclin A appears at the same time a BrdU incorporation starts,i.e., as S-phase is initiated.

47

Figure 4.6: The number of cyclin A-positive cells plotted against the number of BrdU-positive cells. Every dot represents one cell culture.Left: All cells included, irrespective ofcell cycle position. Cultures with short G2 phases show a 1:1 relationship between the numberof cyclin A and BrdU-positive cells. Cultures with longer G2 phases have a larger proportionof cyclin A-positive cells.Right: Only cells positive for cyclin E are included, i.e., cells in lateG1 or early S-phase. All cultures show a one to one relationship between the number of cyclinA- and BrdU-positive cells. The close correlation between positive nuclear staining for cyclinA and BrdU indicates that cyclin A appears in the nucleus very close to the beginning of DNAsynthesis.

Cyclin E: Paper IV

Six out of the nine examined tumor cell lines showed a high fraction of cy-clin E-positive cells in S-phase (19-77%), as compared to normal cell cultureswhere only an average of 8% of the cells were cyclin E-positive in S-phase.Four out of the 9 tumor cell lines also showed an increased number of cyclinE-positive cells in G2 as compared to the normal cell lines. As a complementto counting the number of positive and negative cells based on a threshold, theaverage cyclin E staining intensity in each phase of the cell cycle was exam-ined. In all normal cell cultures studied, the level of cyclin E decreased whenthe cells entered S-phase. In the tumor cell lines, the level of cyclin E insteadincreased during S-phase.

The visual inspection of 3D deconvolution microscopy data, used to findthe location within S-phase based on BrdU incorporation patterns, showed fewnormal cells staining for cyclin E in mid and late S-phase. Large numbers ofthe tumor cells did, conversely, stain positive for cyclin E throughout S-phase,in agreement with the results above.

Analysis of cervical carcinoma samples showed that cells in S- or G2-phasedid indeed stain strongly for cyclin E. There were, however, some tumorswhere cyclin E was expressed as in normal cells in culture, see Figure 4.7.

48

The expression patterns occasionally varied within a single tumor. When thepercentage of cells in S- or G2-phase (i.e., cyclin A-positive cells) was corre-lated with the survival of the patients it was clear that the deceased patients hadmany more cyclin E-positive cells in S- or G2 phase than the survivors. Nei-ther the percentage of cells staining positive for cyclin E out of all cells nor thepercentage of cells staining staining positive for cyclin A out of all cells wassignificantly different between the deceased and the survivors. The differencebetween the two patient groups could only be detected when combining cyclinA and E staining and performing the analysis on single cells.

4.2.3 Conclusions and comments

From Paper III, we conclude that cyclin A accumulation does not precede initi-ation of DNA replication, nor does DNA replication precede cyclin A accumu-lation, and therefore, the onset of cyclin A accumulation and DNA replicationoccur at the same time. The same pattern is seen in normal as well as trans-formed cell lines. The mechanical role of cyclin A in DNA replication is stillunknown. These results suggest that a stable expression pattern of cyclin A iscrucial for cell survival, as both normal and transformed cells show the sameexpression pattern. This also means that nuclear cyclin A can be used as areliable marker for S and G2 in normal as well as transformed cells.

In Paper IV, analysis of cyclin E shows that an abnormal expression patterncan be observed in tumor cell lines. Examination of tumor material indicatesthat abnormal cyclin E expression is associated with increased malignancy incervical carcinomas. However, more extensive studies are required to deter-mine whether the cyclin E expression pattern is an independent prognostic orpredictive factor.

After publication, the thresholding method described in Paper II was fur-ther developed by Joakim Lindblad [35]. Using kernel density estimates forplacement thresholds, each input value to the distribution is represented by aGaussian with a given standard deviation, and the sum of Gaussians replacesthe histogram. In this way, no histogram bin size is needed. A suitable stan-dard deviation (or bandwidth) can be derived from the dataset, and the secondderivative of the distribution is found by simply taking the derivative of thesum of Gaussians. This improved thresholding method was used in Paper III.

BrdU incorporation patterns and cyclin E staining was evaluated by visualinspection of 3D deconvolution microscopy data. Automatic methods, e.g.,3D distance transforms for evaluation of BrdU distribution and S-phase state,could have been used instead and the bias introduced by visual inspectionwould have been eliminated.

Segmentation of images of cell nuclei in tissue sections was performed bya combination of thresholding and manual delineation. Thresholding alone

49

A

DAPI Cyclin A Cyclin E

B

DAPI Cyclin A Cyclin E

Figure 4.7:The abnormal cyclin E expression found in some tumors was not strictly relatedto the morphology of the tumor. The morphology is visualized using HTX-eosin staining andabsorption microscopy.A: A tumor from a patient whowas disease-free more than six yearsafter treatment.B: A tumor that killed its host within three years from the date of diagnosis. Atdiagnosis, both patients were judged to have tumors of the same grade and stage, and receivedsimilar initial treatment. However, immunofluorescence staining reveals that the tumor inAonly expresses cyclin E in G1 cells (cyclin E and cyclin A stains do not coincide), whereasthe tumor inB expresses cyclin E also in S and/or G2 cells (cyclin E and cyclin A stains docoincide).

50

did not produce a satisfying result due to intensity variations in the tissue,structures within the cell nuclei, and clustering of cell nuclei. The tediousmanual delineation inspired us to look further into segmentation of cell nucleiin tissue. This led to Papers VII and VIII, described in Section 4.4.

51

4.3 Papers V and VI:Sequential immunofluorescence staining

Properties of specifically stained cells in culture or tissue sections can be quan-tified by combining immunohistochemistry with digital image analysis. Simul-taneous visualization of more than one antigen by multicolor immunostainingis often desirable or even necessary for analysis of complex spatial and tem-poral relationships. The number of different antigens that can be stained andimaged in the same set of cells using a fluorescence microscope is limited bytwo factors. The first is the spectra of the fluorochromes, which have to beseparated by band-pass filters for the excitation and emission spectra. If thespectra overlap, it is very difficult to discriminate between signals originatingfrom different fluorochromes. The second limiting factor is the availability ofprimary antibodies originating from different species. The majority of com-mercially available primary antibodies are either of rabbit or mouse origin,limiting the number of fluorescence labeled secondary antibodies that can beused without cross-reactivity.

The goal of Papers V and VI was to develop a method that would increasethe number of antigens that can be analyzed in the same set of cells. Inspiredby two methods developed for absorption microscopy [30, 41], a method wherethe fluorescence stain is washed away after imaging the tissue section is pro-posed. If the stain can be removed without affecting the antigens, a new setof specific stains, targeting a new set of antigens, can be applied and imaged.By staining, imaging, and washing repeatedly and using digital image anal-ysis, the number of different antigens that can be imaged and quantitativelyanalyzed in the same set of cells in a tissue section can be increased.

Paper V and VI describe the same sequential immunofluorescence stainingprotocol, but Paper V is focused on the image analysis, while Paper VI bringsup the details of the staining methodology. As Paper VI was published afterPaper V, some improvements were also made to the data analysis.

4.3.1 Methods and experiments

Sequential staining

Tissue sections from cervical carcinoma and carcinoma of the prostate wereused to test and optimize the proposed protocol. Only two antigens werestained and imaged in each round of the sequential staining protocol, reduc-ing many of the problems that occur when large numbers of stains are usedsimultaneously. A third stain, DAPI, was used to visualize the cell nuclei be-fore and after each staining step. After staining and imaging, the primary andsecondary antibodies were removed by a combination of elution and denatura-tion.

52

Elution was performed by washing the tissue sections in a glycine-hydro-chloric acid buffer [41]. Denaturation was performed by cooking the cellsin a citrate buffer in a microwave oven [30] (Note the typing error in PaperVI, section SIFS, where lysine was accidentally written instead of glycine.)After the denaturation and/or elution of the stain, the tissue sections were pho-tographed to check the remaining levels of previously applied antibodies andfluorochromes. Remaining primary antibodies were visualized by addition offluorescence labeled secondary antibodies. The fluorescence present beforeany stain was applied, and the stain that was left after each washing step wasalso imaged. One monochrome 3D image volume (1024� 1024� 7 voxels)was acquired for each stain and staining step by use of appropriate excitationand emission filters and moving the focal plane of the microscope through thetissue section. Working with 3D images reduces the need of auto-focusing pro-cedures, which are otherwise essential in automatic quantitative microscopy.

Image analysis

The tissue sample, attached to a microscope slide, was removed from the mi-croscope after each imaging step. In order to image the same cells after eachwashing or staining step it is necessary to re-mount the slide in the same po-sition in the microscope. Despite great care and use of an automatic reposi-tioning function of the microscope, the slide was often translated inx-, y-, andz-direction. Automatic and objective quantification of the antigen content inthe individual cells required exact repositioning. We used a four-parameterrigid registration algorithm, as described in Section 3.2.2, for repositioning ofthe images. Each staining or washing step resulted in an image set consistingof three images acquired in parallel without inter-image transformation. TheDAPI image of the cell nuclei in each set was used as a reference image forautomatic registration to the first image set. The same translation and rotationparameters could then be applied to the other images within each set.

After registration, a common reference image was created by a 4D to 3Dmaximum intensity projection (MIP) of the registered DAPI image volumes.The individual cell nuclei were then identified by semi-automatic segmenta-tion. The segmentation was initialized by an algorithm based on the watershedalgorithm, working on a 2D MIP of the common reference image. Errors inthe segmentation process were then corrected by the operator, using a set ofdigital cutting and merging tools. The 2D segmentation was then used directlyor extended to an approximate 3D segmentation. Once the nuclei had beensegmented from the common reference image, the same segmentation maskcould be used for analysis of the different antigens. The relative antigen con-centration in each cell nucleus was evaluated by calculating the mean of thevoxel intensities in 3D or pixel intensities in a 2D MIP.

The detected signal is affected by variations in tissue fixation and thickness,

53

antigenic recovery, temperature and time at the staining and washing steps, etc.This variability makes it difficult to compare quantitatively cells from differenttissue sections. In order to make comparisons, it is necessary to classify thecells according to their intensity values relative to the other cells in the sametissue section. Using the thresholding method described in Paper II and [35]combined with fuzzy classification [13] of the cell intensities, a fuzzy classmembership value was found for each cell. Cells were classified as stronglynegative, negative, neutral, positive, or strongly positive for each stain.

4.3.2 Results

Elution and denaturation

In order to adjust the parameters for elution and denaturation, a series of ex-periments were performed. Elution results after washing the tissue sectionsin glycine buffer between zero and three hours were compared. These ex-periments showed that although the stain had disappeared after one hour, theprimary antibodies were still present (as visualized by addition of fluorescencelabeled secondary antibodies). Also, washing for a total time of two hours didnot completely remove primary and secondary antibodies. The experiment alsoshowed that the antigens (in this case cyclin A and p27) were not destroyed bythe elution. In fact, the staining was stronger the second time, indicating thatthe antigens may be more readily available after elution.

Elution did not remove all the primary antibodies, but remaining antibodieswere destroyed by denaturation. Samples that had been eluted for two hourswere denatured for 0 to 15 minutes by microwave cooking. Cooking for 2� 5min was sufficient to denature all the primary antibodies. Denaturation alone,or denaturation followed by elution was also tested. Both these experimentsgave worse results. The majority of the tested antibodies and their respectiveantigens reacted similarly, i.e., the antibodies were removed by the combina-tion of elution and denaturation, and the antigenicity of the specimen remained.There were, however, some antigens that reacted differently. For example, p21lost all its antigenicity through elution.

Cross-over experiment

A cross-over experiment was designed in an attempt to test if the stain is com-pletely washed away without affecting the antigenicity, and if the same cellswill be classified as positive/intermediate/negative if the same stain is appliedagain. Only two antigens (cyclin A and p27) were stained throughout the ex-periment, but two different primary antibodies were used for each antigen.One of the antibodies was of mouse origin, and one was of rabbit origin. Af-ter each stain removal the antibodies used to detect each of the antigens wereswitched. Thus, the secondary antibodies carrying the fluorochrome (FITC

54

or Cy-3) shifted between the antigens after each staining step, see Figure 4.8.This setup gave a priori knowledge of the expected result of the second andthird rounds of staining, since the result should be the same as after the firstround of staining, but with shifted fluorochromes.

Labeled secondary antibodies were re-applied after each stain removal to vi-sualize remaining primary antibodies. Thereby the removal of the previouslyapplied antibodies could be measured. The cross-over experiment provided uswith a method to show both that the antigenicity of the investigated antigenswas not destroyed by elution or denaturation, and that the antibodies and fluo-rochromes could be satisfactorily removed after each of the two first rounds ofstaining.

Each cell was segmented and classified as positive, weakly positive, neutral,weakly negative, or negative. All cells were classified the same way aftereach of the three staining rounds for cyclin A (with two minor changes fromneutral to weakly positive). This shows that the antigenicity is kept after thesubsequent staining steps, and the classification is not affected by previousstaining steps. The variations were larger for the p27 stain. The results are,however, largely similar after the first and the third staining step, where rabbitpolyclonal primary antibodies were used, as compared to the second stainingstep, in which mouse monoclonal primary antibodies were used. The mostlikely explanation is differences in the specificity of the used monoclonal andpolyclonal antibodies.

4.3.3 Conclusions and comments

The conclusion of the described cross-over experiment is that previously ap-plied stains can be removed at least twice, and that at least two rounds of stainremoval can be performed without necessarily losing the antigenicity of the in-vestigated antigens. This opens the possibility to stain one section of tissue forat least six different antigens, using two primary antibodies of different speciesorigin in each staining step.

The segmentation method used in these experiments needs user-interaction,and the extension of 2D segmentation to 3D resulted in small errors when cellsoverlap. Automatic segmentation techniques for 3D images of tissue sectionswere later developed as described in Paper VIII. This type of segmentation,together with more robust classification methods would improve the describedtechnique.

An alternative to sequential staining is quantum dots, as mentioned in Sec-tion 2.2.3. Quantum dots have only recently been used for fluorescence stain-ing of biological material, but seem to be a very promising alternative formulticolor fluorescence staining as they have very distinct and narrow emis-sion spectra. If they can provide sufficient signal when attached directly to a

55

primary antibody, cross reactions between secondary antibodies and primaryantibodies from the same species can be avoided.

In standard fluorescence microscopy, the signals originating from differ-ent fluorochromes are separated based on the spectral properties of the fluo-rochromes. It has been shown that the signals can be separated even further byalso monitoring the lifetime properties of the fluorochromes [11]. Combiningspectral and lifetime properties is yet another way to increase the number ofobservable antigens in as single specimen.

56

stain: DAPI 1st ab: none 1st ab: none

stain: DAPI 1st ab: rabbitαp27 1st ab: mouseαcyclin A

stain: DAPI 1st ab: none 1st ab: none

stain: DAPI 1st ab: rabbitαcyclin A 1st ab: mouseαp27

stain: DAPI 1st ab: none 1st ab: none

stain: DAPI 1st ab: rabbitαp27 1st ab: mouseαcyclin A

Before staining,only 2nd ab andDAPI added

After first staining,DAPI, 1st and 2ndab added

After first destaining,only 2nd ab andDAPI added

After second staining,DAPI, 1st and 2ndab added

After second destaining,only 2nd ab andDAPI added

After third staining,DAPI, 1st and 2ndab added

FITC Cy3

Figure 4.8: Setup of the cross-over experiment. The tissue was stained with DAPI andthe secondary antibodies (ab) FITCαrabbit and Cy-3αmouse were applied in each staining orwashing step (horizontal rows). The primary antibodies were excluded after each stain removalstep, and then exchanged before the next round of staining, according to the text in the image.All images are projections of 3D images. They were acquired using the same exposure time, andsubsequently scaled using the same gray-level interval. Only a small part of the full registeredimage set is shown.

57

4.4 Papers VII and VIII:Segmentation of cell nuclei in tissue

Understanding complex biological systems requires integration of molecular,cellular, and tissue-level information. Automatic segmentation of cell nucleifrom 2D and 3D images of cells in tissue allows the study of individual cellnuclei within their natural tissue context. Compared to manual methods basedon drawing the outlines of the nuclei with a mouse, automatic methods needfar less interaction, and the result is more objective and easily reproduced.Automation also increases the amount of data that can be processed.

The goal of Paper VII was to develop a fully automatic segmentation al-gorithm for cell nuclei in fluorescence microscopy images of tissue sections.In Paper VIII, the segmentation method was further improved and extended to3D.

The difficulties in automatic segmentation of images of cell nuclei in tis-sue produced by fluorescence microscopy usually have three causes. First, theimage background intensity is often uneven due to auto-fluorescence from thetissue and fluorescence from out-of-focus objects. This unevenness makes theseparation of objects and background a non-trivial task. Second, intensity vari-ations within the nuclei further complicate the segmentation as each nucleusmay be split into more than one object, leading to over-segmentation. Third,cell nuclei are often clustered, making it difficult to separate the individualnuclei. In Paper VII, image intensity and gradient information was combinedto improve previously described segmentation techniques. In Paper VIII, thesegmentation method was extended to 3D and shape information was includedto further improve the segmentation result.

4.4.1 Methods and experimentsSegmentationThe segmentation method can be described by the following steps. First, ex-tendedh-maxima transformation of the intensity image was used for findingobject seeds [54]. In Paper VII, background seeds were found by extendedh-minima transformation of the intensity image. This caused problems when theimage borders were dark, as all the extendedh-minima ended up close to theimage borders, and no background seeds were present in the image center. InPaper VIII, better background seeds were found by extendedh-minima trans-formation of the gradient magnitude image. Small extended minima were de-tected inside the objects, but they could easily be removed using a size thresh-old. Seeded watershed segmentation, as described in Section 3.3, was there-after applied to the gradient magnitude image, and region borders were createdat the crest lines in the gradient image. More than one seed in an object resultedin over-segmentation. After watershed segmentation, over-segmentation was

58

reduced by only keeping those borders that correspond to strong edges. If twoseeds are in the same object, the magnitude of the gradient at the region borderwill usually be small.

Associating region boundaries with border strength requires some carefuldefinitions. The strength of a border separating two regions should be calcu-lated in such a way that the strength of the border between regionsA andBis the same as the strength of the border betweenB andA. This is achievedby the following method, where the image of the segmentation result is tra-versed once. If the current pixel/voxel has a label which is different from thatof a ”forward” neighbor, (2 edge and 2 vertex neighbors in the 2D case, and 3face, 6 edge, and 4 vertex neighbors in the 3D case), the pixel/voxel intensi-ties from the corresponding two positions in the gradient magnitude image areretrieved. The intensity of the brighter of the two is used to define the localborder strength between the two neighboring pixels/voxels and saved in a tablefor border data. We choose the brightest value since it represents the strongestborder value. If a pixel/voxel has several forward neighbors with differentlabels, each label will result in a new value in the table of border data.

The strength of the complete border between two regions can be measuredin many different ways. A simple measure is to define the strength of a bor-der as the weakest point along the border. This is often used for reducingover-segmentation resulting from watershed segmentation. However, manycorrectly segmented objects are then merged, due to single weak border pixelsor weak border parts originating from locally less steep gradients. Anothermeasure, which is less sensitive to noise and local variations, is the mean valueof all border pixels of two neighboring objects. This measure proved to givegood results for merging of over-segmented nuclei. The mean value of theborder of each merged object must be updated after merging. This is doneby adding the border data (mean value and number of pixels) of the mergedobjects to the new, larger, object and its neighbors. The merging is continueduntil each remaining object border is stronger than a given threshold. Insteadof defining the border strength as a mean value, one might consider the me-dian, or some other percentile, but this would need storage of more data thanjust the number of pixels and the pixel sum.

A strong border means that the object is well-focused. Merging basedon border strength therefore means that not only over-segmented objects aremerged, but also poorly focused objects will be merged with the background,and disappear. This is an important feature if well-focused objects are requiredin the further analysis of fluorescent signals. In this case, a rather high thresh-old is suitable. If also poorly focused objects are of interest, their removal canbe avoided by not allowing merging of objects and background.

If the nuclei are tightly clustered, no edge is present where they touch, andthey will not be separated. In Paper VIII, objects found by the first steps of

59

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

z−slice

me

an

va

lue

of

voxe

l in

ten

sitie

s >

10

A B

Figure 4.9:The graph shows the mean pixel intensity of pixels� 10 for each image slicein a 3D image volume. Low mean values in the initial slices are due to the fact that they wereacquired outside the focal plane of the tissue sample. Once inside the tissue sample, the signalis attenuated. The attenuation can be described by a linear degradation of the signal. Slicedependent contrast enhancement as described in the text resulted in the mean values of thedashed line.A: Slices 20 and 80 before andB: after contrast enhancement.

the segmentation process are further separated based on shape. Shape-basedcluster separation using the distance transform, as described in Section 3.3, isapplied to all objects found by the previous steps, but only those separationlines that go through deep enough valleys in the distance map are kept. In thepresented method, a true object must contain at least one seed, its borders toneighboring objects must have a sufficiently strong gradient magnitude aver-age, and it must have a fairly round shape.

The described method works in both 2D and 3D, the only difference isthat all operations have to take the dimensionality of the neighborhood intoaccount.

3D pre-processingNo pre-processing was necessary in the 2D images, i.e., the segmentationmethod worked well despite the large background variations. A 3� 3 smooth-ing filter was, however, applied to speed up the watershed segmentation. The3D images had to be pre-processed as the fluorescence signal was attenuatedin the optical slices furthest into the tissue. In the 2D case, the contrast be-tween object and background was always greater than the contrast within thenuclei. The signal attenuation in the 3D case caused the contrast within thenuclei of the first optical slices to be greater than that of the object-backgroundcontrast in the last slices, and a contrast enhancement of the last optical sliceswas needed.

The contrast was enhanced by first finding the function describing the at-tenuation of the signal. This was done by setting a low global object/back-ground threshold for all optical slices of the image. The mean intensity of the

60

signal above the threshold was thereafter calculated for all slices separatelyand plotted against slice number, see Figure 4.9. The attenuation appearedto be linear within the information carrying slices, i.e., the optical slices ofthe image volume that cut through the physical tissue section. A straightline was fitted to the linear part of the data in a least square sense (see Fig-ure 4.9), and all the image slices were contrast enhanced by multiplicationusing Iz� comp

� Iz� m

� �k � z � m� , whereIz� comp is the compensated version

of slice z, Iz is the original slicez, m is the mean intensity value where thefitted line cuts the y-axis, andk is the slope of the line describing the attenua-tion. The compensation was applied to all voxels in the image, not only thoseabove the initial object-background threshold, but as it is a multiplicative con-trast enhancement, values close to zero before enhancement will be close tozero also after enhancement. This makes the method less sensitive to the ini-tial threshold than an additive method, such as the one described in [58]. Themean pixel intensity above the threshold after contrast enhancement is shownby a dashed line in Figure 4.9. Some variation still remains, but the contrastin the last slices is still sufficient for segmentation. An example of slices 20and 80 before and after contrast enhancement is shown in Figure 4.9 A andB, respectively. Other methods for modeling light attenuation have previouslybeen described [33], but this simple approach gave satisfactory results for theaddressed problem.

The voxel size of the original 3D data was 98nm in thex- andy-directionsand 163nm in thez-direction, and the image volumes contained 512� 512� 99voxels. To get cubic voxels, the image was rescaled by a factor 98/163 inthex- andy-directions, resulting in an image volume of 307� 307� 99 voxels.Rescaling was performed by looping over the output image and calculating theposition of the corresponding pixel in the input image using bilinear interpola-tion. After contrast enhancement and compensation for non-cubic voxels, the3D images were smoothed by filtering, just as in the 2D case.

4.4.2 Results

The method described in Paper VII was tested on six 2D images of cell nu-clei in tissue. The same images, together with one 3D confocal image wasused for testing the improved algorithm described in Paper VIII. As a com-parison, segmentation of object/background by thresholding is illustrated inFigure 4.10 B and C. No threshold that finds all objects and excludes all back-ground can be found. Still, the presented, gradient-based seeded segmentationnicely delineates most of the nuclei, as seen in Figure 4.10 I. Out of a total of689 cell nuclei in the six 2D images, 57% were correctly segmented after theinitial seeded watershed segmentation. Merging increased the percentage ofcorrectly segmented cells to 87%. Separation of clusters further increased the

61

percentage of correctly segmented cells to 91%.The 3D image contained 90 cell nuclei. Due to extensive seeding, almost

every nucleus was over-segmented after the first segmentation step (seededwatershed). Out of the 90 cell nuclei, 82 were correctly segmented, four wereover-segmented and four were under-segmented, resulting in 91% correct seg-mentation after merging and shape-based splitting. No nuclei within the tissuesection were missed, but some thin pieces of nuclei near the edge of the tissuesection were lost. In addition to the 90 cells, 18 fluorescing objects not lookinglike proper cell nuclei were also found. Most of them can be discarded basedon their small size. The result is shown in Figure 3.12 of Chapter 3.

4.4.3 Conclusions and commentsThe presented segmentation method needs very little pre-processing, even ifthe background variation in the image is large. Poorly focused objects in 2Dimages are automatically removed, as their edge-strength is low. The numberof missing objects can be reduced by not allowing seeded objects to mergewith the background. This will, however, mean that poorly focused objectsare not removed. At present, the input parameters are manually set for a testimage, and the same parameters are thereafter used for fully automatic seg-mentation of images created under similar imaging conditions. As only fiveinput parameters are required, the parameter adjustments can be done ratherquickly. Automatic setting of input parameters may be possible, but has beenleft for future work.

The segmentation method can be useful for many different tasks where anobject/background threshold is not sufficient. Further processing, such as re-moval of nuclei that are damaged or under-segmented, by thresholding on size,or more advanced statistical methods, may further improve the result. Whenapplied to 3D images, the main difficulty is to acquire input images that havesufficient contrast in thez-direction. If the contrast in thez-direction is veryweak, a gradient filter that takes this into consideration may improve the result.

For further testing of the 3D segmentation, a new set of 3D images was ac-quired of the same tissue sample as was used in Paper VIII. A closer view of theimages showed that the tissue sample had been damaged and degraded duringstorage, but preparation of new specimen was left as future work. The segmen-tation method was also tested on other types of images, ranging from 2D lightabsorption microscopy images of cells to 3D electron microscopy reconstruc-tions of virus particles. The results varied, and image-specific pre-processingmay be necessary to achieve satisfactory results. Seeded watershed segmen-tation was also extensively and successfully used on atomic force microscopy(AFM) images of wood fibers [18].

62

A B C

D E F

G H I

Figure 4.10:Segmentation combining intensity, edge, and shape information.A: Part of anoriginal 2D fluorescence microscopy image of a section of a tumor.B: Result after thresholdingat intensity 60. Most objects are detected, but a lot of background is also above the threshold.C:Result after thresholding at intensity 100. Only a little background is above the threshold, andsome nuclei are nicely delineated, but many are not detected at all.D: The gradient magnitudeof A. E: Object (white) and background (black) seeds found by the extendedh-transformationof the original image and the gradient magnitude image, respectively. Small components wereremoved from the background seed.F: Result of seeded watershed segmentation; some objectsare over-segmented.G: Result after merging seeded objects based on edge-strength. Poorlyfocused objects are removed in this step.H: The distance transform of the objects in the seg-mented image provide information on object shape. The brighter the intensity the further awayfrom the background, or a neighboring object, the pixel is. Watershed segmentation of this im-age separated clustered objects.I: Final segmentation result based on intensity, edge, and shapeinformation.

63

Conclusions

5.1 Summary

This thesis presents a number of algorithms and applications of digital imagecytometry. The first two papers present algorithms for cell-based drug screen-ing and analysis of dynamic events taking place in single cells. Cell segmenta-tion is improved, in Paper I by a statistical cell model, and in Paper II by seededwatershed segmentation. A system for feature selection that will optimize thetrade-off between good descriptive power and over-training is also presented.The final sequence of algorithms presented in Paper II was used for evaluationof the effect of insulin on translocation of proteins in single cells, and could bethe basis for a high-throughput drug screening system.

Papers III and IV present algorithms for studies of cyclin expression in cul-tured cells as well as in tissue sections. Very low levels of cyclin expressioncould be detected from parallel images of different stains using a segmentationmask based on a nuclear dye. A method for unbiased classification of cellsas cyclin-expressing or non-expressing based on a histogram of the fluores-cence intensities of the individual cells is presented. The studies of cyclin Ashowed that cyclin A expression is not disturbed in tumor cell lines, and itswell preserved expression pattern can be used as a reliable marker for the S-and G2-phases of the cell cycle. Analysis of the cyclin E expression pattern innormal as well as transformed cell lines and tumor cells showed that the downregulation of cyclin E at entry of the S-phase is disrupted in some tumor cells.The results also indicate that the analysis of cyclin E expression may provideadditional information, not visible by standard tumor grading techniques, forcancer prognosis.

Analysis of complicated interactions and signaling pathways in single cellsrequire staining of many antigens in parallel. The number of antigens that canbe visualized is limited by the availability of antibodies of different speciesorigin as well as the spectra of the fluorochromes. Papers V and VI present acombination of image analysis and staining procedures that allow for sequen-tial staining and visualization of different antigens. Experimental data indicatethat by repeated imaging, staining, washing and re-staining, at least six differ-ent antigens can be stained in the same set of cells.

All image cytometry requires robust segmentation techniques. The mostdifficult images to segment in cytometry are probably those of cells in tissue, as

65

clustered objects, background variation and internal intensity variations com-plicate the problem. Papers VII and VIII present a segmentation algorithmthat combines intensity, shape, and gradient information to segment 2D and3D images of cell nuclei in tissue. Fast, robust and automatic segmentationmethods can increase the throughput and power of image-based single cellanalysis of cells within their natural tissue context. For example, combiningrobust segmentation with the sequential staining protocol described in PapersV and VI, more complex signaling pathways involved in the cell cycle controlinvestigated in Papers III and IV could be analyzed.

5.2 Discussion and future workThe more knowledge we gather, the more questions we can formulate. Moreadvanced staining and imaging techniques provide new means to visualize ac-tivities and interactions taking place in cells, both in culture, in tissue samples,and in vivo. The presented methods can be thought of as a set of tools in a tool-box that may be combined and used for a wide range of applications. Manymodern methods in the field of biotechnology and biomedicine produce image-like data, and the methods presented in this thesis may very well be useful alsofor analysis of such data.

The studies of cyclin expression-levels of Papers III and IV in relation to tu-mor malignancy show promising results. The segmentation methods presentedin Papers VII and VIII are currently being used in a larger study on the clinicalimplications of deregulated cyclin E expression. Further development of thesequential staining techniques presented in Papers V and VI, combined withautomatic image segmentation and analysis will provide means for analysisand evaluation of large numbers of antigens in relationship to each other andto the morphology of the tissue.

The methods developed for studies of dynamic events in living cells, initi-ated by Paper I and further developed in the study of protein (Rac1) activationand translocation of Paper II, provide a basis for high-throughput drug screen-ing. When combined with high-speed high-resolution imaging platforms, thepresented methods allow efficient screening studies, revealing how drug candi-dates affect the dynamics in living cells. Cell-based drug screening is a cheapand fast alternative to more ethically complicated testing on, e.g., animals andhumans. By efficient cell-based pre-screening, large numbers of potential drugcandidates can be tested in a short time. More knowledge can be gathered be-fore drug candidates showing desired effects are tested on animals and at theclinical level.

Digital image cytometry is not limited to the fields of cancer prognosis anddrug screening. Many interesting problems such as analysis of sub-cellularcompartments, dot-splitting, and co-localization studies have a large number of

66

applications in the field of cell and molecular biology as well as biotechnology.The growing field of stem-cell studies has also proved to be in need of digitalimage cytometry, e.g., for stem-cell tracking and analysis. To conclude, robust,automatic techniques for image-based extraction of data is a fast growing fieldwith a wide range of applications.

67

Acknowledgments

This thesis would not have been completed without the help and friendshipfrom my colleagues at the Centre for Image Analysis (CBA), and from familyand friends in the “outside world”. In particular, I would like to thank thefollowing people:

� Ewert Bengtsson, my supervisor, for offering me help and guidance whennecessary, but also for letting me be independent and work with the thingsI’ve enjoyed the most. In addition, I would like to thank you for being fairand understanding, which are not universal characteristics of supervisors, I’veheard.

� The VISIT-program, for financing my PhD studies, and providing interestingcourses and great fellow PhD-students.

� Joakim Lindblad, for excellent collaborations. 2.10!� Fredrik Erlandsson, for providing me with the most interesting images anddata! Without you there would not have been much of a thesis.

� Ida-Maria Sintorn, for still being the best of friends after the “mangling” ofPaper VIII.

� Gunilla Borgefors, for offering great working facilities and knowledge!� Lennart Thurfjell, for introducing me to the world of image analysis!� Lena Wadelius, for making every complicated administrative procedure very,very simple, and for making CBA a wonderful place to be!

� Olle Eriksson, for always taking your time to undo my accidental rm *s.� Bosse Nordin, for answering every IMP-question with a smile.� Ingela Nystrom, for help and advice with work as well as life in general!� All my proofreaders; Jocke, Ingela, Ida, Xavier, Fredrik and Bosse, for cor-recting my Swinglish and making me aware of all my favorite words and ex-pressions!

� Lennart Bjorkesten, for initiating the drug-screening project and the contactswith the people in Cardiff: Alla, Stuart, Dietrich, Liz, Gareth, Simon, Nick.Thanks for interesting, intensive and fun work together!

� Nils Ringertz, who is unfortunately no longer around, for encouraging me toengage in science in general and microscopy in particular, and Hans, Agneta,Gudrun, Lore, Nina, Wei-Qin, Einar, Henrik and Gabriella for making my timeat CMB, SU and the Nobel e-Museum very challenging, fun, and fruitful.

� Anders Zetterberg, and the people at the department of Oncology-Pathology

68

at the Karolinska Institute for letting me take part in your research!� Professor Choi, Kim, Hyun-Joo, Byong-Il, Hae-Gil, Jae-Yung, Moon-Young,and Jin-Hee for taking such good care of me during my time at MITL, InjeUniversity, Korea. Kam-sa ham-ni-da!

� Felix, Cathe, Rogrer L&H, Micke, Ola, Robin, Lucia, Stina, Patrick, Mats,Natasa, Peter, Mattias A&M, Petra F, Erik, Karl, Johan, April and all other pastand present friends and colleagues at CBA. Seminars, conferences, Friday andWednesday pubs, fika, lunch, and WORK would not have been as much funwithout you!

�Ida, Petra and Anna for sharing work and life with me!! You are the best!

�Lena, Lennart, Urban, Eva, Ingrid, Lovisa, Ulrika, and Bengt, my wonderful

Norwegian-Italian-Gothenburg-family!�

My brothers Clas and Johannes, and Steph, David and Ida! It’s great to haveyou around in Uppsala!

�My mother Carola and my father Sven for being the very best parents in the

world! Stor kram!�

My husband Anders, and my son Jonatan, for all the love and happiness yougive me! Next summer there will be more time for real watersheds !

Tack!

Uppsala, September 2003

69

References

[1] D. A. Agard, Y. Hiraoka, P. Shaw, and J. W. Sedat. Fluorescence microscopy inthree dimensions.Methods in Cell Biology, 30:353–377, 1989.

[2] B. Alberts, D. Bray, J. Lewis, M. Raff, K. Roberts, and J. D. Watson.MolecularBiology of the Cell. Garland Publishing, Inc, New York, NY, 3rd edition, 1994.

[3] S. Beucher. The watershed transformation applied to image segmentation.Scan-ning Microscopy, 6:299–314, 1992.

[4] S. Beucher and C. Lantu´ejoul. Use of watersheds in contour detection. InInter-national Workshop on Image Processing: Real-time and Motion Detec-tion/Estimation, Rennes, France, Sept. 1979.

[5] B. Bjorknas. Non-rigid registration of medical images. Master’s thesis, Centrefor Image Analysis, Uppsala University, Uppsala, Sweden, 2000. Available fromthe author.

[6] F. Boddeke.Quantitative Fluorescence Microscopy. PhD thesis, Delft Uni-versity of Technology, Delft, The Netherlands, Jan. 1999.

[7] G. Borgefors. Distance transformations in digital images.Computer Vision,Graphics and Image Processing, 34:344–371, 1986.

[8] G. Borgefors. On digital distance transforms in three dimensions.ComputerVision and Image Understanding, 44(3):368–376, 1996.

[9] G. Borgefors and G. Sanniti di Baja. Analyzing nonconvex 2D and 3D patterns.Computer Vision and Image Understanding, 63:145–157, 1996.

[10] K. Carlsson and N.Aslund. Confocal imaging for 3-D digital microscopy.Ap-plied Optics, 26(16):3232–3238, 1987.

[11] K. Carlsson and A. Liljeborg. Confocal fluorescence microscopy using spectraland lifetime information to simultaneously record four fluorophores with highchannel separation.Applied Optics, 185(1):37–46, 1996.

[12] M. Chalfie, Y. Tu, G. Euskirchen, W. W. Ward, and D. C. Prasher. Green fluo-rescent protein as a marker for gene expression.Science, 263:802–805, 1994.

71

[13] Z. Chi, H. Yan, and T. Pham.Fuzzy algorithms with applications to imageprocessing and pattern recognition. World Scientific, Singapore, 1996.

[14] L. Dorst and A. W. M. Smeulders. Length estimators for digitized contours.Computer Vision, Graphics and Image Processing, 40:311–333, 1987.

[15] K. E. Dreinhofer, B. Baldetorp, M.Akerman, M. Fern¨o, A. Rydholm, andP. Gustafson. DNA ploidy in soft tissue sarcoma: comparison of flow and imagecytometry with clinical follow-up in 93 patients.Cytometry, 50:19–24, 2002.

[16] R. O. Duda and P. E. Hart.Pattern Classification and Scene Analysis. Wiley,New York, 1973.

[17] S. V. Ekholm, P. Zickert, S. I. Reed, and A. Zetterberg. Accumulation of cyclinE is not a prerequisit for passage through the restriction point.Molecular andCell Biology, 21(9):3256–3265, 2001.

[18] J. Fahlen and L. Salm´en. Cross-sectional structure of the secondary wall of woodfibers as affected by processing.Journal of Material Science, 38(1):119–126,2003.

[19] J. L. Fleiss, J. Cohen, and B. S. Everitt. Large sample standard errors of kappaand weighted kappa.Psychological Bulletin, 72(5):323–327, 1969.

[20] T. Frangsmyr, editor.Les Prix Nobel 2001. Almqvist & Wiksell Int, Stock-holm, Sweden, 2001. Also available at www.nobel.se.

[21] M. Furuno, N. den Elzen, and J. Pines. Human cyclin A is required for mitosisuntil mid prophase.Journal of Cell Biology, 147(2):295–306, 1999.

[22] S. Gilles, M. Brady, J. Declerck, J. Thirion, and N. Ayache. Bias field correctionof breast MR images. InProceedings of the Fourth International Confer-ence on Visualization in Biomedical Computing (VBC), pages 153–158,Hamburg, Germany, Sept. 1996.

[23] R. C. Gonzalez and R. E. Woods.Digital Image Processing. Prentice Hall,Inc, Upper Saddle River, NJ, 2nd edition, 2002.

[24] T. Heiden, G. Auer, and B. Tribukait. Reliability of DNA cytometric S-phaseanalysis in surgical biopsies: Assessment of systematic and sampling errors andcomparison between results obtined by image and flow cytometry.Cytometry,42:169–208, 2000.

[25] R. A. Johnson and D. W. Wichern.Applied Multivariate Statistical Analysis.Prentice-Hall, 4th edition, 1998.

72

[26] A. Karlsson, K. Str˚ahlen, and A. Heyden. Segmentation of histopathologicalsections using snakes. In J. Bigun and T. Gustavsson, editors,Proceedings ofthe13th Scandinavian Conference on Image Analysis (SCIA), volume 2749of Lecture Notes in Computer Science, pages 595–602, Gothenburg, Sweden,June 2003. Springer-Verlag.

[27] M. Kass, A. Witkin, and D. Terzopoulos. Snakes: Active contour models.In-ternational Journal of Computer Vision, 1:321–331, 1988.

[28] A. Krtolica, C. Ortiz de Solorzano, S. Lockett, and J. Campisi. Quantificationof epithelial cells in coculture with fibroblasts by fluorescence image analysis.Cytometry, 49:73–82, 2002.

[29] Z. Kulpa. Area and perimeter measurement of blobs in discrete binary pictures.Computer Graphics and Image Processing, 6:434–454, 1977.

[30] H. Y. Lan, W. Mu, D. J. Nikolic-Paterson, and C. Atkins. A novel, simple,reliable and sensitive method for multiple immunoenzyme staining: use of mi-crowave oven heating to block antibody crossreactivity and retrieve antigens.Journal of Histochemistry and Cytochemistry, 43:97–102, 1995.

[31] P. Lancaster and K.Salkauskas.Curve and Surface Fitting, an Introduction.Academic Press, London, 1986.

[32] D. R. Larson, W. R. Zipfel, R. M. Williams, S. W. Clark, M. P. Bruchez, F. W.Wise, and W. W. Webb. Water-soluble quantum dots for multiphoton fluores-cence imaging in vivo.Science, 300:1434–1436, 2003.

[33] A. Liljeborg, M. Csader, and A. Porwit. A method to compensate for light at-tenuation with depth in three-dimensional dna image cytometry using a confocalscanning microscope.Journal of Microscopy, 177(2):108–114, 1995.

[34] J. Lindblad. Perimeter and area estimates for digitized objects. InProceedingsof the SSAB (Swedish Society for Automated Image Analysis) Symposiumon Image Analysis, pages 113–117, Norrk¨oping, Sweden, March 2001.

[35] J. Lindblad.Development of Algorithms for Digital Image Cytometry. PhDthesis, Uppsala University, Uppsala, Sweden, Jan. 2003. Available from theauthor.

[36] J. Lindblad and E. Bengtsson. A comparison of methods for estimation of inten-sity nonuniformities in 2D and 3D microscope images of fluorescence stainedcells. In I. Austvoll, editor,Proceedings of the12th Scandinavian Confer-ence on Image Analysis (SCIA), pages 264–271, Bergen, Norway, June 2001.NOBIM.

73

[37] S. J. Lockett, D. Sudar, C. T. Thompson, D. Pinkel, and J. W. Gray. Efficient,interactive, and three-dimensional segmentation of cell nuclei in thick tissue sec-tions. Cytometry, 31:275–286, 1998.

[38] N. Malpica, C. Ortiz de Solorzano, J. J. Vaquero, A. Santos, I. Vallcorba, J. M.Garcia-Sagredo, and F. del Pozo. Applying watershed algorithms to the segmen-tation of clustered nuclei.Cytometry, 28(4):289–297, 1997.

[39] E. Manders, J. Stap, G. Barkenhoff, R. van Driel, and J. Aten. Dynamics ofthree-dimensional replication patterns during the S-phase, analysed by doublelabelling of DNA and confocal microscopy.Journal of Cell Science, 103:857–862, 1992.

[40] F. Meyer and S. Beucher. Morphological segmentation.Journal of VisualCommunication and Image Representation, 1(1):21–46, 1990.

[41] P. K. Nakane. Simultaneous localization of multiple tissue antigens using theperoxidase-labeled antibody method: a study on pituitary glands of the rat.Jour-nal of Histochemistry and Cytochemistry, 16:557–560, 1968.

[42] B. Nordin. IPAD, version 2.0 and IMP - an IPAD application. Internalreport No. 6, Centre for Image Analysis, Uppsala, Sweden, 1997. Availablefrom the author.

[43] C. Ortiz de Solorzano, E. Garcia Rodriguez, A. Jones, D. Pinkel, J. Gray, D. Su-dar, and S. Lockett. Segmentation of confocal microscope images of cell nucleiin thick tissue sections.Journal of Microscopy, 193:212–226, 1999.

[44] J. Park and J. M. Keller. Snakes on the watershed.IEEE Transactions onPattern Analysis and Machine Intelligence, 23(10):1201–1205, 2001.

[45] S. M. Potter. Vital imaging: two photons are better than one.Current Biology,6(12):1595–1598, 1996.

[46] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery.NumericalRecipes in C - The Art of Scientific Computing. Cambridge University Press,2nd edition, 1992.

[47] P. Pudil, F. J. Ferri, J. Novoviˇcova, and J. Kittler. Floating search methods forfeature selection with nonmonotonic criterion functions. InProceedings of the12th International Conference on Pattern Recognition (ICPR), pages 279–283, 1994.

[48] C. E. Queiros and E. S. Gelsema. A note on some feature selection criteria.Pattern Recognition Letters, 10(3):155–158, 1989.

74

[49] P. Ranefall, K. Wester, and E. Bengtsson. Automatic quantification of immuno-histochemically stained cell nuclei using unsupervised image analysis.Analyt-ical Cellular Pathology, 16:29–43, 1998.

[50] K. Rodenacker and E. Bengtsson. A feature set for cytometry on digitized mi-croscopic images.Analytical Cellular Pathology, 25:1–36, 2003.

[51] P. K. Sahoo, S. Soltani, A. K. C. Wong, and Y. C. Chen. A survey of thresholdingtechniques.Computer Vision, Graphics and Image Processing, 41:233–260,1988.

[52] K. Schauenstein, E. Schauenstein, and G. Wick. Fluorescence properties of freeand protein bound fluorescein dyes. I. Macrospectrofluorometric measurements.Journal of Histochemistry and Cytochemistry, 26(4):277–283, 1978.

[53] C. Seydel. Quantum dots get wet.Science, 300:80–81, 2003.

[54] P. Soille. Morphological Image Analysis: Principles and Applications.Springer-Verlag, 1999.

[55] M. Sonka, V. Hlavac, and R. Boyle.Image Processing, Analysis and Ma-chine Vision. Brooks/Cole Publishing Company, Pacific Grove, CA, 2nd edi-tion, 1999.

[56] J. A. Swets, R. M. Dawes, and J. Monahan. Better decisions through science.Scientific American, 283(4):70–75, 2000.

[57] B. J. Trask. Fluorescencein situhybridization: applications in cytogenetics andgene mapping.Trends in Genetics, 7(5):149–154, 1991.

[58] P. S. Umesh Adiga and B. B. Chaudhuri. An efficient method based on water-shed and rule-based merging for segmentation of 3-D histo-pathological images.Pattern Recognition, 34:1449–1458, 2001.

[59] L. Vincent. Morphological grayscale reconstruction in image analysis: Appli-cations and efficient algorithms.IEEE Transactions on Image Processing,2(2):176–201, 1993.

[60] L. Vincent and P. Soille. Watersheds in digital spaces: An efficient algorithmbased on immersion simulations.IEEE Transactions on Pattern Analysisand Machine Intelligence, 13(6):583–597, 1991.

75

Acta Universitatis Upsaliensis

Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology

Editor: The Dean of the Faculty of Science and Technology

Distribution:

Uppsala University Library

Box 510, SE-751 20 Uppsala, Sweden

www.uu.se, [email protected]

ISSN 1104-232X

ISBN 91-554-5759-2

A doctoral dissertation from the Faculty of Science and Technology, Uppsala

University, is usually a summary of a number of papers. A few copies of the

complete dissertation are kept at major Swedish research libraries, while the

summary alone is distributed internationally through the series ComprehensiveSummaries of Uppsala Dissertations from the Faculty of Science and Technology.

(Prior to October, 1993, the series was published under the title “Comprehensive

Summaries of Uppsala Dissertations from the Faculty of Science”.)