image analysis and segmentation of anatomical features of cervix

8/6/2019 Image Analysis and Segmentation of Anatomical Features of Cervix

1/4

IMAGE ANALYSIS AND SEGMENTATION OF ANATOMICAL FEATURES OF

CERVIX UTERI IN COLOR SPACE

Viara Van Raad

STI-Medical Systems, Honolulu, HI 96 813, USAe-mail: [email protected]

ABSTRACT

We propose and verify a method for color-based clus-ter segmentation of the various tissues from ectocervix.That method uses a simplified compartment-like analy-sis, aiming for a Gaussian Mixture Model (GMM)-tailored segmentation. The tissues of interest are thecervical canal (CC), the columnar epithelium (CE),the squamous epithelium (SE) and the transformationzone (TZ) the latter known as area where pre-canceris often found [1]. We used an optimization algorithm(maximum-a priori algorithm or MAP) for bimodal seg-mentation in normalized RGB color-space, as initiallywe estimated the deterministic values of CC, TZ, CEand SE as pixel sets in a compartmentallike mode. Weassessed the MAP algorithm via automatic segmenta-tion of squamous intraepithelial lesions (SIL) and CC.Our segmentation method is based on the estimates ofthe GMM boundaries for CC, TZ, and SE and their ad-jacent area-ratios for healthy ectocervices. We demon-strated a segmentation algorithm for CC and pre-cancerlesion detection that performed with high accuracy .

1. INTRODUCTION

Cervical cancer is the second most common canceramongst women worldwide [2]. It is also the thirdmost common cause of cancerrelated deaths [3]. How-ever, with early detection, cancer can be prevented andtreated, so the risk of death is greatly reduced. Cur-rently, in the clinics, there are two primary ways to findout if there is a pre-cancer on the cervix. One way is bytaking a sample of cells from the surface of the womanscervix and later performing a cytotest in vitro, known

also as the Pap smear test. The other one is named col-poscopy and it examines the cervix in vivo using broad-band light and this can be done after positive Pap smear.Forming an overall diagnostic impression based on vi-sual impressions cervical cancer precursors is a complextask. Experts consider factors such as color, textureand location of the shape of lesion(s), their borders fordiagnosis. Accuracy in colposcopy for staging neoplasiavaries among experts [4]. For these precursors to be diag-nosed successfully, one has to understand and interpretthese usual signs, introducing quantitative measures [5].Extensive training is required for colposcopists to ac-curately evaluate if signs of cancer are present [1] ontoectocervix. The colposcopists training is based of how

the human visual system (HVS) perceives color, textureand shapes on the cervix. This is often too complex tobe simulated by computer algorithms. In order to mimicthe colposcopists performance automatically, our aim is

to find a simplified approach for automatically discrim-inating important anatomical features of the cervix byusing quantitatively estimated values in a color-basedsegmentation task. Colorbased classification and seg-mentation can provide an increased accuracy in com-puter aided diagnosis (CAD) and besides textural infor-

mation of the appearance of vessels and other lesions, ituses the color intensity information from the image. Wesuggest a model that uses a-priori information basedon assumed Gaussian Mixture Model (GMM) in colorspace in the current paper and in addition, we proposeda simplified compartmental-like modelling of region ofinterest (ROI), assessing the color space values againstthe complementary (ROIC). For a normal cervix, wedefined at least four classes of image pixels to be ana-lyzed, while for disease diagnosis there are more featuresand tissue types ([1] and [6]). The tissues of interest in ahealthy cervix are the cervical canal (CC), the columnarepithelium (CE), the squamous epithelium (SE) and theTransformation Zone (TZ). The latter is known as area

where pre-cancer is often found [1].Previously, Pogue et al. evaluated the contribution

of color-based analysis in digital colposcopic images[7].However, they concluded that color analysis contributedonly minimally to staging cervical intraepithelial neo-plasias (CIN). Colorbased segmentation and artefactcorrection using probability and a-priori estimates wasproposed previously in [5]. In a similar probabilis-tic approach using projection of convex set (POCS)method, color was used for segmentation in [9].

The outline of this paper is as follows. Section 2 de-scribes the estimation of parameters, that sets the scenefor the MAP algorithm. Section 3 describes the the-

ory behind MAP. Section 4 contains a detailed descrip-tion of the bimodal approach used in the current exper-iments. Section 5 contains the experimental results forcolor-based segmentation, and Section 6 discusses therecommended method and concludes the article.

2. ESTIMATED PROBABILITY VALUES

Predominantly, the cervical landmarks in the colposcopyimages are nested types of anatomical areas: the CCis placed centrally, often surrounded by the TZ. The lat-ter is surrounded by the squamous epithelium (SE) [1].We depicted the described spatial relationship in Figure1, representing a stylized illustration of the spatial re-

lationships between these areas in colposcopy images.The relative size of each anatomical feature appearing asa 2D projection on the image plane can be used for prob-abilistic modelling for a-priori estimation of the area


2/4

Figure 1: Stylized relationship of the 2D projection of acervical image. The features are often nested one inanother, which is a typical for endoscopy type of image.

occupied by each tissue type. The probability of thatpixel that belongs to a type of tissue is proportional to

the area after each tissue class occupies. That is why weestablished standard occupied relative areas of tissues asby measuring the ratio to the total image area of fiftyhealthy cervices. It is possible that the area of the TZor texture lesions might be indicative for the state ofthe health of the cervix. These measurements were per-formed via a semi-automatic (interactive) Matlabbasedprogram, which asks a trained operator to delineate theouter polygons associated with each tissue class. Thefact that the polygon areas are nested, makes the in-ner boundary of the surrounding tissue also the outerboundary of the inner tissue type. The sample averageof the area (the mean), the variance and the area oc-cupied by the tissues were calculated using the Greens

lemma for area on a discrete grid. The measurementwas performed using the simplified model of CC, TZand SE displayed on Figure 1. The portion of the areaoccupied by the CC, TZ, SE and SR are as follows: CCoccupies about 2% or less of the 2D projection of theectocervical area; TZ occupies approximately 28% orless from the full area of the normal cervix; SE occupiesaround 70% or less from the normal cervix. The re-sults represent the pooled average values taken from thetraining set, selected out offifty normal cervices. Moretest pixels were taken from the smaller area regions, ap-proximately proportional to the area of the feature.

3. MAP THEORETICAL BACKGROUND

The digitized color cervical images are a 2D projection(onto image plane) of the cervix, represented as RGBtriplets. Segmentation based on RGB triplets (3-tuples)is difficult because of large variation in color among thesame type of tissue represented on color image (as an ex-ample SE) in healthy cervices. This was illustrated onFigure 2. Assuming that the data has normally distrib-uted probability density function (pdf), each of the datasets under investigation is represented by a setvariableSZ, where Z describes separate random processes. Letz is a vector of the corresponding random variable with3tuple values for the three-color components, that in-terchangeably belongs to each of the tissue types, such

as the CC, the TZ, the SE and the artifact named spec-ular reflection (SR). Our experimental study is based

on large pixel samples. The number of samples bN fortraining data for each tissue type lies within the range

Figure 2: Variation of the color of the normal SE in twohealthy cervices.

of 2.5E+ 6 - 10E+ 6 pixels for each of the experi-ments for empirical assessment of the multiple compo-nent p df(s). The sample size was selected to be pro-

portional to the probability of occurrence of each of theSZ=CC|TZ|SE|SR. The GMM of Ddimensional multi-variate distribution can be parameterized as:

P(x|) =DXj=1

wj1

(2)D/2

j

1/2 ..exp

1

2(xjj)

0

j1(xjj), (1)

where wj are the component mixture weights; xjand j are vectors of

bntuple values from the jth color

class, as the latter is the estimated pooled mean from

the training image set. The former are the bntuple testvector of pixels under classification. j is the covariance

matrix of jth andj

is the determinant of the covari-

ance matrix j .The joint likelihood of the independentand identically distributed (i.i.d) feature vector obser-vations may be specified as a product of each one of themodelled probability:

P(x|) =DYj=1

p(xj|), (2)

which in logarithmic terms is the corresponding sum:

L() =DXj=1

lnp(xj |). (3)

Thus we can model each of the D - elements of themixture in color space and we can determine the bn(bn=3)-tuple RGB cluster boundaries of the normally distrib-uted data with mean and covariance. The set of pixelsY = {y1,y2,....yn} on a colposcopy image (n is the num-ber of pixels in the image) is assumed to belong to arandom process X= {x1, x2,..xn}, where an underlyingsegmentation is fulfilling the simple model:

xJ L = {1, 2,...D} (4)

xj= z and Z is the underlying and sought random

process. This will indicate that the pixel i belongs to


3/4

Figure 3: A stylized image (the far left image) and thefour compartments-like description of the four classesfor CC, TZ, SE and Background.

tissue type z. We define the probability that pixel be-longs to a tissue of type z by the nonnegative quantityzj , so P(xj = z) =

zj , which also is the probability for

the set of pixels representing a random process Z and itcan be expressed as PZ(z). We know that:

DXj=1

zj = 1. (5)

The observed image model is X, while ideally themost probable selection of pixel sets in the image ZMAPis to arise (to be discovered). The latter is in somesense the ideally sought image segmentation, given a-priori knowledge about the observation process X. Thisis known as the MAP approach, while the Maximumlikelihood estimate (ML) is to estimate z (a vector) withthe image zML which maximizes the likelihood that ourobservation will occur. This is:

ZML = arg maxz

PX|Z(x,z) (6)

The formula for the MAP in that case is:

ZMAP = arg maxz

PZ|X(x,z), (7)

thats using the Bayesian rule, the relationship be-tween ML and MAP is:

PZ|X(z,x) =PZ|X(x, z)

PX(x)= PX|Z(x,z)

PZ(z)

PX(x). (8)

As an emerging random process, that leads to the occur-

rence ofz

, we can estimate the approximation ofPZ(z

)deterministically using MAP method (for example, us-ing the estimates for the ratios of the areas occupiedform the training set for CC, TZ, SE and SR). We aresuggesting even more simplified model (Figure 3). Weconsider that we can estimate empirically the boundsfor these tissue types (as pixel values) by using the 50training images. Each time we are assuming bimodaldistribution - e.g. a foreground and a complemen-tary background, for which the foreground is SZ andthe background is SCZ (Figure 3). In that way, wewill be not interested in the latter estimate, but the es-timates of SCZ are included for the calculations in theMAP algorithm. Our focus is on establishment of the

empirical values of the pixels that belong to SZ. The col-lected information can be used for MAP segmentationas in equations 7 and 8 for PZ(z) with pooled valuesfor CC as SC; TZ as STZ; SE as SSE ; and SR as SSR.

Thus we will be determining the bounds ofN(z,z) in

color space using the absolute pixel intensities (R,G,B)or their chromaticity, as the latter are lessilluminationdependent.

4. THE MAP ALGORITHM FOR

SEGMENTATION

The tissue class set Z can be modelled as if it belongs toa bivariate GMM. The two-feature vectors (SZ and itscomplimentary SCZ) are assumed as setvariables mem-bers of a bivariate distribution that consists of fore-ground and background (Figure 3). The joint like-lihood of the i.i.d vectors observations as product ofthese probabilities will yield:

P(x|) = p(xZ|)p(xCZ|), (9)

as this equation is a partial case of equation 2 for thebimodal case, which yields the equation 9. The lattercauses the selected modelled process z to be tested outin order to maximize the chances of the pixel under testto belong to particular class. This, according MAPstheory is to be judged by the logarithmic distance Zbetween the clusters (equation 10), in order to find thesought pixel set. The distance is:

Z = 2ln

"Z |Z|

1/2

(1 Z)CZ

1/2

#(10)

Thus, the following inequality (equation 10) states the

criteria for the bntuple variable xj . The inequity isa mathematical derivation from the logarithmic expres-sion in equation 3 for the bimodal case (equation 11):

L() = lnp(xZ|) + lnp(xCZ|). (11)

The inequality that represents the criteria if a se-lected vector is chosen to belong to ZMAP:

xi SZMAP if

(xi Z)0

Z1(xi Z)

(xi CZ)

0

CZ1

(xi CZ) Z

otherwise xi SCZ

,

(12)where Zand

1Z are the mean vector and covariance

matrix of the 3-tuples for the pixels that belong to theSZ. The

CZand

C1Z are the deterministic values from

the training image for the pixels that belong to SCZ.

5. EXPERIMENTAL RESULTS

MAP-based experiments for detection and correction ofan artifact in cervical images known as glare or specularreflection (SR) were presented previously in [8]. An-other attempt to remove the glare was the ad-hocmethod described in [10]. In this research, we used

the MAP approach, testing the ability of our bimodalmodel to achieve an automated differentiation of thepixel set SCC (the cervical canal) from the rest of theimage SCCC, which was based on the equations (10) and


4/4

Figure 4: a) Each of the 3 color RGB images on the left isfrom healthy woman. b) Illustration of segmented cervi-cal canal (CC) binary images c) Each of the 3 color RGBimages has a lesion. d) Segmented CIN I SIL lesions asbinary images of the right. The image is segmented bypreviously described MAP algorithm for SCC and S

CCC

differentiation. This algorithm detects lesions thathave bloody appearance.

(12) for twenty five images. Thus we calculated if a pixelbelongs to that set, using a probability based distancemeasure, described before as the mean and the adja-

cent covariance {bCCC; bCCC } SCCC in chromaticityor rgb space. Thus, the local chromaticity for pixel yior the local Illumination independent chromaticity valueis: { yiRyiR+yiG+yiB ,

yiGyiR+yiG+yiB

, yiByiR+yiG+yiB }. The valuesare 3-tuples. We achieved fully automatic CC segmenta-tion shown on Figure 4 b) and lesion segmentation fromthe images with pre-cancer in the third column.The ac-curacy of the segmented area for the normal group ofselected test images (Figure4 b)) was 95%, which wasevaluated by the method for comparison of the groundtruth from the pre-defined masks. The accuracy of thesegmented area from the pre-cancerous test images was86% successful accuracy, evaluated by the same method.

This verifi

cation method uses pre-annotated masks thatdefines target based on colposcopy impressions only.

6. DISCUSSION AND CONCLUSIONS

We studied the GMM based MAP probability modelin cervical images in color space. This probabilisticapproach was novel to cervical image analysis, consid-ered the specific application. The suggested methodis a general statistical method that can be applied tomany other problems such as machine vision and pat-tern recognition. It has shown success in other appli-cations, it is simple to implement, and it can pe im-plemented in algorithm that is easy to follow. On the

other hand, there is a disadvantage of this method be-cause the method is not an adaptive method and in thetraining phase it uses device-specific color-based val-ues. The method can be improved if the colposcopy

images are pre-calibrated. The method can be used incolor-based image analysis if a detail annotations areavailable within the training set.

Acknowledgements: The work presented inthis paper was funded by the School of Electrical En-gineering and Telecommunications, University of NewSouth Wales, Kensington and the Australian ResearchCouncil (ARC1 ). This paper was motivated by Prof. B.Celler and A. Prof. H. Outhred, for which I am thankful.I appreciate the suggestions and comments regardingthe contents, sturcture and style of the article, made byGary Bignami (Science and Technology International).

REFERENCES

[1] Campion M. J. , Ferris D. G., di Paola F. M.,Modern Colposcopy, Educational Systems, Inc.Augusta, Georgia, USA 1991.

[2] American Cancer Society. Cancer Facts andfigures 2003. Atlanta (GA):ACS; 2003. data-base (http://www.cancer.org/downloads/STT/CAFF2003PWSecured.pdf).

[3] Sawaya G.F., Brown A.D., Washington A. E. Gar-ber A.M.,Clinical Practice. Current Approachesto cervical cancer screening, N.Eng. J. Med. Vol.344(21), pp.1603-1607, 2001.

[4] Massad L. S. and CollinsY. C., Strength of Cor-relations Between Colposcopic Impressions andBiopsy Histology, Gynecol. Oncol., Vol. 89, pp.424-428, 2003.

[5] Viara Van Raad and Andrew Bradley, Emerg-ing Technologies, Signal Processing and StatisticalMethods for Screening of Cervical Cancer in vivo -Are they good candidates for Cervical Screening?,IEEE - IEE Proc. of Advances in Medical Signaland Inf. Processing, Vol. 2, pp. 210-217, 2004.

[6] Apgar, B. S., M. M. Rubin, et al, Ch. 5: Principlesand technique of the colposcopic examination. Col-poscopy: Principles and Practice. B. S. Apgar, G.L. Brotzman and M. Spitzer. Philadelphia, W.B.Saunders Company, pp. 115-146, 2002.

[7] Pogue B. W. , Mycek Marry-Ann, D. Harper, Im-age Analysis for Discrimination of Cervical Neopla-sia, J. BioMed. Optics, Vol. 5(1), pp. 72-82, 2000.

[8] Van Raad V. , Frequency Space Analysis of Cer-vical Images using Short Time Fourer Transform,Proc. of IASTED, Intern. Conf. of BioMed Eng.,Vol(1), pp. 77-81, 2003.

[9] Gordon S., Zimmerman G., and Greenspan H.,Image segmentation of Uterine Cervix images forindexing in PACS, Proceedings of the 17th IEEESymposium on Computer-Based Medical Systems(CBMS04), 2004.

[10] Lange H., Automatic Glare Removal in Re-flectance Imagery of the Uterine Cervix, to appearin SPIE Proc. 5747, Medical Imaging, San Diego,Feb. 12-17, 2005, in press.

1 ARC-Strategic Partnership with Industry Research andTraining (SPRIT), Project ID: C0001944, 2000-2004.

image analysis and segmentation of anatomical features of cervix

Documents