comparative study of shape, intensity and texture features and support vector machine for white

16
Journal of Theoretical and Applied Computer Science Vol. 7, No. 1, 2013, pp. 20–35 ISSN 2299-2634 http://www.jtacs.org Comparative study of shape, intensity and texture features and support vector machine for white blood cell classification Mehdi Habibzadeh, Adam Krzy˙ zak, Thomas Fevens Department of Computer Science & Software Engineering, Concordia University, Montr´ eal, Qu´ ebec {me habi,krzyzak,fevens}@encs.concordia.ca Abstract: The complete blood count (CBC) is widely used test for counting and categorizing various pe- ripheral particles in the blood. The main goal of the paper is to count and classify white blood cells (leukocytes) in microscopic images into five major categories using features such as shape, intensity and texture features. The first critical step of counting and classification procedure in- volves segmentation of individual cells in cytological images of thin blood smears. The quality of segmentation has significant impact on the cell type identification, but poor quality, noise, and/or low resolution images make segmentation less reliable. We analyze the performance of our system for three different sets of features and we determine that the best performance is achieved by wavelet features using the Dual-Tree Complex Wavelet Transform (DT-CWT) which is based on multi-resolution characteristics of the image. These features are combined with the Support Vector Machine (SVM) which classifies white blood cells into their five primary types. This approach was validated with experiments conducted on digital normal blood smear images with low resolution. Keywords: Complete Blood Count (CBC), White Blood Cell Classification, Dual-Tree Complex Wavelet Transform (DT-CWT), Shape, intensity and texture features, Kernel-PCA, Support Vector Machine (SVM) 1. Introduction The complete blood count (CBC) is widely used pathology screening test to detect abnor- malities such as infections, allergies, disorders with clotting, and for diagnosing and reporting numerous diseases. It examines different particles in blood smears and focuses particularly on the Leukocyte or White Blood Cell (WBC) count, Erythrocyte or Red Blood Cell (RBC) count, WBC differential analysis, evidence of disease, and the number of infected cells. This is accomplished by staining blood film using e.g., Wright, Giemsa, or May-Gr¨ unwald stain- ing [1] and then imaging it with a transmission light microscope. A major purpose of CBC is WBC counting. For example, if the total number of white blood cells (leukocytes) or the number of one of the explicit types of leukocytes is above or below normal then this may indicate abnormalities. In some circumstances, an increased num- ber of WBC is called leukocytosis and may indicate the presence of inflammation, leukemia (a cancer of blood that is characterized by an unusual increase of immature white blood cells), trauma, intense exercise, or stress. However, a decreased leukocyte count called leukopenia may result from chemotherapy, radiation therapy, or diseases of the immune system, preg- nancy (final months), and heavy smoking. The leukocyte differential is the total number of

Upload: others

Post on 09-Feb-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Comparative study of shape, intensity and texture features and support vector machine for white

Journal of Theoretical and Applied Computer Science Vol. 7, No. 1, 2013, pp. 20–35ISSN 2299-2634 http://www.jtacs.org

Comparative study of shape, intensity and texturefeatures and support vector machine for white blood cellclassification

Mehdi Habibzadeh, Adam Krzyzak, Thomas FevensDepartment of Computer Science & Software Engineering, Concordia University, Montreal, Quebec

{me habi,krzyzak,fevens}@encs.concordia.ca

Abstract: The complete blood count (CBC) is widely used test for counting and categorizing various pe-ripheral particles in the blood. The main goal of the paper is to count and classify white bloodcells (leukocytes) in microscopic images into five major categories using features such as shape,intensity and texture features. The first critical step of counting and classification procedure in-volves segmentation of individual cells in cytological images of thin blood smears. The quality ofsegmentation has significant impact on the cell type identification, but poor quality, noise, and/orlow resolution images make segmentation less reliable. We analyze the performance of our systemfor three different sets of features and we determine that the best performance is achieved bywavelet features using the Dual-Tree Complex Wavelet Transform (DT-CWT) which is based onmulti-resolution characteristics of the image. These features are combined with the Support VectorMachine (SVM) which classifies white blood cells into their five primary types. This approach wasvalidated with experiments conducted on digital normal blood smear images with low resolution.

Keywords: Complete Blood Count (CBC), White Blood Cell Classification, Dual-Tree Complex WaveletTransform (DT-CWT), Shape, intensity and texture features, Kernel-PCA, Support Vector Machine(SVM)

1. IntroductionThe complete blood count (CBC) is widely used pathology screening test to detect abnor-

malities such as infections, allergies, disorders with clotting, and for diagnosing and reportingnumerous diseases. It examines different particles in blood smears and focuses particularlyon the Leukocyte or White Blood Cell (WBC) count, Erythrocyte or Red Blood Cell (RBC)count, WBC differential analysis, evidence of disease, and the number of infected cells. Thisis accomplished by staining blood film using e.g., Wright, Giemsa, or May-Grunwald stain-ing [1] and then imaging it with a transmission light microscope.

A major purpose of CBC is WBC counting. For example, if the total number of whiteblood cells (leukocytes) or the number of one of the explicit types of leukocytes is above orbelow normal then this may indicate abnormalities. In some circumstances, an increased num-ber of WBC is called leukocytosis and may indicate the presence of inflammation, leukemia(a cancer of blood that is characterized by an unusual increase of immature white blood cells),trauma, intense exercise, or stress. However, a decreased leukocyte count called leukopeniamay result from chemotherapy, radiation therapy, or diseases of the immune system, preg-nancy (final months), and heavy smoking. The leukocyte differential is the total number of

Page 2: Comparative study of shape, intensity and texture features and support vector machine for white

Image analysis of leukocytes 21

Figure 1. Mature WBCs, Left to right: Neutrophil; Monocyte; Lymphocyte; Eosinophil; and Basophil.

WBCs expressed as thousands/µl in a volume of blood. There are five mature types of WBCs(with typical percentage of occurrence in normal blood): Basophil (<1%); Eosinophil (<5%);Monocyte (3-9%); Lymphocyte (25-35%); and Neutrophil (40- 75%) [1] (see Fig. 1).

The resulting count is the total number of leukocytes expressed in a volume of blood.Automatic counting systems have been available in the medical laboratories for the last 25years. Current counting methods include manual WBC counting by a medical expert, RBCimpedance or flow cytometry machine counters. The most recent systems rely on flow cytom-etry whereby a blood flows through a detector [2]. It is a technique requiring costly chemicalsand it relies on hardware. The most decisive and solid diagnosis of peripheral blood filminfection is done by manually finding disorders and atypical cells in blood samples throughmicroscopic quantitative and qualitative analysis, particularly looking at the shape, e.g, nu-cleus and cytoplasm of the cells, occlusion and degree of contact, overlapping with each otherand by counting blood smear particles. Pathologists use such manual screening visual tests torealize and interpret the minor and major changes that help identify different disorders in theblood cells.

1.1. Background and literature reviewThe counting and classification of each class manually is a tedious, time-consuming and

labor-intensive activity in medical laboratories that can be eased by image processing soft-ware. These automated, software-based counters improve the reproducibility and accuracy ofthe WBC differential results and relieve the burden of these clinical activities. The historyof first computerized steps in automated blood smear examination goes back to Bentley andLewis [3] in 1975. The first fully automated processing of blood smear slides was introducedby Rowan [4] in 1986. The literature on automatic WBC classification by using computervision concepts is numerous and involves different feature extractors, classifiers, quantitativeand qualitative process, e.g., [1, 5–7].

Ongun et al. [8] proposed an approach using active contours to track the boundariesof WBC although occluded cells were not precisely handled. Lezoray [9] introducedregion-based White Blood Cells segmentation using extracted markers (or seeds). However,this method requires prior knowledge of color information for proper seed extraction. Ku-mar [10] applied a novel cell edge detector while trying to perfectly determine the boundaryof the nucleus. Sinha and Ramakrishnan [11] suggested a two-step segmentation frameworkusing k-means clustering of the data mapped to HSV color space and a neural network clas-sifier using shape, color and texture features. WBC segmentation was achieved by means ofmean-shift-based color segmentation in Comaniciu and Meer research work [12] while in [13]Jiang et al. used watershed segmentation.

Conventional and typical wavelet transformation combined with morphology in order toimprove the segmentation of touching or adjacent blood cells was proposed in Chan and hisco-authors work [14]. Automatic differential cell counting was performed in two levels seg-

Page 3: Comparative study of shape, intensity and texture features and support vector machine for white

22 Mehdi Habibzadeh, Adam Krzyzak, Thomas Fevens

menting WBC nucleus and identifying the cytoplasm region in more recent work (2012) byDorini et al. [5]. To improve the correctness and performance of two known segmentationapproaches using watershed transformation and level sets, the image pre-processing withSelf-Dual Multiscale Morphological Toggle (SMMT) filter to combine scaled erosion and di-lation morphological operations was applied to gray-scale inputs. In addition, cell cytoplasmregion was separated by using granulometry and morphological transformations. In that workfive mature WBC types were classified using a K-Nearest Neighbor (K-NN) classifier withgeometrical shape features and a reasonable accuracy (78% performance vs 85% classifiedmanually by a specialist) was achieved.

This present work addresses the problem of classification of normal white blood cells. Af-ter a comparative analysis of different sets of features, an automated computerized system isused to improve mature WBC type recognition and counting white blood cell images by usingwavelet coefficients computed by the Dual-Tree Complex Wavelet Transform (DT-CWT) [15]in combination with a Support Vector Machine (SVM) [16] for WBC type classification. Ex-perimentation indicates that this approach is also effective on imagery with low magnificationand quality.

2. Proposed approachFor this work, only normal digital thin blood smears will be considered. The proposed

approach is composed of three general steps: 1. image acquisition and discrimination ofWBCs from RBCs; 2. extraction of shape, intensity, and texture features; 3. classificationinto the five White Blood Cell types by means of the Support Vector Machine. Details of eachstep are presented in the following sections.

3. Image acquisition and leukocyte separationThe input to our framework are digital color images of thin peripheral blood smear. Ro-

bust and fitted de-noising algorithm is applied to these blood images using bivariate waveletshrinkage, and Kuwahara edge preservation filtering is performed to compensate for Bivariateblurring side-effect. These images are then enhanced by a combination of Otsu and Niblackbinarization algorithms. Next, White Blood Cells are localized and segmented by the follow-ing procedure introduced in the previous work [17, 18] as follows:

1. Extract sub-images holding individual closed White Blood Cells regions. The algorithmalmost determines the location of WBCs nucleus and enhances WBC boundaries.

2. Use step-by-step iterative method based on Red Blood Cell size estimation, circular mask,saturation value and noise removal to separate WBCs and RBCs into two individualsub-images to separate White Blood Cells from Red Blood Cells.

The computational cost of the mentioned process is essentially affected by determining a maskcapable to separate the WBCs from the RBCs. The separated White Blood Cells are collectedinto a database of JPEG images (see Fig. 5).

4. Feature extractionThe main task of the feature selection is to choose those features which are best correlated

with the class under recognition. Different selections are possible and each of them has theindividual importance according to its strategy in classification, and they are not necessary

Page 4: Comparative study of shape, intensity and texture features and support vector machine for white

Image analysis of leukocytes 23

globally optimal. For better judgment and comparison, the following three sets of featureswill be investigated instead: a primary feature vector composed of the shape, intensity andtexture features; lower dimensional projection of this primary feature vector determined by akernel-PCA [19]; and DT-CWT wavelet coefficients.

4.1. Primary feature vectorThe features we proposed to use for the primary feature vector can be grouped into three

categories: shape, intensity, and texture features.Shape features: The following features express the overall size and shape of the White

Blood Cell. Among different selections, we choose features which are invariant under differ-ent translation, changes in scale, and also rotation. First, a couple of definitions. The centralmoment for a digital image [20] is

µij =∑x

∑y

(x− x)i(y − y)jI(x, y)

where I(x, y) is respective pixel intensity value over a WBC. Also, the normalized (ij)thcentral moment where i+ j ≥ 2 is

ηij =µij

µ00( i+j

2+1)

.

Then the set of shape features that we will use are:

1. The Hu set of invariant moments [20, 21], features computed from normalized centralizedmoments up to order three

ϕ1 = η20 + η02.

ϕ2 = (η20 − η02)2 + 4η211.

ϕ3 = (η30 − 3η12)2 + (3η21 − η03)

2.

ϕ4 = (η30 − η12)2 + (η21 − η03)

2.

ϕ5 = (η30 − 3η12)(η30 + η12)×[(η30 + η12)

2 − 3(η21 + η03)2] + (3η21 − η03)×

(η21 + η03)[3(η30 + η12)2 − (η21 + η03)].

ϕ6 = (η20 − η02)[(η30 + η12)2 − (η21 + η03)

2] + 4η11(η30 + η12)(η21 + η03).

ϕ7 = (3η21 − η03)(η30 + η12)[(η30 + η12)2 − 3(η21 + η03)

2]−(η30 − 3η12)(η21 + η03)[3(η30 + η12)

2 − (η21 + η03)2].

2. The relative shape measurements vector (Relative Area, Relative Perimeter). Let r be theestimated average blood cell radius, where r is an estimated value based on approximatediagonal of a square mask bounded the separated WBC inside. The length of the diagonalcan be calculated by Pythagoras theorem where we know the side length of the square. Let(xi, yi) be boundary pixels and N is the total number of pixels. Then the aforementionedrelative shape features are as follows [22]

Page 5: Comparative study of shape, intensity and texture features and support vector machine for white

24 Mehdi Habibzadeh, Adam Krzyzak, Thomas Fevens

a) Relative Area Ar:

Ar =

∑x,y I(x, y)

πr2.

b) Relative Perimeter Pr, where ximax and yimax are maximum intensity values in image:

Pr =

∑N−1i=0

√(xi − xi+1)2 + (yi − yi+1)2 +

√(ximax − x0)2 + (yimax − y0)2

2πr.

Intensity features: These features are based only on the absolute value of the intensitymeasurements in the image. A histogram describes the occurrence relative frequency of theintensity values of the pixels in an image. The intensity features that we will consider are thefirst four central moments of this histogram: Mean, Standard Deviation, Skewness, Kurtosis[22]. The mean (µ) gives an estimate of the average intensity level in the region of the celland the standard deviation (σ) is a measure of the dispersion of intensity. Skewness (γ1) is ameasure of histogram symmetry while kurtosis (K) is a measure of the tail of the histogram[22]. Let the histogram be represented by P (h) which is the relative frequency of the pixelintensity value h, where h ∈ [0, ..., L − 1]. Let I(x, y) be the intensity value of the pixel atrow x and column y, and let N be the total number of pixels.

1. Mean:

µ =L−1∑h=0

hP (h) =Row∑x=0

Column∑y=0

I(x, y)

N.

2. Standard deviation:

σ =

√√√√L−1∑h=0

(h− h)2P (h).

3. Skewness:

γ1 =1

σ3

L−1∑h=0

(h− h)3 ∗ P (h).

4. Kurtosis:

K =

∑L−1h=0 (h− h)4

(L− 1)σ4.

Texture features: The following features aim to quantify the overall local density variabilityinside the object. It is often difficult to visualize textural features and associate feature valueswith the appearance of cells. These features [22] include gradient transformation features▽f(x, y) = (∂f(x,y)

∂x, ∂f(x,y)

∂y), laplacian transformation features ▽2f(x, y) = ∂2f(x,y)

∂x2 + ∂2f(x,y)∂y2

,flat texture features, co-occurrence matrix features [23] which is defined over an image to bethe distribution of co-occurring values at a give offset. Let n × m be the size of the inputimage I . Also, let (△x,△y) be the parameters of an offset. Mathematically, a primaryco-occurrence matrix definition is given by:

C△x,△y(i, j) =n∑

x=1

m∑y=1

{1 ; if I(x, y) = i & I(x+△x, y +△y) = j0 otherwise

.

Each entry is therefore considered to be the probability that a pixel with value i will be foundadjacent to a pixel of value j. It estimates the probability that pixel I(k, l) has intensity i and a

Page 6: Comparative study of shape, intensity and texture features and support vector machine for white

Image analysis of leukocytes 25

pixel I(m,n) has intensity j. Various combinations of the matrix are taken to generate featurescalled Haralick features [23] (namely, the angular second moment, contrast, correlation, sumof squares: variance, and inverse difference moment), as well as features based upon theDual Tree Complex Wavelet Transform [15]. Specifically, the image based versions of thosetexture features used in this research are Dual Tree Complex Wavelet Transform coefficientsexplicitly, whereas other features combinations defined below are also investigated in thiswork.

1. Gradient transformation [22], where L is the local neighbourhood of the pixel (i, j), andthe input pixels are weighted by coefficients H:

G(i, j) =∑

(m,n)⊂L

∑H(i−m, j − n)I(i, j).

2. Laplacian transformation [22] in digital image:

▽2I(i, j) =1

4(I(i+ 1, j) + I(i− 1, j) + I(i, j + 1) + I(i, j − 1)− 4I(i, j)).

3. Flat texture [22], where r is the window size of the median filter:

IFT = I(x, y)−median(I(x+ ν, y + ξ); ν, ξ = −r...r).

4. Angular second moment, where NG is the quantized gray scale and P (i, j) is the proba-bility value:

f1 =

NG∑i=1

NG∑j=1

P (i, j)2.

5. Contrast, where P (i, j) = (i, j)th entry in a normalized spatial dependence matrix [23] :

f2 =

NG−1∑n=0

n2

{∑Ng

i=1

∑Ng

j=1 P (i, j)|i− j = n|

}6. Correlation, where µx, µy, σx, and σy are the means and standard deviations of Px and Py

f3 =

∑NG

i=1

∑NG

j=1(ij)P (i, j)− µxµy

σxσy.

7. Sum of squares: variance

f4 =

NG∑i=1

NG∑j=1

(i− µx)2P (i, j).

8. Inverse difference moment

f5 =

NG∑i=1

NG∑j=1

1

1 + (i− j)2P (i, j).

Page 7: Comparative study of shape, intensity and texture features and support vector machine for white

26 Mehdi Habibzadeh, Adam Krzyzak, Thomas Fevens

4.2. Features based on Dual-Tree Complex Wavelet

When the camera is swept across the peripheral blood slides, particles may be found atdifferent magnifications, angles, staining and quality. Local descriptors of image regions thatare robust and invariant to the imaging are needed to develop an efficient strategy for savingthe set of features. Among aforementioned feature vectors, the dual-tree wavelet transformis an enhanced method to calculate the complex transform of a signal using two separateDiscrete Wavelet Transform decompositions to present useful invariant characterization ofthe structure of a digital blood image.

Wavelet transform analysis provides well-organized tools for capturing local image struc-ture and details, with powerful analysis performance and multi-resolution properties, whichis suitable for image analysis although it has several inherent drawbacks. The wavelet trans-form has four unsolved structural problems [15]: Oscillations, see Fig. 2 (the coefficientstend to oscillate positive and negative around singular points, thus wavelet coefficient valuetends to be exaggerated), Shift variance, see Fig. 3 (a minor shift and rotation of the sig-nal leads to significant variations in the distribution of energy between wavelet coefficientsat different scales), Aliasing (since coefficients are quite extensive and are computed viadown-sampling with non-ideal low-pass and high-pass filters which tends to alias the sig-nals between one another and make them not to be identified as different or distinct), andLack of directionality (lack of directional selectivity particularly makes difficult the analysisof geometric image features such as ridges and edges). To overcome these four weaknessesof Discrete Wavelet Transform (DWT), dual wavelet transform, double-density, and complexWavelet Transforms were introduced. The Dual-Tree Wavelet was introduced as an extendedand enhanced version of the typical Discrete Wavelet Tree (DWT), with additive properties,shift invariance and directional selectivity in two and higher dimensions. To date, there ex-ist two in some way similar versions, namely Kingsbury DT-DWT [15, 24], and SelesnickDT-DWT [25], named DT-DWT(K) and DT-DWT(S), respectively, according to the names oftheir authors. These two redundant transforms consist of two conventional DWT filter-banktrees working in parallel with respective filters of both the trees in approximate quadrature.The double-density Discrete Wavelet Tree (DT-DWT(S)) and the dual-tree complex DiscreteWavelet Tree (DT-DWT(K)) are similar in several respects. Single wavelet transform has thespecifications of both the double-density DWT and dual-tree complex DWT. Therefore, thisis the motivation for the research and development of the both approaches.

Figure 2. Oscillations (more information: [15]): At the edge border, the conventional real DiscreteWavelet has both large and small coefficients (right figure) whereas the Complex Wavelet Transform

provides only coefficients that are more related to their nearness to the edge (left figure).

Page 8: Comparative study of shape, intensity and texture features and support vector machine for white

Image analysis of leukocytes 27

Figure 3. Shift variance (details: [15]): Coefficients for both conventional real DWT and ComplexWT. Conventional real DWT is highly sensitive to translations of the signal (see figures labeled byreal DWT). In Dual-Tree CWT the energy amount at scale j is almost constant (see figures labeled by

complex DWT), unlike in the typical DWT.

The Selesnick DT-DWT(S) simultaneously carries the characteristics and propertiesof the double-density discrete wavelet transform and the dual-tree discrete wavelet trans-form. It is based on two scaling functions {ϕh(t), ϕg(t)} and four explicit wavelets(ψh,c (t), ψg,c (t); c = 1, 2). The double-density discrete wavelet transform is implementedby 3-channel (h0, h1, h2) analysis filter bank alternatively applying first to the rows, then tothe columns of an image. Hence, nine 2-D sub-bands will be computed in each tree (h, g).One of them is the 2-D low pass scaling filter (ϕ(t)). However, the other eight create the eight2-D wavelet filters (ψ(t)). These functions are implicitly as follow, whereNG is the quantizedgray scale.

ϕh(t) =√2

NG∑n=1

h0(n)ϕh(2t− n).

ψh,1(t) =√2

NG∑n=1

h1(n)ϕh(2t− n).

ψh,2(t) =√2

NG∑n=1

h2(n)ϕh(2t− n).

Page 9: Comparative study of shape, intensity and texture features and support vector machine for white

28 Mehdi Habibzadeh, Adam Krzyzak, Thomas Fevens

Figure 4. Q-shift DT-CWT [27], giving real and imaginary parts of complex coefficients from twotrees(α,β). The approximate delay for each filter is shown by brackets in figures, where q = 1/4

sample period.

Scaling and wavelet functions for tree (g) are also defined similarly.

1. Selesnick wavelets are offset from one another by one half, as below:

ψ(t) =

{ψh,1 (t) = ψh,2 (t− 0.5),

ψg,1 (t) = ψg,2 (t− 0.5).

2. Selesnick wavelets form an approximate Hilbert transform (H ) pair:

ψ(t) =

{ψg,1 (t) ≈ H {ψh,1 (t)} ,ψg,2 (t) ≈ H {ψh,2 (t)}

.

The double-density discrete wavelet transform have two properties: low computationalcomplexity and approximate shift invariance, and so this kind of wavelets can be formed toimplement texture features in image classification. A short review of DT-DWT(S) and itsmathematical properties is given in [25]. The well-known applications of the double-densitydual-tree Discrete Wavelet Tree are the same as those of the dual-tree complex DiscreteWavelet Tree, for example, signal modeling, enhancement, image segmentation, compression,coding, watermarking and image denoising.

In this paper the application of DT-CWT(K) in blood smear detection is investigated. Theproposed framework using DT-CWT(K) is similar to our related previous work [26].

DT-CWT(K) is inspired by the Fourier transform in wavelet concepts. However, the co-efficients of DWT are real-valued while the Fourier transform is based on complex-valued

Page 10: Comparative study of shape, intensity and texture features and support vector machine for white

Image analysis of leukocytes 29

oscillating sinusoids (the real and imaginary parts) in which the wavelet function forms aHilbert transform pair with π/2 phase difference. It performs two different sub-band filteringschemes for real and imaginary parts separately. DT-CWT(K) is faster compared with thetraditional template matching method [2] and also overcomes using wavelet thresholding [14]by having freedom degrees in variance and directional selectivity.

In practice, DT-CWT combines two Digital Wavelet Transforms, using even and oddwavelets to provide complex coefficients. Each tree (α,β) contains purely real filters, wherebythe two trees produce the real and imaginary parts respectively of each complex wavelet co-efficients. For the tree (α,β) we need low pass filters with group delays which differ by half asample period. The Q-shift (quarter shift) filter attains required group delays (see Fig. 4). Thisleads to low aliasing energy and also good shift invariance. The DT-CWT analysis is appliedin 1−D, along rows and columns, and six oriented 2−D complex wavelets are constructedfrom different combinations of the outputs. The outcome of the DT-CWT is thus a set ofcomplex coefficients as a sufficiently rich representation of local structure at each pixel for sixdifferent orientations (sub-bands) ± π

12,±π

4,±5π

12, and for each of a number of scales by factor

2. Regarding using the information in the feature vectors for SVM, the complex values (realand imaginary) are converted to polar form (magnitude, phase) to place alternating values intothe feature vector (magnitude1, phase1, magnitude2, phase2 and so on) give the best results inclassifier. For our segmented cell images, DT-CWT is applied at 6 scales, the number of levelsof wavelet decomposition and 14-tap Q-shift [15, 24] filters to image samples, giving a totalof 3204 features (6 scales (14∗14, 7∗7, 4∗4, 2∗2, 1∗1, 1∗1)× 6 sub-bands (± π

12,±π

4,±5π

12)

× 2 magnitude, phase components) for each 28×28 sample (low magnified images).

4.3. Dimensionally reduced feature vector via Kernel-PCA

Kernel Principal Component Analysis (K-PCA) is used to non-linearly project data bymeans of a kernel in the feature space and to reduce high dimensionality of the feature vectors.The K-PCA is an extension of the linear PCA. It is used to generate a new set of featuresfrom the above primary feature vector using shape, intensity, and texture as follows. Let(x1, . . . , xn) be a set of features in the input space. Let the function ϕ(·) be a non-trivialmapping to M dimensional feature space [19] where M > n. We choose a centred ϕ(·) suchthat the mean of the mapped features ϕ(xi) is zero.

Kernel PCA performs the typical linear PCA in the feature space corresponding to thenon-linear n×n kernel implementationK = k(·, ·) by the inner product between two mappedfeatures in the feature space, k(x, y) = <ϕ(x), ϕ(y)>. For the kernel K, we determine itseigenvectors [ai1, . . . , ain] and corresponding eigenvalues λi where i = 1 . . . N . For a lowerdimensional embedding, we choose the eigenvectors aij , i = 1 . . . l, j = 1 . . . n, correspond-ing to the l < m largest eigenvalues. Then the projection of a feature x is given by

yi =n∑

j=1

aij ∗ k(x, xj), i = 1 . . . l.

There are many possible kernel candidates. Among them, three non-linear most popularkernels are: Polynomial, Gaussian (exponential) and Sigmoidal (tanh). The polynomial kernelis determined by k(x, y) = <x, y>D = (x·y + C1)

D, and the sigmoidal kernel is given byk(x, y) = tanh(C1x·y + C2), where D is the polynomial degree, and C1 and C2 are constantvalues. The Gaussian kernel is given by k(x, y) = exp(−d2(x,y)

2δ2) where d(x, y) is the Eu-

Page 11: Comparative study of shape, intensity and texture features and support vector machine for white

30 Mehdi Habibzadeh, Adam Krzyzak, Thomas Fevens

clidean distance between x and y. A short review of PCA and its mathematical properties aregiven in [28, 29].

To create a new set of features from the primary feature vector composed of the shape,intensity and texture features, Kernel-PCA is applied with polynomial D = 1, 2 keeping thefirst l = 50 highest eigenvalues to build a new lower dimensional feature vector.

5. Machine learning approachMachine learning and pattern recognition play critical role in the digital medical imag-

ing field, including computer-aided diagnosis and medical image analysis. Medical patternrecognition essentially requires ”learning from samples”. Classification of objects such aswhite blood cells into specific WBC classes based on input features (e.g., shape, intensity, andtexture) is obtained from segmented WBC candidates.

Some commonly used classifiers are non-parametric Artificial Neural Network (ANN)andK-Nearest Neighbor approach, non-metric approaches such as Decision Tree, parametric tech-niques such as Bayesian [29], and Support Vector Machine (SVM). Below, we will use theSVM which is very popular in biomedical and biological applications.

5.1. Support Vector MachinesSupport vector machines are an example of a well-known linear/non-linear two-class clas-

sifier. Let the notation xi (patterns) be the ith vector in a dataset sample (xi, yi)ni=1 where

yi is the label associated with xi. A linear discriminant function is defined implicitly byf(x) = ωTx + b. A simple and naive non-linear classifier is obtained by mapping data fromthe input space using f(x) = ωTϕ(x) + b where ϕ is a kernel mapping function. A linearcombination of the training samples can be expressed as the weight vector ω =

∑ni αixi.

The classifier in non-linear approach takes the form: f(x) =∑n

i αiϕ(xi)Tϕ(x) + b. The

maximum margin classifier in support vector machine is the discriminant function that max-imizes the geometric margin 1

∥ω∥ . To allow errors and misclassified inputs, the optimization

problem can be formulated as a minimization over ω and b of the function 12∥ω∥2+C

∑ni=1 ζi,

where C is a constant value, subject to the inequality constraints yi(ωTxi + b) > 1 − ζi, andζi ≥ 0. This optimization problem can be solved in dual form using the Lagrange multipliersas follows [16]:

maximizeα

n∑i=1

αi −1

2

n∑i=1

n∑j=1

yiyjαiαjxTi xj

subject ton∑

i=1

yiαi = 0; 0 6 αi 6 C.

We classify White Blood Cells into five sub-types based on their dual tree complexwavelet coefficients, using a support vector machine classifier that is well-suited and robust innon-linear classification in a high-dimensional space and efficient in modeling diverse sourcesof data. In this work performance and efficiency of different feature selection strategies overa small (28 samples per each class) low quality dataset (input image size is 28 × 28) are in-vestigated. Three different sets of training and testing are introduced consisting of the featurevector using only intensity gray value of inputs, combined features including intensity grayvalue of inputs and aforementioned intensity, shape and texture features (section 4.1) andDT-CWT(K) coefficients separately.

Page 12: Comparative study of shape, intensity and texture features and support vector machine for white

Image analysis of leukocytes 31

In the first experiment only an intensity gray value dataset composing of 5×28 sam-ples which are a D1 = 109760-dimensional (5 × 28 × 784) vectors is introduced. In thesecond experiment a set is given of training and testing data (intensity gray value of in-puts and intensity, shape and texture features) is given consisting of 5×28 samples whichform a D2 = 442540-dimensional (5 × 28 × 3161) labelled feature vectors labeled asbelonging to WBC classes. In the third and last experiment a set of training and test-ing data using DT-CWT(K) coefficients is given consisting of 5×28 samples which form aD3 = 448560-dimensional (5× 28× 3204 = 5× 28× 2× 1602) feature vectors (see Fig. 1).

Training SVM [16] requires finding the large margin hyperplane, where kernel parametersalso have effective impacts on the decision boundary. To find the best kernel like most prac-tical problems in machine learning, several kernels (Gaussian and Polynomial) could be triedand typically the lowest degree polynomial, linear kernel provides a baseline which presentsthe best performance in our available database, such as in many other bio-informatics appli-cations when Polynomial and Gaussian kernels frequently leads to over-fitting in high dimen-sional data sets (primary feature vector, DT-CWT(K) coefficients) with a small number ofexamples (28 samples for each individual class) [16]. SVM implementation can be done byone-2-all (1AA strategy) or one-2-one (AAA strategy). In this work support vector machineswith a linear kernel, soft-margin (by allowing to misclassifying some points (slack variables))and 1AA strategy [16] with 5-fold validation over selective features and DT-CWT outcomeis performed. We can efficiently divide training and testing data into two subsets comprisingof 82% = 23 for training and 18% = 5 samples for tuning in each class to have a balanceddatabase.

Figure 5. WBC testing data, each row, top to bottom: Basophil(B), Lymphocyte(L), Monocyte(M),Neutrophil(N), Eosinophil(E)

6. ExperimentsIn the current study, for the purpose of experiments, 140 digital blood smear images of five

different types of WBC cells were chosen. The implementation was done using MATLAB7. We performed test on sets of normal blood slides with different characteristics, inherentlimitations possibly occurring during the blood smear slide preparation like different staining,varying degree of cell overlaps, dirty slides, presence of platelets, dust, microbial or otherartefacts and also during digital image acquisition phase for example, inaccurate angle ofimage capturing, blurred and noisy image, low magnification, and poor resolution which canlead to various problems in processing, analysis, and pattern recognition of blood slides.

High variability of samples is intended to simulate large variation of real blood samplesand to accommodate for diversity. The raw input of leukocyte image is converted into JPEGformat and the experiments (training & testing) are carried through on low quality and poorsampling images normalized to 28×28 pixels (see Fig. 5). Given a SVM classifier and three

Page 13: Comparative study of shape, intensity and texture features and support vector machine for white

32 Mehdi Habibzadeh, Adam Krzyzak, Thomas Fevens

sets of intensity value, combined shape, intensity and texture features and also dual tree com-plex wavelet transform instances (the test set), a 5×5 confusion matrix (also called a con-tingency table) representing the known classes of all WBC objects classified to determinethe effectiveness and accuracy of classification of proposed framework. In this work we alsocompare the reliability of DT-CWT coefficients with our previous work based on ConvolutionNeural Networks (CNN) [30], i. e., feed-forward networks which extract topological featuresin first layers and classify patterns with its last layers. In this work, SVM using features ex-tracted by Kernel Principal Component Analysis (K-PCA) [19] and a simple linear SVM areapplied. For 115 (5∗23) training images and 25 (5∗5) testing images, the confusion matrices(with normalized rows) for normal testing WBC images for hybrid DT-CWT & SVM; CNN(recognition rate after 105 epoch); K-PCA (with polynomial D=2) & SVM; and linear SVMare summarized in Table 1.

In particular, for normal WBCs using linear SVM & DT-CWT 84% of known WBCswere classified as such, with this classification rate decreasing to 72% for CNN and the samefor linear SVM and primary feature vector (see section 4.1), and then to 60% for hybridlinear SVM and only intensity gray level value. Given a small number of examples in highdimensional feature sets using non-linear kernels (e.g, polynomial with degree K=2) leads toover-fitting. With using dimensional reduction in the worst case with hybrid K-PCA & SVM& DT-CWT only 24% of WBCs were classified whereas this classification rate increased to28% for K-PCA & SVM & primary feature vector, and then to 52% for hybrid K-PCA &SVM & only intensity gray level value. So, based on the confusion matrices with five classesthe proposed linear SVM & DT-CWT classifier is much more reliable and accurate evenin the presence of similarity among classes (specially between Lymphocyte and Basophilcells) in this difficult database yielding acceptable accuracy when compared to SVM (referto diagonal confusion matrix such as Basophil, Lymphocyte classification rates of 100% &60% for DT-CWT & linear SVM versus 40% & 40% for linear SVM & primary featurevector). The false positive rate (FPR), i.e., the proportion of negatives samples incorrectlyclassified as positive, of the SVM & DT-CWT is also negligible than the FPR of CNN or aSVM using selective mentioned features. Classification using SVM & DT-CWT has a smallfalse positive rate of 16%, with this FPR increasing to 28% for CNN, linear SVM & selectiveprimary feature vector and then to 40% for linear SVM (without using feature dimensionalityreduction via K-PCA) & only intensity gray-scale value.

7. ConclusionsThe main contributions of this work aim at development of publicly available analysis

software for CBC blood test with automatic processing of blood slide images for producingnecessary data required for blood diseases diagnosis. The computed blood cell count resultsare compared with the human observer counts of the number of WBCs in each of five classes.The method as outlined presents a detailed computerized description of an acceptable visualperception of laboratory task when automated and semi automated clinical instruments suchas used in flow methods initially became dominant in WBC classification. WBC differentialcounts such as manual counting, impedance counters and flow cytometry techniques, however,they would be expected to have feasible false-positive rates that is also very trivial in these cur-rent computerized image processing solutions even in very low quality. As confusion matricesshow even in case of poor samples (messy, small and faded) the WBC counts are much moreaccurate when Hybrid DT-CWT & SVM classifier is used rather than CNN, K-PCA & SVM

Page 14: Comparative study of shape, intensity and texture features and support vector machine for white

Image analysis of leukocytes 33

Table 1. Confusion matrices for classifiers, totals over testing images in three different feature sets(section 5.1): (D1) , (D2) , (D3) : Intensity pixel values, primary feature vector (section 4.1), and

DT-CWT(K) (section 4.2) & linear SVM; K-PCA & SVM; and CNN.

Linear SVM (D1): Assigned WBC classesKnown Basophil Eosinophil Lymphocyte Monocyte Neutrophil

Basophil 0.40 0.20 0.40 0.00 0.00Eosinophil 0.00 1.00 0.00 0.00 0.00

Lymphocyte 0.60 0.00 0.20 0.00 0.20Monocyte 0.20 0.00 0.00 0.80 0.00Neutrophil 0.00 0.20 0.00 0.20 0.60

SVM&K-PCA (D1): Assigned WBC classesKnown Basophil Eosinophil Lymphocyte Monocyte Neutrophil

Basophil 0.80 0.00 0.20 0.00 0.00Eosinophil 0.00 1.00 0.00 0.00 0.00

Lymphocyte 0.40 0.40 0.20 0.00 0.20Monocyte 0.00 0.00 0.40 0.60 0.00Neutrophil 0.80 0.00 0.20 0.00 0.00

Linear SVM (D2): Assigned WBC classesKnown Basophil Eosinophil Lymphocyte Monocyte Neutrophil

Basophil 0.40 0.40 0.20 0.00 0.00Eosinophil 0.00 1.00 0.00 0.00 0.00

Lymphocyte 0.60 0.00 0.40 0.00 0.00Monocyte 0.20 0.00 0.00 0.80 0.00Neutrophil 0.00 0.00 0.00 0.00 1.00

SVM&K-PCA (D2): Assigned WBC classesKnown Basophil Eosinophil Lymphocyte Monocyte Neutrophil

Basophil 0.00 0.00 0.60 0.20 0.20Eosinophil 0.00 0.80 0.20 0.00 0.00

Lymphocyte 0.40 0.00 0.20 0.00 0.40Monocyte 0.00 0.20 0.60 0.00 0.20Neutrophil 0.00 0.20 0.20 0.20 0.40

Linear SVM&DT-CWT (D3) : Assigned WBC classesKnown Basophil Eosinophil Lymphocyte Monocyte Neutrophil

Basophil 1.00 0.00 0.00 0.00 0.00Eosinophil 0.00 0.80 0.00 0.00 0.20

Lymphocyte 0.04 0.00 0.60 0.20 0.20Monocyte 0.00 0.00 0.00 0.80 0.20Neutrophil 0.00 0.00 0.00 0.00 1.00

SVM&K-PCA&DT-CWT (D3) : Assigned WBC classesKnown Basophil Eosinophil Lymphocyte Monocyte Neutrophil

Basophil 0.20 0.40 0.20 0.20 0.00Eosinophil 0.00 1.00 0.00 0.00 0.00

Lymphocyte 0.00 0.04 0.96 0.00 0.00Monocyte 0.00 0.20 0.00 0.80 0.00Neutrophil 0.00 1.00 0.00 0.00 0.00

CNN: Assigned WBC classesKnown Basophil Eosinophil Lymphocyte Monocyte Neutrophil

Basophil 0.60 0.20 0.20 0.00 0.00Eosinophil 0.00 0.80 0.20 0.00 0.00

Lymphocyte 0.20 0.00 0.60 0.20 0.00Monocyte 0.00 0.00 0.00 0.80 0.20Neutrophil 0.00 0.00 0.00 0.20 0.80

Page 15: Comparative study of shape, intensity and texture features and support vector machine for white

34 Mehdi Habibzadeh, Adam Krzyzak, Thomas Fevens

classifier (see confusion matrices given in Table 1). Experimental results indicate that currentanalysis offers remarkable recognition accuracy even in presence of poor quality samples andmultiple classes. Advances in implementation result in the possibility of extending the useof this framework to quantitatively measure the subtypes of cells (sub-differentiation) in theentire field of hematology analysis or other similar research.

8. AcknowledgementsThe authors would like to thank Prof Nick Kingsbury from the University of Cambridge,

UK for providing his DT-CWT code. We also thank the two anonymous reviewers whosecomments and suggestions helped improve and clarify this manuscript.

References[1] Ramoser, H., Laurain, V., Bischof, H., Ecker, R.: Leukocyte segmentation and classification in

blood-smear images. In: 27th IEEE Annual Conference Engineering in Medicine and Biology,pp. 3371–3374. Shanghai, China, 2005.

[2] Ushizima, D., Lorena, A., de Carvalho, A.: Support Vector Machines Applied to White BloodCell Recognition. In: 5th International Conference on Hybrid Intelligent Systems, pp. 379–384.Rio de Janeiro, Brazil, 2005.

[3] Bentley, S., Lewis, S.: The use of an image analyzing computer for the quantification of red cellmorphological characteristics. British Journal of Hematology, 29, pp. 81–88, 1975.

[4] Rowan, R., England, J. M.: Automated examination of the peripheral blood smear. In: Automa-tion and quality assurance in hematology, chapter 5, pp. 129–177. Blackwell Scientific Oxford,1986.

[5] Dorini, L., Minetto, R., Leite, N.: Semi-automatic white blood cell segmentation based onmultiscale analysis. IEEE Transactions on Information Technology in Biomedicine, 17(1), pp.250–256, 2013. ISSN 2168-2194.

[6] Shitong, W., Min, W.: A new detection algorithm (NDA) based on fuzzy cellular neural networksfor white blood cell detection. IEEE Transactions on Information Technology in Biomedicine,10(1), pp. 5–10, 2006.

[7] Theera-Umpon, N., Dhompongsa, S.: Morphological Granulometric Features of Nucleus inAutomatic Bone Marrow White Blood Cell Classification. IEEE Transactions on InformationTechnology in Biomedicine, 11(3), pp. 353–359, 2007.

[8] Ongun, G., Halici, U., Leblebicioglu, K., Atalay, V., Beksac, M., Beksac, S.: Feature extractionand classification of blood cells for an automated differential blood count system. In: Interna-tional Joint Conference on Neural Networks, pp. 2461–2466. Washington, DC, USA, 2001.

[9] Lezoray, O., Elmoataz, A., Cardot, H., Gougeon, G., Lecluse, M., Elie, H., Revenu, M.: Seg-mentation of cytological images using color and mathematical morphology. Acta Stereologica,18(1), pp. 1–14, 1999.

[10] Kumar, B., Joseph, D., Sreenivas, T.: Teager energy based blood cell segmentation. In: 14thInternational Conference on Digital Signal Processing, pp. 619–622. Santorini, Greece, 2002.

[11] Sinha, N., Ramakrishnan, A.: Automation of differential blood count. In: IEEE InternationalConference on Convergent Technologies for Asia-Pacific Region, pp. 547–551. Bangalore, India,2003.

[12] Comaniciu, D., Meer, P.: Cell image segmentation for diagnostic pathology. In: Advancedalgorithmic approaches to medical image segmentation, pp. 541–558. Springer, New York, NY,USA, 2002.

Page 16: Comparative study of shape, intensity and texture features and support vector machine for white

Image analysis of leukocytes 35

[13] Jiang, K., Liao, Q.-M., Dai, S.-Y.: A novel white blood cell segmentation scheme usingscale-space filtering and watershed clustering. In: IEEE International Conference on MachineLearning and Cybernetics, pp. 2820–2825. Xi’an, China, 2003.

[14] Chan, H., Li-Jun, J., Jiang, B.: Wavelet transform and morphology image segmentation algorismfor blood cell. In: 4th IEEE International Conference on Industrial Electronics and Applications,pp. 542 –545. Xi’an, China, 2009.

[15] Selesnick, I., Baraniuk, R., Kingsbury, N.: The dual-tree complex wavelet transform. IEEESignal Processing Magazine, 22(6), pp. 123 – 151, 2005. ISSN 1053-5888.

[16] Ben-Hur, A., Weston, J.: A User’s Guide to Support Vector Machines. In: Carugo, O., Eisen-haber, F. (eds.), Data Mining Techniques for the Life Sciences, volume 609 of Methods in Molec-ular Biology, pp. 223–239. Humana Press, 2010. ISBN 978-1-60327-241-4.

[17] Habibzadeh, M., Krzyzak, A., Fevens, T.: Application of pattern recognition techniques for theanalysis of thin blood smear images. Journal of Medical Informatics & Technologies, 18, pp.29–40, 2011.

[18] Habibzadeh, M., Krzyzak, A., Fevens, T., Sadr, A.: Counting of RBCs and WBCs in noisy nor-mal blood smear microscopic images. In: SPIE Medical Imaging : Computer-Aided Diagnosis,volume 7963, p. 79633I. Orlando, FL, USA, 2011.

[19] Y. Rathi, S. D., Tannenbaum, A.: Statistical shape analysis using kernel PCA. In: SPIE Confer-ences: IS&T Electronic Imaging, volume 6064, pp. 425–432. San Jose, CA, USA, 2006.

[20] Hu, M.: Visual pattern recognition by moment invariants. IRE Transactions on InformationTheory, 8(2), pp. 179–187, 1962. ISSN 0096-1000.

[21] Muralidharan, R., Chandrasekar, C.: Scale invariant feature extraction for identifying an objectin the image using Moment invariants. In: International Conference on Communication andComputational Intelligence (INCOCCI), pp. 452 –456. 2010.

[22] Rodenacker, K., Bengtsson, E.: A feature set for cytometry on digitized microscopic images.Analytical Cellular Pathology, 25(1), pp. 1–36, 2001.

[23] Haralick, R., Shanmugam, K., Dinstein, I.: Textural Features for Image Classification. IEEETransactions on Systems, Man and Cybernetics, SMC-3(6), pp. 610 –621, 1973. ISSN0018-9472.

[24] Kingsbury, N.: Design of Q-shift complex wavelets for image processing using frequency do-main energy minimization. In: International Conference on Image Processing (ICIP), volume 1,pp. I – 1013–16. 2003. ISSN 1522-4880.

[25] Selesnick, I.: The double-density dual-tree DWT. IEEE Transactions on Signal Processing,52(5), pp. 1304 – 1314, 2004. ISSN 1053-587X.

[26] Habibzadeh, M., Krzyzak, A., Fevens, T.: Analysis of White Blood Cell Differential CountsUsing Dual-Tree Complex Wavelet Transform and Support Vector Machine Classifier. In: IC-CVG International Conference on Computer Vision and Graphics, volume 7594, pp. 414–422.Springer, Warsaw, Poland, 2012.

[27] Kingsbury, N.: Complex wavelets for shift invariant analysis and filtering of signals. Appliedand Computational Harmonic Analysis, 10(3), pp. 234 – 253, 2001.

[28] Jolliffe, I.: Principal Component Analysis. Springer-Verlag (New York Inc), 2 edition, 2002.[29] Duda, R., Hart, P., Stork, D.: Pattern Classification. Wiley-Interscience, New York, 2 edition,

2001.[30] Habibzadeh, M., Krzyzak, A., Fevens, T.: White Blood Cell Differential Counts Using Con-

volutional Neural Networks for Low Resolution Images. In: Artificial Intelligence and SoftComputing, volume 7895 of Lecture Notes in Computer Science, pp. 263–274. Springer BerlinHeidelberg, 2013. ISBN 978-3-642-38609-1.