[ieee 2010 ieee embs conference on biomedical engineering and sciences (iecbes) - kuala lumpur,...

Gaussian Bayes Classifier for Medical Diagnosis and

Grading: Application to Diabetic Retinopathy

Ahmad Fadzil M Hani

Intelligent Signal and Imaging Research Centre

Department of Electrical and Electronic Engineering

Universiti Teknologi PETRONAS, Perak, Malaysia

[email protected]

Hanung Adi Nugroho, Hermawan Nugroho

Intelligent Signal and Imaging Research Centre

Department of Electrical and Electronic Engineering

Universiti Teknologi PETRONAS, Perak, Malaysia

Abstract—Data from medical imaging system need to be analysed

for diagnostics and clinical purposes. In a computerized system,

the analysis normally involves classification process to determine

disease and its condition. In an earlier work based on a database

of 315 fundus images (FINDeRS), it is found that the foveal

avascular zone (FAZ) enlargement strongly correlates with

diabetic retinopathy (DR) progression having a correlation factor

up to 0.883 at significant levels better than 0.01. However, it is

also found that the FAZ areas can belong to different DR severity

but with different levels of certainty having a Gaussian

distribution. In this research work, the suitability of the

Gaussian Bayes classifier in determining DR severity level is

investigated. A v-fold cross-validation (VFCF) process is applied

to the FINDeRS database to evaluate the performance of the

classifier. It is shown that the classifier achieved sensitivity of

>84%, specificity of >97% and accuracy of >95% for all DR

stages. At high values of sensitivity (>95%), specificity (>97%)

and accuracy (>98%) obtained for No DR and Severe

NPDR/PDR stages, the Gaussian Bayes classifier is suitable as

part of a computerised DR grading and monitoring system for

early detection of DR and for effective treatment of severe cases.

Keywords-Bayes classifier, diabetic retinopathy, fundus image,

Gaussian distribution

I. INTRODUCTION

In a computerised medical system, a classifier is required as part of a decision process to assist the physicians in diagnosing and monitoring a disease. Manual data analysis for diagnosis in medical practice has become inadequate as number of patients is increasing. The need for computer-based medical system is necessary to speed up medical data analysis and to diagnose a disease accurately and consistently. Hence, development of classification techniques for effective computer based medical analysis to help physicians in diagnosis is essential.

In pattern recognition, there are two types of classification techniques, namely supervised classification and unsupervised classification [1-3]. Supervised classification uses a priori knowledge or information to determine the data structure. In contrast, unsupervised classification extracts the structure of the data from the data itself.

There are numerous classification techniques that have been developed, for example Support Vector Machine (SVM)

[4], neural networks [5], k-Nearest Neighbour (kNN) [6], decision tree [7] and Bayesian classifier [8]. Generally, neural networks and SVM perform much better when dealing with multi-dimensions and continuous features. Neural networks and SVM can also be used to model a complex system as these techniques can deal with more parameters than the remaining techniques [4-5]. However, neural networks and SVM require a large sample size to achieve its maximum prediction accuracy.

The kNN algorithm is quite transparent since users, for example, physicians who find that probabilistic explanations replicate their way of diagnosing easily grasp it [6]. Nevertheless, kNN is usually considered intolerant to noise and is easily distorted by errors in attribute values. Most decision trees are considered resistant to noise because their pruning strategies avoid over-fitting the data in general and noisy data in particular.

Gaussian Bayes classifier is a Bayes classifier for data input classes having Gaussian distribution [8]. The classifier learns from training data and estimates the posterior probabilities of the classes given particular instance of the features using Bayes theorem assuming Gaussian PDF for the data features. Prediction of the class is determined by identifying the class with the highest posterior probability.

The major advantage of the Bayes classifier is its short computational time for training since it requires relatively small amount of training data to estimate the parameters for classification. Bayes classifier is also robust to missing values because these values are simply ignored in computing probabilities and thus have no impact on the final decision.

One of the medical applications of disease severity classification is diabetic retinopathy. Diabetic retinopathy (DR) is a sight threatening complication due to diabetes mellitus that affects the retina. There are five levels of DR severity, namely no-DR, mild non-proliferative diabetic retinopathy (NPDR), moderate NPDR, severe NPDR and proliferative diabetic retinopathy (PDR) [9]. According to the Malaysia National Eye Database 2007, among 10,856 registered diabetic patients, 36.8% has any form of DR, of which 7.1% comprises PDR [10].

It has been reported in medical literature that biologically, the foveal avascular zone (FAZ) enlarges in diabetic retinopathy (DR) cases as a result of loss of capillaries in the

The research work was funded by the Ministry of Science, Technology and Innovation, Malaysia under the Techno Fund grant TF0206C129.

2010 IEEE EMBS Conference on Biomedical Engineering & Sciences (IECBES 2010), Kuala Lumpur, Malaysia, 30th November - 2nd December 2010.

978-1-4244-7600-8/10/$26.00 ©2010 IEEE 52

perifoveal capillary network (Figure 1) [11-13]. This is not readily observable in colour fundus images but the effects are seen in fluorescein angiograms for non-proliferative DR (NPDR) and also for proliferative DR (PDR) cases [14].

Perifovealcapillary network

Figure 1. The perifoveal capillary network in enlarged FAZ

A study by Fadzil et al also found that FAZ enlargement strongly correlates with DR progression with correlation factor up to 0.883 and is significant up to the 0.01 level [15]. Nevertheless, it is also found that the FAZ areas can belong to different DR severity but with different levels of certainty having a Gaussian distribution [15]. The problem of these overlapping FAZ areas is depicted in Figure 2.

As shown in Figure 2, probability mass functions for every DR severity are plotted with their corresponding fitted Gaussian probability density functions (PDFs). The overlapping FAZ areas across DR stages occur because we use the ophthalmologists diagnosis based on the direct ophthalmology method as our reference in grading DR severity. Therefore, a classifier to handle these problems needs to be developed.

In this research work, we investigate the suitability of a Bayes classifier for Gaussian distributed data in classifying DR severity based on FAZ area obtained from digital colour fundus images. The performance of the developed classifier is evaluated using a V-fold cross-validation technique.

Severe NPDR /PDRSevere NPDR /PDR (fit)No DRNo DR (fit)Mild NPDRMild NPDR (fit)Moderate NPDRModerate NPDR (fit)

1 2 3 5 6x 104

0

1

2

Data

De

nsity

FAZ area (pixels)

4

x 104

Figure 2. FAZ area probability mass function of DR stages

II. APPROACH

The approach taken in this research is as follows. First, using colour fundus images from FINDeRS, FAZ are analysed by a computerised DR system to determine their areas [16-19]. The FAZ area distribution for each DR severity class is determined and modeled with a Gaussian probability density

function. A Gaussian Bayes classifier is then developed to determine DR severity based on the measured FAZ area (in pixels) obtained from digital colour fundus images. Finally, a cross-validation technique is used to evaluate the performance of the classifier.

Data from Fundus Image for Non-invasive Diabetic Retinopathy System (FINDeRS) database have been developed from an observational clinical study and consist of 315 fundus images (175 No DR, 52 mild NPDR, 32 moderate NPDR, 18 severe NPDR and 38 PDR) [19]. These fundus images are analysed using a computerised DR system to obtain the FAZ areas [20]. Table 1 shows the statistics of the FAZ areas according to DR severity for 315 fundus images from FINDeRS database [20].

TABLE I. STATISTICS OF FAZ AREAS FOR DR GRADING SYSTEM

FAZ area

No DR Mild Moderate Severe/ PDR

Sample size 175 52 32 56

Mean (pixels) 13644.20 21041.17 27198.31 33933

Std. dev(pixels) 2727.29 3709.95 3180.89 6787.10

Median (pixels) 13817.00 20177.50 27271.00 32211.50

Min (pixels) 6124 14002 21132 27051

Max (pixels) 18667 28202 33358 66558

It can be seen from Table 1 that the FAZ mean and median increases as the DR stage progresses to a more severe level. Based on the maximum and minimum values of FAZ areas of each DR stage, the ranges of FAZ area for the DR stages overlap. Therefore, an effective and reliable DR severity classification technique has to be developed to handle the overlapping ranges.

A classifier is used to determine DR severity based on the measured FAZ area (in pixels) obtained from digital colour fundus images. The classifier uses Bayes theorem with Gaussian distribution for pattern classification [8, 21]. According to Bayes theorem, probability of continuous data x belongs to class ωc is determined as

)(

)()|()|(

xp

PxpxP cc

c

ωωω =

∑=

=C

k kk

cc

Pxp

Pxp

1)()|(

)()|(

ωω

ωω

(1)

)()|()|( ccc PxpxP ωωω ∝

(2)

To simplify the above equation, we take the logarithm of the equation.

53

)(log)|(log)|(log ccc PxpxP ωωω += (3)

)()|()|( ccc LPxLLxLP ωωω += (4)

where LP(ωc |x) is the log posterior probability, LL(x| ωc) is the log likelihood and LP(ωc) is the log prior probability. The log posterior probability ratio (LPPR) is then defined as

)|()|()|(

)|(log xLPxLP

xP

xPba

b

aωω

ω

ω−=

))()(())|()|(( baba LPLPxLLxLL ωωωω −+−=

(5)

For the one dimensional case, the Gaussian probability density function has the form

( )

−−=

2

2

2exp

2

1)(

σ

µ

σπ

xxp

(6)

If we assume the probability density function is Gaussian, the log likelihood will become

),|(log),|(22

σµσµ xpxLL =

2

22

2

)(2log

σ

µπσ

−−

−=

x

(7a)

−−−=

2

222 )(

log2

1),|(

σ

µσσµ

xxLL

(7b)

Therefore, the log posterior probability will be

)(log)(

log2

1)|(

2

22

cc Px

xLP ω

σ

µσω +

−−−=

(8)

In the case of Gaussian Bayes classifier, if class ωa and class ωb are modeled by Gaussian distribution with mean µa and µb and variances σa

2 and σb

2, the log posterior probabilities

ratio (LPPR) can be written as

( ))(log)(log

loglog)()(

2

1

)|(

)|(log 22

2

2

2

2

ba

ba

b

b

a

a

b

a

PP

xx

xP

xP

ωω

σσ

σ

µ

σ

µ

ω

ω

−+

−+

−−

−=

(9)

If the ratio is greater than 0, then data x belongs to class ωa. Otherwise, data x belongs to class ωb [22].

In grading of DR severity, the FAZ area (in pixels) is measured for several known DR related fundus images to obtain FAZ area ranges corresponding to the severity of DR. The FAZ area ranges that overlap show progression of the disease from a DR stage to the next. The categories of the ranges used in this work are as follows:-

(a) Range 1 – No DR stage

(b) Range 2 – Progression range from No DR to mild NPDR

(c) Range 3 – Mild NPDR

(d) Range 4 – Progression range from mild NPDR to moderate NPDR

(e) Range 5 – Moderate NPDR

(f) Range 6 – Progression range from moderate NPDR to severe NPDR/ PDR

(g) Range 7 – Severe NPDR/ PDR

In this work, V-Fold Cross Validation (VFCV) is used to evaluate the performance of the classifier [1, 23-24]. The VFCV is chosen since the number of samples is quite small (number of samples for moderate NPDR is only 32). The VFCV algorithm divides randomly data set D into V disjoint subsets Tv, v=1, 2, 3 ... V with approximately equal size and iteratively performs the cross-validation V times. V-1 of the subsets is used as a learning set and the one remaining subset is used as a test set. An average of the results is used to measure the performance of the developed system. The VFCV is also computationally feasible since V can be chosen (generally between 5 and 10). In this work, V is set to 5 (i.e. each subset consists of 20% of total number of data) since the smallest sample size of the DR severity level is 32 (i.e. moderate NPDR) in order to maintain sufficient training sample size.

III. RESULT AND ANALYSIS

Using the Gaussian Bayes Classifier, the LPPR between two selected stages can be computed using the corresponding mean and standard deviation data from Table 1 and applying Equation (9). The selected two stages, as shown in Figure 3, are as follows, no DR and mild NPDR, mild NPDR and moderate NPDR, moderate NPDR and severe NPDR/ PDR. From the LPPR, the thresholds of FAZ area ranges for DR grading are determined based on the analysis of the 315 fundus images.

Figure 3. LPPRs in Gaussian Bayes classifier

54

For example, if LPPR between No DR and Mild NPDR is greater than 0, then the DR grade is categorised as No DR. Otherwise, the DR grade is categorised as Mild NPDR. The LPPR is also calculated for other stages of DR grades.

Table 2 shows the range of FAZ area (in pixels) for the DR grade if LPPR = 0 for the Gaussian Bayes classifier. Table 2 does not include progression (in between) stages of DR grades. Progression stages are important to give early indication to patients of the DR progression to more severe stages. In Gaussian Bayes classifier, progression stages can be obtained

by setting LPPR ≠ 0. The receiver operating characteristic (ROC) analysis is used to find the optimum non-zero LPPR setting.

TABLE II. DR GAUSSIAN BAYES CLASSIFIER (LPPR=0)

DR Grade FAZ area range (in pixels)

Normal 1-18702

Mild NPDR 18703-25002

Moderate NPDR 25003-29939

Severe NPDR/PDR 29940-45431

For ROC, the sensitivity and specificity must be determined [25-26]. Sensitivity measures the proportion of actual positives which are correctly identified and specificity measures the proportion of negatives which are correctly identified. Based on the nearest distance between an operating point and the reference point in the ROC curve, the optimum classifier for each DR stage can be determined.

In this work, VFCV is applied to evaluate the performance of the Gaussian Bayes classifier. Using VFCV, data is divided into 5 subsets to perform 5 iterations. At each iteration, four subsets are used to train the system while the remaining subset is used to test the system. An example of optimal Gaussian Bayes classifiers as a result of 1 iteration is shown in Table 3.

TABLE III. OPTIMAL GAUSSIAN BAYES CLASSIFIERS FOR ALL DR

STAGES

Optimum Gaussian Bayes

classifiers for DR Stages (1–Specificity) Sensitivity

No DR (-2<LPPR<2) 0.02 1

Mild (-2<LPPR<2) 0.07 0.93

Moderate (-0.75<LPPR< 0.75) 0.23 0.77

Severe/ PDR (-0.75<LPPR<0.75) 0.03 0.98

Based on the above settings, the corresponding FAZ area range of DR stages can be clearly defined (Table 4).

TABLE IV. DR GAUSSIAN BAYES CLASSIFIER WITH PROGRESSION RANGE

Stage FAZ area range (pixels)

No DR 1 – 16204

Progression No DR to Mild 16205 – 20912

Mild 20913 – 20654

Progression Mild to Moderate 20655 – 26866

Moderate 26867 – 27054

Progression Moderate to Severe/PDR 27055 – 31673

Severe/ PDR 31674 – 100000

As shown in Table 4, the DR grading system is able to identify DR severity even though the FAZ area lies in the progression range (highlighted rows). These ranges can be used to indicate to doctors and patients that the DR condition is progressing to a severe level or at the borderline between 2 stages.

Finally, the performance of the DR system is measured based on the average values of sensitivity, specificity and accuracy from all 5 subsets as shown in Table 5.

TABLE V. PERFORMANCE ANALYSIS OF THE DR SYSTEM CLASSIFIER

Classifier Sensitivity Specificity Accuracy

No DR - DR 100 % 97.9±3.1 % 99.1±1.4 %

Mild NPDR - other stages 84.1±11.4 % 99.2±1.0 % 96.8±1.9%

Moderate NPDR - other stages 84.2±16.8 % 97.1±3.6 % 95.9±4.5 %

Severe/ PDR - other stages 95±7.5 % 98.8±1.1 % 98.1±2.0 %

As shown in Table 5 the values of sensitivity, specificity and accuracy vary among DR stages. The sensitivity value for the classifier of Mild NPDR has similar value with that of Moderate NPDR (around 84%). This indicates that the classifier has lower ability to correctly detect a patient suffering from Mild NPDR when the patient actually having Mild NPDR compared to other DR stages (No DR and Severe/ PDR stages), likewise to Moderate NPDR. It happens since the overlapping FAZ areas in Mild and Moderate NPDR are more than that of other DR stages. It can be minimised by increasing the number of training data for Mild and Moderate NPDR. However, the classifier shows high specificity for Mild and Moderate NPDR (>97%). This implies that the classifier is sensitive to other DR stages. The high values of accuracy for all DR stages imply that the system can detect a particular stage with high sensitivity and specificity.

In general, the classifier consistently maintains high sensitivity (>84%), specificity (>97%) and accuracy (95%) for all DR stages. Moreover, high values of sensitivity (>95%), specificity (>97%) and accuracy (>98%) obtained for No DR and Severe NPDR/PDR stages indicate that the Gaussian Bayes classifier is suitable for early detection of DR and for effective treatment of severe cases.

ACKNOWLEDGMENT

We would like to acknowledge the collaboration with Department of Ophthalmology, Hospital Selayang, Malaysia in developing the FINDeRS database.

REFERENCES

[1] E. Micheli-Tzanakou, Supervised and unsupervised pattern recognition: feature extraction and computational intelligence. Boca Raton, FL: CRC Press, 2000.

[2] A. R. Webb, Statistical pattern recognition. West Sussex, England; New Jersey: Wiley, 2002.

[3] L. Costaridou, Medical image analysis methods. Boca Raton: CRC Press/Taylor & Francis, 2005.

[4] V. N. Vapnik, "Statistical learning theory," ed. New York: New York John Wiley & Sons, Inc., 1998.

55

[5] V. Kecman, Learning and soft computing : support vector machines, neural networks, and fuzzy logic models. Cambridge, Massachusetts: MIT Press, 2001.

[6] T. Cover and P. Hart, "Nearest neighbor pattern classification," Information Theory, IEEE Transactions on, vol. 13, pp. 21-27, 1967.

[7] W. Mueller and F. Wysotzki, "Automatic construction of decision trees for classification," Annals of operations research., vol. 52, p. 231, 1994.

[8] S. Theodoridis and K. Koutroumbas, Pattern recognition. London,UK: Elsevier Inc. , 2009.

[9] "American Academy of Ophthalmology. Preferred Practice Pattern: Diabetic Retinopathy 2003," American Academy of Ophthalmology, San Francisco, California2003.

[10] P. P. Goh, "Status of Diabetic Retinopathy Among Diabetics Registered to the Diabetic Eye Registry, National Eye Database, 2007," The Medical Journal of Malaysia, vol. 63, pp. 24-28, September 2008.

[11] G. H. Bresnick, et al., "Abnormalities of the foveal avascular zone in diabetic retinopathy," Arch Ophthalmol, vol. 102, pp. 1286-1293, September 1 1984.

[12] J. Conrath, et al., "Foveal avascular zone in diabetic retinopathy: quantitative vs qualitative assessment," Eye, vol. 19, pp. 322-326, 2004.

[13] J. Conrath, et al., "Semi-automated detection of the foveal avascular zone in fluorescein angiograms in diabetes mellitus," Clinical & Experimental Ophthalmology, vol. 34, pp. 119-123, 2006.

[14] A. K. Khurana, Ophthalmology. New Delhi: New Age International Publishers, 2003.

[15] M. H. Ahmad Fadzil, et al., "Observational Clinical Study on Computerised Diabetic Retinopathy Monitoring and Grading System," Universiti Teknologi PETRONAS Seri Iskandar, Tronoh, Malaysia 19 Jan 2010.

[16] M. H. Ahmad Fadzil and L. I. Izhar, "A Non-Invasive Method for Analysing the Retina for Ocular Manifested Diseases," Malaysia Patent patent filing no. PI20083503 September, 2008.

[17] M. H. Ahmad Fadzil and I. Lila Iznita, "An Apparatus for Monitoring and Grading Diabetic Retinopathy," Malaysia Patent patent filing no. PI20091936 May, 2009.

[18] M. H. Ahmad Fadzil and I. Lila Iznita, "A Non-Invasive Method for Analysing the Retina for Ocular Manifested Diseases," Malaysia Patent patent filing no. PCT/MY2009/000025, 2009.

[19] M. H. Ahmad Fadzil, et al. Fundus Image Database for Non Invasive Diabetic Retinopathy Monitoring and Grading System (FINDeRS).

[20] M. H. Ahmad Fadzil, et al., "Analysis of Foveal Avascular Zone in Colour Fundus Images for Grading of Diabetic Retinopathy Severity," presented at the 32nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Buenos Aires, Argentina, 2010.

[21] R. O. Duda, et al., Pattern classification. New York: Wiley, 2001.

[22] C. E. Rasmussen and C. K. I. Williams, Gaussian processes for machine learning. Cambridge, Mass.: MIT Press, 2006.

[23] M. Stone, "Cross-Validatory Choice and Assessment of Statistical Predictions," Journal of the Royal Statistical Society. Series B (Methodological), vol. 36, pp. 111-147, 1974.

[24] A. J. Izenman, Modern Multivariate Statistical Techniques : Regression, Classification, and Manifold Learning. Berlin: Springer New York, 2008.

[25] M. H. Zweig and G. Campbell, "Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine," Clin Chem, vol. 39, pp. 561-577, April 1, 1993 1993.

[26] C. Metz, "ROC analysis in medical imaging: a tutorial review of the literature," Radiological Physics and Technology, vol. 1, pp. 2-12, 2008.

56

[ieee 2010 ieee embs conference on biomedical engineering and sciences (iecbes) - kuala lumpur,...

Documents