development of algorithm tuberculosis bacteria ... · bacteria identification using color...

5
International Journal of Video & Image Processing and Network Security IJVIPNS-IJENS Vol:12 No:04 9 127404-3737-IJVIPNS-IJENS © August 2012 IJENS I J E N S Development of Algorithm Tuberculosis Bacteria Identification Using Color Segmentation and Neural Networks Ibnu Siena , Kusworo Adi , Rahmat Gernowo and Nelly Mirnasari Abstract-- Tuberculosis (TB) is one of the primary cause of the death in the developing countries, certainly it is coming into lime light from various countries, either developed countries or developing countries. Sputum examination microscopically by using Ziehl-Neelsen stain (ZN-stain) method directly is a primary examination which is still used in all over the world included in Indonesia according to recommendation of the World Health Organization (WHO). Certainly this examination is depend on the expertise of the existing human resources and intensive examination time consuming. In the developing country with limited facility, a little number of expert, and not a cheap cost are some of the reason about how difficult to pressing the development of tubercular. Therefore, it is needed an automation in TB bacteria examination from a digital image of ZN-stain sample which can press the examination cost, time required, and human error. In this research, the algorithm of image processing for identification of TB bacteria is developed by using neural network. The testing result by using 15 hidden layer is obtained accuracy about 88%. Index Term-- Tuberculosis (TB), Ziehl-Neelsen stain (ZN-stain) method, Image Processing, Neural Network I. INTRODUCTION Tuberculosis (TBC or TB) is an infection disease caused by mycobacterium tuberculosa bacteria which included in mycobacteriaceae family and actinomycetales ordo. Mycobacterium tuberculosis include M. tuberculosis, M. bovis, Ibnu Siena Department of Physics, Faculty of Science and Mathematics Diponegoro University Kusworo Adi Department of Physics, Faculty of Science and Mathematics Diponegoro University [email protected] [email protected] Rahmat Gernowo Department of Physics, Faculty of Science and Mathematics Diponegoro University Nelly Mirnasari Department of Physics, Faculty of Science and Mathematics Diponegoro University Jl. Prof. H. Soedarto, SH, Tembalang, Semarang M. africanum, M. microti, and M. canettii. This bacteria is more often to infect lungs than another part of human body. This disease can be suffered by everybody, but right usually attack in 15 - 35 years old, particularly those with a weak body, malnutritions or those who live under one roof and mill around together with tubercular. In the Global Report WHO 2010, obtained data of the Indonesian TB. The total of TB cases in 2009 is about 294.731 cases, where 169.213 are positive TB cases, 108.616 are negative TB cases, 11.215 are Extra Lungs TB cases, 3.709 are recrudescent TB cases, and 1.978 are medicinal retreatment outside of recrudescent cases. Meanwhile, to the success of medicinal treatment from 2003 to 2008 (in %), 2003 (87%), 2004 (90%), 2005 to 2008 were all the same (91%) [1,2]. The examination of TB bacteria all this time is done manually, so that is required a long time enough and trained laboratory staff. Therefore in this research is developed system which can detect TB bacteria by using microscope imaging. Several researcher in the world have done a lot of research about the examination of sputum sample by using image processing technique. Bacteria segmentation of certain species require a complex process. Bacteria shape is insufficient as discriminant feature, because each person almost has bacteria species and particle in various of similar morphology. Therefore besides the shape, information of the bacillus color is a lot used. Veropoulos et al. use the identification method based on the shape descriptor and the classifier of neural network indicate the sensitivity (ratio of the right positive decision to the sum of positive case) is about 94,1% [3,4]. Afterwards Wilkinson proposed a quick segmentation with multiresolution technique based on different threshold for different area of a gray level image. Another researcher used color information as a discriminant key factor either for bacteria segmentation, and identification or cancer cell segmentation for lungs diagnosis [5,6,7]. Afterwards research of TB image processing is recently done by Khutlang, et al. by doing tuberculosis (TB) screening in the countries with a lower income and microscope based for automatic identification of Mycobacterium tuberculosis in the Ziehl-Neelsen (ZN) image, a spotty sputum is obtained by using light-field microscope. Segmentation of bacillus object by using the combination of two pixel class classifier. This algorithm produce output which accord with manual segmentation, then modify Hausdorff distance and Williams index. Geometric-transformation-invariant character

Upload: others

Post on 23-Mar-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Development of Algorithm Tuberculosis Bacteria ... · Bacteria Identification Using Color Segmentation and Neural Networks Ibnu Siena, Kusworo Adi, Rahmat Gernowoand Nelly Mirnasari

International Journal of Video & Image Processing and Network Security IJVIPNS-IJENS Vol:12 No:04 9

127404-3737-IJVIPNS-IJENS © August 2012 IJENS I J E N S

Development of Algorithm Tuberculosis Bacteria Identification Using Color Segmentation and Neural Networks

Ibnu Siena , Kusworo Adi , Rahmat Gernowo and Nelly Mirnasari

Abstract-- Tuberculosis (TB) is one of the primary cause of the death in the developing countries, certainly it is coming into lime light from various countries, either developed countries or developing countries. Sputum examination microscopically by using Ziehl-Neelsen stain (ZN-stain) method directly is a primary examination which is still used in all over the world included in Indonesia according to recommendation of the World Health Organization (WHO). Certainly this examination is depend on the expertise of the existing human resources and intensive examination time consuming. In the developing country with limited facility, a little number of expert, and not a cheap cost are some of the reason about how difficult to pressing the development of tubercular. Therefore, it is needed an automation in TB bacteria examination from a digital image of ZN-stain sample which can press the examination cost, time required, and human error. In this research, the algorithm of image processing for identification of TB bacteria is developed by using neural network. The testing result by using 15 hidden layer is obtained accuracy about 88%. Index Term-- Tuberculosis (TB), Ziehl-Neelsen stain (ZN-stain) method, Image Processing, Neural Network

I. INTRODUCTION

Tuberculosis (TBC or TB) is an infection disease caused by mycobacterium tuberculosa bacteria which included in mycobacteriaceae family and actinomycetales ordo. Mycobacterium tuberculosis include M. tuberculosis, M. bovis,

Ibnu Siena Department of Physics, Faculty of Science and Mathematics

Diponegoro University

Kusworo Adi Department of Physics, Faculty of Science and Mathematics

Diponegoro University [email protected] [email protected]

Rahmat Gernowo

Department of Physics, Faculty of Science and Mathematics Diponegoro University

Nelly Mirnasari

Department of Physics, Faculty of Science and Mathematics Diponegoro University

Jl. Prof. H. Soedarto, SH, Tembalang, Semarang

M. africanum, M. microti, and M. canettii. This bacteria is more often to infect lungs than another part of human body. This disease can be suffered by everybody, but right usually attack in 15 - 35 years old, particularly those with a weak body, malnutritions or those who live under one roof and mill around together with tubercular. In the Global Report WHO 2010, obtained data of the Indonesian TB. The total of TB cases in 2009 is about 294.731 cases, where 169.213 are positive TB cases, 108.616 are negative TB cases, 11.215 are Extra Lungs TB cases, 3.709 are recrudescent TB cases, and 1.978 are medicinal retreatment outside of recrudescent cases. Meanwhile, to the success of medicinal treatment from 2003 to 2008 (in %), 2003 (87%), 2004 (90%), 2005 to 2008 were all the same (91%) [1,2]. The examination of TB bacteria all this time is done manually, so that is required a long time enough and trained laboratory staff. Therefore in this research is developed system which can detect TB bacteria by using microscope imaging. Several researcher in the world have done a lot of research about the examination of sputum sample by using image processing technique. Bacteria segmentation of certain species require a complex process. Bacteria shape is insufficient as discriminant feature, because each person almost has bacteria species and particle in various of similar morphology. Therefore besides the shape, information of the bacillus color is a lot used. Veropoulos et al. use the identification method based on the shape descriptor and the classifier of neural network indicate the sensitivity (ratio of the right positive decision to the sum of positive case) is about 94,1% [3,4]. Afterwards Wilkinson proposed a quick segmentation with multiresolution technique based on different threshold for different area of a gray level image. Another researcher used color information as a discriminant key factor either for bacteria segmentation, and identification or cancer cell segmentation for lungs diagnosis [5,6,7]. Afterwards research of TB image processing is recently done by Khutlang, et al. by doing tuberculosis (TB) screening in the countries with a lower income and microscope based for automatic identification of Mycobacterium tuberculosis in the Ziehl-Neelsen (ZN) image, a spotty sputum is obtained by using light-field microscope. Segmentation of bacillus object by using the combination of two pixel class classifier. This algorithm produce output which accord with manual segmentation, then modify Hausdorff distance and Williams index. Geometric-transformation-invariant character

Page 2: Development of Algorithm Tuberculosis Bacteria ... · Bacteria Identification Using Color Segmentation and Neural Networks Ibnu Siena, Kusworo Adi, Rahmat Gernowoand Nelly Mirnasari

International Journal of Video & Image Processing and Network Security IJVIPNS-IJENS Vol:12 No:04 10

127404-3737-IJVIPNS-IJENS © August 2012 IJENS I J E N S

extraction and feature optimalization which determinated by selection of the character subset and Fisher transformation. Sensitivity and specificity of all classification were tested above 95% for identification of bacillus object which represented by Fisher-feature. This result can be used to decrease the involvement of technician in TB screening [8]. The automatic screening will give some excess, such as substantial decreasing in the work load of doctor laboratory staff, testing sensitivity increasing and even better accuracy in diagnosis by increasing amount of image which can be analyzed by computer [9]. Segmentation and classification with hue color component approximation were used to TB bacteria identification. This method is developed by using color saturation thresholding in the Ziehl-Neelsen (ZN) TB bacteria image pixel [10,11]. According to the methods which have been developed by several researcher mentioned above, so in this research is developed the algorithm of TB bacteria identification by using color marker and neural network from Ziehl-Neelsen (ZN) TB bacteria image. The use of Ziehl-Neelsen (ZN) TB bacteria image is very enable because the required equipment is cheap and simple. The parameter for input of neural network is eccentricities and compactness of bacteria shape. Whereas the output of neural network is positive TB and negative TB, the system will detect positive TB when it is found minimal three TB bacteria in passing of microscope. II. METHODS

2.1. Color Segmentation Segmentation process is dividing an image into parts which is its shaper area. Level of dividing is depend on the resolved problem and segmentation will stop when the intended object has been isolated. The algorithm of image segmentation is generally based to the one of two basic characteristic of intensity value, that are discontinuity and similarity [12]. In the segmentation process consist of : De-correlation Stretching increase color separation of an image significantly in the multi-channel image. This way increase the visual interpretation and make feature discrimination easier. The value of image’s original color is charted to a new set of color value with a wider range [13]. 2.2. Morphology Morphology process is someway used in the image processing technique to extract or modify the information of shape and structure of the object which contained in the image (Dougherty, 2009). In the morphology process consist of : dilation is an operation which “grown” or “thicken” the object (foreground) in a binary image. Mathematically, dilation is defined as an operation of compilation. Dilation from A because of B can be written according to the following equation 1 [12] : 퐴⊕ 퐵 = 휁| 퐵 ∩ 퐴 ≠ ∅ (1)

With A is the input matrix and B is the compiler matrix as a basic of output matrix shaper. Algorithm from the equation 1 is to places the compiler matrix above each element of the input matrix ( input image) so that the center of the compiler matrix is coincide with the input pixel position [12].

Fig. 1. (a) The compilation A, (b) The compiler matrix B, (c) The dilation result A because of B [12]

While erosion is the inverse of dilation, because it is “ shrunken” the object in a binary image, so the equation of erosion turn into: 퐴⊖ 퐵 = 휁|(퐵) ∩ 퐴 ≠ ∅ (2) With A is the input matrix and B is the compiler matrix. Algorithm from the equation 2 is the inverse of dilation, it is dilating background of the input image (A) [11].

Fig. 2. (a) The compilation A, (b) The compiler matrix B, (c) The erosion result A because of B [12]

2.3. Regional Descriptor 1. Eccentricity

Is ratio between the length of major axis and minor axis [12].

Fig. 3. Illustration of an ellipse region

The eccentricity value of an ellipse can be written as the equation below [12] :

2

1

abe (3)

With e is the eccentricity value, a is the length of major axis, and b is the length of minor axis.

2. Compactness

(a) (b) (c)

(a) (b) (c)

Page 3: Development of Algorithm Tuberculosis Bacteria ... · Bacteria Identification Using Color Segmentation and Neural Networks Ibnu Siena, Kusworo Adi, Rahmat Gernowoand Nelly Mirnasari

International Journal of Video & Image Processing and Network Security IJVIPNS-IJENS Vol:12 No:04 11

127404-3737-IJVIPNS-IJENS © August 2012 IJENS I J E N S

Is ratio of the circumference square and the bandwidth of an object. This value is not influenced by change of an object scale and change of rotation angle because it is a ratio and has no dimension [12]. 퐶 = (4) With C is the compactness value, P is the circumference value, and A is the object bandwidth.

2.4. Neural Networks

Neural network is network of a small processor unit group which is modeled based on the human neural. Neural network is an adaptive system which can change its structure to solve problem according to either internal or external information which flowing through the mentioned network. Classically neural network is a modeling equipment of non-linear statistics data. Neural network can be used to model the complex connection between input and output to find patterns in the data. Input of the neural network is character which has produced in the previous stage to executed learning process with data training, with the existing of learning process, so this system is expected to be able to recognize the existing of TB bacteria which found in the sputum sample. Construction of the neural network consist of input layer, hidden layer and output [14,15].

Fig. 4. Block diagram of image processing with neural networks

Fig. 5. Flowchart in the network testing process.

Figure 5 is flowchart of the testing and training process to identification of TB bacteria, the trained network will be re-tested by giving input pattern to the network. There are two input data which will be tested in this process. The first input data is in form of data which used in the training process, it is the data set. This thing is intent to know the characteristic of memorization of the network, that is the ability to remember the trained data. Then the neural network is tested by using data which by all means has never been trained. III. RESULT 3.1. Color Segmentation The object that used in this research is Ziehl-Neelsen Stain sputum sample of digital microscope image result which obtained from the Centers for Disease Control and Prevention, Public health image library. Atlanta, GA, USA: CDC, 2007. http://phil.cdc.gov/phil/home.asp.[16]. This image for about 9 samples is used as system testing. And also data set for about 929 TB bacteria shapes that is used as data of neural network training is obtained from [17,18]. Ziehl-Nellsen Stain image has more complex ground detail, so that the object (TB bacteria) must be separated by the ground before identifiable. On this image, the ground will possess dark blue color until light blue color, the TB bacteria will indicate pink color until scarlet. From the result of the research, by using de-corrstretch function, the sample image will has a better contrast value. With this better contrast value then the

Page 4: Development of Algorithm Tuberculosis Bacteria ... · Bacteria Identification Using Color Segmentation and Neural Networks Ibnu Siena, Kusworo Adi, Rahmat Gernowoand Nelly Mirnasari

International Journal of Video & Image Processing and Network Security IJVIPNS-IJENS Vol:12 No:04 12

127404-3737-IJVIPNS-IJENS © August 2012 IJENS I J E N S

segmentation of TB bacteria will be easier. This is because of the color intensity stretch in bacteria will be narrower. Some example result of the contrast value restoration.

(a) (b)

(c) (d)

(e) (f)

Fig. 6. Results of the contrast value restoration using de-

corrstretch function, (a), (c), and (e) are the original images and (b), (d), and (f) are images of the restoration result so that

it possess a narrower intensity stretch value in bacteria TB. A narrower intensity stretch in bacteria will make

light of color clustering by using an approximation iteratively. In this method is used two color cluster as a basic of clustering, that are the color of TB bacteria and the ground color. Some example of segmentation result can be shown on the images below.

(a) (b)

(c) (d)

(e) (f)

Fig. 7. Result of image segmentation based on color using K-

means clustering method, (a), (c), and (e) are the original images and (b), (d), and (f) are images of the segmentation

result.

3.2. Neural Networks Training Neural network training in this research using back propagation method. This method is chosen because it is suited to non-linear problem and predictive. This method work according to the input data which used at the moment of training, so that the more the data used, the better the network in solving the problem. Although this is also influenced by amount of the hidden layer which used in the network. Because the amount of hidden layer influence the network performance itself. Either accuracy or duration of training. So that is needed a stand alone research in determination the amount of hidden layer. In this research is used the eccentricity and compactness value of 929 types of the TB bacteria shape as input in the training process. This eccentricity and compactness value are chosen because they are ratio value, so that they have no influence towards image magnification and direction and also location of the bacteria. The following is the example of data input which used.

TABLE I Example of the data set extraction as data input in the training

neural networks

Bacteria Eccentricities Compactness 1 0,96904 24,89174 2 0,94706 18,58705 3 0,95395 19,51604 4 0,96391 21,96085 5 0,91198 15,89329 6 0,97303 25,41394 7 0,977 29,53841 8 0,91443 17,13787 9 0,9447 18,88333 10 0,91171 15,76337

Besides the testing above, system is also tested to

recognize the TB bacteria in the Ziehl-Neelsen image. This testing is executed by giving variation in the amount of hidden layer. This testing is to get an optimal amount of the hidden layer. And from this research is obtained that the network will work properly using 15 hidden layer with accuracy about 88%.

IV. CONCLUSIONS The development of image processing algorithm for identification of TB bacteria by using neural network has succeeded to recognize the TB bacteria. Configuration of the

Page 5: Development of Algorithm Tuberculosis Bacteria ... · Bacteria Identification Using Color Segmentation and Neural Networks Ibnu Siena, Kusworo Adi, Rahmat Gernowoand Nelly Mirnasari

International Journal of Video & Image Processing and Network Security IJVIPNS-IJENS Vol:12 No:04 13

127404-3737-IJVIPNS-IJENS © August 2012 IJENS I J E N S

neural network with two input layer, 15 hidden layer, and 2 output layer. From the testing result by using test-image is obtained accuracy about 88%. Future work to increase the accuracy of identification of TB bacteria will be developed with other classification methods and improvements in pattern recognition.

ACKNOWLEDGEMENT

This research was funding from the Ministry of Research and Technology Republic of Indonesia through the Insentif SINAS Program in 2012. The authors would like to acknowledge PIMOD group for TB bacteria dataset for training neural networks. The authors would like to acknowledge Centers for disease control and prevention (cdc) public health image library (phil) for TB bacteria image ZN .

REFERENCES [1]. Avicenna, 2011, " Lung tuberculosis (Lung TB)" (in

English), http://rajawana.com/artikel/kesehatan/264-tuberculosis-paru-tb-paru.html Last access 13 April 2011 14.59.

[2]. WHO Report 2009, 2009, Global Tuberculosis Control 2009: Epidemiology, Strategy, Financing, WHO Press, World Health Organization, ISBN 978 92 4 156380 2, Geneva, Siwtzerland.

[3]. Veropoulos, K., Campbell, C., and Learmonth, G., 1998, Image Processing And Neural Computing Used In The Diagnosis Of Tuberculosis, Proc. IEE Colloquium on Intelligent Methods in Healthcare and Medical Applications (Digest No. 1998/514), pp. 8/1 - 8/4, York, UK.

[4]. Veropoulos, K., Learmonth, G., Campbell, C., and Knight, B., Simpson, J., 1999, Automatic Identification of Tubercle Bacilli in Sputum. A preliminary investigation, Analytical and Quantitative Cytology and Histology, Vol. 21, No. 4, (Aug. 1999), pp. 277–81, ISSN 0884-6812, York, UK.

[5]. Alvarez-Borrego J., Mourino R., Crist´obal G., and Pech J., 2000, “Invariant Optical Color Correlation for Recognition of Vibrio Cholerae o1,” in Int. Conf. on Pattern Recognition, 2847, p. 283, (Barcelona, Spain)

[6]. Sammouda R., Niki N., Nishitani H., Nakamura S., and Mor S. ,1997, “Segmentation of Sputum Color Image for Lung Cancer Diagnosis,” in Int. Conf. on Image Processing, 1, p. 243, (Washington, USA), 1997.

[7]. Sammouda R., Niki N., Nishitani H, and Kyokage E., 1998,“Segmentation of Sputum Color Image for Lung Cancer Diagnosis Based on Neural Network,” IEICE transactions on information and systems (8), 1998.

[8]. Khutlang, R., Krishnan, S., Dendere, R., Whitelaw, A., Veropoulos, K., Learmonth, G., and Douglas T. S., 2010, Classification of Mycobacterium Tuberculosis in Images of ZN-stained Sputum Smears. IEEE Transactions on Information Technology in

Biomedicine, Vol. 14, No. 4, (July 2010), pp. 949-957, ISSN 1089-7771.

[9]. Forero, M. G.; Cristobal, G. and Borrego, J. A., 2003, Automatic Identification Techniques of Tuberculosis Bacteria, SPIE Proceedings Of The Applications Of Digital Image Processing XXVI, Vol.5203, pp. 71-81, ISBN 0-8194-5076-6, San Diego, CA, Aug. 2003, SPIE, Bellingham WA.

[10]. Vishnu Makkapati and Ravindra Agrawal, Raviraja Acharya: Segmentation and classification of tuberculosis bacilli from ZN-stained sputum smear images. CASE 2009: 217-220

[11]. Osman M.K., Mashor M.Y., Saad Z., and Jaafar H., 2010,"Colour Image Segmentation of Tuberculosis Bacilli in Ziehl-Neelsen-Stained Tissue Images Using Moving K-Mean Clustering Procedure,", 2010 Fourth Asia International Conference on Mathematical/Analytical Modelling and Computer Simulation, pp.215-220

[12]. Gonzalez, R. C. and Woods, R. E., 2002, Digital Image Processing Second Edition, Pearson Education, ISBN 978-0201180756, New Jersey.

[13]. Chitade A.Z et. al., 2010, Colour Based Image Segmentation Using K-Means Clustering, International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5319-5325, Madhyapradesh, India.

[14]. Dougherty, G., 2009, Digital Image Processing for Medical Applications, Cambridge University Press, ISBN 978-0-521-86085-7, New York.

[15]. Bishop, C.M., 1995, “Neural Networks for Pattern Recognition”, Oxford: Oxford University Press. ISBN 0-19-853849-9 (hardback)

[16]. “Centers for disease control and prevention (cdc) public health image library (phil),” http://phil.cdc.gov/phil/home.asp.

[17]. Forero, M. G.; Sroubek, F. and Cristobal, G., 2004, Identification of Tuberculosis Bacteria Based on Shape and Color. Real-Time Imaging, Vol. 10, No. 4, (August 2004), pp. 251-262, ISSN 1077-2014.

[18]. Forero, M G. Cristóbal and Desco, M., “Automatic identification of Mycobacterium tuberculosis by Gaussian Mixture models”, J. Microscopy, 223, pp.