6 superpixels using morphology for rock image

Superpixels Using Morphology for Rock Image

Segmentation

Sree Ramya s. P. Malladi, Sundaresh Ram and Jeffrey J. Rodriguez Department of Electrical and Computer Engineering, University of Arizona, Tucson, AZ, USA

Email: {rmalladi.ram.jjrodrig}@email.arizona.edu

Abstract-Detection and segmentation of rocks is an important first task in many applications such as geological analysis,

planetary science and mining processes. Rocks are usually segmented using a variety of features such as texture, shading, shape and edges. It is easier to compute these features for rock superpixels rather than every pixel in the image. A superpixel is a group of spatially coherent pixels that form a meaningful homogeneous region, usually belonging to the same object. In this

paper, we perform a comparative study of some of the current superpixel algorithms on rock images with regard to their ability to adhere to image boundaries, their speed, and their impact on rock segmentation performance. Also, we propose a new and very simple superpixel algorithm, Superpixels Using Morphology (SUM), which permutes a watershed transformation approach to efficiently generate superpixels. We show that SUM achieves a

performance comparable to the recent superpixel algorithms on the rock images.

Index Terms-morphology, watershed segmentation, area closing, superpixels, rock particles.

I. INTRODUCTION

Detection and segmentation of rock particles IS Important in order to measure the size distribution of rock particles in mining processes to monitor the blasting quality, optimize the blast design, and reduce costs and environmental impact. Also, rock shape, weathering, and dispersion carry important information about environmental characteristics and need to be identified for efficient route planning in planetary science. Superpixel segmentation is an attempt to capture the low-level details in image by grouping the pixels such that they provide spatial support for extracting features. Superpixels show benefit in applications such as object tracking [1], detection [2], segmentation [3], depth estimation [4] and object-based compression.

Rock image segmentation can be thought of as a twostep process. First, superpixels are computed, and then a region merging scheme is used to merge the superpixels into a final segmentation. In this paper, we focus on computing superpixels. The superpixels computed for the rock images are expected to have the following properties:

1) Superpixel boundaries should accurately represent, the edges of the rocks.

2) A superpixel should not include portions of more than one rock. (Because post-processing is expected to include a merging step, it is easier to recover from oversegmentation than undersegmentation.)

3) The technique for computing superpixels should be computationally simple and memory efficient.

978-1-4799-4053-0114/$31.00 ©2014 IEEE 145

Many algorithms have been recently developed to divide an image into superpixels [5]-[8], [10]-[15]. We compare four algorithms: normalized cuts (Ncuts) [5], turbopixels [12], simple iterative linear clustering (SLIC) [13], entropy rate superpixels (ERS) [14] and the proposed method, superpixels using morphology (SUM). Ncuts treats the image as a graph G = (V, E), where the vertices V represent the pixel locations and the edges E represent the relation between the pixels (vertices). This graph is partitioned using contour and texture cues, globally minimizing a cost function defined on the edges at the partition boundaries. Run time for Ncuts is relatively large. Turbopixels uses a level-set based geometric flow to dilate a set of initially placed seeds defined by the user. This algorithm relies on other algorithms of varying complexity and sometimes exhibits relatively poor adherence to the boundaries. SLIC is a variation of k-means clustering for generating superpixels. SLIC optimizes the distance calculations by limiting the search space, and uses a weight parameter to control the compactness of superpixels. For SLIC the segmentation accuracy critically depends on how well this weight parameter is tuned. ERS is a graph topology selection method where pixels and their pairwise relations are respectively mapped to the vertices and edges in the graph. Superpixels are then formed via graph topology by maximizing an objective function. The objective function has two components: (1) the entropy rate which favors compact and homogeneous superpixels and (2) the balancing term which favors superpixels with similar sizes. ERS produces a segmentation with reasonable accuracy, but the run-time may be too long for many applications.

In this paper, we describe a new superpixel segmentation method using a modified watershed segmentation algorithm. The algorithm is composed of three key steps: (1) compute the magnitude gradient of the original image, (2) perform an area closing operation on this magnitude gradient image, and (3) apply a watershed transformation to the resultant image to obtain the desired superpixels.

II. SUPERPIXELS USING MORPHOLOGY

We describe a new method for computing the superpixels in an image, using morphology-based operators, which is faster than the existing algorithms and is very memory efficient. The new method, superpixels using morphology, is adapted from the watershed segmentation algorithm [9].

SSIAI2014

Fig. I. (a) Original image. (b) Morphological gradient image (inverted) with a 3 x 3 square. (c) Closing of gradient magnitude image (inverted) with a disk (r = 10). (d) Area closing of gradient magnitude image (inverted) (0: = 100). (e) Watershed lines obtained from image (c) superimposed onto original image. (f) Watershed lines obtained from image (d) superimposed onto original image.

A. Image Gradient

Object boundaries are often characterized by intensity transitions (edges) in the image. Various gradient operators are widely used in image processing to detect these edges, the basic principle being that large gradients indicate points where there is a rapid intensity change. We compute the morphological gradient of the input image f defined by

gradb(f) = (f EB b) - (f e b), (1)

where b is a structuring element, usually symmetric and having a short support, EB is a morphological dilation operation, and

e is a morphological erosion operation. We use a 3 x 3 square-shaped structuring element b for computing the morphological gradient image. Fig. lea) shows an example image, and Fig. l(b) shows the corresponding morphological gradient magnitude image.

B. Morphological Area Closing

The morphological gradient image may consist of spurious strong local gradients within a single rock region due to graylevel variations, in addition to having true strong gradients near the rock edges. A classical approach to suppress such spurious gradients in the gradient image is the use of the morphological closing operator. When there is no prior infonnation about the shape of an object in an image, morphological closing is usually perfonned with a disk-shaped structuring element to preserve isotropy. Fig. l(c) shows the effect of applying morphological closing to the gradient image in Fig. l(b). The closing operation is then followed by watershed segmentation to obtain the segmented image. However, artifacts may appear in the segmented image. Fig. lee) shows an example of such artifacts, where the crest lines have moved many pixels away from the actual boundary of the object. The extent of the deviation depends on the filtering strength (i.e., the radius of the disk).

In order to suppress the spurious gradients in the morphological gradient image, along with the associated segmentation artifacts, we employ Vincent's morphological area closing operator [17]. The gray-scale formulation of this operator relies on the threshold superposition principle and is given by

(2)

where,

BBe,a = {X c E: X is Be-connected, Area(X) 2 o:}

This operator removes connected components whose area is smaller than a given area parameter 0:. This morphological filter is shape preserving because it acts on connected components and, therefore, does not typically change the shape of the structures in the image. We use a fast implementation of this operator by Meijster and Wilkinson [18]. Fig. led) shows the result of applying the area closing operator to the gradient image in Fig. l(b).

C. Watershed Segmentation

The watershed transformation [9] is a popular segmentation algorithm, which divides the gray-level image into regions that are each associated with one local minimum. Consider the gray-level image as a topographic map. For each regional minimum of this map, define a catchment basin (i.e., a region) as all those points whose steepest-slope paths reach this minimum. The watershed lines are then defined as the closed one-pixel-thick crest lines that separate the adjacent catchment basins. Due to numerous local minima present within an image, applying watershed segmentation directly to the image ends up in extreme over-segmentation. We apply the watershed segmentation algorithm to the area-closed gradient image to obtain the desired superpixels. Fig. 1(0 shows the watershed lines obtained by applying the watershed segmentation algorithm to the area-closed gradient image in Fig. led), superimposed onto the original image.

III. PERFORMANCE EVALUATION

To analyze the quality of superpixel segmentation, two metrics are used: under-segmentation error and boundary recall. Manually segmented "ground truth" images are used as a reference to compute the metrics.

A. Under-Segmentation Error (U)

Under-segmentation error [16] measures false merging of superpixels across the ground truth borders. A superpixel is considered to be falsely merged if it spans across a ground truth border. Consider a ground truth segmentation comprising M segments {gl' g2' . . . , 9 M} and a corresponding automatic superpixel segmentation comprising L segments {8 l' 82, • • • , 8 L}' Let N be the number of pixels in the image,

146

.. . . . . . . . . . . . . . . . . . . . . .

-TP -sue -ERS " 'SUM -Nellts

500 1()()() 1500 2()()() 2500 Number of Segments

Fig. 2. (a). Under-segmentation error (b). Boundary recall.

and let the operator 1·1 represent the size of a segment in pixels. The under-segmentation error is a value within the range [0,1], with 0 meaning no under-segmentation error, and is defined as

M

u = �L L min{lsjl-lsjngil,lsjngil}-M i=l {SjICISjngil)",¢}

(3)

B. Boundary Recall (�)

Boundary recall measures the fraction of ground truth boundaries that fall within a fixed distance from a superpixel boundary. Consider a ground truth segmentation GT and a superpixel segmentation S. Let TP represent the number of boundary pixels in GT that have a boundary pixel in S within a distance of 2 pixels. Let FN represent the number of boundary pixels in GT for which there does not exist a boundary pixel in S within a distance of 2 pixels. Boundary recall is a value in the range [0,1]' and is defined as

TP � =

TP+FN (4)

C. Experiments

The recent superpixel algorithms [5], [12], [13], [14] and the proposed method SUM were tested on a set of 10 rock images. Each image has size 480 x 640 pixels. A careful manual segmentation of the rock images was considered as ground truth for all subsequent analysis. On average, each rock image has around 50 to 100 ground truth rock regions. The dataset of rock images includes rocks with varying illumination, shading, shape, and texture. The goal of the superpixel algorithms should be to produce a minimum number of superpixels with good segmentation quality (low under-segmentation error and high boundary recall). The run time of the algorithms is also an important factor.

We used open source implementations of the superpixel algorithms available online. The original implementation of the Ncuts algorithm resizes the image to 160 x 160 for faster compution. We disable the image resizing poperty of Ncuts algorithm and keep the size of the image fixed for all the methods to have a fair comparison. The number of superpixels is the only parameter used by the turbopixel algorithm. SLIC has two parameters: region size (to produce uniformly sized

TABLE 1

RUN TIME IN SECONDS FOR DIFFERENT ALGORITHMS

# of Turbo ERS SLIC Ncuts SUM Segments

25 14.310 2.045 0.650 158.349 0.020

500 39. 110 2.910 1.699 1802.600 0.024

1000 41.20 1 3.641 1.731 0.027

2500 44.036 5.741 1.960 0.044

regions) and regularizer (to control the compactness). The region size was fixed, and the regularizer that gave the least under-segmentation error was chosen. The region sizes were chosen so as to give the same number of segments as the other methods for fair comparison. ERS has four parameters: number of superpixels, weighting factor for the balancing function, Gaussian kernel parameter, and connectedness (4-connected or 8-connected). The number of superpixels was fixed, and the combination of other parameters that gave the least under-segmentation error was chosen. For the rock images tested, the optimal range of superpixels lies between 250 and 500. Thus, we compared the performance of all the algorithms in this operating range.

The under-segmentation error measures the amount of false merging, so we want to minimize this error. Fig. 2(a) shows a plot of the under-segmentation error vs. the number of superpixels for all the automated methods. In the range of 250-500 superpixels, the difference between the SUM algorithm and the other methods is very small. The under-segmentation error of Ncuts and ERS is 0.03 less than SUM. The undersegmentation error of SLIC and the turbopixel algorithm is just 0.007 less than SUM.

Boundary recall measures the adherence of superpixel boundaries to ground truth image boundaries, so we want to maximize this quantity. Fig. 2(b) shows that the boundary recall of SUM is comparable to the other superpixel algorithms. The boundary recall of SUM is 0.2 greater than turbopixels. The boundary recall of ERS, SLIC, and Ncuts is 0.07 greater than SUM.

All the automated algorithms were run on a 2.5 GHz Intel core is processor with 4 GB RAM. Table I compares the run time of all the algorithms. SUM outperforms all the algorithms under study with respect to run time. In the operating range, SUM is 1.674 seconds faster than SLIC and 2.886 seconds faster than ERS. The algorithms are ranked in the following order with respect to run time: SUM, SLIC, ERS, Turbopixels, and Ncuts.

Fig. 3 shows the superpixel segmentation results for an example rock image. Typically, the superpixel segmentation would next undergo post-processing in order to merge the oversegmented regions.

IV. CONCLUSION

In this paper, we compared recent superpixel segmentation algorithms on rock images. The Ncuts algorithm gives comparable results to the rest of the automated algorithms in tenns

147

(d) (e) (f)

Fig. 3. (a) Original image. Results of the automated algorithms: (b) Ncuts, (c) turbopixels, (d) SLIC, (e) ERS, (f) SUM (proposed method).

of under-segmentation and boundary recall, but requires more computation time. ERS and SUC perform well in terms of under-segmentation and boundary recall and are also faster than turbopixels and Ncuts. SUM is the fastest among all the algorithms and simple to implement when compared to the other methods. At the same time, its under-segmentation and boundary recall are comparable to ERS and SUe. Next, we plan to implement various region merging schemes to combine with these superpixe\ algorithms in order to determine which algorithm gives the most accurate final segmentation.

ACKNOWLEDGMENT

We would like to thank Split Engineering LLC for providing the rock image data set.

REFERENCES

[I] S. Wang, H. Lu, L. Yang, and M-H. Yang, "Superpixel tracking", in Proc. IEEE Int. Con! Computer Vision, pp. 1323-1330, 201l.

[2] B. Fulkerson, A. Vedaldi, and S. Soatto, "Class segmentation and object localization with superpixel neighborhoods." in Proc. IEEE Int. COIll

Computer Vision, pp. 670-677, 2009. [3] p. Mehrani and O. Veksler, "Saliency segmentation based on leaming

and graph cut refinement." in Proc. British Machine Vision COIif., pp. 110.1-110.12,2010.

[4] B. Micusfk and J. Kosecka, "Multi-view superpixel stereo in urban environments," Int. J. Computer Vision vol. 89, no. I, pp. 106-119, Aug. 2010.

[5] J. Shi and J. Malik, "Normalized Cuts and [mage Segmentation," IEEE

Trans. Pattern. Anal. and Mach. buell., vol. 22, no. 8, pp. 888-905, Aug. 2000.

[6] P. F. Felzenszwalb and D. P. Huttenlocher, "Efficient graph-based image segmentation;' Int'l. J. Computer Vision vol. 59, no. 2, pp. 167-181, Sep. 2004.

[7] D. Comaniciu and P. Meer, "Mean shift: A robust approach toward feature space analysis;' IEEE Trans. Pattern. Anal. Mach. buell., vol. 24, no. 5, pp. 603-619, May 2002.

148

[8] A. Vedaldi and S. Soatto, "Quick shift and kernel methods for mode seeking;' in Proc. European COIif. Computer Vision, pp. 705-718, 2008.

[9] L. Vincent and P. Soil Ie, "Watersheds in digital spaces: an efficient algorithm based on immersion simulations," IEEE Trans. Pattern. Anal.

Mach. Intell., vol. 13, no. 6, pp. 593-598, Jun. 1991. [10] O. Veksler, Y. Boykov, and P. Mehrani, "Superpixels and supervoxels in

an energy optimization framework," in Proc. European COlif. Computer Vision, pp. 211-224, 2010.

[II] M. Van den Bergh, X. Boix, G. Roig, B. de Capitani, and L. Van Gool, "SEEDS: superpixels extracted via energy-driven sampling, in Proc.

European COIll Computer Vision, pp. 213-26,2012. [12] A. Levinshtein, A. Stere, K. N. Kutulakos, D. J. Dickinson, and K.

Siddiqi, "Turbopixels: Fast superpixels using geometric flows," IEEE Trans. Pattern. Anal. Mach. buell., vol. 31, no. 12, pp. 2290-2297, Dec. 2009.

[13] R. Achanta, A. Shaji, A. Lucchi, P. Fua, and S. Susstrunk, "SLIC superpixels compared to state-of-the-art superpixel methods," IEEE Trans.

Pattern. Anal. Mach. buell., vol. 34, no. 11, pp. 2274-2282, Nov. 2012. [14] M. Y. Liu, O. Tuzel, S. Ramalingam, and R. Chellappa, "Entropy-rate

clustering: cluster analysis via maximizing a submodular function subject to a matroid constraint," IEEE Trans. Pattern. Anal. Mach. buell., vol. 36, no. 1, pp. 99-112, Jan. 2014.

[15] C. Conrad, M. Mertz, and R. Mester, "Contour-related superpixels;' in Proc. Int. COIll on Energy Minimzation Methods in Computer Vision and Pattern Recognition, pp. 280-293, 2013.

[16] P. Neubert and P. P rotzel, "Evaluating superpixels in video: metrics beyond figure-ground segmentation," in Proc. British Machine Vision Conference, pp. 1-11,2013.

[17] L. Vincent, "Grayscale area openings and closings: their applications and efficient implementation," in EURASIP Workshop on Mathematical Morphology and its Applications to Signal Processing, pp. 22-27, 1993.

[18] A. Meijster and M. H. F. Wilkinson, "A comparison of algorithms for connected set openings and closings;' IEEE Trans. Pattern. Anal. Mach. buell., vol. 24, no. 4, pp. 484-494, Apr. 2002.