ieee journal of selected topics in applied earth ... of... · sequence of succesive scales . it has...

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING 1

Indexing of Remote Sensing Images With DifferentResolutions by Multiple Features

Bin Luo, Shujing Jiang, and Liangpei Zhang

Abstract—The indexing of the images from huge remote sensingdatabases is a key issue for the space agencies. A particularity ofremote sensing image databases is that the images are with dif-ferent but known spatial resolutions. In this paper, the joint in-dexing of remote sensing images with different spatial resolutionsare investigated. The main contribution of this paper consists inproposing the approaches for comparing the features extractedfrom the images with different spatial resolutions, which are usu-ally not comparable. The experimental results obtained on satelliteimages taken by SPOT5 and SPOT have validated the proposedcomparing approaches.

Index Terms—Image analysis, image recognition, image re-trieval, remote sensing.

I. INTRODUCTION

S PACE agencies have collected databases with huge amountof remote sensing images over the last decades. The in-

dexing of the images from such databases is a key issue for thespace agencies. Since 1990s, the Content Based Image Retrieval(CBIR) approaches have been proposed for the natural image ormultimedia databases (see [1], [2] for a review). The CBIR sys-tems, such as the KES [3], KIM [4], for remote sensing imageshave also been developed.For the image retrieval, features are at first extracted as de-

scriptors of the images. Three families of features can often befound in the literature: the radiometric features, the texture fea-tures and the shape features. The radiometric features includingthe statistics of gray values of the images are the most widelyused for image retrieval [5]. Since the patterns of the groundobjects in remote sensing images are often complicated, the ra-diometric features are not adequate for describing the spatialrelations between the pixels. Therefore, texture features, suchas Gabor features [6]–[9], Gray Level Co-occurrence Matrix(GLCM) features [10]–[12] and wavelet features [13], are pro-posed for describing the complex patterns in the remote sensingimages. However, the spatial accuracy of the texture features areusually low, since they require a sufficiently large neighborhoodto describe the spatial information on a single pixel. Recently,object-oriented approaches are proposed for the analysis of re-mote sensing images [14], [15]. These approaches separate the

Manuscript received February 14, 2012; revised June 11, 2012; acceptedOctober 28, 2012. This work was supported by the Chinese NSFC project61102129 and NSFC project 41061130553.The authors are with the State Key Laboratory of Information Engi-

neering in Surveying, Mapping and Remote Sensing (LIESMARS), WuhanUniversity, 430079 Wuhan, China (e-mail: [email protected], [email protected], [email protected])Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/JSTARS.2012.2228254

structures corresponding to ground objects in the images andcan therefore extract the shape features of the extracted struc-tures (such as the areas, the perimeters of the ground objects).These shape features are descriptive since they are related di-rectly to the objects on the ground.One particularity of the remote sensing image databases,

when compared to the natural images databases, is that theyare constituted by images with different but known spatialresolutions. Classical features, such as Gabor features, GLCMfeatures, wavelet features, etc., extracted from images with dif-ferent resolutions are not directly comparable. Thus the imageswith different resolutions can’t be jointly indexed. Fortunately,many feature extraction methods are either scale invariantor resolution invariant. For example, the Gabor filter bank isscale invariant [9]. And in [13], the authors have proposed theresolution invariance for the Gaussian derivative wavelets inorder to compare the Gaussian wavelet features obtained fromthe images with different resolutions. These invariances makeit possible to compare the features extracted from images withdifferent resolutions, which can hence be jointly indexed.The contributions of this paper are two-fold: i) we propose

the approaches for comparing the texture features and the shapefeatures extracted from the images with different resolutions,which are usually not comparable; and ii) we evaluate the per-formances of different combinations of the radiometric, textureand shape features for the indexing of remote sensing imageswith different resolutions.The paper is organized as follow. In Section II, we briefly

introduce the radiometric, the texture and the shape featuresused for indexing the remote sensing images. In addition, in thesame section, we present how to compare the features extractedfrom the images with different resolutions for the indexing. InSection III.A, we present the data sets used for experiments,as well as the parameters used for extracting the features. InSection III, classification and retrieval experiments are carriedout on the images taken by SPOT5 on the same scene but withdifferent resolutions. In Section III.B, the results of the classi-fications are presented in order to evaluate the efficiencies ofdifferent feature sets. In Section III.C, the most efficient featuresets are used for the retrieval of remote sensing images withdifferent resolutions. In Section IV, classification and retrievalexperiments are carried out on the images taken respectively bySPOT2 and SPOT5 on two different scenes and with differentresolutions. Finally, we conclude in Section V.

II. EXTRACTIONS AND COMPARISONS OF THE FEATURESOBTAINED AT DIFFERENT RESOLUTIONS

Three families of features are used for the indexing in thispaper: the radiometric features, the texture features and the

1939-1404/$31.00 © 2012 IEEE


2 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING

shape features. In this section, we briefly introduce the methodsfor extracting the aforementioned features. Since the low levelfeatures extracted on the images with different resolutions arenot always comparable, the approaches which allow comparingthe features extracted from images with different resolutionsare also proposed for each type of features in this section.

A. Radiometric Features

The mean values and the standard deviations of the imagesare computed as radiometric features. More concretely, the ra-diometric features for an image are defined by:

(1)

where ( is the grey value ofthe pixel on the image and is the number of pixels),

.

If the distributions of gray values of the remote sensing im-ages on the same scene but with different spatial resolutionsare similar, their radiometric features can be directly comparedwithout any additional step.

B. Texture Features

Three classical methods are used for extracting texture fea-tures in this paper: the continuous Gaussian wavelets, the Gaborfilter bank and the Gray Level Co-occurrence Matrix (GLCM).1) Gaussian Wavelet Features: The Gaussian scale-space

representation of an image is defined as:

(2)

where represents the convolution,and is

the scale parameter. The features of the image (withresolution ) at scale are computed on the Gaussianscale-space representation by:

(3)

where

and . and represent thediscrete first order deriatives on the horizontal and verticaldirection. For the image , the features are extracted on asequence of succesive scales .It has been shown in [13] that the image acquisition process

can be approximately modeled by a Gaussian convolution fol-lowed by a sampling. An image of resolution is obtainedby:

(4)

where is the sampling at resolution is the continuousfunction representing the scene, is a Gaussian function withstandard deviation representing the MTF of the instrument,is the characteristic parameter of the MTF. The larger is ,

the smoother is the image. The Gaussian scale-space is scaleinvariant.According to [13], by using the causality of the Gaussian con-

volution and (2) and (4), we can obtain the resolution invariance

of the Gaussian scale-space features. More concretely, the fea-tures extracted on the images of the same scene with differentspatial resolutions can be the same. More concretely, for twoimages (with resolution ) and (with resolution ) onthe same scene, the features and extracted re-spectively on and are equal, if

(5)

It is shown in [13] that the resolution invariance of theGaussian wavelet features yields better indexing results thanusing only the scale invariance of the Gaussian scale-space.Therefore, for the images of two different resolutionsand , we extract the features by Gaussian wavelet at

respectively two series of scales andand keep ,

the extracted features at different resolutions can thus be com-pared for indexing.2) Gabor Features: In computer vision, the Gabor filter bank

is considered to be similar to the human vision system. It is oftenused for extracting the orientation and frequency information ofthe texture. A Gabor filter in spatial domain is defined by:

(6)

where and isthe orientation parameter, represents the sinusoid wavelength,is the phase offset, is the Gaussian scale parameter, is

the spatial ratio. The features extracted on an image (withresolution ) by Gabor filter are defined as:

(7)

The Gabor filters are scale invariant [9]. More concretely, fortwo bi-dimensional functions and ,where is a scaling factor, we have:

(8)

Note that the Gabor filter has two parameters related to thescale and . The causality of the Gaussian kernel cannot be ap-plied to Gabor filter. Therefore the image acquisition model (4)cannot be integrated with Gabor filters. In this paper, we attempt


LUO et al.: INDEXING OF REMOTE SENSING IMAGES WITH DIFFERENT RESOLUTIONS BY MULTIPLE FEATURES 3

to use the scale invariance of the Gabor filters for comparing thefeatures extracted on the image of different resolutions by sup-posing that:

(9)

where , and is a normalization constant relatedto the size of the images with different resolutions. This approx-imation is different to the resolution invariance proposed in [13]in the sense that the convolution with the MTF of the instrumentis not considered and the change of resolutions of the images isapproximated simply by a resampling of the image.3) Gray Level Co-Occurrence Matrix (GLCM) Features:

The Gray Level Co-occurrence Matrix computed on an image(with resolution ) is defined as:

(10)

where represents the number of elements contained in theset . Usually, for an image , four co-occurrence matrices arecomputed at four orientations:

(11)

(12)

(13)

(14)

where is the distance parameter. In this paper, 5 features—theAngular SecondMoment, the Contrast, the Variance, the InverseDifferent Moment and the Prominence of Clustering—are com-puted from each matrix. These features are originally proposedin [10], of which the definitions are shown in the Appendix. Wenote the th featureextracted on theGLCM . TheGLCM feature set computedon an image with resolution is defined as:

(15)

where .For the indexing of images with different resolutions, we ex-

tract the GLCM features from images with resolution byusing distance and the GLCM features from images with res-olution by using distance . Though equality between theGLCM features extracted on two images and with twodifferent resolutions (but on the same scene) cannot be rigor-ously established, we can do a rough approximation between

and by setting

(16)

In Section III.B, it will be seen that this rough approximation canprovide satisfactory results for the indexing of remote sensingimages with different resolutions.

C. Shape Features

In [15], the authors propose a method based on the topo-graphic map of the image to estimate the local scale of eachpixel in the case of gray scale remote sensing images. For each

Fig. 1. Examples of the SPOT5 image patches taken on Nanjing used for ex-periments© SPOTIMAGE. Top row: image patches of the classes with 2.5 mresolution (512 512 pixels). Bottom row: image patches of the classes with10 m resolution (128 128 pixels).

TABLE ISIX CLASSIFICATION EXPERIMENTS FOR EVALUATING THE FEATURES

pixel , the most contrasted shape containingis extracted. The scale of the shape defines the charac-teristic scale of this pixel. The topographic map [16], which canbe obtained by Fast Level Set Transformation (FLST) [17], rep-resents an image by an inclusion tree of the shapes (which aredefined as the connected components of the level sets). For eachpixel , there is a branch of shapescontaining it. Note the gray level of the shape

its area and its perimeter. The contrast of the shapeis defined as .

The most contrasted shape of a given pixel is definedas the shape containing this pixel, of which the contrast is themost important, i.e.,

(17)

Since the optical instruments always blur remote sensing im-ages, several shapes with very low contrasts can belong to thesame structure. In order to deal with the blur, the authors of [15]propose a geometrical criterion to accumulate the contrasts ofthe shapes corresponding to one given structure. The idea isthat the difference of the areas of two successive shapes (for ex-ample and ) corresponding to one given structure is pro-portional to the perimeter of the smaller shape, i.e.,

, where is a constant characterizing the blurof the image contour (which is usually equal to 1 for most of theinstrument according to [15]). It is shown in [15] that the mostcontrasted shapes extracted in an image form a partition of thisimage. The area and the perimeter values of the most contrastedshape are very pertinent geometrical features for theclassification task [18].



Fig. 2. Mean and standard deviations of the coefficients of the classification results by combining two feature sets. represents radiometric features;represents Gaussian wavelet features; represents Gabor features; represents GLCM features and represents shape features. See the protocol of theexperiments in Section III.B.1 for better comprehension of the results. (a) , (b) , (c) , (d) , (e) , (f) .

In this paper, for an image , we at first compute the mostcontrasted shape for each pixel. The area value ofthe shape is associated to each pixel contained in

. The feature used for describing the image is thehistogram of the logarithmic area values of all the pixels in theimage:

(18)

where

(19)

On two images on the same scene and , for the sameobject (such as a building, a farm, etc.), the area values of theshapes extracted respectively on the two imagesand have the following relation:

(20)

Therefore if

(21)

we have

(22)

Fig. 3. Image pairs of three classes with different resolutions and the ratios ofthe Gabor features and the GLCM features extracted from these image pairs. Theclasses of the image pairs are (from left to right) Building, Farm and Vegetation.



Fig. 4. Mean and standard deviations of the coefficients of the classification results by combining more than two feature sets. represents radiometricfeatures; represents Gaussian wavelet features; represents GLCM features and represents shape features. See the protocol of the experiments inSection III.B.1 for better comprehension of the results. (a) , (b) , (c) , (d) , (e) , (f) .

III. EXPERIMENTS ON IMAGES OF SPOT5

A. Data Sets and Parameters

1) Data Sets: The SPOT5 image taken on Nanjing, Chinaat the date of Oct. 3th, 2002 is used. The spatial resolution ofthe panchromatic image is 2.5 m. Since the SPOT5 providesalso multispectral data with 10 m resolution on the same re-gion, we simulate a panchromatic image with 10 m resolutionby using the multispectral product in order to assure the ter-rain types and the weather conditions are similar for the clas-sification for avoiding at most the influences of other factorsrather than the difference of the spatial resolutions. More con-cretely, since SPOT5 doesn’t have blue channel, the panchro-matic imagewith 10m resolution is simulated by(where and are respectively the green and the red channelsof the multispectral SPOT5 images with 10 m resolution), inorder to assure the histograms of the simulated panchromaticimage with 10 m resolution and the panchromatic image with2.5 m resolution are similar.The images are then cropped into small image patches. For

the images with the resolution of 2.5 m, the size of each patchis 512 512 pixels; While for the resolution of 10 m, the sizeof each patch is 128 128 pixels. Among all the image patchesof each resolution, 3 classes of terrains are chosen: the Buildingclass (which contains 97 images representing the urban areas),the Vegetation class (which contains 97 images representing theforest areas) and the Farm class (which contains 110 images

representing the rural and agricultural areas). In total, 304 imagepatches with 2.5 m resolution and the same number of imagepatches with 10 m resolution are used for the classification. Theradiometric, texture and shape features are extracted on eachimage patch. The classification and retrieval experiments arecarried out on the databases composed by these image patches.Some examples of the image patches at different resolutions

are shown in Fig. 1.2) Parameters: Radiometric, texture and shape features are

extracted from the image patches for the experiments.For extracting the Gaussian wavelet features, the parameter(see (4)) is set to be 0.4, which is experimentally found to be

the most appropriate for the SPOT5 instrument. More specifi-cally, we classify the images with different resolutions by set-ting . According to the results of cross val-idations, has been found to be the most appropriate.For extracting features from the images with 10 m resolution,the scale parameters used are . Thescale parameters used for the images with 2.5 m resolution arecomputed by using (5).For extracting the Gabor features, the Gabor filter bank

divides the frequency domain into four directions (i.e., theparameter is set to be 0, 45, 90 and 135 degrees). The

parameter is set to be 3. For extracting the Gabor features forthe images with 10 m resolution, the parameters is set to be

. And we set . While for the im-ages with 2.5 m resolution, according to (9), the corresponding



Fig. 5. Mean and standard deviations of the coefficients of the classification results by combining different feature sets. represents radiometric features;represents Gaussian wavelet features; represents GLCM features and sift represents SIFT features. (a) , (b) , (c) , (d) ,

(e) , (f) .

parameters are and .For each image, the dimension of the Gabor features are then

.For extracting GLCM features, the GLCM at the four direc-

tions are computed. For the image with 10 mresolution, at each direction, the GLCM features of the distances1, 2 and 3 are extracted. According to (16), for the image with2.5 m resolution, at each direction, the GLCM features of thedistances 4, 8 and 12 are extracted. We have computed 5 fea-tures from each Gray-Level Co-occurrent Matrix: the AngularSecond Moment, the Contrast, the Variance, the Inverse Dif-ferent Moment and the Prominence of Clustering (see Appendixfor the definition of these features). Therefore, for each image,the dimension of the GLCM feature vector is .

B. Experiments and Results I: Classification

1) Protocol of the Experiments: In this section, the remotesensing images with different resolutions are classified by usingdifferent features. In order to show the efficiency of the features,we classify the images by using different combinations of thefeature sets The used classifier is Support Vector Machine [19].During experiment, we found that the images in the database arelinearly separable. The linear kernel is thus used. The classifi-cation results are evaluated by the -coefficient [20]. For eachchoice of feature set, classification results obtained by using the6 schemes (see Table I) are shown.For each classification task, the training sets are randomly

selected from the whole image patch database. The sizes of the

training sets vary from 10% to 60% of all the sample imagepatches with an increment of 10%. For each percentage, in orderto avoid the bias caused by the selection of the training sets, 10different sets are randomly selected for training the classifier.The mean value and the standard deviation of the coefficientsobtained by using the 10 training sets are shown as quantitativeevaluation.2) Classification Based on the Combinations of Two Fea-

ture Sets: In this section, the radiometric features ( , see (1))are combined with one of the Gauss scale-space ( , see (3)),Gabor ( , see (7)), GLCM ( , see (15)), and Shape features( , see (18)) in order to see the gain of the texture and shapefeatures for classification task when compared to radiometricfeatures.In Fig. 2, the mean values and standard deviations of the

coefficients of the classification results by combining two fea-ture sets are shown.Several remarks can be drawn based on the results shown in

Fig. 2:• The radiometric features can produce good classifi-cation results when the training set and the test set are allwith 2.5 m resolution (Fig. 2(f)). While for other cases, theresults obtained by are moderate, especially for the ex-periment , the results are quite bad.

• When we combine other features with the radiometric fea-tures, in most of the cases, the classification results im-prove a lot, especially when the training sets and the testsets are with different resolutions. The only exception is



Fig. 6. Retrieval results for an image of the Farm class. The key image, which is on the top left, belongs to the Farm class with 10 m resolution. On the top of eachimage patch, the class which the image patch belongs to and its spatial resolution are shown on the first row. And the number of the image patch in the database isshown on the second row. The images are ordered from left to right and from top to bottom by their similarities with the key image on the top left. The precisionis 48/48.

the Gabor features . For the images with the same res-olution, the improvement of the results by adding Gaborfeatures is quite slight. While when the training set and thetest set are with different resolutions, the results obtainedby are even worse.

• The combination of the radiometric features and theGLCM features always produces the best classifica-tion results. The approximationmade by (16) is empiricallyvalide by this experiment.

• The performance of the combination of the radiometricfeatures and the Gaussian wavelet featuresalways follows the results obtained by , be-cause the comparison of the Gaussian wavelet features ob-tained on the images with different resolutions are rigor-ously based on the resolution invariance proposed in [13]and [21].

• By adding the shape features with the radiometricfeatures , the classification results improve a lot. Forthe case when training and test sets are all with 10 m reso-lution, the combination is one of the feature sets

which give the best results. The shape features are based onthe scale-adaptive segmentation proposed in [15], of whichthe effectiveness for the classification of remote sensingimages have been tested in [18]. It is not surprising thatthe addition of the shape features can improve classifica-tion results. When the resolution of the images is 2.5 m,structures in the image are much more complicated thanthe images with 10 m resolution. More segmentation er-rors may occur in the images. Therefore, for the imageswith 10 m resolution, the improvements related to shapefeatures are more significant.

Based on the above observations, we can conclude that the ap-proaches proposed in the previous section for comparing fea-tures extracted from images with different resolutions are effi-cient, except the Gabor features. It has to be remarked that thecomparisons of the Gabor features extracted from different res-olutions are based on the scale invariance of Gabor filters. Thescale invariance is efficient in many cases when the resolutionsof the images are not specifically given, for example for the ob-ject recognition purpose [9]. However, when we want to com-



Fig. 7. Retrieval results for an image patch of the Building class. The key image patch, which is on the top left, belongs to the Building class with 2.5 m resolution.On the top of each image patch, the class which the image patch belongs to and its spatial resolution are shown on the first row. And the number of the image patchin the database is shown on the second row. The images are ordered from left to right and from top to bottom by their similarities with the key image on the topleft. The precision is 47/48.

pare the features obtained from the images of which the resolu-tions are known, the scale invariance is not sufficiently accurate[13]. In Fig. 3, we compute the ratios between the Gabor fea-tures from 3 image pairs captured by SPOT5 with respectively2.5 m and 10 m resolutions by using the parameters shown inSection III.A. Note and the features extracted from2.5 m and 10 m resolutions respectively. We compute the ratio

for each feature. It can be seen that though foreach image pair, the ratios are quite constant, however, the ratiovalues vary a lot when the classes of the images change (0.6 forBuilding class, 0.9 for Farm class and 1 for Vegetation class).In particular, when the image is homogeneous (for example, theimage of Vegetation which has very little variations when com-pared to the image of Building) the ratios are close to one. Thisvariation is the cause of the great classification errors when theimage resolutions are different.For comparison, we also compute the ratios between

the GLCM features extracted on images with two differentresolutions (but on the same scene) , where

and are the feature vectors computed at 2.5 m and

10 m resolutions respectively. The results are shown on thefourth row of Fig. 3. It can be seen that, though equality betweenthe GLCM features extracted on images with two differentresolutions is established by a simple linear approximation(16), the ratios are always varying around 1 for the images ofthe three classes.3) Classification Based on the Combinations of More

Than Two Feature Sets: In the previous section, we haveobserved that the addition of the GLCM features , theGaussian wavelet features , the shape features tothe radiometric features can improve the classificationresults when compared to the radiometric features alone. Inthis section, we at first choose two feature sets among

and to combine with the radiometric features for theclassification. Then all the four feature sets are combined for theclassification. The mean values and the standard deviations ofthe coefficients of the classification experiments by using thecombinations of 3 and 4 feature sets are shown in Fig. 4. Theresults obtained by using is served as the baselinefor comparison.



Fig. 8. Retrieval results for an image patch of the Vegetation class. The key image patch, which is on the top left, belongs to the Vegetation class with 2.5 mresolution. On the top of each image patch, the class which the image patch belongs to and its spatial resolution are shown on the first row. And the number of theimage patch in the database is shown on the second row. The images are ordered from left to right and from top to bottom by their similarities with the key imageon the top left. The precision is 47/48.

Several remarks can be drawn from the results:• The results obtained by are always betterthan the results obtained by , which indicatesthat the addition of the Gaussian wavelet features can stillimprove the classification results. This is coherent to theobservation drawn in Section III.B.2, becauseand are always the two best combinations.

• The results obtained by andare very similar, which indicates that the shape

features can hardly improve the classification results.It may be caused by the coarse resolutions of the images.The shape features extracted from images with 2.5 or 10 mresolutions can’t provide supplementary information to thecombinations of .

4) Comparison With the Classification Results Obtained byScale Invariant Feature Transformation (SIFT): The Scale In-variant Feature Transformation (SIFT) proposed in [22], [23]has been widely used in computer vision for image matching,object tracking, scene classification, etc. Recently, applicationsof SIFT for remote sensing have also been proposed, includingobject detection [24], image classification [25], etc. The features

Fig. 9. Examples of the image patches used for experiments. Top row: imagepatches with 2.5 m resolution taken by SPOT5 at Nanjing, China (512 512pixels). Bottom row: image patches with 10 m resolution taken by SPOT atWuhan, China (128 128 pixels).

are extracted from key points in an image on the principle di-rections of the gradients, which make the extracted features in-variant by scale and rotation changes.



Fig. 10. Mean and standard deviations of the coefficients of the classification results by combining two feature sets. represents radiometric features;represents Gaussian wavelet features; represents GLCM features and sift represents SIFT features. (a) , (b) , (c) , (d) ,(e) , (f) .

In this paragraph, we compare the results obtained by the fea-tures presented in Section II and the results obtained by SIFT,in order to compare the efficiency of the SIFT for the indexingof remote sensing images with different resolutions. We extractthe SIFT features from key points selected by DOG operator inthe image. Since many key points can be detected, we computethe mean value of the SIFT features for all the key points asthe SIFT feature of the whole image, which is then used for theclassification.In Fig. 5, the results obtained by using and

SIFT are shown. It can be seen that for the classifications of theimages with the same resolution ( and ), theresults obtained by SIFT are quite similar to results obtained by

. However, when the images are with differentresolutions, the accuracies obtained by SIFT features decreasedramatically. This observation is mainly due to the fact that thekey points selected from the images of 10 m resolutions and2.5 m resolutions can be quite different, whichmake the featuresfrom the key points not comparable. More intensive studies onthe influence of the key point selection for scene classificationcan be found in [26].

C. Experiments and Results II: Image Retrieval

In the previous section, the performances of the differentfeature sets are evaluated by the classification results obtainedon the images with different resolutions. It has been seen thatthe combination of the radiometric features , the GLCMfeatures and the Gaussian wavelet features canprovide the best classification results. In this section, we apply

this feature combination for the image retrievals. For eachretrieval, a key image patch (with 2.5 m or 10 m resolution) isselected from from the database for the request. Its radiometric,GLCM and Gaussian scale-space features are then comparedwith the features of the other image patches in the database.The Euclidean distance between the features is computed assimilarity measurement. The most similar image patches areselected as the retrieval result. For each retrieval, 48 imagepatches (including of course the key image itselt) are shown.In Figs. 6 to 8, the results of three retrieval experiments areshown. The three key image patches belong to the three dif-ferent classes: Building, Farm and Vegetation.Several remarks can be drawn from the retrieval results:• Generally speaking, the retrieval results are quite good. Forthe Farm class, all the retrieved images belong to the sameclass of the key image. While for the Building and the Veg-etation class, there is only one retrieved image belongs to adifferent class. The results show that the schemes for com-paring the features extracted from images with differentresolutions proposed in Section II are accurate enough forthe indexing of images at different resolutions. This obser-vation is coherent to the results obtained for the classifica-tion experiments.

• For the images of the Farm and the Vegetation classes,the images on the same scenes of the key images but withdifferent resolutions are the first one retrieved by usingthe feature sets , which indicates thattheir features are the most similar to that of the key imagesamong all the images in the database, though their spatial



Fig. 11. Retrieval results for an image patch of the Building class. The key image patch, which is on the top left, belongs to the Farm class with 10 m resolution.The ID number, the class and the spatial resolution of the image patch are shown on top of each image patch. The images are ordered from left to right and fromtop to bottom by their similarities with the key image on the top left. The precision is 48/48.

resolutions are very different. While for the Building class,though the first retrieved image is not on the same scene ofthe key image, this image is still among all the retrievedimages.

IV. EXPERIMENTS ON IMAGES ACQUIRED BY

DIFFERENT CAPTORS

A. Data Sets

In this experiment, the data set is composed by the imagestaken by two captors on two different regions. We use the SPOTimage taken Wuhan, China at the date of Jul the 22th, 2000 with10 m resolution and the SPOT5 image taken on the Nanjing,China at the data of Oct. the 3th, 2002 with 2.5 m resolution. Thetwo images are then cropped into image patches with 128 128pixels (SPOT image at Wuhan) and 512 512 pixels (SPOT5image at Nanjing).For the image patches of each resolution, 237 are manually

labeled into three classes: the Building class (97 images), theFarm class (97 images) and the Water class (43 images). There-fore, there are in total 474 image patches in the database.

Some examples of the image patches at different resolutionsare shown in Fig. 9.The parameters are the same with the experiments carried out

in Section III. The classifier we used is always SVM with linearkernel.

B. Experiments and Results I: Classification

Since we have found in the previous section that the combi-nation can provide the best results, the resultsobtained by three different feature sets are shown: i) radiometric; ii) combination and iii) SIFT features. As

usual, the mean values and standard deviations of the coeffi-cients in Fig. 10.It can be seen that the results obtained by usingare always the best when compared to and SIFT fea-

tures. The combination can greatly improve theresults when compared to . The SIFT features always givepoor results due to the difficulty of key point selections on theimages with different resolutions. All the above observationsare coherent to the results obtained on the database composedby SPOT5 images in Section III.B.



Fig. 12. Retrieval results for an image patch of the Water class. The key image patch, which is on the top left, belongs to the Farm class with 10 m resolution.The ID number, the class and the spatial resolution of the image patch are shown on top of each image patch. The images are ordered from left to right and fromtop to bottom by their similarities with the key image on the top left. The precision is 48/48.

C. Experiments and Results II: Image Retrieval

In this section, the retrieval experiments are carried out onthe database composed by SPOT and SPOT5 images by usingthe feature combination . The protocol of theexperiment is the same as in Section III.C.The retrieval results obtained on the images of the three dif-

ferent classes are shown in Figs. 11–13.The retrieval results are generally good. For the Building and

the Water classes, all the retrieved images belong to the sameclasses as the required images. And the retrieved results arecomposed by images with both 10 m and 2.5 m resolutions. Forthe Farm class, there are 9 images are wrongly retrieved. How-ever, it has to be remarked that the Farm class is very difficultto be separated from the other two classes, since the fields in thekey image is very small which is quite similar to the Buildingclass.

V. CONCLUSION

In this paper, we have proposed the method for indexing theremote sensing images with different resolutions. The radio-metric features, the texture features (including the Gaussian

wavelet features, the Gabor features and the GLCM features)and the shape features have been used in this paper. For dif-ferent kinds of features, we have proposed to use either the scaleinvariance or the resolution invariance in order to comparethe features extracted from images with different resolutions.According to the classification results of remote sensing imageswith different resolutions, the combination of the radiometricfeatures, the GLCM features and the Gaussian wavelet featuresis the most efficient. The image retrieval experiments confirmthat this combination can very well retrieve the images of thesame class to the required image even though the differencesbetween the spatial resolutions of the required images and theretrieved images are quite important.

APPENDIX

The 5 GLCM features computed on a matrixare the following:

• Angular Second Moment

(23)



Fig. 13. Retrieval results for an image patch of the Farm class. The key image patch, which is on the top left, belongs to the Farm class with 10 m resolution. TheID number, the class and the spatial resolution of the image patch are shown on top of each image patch. The images are ordered from left to right and from top tobottom by their similarities with the key image on the top left. The precision is 39/48.

• Contrast

(24)

• Variance

(25)

where is the mean of• Inverse Different Moment

(26)

• Prominence of Clustering

(27)

REFERENCES[1] M. S. Lew, N. Sebe, C. Djeraba, and R. Jain, “Content-based mul-

timedia information retrieval: State of the art and challenges,” ACMTrans. Multimedia Computing Communications and Applications, vol.2, no. 1, pp. 1–19, 2006.

[2] R. Datta, D. Joshi, J. Li, and J. Z. Wang, “Image retrieval: Ideas, influ-ences, and trends of the new age,” ACM Computing Surveys, vol. 40,no. 2, 2008.

[3] A. Colapicchioni, “Kes: Knowledge enabled services for better eo in-formation use,” in Proc. IEEE International Geoscience and RemoteSensing Symp., IGARSS’04, 2004, pp. 176–179, and ieee.

[4] M. Datcu, H. Daschiel, A. Pelizzari, M. Quartulli, A. Galoppo, A.Colapicchioni, M. Pastori, K. Seidel, P. G. Marchetti, and S. D’Elia,“Information mining in remote sensing image archives: System con-cepts,” IEEE Trans. Geoscience and Remote Sensing, vol. 41, no. 12,pp. 2923–2936, 2003.

[5] J. Flusser and T. Suk, “Degraded image analysis: An invariant ap-proach,” IEEE Pattern Anal. Mach. Intell., vol. 20, no. 6, pp. 590–603,1998.

[6] S. D. Newsam and C. Kamath, “Retrieval using texture features inhigh resolution multi-spectral satellite imagery,” in Data Mining andKnowledge Discovery: Theory, Tools, and Technology VI, vol. 5433Proc. Society of Photo-Optical Instrumentation Engineers (SPIE), B.V. Dasarathy, Ed., 2004, pp. 21–32.

[7] A. P. Wang and S. G. Wang, “Content-based high resolution remotesensing image retrieval with local binary patterns,” in Geoinformatics2006: Remotely Sensed Data and Information, vol. 6419 Proc. SPIE,L. Zhang and X. L. Chen, Eds., 2006.



[8] H. Y. Yao, B. C. Li, W. Cao, and Phei, “Remote sensing imagery re-trieval based on Gabor texture feature classification,” in Proc. 2004 7thInt. Conf. Signal Processing, 2004, vol. 1–3, pp. 733–736.

[9] J. K. Kamarainen, V. Kyrki, and H. Kalviainen, “Invariance proper-ties of Gabor filter-based features—Overview and applications,” IEEETrans. Image Processing, vol. 15, no. 5, pp. 1088–1099, 2006.

[10] R. M. Haralick, K. Shanmuga, and I. Dinstein, “Textural features forimage classification,” IEEE Trans. Systems Man and Cybernetics, vol.SMC3, no. 6, pp. 610–621, 1973.

[11] R. F. Walker, P. T. Jackway, I. D. Longstaff, and P. Univ, “Recent de-velopments in the use of the co-occurrence matrix for texture recogni-tion,” in Proc. DSP’97: 1997 13th Int. Conf. Digital Signal Processing,New York, 1997, vol. 1 and 2, pp. 63–65.

[12] R. F. Walker, P. T. Jackway, and D. Longstaff, “Genetic algorithmoptimization of adaptive multi-scale GLCM features,” Int. J. PatternRecognition and Artificial Intelligence, vol. 17, no. 1, pp. 17–39, 2003.

[13] B. Luo, J. F. Aujol, Y. Gousseau, and S. Ladjal, “Indexing of satelliteimages with different resolutions by wavelet features,” IEEE Trans.Image Processing, vol. 17, no. 8, pp. 1465–1472, 2008.

[14] S. Niebergall, A. Loew, and W. Mauser, “Integrative assessment ofinformal settlements using VHR remote sensing data—the Delhi casestudy,” IEEE J. Selected Topics in Applied Earth Observations andRemote Sensing, vol. 1, no. 3, pp. 193–205, 2008.

[15] B. Luo, J. F. Aujol, and Y. Gousseau, “Local scale measure from the to-pographic map and application to remote sensing images,” MultiscaleModeling and Simulation, vol. 8, no. 1, pp. 1–29, 2009.

[16] V. Caselles, B. Coll, and J. M. Morel, “Topographic maps and localcontrast changes in natural images,” Int. J. Computer Vision, vol. 33,no. 1, pp. 5–27, 1999.

[17] P. Monasse and F. Guichard, “Fast computation of a contrast-invariantimage representation,” IEEE Trans. Image Processing, vol. 9, no. 5,pp. 860–872, 2000.

[18] B. Luo and J. Chanussot, “Supervised hyperspectral image classifica-tion based on spectral unmixing and geometrical features,” J. SignalProcessing Systems for Signal Image and Video Technology, vol. 65,no. 3, pp. 457–468, 2011.

[19] V. Vapnik, Statistical Learning Theory. New York: Wiley, 1998.[20] C. Jean, “Assessing agreement on classification tasks: The kappa

statistic,” Computational Linguistics, vol. 22, pp. 249–254, 1996.[21] B. Luo, J. F. Aujol, Y. Gousseau, S. Ladjal, and H. Maitre, “Resolution

independent characteristic scale dedicated to satellite images,” IEEETrans. Image Processing, vol. 16, no. 10, pp. 2503–2514, 2007.

[22] D. G. Lowe, “Object recognition from local scale-invariant features,”in Proc. Int. Conf. Computer Vision, 1999, vol. 2, pp. 1150–157.

[23] D. G. Lowe, “Distinctive image features from scale-invariant key-points,” Int. J. Computer Vision, vol. 60, pp. 91–110, 2004.

[24] B. Sirmacek and C. Unsalan, “Urban-area and building detection usingsift keypoints and graph theory,” IEEE Trans. Geoscience and RemoteSensing, vol. 47, no. 4, pp. 1156–1167, 2009.

[25] S. Xu, T. Fang, D. R. Li, and S. W. Wang, “Object classification ofaerial images with bag-of-visual words,” IEEEGeoscience and RemoteSensing Lett., vol. 7, no. 2, pp. 366–370, 2010.

[26] W. J. Xie, D. Xu, S. Y. Liu, and Y. G. Tang, “How the number ofinterest points affect scene classification,” IEICE Trans. Informationand Systems, vol. E93D, no. 4, pp. 930–933, 2010.

Photographs and biographies of the authors were not available at the timeof publication.

ieee journal of selected topics in applied earth ... of... · sequence of succesive scales . it has...

Documents