building localization from forward-looking infrared images for uav guidance

10
ISSN 20751087, Gyroscopy and Navigation, 2013, Vol. 4, No. 4, pp. 188–197. © Pleiades Publishing, Ltd., 2013. Published in Russian in Giroskopiya i Navigatsiya, 2013, No. 3, pp. 59–71. 188 1 I. INTRODUCTION Inertial navigation systems (INS) are widely used in unmanned aircrafts, such as unmanned air vehicles (UAV), to navigate to a designated target [1]. As is known, use of INS is inappropriate for longdistance navigation. Therefore, many aiding systems are pro posed and integrated with INS to ensure highaccu racy aircraft navigation. Among all the existing aiding systems, camerabased (or visionaided) navigation has drawn much attention. For example, a vision aided inertial navigation algorithm via entropylike relative pose estimation was presented by Corato et al. [2], and Scherbinin et al. proposed a color vision based correlationextremal aircraft navigation system [3]. The present paper focuses on one aspect of the camerabased navigation component of this problem: how to localize a designated building from forward looking infrared (FLIR) imagery. The main reason for using the infrared imaging system is that it can work for nearly 24 hours a day in passive imaging mode, with the advantage to support the operation mode of “electromagnetic silence”. Knowing the coordinates of the designated build ing in FLIR imagery is very important for UAV guid ance. However, to localize a designated building from FLIR images is a big challenge. Generally speaking, the main challenges lie in the following three aspects. 1) It is very difficult to predict the intensity of the infrared radiation emitted from the designated build ing. Therefore, the prior knowledge about the lumi nance of the designated building in FLIR images is unavailable. As a result, many methods such as those used in [4–6], which recognize or extract warm/cool targets, e.g., cars, tanks and aircrafts, against a 1 The article is published in the original. cooler/warmer background from FLIR images, are not suitable for buildings localization. 2) In FLIR images there are usually some other buildings, which look similar to the designated build ing. 3) The designated building can be small, i.e. short and low, or occluded by other objects. As a result of challenges 2) and 3), the method pro posed in [7] does not work well when the designated building is small or there are some other similar build ings in FLIR images. The main reason is that the method proposed in [7] just utilizes the information of the designated building but ignores the spatial rela tionship between the designated building and the other objects in the FLIR images. Researchers have noticed that the shape of build ings can provide stable information for building local ization. For this reason, it is very common to get the shape prior knowledge of buildings from 3D building models. Dorota Iwaszczuk et al. [8] proposed a method to match 3D building models with IR images. However, this method can work only when the GPS, INS and the accurate system calibration are available at the same time. Using the multiscale structuring elements generated from 3D building models, Yang et al. [7] demonstrated a morphologybased approach to suppressing the background for building recogni tion from FLIR images. This method can only recog nize large buildings, which should be huge enough (e.g., a 40storey building or a building covering an entire city block). Wang et al. [9] proposed an indirect building localization method based on prominent solid landmark. This method can localize small build ings, which are lowrise and with small occupied area, but it will not work with the absence of the prominent solid landmark in the visual field. Furthermore, the precision of localization will drop sharply when the Building Localization from ForwardLooking Infrared Images for UAV Guidance 1 Y. Qin a, b , Zh. Cao a, b , H. Li a, b , X. Wang a, b , and W. Zhuo a, b a National Key Laboratory of Science and Technology on MultiSpectral Information Processing b School of Automation, Huazhong University of Science and Technology, Wuhan, China Received April 22, 2013. AbstractThis paper proposes a new approach to localizing a designated building from forwardlooking infrared images under complex scenes, which can be used in UAV guidance. The approach makes full use of the scene information and is able to localize small or occluded buildings. The experiment results prove the algorithm efficiency. DOI: 10.1134/S2075108713040093

Upload: w

Post on 23-Dec-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Building localization from forward-looking infrared images for UAV guidance

ISSN 2075�1087, Gyroscopy and Navigation, 2013, Vol. 4, No. 4, pp. 188–197. © Pleiades Publishing, Ltd., 2013.Published in Russian in Giroskopiya i Navigatsiya, 2013, No. 3, pp. 59–71.

188

1 I. INTRODUCTION

Inertial navigation systems (INS) are widely usedin unmanned aircrafts, such as unmanned air vehicles(UAV), to navigate to a designated target [1]. As isknown, use of INS is inappropriate for long�distancenavigation. Therefore, many aiding systems are pro�posed and integrated with INS to ensure high�accu�racy aircraft navigation. Among all the existing aidingsystems, camera�based (or vision�aided) navigationhas drawn much attention. For example, a vision�aided inertial navigation algorithm via entropy�likerelative pose estimation was presented by Corato et al.[2], and Scherbinin et al. proposed a color vision�based correlation�extremal aircraft navigation system[3]. The present paper focuses on one aspect of thecamera�based navigation component of this problem:how to localize a designated building from forward�looking infrared (FLIR) imagery. The main reason forusing the infrared imaging system is that it can workfor nearly 24 hours a day in passive imaging mode,with the advantage to support the operation mode of“electromagnetic silence”.

Knowing the coordinates of the designated build�ing in FLIR imagery is very important for UAV guid�ance. However, to localize a designated building fromFLIR images is a big challenge. Generally speaking,the main challenges lie in the following three aspects.

1) It is very difficult to predict the intensity of theinfrared radiation emitted from the designated build�ing. Therefore, the prior knowledge about the lumi�nance of the designated building in FLIR images isunavailable. As a result, many methods such as thoseused in [4–6], which recognize or extract warm/cooltargets, e.g., cars, tanks and aircrafts, against a

1 The article is published in the original.

cooler/warmer background from FLIR images, arenot suitable for buildings localization.

2) In FLIR images there are usually some otherbuildings, which look similar to the designated build�ing.

3) The designated building can be small, i.e. shortand low, or occluded by other objects.

As a result of challenges 2) and 3), the method pro�posed in [7] does not work well when the designatedbuilding is small or there are some other similar build�ings in FLIR images. The main reason is that themethod proposed in [7] just utilizes the information ofthe designated building but ignores the spatial rela�tionship between the designated building and the otherobjects in the FLIR images.

Researchers have noticed that the shape of build�ings can provide stable information for building local�ization. For this reason, it is very common to get theshape prior knowledge of buildings from 3D buildingmodels. Dorota Iwaszczuk et al. [8] proposed amethod to match 3D building models with IR images.However, this method can work only when the GPS,INS and the accurate system calibration are availableat the same time. Using the multi�scale structuringelements generated from 3D building models, Yanget al. [7] demonstrated a morphology�based approachto suppressing the background for building recogni�tion from FLIR images. This method can only recog�nize large buildings, which should be huge enough(e.g., a 40�storey building or a building covering anentire city block). Wang et al. [9] proposed an indirectbuilding localization method based on prominentsolid landmark. This method can localize small build�ings, which are low�rise and with small occupied area,but it will not work with the absence of the prominentsolid landmark in the visual field. Furthermore, theprecision of localization will drop sharply when the

Building Localization from Forward�Looking Infrared Images for UAV Guidance1

Y. Qina, b, Zh. Caoa, b, H. Lia, b, X. Wanga, b, and W. Zhuoa, b

aNational Key Laboratory of Science and Technology on Multi�Spectral Information ProcessingbSchool of Automation, Huazhong University of Science and Technology,

Wuhan, ChinaReceived April 22, 2013.

Abstract—This paper proposes a new approach to localizing a designated building from forward�lookinginfrared images under complex scenes, which can be used in UAV guidance. The approach makes full use ofthe scene information and is able to localize small or occluded buildings. The experiment results prove thealgorithm efficiency.

DOI: 10.1134/S2075108713040093

Page 2: Building localization from forward-looking infrared images for UAV guidance

GYROSCOPY AND NAVIGATION Vol. 4 No. 4 2013

BUILDING LOCALIZATION FROM FORWARD�LOOKING INFRARED IMAGES 189

designated building is far away from the prominentsolid landmark because of the error in measuring theposition and attitude of the camera.

The existing methods to localize buildings just uti�lize the information of a designated building and treatthe other objects, such as rivers, roads and the otherbuildings, in the FLIR image as the cluttered back�ground.

Therefore, the information provided by the otherobjects is underused. In fact, the other objects in theFLIR image contain a wealth of information that maybe exploited for building localization.

In this paper, we propose a new approach to local�izing a designated building by using 3D building mod�els and remote sensing images. A schematic overviewof the proposed approach is depicted in Fig. 1. First,we generate a brand�new reference image, in whichthe designated building is assigned in advance. Then,by matching the FLIR image with the correspondingreference image, the correspondence between theFLIR image and our reference image is found and thedesignated building in the FLIR image is localized.Unlike the existing methods, our approach makes fulluse of the scene information to localize buildings.Everything in the FLIR image can provide usefulinformation for finding the correspondence. There�fore, even though the designated building in the FLIRimage is small, occluded, or there exist other similarbuildings, the approach can still yield good results. Inother words, the approach will localize the scene in theFLIR image and then localize the designated building,because the spatial relationship between the desig�nated building and scene is substantially unchange�able.

Although out of the scope of current work, we pointout that the described approach can also be used inother applications beside the UAV guidance: texturing

of an existing 3D building model with FLIR imagesequences. It has been reported that 40% of the energyis consumed by buildings and 47% of it is used forheating in the European countries [8]. Therefore, tex�turing of an existing 3D building model with FLIRimage sequences can help people to detect the unusualloss of heat from buildings, with the goal of construct�ing green buildings.

The rest of this paper is organized as follows. Insection II, we give an overview of the proposedapproach. Then in section III we introduce the gener�ation of our reference image. Section IV presents ourmethod to match the FLIR image with the referenceimage. The experimental results are shown in sectionVI. Section VII contains the conclusions.

II. METHOD OVERVIEW

For each FLIR image from the IR camera onboardan aircraft, the corresponding position and attitude(pitch, yaw and roll) of IR camera are captured by theINS. Let (x, y, z) be the coordinates of the aircraft inthe physical world and (θ, ψ, φ) be the correspondingpitch, yaw and roll angles of the IR camera, respec�tively. Meanwhile, the parameters of the IR camera(e.g., focal length f, horizontal angle of view α, verti�cal angle of view β) are also available.

A straightforward approach to localizing the build�ing is to map the building in the physical world to theprojection screen, which is also called a projectivetransform. However, this approach works poorly evenif the image system is accurately calibrated [8]. Themain reason is that the position (x, y, z) and attitude(θ, ψ, φ) of the IR camera are not accurate because ofthe measurement error.

remote sensingimage

Reference image Localization rezult

FLIR image3D building models

Buildinglocalization

Reference imagegeneration

Fig. 1. A schematic overview of the proposed approach.

Page 3: Building localization from forward-looking infrared images for UAV guidance

190

GYROSCOPY AND NAVIGATION Vol. 4 No. 4 2013

QIN et al.

Therefore, in this study we present a new methodfor building localization. The approach consists of thefollowing steps.

First, a reference image is generated by projecting3D building models and remote sensing image (bothgeo�referenced), based on the camera position (x, y, z)

and the attitude (θ, ψ, φ). The point of interest on the designated building is appointed manually.Then, the correspondence or the mapping betweenthe FLIR image and the reference image is found viaimage matching. As a result, the projection parametersrelating the reference image to the correspondingFLIR image are obtained. Finally, the target is local�ized based on the projection parameters. In the firststep, the projecting is based on the pinhole cameramodel. As the projective transform is a mature tech�nology in computer vision, we will not go further withthis technology in this study.

Finding the mapping between the FLIR image andthe reference image is the key to the target localiza�tion. For each pixel (xr, yr) in the reference image, let(xI, yI) be the corresponding pixel in the FLIR image.Here we write this mapping as

(xI, yI) = F1(xr, yr). (1)Because the measurement error of the camera

position (x, y, z) and the attitude (θ, ψ, φ) is small, the2�D affine mapping is a good approximation for F1.Therefore, F1 has the form of

(2)

where the translation is [tx, and the affine rotation,scale, and stretch are represented by the parameters

mi. Given F1, the coordinates of the target inthe FLIR image are

. (3)To determine F1 we proposed a novel scheme as fol�

lows.1) For each FLIR image I, we generate a set of

image patches (see Fig. 2a). Each image patch Ii is ofsize n × n and centered at (xi, yi) (see Fig. 2b). Theimage patches are m pixels apart from each other.

( , )r rt tx y

1 2

3 4

,I r

x

I ry

tx m m x

tm my y

⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎡ ⎤= +⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥⎣ ⎦ ⎣ ⎦⎣ ⎦ ⎣ ⎦

]tv

( , )I It tx y

1( , ) ( , )I I r rt t t tx y F x y=

2) Each image patch Ii is matched with the refer�ence image and let (ui, be the matching result (seeFigs. 2c, d and e).

3) Using the famous random sample consensus(RANSAC) algorithm, the parameters (tx, and mi)are obtained by fitting the 2�D affine mappingbetween (xi, yi) and (ui,

One thing about this scheme is noteworthy. By gen�erating a set of image patches from the FLIR image, allthe objects in the FLIR image are described andbecome useful information sources for finding the cor�respondence. This is the essential difference betweenthe new approach and the existing methods. Mean�while, that is why the proposed approach can still workwhen the designated building is very small or occludedby other objects. For example, when the designatedbuilding is occluded by clouds in the FLIR image, thecorrespondences between the FLIR image and the ref�erence image can still be found, if there are otherunoccluded objects in the FLIR image, such as a river,a road, or other buildings.

In next section, we will introduce our referenceimage. The method to match the FLIR image with thereference image is presented in section IV.

III. REFERENCE IMAGE GENERATION

The first step of the described approach is referenceimage generation. For every FLIR image, the corre�sponding position and attitude (pitch, yaw and roll) ofIR camera are captured by the INS. Knowing the cam�era position and attitude, 3D building models andremote sensing images, which are both geo�refer�enced, can be projected into the image, based on thepinhole camera model. Figure 3a demonstrates theprojection. In Fig. 3b is a FLIR image, while c is thecorresponding projection result.

Because of the measurement error, the positionand attitude of the IR camera provided by INS are notaccurate. In fact, the attitude measurement is muchmore accurate than the position measurement. As aresult, the projected model does not exactly match theobjects in the FLIR image. The main geometric rela�tionship between a FLIR image and its corresponding

)iv

tv

.)iv

(a) (b) (c) (d) (e)

Fig. 2. Matching the FLIR image with the corresponding reference image. (a): Some image patches generated from the FLIRimage. (b): Centers of the image patches. (c): An example of incorrect match. (d) and (e): Two correct matches.

Page 4: Building localization from forward-looking infrared images for UAV guidance

GYROSCOPY AND NAVIGATION Vol. 4 No. 4 2013

BUILDING LOCALIZATION FROM FORWARD�LOOKING INFRARED IMAGES 191

projection image is translation. This phenomenon isalso mentioned by D. Iwaszczuk et al. [8].

Comparing Figs. 3 (b) and (c), we can find thatprojection images are not well matched with FLIRimages. To localize a designated building, we must findthe correspondences between FLIR images and pro�jection images. However, this is a big challenge. Themain problem in finding correspondences lies in thefact that 3D building models and FLIR images areheterogeneous. As a consequence, buildings in theprojection images look very different from the corre�sponding buildings in FLIR images. Meanwhile, it isdifficult to distinguish the sides on the buildings in theprojection images. As a consequence, there is no gra�dient information on the buildings in the projectionimages, while abundant gradient information exists onthe buildings in the FLIR images.

To solve this problem, we proposed a simple butefficacious method here. 3D building models are pro�jected into an image with different colors on differentsides, to make sure every side can be distinguished. Foreach side of a building, let i be the index of the buildingand V = be the normal vector of the side.The color C = (R, G, B) of this side is

(4)

,( , )x y zv v v

255 sin( ) ,

255 sin( ) ,

255 sin( ) .z

x

R i

G i

B i

= × +⎧⎪

= × +⎨⎪ = × +⎩

vv

v

v

Equation (4) makes every building and every side ofa building in the projection image distinguishable fromeach other. As a consequence, the shapes of buildingsin the projection image are very clear and easy todescribe. As 3D projection is a mature technology incomputer vision and has been used by many research�ers, for example, D. Iwaszczuk et al. [8], we will not gofurther with 3D projection or spatial resolution of theprojection image in this paper.

As FLIR images are grey, we use the common wayto convert the RGB values of the colorful projectionimage to gray scale values by forming a weighted sumof the R, G and B components. The grey version (such

(a)

(b) (c)

Fig. 3. 3D projection and a FLIR image. (a): Demonstration of 3D projection. (b): FLIR image. (c): The corresponding projec�tion image.

Fig. 4. The reference image.

Page 5: Building localization from forward-looking infrared images for UAV guidance

192

GYROSCOPY AND NAVIGATION Vol. 4 No. 4 2013

QIN et al.

as Fig. 4) of the projection image is used as the refer�ence image.

IV. MATCHING THE FLIR IMAGE PATCH AND THE REFERENCE IMAGE

We believe that an appropriate representation is thekey to the success of an image matching task. A numberof visual descriptors have been proposed in the litera�ture, such as SIFT [10], SURF [11], BRIEF [12], andORB [13]. These descriptors have been proven remark�ably successful in a number of applications althoughthey are a decade old. However, these descriptors arenot suitable for matching the FLIR image patcheswith our reference images. In this study, the famousHistogram of Oriented Gradients (HOG) [14] is usedto represent FLIR image patches and referenceimages. Based on this representation, we match FLIRimage patches with their corresponding referenceimages.

It should be noticed that the pixel value in the refer�ence image may not be the same or similar with thecorresponding pixel value in the FLIR image. Conse�quently, the gradient orientation of one pixel in the ref�

erence image could be just opposite to the gradient ori�entation of the corresponding pixel in the FLIR image.To make the HOG feature be able to capture the essen�tial shape information, the proposed approach modi�fied the gradient orientation as:

(5)

In Eq. (5), r0 is the common version of gradient ori�entation. Figure 5 shows an example of our gradientorientation modification. In Fig. 5, each circle standsfor a pixel and each arrow stands for a gradient. If thegradient orientation belongs to the third or fourthquadrant, e.g., the central and the left bottom pixels inFig. 5, we will add π to its value. Because of this modi�fication, HOG feature will be able to capture theessential shape information, without being affected bythe dramatic luminance differences between FLIRimages and the reference images.

Let Ii be the FLIR image patch, T be the referenceimage and F(Ii) be the modified HOG feature of imageIi. F(Ii) is a vector and its dimension is direct propor�tion to the size of image Ii (see Fig. 6). The best matchP between I and T is

(6)

where d(a, b) is the Euclidean distance between a andb. p = (x, y)T is a vector of parameters. Tp denotes apatch of the reference image T, which has the samesize as the FLIR image patch. For any pixel indexed by(u, in T, the equation Tp(u, = T(u + x, + y)holds. To solve Eq. (6), the traversal method is used.

Figure 7 shows the initial matching results. It canbe noticed that most matches are correct. Basedon the matching results, we have a candidate mapping(xi, yi) ↔ As there are a few incorrect matches,

when

when0 0

0 0

0,

0.

r rr

r r

+ π ≤⎧= ⎨

>⎩

p

argmin { ( , ))},ˆ ( ) (i pP d F I F T=

)v )v v

.( , )i iu v

Fig. 5. An example of gradient orientation modification.

Image

Histogram

Block 1

Block i

Block n

Cell Cell

Cell Cell

for a block

Histogramfor a cell

HOG feature

Fig. 6. HOG feature for an image. There are several “cells” in each “block”. By concatenating all the histograms from the “cells”,a histogram is obtained to describe the “block”. After processing all the “blocks”, the HOG feature is obtained by concatenatingall the histograms from all the “blocks”.

Page 6: Building localization from forward-looking infrared images for UAV guidance

GYROSCOPY AND NAVIGATION Vol. 4 No. 4 2013

BUILDING LOCALIZATION FROM FORWARD�LOOKING INFRARED IMAGES 193

the least squares fitting can not generate a good result.Therefore, a best�fit to this mapping is generated usingthe random sample consensus (RANSAC) algorithm.This helps reject outliers and makes the algorithmrobust. Once the best�fit affine parameters are gener�ated, the target is localized based on Eq. (3).

V. THE PARAMETERS IN THE DESCRIBED APPROACH

There are two parameters in the describedapproach: n and m. If n is too small, the number of theincorrect matches will increase; if n is too big, thenumber of the incorrect matches will decrease at theexpense of small computational burden. The runningtime and the correct matching rate (CMR) of match�ing the FLIR image patch of size n × n (pixel) with thereference image (320 × 256, pixel) are shown in Fig. 8.

The experiments are performed on a laptop PC(CPU T7500 2.20GHz, 2GB of RAM). It should bepointed out that the definition of the CMR is

(7)

To ensure relatively high CMR and small computa�tional burden at the same time, we set the parameter nto 120. In this situation, (320 – 120 + 1) × (256 –120 + 1) = 27537 patches are available, since the sizeof the FLIR image is 320 × 256. However, the compu�tational burden can be huge if all these patches aregoing to be matched with the reference image. Therunning time and the average probability of correctlocalization (PCL) of localizing the target buildingfrom FLIR images are shown in Fig. 9. The definitionof the PCL is

(8)

From Fig. 9, it can be found that the running timeand the PCL both decrease with the increase of theparameter m. Therefore, we set the parameter m to 40

CMR

#(correctly matched FLIR image patches)

#(total FLIR image patches).=

#(correctly localized FLIR images)PCL

#(total FLIR images).=

Fig. 7. The results of matching the FLIR image patches with the reference image. The white line segments denote correct matchesand black lines segment denote incorrect matches. Only one quarter of matches are presented for a better visibility.

0.8

0.6

0.4

0.21601208040 10060 140

n

(b)CMR

2

1

01601208040 10060 140

n

(a)

Time, s

Fig. 8. Running times and the CMRs with different n.

1.0

0.9

0.850403020 3525 45m

(b)PCL

(a)Time, s

150

100

050403020 3525 45

50

m

Fig. 9. Running times and the PCLs with different m.

Page 7: Building localization from forward-looking infrared images for UAV guidance

194

GYROSCOPY AND NAVIGATION Vol. 4 No. 4 2013

QIN et al.

to ensure relatively high PCL and small computationalburden at the same time.

VI. EXPERIMENT RESULTS

About 32000 FLIR images like the ones shown inFig. 10 were involved in our building localizationexperiment. The FLIR images have been obtainedusing the camera detailed in Table 1 under two kinds ofweather conditions: sunny and cloudy. These FLIRimages were composed of 29 image sequences.20 image sequences were taken in daytime and the restwere taken at night. The altitude of the UAV was about1000 meters and the range to target was from about6 km to 1 km. The 3D building models used in the

experiment were generated from the target area of asize 1 km × 1 km. The spatial resolution of the remotesensing images used in the experiments was 1 meter.The image coordinates of points of interest on the des�ignated buildings in these FLIR images were selectedmanually and used as the ground truths.

The parameters of the HOG were set as follows.There were 2 × 2 cells in one block. The cells were of asize 8 × 8 and the number of orientation bins was 5.There was no overlap between cells. The overlapbetween blocks was fixed at half of the block size.

Figure 11 shows the probability of correct localiza�tion (PCL) of Yang’s method [7], Wang’s method [9]and the proposed method. From Fig. 11, we can seethat the proposed method is much better, especiallywhen short and low buildings or buildings occluded byclouds should be localized. The average correct local�ization rate of the proposed approach was 95.12%,with the allowable error of ±3 pixels, with 50.68% and64.47% for Yang’s method [7] and Wang’s method [9],respectively. In this paper, we did not test the methodproposed by D. Iwaszczuk et al. [8], for its system cal�ibration is too rigorous.

Figure 12 shows three groups of building localiza�tion results under a sunny weather. In Fig. 12, twopoints of interest on the designated buildings aremarked in the reference images, and the localizationresults are marked in the corresponding FLIR images.In Fig. 12, some buildings in the FLIR images do notexist in the data of 3D building models, because thosebuildings had not been built when these models weregenerated, while the FLIR images were captured afterthose buildings had been finished. However, thedescribed approach is tolerant towards this fault andcan still localize the designated building correctly.

Figure 13 shows the localization result for a smallbuilding under cloudy weather. In Fig. 13, (a) is thereference image and the point of interest on the desig�nated building is marked with a black arrow; (b), (c)

(a) (b) (c)

(d) (e) (f)

Fig. 10. Examples of our FLIR images.

100

6040

0292725231791 135 21157 113 19

80

20

PCL, %

Image sequence

The proposed method

Yang's method

Wang's method

Fig. 11. Performances of the different methods. The desig�nated buildings in the image sequences labeled as 1–17,18–25, 26–29 are big buildings, short and low buildingsand buildings occluded by clouds, respectively.

Table 1. Camera specifications

Resolution Focal length Vision angle

320 × 256 19 mm 3.7 × 2.9 (degrees)

Page 8: Building localization from forward-looking infrared images for UAV guidance

GYROSCOPY AND NAVIGATION Vol. 4 No. 4 2013

BUILDING LOCALIZATION FROM FORWARD�LOOKING INFRARED IMAGES 195

and (d) are the localization results of different framesin a FLIR image sequence. The localization is veryaccurate, although the designated building is small,and the FLIR images are degraded by clouds at thesame time. Therefore, the proposed approach canlocalize small, deeply buried, or carefully camouflagedbuildings under complex scenes.

We also performed some experiments to illustratehow robust is the described approach to the measure�ment error of camera position and attitude. The cam�era position is described by coordinate (x, y) in Gauss�

Kruger coordinate system and altitude h. The cameraattitude is described by three parameters: pitch angle,yaw angle, and roll angle. The measurement errors ofthese six parameters will affect building localization.We added Gauss white noise to these six parametersone by one and then generated reference images basedon the noisy parameters. Concretely speaking, thereare six parameters (x, y, h, pitch, yaw, roll) for a FLIRimage. We added Gauss white noise to one of theseparameters without changing the other five parame�ters. So we got six versions of the noisy parameters for

(a)

(b)

(c)

Fig. 12. Results of building localization.

Page 9: Building localization from forward-looking infrared images for UAV guidance

196

GYROSCOPY AND NAVIGATION Vol. 4 No. 4 2013

QIN et al.

one FLIR image. Then we generated six referenceimages based on the noisy parameters for this FLIRimage. The PCL was used again to see which parame�ter is the most sensitive in the described approach. Thestandard deviation of Gauss white noise added toGauss�Kruger coordinate (x, y) and altitude h werefrom 0 to 100, while 0 to 3 for pitch, yaw and roll. Theresults are shown in Fig. 14, from which we can findthat altitude h is the most sensitive parameter in the

three position parameters, while roll is the most sensi�tive parameter in the three attitude parameters. Fortu�nately, it is easy for modern INS to measure altitude hand the angle of roll with high precision. In fact, themeasurement errors of Gauss�Kruger coordinate (x,y) are the main errors in INS. However, the describedapproach is robust to these measurement errors. Thisproperty makes the described approach practical.

(a) (b)

(c) (d)

Fig. 13. An example of building localization under cloudy weather. (a): our reference image. (b), (c) and (d): building localizationresults of different frames.

1.0

0.9

0.8

0.7

0.6

0.5

0.480400 6020 100

PCL

Standard deviation

1.0

0.9

0.8

0.7

0.6

0.5

0.4

2.51.00 1.50.5 3.02.0

x

y

h

pitch

yaw

roll

Fig. 14. Probability of correct localization (PCL) with different standard deviation.

Page 10: Building localization from forward-looking infrared images for UAV guidance

GYROSCOPY AND NAVIGATION Vol. 4 No. 4 2013

BUILDING LOCALIZATION FROM FORWARD�LOOKING INFRARED IMAGES 197

VII. CONCLUSIONS

A novel approach to localizing buildings fromFLIR images is proposed, which can localize small oroccluded buildings under complex scenes. Theapproach has been tested by the real FLIR imagesequences. The experiment results show that the newapproach can precisely localize the designated build�ing form FLIR images under complex scenes. More�over, it is robust to the situation where the FLIR imagesare degraded by clouds.

ACKNOWLEDGMENTS

This work was supported by the Fund of AdvancedResearch Projects of China (9140A01060111JW0505).

REFERENCES

1. Farooq, A. and Limebeer, D.J.N., Bank�to�Turn Mis�sile Guidance with Radar Imaging Constraints, Journalof Guidance, Control, and Dynamics, 2005, vol. 28,pp. 1157–1170.

2. Corato, F.D., Innocenti, M., and Pollini, L., RobustVision�Aided Inertial Navigation Algorithm viaEntropy�like Relative Pose Estimation, Gyroscopy andNavigation, 2013, vol. 4, no. 1, pp. 1– 13.

3. Scherbinin, V.V., Shevtsova, E.V., Vasil’eva, Y.S., andChizhevskaya, O.M., Functioning Methods and Algo�rithms of Color Vision�Based Correlation�ExtremalAircraft Navigation System, Gyroscopy and Navigation,2013, vol. 4, no. 1, pp. 39–49.

4. Lee, H.�Y., Kim, T.�H., and Park, K.�H., Target Extrac�tion in Forward�Looking Infrared Images Using FuzzyThresholding Via Local Region Analysis, OpticalReview, 2011, vol. 18, pp. 383–388.

5. Yang, L., Yang, J., and Zheng, Z. L., Detecting InfraredSmall Targets Based on Adaptive Local Energy Thersh�old under Sea�Sky Complex Backgrounds, Journal ofInfrared and Millimeter Waves, 2006, vol. 25, pp. 41–45.

6. Mahalanobis, A., Muise, R.R., and Stanfill, R.R.,Quadratic Correlation Filter Design Methodology forTarget Detection and Surveillance Applications,Applied Optics, 2004, vol. 43, pp. 5198–5205.

7. Yang, X.Y., Zhang, T.X., and Lu, Y., Method for Build�ing Recognition From FLIR Images, IEEE Aerospaceand Electronic Systems Magazine, 2011, vol. 26,pp. 28–33.

8. Iwaszczuk, D., Hoegner, L., and Stilla, U., Matchingof 3d Building Models with IR Images for TextureExtraction, in Urban Remote Sensing Event (JURSE),2011, pp. 25–28.

9. Wang, X.P., Zhang, T.X., and Yang, X.Y., IndirectBuilding Localization Based on a Prominent SolidLandmark from a Forward�Looking Infrared Imagery,Chinese Optics Letters, 2011, vol. 9.

10. Lowe, D.G., Distinctive Image Features from Scale�Invariant Keypoints, International Journal of ComputerVision, 2004, vol. 60, pp. 91–110.

11. Bay, H., Ess, A., Tuytelaars, T., and Gool, L.V.,Speeded�Up Robust Features (SURF), ComputerVision and Image Understanding, 2008, vol. 110,pp. 346–359.

12. Calonder, M., Lepetit, V., Strecha, C., and Fua, F.,BRIEF: Binary Robust Independent Elementary Fea�tures, in Proceedings of ECCV, 2010, pp. 778–792.

13. Rublee, E., Rabaud, V., Konolige, K., and Bradski,G.,ORB: an Efficient Alternative to SIFT or SURF, in Pro�ceedings of IEEE International Conference on ComputerVision, 2011, pp. 2564–2571.

14. Dalal, N. and Triggs, B., Histograms of Oriented Gra�dients for Human Detection, in Proceedings of IEEEConference on Computer Vision and Pattern Recognition,2005, vol. 1, pp. 886–893.