sar target configuration recognition based on the biologically inspired...

7
Contents lists available at ScienceDirect Neurocomputing journal homepage: www.elsevier.com/locate/neucom SAR target conguration recognition based on the biologically inspired model Xiayuan Huang a , Xiangli Nie a , Wei Wu a , Hong Qiao a , Bo Zhang b, a The State Key Lab of Management and Control for Complex System, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China b Institute of Applied Mathematics, AMSS, Chinese Academy of Sciences, Beijing 100190, China ARTICLE INFO Communicated by Bo Shen Keywords: Biologically inspired model SAR target conguration recognition Episodic features Semantic features Aspect angle estimation ABSTRACT How to extract proper features is very important for synthetic aperture radar (SAR) target conguration recognition. However, most of feature extraction methods are hand-designed and usually can not achieve a satisfactory performance. In this paper, we propose a novel method based on the biologically inspired model to extract features automatically from limited data. Specically, we learn episodic features (containing the key components and their spatial relations) and semantic features (i.e., semantic descriptions of the key components) which are two important types of features for the human cognition process. Episode features are learned through a deep neural network (DNN) and then semantic geometric features of the key components are dened. Moreover, SAR images are very sensitive to aspect angles. Therefore, we use episode features to estimate aspect angles of testing samples for the nal recognition. This paper is a preliminary study and the preliminary experimental results on the moving and stationary target automatic recognition (MSTAR) database demonstrate the eectiveness of the proposed method. 1. Introduction Synthetic aperture radar (SAR) can produce high-resolution images in all weather and time [1,2], therefore it plays an important role in civil and military uses. Conventional SAR target recognition is to recognition target type, but one type may consist of a few dierent congurations. In this paper, we aim to recognise the targets belonging to the same type but with dierent congurations [3,4]. Based on the standard terminology in [24], congurationmeans serial number. Targets of the same type with dierent congurations are considered as variants. Therefore, SAR target conguration recognition is able to provide much more detailed information of the targets than SAR target type recognition and thus is widely used in military and civil elds [14]. There exit mainly three types of techniques for SAR target recogni- tion, namely: template matching methods, model-based methods, and machine learning methods. Template matching methods [1] usually utilize the mean square error (MSE) criterion to classify the target. These methods can achieve satisfactory results if samples in the template database are enough. Otherwise, their performances would degrade greatly [5]. The model-based methods can help solve this problem of template matching methods [6]. These methods mainly contain three steps: 1) rstly, we need to generate some hypotheses about the target class and pose, 2) then the computer-aided design model and electromagnetic simulation software are used to obtain SAR images of the targets following the above hypotheses, 3) nally, the obtained features are compared with those of the actual SAR images. Recently, machine learning methods have been widely applied to SAR target recognition and achieved satisfactory results. These meth- ods contain two main steps: feature extraction and classier design. Many commonly-used classiers have been applied to SAR target recognition, such as neural network [7], support vector machine [8] and AdaBoost [2]. This paper focuses on feature extraction, which is very important for the nal recognition. Various feature extraction methods have been proposed in the previous studies. The gray values of SAR images were commonly used as features for SAR target recogni- tion [7,8,2]. Moreover, the magnitudes of the Discrete Fourier Transform coecients of the SAR images were adopted as the features in [2]. The scattering center features obtained from the global scatter center model were utilized in [9]. The 11 discriminative features were proposed in [10]. The linear dimensionality reduction method, locality preserving projection (LPP) has been used to extract features of SAR images in [3]. A multi-linear dimensionality reduction approach based on the tensor global and local discriminant embedding was proposed in [11]. In addition, SAR images are very sensitive to aspect angles, which can aect the performances of target recognition. And in several http://dx.doi.org/10.1016/j.neucom.2016.12.054 Received 23 September 2016; Received in revised form 7 November 2016; Accepted 20 December 2016 Corresponding author. E-mail addresses: [email protected] (X. Huang), [email protected] (X. Nie), [email protected] (W. Wu), [email protected] (H. Qiao), [email protected] (B. Zhang). Neurocomputing 234 (2017) 185–191 Available online 28 December 2016 0925-2312/ © 2017 Elsevier B.V. All rights reserved. MARK

Upload: others

Post on 07-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SAR target configuration recognition based on the biologically inspired modelstatic.tongtianta.site/paper_pdf/c9daf41a-4193-11e9-8aac... · 2019. 3. 8. · model has achieved a good

Contents lists available at ScienceDirect

Neurocomputing

journal homepage: www.elsevier.com/locate/neucom

SAR target configuration recognition based on the biologically inspiredmodel

Xiayuan Huanga, Xiangli Niea, Wei Wua, Hong Qiaoa, Bo Zhangb,⁎

a The State Key Lab of Management and Control for Complex System, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, Chinab Institute of Applied Mathematics, AMSS, Chinese Academy of Sciences, Beijing 100190, China

A R T I C L E I N F O

Communicated by Bo Shen

Keywords:Biologically inspired modelSAR target configuration recognitionEpisodic featuresSemantic featuresAspect angle estimation

A B S T R A C T

How to extract proper features is very important for synthetic aperture radar (SAR) target configurationrecognition. However, most of feature extraction methods are hand-designed and usually can not achieve asatisfactory performance. In this paper, we propose a novel method based on the biologically inspired model toextract features automatically from limited data. Specifically, we learn episodic features (containing the keycomponents and their spatial relations) and semantic features (i.e., semantic descriptions of the keycomponents) which are two important types of features for the human cognition process. Episode featuresare learned through a deep neural network (DNN) and then semantic geometric features of the key componentsare defined. Moreover, SAR images are very sensitive to aspect angles. Therefore, we use episode features toestimate aspect angles of testing samples for the final recognition. This paper is a preliminary study and thepreliminary experimental results on the moving and stationary target automatic recognition (MSTAR) databasedemonstrate the effectiveness of the proposed method.

1. Introduction

Synthetic aperture radar (SAR) can produce high-resolution imagesin all weather and time [1,2], therefore it plays an important role incivil and military uses. Conventional SAR target recognition is torecognition target type, but one type may consist of a few differentconfigurations. In this paper, we aim to recognise the targets belongingto the same type but with different configurations [3,4]. Based on thestandard terminology in [2–4], “configuration” means “serial number”.Targets of the same type with different configurations are considered asvariants. Therefore, SAR target configuration recognition is able toprovide much more detailed information of the targets than SAR targettype recognition and thus is widely used in military and civil fields [1–4].

There exit mainly three types of techniques for SAR target recogni-tion, namely: template matching methods, model-based methods, andmachine learning methods. Template matching methods [1] usuallyutilize the mean square error (MSE) criterion to classify the target.These methods can achieve satisfactory results if samples in thetemplate database are enough. Otherwise, their performances woulddegrade greatly [5]. The model-based methods can help solve thisproblem of template matching methods [6]. These methods mainlycontain three steps: 1) firstly, we need to generate some hypotheses

about the target class and pose, 2) then the computer-aided designmodel and electromagnetic simulation software are used to obtain SARimages of the targets following the above hypotheses, 3) finally, theobtained features are compared with those of the actual SAR images.

Recently, machine learning methods have been widely applied toSAR target recognition and achieved satisfactory results. These meth-ods contain two main steps: feature extraction and classifier design.Many commonly-used classifiers have been applied to SAR targetrecognition, such as neural network [7], support vector machine [8]and AdaBoost [2]. This paper focuses on feature extraction, which isvery important for the final recognition. Various feature extractionmethods have been proposed in the previous studies. The gray values ofSAR images were commonly used as features for SAR target recogni-tion [7,8,2]. Moreover, the magnitudes of the Discrete FourierTransform coefficients of the SAR images were adopted as the featuresin [2]. The scattering center features obtained from the global scattercenter model were utilized in [9]. The 11 discriminative features wereproposed in [10]. The linear dimensionality reduction method, localitypreserving projection (LPP) has been used to extract features of SARimages in [3]. A multi-linear dimensionality reduction approach basedon the tensor global and local discriminant embedding was proposed in[11]. In addition, SAR images are very sensitive to aspect angles, whichcan affect the performances of target recognition. And in several

http://dx.doi.org/10.1016/j.neucom.2016.12.054Received 23 September 2016; Received in revised form 7 November 2016; Accepted 20 December 2016

⁎ Corresponding author.E-mail addresses: [email protected] (X. Huang), [email protected] (X. Nie), [email protected] (W. Wu), [email protected] (H. Qiao), [email protected] (B. Zhang).

Neurocomputing 234 (2017) 185–191

Available online 28 December 20160925-2312/ © 2017 Elsevier B.V. All rights reserved.

MARK

Page 2: SAR target configuration recognition based on the biologically inspired modelstatic.tongtianta.site/paper_pdf/c9daf41a-4193-11e9-8aac... · 2019. 3. 8. · model has achieved a good

studies [2,12], the aspect angles were firstly estimated before targetrecognition, which helped to improve the recognition accuracy. Inconclusion, most of previous feature extraction methods are hand-designed and usually can not achieve satisfactory results.

On the other hand, with the development of the interdisciplinarystudies between neuroscience and information science, based on thecircuit model of the visual cortex proposed by Hubel and Wiesel [13]and other biological studies, researchers proposed many computationalvisual models for different visual tasks, such as the saliency-basedvisual attention model [14], the neocognitron model [15] and thehierarchical max-pooling model (HMAX) [16]. The HMAX model is ahierarchical model of the visual cortex, which can produce position andscale invariant features by alternating between the template matchingand a max-pooling operation. Based on the HMAX model, Qiao et al.[17] recently proposed a framework to mimic the recognition andlearning process of the human visual cortex. Due to the fact thatepisodic features (containing the key components and their spatialrelations) and semantic features (i.e., semantic descriptions of the keycomponents) are two important types of features for the humancognition process, the biologically inspired model in [17] learnedepisode and semantic features for object recognition. Specifically, adeep neural network (DNN) model was used to learn episodic features,that is, the key components and their spatial relations. Then varioustypes of semantic geometric features were defined after finding thecontour of each key component using an edge detection method. Thismodel has achieved a good performance for face recognition.

Therefore, we attempt to use the biologically inspired model [17] tolearn episodic and semantic features of SAR images automatically fromlimited training data. Firstly, episodic features of SAR images (i.e., thetwo key components: the target and its shadow) are learned throughthe DNN. It is known that DNN is an advanced feature extractionmethod which has achieved success in many optical image tasks and itoften needs large amounts of training data. However, the number ofSAR images for specific targets is often limited. Thus, featuresextracted by DNN are not directly used as features for final classifica-tion. In addition, because of the sensitivity of SAR images to aspectangles, the episode features are utilized to estimate aspect angles of thetesting samples. Next, for each test sample, we select those trainingsamples whose aspect angles are near the aspect angle of this testsample for the final recognition to alleviate the influence of the aspectangle. Finally, the semantic features of the key components areobtained for classification, that is, a test sample is classified by justcomparing its semantic features with those of the chosen trainingsamples. Experimental results on the moving and stationary targetautomatic recognition (MSTAR) database demonstrate that the pro-posed method can achieve a higher recognition accuracy than the otherapproaches compared.

The rest of this paper is organized as follows. In Section 2, we give abrief introduction of the biologically inspired model for visual cogni-tion. Section 3 presents the proposed feature extraction method forSAR target configuration recognition. Experimental results on theMSTAR database are shown in Section 3. Conclusions are given inSection 4.

2. Related works

In the early 1970s, Tulving [18] proposed that declarative memorycan be divided into episodic memory and semantic memory. Episodicmemory relies mainly on the hippocampus and semantic memorymainly relies on the underlying cortices of the medial temporal lobe[19]. Moreover, some researches reveal that semantic memory isgenerally derived from episodic memory [20]. Therefore, episodicand semantic features should be extracted for the later memoryformation process in the human cognition process.

In order to mimic the recognition and learning process of thehuman visual cortex, Qiao et al. [17] proposed a framework for the

cognition of a special class of objects. In this framework, episodicfeatures and semantic features are learned for higher level cognition ofan object. In addition, the framework is evaluated on face recognitionand the corresponding key components are two eyes, a nose and amouth. To learn episodic features, the convolutional deep belief net-work (CDBN) is exploited. The learned key components can be shownby weights visualization, and the activations in the feature mapsindicate the regions of each component and their spatial relations.Based on the episode features, we search for the contour of each keycomponent using an edge detection method. Then four types ofsemantic geometric features of the key components (i.e., shape, area,curvature and ratio) are defined.

We now give a brief introduction to CDBN. The CDBN model wasproposed in [21], based on the restricted Boltzmann machines (RBM)and deep belief networks (DBN). It consists of a few max-pooling-CRBMs stacked on top of one another, that is, the output of a max-pooling-CRBM is used as the input of the next one. Here, CRBMmeansconvolutional RBM.

A convolutional RBM with probabilistic max-pooling contains threelayers: a visible layer V, a hidden layer H and a pooling layer P, asshown in Fig. 1. The input layer, i.e., the visual layer V, consists ofN N×V V units and each unit is denoted as vi j, . The hidden layer Hconsists of K groups of feature maps denoted as H k K{ , = 1, 2,…, }k .Each feature map has N N×H H units. Obviously, there are totally N KH

2

hidden units, and each unit can be denoted as hi jk, . The two layers V and

Hk are connected with a filter Wk (a convolutional kernel) with sizeN N×W W . The weights are shared across all the hidden units of layerHk. Therefore we have N N N= − + 1H V W . Moreover, units of layer Hk

have a bias bk and all units of layer V share a bias c. The pooling layer Palso has K groups of feature maps, i.e., P k K{ , = 1, 2,…, }k with sizeN N×P P. We use Bα to represent all units in a C C× block α from Hk. Aunit pα

k from Pk can be obtained by the max-pooling of all units in Bα.Therefore we have N N C= /P H .

The CRBM is a energy-based model. For the real input layer V andthe binary hidden layer H, the energy of state (v,h) is defined by

∑ ∑ ∑ ∑ ∑

∑ ∑

E v h h W v b h c v

v s t h k α

( , ) = − ( * ) − −

+ 1/2 . . ≤ 1, ∀ ,

k i j

i jk k

i jk

ki j

i jk

i ji j

i ji j

i j Bi jk

,, ,

,,

,,

,,2

( , )∈,

α (1)

where W k means the flipping matrix Wk horizontally and vertically and* denotes the convolution operation. Then the probability that themodel will stay in state (v,h) is defined by

P v hZ

E v h( , ) = 1 exp ( , ))(2)

where Z is a normalization constant.Given the visible layer V, we can sample the hidden layer H and

pooling layer P with the following conditional probabilities:

P H vI h

I h( = 1| ) =

exp( ( ))1 + ∑ exp( ( ))i j

k i jk

i j B i jk,

,

( ′, ′)∈ ′, ′α (3)

Fig. 1. Convolutional RBM with probabilistic max-pooling which contains three layers: avisible layer V, a hidden layer H and a pooling layer P.

X. Huang et al. Neurocomputing 234 (2017) 185–191

186

Page 3: SAR target configuration recognition based on the biologically inspired modelstatic.tongtianta.site/paper_pdf/c9daf41a-4193-11e9-8aac... · 2019. 3. 8. · model has achieved a good

P p vI h

( = 1| ) = 11 + ∑ exp( ( ))α

k

i j B i jk

( ′, ′)∈ ′, ′α (4)

where

I h b W v( ) = + ( * )i jk

kk

ij, (5)

In the same way, given the hidden layer H, we can sample the visiblelayer V with the conditional probability:

⎝⎜⎜

⎛⎝⎜⎜

⎞⎠⎟⎟

⎠⎟⎟∑P v h σ W H c( = 1| ) = * +i j

k

k k

ij

,

(6)

where σ is a sigmoid function, i.e.σ x e( ) = 1/(1 + )x− . More details ofCDBN can be found in [21].

CDBM has been widely applied to visual recognition tasks, such asface recognition, handwritten digit classification and other objectrecognition [21]. It can also be used for unsupervised learning ofobject parts [21]. In [17], two max-pooling CRBMs are staked to learnthe four key components of a human face.

3. The proposed method

It has been demonstrated that episodic features and semanticfeatures are two important types of features for the human cognitionprocess. We proposed a method based on the biologically inspiredmodel proposed in [17] to learn episode and semantic features of SARimages for target configuration recognition. The proposed methodmainly contains four steps: 1) learning episode features, 2) estimatingaspect angles, 3) learning semantic features and 4) classification asshown in Fig. 2. We will describe these four steps in details.

3.1. Learning episode features

From the perspective of human vision system, it is obvious thateach SAR image mainly contains two parts, that is, the target and itsshadow, as shown in the top row of Fig. 3. Therefore we aim to learnthese two parts. We apply the CDBN to learn the two key parts of SARimages. Here, we use only one max-pooling-CRBM. Given the trainingsamples X i n, = 1, 2,…,i , where n is the number of training samples,the corresponding aspect angles are denoted as α i n, = 1, 2,…,i with

α0° < ≤ 360°i .By randomly sampling from all training samples, we can get a

number of patches which are used as the input. For Xi, we obtain theoutput layer PXi, which has K groups. Observing these K feature maps,it can be found that two feature maps show some information of thetarget part and the shadow part respectively, as seen in the second rowof Fig. 3. One feature map, where only the location of the target part isroughly activated, is denoted as PX

tai for Xi. The other feature map,

where only the location of the shadow part is not approximatelyactivated, is denoted as PX

shi for Xi. Since both two feature maps are

binary (activated or not activated), they can be combined in a simpleway as follows:

E P P i n= 1 − + , = 1, 2,…, .X Xta

Xsh

i i i (7)

where EXi is of the same size as PXta

i and PXshi . We use EXi as the episode

features of Xi.Meanwhile, the two main parts of the testing samples can also be

obtained based on the trained CDBM. We denote the testing samples asY j m, = 1, 2,…,j , where m is the number of the testing samples. Thetwo feature maps corresponding to the target and shadow parts aredenoted as PY

taj and PY

shj for Yj. Therefore the episode features of the

testing samples can be computed as follows:

E P P j m= 1 − + , = 1, 2,…, .Y Yta

Ysh

j j j (8)

Thus the episode features of Yj is EYj.

3.2. Estimating aspect angles

SAR images are very sensitive to aspect angles, so the varyingaspect angles will influence the performance of target configurationrecognition. It can be observed that SAR images from one class withvery different aspect angles appear very different, while SAR imagesfrom different classes with similar aspect angle look similar, as seen inFig. 4. Therefore samples from one class but with very different aspectangles are easily misclassified as different classes while samples fromdifferent classes but with similar aspect angles are likely to bemisclassified as one class. Certainly, SAR images from one class withsimilar aspect angles are very similar, as shown in Fig. 5.

Due to these facts, to alleviate the influence of aspect angles, weneed to find those training samples whose aspect angles are close tothat of a testing sample for classifying this testing sample. Therefore,we first estimate the aspect angles of the testing samples. Since episodefeatures present information of the two main parts, we use episodefeatures to estimate the aspect angles of testing samples by a nearestneighbor (NN) classifier. The distances between the training set and thetesting set are:

d E E i n j m= ∥ − ∥ , = 1, 2,…, ; = 1, 2,…,ije

Y X Fj i (9)

Note that For a testing sample Yj, we find the training sample whoseepisode features are nearest to EYj. That is,

i d* = arg mini

ij (10)

Since EYj and EXi are binary, it is fast to solve the problem (10).

Fig. 2. The process of the proposed method. The proposed method mainly contains foursteps: 1) learning episode features, 2) estimating aspect angles, 3) learning semanticfeatures and 4) classification.

Fig. 3. The process of learning episode features, where the two figures in the second row(i.e., the two feature maps in the pooling layer) correspond to the shadow and target.

X. Huang et al. Neurocomputing 234 (2017) 185–191

187

Page 4: SAR target configuration recognition based on the biologically inspired modelstatic.tongtianta.site/paper_pdf/c9daf41a-4193-11e9-8aac... · 2019. 3. 8. · model has achieved a good

Then the aspect angle αi* is assigned to the aspect angle of the testingsample Yj. We denote the estimated aspect angle of the testing sampleYj as βj, that is,

β α j m= , = 1, 2,…, .j i* (11)

Subsequently, we select training samples whose aspect angles areclose to βj. An aspect angle interval β δ β δ[ − , + ]j j is set, where δ isparameter. Then we search for training samples whose aspect anglesare in that interval. The indices of selected training samples aredenoted as a set I i i i= { , ,…, }j k1 2 j , where kj is the number of thetraining samples whose aspect angles are in that interval, that is,

α β δ β δ i∈ [ − , + ], ∀ ∈ I .i j j j (12)

Therefore the subset of the training set prepared to classify Yj isX i IXS = { , ∈ }j i j .

3.3. Learning semantic features

Based on the episode features, the following four types of semanticfeatures are defined for the final recognition:

• Areathe area of the shadow part is the number of pixels which are not

activated in Psh, while the area of the target part is the number of

pixels which are activated in Pta; these are denoted by ash and ata,respectively;

• Lengththe length of the longest segments in the target or shadow part

denoted as lta and lsh, respectively; the longest segment is generatedby linking two pixels which are the farthest from each other in thetarget or shadow part;

• Slopethe slopes of the longest segments in the target or shadow part

denoted as ssh and sta, respectively;

• Gray valuethe gray values of an image denoted as a gray value vector G.

The first three types of features indicate the semantic geometricalfeatures of the two main parts and they can describe the two main partsroughly. The gray value contains more detailed information of thetarget and thus is also added to describe the sample. Then these fourtypes of features are combined to represent a sample, that is,a a l l s s( , , , , , , G )sh ta ta sh ta sh T T , where GT stands for the transpose of thevector G. Therefore, the semantic features of the training and testingsets are denoted as S i n, = 1, 2,…,Xi , and S j m, = 1, 2,…,Yj , respec-tively.

Fig. 4. SAR images from seven classes with varying aspect angles: i i(20 × ) , = 1, 2,…,9○ . Each row corresponds to one class and each column corresponds to one aspect angle. The

numbers “1”, “2”,…,“7” correspond to BTR70(sn-c71), BTR70(sn-c71) BMP2(sn-9563), BMP2(sn-9563), BMP2(sn-9566), BMP2(sn-9566), BMP2(sn-c21), BMP2(sn-c21), T72(sn-132), T72(sn-132), T72(sn-812), T72(sn-812), T72(sn-s7), T72(sn-s7).

Fig. 5. Forty SAR images from one class: BTR70 (sn-c71). Each row shows ten SAR images whose aspect angles are close to one angle: 0°, 30°, 60°, 90° or 120°.

X. Huang et al. Neurocomputing 234 (2017) 185–191

188

Page 5: SAR target configuration recognition based on the biologically inspired modelstatic.tongtianta.site/paper_pdf/c9daf41a-4193-11e9-8aac... · 2019. 3. 8. · model has achieved a good

3.4. Classification

Based on the above three steps, for an arbitrary testing sample Yj,its semantic feature SYj and its corresponding training subset XSj havebeen obtained. Then we search for the training sample whose semanticfeatures are the nearest to those of the test sample from XSj and assignthe label of this training sample to Yj to achieve recognition.

4. Experiments

The proposed method is evaluated on the moving and stationarytarget automatic recognition (MSTAR) public database [5]. Thisdatabase includes three types of ground vehicles: the armored car:BTR70, the tank: BMP2, and the tank: T72. For the three target types,there are seven configurations: one for BTR70: sn-c71, three for BMP2:sn-9596, sn-9566, and sn-c21, and three for T72: sn-132, sn-812, andsn-s7. For every target configuration, SAR images whose sizes are128×128 are produced over the 360° aspect angles at two depressionangles (17° and 15°). The SAR images at 17° depression angle areexploited as the training set, and those at15° depression angle are usedas the testing set [1]. The detailed information of the training andtesting sets is shown in Table 1.

4.1. The process of experiments

For the SAR images, subimages of size 90×90 are obtained first bythrowing away part of the background and then equalized via histo-gram.

1) Learning episode featuresTo learn episode features, we need to train the max-pooling CRBM.

All training samples are divided into 10,000 pathes of size 23×23. Thesize of the convolutional kernel is 12×12, i.e., NW=12. And we setK=12, i.e., the hidden layer H and the pooling layer P both have 12groups of feature maps. In addition, C=2, i.e., a 2×2 block of the hiddenlayer corresponds to one unit of the pooling layer by max pooling. Thelearned weights between the visual layer and the hidden layer areshown in Fig. 6. We can see that it mainly presents features about thebright and dark spots due to speckle noise. Observing the output of theCRBM, we can discover that P3 and P7 correspond to the target and theshadow, respectively, that is, P P=ta 3 and P P=sh 7. Then we cancompute the episode features of training samples and testing samplesfollowing Eqs. (7) and (8).

2) Estimating aspect anglesBased on the episode features, we can estimate the aspect angles of

testing samples. Then we set δ = 10°. For each testing sample Yj, theaspect angle interval is β β[ − 10°, + 10°]j j . Thus, XSj is obtained.

3) Learning semantic featuresAccording to the definitions of semantic features, we compute the

semantic features: area, length, slope and gray value.4) ClassificationThe semantic features and the subset of the training set are

prepared for the final classification.

4.2. Experimental results

The proposed method is compared with two feature extractionmethods, principal component analysis (PCA) and local discriminantembedding (LDE). PCA is a global and unsupervised dimensionalityreduction method and LDE is a local and supervised dimensionalityreduction method. For PCA and LDE, the feature vector is made up ofthe gray value and the retained dimensionality is 50. The classificationresults are shown in Table 2 and the corresponding confusion matricesare given in Fig. 7.

From the experimental results, it can be seen that the proposedmethod outperforms the other comparing methods. LDE performsbetter than PCA because PCA is a unsupervised method and it can notpreserve the discriminative information. While LDE is a supervisedmethod, which can preserve local discriminative information of data.Feature extraction is the key factor of target configuration recognition.The proposed method applies the biologically inspired model to learn“good” features. Since episodic features and semantic features are twoimportant types of features for the human cognition process, we learnepisode features and semantic features for the final recognition.Moreover, we exploit episode features to estimate aspect angles oftesting samples to alleviate the influence of aspect angles. Due to theabove reasons, the proposed method performs better compared withother methods.

5. Conclusion

This paper exploited the biologically inspired model to learnepisode and semantic features automatically of SAR images for targetconfiguration recognition for the first time. First, CRBM is exploited toextract the two main parts of SAR images and compute episodefeatures. Then episode features are used to estimate the aspect anglesof the testing samples approximately to find the appropriate subset ofthe training set. Finally, semantic features are computed based on theepisode features to achieve the classification based on the correspond-ing subset. Experimental results show that the proposed method givesa better performance than the other comparing methods. This paperjust demonstrates a preliminary result of applying the biologicallyinspired model to SAR target configuration recognition. There are stillsome further works to be done based on this paper, such as theintroduction of the memory and association mechanism.

Table 1Configurations and sizes of the training and testing sets.

Training set (17°) Testing set (15°)

configuration size configuration size

1 BTR70 (sn-c71) 233 BTR70 (sn-c71) 1962 BMP2 (sn-9563) 233 BMP2 (sn-9563) 1953 BMP2 (sn-9566) 232 BMP2 (sn-9566) 1964 BMP2 (sn-c21) 233 BMP2 (sn-c21 1965 T72 (sn-132) 232 T72 (sn-132) 1966 T72 (sn-812) 231 T72 (sn-812) 1957 T72 (sn-s7) 228 T72 (sn-s7) 191

Fig. 6. Visualization results of the learned weights W of the CDBN.

Table 2Total recognition accuracy of the three comparing methods.

PCA LDE The proposed method

BTR70 (sn-c71) 0.9847 0.9796 0.9847BMP2 (sn-9563) 0.7692 0.8308 0.9333BMP2 (sn-9566) 0.8112 0.8418 0.9796BMP2 (sn-c21) 0.9082 0.9337 0.9184T72 (sn-132) 0.9184 0.9541 0.9643T72 (sn-812) 0.9641 0.9744 0.9744T72 (sn-s7) 0.9319 0.9581 0.9686Total recognition accuracy 0.8982 0.9245 0.9604

X. Huang et al. Neurocomputing 234 (2017) 185–191

189

Page 6: SAR target configuration recognition based on the biologically inspired modelstatic.tongtianta.site/paper_pdf/c9daf41a-4193-11e9-8aac... · 2019. 3. 8. · model has achieved a good

References

[1] L.M. Novak, G.J. Owirka, The automatic target-recognition system in saip, Linc.Lab. J. 10 (2) (1997) 187–202.

[2] Y. Sun, et al., Adaptive boosting for sar automatic target recognition, IEEE Trans.Aerosp. Electron. Syst. 43 (1) (2007) 112–125.

[3] M. Liu, Y. Wu, P. Zhang, Q. Zhang, Y. Li, M. Li, Sar target configuration recognitionusing locality preserving property and Guassian mixture distribution, IEEE Geosci.Remote Sens. Lett. 10 (2) (2013) 268–272.

[4] M. Liu, Y. Wu, W. Zhao, Q. Zhang, Y. Li, M. Li, G. Liao, Dempster-shafer fusion ofmultiple sparse representation and statistical property for sar target configurationrecognition, IEEE Geosci. Remote Sens. Lett. 11 (6) (2014) 1106–1110.

[5] T.D. Ross, S.W. Worrell, V.J. Velten, J.C. Mossing, Standard sar atr evaluationexperiments using the mstar public release data set, in: Proceedings SPIEConference Algorithms SAR Imagery V, 1998, pp. 566–573.

[6] Q.H. Pham, et al., A new end-to-end sar atr system, in: Proceedings SPIEConference Algorithms SAR Imagery VI, 1999, pp. 293–301.

[7] A. Hirose, Complex-Valued Neural Networks: Advances and Applications, Wiley-IEEE Press, Hoboken, NJ, USA, 2013.

[8] M. Bryant, F. Garber, Svm classifier applied to the mstar public data set, in:Proceedings SPIE Conference Algorithms SAR Imagery VI, 1999, pp. 355–360.

[9] J.X. Zhou, Z.G. Shi, X. Cheng, Q. Fu, Automatic target recognition of sar imagesbased on global scattering center model, IEEE Trans. Geosci. Remote Sens. 49 (10)(2011) 3713–3729.

[10] J.I. Park, S.H. Park, K.T. Kim, New discrimination features for sar automatic targetrecognition, IEEE Geosci. Remote Sens. Lett. 10 (3) (2013) 476–480.

[11] X. Huang, H. Qiao, B. Zhang, Sar target configuration recognition using tensor

global and local discriminant embedding, IEEE Geosci. Remote Sens. Lett. 13 (2)(2016) 222–226.

[12] S. Chen, J. Yang, X. Song, A new method for target aspect estimation in sar images,in: International Conference Multimedia Technology, 2010, pp. 1–4.

[13] D.H. Hubel, T.N. Wiesel, Receptive fields of single neurones in the cats striatecortex, J. Physiol. Lond. 148 (3) (1959) 574–591.

[14] L. Itti, C. Koch, E. Niebur, A model of saliency-based visual attention for rapidscene analysis, IEEE Trans. Pattern Anal. Mach. Intell. 20 (11) (1998) 1254–1259.

[15] K. Fukushima, Neocognitron: a hierarchical neural network capable of visualpattern recognition, Neural Netw. 1 (2) (1988) 119–130.

[16] T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, T. Poggio, Robust object recognitionwith cortex-like mechanisms, IEEE Trans. Pattern Anal. Mach. Intell. 29 (3) (2007)411–426.

[17] H. Qiao, Y. Li, F. Feng, X. Xi, W. Wu, Biologically inspired model for visualcognition achieving unsupervised episodic and semantic feature learning, IEEETrans. Cybernetics ⟨http://dx.doi.org/10.1109/TCYB.2015.2476706⟩.

[18] E. Tulving, Episodic and Semantic Memory, Academic Press, Waltham, MA, USA,1972.

[19] J.R. Binder, R.H. Desai, W.W. Graves, Where is the semantic system? A criticalreview and meta-analysis of 120 functional neuroimaging studies, Cerebr. Cortex19 (12) (2009) 2767–2796.

[20] M. Moscovitch, et al., Functional neuroanatomy of remote episodic, semantic andspatial memory: a unified account based on multiple trace theory, J. Anat. 207 (1)(2005) 35–66.

[21] H. Lee, R. Grosse, R. Ranganath, A.Y. Ng, Convolutional deep belief networks forscalable unsupervised learning of hierarchical representations, in: Proceedings ofthe 26th International Conference Machine Learning, 2009, pp. 609–616.

Fig. 7. Confusion matrices of the three comparing methods: (a) PCA, (b) LDE, (c) the proposed method. In the confusion matrix, the numbers on the x-axis are the real labels of thesamples and those on the y-axis are the predicted labels. The seven target configurations are shown in TABLE 1.

X. Huang et al. Neurocomputing 234 (2017) 185–191

190

Page 7: SAR target configuration recognition based on the biologically inspired modelstatic.tongtianta.site/paper_pdf/c9daf41a-4193-11e9-8aac... · 2019. 3. 8. · model has achieved a good

Xiayuan Huang received the B.Sc. degree in informationand computing science from the Minzu University of China,Beijing, China, and the Ph.D. degree in applied mathe-matics at the Institute of Applied Mathematics, Academy ofMathematics and Systems Science, Chinese Academy ofSciences, Beijing, China, in 2011, and 2016, respectively.She is currently a post-doctor with the State Key Lab of

Management and Control for Complex Systems, Institute ofAutomation, Chinese Academy of Sciences, Beijing, China.

Xiangli Nie received the B.Sc. degree in mathematicsfrom Shandong University, Jinan, China, and the Ph.D.degree in computational mathematics at the Institute ofApplied Mathematics, Academy of Mathematics andSystems Science, Chinese Academy of Sciences, Beijing,China, in 2010, and 2015, respectively.She is currently an Assistant Professor with the State Key

Lab of Management and Control for Complex Systems,Institute of Automation, Chinese Academy of Sciences,Beijing, China. Her research interests include (polari-metric) synthetic aperture radar image understandingand online classification.

Wei Wu received the B.Sc. degree in physics and M.Sc.degree in theoretical physics from Beijing NormalUniversity, Beijing, China, in 2001 and 2004, respectively,and the Ph.D. degree in computational neuroscience fromJohann Wolfgang Goethe University, Frankfurt, Germany,in 2008.He is currently a Vice Professor with the State Key

Laboratory of Management and Control for ComplexSystems, Institute of Automation, Chinese Academy ofSciences, Beijing.

Hong Qiao received the B.Eng. degree in hydraulics andcontrol and the M.Eng. degree in robotics from XianJiaotong University, Xian, China, the M.Phil. degree inrobotics control from the Industrial Control Center,University of Strathclyde, Strathclyde, U.K., and the Ph.D.degree in robotics and artificial intelligence from DeMontfort University, Leicester, U.K., in 1995. She was aUniversity Research Fellow with De Montfort Universityfrom 1995 to 1997. She was a Research Assistant Professorfrom 1997 to 2000 and an Assistant Professor from 2000 to2002 with the Department of Manufacturing Engineeringand Engineering Management, City University of HongKong, Kowloon, Hong Kong. Since January 2002, she has

been a Lecturer with the School of Informatics, University of Manchester, Manchester,U.K. Currently, she is also a Professor with the State Key Lab of Management and Control

for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing,China. She first proposed the concept of “the attractive region in strategy investigation,”which has successfully been applied by herself in robot assembly, robot grasping, andpart recognition. The work has been reported in Advanced Manufacturing Alert (Wiley,1999). Her current research interests include information-based strategy investigation,robotics and intelligent agents, animation, machine learning, and pattern recognition.

Dr. Qiao is currently a Member of the Administrative Committee of the IEEE Roboticsand Automation Society (RAS), a Member of the IEEE Medal for Environmental andSafety Technologies Committee, and a Member of the Early Career Award NominationCommittee, Most Active Technical Committee Award Nomination Committee, andIndustrial Activities Board for RAS. She is currently Associate Editors of the IEEETRANSACTION ON CYBERNETICS, IEEE TRANSACTION ON AUTOMATIONSCIENCE AND ENGINEERING, and Editor-in-Chief of ASSEMBLY AUTOMATION.

Bo Zhang received the B.Sc. degree in mathematics fromShandong University, Jinan, China, the M.Sc. degree inmathematics from Xi'an Jiaotong University, Xi'an, China,and the Ph.D. degree in applied mathematics from theUniversity of Strathclyde, Glasgow, U.K., in 1983, 1985,and 1992, respectively

From 1985 to 1988, he was a Lecturer with theDepartment of Mathematics, Xi'an Jiaotong University.He was a Postdoctoral Research Fellow with theDepartment of Mathematics, Keele University, Keele,U.K., from January 1992 to December 1994, and with theDepartment of Mathematical Sciences, Brunel University,Uxbridge, U.K., from January 1995 to October 1997. In

November 1997, he joined the School of Mathematical and Informational Sciences,Coventry University, Coventry, U.K., as a Senior Lecturer, where he was promoted toReader in Applied Mathematics in September 2000 and to Professor of AppliedMathematics in September 2003. Currently, he is a Professor with the Institute ofApplied Mathematics, Academy of Mathematics and Systems Science, Chinese Academyof Sciences, Beijing, China. He is currently Associate Editors of the IEEE TRANSACTIONON CYBERNETICS, and APPLICABLE ANALYSIS. His current research interests includedirect and inverse scattering problems, radar and sonar imaging, machine learning, andpattern recognition.

X. Huang et al. Neurocomputing 234 (2017) 185–191

191