spie proceedings [spie spie defense, security, and sensing - baltimore, maryland, usa (monday 29...

Small Feature Recognition of Moving Targets

Andre Sokolnikov, “Visual Solutions & Applications”, South Bend, IN

Keywords: Automatic Target Recognition, moving targets, adaptive modeling, prediction modeling, small feature recognition;

ABSTRACT

This paper presents an approach related to automated recognition of small features of movable targets including fast moving objects such as airplanes, etc. Small features recognition is a challenging problem in both fields: pattern recognition of particular configurations and of complexes comprising a number of configurations. Specific target details, although well characterized by their features are often arranged in an elaborated way which makes the recognition task very difficult and welcomes new ideas (approaches). On the other hand, the variety of small characters (features) is intrinsically linked to the technology development of the identified targets and is unavoidable. Due to the complexity of possible technological designs, the feature representation is one of the key issues in optical pattern recognition. A flexible hierarchical prediction modeling is proposed with application examples.

Introduction

A small feature recognition problem for Automatic Target Recognition (ATR) remains a difficult problem, especially for moving targets. Obviously, difficulties arise from both, small dimensions and constantly changing coordinates. Also, since the process is nonlinear and non-stationary, the modeling of the target features is problematic. The initial idea for small feature modeling came from the introduction of modular or hierarchical architectures (Singh, 1992; Dayan and Hilton, 1993, et al). The main problem for modular or hierarchical structures used for recognition is how to decompose a complex task into simple operations. In particular, our approach is to use linear functions to model largely nonlinear and non-stationary process. Also, the notion of probability field is used for spatial localization of a small feature moving in space. The model that makes the smallest prediction error is selected for building of the architecture which evolves in a number of steps, each within the probability field.

* [email protected]

Optical Pattern Recognition XXIV, edited by David Casasent, Tien-Hsin Chao, Proc. of SPIE Vol. 8748, 87480M · © 2013 SPIE · CCC code: 0277-786X/13/$18 · doi: 10.1117/12.2016947

Proc. of SPIE Vol. 8748 87480M-1

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 10/03/2013 Terms of Use: http://spiedl.org/terms

1. The hierarchical recognition process

The recognition system uses model-based learning architecture. The idea, that the architecture is based upon, implies breaking a complex target image recognition process into multiple domains/modules1. Each module has a prediction model and a correctional procedure. The outcome signal of the above modules is evaluated and specified within the preset boundaries. The functions used for recognition are in general nonlinear and non-stationary2. In order to overcome these two qualities, each processing module uses temporal and spatial dimensions for local prediction of small elements of the target image. Thus, we use flexible adaptive prediction process. The resulting architecture employs an interaction of all the local prediction mode algorithms3.

Fig. 1 shows the overall structure of the hierarchical recognition process that consists of small feature identification that comprises the resulting target image. The basic idea is to represent a complex task as a combination of stationary and linear processes in space and time. The modules outputs are evaluated with weight assignments that determine their contribution to the recognition model of the target.

Figure 1 Schematic diagram of the recognition process

Parameter

( )xVi

Function ( )xiμ

Weighting

Target image

Module 1

Module 2

( )tλ ( )txi

( )tu

Prediction unit

The obscured values



The probability function )(( txP is nonlinear in general. We present it as a function that depends on a number of temporal components that include noise (Eq. (1.1)):

));1(),1(),,(())1(),1(/)(( −−=−− tutxtxFtutxtxP (1.1)

where ...2,1=t and

),())(),(()( ttutxftx ν+= (1.2)

where ],0[ ∞∈t ; x and u represent temporal variables; ν is noise at each value of ]...1[],...1[ MuNx ∈∈

The above functions may be stochastic or deterministic, i.e.

- Stochastic: ))(),(())(/)(( txtuGtxtuP = ; (1.3)

- Deterministic: ));(()( txgtu = (1.4)

The process (Fig. 1) continues until the prediction value reaches the preset predictability setting. The evaluated prediction model may be defined for the discrete case:

⎥⎦

⎤⎢⎣

⎡+= ∑

∞

=0)())((

k

k ktrEtxV γ ; (1.5)

In the continuous case:

⎥⎦

⎤⎢⎣

⎡= ∫

∞ +−

0

)())((

dssts

eEtxV τ ; (1.6)

where 10 ≤≤ γ , τ<0 are the parameters for weighting correction, and ds is a weighting correction variable;

For the discrete case, the new state is defined as:

);()( txty = (1.7)

For the continuous case, the temporal derivative is:

);()( txty ′= (1.8)

From the Bayes’ rule:

∑=

== n

j

i

jtyPjP

ityPiPtyiPt

1

)/)(()((

)/))(()())(//()(λ , (1.9)



where )(tiλ is resulting signal’s probability, )(iP is the probability of choosing the unit (i), and )/)(( ityP is the

probability that the observation )(ty corresponds to the unit i For the discrete case, the probability distribution

resulting from the prediction model (based on the previous state) for the new state )(ˆ tx is calculated on the basis of the

probability of the initial state )1( −tx . For the modified state (whose parameters have been changed) )1( −tu , the probability is:

)1(),1(),(ˆ())1(),1(/)(ˆ( −−=−− tutxtxFtutxtxP i , (1.10)

where ),....1 ni = . As the starting model a uniform probability distribution may be assumed. In this case, we calculate the probability average of the function in (1.10) over the sum of all present probabilities of n-events:

;)1(),1(),((

)1(),1(),(()(

1∑=

−−

−−= n

jj

ii

tutxtxF

tutxtxFtλ (1.11)

(...))(xFi , therefore, relates to a new state that contains no previous information about the target. Further, the change

of the state is a derivative of the above initial function, i.e.

))(),(()(ˆ tutxftx i= ; (1.12)

In general case, with no known tendencies, the signal from the target is:

∑ =

−′−

′−′−

=n

j

txtx

txtx

ii

i

e

et

1

//)(ˆ)(//2

1

//)()(//2

1

22

22

)(σ

σ

λ ; (1.13)

In the above equation, the 2σ is taken as the Gaussian distribution.

2. Weight assignment4

The result signals from the previous discussions are evaluated by calculating the prediction probability function:

))1(),1(),(()())((1

−−′=′ ∑=

tutxtxFttxP i

n

iiλ ; (2.1)

The predicted value of the new state is calculated as:

)()()(ˆ1

txttx i

n

ii ′= ∑

=

λ ; (2.2)



Further, the adaptivity function is defined as the valued prediction model function )(( txV as the derivative in each

point x at time t plus an average value of the function )(( txV for a given range of values α

)(( txV

)()(()( tVtxVt ′+=α

τ (2.3)

If the initial states are known, we can define adaptivity predictors. In this case we assume that the probability of selecting the initial modules (1.9) + )(tτ , i.e.

)()()( ttt i τλ +=Λ ; (2.4)

In case of transition from one module to another, it is advantageous to assume the temporal and spatial continuity.

Temporal continuity: The temporal continuity assumes that the prediction model for the present model is based on the evaluated prediction from the previous mode:

)(ˆ)( tt ii δλλ =′ ;

(2.5)

From eq. (2.17 and 2.48), we can write:

m

t

ti

i

ttxPtz

t ατλ )/)(()(

1)(ˆ0∏=

−= ; (2.6)

where 10 <<α that determines the degree of dependence between consecutive prediction models and

m

n

j

t

ti jttxPtz

i

α)/)(()(1 0∑∏= =

−= ; (2.7)

For the continuous case, the resulting signal probability:

mii ttt )()(ˆ Δ−= λλ ; (2.8)

where tΔ is the increment of time.

Spatial continuity: Considering space continuous, the resulting signal probability becomes:

∑∑

=−

=−

−

−= m

j jj

m

i iii

txtxM

txtxMt

11

11

))()((

))()(()(λ ; (2.9)

where iM is a covariance matrix that determines the spatial coordinates and points transposition. The matrices

parameters reflect the parameters of the weighted input signal.



3. Implementation

Example: Multiple linear quadratic models

Implementing small change architecture can cause problems if we try to use universal nonlinear function approximations with large numbers of degrees of freedom. Since with this approach we likely to encounter situation when one module controls/affects all others, then small change adjustments are not possible. Linear models are still applicable for prediction because (local) linear models are flexible and suitable for generalization. For example, an expression for a local linear dynamic model:

))(())(()(ˆ tuBxtxAtx inii +−=′ ; (3.1)

The value function is given as:

)()(21)( m

iimiii xxPxxmxV −′−−= ; (3.2)

where im = constant = 0,1,2,…; (*)iP = veracity matrix;

)(xPi is found by solving the Riccati equation:

iiiiiiiiii QPBRBPPAAPP −′++′−−= −−

11

10τ

; (3.3)

The center mix value and the small change im of the value function are given by:

)()( 1 niiiiiiii

mi xAPxQAPQx +′+= − ; (3.4)

)()(211 0 e

imii

ei

miii xxQxxrm −′−−=

τ; (3.5)

The adaptive output is given by weighting the outputs by the resulting signal probability :)(tiλ

);()()(1

tuttu i

n

ii∑

=

= λ (3.6)

The parameters of the local linear models ,, ii BA and mix are updated by the weighted prediction expressions for

(assumed) errors:

)(ˆ)()(( txtxt ii −′λ ;



In order to update the above models, the Ricatti equations may be recalculated.

Example:

a) State prediction models:

Module 1 Module 2 Module 3 Module 4

b) State value functions:

Model 1 Model 2 Model 3 Model 4

Figure 2. Example of value function and prediction models with respect to probability distribution for small feature identification

The position of the small feature (a blank rectangular) fluctuates and its position is determined by a state value function. State prediction models give a field of probability distribution that assigns a probability distribution field to every position of the small feature. The probability field distribution is not constant in time. Its function evaluates the precision with which the small feature is localized. The state value function is two-dimensional. Its source of data is the signal from the target. Further, each location of the small feature moves in space and so do the location’s coordinates and the probability distribution field for each location. Thus, the exact location of the small feature, although never exactly determined in general case, is always pinned by the probability distribution field (see Fig. 3). As the small feature moves in space, the probability distribution for all components involved changes. However, the large feature probability is much slower than the one for the small feature. In general, the small feature movement is a nonlinear, non-stationary problem. Nevertheless, the solution is approximated by several linear prediction models.



Figure 3. The small feature probability space is surrounded by the large feature probability space.

Figure 4 Conversion of two prediction models: the small feature probability is continued by the dotted line; the big feature

probability is shown below as a solid line.

The linearization of the oscillating part (circled in Fig. 4) on the left is a possible application of the classical oscillation nonlinear equation:

0sin2

2

=+∂∂ θθ

t; (3.7)

where θ = phase of the elementary oscillation of the first approximation (for the prediction model). The real solution is more complicated than the simple harmonic oscillator function but in the simple case, it may be approximated by the Taylor series. Another linearization would be at πθ = , corresponding to the pendulum being in the vertical position (by the analogy to the pendulum), i.e.:

Small feature

Z

Y

X

Conversion of two prediction models

Time, sec

λ ,

0.5

1.0

0 1 2 3 4 5 6 7 8 9 10



02

2

=−+∂∂ θπθ

t; (3.8)

Since θπθ −≈sin for πθ ≈ . The solution to the problem involves hyperbolic sinusoids. Please note that unlike

the small angle approximation, this approximation is unstable. θ will usually grow without a limit, although bounded

solutions are possible. This corresponds to the difficulty of balancing (using the pendulum analogy) the pendulum in the vertical position or achieving a linear solution for the conversion of the above two prediction models. In a more complicated case, the solution involves hyperbolic sinusoids.

Conclusion:

In this paper, a new approach was introduced to account for small changes in the target images. The basis in the linear equation approach helped simplify calculations while the hierarchical structure allowed using recurrent series to improve the prediction model. The module structure of the recognition process helps to diminish an error accumulation in case the initial measurement was erroneous. The modular structure also allows an independent analysis of different parts of a complex image. Another advantage lies in the possibility of a conversion of several prediction models, thus creating a multilayer hierarchical structure resulting in a more sophisticated recognition process. The future work may involve a different choice for the linear approximation, such as hidden Markov’s chains or time series analysis.

References:

[1] Foroutan, I. and Sklansky, J. “Feature Selection for Automatic Classification of Non-Gaussian Data”, IEEE

Transactions on Systems, and Cybernetics, 17 (2), 187-198,(1987)

[2] Duda, R., Hart, P. and Stork, D., [Pattern Classification], 2nd edition, Wiley, NY, (2001)

[3] McLachlan, G. [Discriminant Analysis and Statistical Pattern Recognition], Wiley Series in Probability and

Statistics, NJ, (2004)

[4] Sokolnikov, A. “Time Series Modeling for Automatic Target Recognition”, SPIE Proceedings, 8391, (2012)



spie proceedings [spie spie defense, security, and sensing - baltimore, maryland, usa (monday 29...

Documents