informatlon to users as original or · acknowledgement s 1 thank my thesis advisur, dr- pierre r...

This manuscript h

INFORMATlON TO USERS

as ben repcoduced from the miadilm masfa K. UMI films

the text di* frwn tb original or suknitb6d. Thus, soma thesis and

dissertation =pies are in typewiter fsce, Mile oümm may be from any type d

cornputer printer.

fhe q u a l i of thk mprodoctioii b dep.dont upon th qurlity of th.

copy submitteâ. Broken or indistinct print, colored or poor quari illustmüons

and photographs, print bkedthrwgh, substandard margins, and improper

alignrnent cm adverrely afbct repmducoioci.

In the wrlikely event that the author dii not send UMI a complete manusaipt

and there are missing pages, Uœse wïll be Med. Also, if unauthorked

copyright material had to be removed, a note will indicate the deletion.

Oversize materials (e.g., maps, dram'ngs, &arts) are reprioduced by

sectiming the original, beginning at the upper lefi-harrd c o r n and amtinuing

fmm left 60 right in equal sedions with malt merlaps.

Photographs includsd in the original manuscript have been reproduœâ

xerographically in mis copy- Higher quaiity 6- x W Mack and mite photographie prints are availabîe for aiiy pnotographs or illustrations appwrirrg

in this w p y for an addithal charge. Contact UMI d i w to order.

Be11 6 Howell Information and barning 300 North Zeeb Road, Ann A W , MI 481ûS1346 USA

-521-

CONTROL STRATEGIES FOR AN ACTIVE VISION

SYSTEM

Kaouthar Benameur

Depart ment of Electricd Engineering

McGill University, Montréal

A Thesis submitted to the Faculty of Graduate Studies and Research

in partial Eulfilment of the requirements for the degree of

Doctor of Philosophy

National Library Bibliothèque nationale du Canada

Acquisitions and Acquisitions et Bibliog raphic Services services bibliographiques

395 Wellington Street 395. rue Wellirigtm Ottawa ON K1A ON4 ôttawa ON KIA ON4 canada canada

The author has granted a non- exclusive licence allowing the National Library of Canada to reproduce, loan, distribute or sell copies of this thesis in microform, paper or electronic formats.

The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts fiom it may be printed or otherwise reproduced without the author's permission.

t'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la fome de microfiche/film, de reproduction sur papier ou sur format électronique.

L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.

Abstract

This dissertation addresses the problem of information gathering with a visual system- We

formaiize a probabiiistic framework for information gathering by defbing three components:

geometric models, sensor observation modeis and models for prior information. Geometric

models consist mainly of the object model, which describes the structure of the rigid body

in a certain h e d coordinate system. Sensor models mathematically describe the mapping

between the rigid body coordinates and the statisticaily compted sensor observations-

Prior information is encoded in a structured environment assumption, as geometric and

dynamic models are assumed known.

Within this framework and using an optirnization criterion that relates the task at

hand to the information gathering process, we demonstrate that a commonly used method,

fixation, is not optimal for the class of active perception. This motivates the development

of an adaptive strategy for observations. This strategy accounts for sensor and dynamic

models uncertainties as well as the task description.

For the task of recovering the position and velocity of a moving object, the adaptive

strategy is based on a procedure to choose sensor actions which yield the best estimation

performance for a receding observation horizon. We derive and discuss the optimization

approach; then show how to compute it efficiently: and demonstrate some of its properties

through simuiations. We also show how this method can be extended to include the internal

parameters of the visual system; in particular, the focus and the zoom.

In conjunction with sensor control, a Bayesian decision approach is considered to deal

with the occlusion problem. This occlusion occurs when the underlying noisy measurement

process contains no more information proper to the task at hand. Within the considered

fiamework, information bom the occlusion event is derived and used to update the esti-

mates.

Finaily, we describe a modified dual control strategy that coordinates observation con-

trol and robotic ann command for the task of intercepting a moving object. This control

approacb is compared to more classical control approaches. I t is shown that the proposed

strategy allows a significant improvement in the convergence rate of the end-effector to the

desired trajectory without altering the quality of the estimates.

Résumé

Cette thèse aborde le problème de la définition des strategies de mesures dont l'objectif

principal est la détermination du mouvement d'un objet dans un environnement dom&

Une des contributions de cette étude est l'introduction des mécanismes adaptatiik au sein

de la vision active. Ces mécanismes adaptatifS touchent à la fois les paramètres externes

du capteur: position et orientation, ainsi que ses paramètres internes: mise au point et

agrandissement. Une métrique pour l'evaluation des performances du capteur est introduite

permettant ainsi une rétroaction de l'information visuelle sur les paramètres de celui ci.

Contrairement à d'autres méthodes, celle proposée s'adapte facilement à différentes tâches,

elle permet aussi un fusionnement fxile de l'information reçue d'autres capteurs, visuels ou

autres. Les stratégies de mesure qu'on propose, permet tent une interaction autonome avec

un environnement structuré où les occlusions sont prédictibles. L'ensemble du problème

est alors modélisé dans un cadre statistique s'appuyant sur des techniques d'estimation

optimale où la probabilité d'occlusion est calculée et où l'expression de l'information déduite

de l'occlusion est incluse dans les éstimés.

Pour conclure, l'utilisation de l'information visuelle dynamique, dans les boucles de

commande d'un manipulateur dont la tâche est la saisie d'un objet attaché à un pendule,

est présentée dans un cadre de dualité estimation-commande-

Acknowledgement s

1 thank my thesis advisur, Dr- Pierre R Bélanger, for his ideas and guidance. During my

stay at McGiIl, he helped me discover the joys of research and the necessity for clear and

positive presentation.

I would like also to thank my feUow students at the Centre for Intelligent Machines for

many interesting and enjoyable discussions which covered a broad range of topics except

women in engineering.

The support of this research by the Natural Science and Engineering Council of Canada

(NSERC) and the Tunisian Government are gratefully acknowledged.

Most of ail: I would like to thank my parents, m y brothers, Hichem and Kameleddine,

and. my sister Hend, whose love and support over the years have brought joy to my daily

endeavors and made the completion of this thesis possible.

Claims of Originality

This research presents new resuits regarding automated camera control and the visuai ser-

voing of a robot manipulator. To the best of my knowledge, the foUowing are onginal

contributions:

A moving horizon observer is considered for recovering the position and the velocity

of the target object, The estimates are obtained by minimizing or approximately

minimizing the trace dependent cost function over the observation interval.

m The suboptimal approach to the control of the external parameters of the visual

system is considered to be more general than the fixation approach.

m Including interna1 parameters in addition to the external ones, the considered a p

proach involves the integration of tracking, focusing and zooming by means of op-

timal estimation and prediction which is a chalienging problem in dynamic vision

systems.

m A probabilistic aigorithm, based on information derived from the occlusion event is

introduced to improve the performance of the visual system used in occlusion cases.

m An adaptive control approach is presented. It allows an interaction between active

vision and visual servoing in an interception problem.

TABLE OF CONTENTS

Résumé . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

Acknowledgernents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LIST OF FIGURES viii

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CHAPTER 1 . Introduction 1

1 . Research objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . The concept of active vision 2

2 1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 . Gaze Control 3

3.1. Gaze stabilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. Gaze change 6

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 . State of the art 6

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 . Overview of the Thesis 8

. . . . . . . . . . . . CHAPTER 2 . Problem Statement and optimization approach 11

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . Modelling 11

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1. introduction 11

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2. Geometric Mode1 12

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . Bayesian Frarnework 13

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1. Dynamic Mode1 14

TABLE OF CONTENTS

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 . Imaging Mode1 15

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 . Recursive State Estimation 16

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 . Task Mode1 18

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 . Optimization Approach 18

7 . Reforrnuiation of the Optimization Problem . . . . . . . . . . . . . . . . . . . 20

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 . Simulation Example 21

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 . Constrained Optïmization 25

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 . Simulations 27

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1. First Example 28

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2. Second Example 33

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3. ThirdExample 35

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4. Fourth Example 37

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 . Conclusion 40

. . . . . . . . . . . . . . . . . . . . CHAPTER 3- Variations of Optical Parameters 42

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I Analysis of Defocus 43

. . . . . . . . . . . . . . . . . . . . . . . . . . 1.1. Definition and Assumptions 43

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2. Evaluation 45

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . Varifocal lem 48

. . . . . . . . . . . . . . . 2.1. Derivation and properties of the equivalent lem 48

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2. Definitions 49

. . . . . . . . . . . . . . . . . . . . . . . . . 2.3. Radius of blur: varifocal lem 51

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 . Optimization Approach 53

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 . Simulations 54

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1. Simulations Results 56

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2. Second Example 59

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3. Third Example 60

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 . Conclusion 62

. . . . . . . . . . . . . . . . . . . . . . . . . CHAPTER 4 . The Occlusion Problem 65

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . Introduction 65

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . Description of the Problem 66

viii

T-ABLE OF CONTENTS

3 . Filtering: Single Measurement Problem . . . . . . . . . . . . . . . . . . . . . . 66

4 . Information fkom the Occlusion Case . . . . . . . . . . . . . . . . . . . . . . . 68

5 . Modified Nonlinear Estimation: Occlusion Information . . . . . . . . . . . . . 70

6- DescriptionoftheControlStrategy . . . . . . . . . . . . . . . . . . . . . . . . 72

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1. First Approach 72

6 2 Second Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Simulations 74

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1. Filtering 76

7.2. Observation strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3. Second example 81

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 . Gaze Change Strategy 83

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 . Conclusion 85

CHAPTER 5 . Grasping of a Moving Object with a Robotic Hand-Eye System . . . 87

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . Introduction 87

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . Problem Description 90

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1. SetupDescription 92

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2. Modelling 92

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 . PIatfonnandRobotControl 96

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1. Open-Loop Approach 96

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. Closed-Loop Approach 98

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3. The Dual Approach 98

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 . Simulations 105

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1. Simulation Parameters 105

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2. Simulations Resuits 108

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 . Conclusion 112

. . . . . . . . . . . . . . . . . . . . CHAPTER 6 . Conclusions and Future Research 114

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . Future Research 116

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1- Sensing extension 116

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2. Task extension 116

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 . Cooperation 116

TABLE OF CONTENTS

. . . . . . . . . . . . . . . . . . . APPENDIX A . Matrix method in paraxial optics 126

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . Ray-trader matrices 126

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . The translation mat* T 127

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 . The refraction matrix R 128

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 . Thin lem approximation 129

. . . . . . . . . . . . . . . . . . . . . . . 5 . Cardinal points of an optical systern 131

. . . . . . . . . . . . . . . . . . . . . -4F'PENDM B . Modifieci nonlinear estimation 133

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.1. Derivation of Po(k + 1) 137

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . Simulations 137

. . . . . . . . . . . . . . . . . . . . . APPENDIX C . The closed-loop opthkation 140

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . APPENDIX D . Observability 146

LIST OF FIGURES

Geometry of the camera pinhole mode1 . - . . . , . . - . - . . . - . . 22

Performances of the extended Kalrnan Elter for T = 0.2s, tu = 0.1s and,

R = diag[lO 101 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Position of the particle in the image plane and trajectory of the visual

system: R = diag[lO 101, tu = 0.1s: T = 0.2s . . . . . . . . . . . . . . 24

Double integrator example: position of the particle in the image fiame

and trace of the state error covariance matrix R = diag[lO 101, tu = 0.1s:

T = 0 . 5 s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Trajectory of the particle in the image plane, T = 0.2s and t, = 0.1s:

R = dzag[l 11 (dashed), R = diag[lO 10) (solid) . . . . . . . . . . . . . 28

Trajectory of the camera, T = 0.2s and tu = 0.1s : R = diag[l 11

(dashed), R = diag[lO 101 (solid) . . . . . . . . , . . . . . . . . . . . 29

Trajectory of the particle in the image plane for R = diag[lO LOI,

tu = 0-1s: T = 0.2s (dashed) and T = 0.5s (solid) . . . . . . . . . . . 29

Trajectory of the particle in the image plane for R = diag[lO 101,

tu = 0.1s: ai = 1 (dashed) and = 10 (solid) . . . . . . . . . . . 30

Performances of the extended Kalman filter: trace of the state error

covariance mat&, R = diag[LO 101: &i = 10, T = 0.5s and, tu = 0.1s

(~olid);&,~=l,T=0.2sand,t ,=O.ls(dashed). . . . . . . . * . . 30

Trajectory of the particle in the image plane for R = diag[lO 101:

br = 10, T = 0.5s and, tu = 0.1s (solid); = 1, T = 0.2s and,

= 0.1s (dashed) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

LIST OF FIGURES

Performances of the extended Kalman filter for R = diag[lO 101, ar =

. . . . . . . . . . . . . . . . . . . . . . 1 0 , T = 0 . 5 s a n d , ~ = O . l s . . 32

Trajectories of the particle in the image plane for T = 0.5s and for

different update times: = 0.05 (dashed), tu = 0.1 (dashdot) and

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . t, = 0.2 (solid) 32

Trajectory coordinates of the particle in the image frame for T = 1s and

t, = 0.1s: with respect to the y, axis (dotted), with respect to the x,

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . axis (solid) 34

Orientation of the visual system with respect to the zi axïs: T = 1s and

t u = O - l s c . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

. . . . . . . Performances of the K h a n filter: T = 1s and tu = 0.1s 35

Performances of the Kalman fiiter: Limited image plane (60x60), T = 1s

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . and tu = 0.1s 36

. . . . . . . . . . . . . . . Trace of the state error covariance matrix 37

Trajectory coordinates of the particle in the image p1ane:with respect to

. . . . . . . . the y, axis (dotted), with respect to the x, axis (solid) 37

Trajectories of the visual system with respect to the q axis (dashdot)

. . . . . . . . . . . . . . . . . . . . . . . . . . . and the yi axis (solid) 38

Trajectory coordinates of the particle in the image plane: with respect

. . . . . . . to the y, axis (dotted), with respect to the x, axis (sotid) 39

Rajectories of the visual system with respect to the q axis (dashdot)

. . . . . . . . . . . . . . . . . . . . . . . . . . . and the yi axis (solid) 40

Displacement of the detector plane causes blurring: A point feature at

infinity will be imaged as a point on the focal plane. Blurring results if

. . . . . . . . . . . the detector plane and focal plane do not coincide 45

Simulation results for an oscillating particle parallel to the optic axis at

. . . . . . . . . . . . . . . . . . . . . . a "far" distance fkom the lem 47

Simulation results for an oscillating particle parallel to the optic cucis at

. . . . . . . . . . . . . . . . . . . . . a "close" distance from the lem 45

xii

LIST OF FIGURES

Simulation results for an osciilating particle parailel to the optic axis at

a "closen distance from the lens for a different focal length . . . - . .

The two-component zoom system . . . - . . . . . . . . * . . . . . . .

Geometry defining the radius of the confusion circle . . . . . . - . .

Performances of the Kalman filter: t h e varyïng internal parameters

f osition of the particle in the image plane: time varying internai

parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Trajectory of the particle in the image plane . . . , . . . - . . . . . .

Performances of the Extended Kalman filter: tirne varying external and

internal parameters . - . - . . . . . . . . . . . . . . . . . . . - . . . Orientation of the camera with respect to the zi axis . . . . . . . . .

Variations of the optical parameters . . . . . . . . . . . . . . . . . . .

Blurring radius ciifference: two features in the image plane . . - . . . Performance of the Kalman fiiter: log of the trace of the state error

covariance rnatrix for the example of two features with the same

dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Blurring radius Merence: two particles with different dynamics . . .

Variations of the optical parameters and performances of the extended

Kalman fiiter: real positions (solid), estimated positions (dashed) - .

The configuration of the spinning target with respect to the observer

Real trajectory of the object and performances of the Kalman filter:

probabilistic approach, Po = diagi0.1 0.11 . . . . . . . . . . . . . . . .

Extended Kalrnan filter performances: error in the trajectory between

the real orientation and the estimated one for Po = diag[ l 11 : classical

Kalman filter (dashed) and probabilistic approach (solid) . . . . . . .

Extended Kahan filter performances: error in the trajectory between

the real state and the estimated one for Po = dzag[10 101 : classical

Kalman filter (dashed) and probabilistic approach (solid) - . . . . - .

LIST OF FIGURES

Trajectory of t he visual system for different strategies with an observation

horizon N = 50, Po = diag[l 11 and Q = [O 0; O 0-11; NOIFC (dashdot)

and OIFC (solid) . , . . - . . . . . . . . . . . . . . . . . , - . . - . . 79

Performances of the Kalman Hter for dXerent strategies with an

observation horizon N = 50, Po = diag[l 11 and Q = [O 0; O 0.11;

NOIFC (dashdot) and OIFC (solid) . . - . . . . . - . . . . - . . . . . 79

Trajectory of t he visual system for different strategies with an observation

horizon N = 50, Po = diag[O.l 0.11 and Q = [O 0;O O.Ol]; NOIFC

(dashdot), OIFNC (dashed) and OIFC (solid) . . . . . . . . . . . . . 80

Trajectory of the visual system for different strategies with an observation

horizon N = 50, Po = diag[l 11 and Q = [O 0; O 0.011; NOIFC (dashdot),

OIFNC (dashed) and OIF'C (solid) . . . . . . . . . . . . . . . . . . . 80

Trajectory of the visuai system for the OIFC strategy, Po = diag[l 11:

Werent observations' horizons, N = 20 (dashdot) and, N = 50 (soiid) 81

Extended K h a n filter performances: l lace of the state error covariance:

neither occlusion information nor observation strategy . , . . . . . . 82

Extended Kaiman filter performances: state estimates: "servo" approach

(dashed) and optimal control (solid) . . . . . . . . . . . . . . . . . , , 83

Trajectory of the carnera for a translating and spinning target: "servo"

approach (dashed) and optimal control (solid) . - . . . . . . . . . . . 84

Performance of the Extended Kalman f3ter: log of trace of the state

error covariance matrix: N = 50 . . . . . . . . . . - . . . . . . . . . . 85

Trajectory of the visual system: gaze change approach: N = 50 . - - 85

Position-based visual servoing . . . . . . . . . . . . . . . . . . . . . . 88

Image-based visual servoing . . . . . . . . . . . - - . . . . . . . . . . 88

Setup description . . - . - . . . . . . . . . . . . . . . . . . . . . , . . 92

Variables for the pendulum. . . , . . . - . . . . . . . . . . . . . . . . 93

Geometry of the platform-borne camera . . . . . . . . . . . . . . . . 94

LIST OF FIGURES

The Closed-loop Control Approach . . . . . . - . . . - . . . . . . . . 98

States and parameters estimation errors for the grasping task with dual

adaptive control (solid), open-loop control (dashed) and closed-loop

control (dashdot) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

States and parameters errors for the grasping task: closed-loop control,

threshold equals 0-1 (solid) and threshold equals 0.001 (dashdot) . . 107

Error between the target trajectory and the robot end-effector trajectory

in the grasping process: closed-loop control, theshold equals 0.001

. . . . . . . . . . . . . . . (dashdot) and threshold equals 0.1 (solid) 108

States and parameters errors for the grasping task with dual adaptive

control, N = 20 (solid), 1V = 100 (dashed) . . . . . . . . . . . . . . . 109


in the grasping process with dual adaptive control: N = 20 (solid),

N = 100 (dashed) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110


in the grasping process: adaptive control (solid), open-loop control

(dashed), and closed-loop cont rol (das hdot ) . . . . . . . . . . . . . . 1 10


in the grasping process: adaptive control (solid), and closed-loop control

(dashdot) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 1


in the grasping process: dual adaptive contra1 (solid), and closed-loop

control (dashdot) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

Translation of a ray by a distance t to the right between two reference

planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

Refraction of a ray between two surfaces of refractive index ni and n2 129

Real object trajectory and filter performances: first probabilistic approach

(dashed) and second probabiïistic approach (solid) . . . . . . . . . . . 138

LIST OF FIGURES

B-2 O(-) The probabiIity of occlusion of the tracked feature . - . . . . . . 138

xvi

CHAPTER 1

Introduction

It is well known that every creature conditions its actions on the perceived state of the

environment. The simplest interaction is carried out in an open-loop marner "stimulus".

As the scale of complexity rises, an element of "intent" enters the process. Some creatures

use action to favourably alter the relationship between t hemselves and the environment.

With this increase in alternatives cornes an associated increase in the information needed to

make informed, intelligent choices. EventuaUy situations a i s e where the most appropriate

action is more informative than performative. That is, the creature is engaged in purposefui,

dynamic information gathering activities.

Robot systems are still in the level of a uni-directional sensor-action relationship. Nearly

al1 the curent and the past robots are constructed without the ability of a dynamic infor-

mation gathering or evaluation. Without this ability to actively sense their environments,

robots are restncted to situations where the environment, tasks and sensing systems are

pre-engineered so that the relevant information is available a t the appropriate time. If this

is not the case, information is incornplete or imprecise, the robot may becorne inconsistent

and enter an error state.

1. Research objectives

To operate in an environment which is inherently uncertain, increasing task and envi-

ronment complexity requires that we develop sensor systems which can dynamically inter-

pret observations of the environment in terms of the task to be performed in a minimum

time. Contrary to other sensing devices, vision is a usefd robotic sensor since it rnimics

1-2 THE CONCEPT OF ACTIVE VISION

the human sense of vision and allows for non-contact measurements of the environment.

It seems clear then that the ability to actively gather information based on perception is

centra1 to the enterprise of building intelligent, sensor-based robots.

Building an intelligent sensor poses several problems. The construction of such a system

involves sensing, decision making and acting. Robotics haç worked with models of position

and basic spatial relationships; perception has developed a number of feature-based or

position-based representations. In order to have components to interact in a suitable fashion,

we must establish clear relationships among the models. One requirement is to develop a

framework for sensor interaction which provides criteria for making intelligent choices about

what to observe and how to observe it. Certainly, the goals of the task should influence the

choice of the observations.

Establishing models and criteria for the integration of sensing and actuation to form

an intefigent robot control system, are substantiating this thesis. Within this framework,

we address the recovery problem of the 3D orientation and motion of a tracked target from

a sequence of monocular images. We are concerned with issues specifically related to the

coordination of visual information gathering, a task itself, and the achievement of the task

at hand. In this coordination we seek good performances of the estimator as weii as a small

time for the realization of the task- To focus on this concrete problem of sensor interaction

and control, we consider these issues in the more general context of active vision.

2. The concept of active vision

2.1. Definitions. In machine vision, we can distinguish t hree main categories:

Passive vision

Dynamic vision

Active vision

Passive vision is the study of static, off-line information gathered fiom a visual system.

It includes scene analysis, stereo vision kom two or more views. Dynamic vision relates to

the study of a t h e varying sequence of views. Active vision relates to the feedback of the

information gained hom visual sources to S e c t the perception process [58, 70, 20, 21.

The signal obtained hom a camera systems contains more information than is required

by many applications. This gives us far more data than can be analyzed in real time, a

1.3 GAZE CONTROL

high volume of redundant data. Although active vision techniques include those appLied to

passive and dynamic vision processes, they are adaptive to the system requirements and to

the environment of the visual system. Taking this one step hrther, these techniques can

then be made fully adaptive such that system parameters subsequent to initial caiibration

are enabled to readjust as a h c t i o n of the gathered information [29].

It follows that active vision is defined as active or adaptive control of camera parame-

ters. This adaptive control leads to more efficient processing of the digital image to enable

feedback to the camera system to occur. More broadly, active vision encompasses attention,

selectively sensing in space, resolution and time, whether it is achieved by mo-ing phys-

ical camera parameters or the way data is processed after leaving the camera. Important

areas in active vision include:

m Attention: processing of regions in a selective rnanner such that location, motion

or depth is reduced. Attentional aspects are complex and have a direct bearing on

early vision.

Foveal sensing: mainly oriented to the development of specific visual sensors with

high-resolution at the center and lower resolution at the periphery.

m Gaze control: manipulation of the camera parameters such that the images acquired

are specific to the vision task- This can include stabilization of the image, augment-

ing the field of view and range estimation. This can be divided into two main classes:

gaze stabilization and gaze change.

0 Hand-eye coordination: it refers to the use of vision to predict the motion of a

robotic sub-system, while data from this sub-system is used backwards to correct

the visual perception. This includes manipulation of objects to aid vision analysis,

auto-caiibration, visually guided assembly tasks, ... etc-

3. Gaze Control

Of al1 the aspects of active vision, gaze control is considered the most important [58].

Since the primary goal of gaze control is to actively manipulate the camera parameters so

that the images acquired are directly suited to the task specified, it is clear that this is

indeed an integral part of active vision.

1.3 GAZE CONTROL

When the irnaging system has no controilable parameters, the modeliing and calibration

problems are those commoniy referred to as camera calibration [69]. The controllable

parameters can be thought of as defining a cofiguration space for the imaging system [29].

Automated camera control consists of modelling the family of images and selecting a set of

points or trajectory in the control space.

The commoniy found controllable parameters are:

Pose: The imaging system may be movable, in which case the position and orienta-

tion are controllable-

Lens: Automated Ienses allow the control of the focus distance, zoom and aperture

[47l-

Filters: In color imaging, several color filters are inserted into the optical system.

Cameras: Some visual systems aUow the control of exposure time-

IUumination: Some systems present the possibility of moving, dimming or switching

iight sources.

There are several features that one might expect to enhance by controlling the visual

system parameters [42], [71], [68], [al]:

exposing important views of the scene;

improving the obsembiiity of key features;

measuring shape by inverting optical laws while varying one or more imaging system

parameters. This class includes many different branches in the Literature:

- shape from controiling the camera pose [54];

- shape kom controlling the illumination: shape from shadows and photometric

stereo [80, 351;

- shape fkom controlling the focus and the aperture (61, 26, 781;

- reflexion analysis while controlling color filters [57].

In this dissertation, we concentrate particularly on the first two features for camera

control: visibiiity and image quality. The problern of gaze control can be then partitioned

in two ways. A common approach is to divide the problem up according to the biological

mechanisms: vergence, smooth pursuit, vestibulo-ocular reflex, ... etc- These modules are

integrated by the brain with a certain coordination; one of the questions in this formulation

is how these should be coordinated in a machine vision sense-

1.3 GAZE CONTROL

An easier and more abstract definition set for gaze control is to divide it into two

primary categorïes, gaze stabilization and gaze change. Gaze stabilization refers generaily

to fixation, reducing the disparity between fiames in order to maintain clear images of some

world point that may be stationary or in motion with respect to the camera- For moving

t argets, t his typically involves target tracking.

Gaze change is closely related to the issue of attention in that to actuate any attentional

algorithm a correspondhg change in camera gaze is required. This change may be used to

transfer stabilized gaze to new fixation points. This problem, also referred as 'khere to

look next?" [15], is very cornplex and can be determined according to the same criteria as

attention.

For real-world situations, solutions to the problems of gaze stabilization and gaze change

can only be accomplished using active control mechanisms. In order to perform 3-D recon-

struction of a general scene, the visual system must be moved so that the different portions

of the scene corne into view and lens parameters must be tuned so that sharp, focused images

are acquired with appropriate brightness values. For moving objects, the need to control

gaze is even more compelling. The ability to stabilize the image of a moving object may

be not only necessary for robust processing, but also can provide significant cornputational

advantages over the passive case. Gaze control presents several advantages including:

image stabilization for fixation where a moving object can be efficiently tracked to

avoid blurring, enhancement of the object due to the blurring of the background

making a figure/background separation simpler, tracking of the object can lead to

the object being held within the field of view.

range calculations can be performed on a scene by use of optical Bow.

moving the camera can help the vision system ta simpler calculations to be done to

determine object properties, to see beyond occlusions and to gain more information

about the object it is interested in.

3.1. Gaze stabilization. Gaze stabilization or fixation is a control problem, de-

termining estimation error and feeding thîs back to cause the error signal to tend to zero.

Swain and Stricker [58] state some current issues that can be speciiic to gaze stabilization:

Models of noise, sensitivity, stability, and performance must be appropriate for

spatio-temporal signals.

4 STATE OF THE MCT

0 gaze stabilization is a mdti-input, multi-output control problem. For example, a

tracking problem may involves inputs fiom different algorithms (position estimate,

blur,.-.etc.)

rn a high Ievel of adaptation is required in order to fully adapt the control to changes

in knowledge about target or pIant acquired through tirne.

3.2. Gaze change. Change in gaze is required in order to appropriate the various

components of the active vision system to new parts of the visual field. Integration of the

various inputs into the vision system is required to fully incorporate available resources and

use this as control specification for the camera parameters. Two steps can be followed at

this level: first define the point "where to look nextn which is reminiscent to attention,

second change gaze which involves the mechanics and the control related to moving the

camera parameters to the new fixation point defined by the fkst step-

Again, a few relevant areas have been highlighted in [58]:

O dynarnic optimal integration between sources of visual information based on different

control approaches.

rn strategies for acquiring images and processing them to examine different points of

the scene given the task at hand.

rn issues on gaze change including robustness of the control approach against noise,

dropout, and delays.

4. State of the art

Because of gaze control definition as intrinsic to any active vision system, a lot of work

has been done in this area.

Much research in this field has been concentrated on the solution of the visual tracking

problem. It appears that visual tasks are facilitated when the carnera motion is actively

controlled. If the vision system is constituted of a single camera, the visual input to the

control system should be directly extracted from the image. The parameters of the camera

are then actively controlled. If a stereo head, composed of two cameras, is available, then

more degrees of freedom can be exploited: independent vergences. Motivated by image

stabilization, most of this research considers fixation as the objective in the control of

camera parameters [56], [Tg], [59], [44], [36], [43], [19], (301. Difirent control approaches

1.4 STATE OF TBF, ART

have been used in the control of the visual system- It covers the area of classical control

Bom a simple PD controiler [23], adaptive controuer [64], [31], [59] to predictive controiler

[41l-

D8erent vision eues have been used in these control approaches. Existing methods can

be divided into two groups,

optical flow field,

correspondence of discrete features.

The philosophy of optical flow is to work in two steps. First, estimate fiom the variation

of image int ensities the projection of the t hree-dimensional velocit ies and second, comput e

those velocities and the depth fiom the optical flow (821. Fermiiller and Aioimonos (321 use

the spatietemporal d e r i ~ t i v e s of the image intensity function, to provide for a fixation on

a target object for a robot mounted camera.

In the area of robotic visual servoing, Papanikolopoulos et al. 159, 441, Nelson and

KhosIa [56] have introduced the SSD algorithm ( S m of squared differences). Tt is a corre-

lation approach which tracks a region of an image by exploiting its temporal consistency.

SSD is sensitive to background changes maidy due to occlusions [38]. Luo, Muilen and

Wessel [43] have presented a robot conveyor tracking system that incorporates a combina-

tion of visual and acoustic sensing. They have used a modified version of the Horn-Shunck

algorithm for the calculation of the optical flow. Zhuang et al. [39] have utilized optical flow

and depth information to determine the movement of a tracked object. Viéville et al. [28]

have used optical flow as measurement data in their tracking strategy. Precise positioning

of camera pan, tilt and vergence is ensured by a kinematic chain of few joints. The INRIA

robotic head is made up of two different cameras. The gaze control irnplemented does not

focus on the object to be tracked but instead h d s the region of highest residual disparity

and selects the target object accoràingly.

A limitation to the recovery of the 3D structure and motion hom the optical flow is

the hypothesis of a single motion in the image. If a camera is moving in a scene containhg

moving objects, then the resdting global image flow is the superposition of each independent

flow created by each moving object. Before attempting to analyze and interpret the motion

of each object fiom the optical flow, the entire flow should be decomposed into difEerent

flows corresponding to distinct occurring movements. A second limitation, is the hypot hesis

1.5 OVERVIEW OF THE THESIS

of a Limited change of the desired features over tirne. It follows t hat the appearance of small

region in an image sequence changes very Little.

Rather than using the optical flow approach, a second approach based on extracting

discnminatory features has been proposed. A correspondence is t hen established between

the successive projections in the image of the same physical 3D features in the scene. A

system of equations can then be derived, with respect to the motion parameters of the

object to which features belong.

In the correspondence class, cornes the seminal research of Feddema et al. [31], [49].

They have addresseci the problem of feature selection and feature-based control for the

tracking of a moving target by a mounted-robot camera. Dickrnann'; and Graefe [27] have

dealt with the tracking problem as weli as the feature selection problem. In their approach,

they have considered a Kalman filter in tracking objects in monocular image sequences

and they have investigated the problem of selecting subsets of available featwe points by

maximizing the weighted determinant of the measurements Jacobian [66]. Wilson [79],

Broida et ai. 1141 have considered the image of a number of features as measurements in a

K a h a n filter used for motion estimation. Despite all this research efforts, little has been

done for the selection of locations of interest, others than fixation point, to autornatically

and dynarnically identib the changing requirements of the task [58, 72, 211.

5 . Overview of the Thesis

In this dissertation we adopt the correspondence class to address the problem of re-

covering the position of a moving object. This work contributes to the tracking problem

of a moving object by providing more general techniques rather than fixation. Considering

a structured environment, as dynamic and geometric rnodels are assumed to be known a

priori, in these techniques we investigate how the sensor trajectory can be controlied to

reduce the errors on the estimates. This work contributes to the problem of gaze control

by providing a novel method to gaze control based on the integration of gaze stabilization

and gaze control in a unique strategy of measurement.

In this development of the observation strategy, we define an optimization criteria

that rests on the tenet that this is a purposeful, directed process: the priorities of the

curent goal should influence the information gathering process. This can be viewed as a

1.5 OVERVZEW OF THE THESIS

way of optimizing the use of limited computational resources [73] or as a way of gathering

information about the system [SI.

These model based techniques offer a general fkamework to derive observations strate-

gies for a wide range of immediate applications. Ln robot vision systems, there arise many

situations in which recovering the position of the target is required. For example intercept-

ing and catching an object on a conveyor belt or recovering the movement of a tumbLing

satellite. Of course, there are many other special purpose systems, such as in the medical

imaging area, tracking the left ventricular of the heart, fkom a sequence of images, that

would benefit from a simple model based observation strategy.

We have outlined several issues to be addressed, in the next chapter a probabilistic,

model-based hamework is considered. It is characterized by three components: geomet-

ric models, sensor observations models and models for prior information. This chapter

presents a detailed description of these models. It reviews some fundamentals of Bayesian

based approaches and optimization theory relevant to the rest of the dissertation, Also it

presents some optimization resuits for observations strategies which will be used later in

the dissertation.

In chapter 3, we present a more detaiied model for the visuai system- It integrates the

focal length and the zoom in the proposed model of the camera. A gaze control strategy

including these optical parameters is derived. We provide the results of simulations which

show the form of the observer trajectories as weil as the variations of the interna1 parameters

produced by our approach. These parameters variations depend critically on the measure-

ment noise model. The camera trajectories exhibit both a target following behaviour and

a gaze change behaviour. Appendix A introduces the mathematical background for matrix

methods in paraxial optics. We review the derivation and the properties of an equivalent

lem to a varifocal one.

Chapter 4 introduces the notion of using the occlusion as a source of information in

the update of the state error covariance matrix, The chapter begins with a definition of the

occlusion. Then, a probabilistic frarnework to the problem is defined. Different observation

strategies are developed in this framework. We investigate the advantages of using the

occlusion information. Simulation results present a comparative study between the akeady

presented approaches and a third one based on gaze change between observable features.

1.5 O V E R m V OF THE THESIS

Appendix B presents a second filtering approach based on the occlusion information and

derived in a probabilistic editing framework.

In chapter 5 of this dissertation, we examine the problem of tracking and catching a

moving object. We present three new control approaches to the grasping process. The main

result of this chapter is a new control strategy of the robot manipulator that allows the

simultaneous achievement of two goals: the observation of the moving object to define its

movement and the command of the end-effector to move to a grasping position-

Appendix C is devoted to deriving an approximate expression for the cost function of

the stochastic closed-loop control method d e h e d in chapter 5. The main feature of this

derivation is an expression that includes a term of the future uncertainty associated with

the state.

Appendix D treats the observability of the testbed system considered in chapter 5 of

this thesis. It shows that unless special values are considered for the rod length, one feature

can insure the observability of the target.

Finaily conclusions and fiiture research directions are given in chapter 6.

CHAPTER 2

- - - - -- - - - - - --

Problem Statement and optimization approach

Representations of information play a centrai role in the study of any system 1501. Repre-

sentations make certain types of information explicit, while requiring t hat ot her information

be computed wheri needed. The choice of representation becomes crucial when the informa-

tion being presented is uncertain. The purpose of this chapter is to outline the framework

in which we will be working. First we will state and discuss our basic assumptions about

this framework. hlainly the introduction of dynamical models in our approach rather than

being limited to geometric models.

We will examine represent at ions suit able t O t hese dynamical models. These represent a t ions

embrace three classes of modelling assumptions: geometric models, sensor models and task

models.

Second, a general optimization approach will be presented. In this approach, the vi-

sua1 system motion during a receding observation time interval will be defined so that

an estimation accuracy criteria will be optimized. This problem of strategy computation

will be transformed into a deterministic control problem and a memurement policy will be

precomputed before the measurements a c t u d y occur.

1. Modelling

1.1. Introduction. One objective of this dissertation is to study the use of visual

sensors to locate a desired object in some region of space. This problem is characterized

by two features. The first is the ability to recognize the desired object when it comes into

view. This ability has received a great deal of attention kom vision researchers, as object

recognition is of importance to a variety of tasks in addition to localization. However,

simpIy being able to recognize the object when it appears in the field of view is not enough.

The second feature consists of con t rohg the sensors so that to determine the location of

the object- This assumes that in a correspondence hamework, the image processing system

has identified and located some features in the image plane. For simplicity, we consider

point features in our localization strategy. It is evident that if the object is rnoving thcn its

location is time varying. It follows that the sensor has to be controlled to keep the object

in the field of view- This dynamic aspect of active vision rests primarily on the observed

scene characteristics as weil as the physical properties of the target.

When discussing a dynamic scene, we usually consider a spatial interpretation of the

scene both in position and velocity whenever it is possible. Similarly in the considered

approach, a direct spatial interpretation of image sequences is achieved by using spatial

geometric modeis and adding to this aspect the temporal one . This immediate inclusion

of temporal aspects is very essential since it allows a proper definition of state variables.

Geometric shape descriptions and models for motion constitute the basis for our geometric

model. This means that not just objects are being observed but also their dynamics.

1.2. Geometric Model. Throughout this thesis, we consider a correspondence

approach in our development of the observation and the control strategies. Thus, a special

attention is given to geometric modelling and coordinate transformation.

Assiuning a known geometry for the target, this geometry is first described in a certain

structure coordinate system fixed on the object. This description depicts mainly the posi-

tions of the chosen features in the object kame- We consider an inertial kame as a common

6xed reference fkame. Therefore, for a given feature i characterized by its position OPi in

the object frame, its position in the inertial fkame is given by

We use function transformation to denote applying a change of coordinates to a point.

go is the rotation matrix that defines the orientation of the object hame with respect

to the inertial fiame. The location of the origin of the object kame with respect to the

inertial frame is denoted by the vector Y.. Together, the position and orientation specify

the transformation of the feature i presentation fiom the object rame to the inertial kame.

Often, we must compose multiple coordinate transformation to obtain a desired change

of coordinates mainly an additional transformation between the inertial kame and the

camera fiame. It ensues that for a transformation inertial fiame object hame characterized

by 'R, and Cti. The position of the feature i in the camera frame is then

It follows that

2. Bayesian Framework

Most of the work in this dissertation will be carried out within the fiamework of

Bayesian decision theory- A Bayesian model is a statistical description of an estimation

problem which consists of two separate cornponents. The first component, the prior mode1

is a probabilistic description of the prior information on the estimates. The second model, is

a description of the noisy or stochastic processes that relates the unknown states to sensor

values- These two probabilistic models can be combined to obtain a posterior model, which

is a probabilistic description of the current estimate given current data. To compute this

posterior mode1 we use Bayes' Rule.

Maintaining this probabilistic description of our estimate is particularly usefiil in the

context of dynamic vision. riz such a system, new information is continually being acquired

due to either obsemer or target motion or both, and estimates are continually being updated.

A useful formaiism for modeiling such a system is the Kalman filter, which we will examine

in the following section- The Kalman filter extends the Bayesian fiamework introduced

in this section by adding the system model to the prior and rneasurement modeis. This

system model describes the evolution of the state being estimated, and contains a stochastic

component to account for unknown disturbances and model errors.

One of the most compelling arguments for using a Bayesian framework is its mathe-

matical clarity and elegance. Furthermore we believe that:

In the situations we are considering, some prior information is available.

We have a well justified basis for choosing a performance criteria that will be defined

in section 6-

It defines a powerful tool to maximize the speed of convergence of the estimates to

tme values by application of advance knowledge.

Of course Kalman filtering requires a system dynamic model and measurement equations.

Furthermore using a dynamic model to describe ail rigid body motions which yield motions

in the image sequence will lead to

elimination of the need to access past images,

determination of the position and the velocity of the tracked object,

circumvention of the non-unique inversion of the perspective projection by using an

extended Kalman filter for state estimation exploithg the Jacobian matrix of the

measured image ieatures.

2.1. Dynamic Model. The objective of a mathematical model is to generate an

adequate tractable representation of the behavior of ail outputs of interest fiom the real

physical system [51]. Since no model is perfect, one attempts to generate models that closely

approximate the behavior of observed quantities. In our Bayesian hamework, it is alrnost

impossible to derive or evaluate the joint probability distribution or density function for the

s t ate and O bt ain the correspondhg joint funct ions for the measurements unless we assume

a Gaussian noise for both the dynamic mode1 input and the measurements. Moreover for

a tractabie depiction of a joint distribution or density, we need a Markov property for our

system [51]. Thus the form of the system model defhed by equation 2.4 is motivated.

It is a time varying system describing the position and the motion of the object in the

inertial fiame. We assume that the system state vector x ( t ) is an n dimensional column

vector. The system driving noise c(t) is asswned to be a Gaussian white noise N(0, Q)

with zero mean and a known covariance matrix. The initial state vector xo is modelled as

a randorn vector with known mean, iio, and known covariance Po. This model may be as

simple as x(t) = t(t), the state of an object on a conveyor belt and as complex as a model

for a tumbling satellite.

3. Imaging Mode1

A sensor model d e h e s the hardware and software involved in extracting properties of

observed target. A complete model of a sensor would, of course, include a description of the

effect of all influences on the output of the sensor. We can divide these factors into those

which are under control of the system and those which cannot be controlled. For the most

part, we assume that the influence of factors not in the geometric model or under control

of the system is negligible.

The imaging model combines 3D-geometric models of the scene with the laws of per-

spective projection in order to describe the positions of relevant scene-features in the image

as a function of the relative spatial position and orientation; this procedure is weil known

fkom 3D-computer graphics and not detailed here [7].

For each rigid body object in the scene, a 3D-geometric model is used to describe the

coordinates of relewmt features in the object centred coordinate hame. This yields for the

position OPi of a feature i three space components, siimmarized as

The position of the camera projection center

'P, = [ 2,

T Y i Zi ]

in the inertial

T Y= zc ]

(2-5)

kame is given by

(2.6)

This yields for the given state vector x ( t ) of the dynamic model ihe horizontal coordi-

nate x, and the vertical coordinate y, of the feature i in the image plane as one special

two-component contribution to the measurement vector:

It follows that for a given number of features 1, - - - j in the image plane, the measurement

vector is defined by

2.4 RECURSIVE STATE ESTIMATION

with the output variable dimension m = 2 j - The measurement noise vector v(t) is assumed

to have a Gaussian distribution and a diagonal measurement covariance matrix R(t). Be-

cause OP is constant, we are not considering a flexible object, we can drop th% term fiom

the expression of the measurement vector.

4. Recursive State Estimation

So far, we have defined the dynamic mode1 and the sensor mode1 for our system. Given

these models, it is desied to generate an optima1 state estimate of the imaged object. For

this and because of the nonlinearities of the models, we consider an extended Kalman filter

to derive the estimates-

The basic idea of the extended Kalman filter [52] is to relinearize the sensor and

dynamic models about each preàicted estimate once it has been computed. As soon as a

new state estimate is made, a new and better reference state trajectory is incorporated into

the estimation process. Thus aUowing the vaiidity of the asswnption that deviations kom

the nominal trajectory are small enough to aiiow linear perturbation techniques.

Under the assumption of continuous time process, continuous t h e measurements:

The continuous nonlinear system dymamics

where

The continuous nonlinear measurement system

2.4 RECCTRSIVE STATE ESTXK4TION

where

E(v(t)} = O

~ { v ( t ) v ~ ( t ' ) } = R(t)B(t - t')

E { ( z ( t o ) - it0)vT(t)} = O

l3(e(t)uT(t')} = O

The extended K a h a n Giter cycle Ïs given by the fotiowing equations [l]

where

and

In this continuous time linearization, P ( t ) cannot be precaïculated as it can if we linearize

with respect to a nominal trajectory. It has to be calculated in real tirne, since it is coupled

to the current estimate 2(t) through the linearization procedure.

It is clear from equations 2.8 and 2.24 t hat the measurement vector and the measure-

ment Jacobian both depend on the position of the camera in the inertial frame. Therefore

we can depict two situations

a The camera position is a function of time known a priori (including a constant).

This is the c'classical" case to wbich an extended Kalman filtering is applied.

a The camera position in the inertial frame is not known a priori but can be varied,

instantaneously or through some dynamics, This case is "non-classical", because the

observation can itself be controiled.

Throughout this dissertation the second situation will be adopted.

5. Task Model

Information gathering within our defmed fhmework, consists of choosing a location and

deterrnining the values of the unknown states. Our strategy rests on the tenet that this is

a directed process and instead of gathering al1 possible information about the target object

and the environment, the measurement system should be concentrated on t hose geometric

aspects which are the most relevant to the task at hand.

The location of the rneasurement device is to be governed by the task: what information

are we seeking? This point of view implies the definition of a criterion that reflects our

objective fkom the g a t h e ~ g process. Obtaining the position and the velocity of the target

object with minimum error, suggests a criteria function tbat depends on the state error

covariance rnatrix P(t)- The covariance matrix P(t) provides a measure for the amount of

uncertainty in the estimate of the state variables.

In order to optimize the estimation process, we can define as optimization criteria the

trace of the state error covariance matrix. It follows that considering an observation time

interval (horizon) [t, t + T] such that as t advances so does the horizon, our objective is

then to minimize

where x ( t +T) presents the tme state of the object at time t +T and Z ( t +T) the estimated

one. Note that the srnaller tr(P(t + T)) the more accurate is the estimation.

Obviously, the behavior of the sensor system is then entirely governed by the optimiza-

tion function, and by the information available to the system through the dynamics of the

object.

6. Optimization Approach

The definition of an optimal observation policy during the time interval [t, t f TI is

based on the fact that the measurement equations are explicit functions of the camera

position. Hence, the quality of the measurements, consequently the estimates, depends on

the relative location of the camera with respect to the object. More precisely, to maximize

the information content of the observations, we may vary the camera position under some

constraint on its command. Therefore an optimal observation policy has to reflect the

conformity of the estimation and the fact that special constraints on the camera command

may be required to carry out measurements [3], [63].

It foilows that the cost function can be defined by

The term defines the weighting on the command uc(t) of the camera.

The moving horizon approach was origindy formulated as a method of stabilizing tirne

varying systems, wit hout requiring information about the system mode1 over ail future t ime,

by minirnizing a cost h c t i o n up to some finite horizon ahead a t each instant. The resultuig

controller defined on the intervai [t, t +Tl is only actually used for a time interval [t, t + tu]

where tu < T before being recalcuiated [53].

In this thesis, we consider a receding horizon observation strategy with each finite hori-

zon optimization based upon an L2 optimization. We define the update time as the time

to initiate a new computation of the control strategy after a certain number of measure-

ments. The current measurement history is used in control strategy computation for a

defined observation horizon. The advantage of this procedure in deriving the observation

strategy is that we can deal with short tenn variation of the system, mainly due to the error

covariance matrix. The moving horizon approach can then be viewed as a good compro-

mise since in the extended Kalman filter the error covariance matrix cannot be accurately

precomputed because it is coupled to the estimation equation and an approximation on a

short observation time interval is needed.

Therefore, given the derived equations of the extended Kalman filter? and given the

dynamics of the visual system

a linear time invariant mode1 with constant and known Ac and B,. It is evident that the

state vector x,(t) presents the dynamics of 'P,. The defined cost function can be reduced

to the foilowing expression

J = tr(P(t + T ) ) + lt+T

2.7 REFORMUL,4TION OF THE OPTIMIZATION PROBLEM

Obviously no command u,(t) can be calculated to minimize the previous cost function

because of the properties of the solution to the Riccati equation P ( t ) defined with respect

to the estimate. Thus one approximation is to derive the extended Kalman filter with

respect to a nominal trajectory which still allow to take into account the disturbances that

will occur in the future.

This approach consists of the hearization of the system with respect to some nominal

trajectory [55]. This nominal trajectory is defined as the predicted trajectory. The problem

is then to find the command u,(t) that minimize the foiiowing cost function,

where Pn (t i- T ) is the solution of the Riccati equation derived with respect to the nominal

predicted trajectory. This trajectory is d e h e d considering the predict ion of the state vector

2, ( t ) *

Given the previous equations, we can see that for a given observation approach, we

have a defined estimation error covariance matrix P, (t) solution of the Riccati differential

equation and a corresponding value of the cost function, Hence, we can transform the

op timization problem to a deterministic control problem [4].

7. Reformulation of the Optimization Problem

Given the deterministic, Riccat i differential equat ion

where H [xn( t ) , xc( t ) ] is the Jacobian of the observation equationç generated dong the norn-

inal predicted trajectory. This trajectory is defined considering the prediction of the state

vector xn( t ) . And given the camera dynamics, for simpiicity a linear tirne invariant model,

2.8 SIMULATION EXAMPLE

Find the optimal vc(t) such that the cost function (equation 2.29) is minimized- For this

we consider the H d t o n i a n fùnction

where Gl (t) and G2(t) define the corresponding costate matrices that satis@ the following

equations,

and the boundary conditions:

At t = t: Pn(t) = Pt, xn(t) =

A t t = t t T : G 1 ( t + T ) = I ,

The equations, stated above, that define the properties of the optimal observation strategy

present a nonlinear two point boundary value problem where it is hard to find other than a

numerical solution. Since we have nonlinear matrix dxerential equations, we use a variable

metric technique f331 to solve the problem.

8. Simulation Example

Throughout this dissertation, we limit the dynamics of each degree of keedom of the

target object to a second order Linear system. For simplicity we assume that each motion

component is independent.

To examine the properties of the proposed optimization approach to obsemtion strate-

gies, we consider the simple example of a particle moving dong the Yi axis and a camera

rotating with respect to the zi axis. The ratio focd length to pixel size is taken equal to

600. Let x = [xi x2] be the system state vector where XI defines the position of the particle

along the yi axis, 1 2 its velocity with respect to the same axis and Q the constant covariance

of the model disturbance noise. Then the continuous time system model is

A Rnhole Cmsn Modcl . Oblat F m

FIGURE 2-1- Geometry of the camera pinhole mode1

where

The camera dynamics are limited to a second order system modelling its rotation with

respect to the q axis. For xc = [xci xd] we have,

where

The measurements y ( t ) are required to depend on the states according to the measure-

ment equations,

0.01 A

E - b O G

-0.01

I . I ' I 't 4.02 1 -0-1 J

O 5 1 O 15 O 5 1 O 15 Time (s) rime (s) kg of the baoa of the state e m covariance matrix

-2 I

FIGURE 2.2. Performances of the extended Kalman filter for T = 0.2s. t , = 0.1s and, R = diag[lO 101

sa, ya and, z, define the coordinates of the particle in the inertial frame such that

d, = lm and d, = Om define the constant positions of the particle along the xi

and axes of the inertial fÎame. Simulation results are presented for this example with r 7

2.8 SIMULATION EXAM PLE

orientation of the camcm 0.01 1 1

Tirne (s)

FIGURE 2.3. Position of the particle in the image plane and trajectory of the visual

Position of Me pmtide in the image piane

3-0.5 - 2 0

log of the trace of Via statu emr covariance m a h

FIGURE 2.4. Double integrator example: position of the particle in the image irame and trace of the state error covariance matriv R = diag[lO 101, t , = 0.1s: T = 0.5s

From the obtained results, it is c1ea.r that controlling the camera allows a srnall trace

of the state error covariance (figure 2.2). At the same time it leads to an eccentric image of

the particie (figure 2.3). In fact, hom figure 2.3 we realize that the camera is moving in the

opposite direction to the one followed by the particie (figure 2.2), causing the projection of

the particle in the image plane to be the furthest fiom the center. These concIusions are more

evident for a simple double integrator model- Considering similar simulation parameters

as in the previous example, we obtain a particle image exceeding the extent of the image

plane of any practical camera and a fast decreasing state error coMlriance matrix (figure 2.4).

Rom equations 2.32 and 2.33, it is evident that minimizing the Hamiltonian expression with

respect to the camera position x,(t) is equivalent to maximazing the measurement Jacobian

H[x,(t),x,(t)] with respect to x,(t). It folIows that a maximum value to H[x,(t),x,(t)]

induces an eccentric image.

The obtained results illustrate the fact that to gather the maximum of information

usefd for the task of recovering the position and velocity of a rnoving object we need to

keep the image of the target object far from the image plane center. If follows that the

fixation strategy, used for this kind of task and usuaiiy with respect to the image center, is

not the optimal approach. In fact, keeping an eccentric Mage ailows a maximum ratio signal

to noise. Therefore, better performances of the extended K h a n filter can be reached.

9. Constrained Optimization

Such ~btained trajectories of the object in the image piane are impractical in many

applications. Thus, we need to apply constraints to the problem to ensure that the image

of the target be kept within reasonable limits and without aitering the performances of the

extended Kalman filter. It foiiows that we need to consider an additional term in the cost

function that reflects this constraint. For this, we consider the foilowing cost function

J = E{(x(t + T ) - Z(t + ~ ) ) ~ ( x ( t + T) - Z(t + T))} +

where ai defines the weighting on the trace of the measurements error

y(t) presents the real camera output and Y(t) the estirnated one derived

covariance matrix;

from the extended

Kalman filter. This term is added to the expression of the cost function to reflect a minimum

error variance objective during the observation interval. The second term k2 defines the

weighting on the command u,(t) of the camera.

Given the derived equations of the extended Kalman filter, and &en the dynamics of

the visual system (equation 2-27), the defined cost function can be reduced to the followiog

2.9 CONS-D OF"ïiMIZ,4TION

expression

Because of the properties of the solution P ( t ) to the Riccati equation, defined with respect

to the estimate, no command uc(t) can be caiculated to minimize the previous cost func-

tion. One approximation is again to derive the extended Kaiman 6lter with respect to the

predicted trajectory. The problem is then to find the command uc(t) that minimize the

foliowing cost function,

where H[xn( t ) , xc( t ) ] is the Jacobian of the observation equations generated dong the nom-

inal predicted trajectory. This trajectory is d e h e d considering the prediction of the state

vector x, ( t ) .

Zt follows t hat @en the deterministic, Riccati differential equation

and the camera dynamics (equation 2.27), our objective is to find the optimal uc( t ) such

that the cost function (equation 2.46) is minimized.

For t his we consider the Hamiltonian function

where Gl ( t ) and G2(t) define the corresponding costate matrices that satisS. the fol-

lowing equations,

and the boundary conditions:

Similar to the 6rst optimization problem, the equations. stated above, present a nonlinear

two point boundary value problem for which we use a variable metric technique [33] to find

a solution.

10. Simulations

To investigate the properties of the proposed constrained optimization approach to

observation strategies: dXerent simulation examples are considered with a special attention

granted to the effects of different obsemtion horizons, different levels of uncertainties on

the object dynamic mode1 and the observations as well as on the initial state. Also, we give

special attention to the effects of weighting parameters. These properties are invest igated

in different examples

0 a particle moving dong the yi axis and a camera rotating wit h respect to the axis;

0 a particle moving in the plane xiyi paraiiel to the optic axis and a camera moving

with respect to the zi axk;

+ a particle moving in the plane paraIlel to the optic axis and a camera with a

two degree of freedom;

0 a partide moving in 3 0 and a controlled camera that rotates with respect to zi and

the Yi axes.

A constant ratio focal length to pixel size is considered in the simulations to be equal to

600. We will assume that we have control over the camera rotation movement wit h respect

2.10 SIMULATIONS

TrajaQory of îhe peitida in the image plane 300 1

I 5 1 O 15

Time (s)

FIGURE 2.5. Trajectory of the particle in the image plane, T = 0.2s and t , = 0.1s: R = diag[l II (dashed), R = diag[lO 101 (solid)

to the G and the Yi axes. This allows us to optimize the moving target analysis through

the filtering process.

10.1. First Example. In the following simulations, we consider again the simple

example of a moving particle in the plane ziyi along the Yi a i s , defined in section 8. In

fiOwes 2.5, 2.6, 2.7, 2.8, 2.10, 2.11, and 2.12, simulations results are presented for the first

example with Q = [U A ] , Po =diag[O.L 0.11, zo = [ O O S ] andZO = [00.3].

It is clear from figure 2.5 that for a small observation horizon T = 0 . 2 ~ ~ increasing the

covariance of the measurement noise R leads to a more eccentric image. The image plane,

in this situation, is almost parailel to the line between the focal point and the particle. This

perrnits the maximum signal to noise ratio to allow a fast decrease of the trace of the state

error covariance matrix. Such images are impractical since the particle position still exceeds

the extent of the image plane of any practical camera. The plots of figure 2.6 show that

the trajectory of the visual system is now tracking the movement of the particle compared

to the trajectory obtained without the measurement error variance constraint expression

(figure 2.3). Moreover, the trajectory of the visual system depends on the covariance of the

measurement noise: an increase in this covariance leads to a more marked movement of the

visual system (figure 2.6). This can be explained by two factors which can be impiicitly

FIGURE 2.6, Trajectory of the camera, T = 0.2s and tu = 0-1s : R = diag[l 11 (dashed), R = diag[10 101 (solid)

Trajectory of ifrs paitida in the image plane 400 I

I 5 10 1s

Tima (s)

FIGURE 2-7. Trajectory of the particle in the image plane for R = diag[lO 101, tu = 0.1s: T = 0.2s (dashed) and T = 0.5s (solid)

linked to the observation horizon: (i) the big uncertainty on the target trajectory for a small

horizon; (ii) the objective of a fast reduction of the trace of the error covariance matrix.

Thus, we need to study the significance of certain parameters on the performance of the

visual system. These parameters are defined as the observation horizon, the weighting kl in the cost function (equation 2.46) and the chosen update tirne to the receding horizon

observer. It is evident from figure 2.7 that increasing the observation horizon T = 0.5s

2.10 SIMULATIONS

FIGURE 2.8. Trajectory of the particle in the image plane for R = diag[lO 101, t , = 0.1s: k1 = 1 (dashed) and k1 = 10 (solid)

log of the mœ of the sraie error covariance ma- -2 l

FIGURE 2-9. Performances of the extended Kalman filter: trace of the state error covariance matrix, R = diag[lO 101: Gi = 10, T = 0.5s and, t , = 0.1s (solid);

= 1: T = 0.2s and, t , = 0.1s (dashed)

aliows a more s p m e t n c behavior in the image plane. In fact increasing the observation

horizon allows to assess more the predicted trajectory of the particle and the errors on these

predictions. Therefore, aUowing the trajectory of the visual system to be controiied given

t his added informat ion.

Rom figure 2.8, it is evident that increasing the weighting on the measurements esti-

mation error in the cost function (equation 2.28) înduces a more eccentric image allowing a

faster decrease of the trace of the state error covariance rnatrix (figure 2.9). Consequently,

the camera oscillates more. Combining both parameters changes, i-e. observation horizon

and weighting factor, we obtain a well centred image (figure 2.10). Figure 2.11 presents

the performances of the extended Kalman fiiter in this case as weil as the trajectory of the

observer. It is evident that the estimates converge rapidly to real values. Moreover, we

r eahe that the observation system is adopting more or less a fixation strategy. It follows

that the additional constraint term ailows good performances of the extended K a h a n fil-

ter, fast converging estimates (figure 2.11) and a simultaneous centralization of the particle

trajectory in the image plane.

-3001 O 5 10

fime (s)

FIGURE 2.10. Trajectory of the particle in the image plane for R = diag(l0 101: 5 , i = 10, T = 0.5s and, tu = 0.1s (solid); GL = 1, T = 0.2s and, tu = 0.1s (dashed)

Figure 2.12 presents the plots of the trajectories of the particle in the image plane for

different update times & = 0.05, t , = 0.1 and tu = 0.2, It is very clear that for a smaller

update time, we obtain a more centred image. This is an expected result since decreasing

the update time d o w s a faster consideration of information from new measurements in the

computation of the new observation strate=

log of tha VBCB of me state enor covarcana, Trepctofy of t)>e mua! sysem -2 1 0.4 r

FIGURE 2-11. Performances of the extended K h a n Mter for R = diag[lO 101, &, = 10, T = 0.5s and, tu = 0.1s

Traieuory of the panide in me image piane T 1

FIGURE 2.12. Trajectories of the particle in the image plane for T = 0.5s and for different update times: tu = 0.05 (dashed), t , = 0.1 (dashdot) and t, = 0.2 (solid)

Although the limited nature of the simulations presented, it is possible to draw certain

conclusions fkom these results, mainly:

The determination of the observation horizon is very important in optimizing the

contribution of the observation strategy to the performances of the extended K a h a n

filter even though thete is no rule to determine the observation horizon.

2.10 SIMULATIONS

(ii) The trajectory of the visual system does not track the oscillations of the target, in

contrast to a fixation tracking approach where the trajectory of the visual system is

derived such that the predicted position of the particle is in a fixed position in the

image plane, usually the centre- It follows that with this observation strategy, we

are able to handle fast moving targets.

(iii) It is clear that a right choice of some mode1 parameters, mainly observation horizon,

weighting factors and update time aUow to keep the target image within reasonabIe

limits in the image plane.

10.2. Second ExampIe. In this simulation example, we add a second degree of

Ereedom to the particle moving in a plane parailel to the xiyi plane and we keep the same

camera motion as in the previous example. That means that the particle oscillates dong

the Yi axis and it is moving toward the camera dong the xi axis. Therefore, the model and

measurements equations are defined as follows:

0 each degree of freedom of the target is presented by a second order system. For

simulations, we consider the following model

where

Given the position of the particle in the inertial £rame

2-10 SIMULATIONS

F~CURE 2.13, Trajectory coordinates of the particle in the image Game for T = 1s and tu = 0-1s: with respect to the y, axis (dotted), with respect to the z, avis (solid)

0 It follows that the measurements equations are

d, is taken equal to 10 cm. In the simulations results, presented in figures 2.13 and 2.14,

we consider the following initial conditions Q = l e - 3 1; !i i ] , R = d i a g [ l l l ,

Po = dzag[l 1 1 11, io = [-3.3 O O O] and xo = [-3 O O O]. It is evident kom the obtained

results that, because the particle is moving toward the observer, its position in the image

plane is drifting fiom the center (figure 2.13).

At this level, it is very clear that the visual system is no longer capable of keeping

a centered projection of the particle in the image kame. Of course, setting the limits on

the extent of the image plane, induces a deterioration in the performances of the extended

Kalman filter (figure 2.16). It follows that to attain the goal of a centered image and good

2.10 SIMULATIONS

FIGURE 2.14. Orientation of the visuai system with respect to the zi axk : T = 1s and tu = 0.1s

Enors in the position estimates Errors in the velocky enimams 1 r

log of the trace of the nate error covariance rnanix 5 1

FIGURE 2.15. Performances of the Kalrnan filter: T = 1s and t , = 0.1s

and fast converging estimates, we need to consider an additional degree of fkeedom to the

visual system.

10.3. Third Example. In this example, we add a second degree of freedom to the

visual system observer. The target dynamic mode1 is similar to the one defined in the pre-

vious example. Similar initial conditions are considered for simulations- The measurements

-1 I O 5 10 15

Time (s)

FIGURE 2.16. Performances of the Kalman filter: Limited image plane (60x60), T = Is and tu = 0.1s

equations are defined by

where the camera dynamics are Limited to two second order systems modelling its rotation

with respect to respectively the zi axis and the yi axis. For xc = [xci x c XJ xc4] we haveo

where

Rom figures 2.17, 2.18, and 2.19, it is clear that the additional degree of freedom for

-121 O 5 10 15

Tirne (s)

FIGURE 2.17. Trace of the state error covariance matrix

FIGURE 2-18. Trajectory coordinates of the particle in the image planemith respect to the y, axis (dotted), with respect to the x, axis (solid)

the visual system allows a better performances of the extended Kalman fiiter since the

observation control strategy allows to keep a relatively centred image of the moving target

(figure 2.19) compared to the results obtained in the second simulation example.

10.4. Fourt h Example.

era as in the third exarnple and

In this example, we keep the same dynamics for the cam-

we assume that the particle is moving in 3 0 . Consequently

FIGURE 2-19. Trajectories of the visuai system with respect to the zi axis (dashdot) and the y; axis (solid)

its dynamics are deihed as follows

where for simulations, we choose A equals

Given

FIGURE 2.20. Trajectory coordinates of the particle in the image phne: with respect to the y, axk (dotted), with respect to the x, axis (solid)

Thus? the measurements equations are

cos (xd t ) ) sW~&))x, + sin(z,i(t)) s k ( z d t ) ) y , -+ cos(z,&))z, ~ l ( t ) = -f

- cos(xci (t)) cos(xd ( t ) ) ~ , - sin(s,i ( t ) ) cos(xd (t)) y, + sin(xc3 ( t ) ) ~ ,

Adding one more degree of freedom to the particle brings about a more eccentric image

of the particle (figure 2.20). This drift in the image position from the center does not affect

the performances of the filtering process as long as we do consider neither physical limits on

the position of the particle in the image frame nor a measurement noise covariance matrix

that depends on the position of the particle in the image frame.

F'rom the simulations results we can conclude that keeping the image of a tracked

feature within reasonable limits in the image plane without constraining the performances

of the Extended Kalman fiiter [22] can be ensured by associating to each degree of freedom

of the tracked target one degree of freedom to the visual system.

2-11 CONCLUSION

FIGURE 2.21, Trajectorïes of the visuai system with respect to the z, axis (dashdot) and the yj axis (solid)

11. Conclusion

We have presented an approach for optimal measurement determination using a vision

system. Simulations resuits suggest that this technique presents satisfactory results for the

specification of the optimal measurement strategy. The developed approach is based on

using a recursive extended Kahan filter to provide estimates of the target position and

velocity. And even though we use a suboptimal approach and we linearize with respect

to a predicted nominal trajectory given the fact that when using an extended Kalrnan the

covariance matrix is coupled to the estimation equation, the proposed technique ailows to

track an object with only a predicted motion.

We have presented results from a number of simuiated experiments which reveal many

interesting properties of the proposed strategy. Results outline the fact t hat the estimation

through the extended Kalman fiiter is sufficiently efficient to keep in view under reasonable

image limits a highly manoeuvring target, case excluded when assiiming fixation in a track-

ing probtem. The visual system trajectories generated by this approach depends not only

on the dynamics of the target but &O on the considered uncertainties and the observation

horizon.

2.11 CONCLUSION

The recursive filter used in our algonthm implicitly ailows the fusion of data fÎom

different sensors whenever needed. The extension of the algorithm to include the variation

of the interna1 optical parameters of the carnera is presented in the next chapter.

CHAPTER 3

Variations of Opt ical Paramet ers

In the relationship between viewer and viewed object: research has generally piaced an

emphasis on the object to be observed. The problems of inspecting an object, reconstructing

a scene, finding an object's pose or trajectory, have been studied extensively Ekom the viewed

side, physical properties of the target object, features, texture, ... etc. On the other hand,

the issues related to determinhg the properties of the observer that wii i be suitable for

the vision task at hand have received considerably Less attention. This latter area includes

questions such as what should the observer pose be? Or what values should observer's

optical parameters have? In the previous chapter we have introduced a strategy to control

the position of the observer. It has been shown that the control of the relationship object

observer-position is crucial in meeting the requirements of the task at hand. However, it

is usehl to define the concept of the relationship object-observer in a broader sense that

includes not o d y observer position and orientation but ais0 the optical settings associated

with the viewpoint at hand. These settings are also the visual system attributes afFecting

the quality of the image. Examples of such attributes include the focus, the aperture as

weU as the zoom.

In this chapter, the optical system is modelled by a varifocal lem system. For such

a Iens model, the optical or perspective center of the lens defined in the pinhole model is

replaced by the front and back nodal points (appendix A). Other characteristic properties

of the optical system that are used in this research are the aperture stop and the distance to

the image plane. The viewpoint parameters that are employed in this analysis are geometric

and optical in nature. The geometric parameters correspond to the positional degrees of

3.1 ,w-4LYSIS OF DEFOCUS

heedom of the visual system. On t he other hand, the optical parameters include the back

nodal point to image plane distance and the focal length- This chapter discusses two specific

problems arising in this context:

0 how the optical parameters can affect the quality of the image, and consequently the

quality of the estimates;

how to adjust the optical parameters to the task specification of recovering the

position and the velocity of the moving target.

We decompose the first problem into two parts: how to measure the sharpness of focus

wit h some geometric properties of the camera and how to optimaiLy determine the optical

parameters within their physicai constraints. After analyzing defocus as a variance of an

additive Gaussian measurement white noise, defined function of the state error covariance

matrix (chapter 2): we h d that a method based on minimizing the cost criteria (equation

2.45) proves superior to others more classical approaches [24]. We solve the second problem

by application of the proposed optimization approach to the case of a varifocal lens system.

This chapter is in five parts: Section 1 analyzes the effects of defocusing an image and

presents an evaluation of the effects of focal lengths on the quality of the image and mainly

their effects on the estimates. Section 2 describes the mode1 for the varifocal lem. Then

section 3 develops the optimization approach that includes the geometric as well as the

optical parameters in the optimization process. Simulations results are presented in section

4 for three particular cases. Findy, section 5 summarizes the results.

1. Analysis of Defocus

In this section we provide a frarnework for understanding the effects of defocusing

an optical system. First, we present some definitions and assumptions, and then employ

geometric optics to define the effects of defocus on the image.

1.1. Definition and Assumptions. There are two traditional ways to think about

the image formation process. The h t , classicai geometric optics, relies upon ray-tracing

and its results are a first order approximation. The second, classical physical optics, relies

upon diffraction theory and its results are exact [48]. Since our approach in chapter 2 has

been based on Merent geornetric models, we consider in this chapter only geometric effects.

The amount of defocus or bIurring depends solely on the distance to the surface of

exact focus and the characteristics of the lem system; as the distance between the imaged

point and the surface of the exact focus increa~es~ the imaged objects become progressively

more defocused. If we could compute the distance to an imaged point, therefore, it seems

possible that we could use our knowledge of the parameters of the lem system to compute

the radius of the blurring circle [61].

Assuming a simple thin lem mode1 [46] then following the Gaussian lem law we have

where u is the distance between a point in the scene and the lem, v the distance between

the lens and the plane on which the image is in perfect focus andt f the focal length of the

lens. Thus

f v u=- v - f

for a simple thin lem, f is a constant. If we fix the distance v between the lem and the

image plane to the value v = vo, we have also determined a locus of points at distance

u = uo that will be in perfect focus, Le.,

To explore what happens when a point a t a distance u > uo is imaged, figure 3.1 shows the

situation in which a lens of radius a is used to project a point at distance u onto an image

plane at distance vo behind the Lens. Given this co~gura t ion , the point will be focused at

distance u behind the lem, but in front of the image plane. Thus a blur circle is formed on

the image

From

plane. Note that a point at distance u < uo ais0 forms a blur circle.

the geometry of figure 3.1. we see that

Combining equations 3.4 and 3.3 and substituting the distance D for the variable u we

obtain

3-1 ANALYSIS OF DEFOCUS

Thin Lens

FIGURE 3.1. Displacement of the detector plane causes blurring: A point feature a t infinity will be imzged as a point on the focal plane, Blurring results if the detector plane and focal plane do not coincide

where f, is the f-number of the lem [l3]. The blurring of the image is better described by a

point spread function than by a blur circle radius [61]. Although this point spread b c t i o n

is quite complex, Pentland [61] shows that for white iight the sum of the various fimctions

obtained at difFerent wavelengths has the general shape of a two-dimensional Gaussian. He

defines the net effect, in light of the central limit theorem and the analysis of the s w n -

of single-wavelength focus patterns, by a two dimensional Gaussian N(a, a) with spatial

constant a. He defines a to be equal to the blurring radius r- This formulation of the

blurring as a two dimensional Gaussian twns out to be very useful and easy to consider in

our optimization approach derived in chapter 2.

1.2. Evaluation. In this paragraph, we consider the simple case of a point feature

moving parailel to the optic axis. The visual system is characterized by a thin lens of

constant focal length f that obeys the Gaussian law (equation 3.1). The approach consists

of adjusting the distance between the lem and the image plane of the camera so to keep the

feature in sharp focus and to study the effects of the focal length values on the quality of the

estimates. It is evident that for a thin lem, the quality of the image depends directly on the

distance between the lens and the image plane: if this distance is bounded it follows that

3.1 -&V-QLYSIS OF DEFOCUS

the quaiity of the image is a function only of the constant focal Iength of the Iens and of the

distance between the object and the lens. This relationship is depicted in the expression of

the blurring radius (equation 3.4) - It follows that the quality of the image depends on the

amplitude of the blurring radius.

To examine the properties of this relationship blurring radius, performances of the

extended Kalman ûlter, we consider the simple example of a particle oscillating paralle1

to the optic a i s . The dynamics of the particle are limited to a second order system. We

assume a constant input to the system aiiowing an oscillatory movement to the particle

with respect to a hxed position (figures 3.2, 3.3 and, 3.4). Let x = [zI x2] be the system

state vector where xi defines the position of the particle dong the xi axis (figure 3.3). The

measurements vector y (t) depends on the state and the optical parameters as foIiows

where y, = 0-3m and z, = 0.3m define the constant position coordinates of the particle wit h

respect to the Yi and zi axes and x, = X I (t) the position of the particle with respect to the xi

axis. v(t ) d e h e s the measurement noise, characterized by a time varying covariance matrix.

This covariance matrix is defined as the sum of two terms R = Ri + R2: the first term

R1 derived fiom the expression 3.5 as s function of the predicted position of the particle;

the second term is taken equal to R2 = diag[ l 11. f,(t) corresponds to the equivalent focal

length obtained for a fixed f and under bounds constraint on the displacement of the lem

u. For each new predicted position value of the particle, f,(t) is computed such that the

trace of the measurement error covariance matrix is minimized.

Simulations results are presented for Q = [ U :] and Po = diagi10 101 The

aperture is taken equal to 1.6mm . In the first simulation results (figure 3.2), we consider

as initial conditions xo = [ - O 5 O] and 5o = [-0.6 O].

We consider for this simulation and the next one a thin lens with a focal length f = 9mm

and bounds on v, v,i, = 3mm and v,, = 12.5mm.

Figure 3.3 presents a deterioration in the performances of the extended Kalman fiiter

which is due to the quality of measurements: too noisy measurements induced by a large

3.1 AVALYSIS OF DEFOCUS

FIGURE 3.2. Simulation results for an osdating particle paraiiel to the optic axis at a "fa? distance £rom the lem

bhrring radius. In these simulations, the constraints, defined by the lens Gaussian law and

the bounds on v, cause this large blurring radius for a particle getting too close to the lem

(figure 3.3).

Changing the focal length of the thin lem aliows to reach better performances of the

extended K a h a n under the same constraints- Simulation results, presented in figure 3.4,

show that a smaller focal length ailows the particle to get closer to the lem.

It follows that unless the feature position is in the field of focus of the lens (24, 671,

the quality of the estimates decreases- It ensues that for ail features of interest in the scene

to be in focus, or at least to aUow a maximum of information on the position of the object,

we need to plan a camera placement and a lens setting that keep the features of interest in

the depth of field zone, allowing the blurring radius to be within the limits of the adopted

resolution in the image plane. However for a thin lens model, the lem setting corresponds

to carrying a bag fuil of interchangeable fixed thin lenses. Of course, we assume here that

the depth of field problem cannot be circumvented by varying the placement of the sensor.

Moreover varying the placement of the sensor mainly along the optic axis doesn't induce the

same image variation as a zooming process [45]. One approach to obtain a variable focus

length allowing a zooming feature to the visual system is to consider a thick lem model for

3.2 VMUFOCAL LENS

FIGURE 3.3. S'imulation results for an osciliating particle parallel to the optic as% at a "close" distance from the lens

the camera [47, 621. In this dissertation we consider a more general model for the lem. It

is a varifocal or a zoom lens model in which the focal length can be continuously varied by

moving one or more of the lem elements dong the axis of the lens.

2. Varifocal lens

This section covers a simple type of vanfocal lens that includes a positive and a negative

lem 1341. We assume that we can vary the distance di between the two lenses as weU as

the distance 4 between the back lem and the image plane.

2.1. Derivation and properties of the equivalent lens. In appendix A we

review the matrix method in paraxial optics. This method allows a simple approach to

determine the properties of an equivalent lem to a set of lenses. I t allows dso to determine

the transformation between the object fiame and the image fiame for a given set of lenses

under the condition of paraxial optics (appendix A). In this chapter, this approach is used

to derive the optical properties of a varifocal lem.

FICURE 3.4. Simulation results for an oscillating particle parallel to the optic axis at a "closen distance Corn the lens for a different focal length

FIGURE 3.5. The two-component zoom system

2.2. Definitions. The 6 r s t step consists of defining the equivaient Iens for the set

f l , 4, f2. Based on the matrbc approach, we have

where C defines the power of the equivalent lem in dioptres.

3.2 VARIFOCAL LENS

2.2.1. Rekationship object-image. After deriving the matrix M presenting the optical

properties of the equivalent lem and assirming that the object is at a distance pl from the

hont lem, the position of the image plane that ailows an exact in focus image of the object

ql is computed from the object-image transformation (appendix -4)

For the object-image relationship to hold, we must have

It follows that the distance from the back lem to the image plane, ailowing a well focused

image, is defined by

It is clear kom equation 3-12 that the position of the image plane for a well focused image

depends on the optical properties of the visuai system and on the distance of the imaged

object to the front Lens.

Defining dp as the varying distance between the back lem and the image plane, it is

evident t hat for a visual system focusing on ody one feature, the best value for the dist bnce

d, corresponds to the one defined by qi (equation 3.12). For more than one feature in the

image plane, mainly lying at Merent distances from the camera, a problem arise: how to

choose the distance dp that allows a good performance of the visual system? To answer

this question we need to define, as in chapter 2, a metric to measure the performance of the

visual system. This metric needs to take into accounts the effects of the external as weil as

the internal parameters variations of the visual system.

Because the radius of the blurring circle is defined as a function of the internal param-

eters of the system and its effect is modeiied by a Gaussian white noise centred at the focus

point with a covariance equal to the radius of blur, then we are s tU able to use the same

metric as in chapter 2 to measure the performances of the visual system.

3.2 VARIFOCAL LENS

FIGURE 3.6. Geornetry defining the radius of the confusion circle

The covariance of the measurement noise is defined as the s u m of two terms. One term

equals the radius of blur, the second defines the covariance of measurement noise due to

dzerent sources. We can quote some of these sources:

overload noise due to limited size of the image plane on the camera;

quantization error due to the pixel-level accuracy;

image processing and computational errors.

2.3. Radius of blur: varifocal lem. As derived in the first section? the amount

of defocus depends on the distance of the image plane to the surface of exact focus. in

our definition of the varifocal lem, we consider as variables the distance between both thin

lenses d, and the distance between the badc lem and the image plane 6. More precisely

dp defines the distance between the second principal plane If2 and the image plane (figure

3.6). To derive the expression of the radius of the confusion circle, we assume known:

The predicted position of the object and its related uncertainty described by the

error covariance matrix P, of the state estimate.

The f-number of the lenses is asswned to be constant. Its value defines the ratio

of focal length to the aperture or diameter of the lem. It presents a measure of

the amount of light which passes through the system. Small f-numbers imply Large

openings. Thus the f-number is sometimes referred to as the speed of the lens-

The distance between the back lens and the image plane ailowing a weii focused image is

defined by q i . Fkom equation 3.12, ql is derived as a function of the predicted position of

the object with respect to the front lens p l . It follows that the error on ql is also a function

3.2 VARIFOCAL LEXS

of the state error covariance matrix. The expression of the back focal length fb, defked

with respect to RP' (equation 3.6) and derived in appendix A, equals

Thus, the distance between RP2 and the second principal plane H2 equals 1 - A/C+

Given ql, pl, fb and the error covariance of the predicted position; and given dp the

distance between the second principal plane and the image plane; the blurring radius is

then defined for the worst case as the s u of the errors due to:

fi) The clifference between the distance h m the second principal plane H2 to the image

plane and the distance of sharp focus defined with respect to H2 (figure 3.6)

(2) The axial displacement of the image corresponding to a small displacement of the

object. This small displacement is caused by the uncertainty on the object position

and derived as a function of the state error covariance matrix.

In this definition of the biurring radius, terms due to different sorts of aberrations maidy

geometric aberrations caused by the paraxial-ray assumption breaking down are not con-

sidered-

The position defines the real position of the object at an instant f, we drop t fkom

equations for simplicity. xl is defined with respect to the inertial frame with a similar

development to the one defined in chapter 2. Assilming that the position of the object with

respect to the front lem, dong the z, axis is defined by g(xl). Then the mean value pl of

g(xl ) is dehed by

of the random variable g(xl) [60] in terms of the two first moments of xi. !(xi) takes

significant values in an intenal near its mean E{xl) of the order of its standard deviation

a. We take û! to be equal to the square root of the position error variance. This position is

defined with respect to the inertial fiame.

Considering that g(xl) is "smooth" then E{g(xl)) can be approximated by [60]

AIso the approximation

3.3 OPTILMIZATION APPROACH

to the standard deviation of g(zr) is d e h e d by:

f12 .: [ s f (~{z i })I2a2 (3.16)

It ensues that the distance fkom the object to the &ont lem is a b c t i o n of the estimated

position as well a s its variance.

The second term considered in the definition of the radius of confusion is taken equal to

the standard deviation d e h e d with respect to the second plane RP2 by using the following

transformation

where Apl = p presents the small displacement along the optic a i s .

Consequently the covariance of the measurement noise is a function of the state error co-

variance matrix.

3. Optimization Approach

In the previous chapter 2, we derived an optimization procedure that aliows the best

position of a pinhole camera to gather information allowing the determination of the dy-

namics of a tracked object. In this chapter we are considering in addition to the external

parameters, some interna1 parameters which include: (i) the focus length obtained by vary-

ing the distance di defined in the previous section; (ii) the aperture considered to be varying

indirectly since we are keeping a constant f-number; (iii) the zoom which can be defined

by the simultaneous variations of the focal length and the distance dp between the back

lens and the image plane. In our approach, we consider an instantaneous variation of each

optical parameters rather than including a dynamic mode1 for each one.

It follows that, in th& chapter, in addition to the input cornmands controlling the

position and orientation of the visual system, we consider the optical parameters as inputs.

In the computation of the observation strategy, the measurement noise covariance rnatrix

is the sum of two terms: The first term is defined by the blurring radius and derived as a

function of the state error covariance matrix. The second term presents the covariance of

measurement noise caused by different noise sources.

Given the cost function defined by equation 2.29 and the dynamics of the camera

defined in equation 2.27 as the first constraint, the second constraint to our optimization

problem can be formulated as follows

which defines a "modified Riccati equation" where the noise covariance matrix is an inherent

function of the error covariance matrix P,(t). As in chapter 2, and because it is a nonlinear

two point boundary value problem, we use a numerical technique to solve the problem.

4. Simulations

As simple but iUustrative examples, we present simulations results for different cases:

(i) one particle moving in the horizontal xiyi plane; (ii) two particles that evolve with

the same dynamics in the same plane but at different distances Fiorn the sensor; (iii) two

particles that evolve in the same plane but with different dynamics.

In these simulations, the optic axis of the camera is considered to be in a parallel

plane to the xigi plane; our objective is to adjust the canera's parameters to minimize

the cost functional already defined (equation 2.29). For the first two cases, the dynamics

of the particle as weU as the dynamics of the object are both defined as second order

systems. Errors in these models are presented by disturbance white Gaussian noise inputs,

characterized by a constant covariance Q. Let z = [zi z2] be the system state vector and

Q the constant covariance of the disturbance noise <(t). Then the continuous time system

mode1 for each case is defined by:

where

The camera dynamics are Limited to a second order system modeilhg its rotation with

respect to the q axis For x, = [xcI xa] where x,r defines the orientation of the camera

with respect to the inertial fiame and x d its angular velocity, we have

where

Assuming the focal lengths of the two t hin lenses are f 1 and f2, t heir separation being

4, the k s t optical optimization parameter. Then the back focal length of the combination

is given by,

SimiIariy, the second optical optimization parameter is defined by the distance between

the back lem and the image plane, dp. Although the fact that we consider a varifocal system,

this distance is important in defining a minimum radius of blur in the case of more than

one feature in the image plane. The distance between the back lem and the image plane

affects expiicitly the covariance of the measurement noise Rf P, (t)) . In fact, in addition of

being function of Pn ( t ) , R(P,(t)), characterïzed by a diagonal matrix, is explicitly related

to the interna1 parameters. For one feature point in the image plane: the element R(1,I)

of R(Pn(t)) is d e h e d by

3.4 SIMULATIONS

w here

,û defhes the product of the ratio, back focal distance fb to the aperture, by the discretization

constant p (pixel s i x ) . The measurements y j at time t are required to depend on the States

and the opt ical paramet ers according t O the measurement equat ions,

where d, and dz define the constant positions of the particle along the xi and zi axes of the

inertial fiame.

4.1. Simulations Results.

4.1-1. First Ezample. In th& case we assume ody one particle in the field of view of

the camera. We consider a two-component zoom system characterized by a focal lengt h for

the lenses equals to 3mm and limits on the possible values of the two optical parameters:

Im 5 di 5 2 . 5 n and O . l n 5 dp 5 15cm. We consider a constant observation horizon C

T = 0.5s, a constant mode1 disturbance noise covariance matrix Q = 1 : ool 1 and, the

C 4

initial conditions Po = diag[O.l O. l ] , xo = [O 0.51 and & = [O 0.31. First, we assume a fked

camera, it foilows that the properties of the Kalman filter are governed by the values of

3.4 SIMULATIONS

Position enifnaaion error VelWty asümation emr

n 1 1

log of the trace of the state enor covaBence ma& -2

FIGURE 3.7. Performances of the K a h a n filter: time n q i n g internai parameters

P e - o n at the partic& in <he image piane 40 r I

-120; 1 2 3 4 1 5 6 7 8 9 10

Time (s)

FIGURE 3.8. Position of the particle in the image plane: time t-arying interna1 parameters

the optical parameters. From simulations results, it is evident that good performances are

obtained for the extended Kalman filter (figure 3.7). Figure 3.8 shows that the position of

the particle in the image plane is not centered. This resuit can be explained by the fact

t hat the variations of the optical parameters aiiows to zoom on the particle and since the

direction of the optic axis is fixed, we obtain an image that is not centered.

FIGURE 3 -9. Trajectory of the particie in the image plane

1

1 2 3 5 6 7 8 9 10 4 Time (s)

FIGURE 3.10. Performances of the E-utended Kalman filter: time varying extemal and interna1 parameters

The rotation of the camera with respect to the zi tuas d o w s a more centered image with

similar performances to the extended Kalman filter (figures 3.9 and 3-10). In fact Mcying

the direction of the optic axis (figure 3.11) offers a sort of tracking to the particle allowing

a more centered image in a high resolution image frame: p = 1.5pm. It is important

to mention that at this level of simulations, we cannot distiaguish between the real effects

of the zooming process and the effects due to a non-centered image on the position of the

F~GURE 3.11. Orientation of the camera with respect to the zi a.+

Variations of the opticai parameter di

Variations of the ootkai parameter dg

FIGURE 3.12. Variations of the opticd parameters

feature in the image plane. It is also signiCicant to refer to the resolution value considered

in the simulations. In fact, the position of the particle in the image kame depends on the

chosen value of the resolution: a high resolution induces a more important position of the

particle in the image frame.

4.2. Second Exampie. In this example, we consider two features in the field of

view of the camera. These two features belong to the same rigid body underlying the

Disrence in the Murring radii fmtwson Vle Iwo fsahrres

FIGURE 3.13. Blurring radius merence: two features in the image pIane

same movement as the one defined in the first example. Similar parameters and b i t s are

assumed for the optical system as in the previous example. We take a constant observation r 1

horizon T = 0.5s: a constant mode1 covariance matrix Q = 1 ool 1 and, the initiai

conditions Po = dzag[l 11, xo = [O O] and = [-0.01 O]. The features are considered at a

distance 20cm kom each other. The behavior of the camera is completely different Eiom the

previous example. In this example and to keep an optima1 performance to the extended

Kaiman filter, the interna1 parameters are adjusted to aUow an equal radius of blur for both

particles (figure 3.13). Consequently the camera is not zooming on any specific feature and

unless other constraints on the parameters are considered, an optimal position is the one

auonring for minimum measurements noises for the two features.

4.3. Third Example. In this example we consider two particles with dynamics

characterized by

log of me tram of the statu error cwanana rnatnx O r

1

O 1 2 3 4 5 6 7 I

Time (s)

F~GURE 3.14. Performance of the KaIman filter: log of the trace of the state error covariance matrix for the example of two features with the same dynamics

where

O 0.01 O 0.01 Simulations results are presented for Q = IO 1 andthefoliowingioitill

O 0.01 O 0-01 conditions Po = diag[l 1 1 11, xo = [-0.2 0 0.2 O] and Zo = [-0.01 0 - 0.02 O]. The iimits

on the possible values of the two opticai parameters are : lcm < di 5 1 0 n and O n 5 d,

focm. We consider a constant observation horizon T = 0.5s. The measurements equations

3.5 CONCLUSION

F [CURE 3 - 15. Blurring radius dxerence: two particles with different dynamics

are

where the constant term of the measurement noise covariance matrix is taken equai to

Ri = diag[l 1 1 1 11. Similar conclusions are reached in this example too: unless we add

another constraint to the cost function, the camera is not zooming on any of the particle

but adjusting its parameters to aliow a minimum measurement noise (figure 3.15) and good

performances to the extended Kalman filter (figure 3.16)-

5. Conclusion

In this chapter, we present a strategy of optimal measurements using a vision system.

The comidered approach allows the integration of different modules of the active vision:

tracking, focusing and zooming. It presents a performance function that permits to the

system at the same time, to coordinate between different variables, interna1 and external and

3.5 CONCLUSION

Variations of di VarÏations of do

log of the trace of the state m o i covariance rnatrïx

Tirne (s) Traiectories of the ~arücles

-0.5 ' 1 I 1 1 1 1 1 I O 1 2 3 4 5 6 7 8

l ime (s)

FIGURE 3-16. Variations of the optical parameters and performacces of the extended Kalman filter: red positions (solid), estimated positions (dashed)

to insure an optimal filtering accuracy. Simulation resdts prove that the presented strategy,

dynamically ident ify the changing requirements of the vision task and accommodat e the

imaging paramet ers to t hese requirements- The simulations resdts for two features, show

t hat unless more constraints are added, the variations of the optical parameters d o w a less

sensitivity to errors in the measurements by reducing the blurrïng radii without allowing a

zooming on any particular feature.

CONCLUSION

The introduction of the optical parameters in the optimization process allows a better

performances of the visual system in situations where otherwise the filter diverges: for

example a particle getting too close to the camera. Therefore aiieviating the need to more

degrees of fkeedom to the visual system.

CHAPTER 4

The Occlusion Problem

1. Introduction

Sensors systems have received a widely varyhg treatment in the system literature

and the perception literature. Systems theory views sensors as relatively simple, known

measurement systems corrupted by some type of noise process- The emphasis is to gain

enough information hom the sensor to control a system or estimate parameters. This has

resulted in a number of methods for estirnating parameters hom noisy observations.

Perception has tended to disregard the "low leveln aspect of the sensor in favor of a

description of what geometric primitives, features, or high-level information it returns. The

interface between these two views has come primarily at the level of feature extraction.

However, feature extraction does have to yield to the inherent Limitations of the sensor

including the noise, the ambiguity, and more important the limitations of the field of view,

factors many perception programs assume do not exist.

The effects of sensor constraints must be explicitly evaluated throughout a system using

sensor information. A complete model of a sensor would, of course, include a description

of the effect of all influences on the output of the sensor. We can divide these factors into

those which are under the control of the system, and those which cannot be controlled-

The uncontrolled factors include quantkation, model uncertainty and unavoidable physical

constraints. The physical constraints are due eit her to the geornetry of the target object or

to properties of the environment.

4.3 FILTERING: SINGLE MEASUREMENT PROBLEM

For the purposes of this dissertation, we describe in this chapter three approaches

to defining the strategies of measurements. Two approaches are based on a probabilistic

technique in conjunction with sensor control to deal with observability problems that arke

under sensor constraints, mainly field of view constraints due either to the geometry of

the observed target or to the environment, These limitations characterized by the non-

observabiliQ of the tracked target, are dealt with in a Bayesian fiamework. The third

approach, presented at the end of this chapter, deals with the same kind of constraints-

However, it is based on an approach of gaze change between Merent tractable features.

2. Description of the Problem

In chapter 2, the problem of automatically generating camera locations for observing

an object is defined, and an approach to its solution is presented. Our approach, which uses

models of the object and the visual system is based on meeting the requirements that the

trace of the estimation error covariance matrix be minimum. This assumes that the tracked

features are observable at each instant of time. Considering the physical constraints on the

camera and the target:

geometry of the target,

iîmited field of view of the camera,

rn limited number of degrees of fkeedom for the camera: for exampIe only a rotation

with respect to a certain axis is available,

an unstructured environment: appearance of unexpected features in the image plane;

the observability assumption does not hold anymore. Our objective is then to develop algo-

rithms based on engineering judgement on how to modifv the basic observation algorithm

to handle some of these 'teal life" problems and to deal with the observation problem.

Special attention is given to occlusion problems due to the geometry of the target. In the

next section we develop the approach to deal with this problem at the filtering level-

3. Filtering: Single Measurement Problem

Assuming a vector of measurements is carried out at instant k + 1, we consider two

cases.

4.3 FILTERING: SmGLE -MEASUR.EMENT PROBLEM

a Case 1: The measurement vector y ( k + l ) contains images of tracked features. There-

fore y(k + 1 ) carries information about the state x ( k + 1 ) . We refer to this case as

t racking.

0 Case 2: The measurement vector y(k + 1) contains no information about x ( k + 1 ) .

It is a total occlusion situation, none of the tracked features appear in the image

plane. We refer to thk case as an occlusion.

At instant k + 1, y(k + 1 ) contains tracked features images. Rom a probabilistic point of

view, al1 information about x(k + 1) given the measurement y(k + 1) is contained in the

posterior conditionai p r o b a b i l i ~ density function (pdf) p(x(k t l ) l y ( k + 1 ) ) - We define the

event V by

V = T(k + 1): denotes the tracking event,

V = O ( k + 1): denotes the occlusion event.

For simplicity we limit the number of tracked features to one. Otherwise, we need to

associate an event wit h each feature.

In the update process, we assume we know at k + 1 if V = 7 ( k + 1) or V = O ( k + 1).

It follows that

p(y(k + 1)1x(k + l ) , 7 ( k + 1 ) ) is derived function of the probability of the measurement

noise. p(x (k + 1 L) l 7 ( k + 1 ) ) can be written

As a function of x(k + 1): p(7(k + l ) l x (k + 1 ) ) is either 1 or O and

For the occlusion case, the measurement y ( k + 1) carrying no information on the state,

we have

4.4 INFOmiATION €ROM THE OCCLUSION CASE

At this level of the development, we may consider two cases:

p(x(k + 1)10(k + 1 ) ) = p(x(k + 1)): No information can be considered in the update

of the state vector z ( k + 1) .

p(x(k + l ) l O ( k + 1) ) : Some information fiom the occlusion can be added in the update

of the state vector z(k + 1) .

4. Information fkom the Occlusion Case

For a visual system, the fact t hat the tracked feature do not appear in the image plane

in a given time interval provides information, nameIy that no tracked feature was in the

field of view of the sensor during that particular interval. Usually, in an estimation problem,

that information is not used- This is not important for a fast moving object, but becomes

relevant at iow speeds. A modification useful for this kind of problem, is introduced to

improve estimation. Given O(k + 1) the event defined as the absence of the tracked feature

in the image plane between two samples, by Bayes' rule, we define the following pdf:

In order to keep complexity within reasonable Iimits, p(x(k + 1 ) ) is assumed to be Gaussian.

The probability density function p(O(k + l ) fx (k + 1 ) ) is not Gaussian, but will be approx-

imated by a Gaussian, so that the posteriori pdf p(x(k + 1)10(k + 1 ) ) is Gaussian. Given

the dynamics

where xi(k +

of the object as weli as its geometry? one can write

1) d e h e s the position state of the target at instant II- + 1. Bl and B2 define

the Iimits of the pulse i n t e d , derived as a function of the geometry of the object. The

definition of Br and B2 can be extended to include the physical properties of the camera,

mainly its field of view. This conditional probabiIty is a rectangular pulse whose location

along the axis x l ( k + 1) depends on x(k + 1). Since the denominator p(O(k + 1) ) is only

a normalizing factor, equation (4.5) c a b for multiplying the Gaussian p(x(k + 1 ) ) by a

different pulse function for each d u e of x(k f 2) . If aach rectangular pulse is replaced by

a Gaussian approximation, the result consists of dXerent Gaussian pieces over the different

4.4 INFORMATION FROM THE OCCLUSION CASE

intervals of x l ( k + 1)- However, if p(x(k + 1 ) ) is negligibly smali for all z l ( k + 1 ) outside

the interval Bi _< x l ( k + 1 ) < B2 then only one of the puises of p(U(k + l ) ( x ( k + 1 ) ) needs

to be approximated. The simpkst way to derive an approximation is to fit the mean and

the variance. The h c t i o n to be approximated is a unifonn distribution, nirining hom Dl

and B2. The mean is 7 = 1 / 2 ( B 1 + B2), the variance is d e h e d by

Therefore, one uses

Now to calculate p(O(k + l ) ) l x ( k + 1 ) according to equation (4.5) and assiiming a two

component state vector, the result is clearly Gaussian, so consideration may be limited to

the mean and variance terms. Let p(x(k + 1)) = N ( 2 ( k + l l k ) , P ( k -+ l l k ) ) and p ( z ( k + 1)10(k + 1 ) ) = N(m, M - l ) and let S = ~ - l ( k -+ l l k ) . Equating exponents on each side of

equation (4 .5) yields

Matching terms gives the foliowing:

4.5 MODIFIED N0NLINEA.R. ESTILMATION- OCCLUSION INFOEtMATION

Those equations are written more compactly as

Let P,(k + l lk + 1) = M-', Le. P,(k + l lk + 1) is the updated P ( k + l fk) given the occlusion

information and using the mat* inversion Iemma,

where

1 cr(k + 1 ) =

a2 + vT(k + l ) P ( k + l lk)v(k + 1)

and

5. Modified Nonlinear Estimation: Occlusion Idormat ion

The discrete measurements y (k + 1) of the system (2-9) are d e h e d by

where vd(k + 1) is the equivalent discrete-time zero-mean white Gaussian noise sequence

with constant covariance Rd. The size of the vector h(x(k + l ) , x,(k + 1 ) ) depends on the

number of features in view. Our objective is to estirnate the system state x(k + 1) given the

noisy observations and the system dynamics- In this chapter and for simplicity, we consider

only one tracked feature and we treat the occlusion as the absence of the feature from the

4.5 MODIFIED NONLINEAR ESî"IMM"i'0N: OCCLUSION WORMATiON

image plane. We consider again the extended Kalman filter solution. Based on a first order

linearization, we can extend to this solution the results of the probabilistic approach d e k e d

in the previous sections 2 and 4. The aigorithm is then defmed as follows:

Initialkation: at k = O

Z(Ol0) = x(0) ; P(OI0) = Po

Predict ion:

where fd[5p(kl k ) ] presents the equivalent discrete time dynamics of the object.

Fd[2,(k + l lk ) ] the corresponding Jacobian function of fd[Z(klk)] denved with re-

spect to *(kl k ) and Q d ( k + l ) the covariance of t he equivalent discrete-time zero-mean

white Gaussian noise sequence.

Update step: The gain vector is then defined by

Since V = T ( k + 1) it follows that

consequently, the update process is similar to the one used in a ckssical extended

K a h a n filter

(h[x(k + l ) , z , ( k + l)] - h[%(k + 11 k ) , z,(k + l ) ] )

4.6 DESCRIPTION

P ( k + l l k + l ) = P ( k + l l k ) - K ( k + l ) E ( Z . ( k + l

Occlusion V = O(k + 1):

. OF THE CO-;\TTROL STR-4TEGY

Ik), x,(k + l ) ) P ( k + Llk) (4.24)

The presented algorithm is a suboptimal solution since the probability distribution function

of the state is no longer Gaussian. This approach is based on the fact that we know at each

new measurement if it is carrying a projection of the tracked feature or not. For a unique

filter presentation without a need to check the properties of each measurement, we propose a

probabilistic approach denved based on the probabilistic editing algorithm [76] and taking

into consideration the information derived fkom occlusion (section 4). This approach is

presented in appendix B.

6. Description of the Control Strategy

Two approaches are considered to define the control strategy of the visual system. In

the first one, we do not consider the occlusion information in the derivation of the command

of the visual system. On the other hand, the second approach is based on a strategy that

has recouse to the Bayesian framework defined in section 2 and section 4.

6.1. First Approach. In this approach, the no occlusion information controller,

NOIC is based on the development of chapter 2. The linearized Kalman filter defined with

respect to the nominal trajectory xn(.) is characterized by the foilowing Riccati equation

4.6 DESClUPTION OF TEE CONTROL STRATEGY

where

x, (.) corresponds to the predicted trajectory of the target. In this development, we associate

with such a nominal state trajectory, a sequence of nominal measurements

It follows that in the development of the linearized Kalman fiiter with respect to the nominal

trajectory, we take into consideration the presence of measurements at each instant of time

disregarding the occlusion problem.

Therefore, a similar optimization technique to the one derived in chapter 2 is used to

calculate the suboptimal command of the visual system.

6.2. Second Approach. The second approach: occlusion information based con-

troller (OIC), is characterized by the introduction of weighting terms in the computation

of the error covariance rnatrix of the linearïzed Kalman filter. These weighting terms are

computed function of the probability that an occlusion can occur. in fact given the state

prediction, which d e h e s the nominal trajectory in our approximation, two situations can

arise in computing the solution of the Riccati equation: either there is an occlusion and

therefore the solution of the Riccati equation is the updated term P(k1k) in equation 4.13 or

there is no occIusion and then the solution of the Riccati equation is the updated expression

defined in equation 4.24. Since we do not know a priori if V = I ( k + 1) or V = O ( k + l), we

can only assume that we compute the probability of the occlusion event. This probability is

derived function of the updated state and covariance matrix. For this, we assume that we

update the state and the covariance matrix at instant k = O and we define the observation

strategy for a receding horizon time interval k = 1 -- - N . At instant k we have

4.7 SMULATIONS

Therefore, considering the initial conditions x,(O) = 2(010) and P(0) = P(OIO), at instant

k we have

and

where xnr (k) defines the position cornponent of the vector zn ( k ) and

defines the normalized error function- Then

In the same spirit we can compute the posterior covariance matrix given the nominal tra-

jectory

Pn(k + l lk + 1) = P,(O(k))P,(k + l l k + 1 ) + ( 1 - P , ( O ( k ) ) P ( k + l l k + 1 ) )+

Pr(O(k + Z ) ) ( L - P,(O(k + l ) ) [ p ( ~ ( k + l ) l y ( k + 1), V = O ( k + 1 ) ) -

p ( z (k + l ) l y ( k + l ) , V = I ( k + l ) ) ] [ p ( x ( k + L)[y(k + l ) , V = O ( k t 1 ) ) -

p(x(k + l ) l g ( k + i), V = 7 ( k + 1))lT (4.35)

It is evident fiom the expression of P r ( O ( k ) ) that its value depends not o d y on the predicted

state but also on the uncertainty on this prediction. Thus we are able to condition the

weighting not only on the predicted value but ais0 on the uncertainty on that prediction.

7. Simulations

The theory is verified by performing a number of simulations. As simple but iliustrative

examples, we present the results of simulations for the case of a cubic target evolving in

the plane parallel to the perception axis of the visuai system and normal to the a i s Zi.

FIGURE 4.1. The configuration of the spinning target with respect to the observer

We suppose one degree of freedom for the visual system: a rotation with respect to the Zj

axis (figure 4.1) and a spinning cubic target. One feature corresponding to the upper right

corner of the cube is considered for the observation process- It is evident that this feature

gets out of the field of view in given time intervals.

First, a comparative study of the performances of the proposed £ilter to those of a

classical extended Kaiman filter is presented.

Second, observation strategies are considered for the visual system. The two control

approaches defined in the previous section are simulated and their performances compared.

Their performances are also compared to the performance of a control strategy where the

presence of an occlusion is ignored in al1 steps of the observation process: computation of

the observation strategy and £iltering.

The probabilistic algorithms considering the occlusion event, are based on two simple

assumptions: the measurement is carrying the image of the tracked feature or not. The

prior and posterior probabilities of an occlusion are defined function of BI and B2, the pulse

limits. In the considered simulations examples, these bounds depend on the geometry o f

the cube and they are defined as follows

B2 = (n + l ) ~ + 5 ~ / 4 n even integer

7.1. Filtering. We consider k t that the camera is h e d to focus on the filtering

aspect. Consequently, the simulations presented in this section concern the fiitering process.

A double integrator is considered to define the dynamics of the object. The input noise is

taken equal to = N(0,O-01) and the measurement noise equals v = N(0,l). The initial

conditions are defined by P(0,O) = diagI0.l 0.11, 2(0) = [O 0.31 and, x(0) = [O 0.051.

A simple pinhole model is assumed for the camera. This model assumes one perfect lens,

with fixed optical properties, obeying the perspective projection law. The measurements at

tirne k + l are defined by

~ l ( k + l ) =

where

d,, 4 , d : define the geometric properties of the cube in the inertial frame taken equal to

l O c m and d = lm the initial distance between the object and the visual system.

Simulation results are presented in figures 4.2, 4.3 and, 4.4. Figure 4.2 presents the

real trajectory of the object in the inertial frame and the tirne intervals corresponding to

the occlusion event. I t is clear &om figure 4.2 that the faster the object rotates, the smaller

is the time interval corresponding to the out of view state,

It is evident that without the probabilistic approach, the extended Kalman filter

presents a deteriorating performances: the bigger is the initial uncertainty, characterized by

the error covariance Po, the worse are the performances of the classical extended Kalman

filter (figures 4.3 and 4.4)- Moreover, these performances degenerate as a function of the

occlusion occurrence time interval: as long as the tracked feature is out of the view, the

uncertainty on the state estimate is increasing. It follows that the performance of the filter

depends, in this case, also on the object angular velocity.

FIGURE 4.2. Real trajectory of the object and performances of the K h a n filter: probabiiistic approach, Po = diag[O-1 0.11

FIGURE 4.3. Extended Kalman filter performances: error in the trajectory between the real orientation and the estimated one for Po = diagi1 11 : ciassicaI Kalman fil ter (dashed) and probabilistic ap proach (solid)

Consequently, the probabilistic algorithm improves substantially the state estimate. In

fact, the larger is the initial standard deviation of the trajectory of the target, the broader

the probability peak is, spreading the probability weight over a larger d u e s of state values.

Considering the information carried by the occlusion, its variance is assumed to be smailer

than the one on the trajectory. Therefore, the conditional density of the updated orientation

4-7 SIMULATIONS

FIGURE 4.4. Extended Kalman fiiter performances: error in the trajectory between the real state and the estimated one for Po = diagtl0 101 : classicai Kalman filter (dashed) and probabilistic approach (soiid)

presents a narrower peak due to a smaiier variance, indicating that we are rather increasing

the certainty of our estimate.

7.2. Observation strategies. In this section, we consider simulations results for

the two previously defined control approaches. The same dynamics are considered for the

ob ject and similar initial conditions are considered for simulations. Three combinations of

filtering and control are proposed based on the consideration of the occlusion information: - neither a t the filtering level nor at the observation control level: no occlusion infor-

mation in filtering and control (NOIFC);

m at the filtering level but not at the observation control level (OIFNC);

O at both levels (OIFC).

The following simulations results are presented in a discrete time framework. We

consider a sampling time dt = 0.01s. Consequently the update time of the observation

control strategy, taken equal to 10 and the observation horizon N7 are defined by the

number of samples.

It is very clear from figures 4.5 and 4.6 that the introduction of the occlusion information

allows a faster convergence of the estimates and a difïerent trajectory to the visual system.

4.7 SIMULATIONS

FIGURE 4.5. Trajectory of the visual system for dXerent strategies with an observation horîzon N = 50: Po = diag[l 11 and Q = [O 0; O 0.11; NOIFC (dashdot) and OIFC (solid)

FICURE 4.6. Performances of the Kalman filter for different strategies with an observation horizon N = 50, Po = diag(1 11 and Q = [O 0; O 0.11; NOIFC (dashdot) and OIFC (solid)

This trajectory (figure 4.6) can be explained by a camera fwng its fieId of view on a

predicted position of a visible feature.

Figures 4.7 and 4.8 show plots of the trajectories of the camera, corresponding to

difFerent values of the initiai state error covariance matrix, It is clear that depending on the

used combination we obtain different trajectories for the camera. From figures 4.7 and 4.8,

we realize that for the N O I X strategy the camera is tracking the ptedicted trajectory of the

object without taking into account the disappearance of the tracked feature fiom its field

FIGURE 4.7. Trajectory of the visual system for difFerent strategies with an observation horizon N = 50, Po = diag[O.lO.l] and Q = [O 0; O 0.01J; NOIFC (dashdot), OIFNC (dashed) and OIFC (solid)

FIGURE 4.8. Trajectory of the visual systern for difierent strategies with an observation horizon N = 50, Po = diag[l 11 and Q = [O 0;O 0.01]; NOIFC (dashdot), OIFXC (dashed) and OIFC (solid)

of view. On the other band, the OIFNC and the OIFC methods ailow a different strategy

of measurement for the visual system. In fact, plots in figures 4.7 and 4.8 reveal that as

soon as the predicted trajectory of the object shows that the probability of an occlusion is

prevailing, the camera trajectory changes from tracking the feature to being adjusted such

that the field of view of the camera covers the region of reappearance of the feature. This

use of occlusion information is bet ter illustrated considering the OIFC met hod.

FIGURE 4.9. Trajectory of the visual system for the OIFC strategy, Po = diag[l 11: digerent observations' horizons, N = 20 (dashdot) and, LV = 50 (solid)

Of course, this strategy of measurement depends on the observation horizon: kom

figure 4.9 it is clear that decreasing the horizon affects directly the trajectory of the visual

system: the smaller is the horizon, the slower is the visual system to assess the occurrence

of the occlusion event. Therefore, the visual system presents a larger pied of movement

and it takes for it longer to settle to zero.

7.3. Second exampie. In this example, we extend the first example to include

in addition to the rotation of the object another degree of freedom corresponding to a

translation along the axis. We consider a second order system to model the rotation of the

object with respect to the Zi axis and another second order system to model its translation

along x- axis. Similar system properties, initial conditions and noise characteristics to

the ones taken in the Erst example, are considered for the following simulations. The

measurements equations are then defined by

where

FIGURE 4.10. Extended Kalman filter performances: Trace of the state error covariance: neither occlusion information nor observation strategy

Figure 4.10 illustrates the trace of the state error covariance matrix for which the

occlusion information is excluded and no control strategy to the observer is considered. It is

evident t hat without a measure control, the performances of the filtering algorithm, limited

in this example to the classical K h a n filter, deteriorate as a function of the distance of the

tracked object Tom the camera: if the object is far fiom the visual system, the performances

of the algorithm are poor. The measurement control allows the visual system to track the

target. The rotation of the visual system is presented in figure 4.12. It is evident that the

camera tracks the target to keep it in the field of view.

Figure 4.1 1 shows plots corresponding to two control approaches the first one, a tracking

approach, is a classical control approach where the h s t step consists of defining the desired

trajectory: x,(k) = [xm1(k) xms(k)] , it corresponds to a fixation approach of the tracked

feature. In the second step, we use a Einite horizon cost function defined by

4.8 GAZE CEL-VGE STEMTEGY

where z , ( k ) = [zci(k) z a ( k ) ] defines the actual position and angular velocity of the visual

systern. N defines the control horizon which we limit in this approach to the observation

horizon.

The second control approach corresponds to the OIF'C strategy where our objective is

to minirnize the estimation error meanwhile weighting the command of the visuaI system

for the defined receding observation horizon. It is clear fiom figures 4.11 and 4.12 that

FIGURE 4.1 1. E-utended Kalman filter performances: state estimates; "semon a p proach (dashed) and optimal control (solid)

our approach to the control of the visuai system presents obvious better performances then

these obtained by a classical control approach. Also, the occlusion information included at

the filtering level allows the convergence of the extended Kalman fdter, otherwise impossible

(figure 4.10).

8. Gaze Change Strategy

A gaze change approach is presented to deal with the occlusion problem in a different

manner. Motivated by the fouowing consideration: it is not necessary to insure the observ-

ability of the tracked object to get measurements fkom the same features in the field of view.

This points to the possibility of a gaze change between features to keep the observability

of the system true. The gaze change, of course, is defined as a function of the geometric

4.8 GAZE CHANGE STRATEGY

F~GURE 4-13. Trajectory of the camera for a translating and spinning target: Uservo" approach (dashed) and optimal control (soiid)

properties of the target and assuming thst a t l e s t one feature can be kept in the field of

view.

The observation control strategy is derived, under the assumption of two only tracked

features as follows: given the predicted nominal trajectory and given the geometric mode1

of the tracked object, we define the gaze change to correspond to the "switch" between two

features: one is getting out of the field of view and one is assumed in the field of view . It

follows t hat the expression of the Riccat i equat ion, mainly the measurement Jaco bian in

this equation, depends on the feature in view.

In terms of modelling, this is equivalent to assuming a hypothesis testing approach

where given the predicted position of the object, we consider different expressions to the

Riccati equation for different in view features.

Similar assumptions are considered a t the filtering stage for which we assume that we

can associate with each measurement the corresponding feature in view. Consequently, we

derive the expression of the Riccati equation as weli as the expressions of the estimated

measurements given the predicted position of the feature in view.

Simulations are considered for the same first example of section 7. The second feature

corresponds to the upper left corner of the box. Simulations results present similar perfor-

mances of the proposed approach to the performances obtained by the OIFC observation

strategy (figure 4.13).

4.9 CONCLUSION

1 2 3 4 5 6 7 8 9 10 Tvnr (s)

FIGURE 4.13. Performance of the Evtended Kalman filter: Iog of trace of the state error covariance matrix: ,N = 50

Although, the obtained performances are close to these obtained by the OIFC observa-

tion strategy, we realize that the OIFC presents a faster convergence rate of the estimates

and a dXerent trajectory of the camera. Rom figure 4-14, we note that the visual system is

trying to keep track of the observed feature. It changes its trajectory as soon as an occlusion

is detected given the predicted state trajectory.

FIGURE 4.14. Trajectory of the visual system: gaze change approach: N = 50

9. Conclusion

In this chapter, we derived a strategy for robust optimal measurements using a vision

system (OIFC). Even though we used a suboptimal approach and we linearized with respect

4.9 CONCLUSION

to a predicted nominal trajectory, the proposed filtering and control techniques dehed in

a Bayesian frarnework allow to deal with the occlusion problem due to the geornetry of the

tracked target and which c m be easily extended to include the field of view constraints. In

this frarnework, the pnor and posterior probabilities are used to sub-optimally balance each

measurement . A comparative study for different observation strategies is presented: The first observa-

tion strategy, NOIFC, has been based mainly on two assumptions: no information is gained

from the occlusion stage and the occlusion is not considered in the solution of the Riccati

equation d e h e d with respect to the nominal trajectory. The second strategy, OIFNCt has

been defined such that the occlusion information is included at the 6ltering level. The third

observation strategy, OIFC, has been denved with t hese two assumptions being considered.

A different observation method has been considered u1 this chapter, it is based on a gaze

control approach allowing to d e d with the occlusion probiem in a different way.

The obtained results outlined the fact that the estimation through a modified extended

Kalman filter is sufficiently efficient to track a highly manoeuvring target. Typicaf simula-

tions results show that significant improvements in the tracking performance where achieved

using O E C strategy. These improvements were viewed in terms of more accurate estimates

of the position and velocity of the target.

Simulation results also proved that control strategies, assuming a certain kind of in-

formation on the occlusion event, dynamicaily identi& the changing requirements of the

vision task and accommodate the imaging dynamics to t hese requirements. Moreover and

as extension to this work, we can consider the OIFC strategy to deal with more general

occlusion problem.

CHAPTER 5

Grasping of a Moving Object with a Robotic

Hand-Eye System

1. Introduction

Vision sensors (e-g. CCD cameras) have revolutionized the area of sensor-based ro-

botics by introducing flexibility in conventional robotic systems. Alt hough preprogrammed

operation in which a robot is "taught" to perform repetitive tasks via a set of programmed

functions is the predominant form of operation of present industrial robots, the use of sens-

ing technology to endow machines with a greater degree of intelligence in dealing with their

environment is an active topic of research. Considerable effort has been devoted to the

visual control of robot manipulators-

Vision is a useful robotic sensor since it rnirnics the human sense of vision and allows for

non-contact measurement of the environment. Typically visual sensing and manipulation

are combined in an open-loop fashion, "look" then '%ove". The resulting operation depends

directly on the accuracy of the visual sensor and the robot end-effector- Visual servoing or

visual feedback has a less extensive history than the "lookn then "move" approach, mainly

due to the lack of computational resources a d a b l e for processing the large amount of data

contained in an image. Although previous researchers had considered fast visual feedback

for guiding rnanipulator motion [65], the visual servoing field was e s t weU defined by Weiss

[64]. Two types of visual servoing have then emerged, eye-in-hand configurations and static

camera configurations. Eye-in-hand visual servoing tracks object of interes t wit h a camera

mounted on a manipulator's end-effector [64, 31, 59, 29, 79, 231. Static camera visual

semoing guides manipulator motion based on feedback trom a camera observing the end-

effector [41], [56]. Most of this past work has been with monocular systems, though recently

st ereo systems have been used for visual servoing [40, 371.

Feature œ Determination Extraction

FIGURE 5.1. Position- based visual servoing

Feature Extraction

-

FIGURE 5 -2. Image-based visual servoing

A typical visual servoing feedback-loop is shown in figures 5.1 and 5.2. Differences

between the various approaches to visual servoing include the space in which reference

inputs are provided, the dimensionality of the control space, the structure of the controuer,

the physical configuration of the system, the derivation of the control law, and the feature

tracking algorithrns used [23]. Most researchers have concentrated on the tracking problem:

given an object, at rest or moving, manipulate the trandational and/or rotational degrees

of freedom, of a camera so as to stabilize the image. The camera may be mounted on a

robot, or may have its own motion system.

Papanikolopoulos et ai [59] use an optic-flow model and control a single carnera in such

a way as to ndl the relative motion. The data is reduced to two translation errors plus a

roll error. The initial depth is assumed to be known. Rom a control point of view, the

models, essentixUy single integrators, are simple- Much of their research is concerned with

the application of ali known control methods.

Espiau [l?], Rives and Chaumette [18] use the robot control technique developed by

Samson [16]- It is also a tracking scheme, but one where the reference is defined in the

image plane, feature based approach. The specific features may include geometric curves

as well as joints. Kalman Mtering is used, but to track a mode1 in the image plane only.

The model is a constant-velocity model, i-e. i t consists of independent integrators- In more

recent work [12] constant velociS and constant acceleration models with jumps are used,

and techniques developed in [77] are used to detect the jumps-

In Allan et al f30], two stationary cameras are used to obtain the position and velocity

of a moving object- Sequential images are used to obtain optic-flow spatial and temporal

derivatives. The information is used to isolate the images of the moving object, and to

calculate the two centroids which, by triangulation, are then used to calculate object loca-

tion. The method is limited to the tracking of motions in a plane normal to the camera

axes. Grasping is initiated when the trajectory estimates converge to real values, and is

done open-loop, i.e. without vision updates.

Wilson [79] proposes to estimate the relative robot pose , with respect to the target

object, through a robot-mounted carnera pointed at the object. He considers the images of

a number of feature points as measurements. A minimum of three non-collinear features are

needed to solve for three relative positions and three orientations. A recursive Kalman filter

is used in the estimation procedure, where, here again, each unknown is assumed to foUow

a constant-velocity model. A closed-loop tracking process is defined where a pre-computed

desired trajectory is tracked- Classical controllers, PD, are designed in the joint-space for

the visual servoing loop and the controllers outputs are the commanded joints positions.

Image-tracking schemes are reminiscent of classical control: the quality to be nulled,

the error signal, is directly obsewed, and the tracking is u sudy fairly robust. Simple models

may be used to describe the motion, which are in essence low-fiequency approximations.

Several cameras may be used, each one independently tracking the object; however, in the

PROBLEM DESCRIPTION

case of feature based approach it is necessary to know a priori, for each camera, what the

desired image should be in order to define an error- Put anot her way, it must be possible

for al l carneras to move in such a way as to attain a given desired relative position with

respect to the object. However, it is difficdt to see how disparate information from different

cameras codd be h e d , since the image-motion models are unrelated.

In this chapter, new approaches to visual servoing are presented where the reference

trajectory is not fked a priori. An ideaI reference trajectory is d e h e d as a function of the

real states of the target object. We present three approaches to the control of the robot

manipulator and the platfonn to track a desired trajectory. This trajectory is an approx-

imation to the ideal one and is defined as a function of the state estimates- Two classical

control approaches, both based on position-tracking schemes are considered. The first is

an open-Ioop method similar to the one proposed by Ailan [30]. In the second, defined by

a closed-loop approach, the desired trajectory keeps being updated even after the starting

of the grasping process. The third approach combines visual servoing and active vision by

pursuing at the same time two go*, the observation and the control to achïeve the grasping

task. In this approach, we present an adaptive coatrol scheme where the control is chosen

both to regulate and to bring out information- This is done by minimizing a cost function

that exhibits the dual purpose of the command, Le., observation and control. An approxi-

mate solution to the full dynamic programming solution is presented. This approximation

involves the selection of the observation strategy and the associated prediction algorithm

and at the same time, the selection of a stochastic closed-loop controi method. Simulation

results present a comparative study between the three methods of control.

2. Problem Description

The proposed problem starts with a dynamic model for the motion of the object:

where x ( k ) is the state vector, w(k) is the noise input characterized by a zero mean and a

constant covariance Q. The state vector for the dynamic model is chosen as the position

and orientation of the work-piece in the inertial frame, it may also include some unknown

parameters.

5.2 PROBLEM DESCRIPTION

The object carries a number of feature points wit h weU defined coordinates in the object

fiame. The object is assumed to be ngid, but that assumption could be removed. Since

x(.) describes the translations and rotations of the object in the inertial fiame, it is possible

to express the location of each feature point in that fiame in terms of its coordinates in the

object hame and of x(k).

Assuming a defined number of cameras moving in space and that the position and

rotation of the jth camera is represented by the vector xCj(k) - Knowing this and the

position of the ith feature point, and assuming a simple pinhole mode1 for each camera,

characterized by a fixed focal distance, we May express the location in the image plane of

the image of the ith feature as

where v r i j ( k ) and -,(k) are noise signals, reflecting the fact that the results of the image

processing are imperfect.

Equation (5.1) describes the dynamics of the target object while equation (5.4) describes

observation equations. As mentioned in chapter 2, we can distinguish two situations:

(i) The cameras positions xCj(-) are functions known a priori, including a constant. This

is the "classicaln case to which Extended Kaiman fiitering is applied.

(ii) The vectors xCj( . ) are not known a priori, but can be manipulated, either directly

or through some dynamics. This case corresponds to active observers where the

cameras can be moved so as to optimize the state estirnate.

In case where w ( k ) = O in equation 5.1, the object trajectory is entirely determined by

its initial conditions. The Kalman filter may be recast in a simpler form of a parameter

estimation problem. As mentioned in the previous section, different approaches are con-

sidered for the grasping process. In the first two methods, the grasping is initiated when

the estimates are considered to be satisfactory The third approach presents an adaptive

strategy that brings together observation and control without a need to make s shift fiom

the observation process to the grasping process, I t is assumed that the joints are provided


with their local control systems; our approach is to provide reference trajectories to each

joint senro. The reference trajectories must have a bandwidth that is sufticiently smail to

allow good tracking by the joint senro system-

2.1. Setup Description. Figure 5.3 shows the proposed simulations testbed- The

pendulum may be either a fkee pendulum, or one where the axis of rotation is driven by

a se-motor. The object, at the end of a rigid bar, is assumed to be a box with square

cross-section. There is a camera mounted on a platform, and another on the robot. Both

the platform and the robot stand in the plane of motion of the pendulum. The ptatform-

mounted camera is £iee to rotate in the vertical plane, about a horizontal axis. The robot,

a two degree of freedom manipulator, moves in the same vertical plane. It is equipped with

a h e d gripper a t its end effector. Thus, the platform has one degree of fkeedom, the robot

has two.

FIGURE 5.3. Setup description

2.2. Modelling.

2.2.1. Pendulum. Figure 5.4 defkes variables for the pendulum. The equations of

the pendulum are


FIGURE 5.4. Variables for the penduium

where Lp is the length of the rod. We use the small-angle approximation

8, = -O,O, (5-6)

The pivot of the pendulum, point O, is taken to be the origin of the coordinate system in

the inertial hame. Considering the lower vertices of the object to be the feature points.

they may be located in inertial space as:

1 H Point a : xa = (Lp + 2D)sinOp - -cosop

2

Point 6 :

Points c and d are the same as a and b, respectively, except for a change of sign in z.

2.2.2. Platfonn-Borne Camem. Figure 5.5 shows the geometry of the platform-borne

camera. The ongin of the local coordinate system coincides with the center of the camera

lem. We use homogeneous transformations to locate a point in inertial space, given its

5.2 PROBLEM DESCRLPTION

location in camera space

F~GURE 5.5. Geometry of the platform-borne carnera

which may be inverted to yield

Using the pinhole mode1 for the platform camera, we may &te the location (yic f, yzcl)

on the image plane of a point (xf7 yf? zf) in inertial space:

L,, and LY, are considered unknown parameters, they d e h e the position of the plat-

form in the inertial frame. Simiiarly, C,, and S, define the position of the camera in the

local frame attached to the platform.

2.2.3. The robot. The kinematics of the robot are modelied foilowing Craig notation

[25]. The location of the feature point [xf, yf, z f ] in the image kame of the camera carried

by the manipulator is derived using homogeneous transformations. First, the location of


the feature in the camera coordinate kame has to be defined given the robot kinematics

carried by the manipulator is given by:

The parameters L, and Lw d e h e the uncertain position of the robot base in the

inertial hame. 81 and Os define respectively the position angle of the first and second joint

of the manipulator.

Including unknown parameters, a state vector x = [8,,wp, LzcT LYC, Lm, LV] can be

defined , then the dynamic equations are

The observations are described by the functions ylc-, y2=f, Y I r f and 3arf- There are

two observations per feature, per camera, so the total is 16 expressions if al1 four features

are in both fields of view of the two carneras. The observations are redundant: points a

and c (b and d) give strongly dependent results. Each observation is corrupted by noise,

corresponding to the error in the location on the image plane given by the image-processing

system.

5.3 PLATFOM .hi ROBOT C O P u ï O t

3. Platform and Robot Control

We make the basic assumption that the platform and the robot are provîded with

adequate joint control systems. Our approach is to provide reference trajectories to each

joint servo- The joint trajectories must be such as to aUow rendez-vous with the target with

both a position and a velocity match. Three approaches are proposed for the graspine; of

the moving target object with the robot hand-eye system. ail these approaches are based

on double-integrator models-

3.1. Open-Loop Approach. In fact it is a classical approach where the main

purpose is to track the target until getting a good estimate of its dynamics then to start

the grasping processing where the reference trajectory is defined h c t i o n of the target

estimates. Two steps are followed in this approach.

3.1.1. Estimation and Active Observation. Fixation is considered as the observation

strategy during the estimation process. For this and given the observation equations derived

fkom both visual systems, we define the desired trajectories for bot h cameras as follows: The

estimation is done by extended Kalman filtering, The observability is an issue discussed in

appendix D. It is shown that the system is observable, provided that the pendulum moves

and at least one feature is kept in the fields of view of both cameras.

At each instant of t h e k, and based on the curent estimates, we can define a predicted

state

5( j + l ( k ) = @i?(j lk) , j = k...N + k

where IV defines the control Horizon and

5.3 PLATFORU AND ROBOT COLNTROL

Given this prediction, the desired trajectories for both cameras q,(.) and 81,(.): Oz,(.) can

be derived from

[ x f p ( j ) , yfp(j),rfpü)1 defines the predicted coordinates of the feature at instant j and

L, ( j ) : L, (j) , L, (j) , and LFp (j) the predicted parameters. For simplicity, we con-

sider an instantaneous motion for the platform. Therefore, the trajectory of the platform

corresponds to the desired one. The robot controllers attempt at this phase to keep track

respectively of BI,(.) and &,(.). Of course, at this level different classical controllers can

be considered in this tracking problem, PD, optimal control -.. etc. In the optimal control

approach, similar cost functions with equal weightings are assumed in the derivation of the

finite horizon iinear quadratic regdators. The cost function is defined by

3.1.2. Grusping. When the estimates converge to their true values, the manipulator

should be commanded to approach the target. The procedure should be in such away that

the object is grabbed smoothly. i.e. both the end-effector and the object possess the same

position and velocity. This phase is caiied grasping which starts just after the previous

tracking phase. The switch between the two phases is determined by a criterion which

implies the convergence of the estimator within a certain range of accuracy. This criterion

is defined by the trace of the state error covariance matrix. As will be shown in section 4,

simulations results depend on the chosen value of the threshold.

No more measurements are then taken to update the estimates. The reference trajectory

for the manipulator corresponds to the predicted trajectory of the target defined in the robot

hame. Inverse kinematics and inverse Jacobian are used to d e h e the corresponding desired

trajectory relative to each joint. A finite horizon quadratic controiler is used for each joint,

5.3 PLATFOM AND ROBOT C O h i O L

1 I 1 Robot

FIGURE 5.6. The Closed-loop Control Approach

characterized by the cost function

k=N

qir(k) = [oir(k), &,(k)] and ql(k) = [&(k), &(k)]. Even though we have a controlled

observation, the open-loop approach is basically a "look" then "ove" approach where the

accuracy of the operation depends directly on the accuracy of the visual system and on the

rnanipulator and its controller.

3.2. Closed-Loop Approach. As shown in figure 5.6, this approach is similar to

the first one except for the grasping process where the measurements are stiil taken. We

keep updating the reference trajectory by considering the new measurements.

3.3. The Dual Approach. In this approach and given the peculiarity of the prob-

lem, we have posed the design of a dynamic visual feedback controller in the fiame-work of

a stochastic optimal control theory. I t is weil known in this case that the estimation and

control function can seldom be separated. The combined control-estimation problem leads

to a complex dynamic programming probIem. There have been several attempts at subopti-

mal schemes. An approximation proposed by Tse et al@] considers a dual controiler using

some approximation to the full dynamic programming solution. Based on this approach,

we consider the following formulation of the problem: Given the dynamics of the system

defined by equation 5.1, in the simulation example, x(k) defines the position of the robot

5.3 PL.4WOlEtM AND ROBOT CONTROL

and the platform in the inertial 6rame in addition to the position and velocity of the target

in the same fiame, we assume that we have a set of measurements coilected by both visuai

systems yk = (~ ( l ) , y(2) - - - y (k)) tiil instant k such that

where v ( k ) the measurement noise defined as a Gaussian white noise with zero mean and

a covariance that can be considered as an optimization parameter [Il], xp(k) specifies the

dynamics of the visual system mounted on the platfonn and xr(k) defines the state vector

of the visual system in the robot-frame. The origin of this frame corresponds to the base

of the manipulator. In our example, we consider a camera mounted on the last link of the

manipulator. The position of the camera in the robot-kame is then directly iinked to the

dynamics of the manipuiator Le. positions and velocities of its dXerent joints with respect

to its base foilowing

u(k) defines the input command for each joint, To simplifv results, as in the previous

section, we consider an instantaneous movement for the platform camera. Our objective

is to find at each instant of time u(k) that allows at the same time a good estimation of

the unknown parameters and a good tracking of a desired trajectory fd(x(.)). This desired

trajectory corresponds to the trajectory of the target in the robot-kame and is dehed by

the position and velocity of the object in this fiame. Our objective is then to find the

cornmand u(k) of the manipuiator joints at each instant of time k such that the robot end-

effector trajectory coincides with the desired trajectory allowing the starting of the grasping

process at any time. It follows that the performance index to be minimized is:

where WI (k) and W2(k) define the weightings on the tracking errors and the command.

The inherent uncertainty on x(.) prevents the exact determination of the desired trajectory.

In order to achieve our goal, one approach îs to consider the expected loss as a minimization

5.3 PLATFORM AND ROBOT CONTROL

criterion:

composed of the expected vaiue of the sum of l o s functions associated with each sample

period plus a loss function associated with state values at the final time. The optimal control

solution to equation 5.23 can be obtained by applying Behan ' s principle of optimality.

Of course, it is desirable to be able to derive closed-loop control laws, by means other

than attempting direct solution of the dynamic programming algorithm, that exhibits at

the same time the probing and caution characteristics of the full-scale optimal stochastic

controller[51].

The proposed closed-loop control approach anticipates subsequent feedback. Whenever

a control is computed, the expectation of the cost conditioned on the available information

state is first obtained. The expectation is over the subsequent measurements which are

averaged out. Ln fact in a posterior anaiysis, the vaiue of the future information is estimated

based on the available data and this knowledge can be utilized in deciding what control is the

most desirable- Since this is done at every step it follows that the resuiting control depends

on the future observation strategy and the associated statistics 181. For this and rather than

using the exact information state {Yk, /Yk-') defined by the set of all measurements Yk up

to instant k and uk-l, the set of commands applied to dinerent joints until time (k - 1). the

information state is limited to fr = {f (klk), P ( k ( k ) } the conditional mean and covariance

of x ( k ) . The computation of this information state is done by an extended Kalman fiiter.

Then the control to apply at instant k is generated as follows:

A desired trajectory, based on the predicted values of the state x is generated. A

control function derived assurning a certainty equivaience design, is assumed to be used for

instants k + 1: k+2, , N, thereby generating a nominal trajectory for the manipulator and

a control time history. Stochastic effects are then described via a first order perturbation

about this nominal, aliowing evaluation of the expected cost associated with the nominal

plus a specific perturbation control choice through a set of recursions. An iteration is

then conducted to 6nd the control that yields the minimum expected cost. Since the state

5.3 PLATF'OFüM kW ROBOT CONTROL

covariance is precomputed along a nominal trajectory and this trajectory depends on the

current control u ( k ) then the effect of u ( k ) on the quaLity of future information is embedded

into the determination of the control. We consider a double integrator system to mode1 the

dynamics of each joint of the manipulator.

where

In this testbed modeiiing, we consider the state vector x ( k ) such that

it evolves according to the equation

with observations

where x(0) is the initial condition, a random variable with mean jC(010) and covariance

P(OI0) - Our objective is then to find %(-) the command for each joint to minimize the perfor-

mance index (equation 5.23). The principle of optimality and the information state yield the

following stochastic dynamic programming equation for the closed-loop optimal expected

performance at tirne k

To obtain an approxirnate expression for J*(N - k - 1) whiie preserving its closed loop

feature, the performance for the last N - k - 1 steps is expanded about a nominal trajectory

5.3 PL-4TFOKM ,iLND ROBOT CONTROL

defined by:

where

x, (k + 1 ) = Z(k + 1 lk)

The expansion of the cost is with terms up to a first order

J ( N - k - 1 ) = E { C n ( N - k - 1 ) + 4 C n ( N - k - 1 ) )

where

and AC, ( N - k - 1 ) is the variation of the cost about the nominal. The approximation of

the closed loop optimal expected cost for the last N - k - I steps is then defined by:

where

A J ; ( N - k - 1 ) = min E { . - - min E[AC,(N-~-~)IP~-~]---[P*+'} (5.36) 6ir(ki-1) bu(^- I )

J,(N - k - 1) is minimized by a sequence of nominal controls h ( j ) , j = k + 1, - - - , N - 1 .

Fkom the definition of the nominal trajectory and the dynomics of the system, the

perturbations obey the foilowing dynamic equations (with terrns up to the first order):

6 ( + 1) = A ~ x , (j) + B6u(j)

62, ( j + 1) = @ b ~ n ( j ) + v(j)

5.3 PLATFORiM AND ROBOT CONTROL

for j = k + 1, - - - , N - I with initial conditions

Thus the problem defined by equation 5.36 consists of the rninimization of ACn(N-k-1)

for the system fequation 5.38). The solution of this problem can be assumed to be of the

form:

which can be written in a manner to emphasize the closed-loop property:

Pn(jl j ) is the solution of the extended Kalman filter Riccati equation aiong the nominal

trajectory defined by:

5.3 PLATFORM hW ROBOT CONTROL

Combining equation 5-41 and equation 5-24, the stochastic dynamic programming equation

is then

The recursions that yield g,(k + 1), rn(k + l ) , Hn(j + l ) , Jn(k + l ) , Kn(k + 1), Ln(k + l),

M,(j + l), Nn(k + 1) and Yi as weii as the detaiied derivation can be found in appendix C.

n o m equation 5.31 and equation 5.42 it follows that

and

Dropping from equation 5.43 the first term which does not depend on u ( k ) , we obtain

a deterministic expression that has to be minimized with respect to u(k)

5.4 SIMULATIONS

The active learning feature of the closed-loop control has an important property: it is

done to the extent required by the over ail performance. Since learning may be codicting

with the control purpose, the optimal stochastic control balances its learning and control

effects such as to rninimize the over ali cost. This property is brought in the expression of

equation 5.46 where the estimation and control parts of the cost are separated.

4. Simulations

This section reports on severaf simuIation experiments that were conducted to demon-

strate the properties of the open-loop control approach? the closed-loop control approach

and the dual control approach. The simulations are meant to bc tools to gain insight into

the visual sewoing and to support claims about the possibility of merging visual servoing

and active vision. They are not meant to be proofs in any strict sense.

4.1. Simulation Parameters. In this section, we study the experimental testbed

via simulation. The purposes of this study are to investigate the feasibility of the dual

adaptive control in the context of active vision and to compare the performances of this

approach to those of the two more classical approaches, Le. the open-loop approach and

the closed-loop one- The simulations illustrate the dual control algorithin's feature of being

actively adaptive. In particular, we shall see how a closed-loop contraller presents a better

performance in learning unknown parameters. It should be pointed out, however, that

the performance of the dual approach is not limited to the estimation process. A drastic

improvernent in convergence rate compared to the others methods can be obtained under

certain conditions.

The following parameters are considered in the simulations: the tradced object is a

cube of side dimension 20cm. The length of the rod is l m and a fictitious model for the

manipulator is considered where li = 1.5m and l2 = 1.2m. We consider a simple pinhole

model for both cameras with a focal distance f = 9mm. The initial conditions for the

state vector x = [ 8, w, Lz, L, L, L, ] are characterized with normal a priori #- -,

statistics having mean and variance ~(010) = 1 0.5 0.1 -3.2 -3.2 1.5 -2 1 ,P(OIO) =

Error in orientation

Emr in Lxr 0.3 1

Error in angular velocity

Enor in Lyc

Enor in Lyr 0.1 i

-02l 1 O 1 2 3 4

Time (s)

I O 1 2 3 4

Time (s)

FIGURE 5.7. States and parameters estimation errors for the grasping task with dual adaptive controI (solid), open-Ioop control (dashed) and closed-loop control (dashdot )

4.2. Simulations Results. Figure 5.7 shows the performance of the estimator

when the predicted parameters and states are used to generate the desired trajectory. It is

weli clear that a better performance of the estimator is obtained in the closed-loop context,

i.e., either with the closed-loop approach or the dual control approach. And although

the fact that we did assume a kat ion approach in the learning process of the closed-loop

Position Emr

O 5 10 15 The (s) Lxc Error

0 2

Time (s)

1 O 5 10 15

Time (s)

Time (s) Lyc Enor

1 5 10 15

Time (s)

Lyr Error

O-' 1

Tirne (s)

FIGURE 5.8. States and parameters errors for the grasping task: ciosed-loop control, threshold equals 0.1 (solid) and threshold equals 0.001 (dashdot )

approacb, the learning is about the same for both strategies, Le., closed-loop control and

dual control. This can be explained by the learning properties of the dual control approach

and the fact that decreasing the error on the estimates is anticipated by the control from

the initial time because it is of the closed-loop stochastic control type.

It follows that, if we consider the open-loop approach, we have to deal with cumulative

errors due to modelling errors and measurement noise. In this case, the control scheme

5.4 SIMULATIONS

-0.6; 2 4 1 6 8 10 12 14

l ïme (s)

FIGURE 5.9. Error between the target trajectory and the robot end-effector trajectory in the grasping process: closed-loop control, t hreshold equals 0.001 (dashdot ) and threshold equals 0.1 (solid)

works in an open-loop fashion with respect to the environment and cannot compensate

these errors. In the closed-loop approach and even though the desired trajectory is updated

at each new measurement the convergence rate of t his approach is function of the considered

t hreshold. And the smaller is the threshold, the bet ter are the performances of the estimator

and the controller (figures 5.8, 5.9).

The learning process of the dual adaptive controller is directly dected by the horizon

of the cost function. Increasing the horizon aiiows more information on the t h e evolution

of the parameters and the states, through the information carried by the nominal trajectory

(fi,wes 5.10, 5.11). To illustrate more the properties of the dual adaptive controiler, we

present the resdts of the simulations related to the grasping process. As shown in figure

5.12, the error between the real position of the target and the robot end-effector is smaller

when considering the adaptive control approach.

This indicates that the dual control does use the information from the future strategy

of measurements and its related statistics in addition to the gathered information from the

measurements.

Until now, we have been discussing the performance of the three control approaches

without considering the effects of the uncertainties on the performance of these proposed

control methods. To examine the effect of these uncertainties, two simuiation results are

5.4 SIMULATIONS

Position Emr

Time (s) h c Emr

Time (s)

Lxr Enor

-021 I O 5 1 O 15

T h e (s)

Time (s) Lyc Enor

5 10 Time (s)

Lyr Error

Time (s)

FIGURE 5.10- States and parameters errors for the grasping task with dual adaptive control, Ar = 20 (solid), N = 100 (dashed)

presented. First, the variance of the measurement noise is decreased, which wodd indicate

that the measurements are subjected to a lower corruptive noise and so should be weighted

more by the filter. Furthemore there is now smailer uncertainty in the measurements so

the filter is expected to "track closer" the measurements.

Fkom figure 5.13, it is clear that the duai control achieves a dramatic improvement in

the time to reach the target. This indicates that the dual control is less sensitive to noise

5.4 SIMULATIONS

FIGURE 5.11. Error between the target trajectory and the robot end-effector trajectory in the grasping process with dual adaptive control: N = 20 (solid), N = 100 (dashed)

Emor belween the target Paiectory and thet robot end-effector -@=tory 1 2

1

-0.6 ' O

1 2 4 6 8 10 12 14

Time (s)

FIGURE 5.12. Error between the target trajectory and the robot end-effector trajectory in the grasping process: adaptive control (solid), open-loop control (dashed), and closed-loop control (dashdot)

variations. However from figure 5.13, it is evident that the dual control presents a certain

periodic error. This error can be explained by the trade off that exists between observation

and control in the cost function.

5.4 SIMULATIONS

FIGURE 5-13. Error between the target trajectory and the robot end-effector trajectory in the grasping process: adaptive control (solid), and dosed-loop control (dashdot)

FIGURE 5.14. Error between the target trajectory and the robot end-effector trajectory in the grasping process: dual adaptive control (solid), and closed-Ioop control (dashdot)

Of course, tuning the weighting Wi in the first term of J (equation 5.23) will d o w a

better performance for the adaptive dual control.

Figure 5.14 presents the error between the target trajectory and the robot end-effector

trajectory for the second simulation testing the performance of the dual control. ln this

simulation, we increase

of the initial conditions

the uncertainty

P(OI0) = diag [

5.0 CONCLUSION

on the initial estimate by increasing the variance

0.5 0.5 0.5 0.5 0.5 0.51- Resultsshow that

stiii we have a better convergence rate for the adaptive dual controller. This performance

is due to the properties of this suboptimai feedback law. In fact the considered system is

not neutrai, presence of the dual effect, it foilows that this control law wili investigate the

system: i wili operate in such a marmer as to enhance estimation precision, so as to improve

the overall performance in the future.

5. Conclusion

In this chapter, we proposed and simulated a novel method for visual trajectory plan-

ning, and adaptive dual visual feedback control. The method requires no prior information

eit her about the desired trajectory, or the placement or the calibration of the cameras, and

imposes no limitations on the number of degrees of fieedom controiled- The approach pro-

vides aot only a means of low-level servoing but a means to integrate it in the more general

context of purposive vision. In fact, the proposed control is an approximation based on the

principle of optimality that allows a regdation of the learning process as required by the

controi objective. It is inspired by previous work in visual servoing, but differs in that, while

previous work in visual servoing has concentrated only on the fixation of selected features,

defined either in the image plane or in the task plane, we integrate visual servoing with a

higher level of task oriented active vision. The proposed approach has numerous application

in the robotics domain. Mainly a possible autonomous satefite docking and recovery or to

improve navigation techniques of a mobile report.

Simulation results have been presented which demonstrate the general performance

and abilities of the proposed approach. First the extended Kalman filter performances were

shown to be at least as accurate as with the tracking strategy. Secondly and simultaneously

with the estimation process, the ability to reach the target and follow it was demonstrated.

CHAPTER 6

Conclusions and Future Research

In many fields, autonomous operation by a robot would be desirable under special cir-

cumstances, for example, in space-based or extremely hazardous activities. Furthemore, a

robotic task might be complex and the environment may be unstructured to the extent that

some degree of 'kteliigence" would be required for satisfactory performance of the task.

There is an increased awareness of this, and the trend has been to include intelligence, which

would encompass abilities to perceive, reason, leam and infer fkom incomplete information,

as a requirement in characterizing a robot. In short, there has been a significant effort in

making robots more intelligent. Sensing, actuation, and control are the three components

of 'Lr~bot intelligence". Intelligent control can significantly improve the performance of a

robotic manipulator.

Both themes of active vision and visual semoing have been the focus of signifîcant

attention in the recent research for intelligent sensing and actuation devices and it seems

useful to relate our work to that larger body of research trend.

This thesis contains a mixture of theory and applications. Our approach to set up the

duality active vision visual servoing was built up incrementally. In the first part of t his

dissertation, we were concerned with the performance of the estimator. The estimator has

been defined so to adjust its parameters in a manner to rninimize the state error covariance

matrix over a receding observation horizon. Our perceptual loop was designed using a priori

knowledge of the object dynamics. This decision ta rely on a dyuamical mode1 of object

motion rather than a kinematic transformation, leads to a nonlinear estimation procedure.

In consequence, we have chosen to b i t our approach to simple features in the image plane.

CHAPTER 6. CONCLUSIONS AND J W ~ ' R E RESE-4RCH

Ln studying the properties of the developed observation strategy, we have proved that the

fixation approach most adopted in the visuai servoing domain is not an optimal approach.

In fact as a function of the observation horizon and the initial d u e of the state error

covariance matrix, the observation strategy changes from keeping a tracked feature at the

edge of the image plane to centering it.

The introduction of the internal parameters, like the focal length in the optimization

loop of the observation process has given us a closer insight into biological vision systern. In

fact, simulation results have shown that in a tracking process, varying internal parameters

simdtaneously with the external ones allows better centered images and, depending on

the physicd limits adopted on these internai parameters, we have observed a domination

of internal parameters variations with respect to the external ones. Moreover simulation

results have shown that for more than one feature to track, the intemal parameters are

adjusted to keep an equal radius of blurring for each one of them- Consequently as a

human being and unless another attention criteria is added to its observation criteria, an

optimal camera position is the one ailowing for minimum measurements noises.

The development and application of a new strategy to deal with the occlusion problem

in an observation process, has been successful in dealing with occlusion problems related to

the geometry of the target. In this strategy and based on a known geometry of the tracked

object, the occlusion has been considered as an information carrying process that c m be

used in the computation of the observation strategy.

The second part of this dissertation was devoted to the development of a dual con-

trol strategy that combines trajectory planning and adaptive dual visual feedback control.

Throughout, we have tried to derive an integrated purposive strategy, stressing the impli-

cations of our strategy on the quality of the estimates, their convergence time, and the

grasping time. This strategy has ailowed a signiûcant improvement in the convergence time

of the end-effector to the desired trajectory without altering the quality of the estimates.

In addition to these contributions, we point out several other important observations.

First, a mode1 based approach to the sensing process turns out to be the right decision since

modern control theory can be easiIy inçluded in the development of perceptual strategies.

This gives us tools to deal with many problems facing vision in robotics field mainly time

delays and sensors fùsion.

6.1 FUTURE RESEARCH

Second, the observation strategy developed in this thesis can be extended to other

kind of sensors like sonar and radar. A common measurement strategy allows an easier

integration of different observation modules leading to a more autonomous behavior,

The coverage of this dissertation is quite broad, and so we have left many intersting

and important avenues unexplored:

1.1. Sensing extension. To this point, we have discussed the information gath-

ering process mainly Eom a single sensor, singIe model, single task point of view. The

purpose of this section is to argue that the proposed approach extends to more cornplex

task domains with multiple sensing modaiîties. We first argue that complex tasks such as

recognition [74, 751 can be addressed in our framework. In fact it sufEices to add a term in

the cost funct ional measuring the recognition ambiguity.

The fusion process extends directly to many sensors. The fusion is carried by the ex-

tended K a h a n filter. Actudy, the filter statisticaily minimizes the errors in the estimates:

on an ensemble average basis, no other means of combining the data wiU outperform it,

assuming of course, the interna1 model in the filter is adequate.

1-2. Task extension. During task execution, other problems arise. The two most

common problems are occlusiori of features and singularities either at the perception level

or the robot level. The solution of the former include intelligent observers that note the

disappearance of features and predict their locations based on dynamics and geometric

information, or redundant feature specifications that can perform even with some loss of

information. Solution requires some combination of intelligent pat h planning for the ma-

nipdator and/or intelligent acquisition and focus-of-attention to maintain the observability

of the system. This issue needs to be explored further. F'rom the perspective of the dual

control, an approach to deal with singdarity problems, which can be defined as constraints

in the considered cost function, needs to be elaborated and if possible tested on a setup.

1.3. Cooperation. "Cooperation is the process of taking different observations

and a priori information fiom the shared world and combining it in such a way that the

process achieves a common task" 161. This includes cooperative sensing and cooperative

processing. As an extension to our present work in the cooperation domain, we propose

the adoption of mode1 based approaches in the development of cooperative sensing and

cooperative processing. This approach ailows an easy integration of observations fkom

many distributed, non-homogeneous sensors and cues. Also, the consideration of a cornmon

reference frame for similar sensors as weil as a common representation makes this integration

process easier. Further the dual control approach can be extended to coordinate between

cooperative sensing and cooperative processing.

Overail. the problern of information-gathering is an extremely rich and prornising area

of research that has ramifications in many domains kom object recognition to autonomous

navigation. We hope that the research presented in this thesîs can provide a common

approach to this area of research-

REFERENCES

Jr. A. E. Bryson and Y. Ho. -4pplied Optimal Control- Hemisphere Publishing Cor-

poration, New York, 1975.

J. K. Aggarwal and N- Nandhakumar- On the computation of motion from sequences

of images - a review. In Proceedings of the IEEE, August 1988-

M. Athans. Optimal control: an introduction tu the theory and its applications.

McGraw-Hill, New York, 1966.

M. Athans. The matrix minimum principle. Information and Control, 11:592-606,

1968.

R. Bajcsy- Active perception. In Proceedings of the IEEE, August 1988.

R. Bajcsy. Fkom active perception to active cooperation: fundamental processes of

intelligent behavior. Technical report, Generd Robotics and Active Sensory Percep-

t ion Laboratory, Depart ment of Computer and Information Science, University of

Philadelphia, 1996.

D. H. Ballard and C. M. Brown. Computer Vision. f rentice Hali, Englewood Clif&,

New Jersey 07632, 1982.

E. Tse Y. Bar-Shalom and L. Meier. Wide-sense adaptive dual control for nonlinear

stochastic systems. IEEE *ans. on Automatic Contro. AG18 (2):98-108, 1973.

P. R. Bélanger. Contribution to the definition of a mcgill-laas project. Outcome of a

visit to LAAS, 1994.

P. R. Bélanger. Control Engineering: A modem approach. Saunders College Pub-

lishing, 1995.

REFERENCES

[Il] K. Benameur and P. R. Bélanger. Optical parameters variations in an active visual

estimator. In Proceedings of the IEEE International Conference on Decision and

Control, pages 2571-2577, 1996.

[12] F. Bensalah and F. Chaumette. Detection de rupture de modèle appliquée à

I'assemissement visuel- Technicai report, IRISA Publication Interne No.886, 1994-

[13] J. W. Blaker. Geometric optics the m a t e thwry. Marcel Dekker, inc., New York,

1972.

[lq Ted J. Broida and R- Chellappa, Estimating the kinematics and structure of a rigid

object from a sequence of rnonocular images. IEEE Transactions o n Pattern Analysis

and Machine Intelligence? 13, 1991.

[15] P. B u t . Algorithms and architectures for srnart sensing. In Proceedings of the 1988

Darpa Image Understanding Workshop, pages 139-153, 1988.

[16] B. Espiau C. Samson and M. Le Borgne. Robot Control: The Task Function Ap-

proach. Clarendon Press, Oxford, England, 1991.

1171 Espiau F. Chaumette and P. Rives. A new approach to visual servonig in robotics.

IEEE k n s . on Robotics and Automation, 8:313-326, 1992.

[18] F. Chaumette and A. Santos. îkacking a moving object by visual servonig. In IFAC

12th World Congress, Sydney,Austraiia, 1993.

[19] P. Rives F. Chaumette and B. Espiau. Visual s e ~ o i n g based on a task function ap-

proach, In Experimental Robotics 1: The Fkst International Symposium, Montreal,

June 1989.

[20] J. J. Clark and N. J- Ferrier. Modal control of an attentive vision system. pages -, 1988-

[21] J. J. Clark and N. J. Ferrier. Attentive visual servoing. In An htroduction tu Actiue

Vision, A. Blake and A.L. Yuille,eds, pages -. MIT Press, 1992.

[22] J. J. Clark and N. Twum-Danso. Visual target motion analysis using controiled

camera motion. Technical Report 945, Harvard Robotics Lab., 1994.

[23] P. Corke and M. Good. Controller design for high-performance visual servoing. In

IFAC 12th World Congress, Sydney, Australia, 1993.

REFERENCES

C. K. Cowan and P. D. Kovesi. Automatic sensor placement from vision task re-

quirements. IEEE %nsactzons on Pattern Analysis and Machine Intelligence, 10,

1988-

J. J. Craig. Introduction to m bu tics: rnechanics and control. Addison- Wesley, Read-

ing, Mass, 1989.

T. Darrel and A. Pentland. On the representation o f occluded shapes. I n Proceedings

of the 1991 IEEE Computer Society Conference on Computer Vision and Pattern

Recognition, pages 728-129, 1991.

E. D. Dickmanns and V. Graefe. Applications of dynamic monocdar machine vision.

Technical Report, Univ. BwM/LRT/WE 13/FB/ 88-3, 1988-

T. Viéville E. Clergue R. Enciso and H. Mathieu. Experimenting with 3d vision on

a robotic head. Robotics and Autonomous Systems, 14:l-27, 1995.

B. Espiau. Effect of canera calibration errors on visual semoing in robotics. Ln

Preprints of the Thirrl International Symposium on Experimental Robotics, Kyoto,

Japan, Oct. 28-30, 1993.

P.K. Allan et AL Automated tracking and grasping of a moving object with a robotic

hand-eye systern. IEEE Bans. on Ro botics and Automation, 9: f 52-165, 1993.

J. T. Feddema and 0. R. Mitchell. Vision-guided semoing with feature-based trajec-

tory generation. IEEE Tnrns-on Robotics and Automation, 5:691-700, 1989.

C. Ferrader and Y. Aloimonos, The role of fixation in visual motion analysis. In-

ternational Journal of Computer Vision, 11:165-186, 1993.

R. Fletcher. A new approach to variable metric algorithms. The Computer Journa&

13, 1970.

A. Gerrard and J. M. Burch. Introduction t o Matriz Methods in Optics. John Wiley

& Sons, 1975.

A. Zisserman P. Giblin and A. Blake. Information available to a moving observer

fiom specularities. Image tY vision computing, 7 (1):38-42, 1989.

REFERENCES

D. B. Zhang L. V. Go01 and A. Oosterlinck. Stochastic predictive control of robot

tracking systems with dynamic visual feedback. In IEEE Int. Conf. Robotics and

Automation, pages 610415, 1990.

G. D. Hager. R e d - t h e feature tracking and projective invariance. In IEEE Conf.

on Computer Vision and Pattern Recognition, pages 533-539, 1994.

S. Hutchinson G- D. Hager and P. 1. Corke. A tutorial on visual servo control. IEEE

Trans. on Robotics and Automation, 12 (5):651-670, 1996.

X. Zhuang R. M. Haralick and Y. Zhao. R o m depth and optical fiow to rigid body

motion. In IEEE Int- Conf. Robotics and Automation, pages 393-397, 1988.

K. Hosoda and M. Asada. VersatiIe visual servoing without knowledge of true jaco-

bian. In In Proc. IROS, 1990.

N. Houshangi. Control of a robotic manipulator to grasp a moving target using

vision. In IEEE Int. Conf. Robotics and Automation, pages 604-609, 1990.

S. K. Nayar K. Ikeuchi and T. Kanade. Surface reflection: Physical and geometrical

perspectives. IEEE Bans. on Pattern Analysis and Machine Intelliçence, 13:611-

634, 1991,

R. C. Luo R. E- Muiien Jr. and D. E. Wessell. An adaptive robotic tracking system

using optical flow. In IEEE Int. Conf. Robotics and Automation, pages 568-573:

1988.

N. P. Papanikolopoulos P.K. Khosla and T.K. Kanade. Visual tracking of a moving

target by a camera rnounted on a robot: a combination of control and vision- IEEE

Bans. on Robotics and Automation, 9:14-35, 1993.

R. Kingslake. The development of the zoom lem. Journal of the SMPTE, 69:534-544,

1960.

R. Kingslake. Optical System Design. Academic Press, Inc., 24/28 Oval Road, Lon-

don NW1 7DX, 1983.

E. Krotkov and R. Bajcsy. Active vision for reliable ranging: Cooperating fo-

cus,stereo, and vergence. International Journal of Computer Vision, 11 (2):187-203,

1993.

REFERENCES

E- P. Krotkov- Ezploratory visual sensing with a n agile carnera systern: Ph.D. Dis-

sertation, TR-87-29. PhD t hesis, University of Penmsylvania, P hiladelphia, 1987.

J. T. Feddema C. S. Lee and 0. R Mitchell. Automatic selection of image features for

visual servoing of a robot manipulator. In IEEE [nt- Conf. Robotics and Automation,

pages 833-837, 1990.

D. Marr, Vision, A Computational Investigation into the Human Representation and

Processing of Visuat Information. W. H. F'reeman, San J?rancisco, 1982.

P. S. Maybeck. Stochastic rnodels, estimation, and contml, volume 1. Academic Press,

1979.

P. S. Maybeck. Stochastic rnodels, estimation, and control, volume 2. Academic Press,

1982.

D. Q. Mayne and H. Michalska. Receding horizon control of nonlinear systems. IEEE

Transactions on Automatic Control, 35:814-824, 1990.

A- Blake T. Michael and -4. Cox. Grasping visual symmetry. In Proceedings of the

1993 4th Int. Conference on Computer Vision, pages 724-733, Los Alamitos, CA,

US-4, 1993.

S. K. Mitter. Successive approximation methods for the solution of optimal control

problems. Automatica, 3:135-149, 1966.

B. Nelson and P. K. Khosla. Integrating sensor placement and visual tracking strate-

gies. In Preprints of The T h i d International Symposium On Experimental Ro botics,

Kyoto, Japan, Oct. 28-30, 1993.

C. L. Novak and S. A. Shafer. Method for estimating scene parameters from color

histograms. Journal of the Optical Society of Americcr. A, Optics, Image Sciense, &

vision, 11 (11):3020-3036, 1994.

Attendees of the NSF Active Vision Workshop. Promising directions in active vision.

International Journal of Computer Vision, 11:2, pages 109-126, 1993.

N. P. Papanikolopoulos and P. K Khosla. Feature based robotic visual tracking of 3-d

translational motion. In Pmceedings of the 30th Confer~nce on Decision and Contml,

pages 1877-1882, 1991.

REFERENCES

A. Papoulis. Probabilîty Random Variables and Stochastic Processes. McGraw-Hill,

Inc: New York, 1965-

A. P. Pentland. A new sense for depth of field. IEEE Trans- on Pattern Analysis and

Machine Intelligence, PAMI-9 (4), 1987.

J. M. Lavest G. Rives and M. Dhome. Three-dimensional reconstruction by zooming.

IEE Transactions on Robotics and Automation, 10 (2): 1993.

A. P. Sage. Optimum Systems Control- Prentice-Hail, INC, Englewood Cliffs, N-J.,

1968.

L. E. Weiss A. C . Sanderson and C. P. Neuman. Dynamic sensor-based control of

robots with visual feedback. In IEEE Journal of Robotics and Automation, pages

404-417, October 1987.

Y. Shirai and H. houe. Guiding a robot by visual feedback in assembling tasks.

IEEE Dans. on Pattern Recognition, 599-108, 1973.

O. Silven. Estimating the pose and motion of a known object for real-time robotic

tracking. Technical report, University of Maryland? 1990.

K. Tarabanis and R. Y. Tsai. Computing viewpoints that satis& optical constraints.

In Proceedings of the 1992 IEEE Conference on Robotics and Automation, pages

399-405, 1992.

C. Tomasi and T. Kanade. Shape and motion ficm image streams under orthography:

a factorization method. Int. Journal of Cornputer Vision, 9 (2):137-154: 1992.

R. Y. Tsai. A versatile camera calibration technique for high-accuracy 3d machine

vision rnetrology using off-the-shelf tv cameras and lenses. IEEE Journal of Robotics

and Automation, RA. 3 (4):323-344, 1987.

T. Viéville. A Few Steps Toward 3 0 Active Vision. Technicd Report, INRIA, fiance,

1994.

W. K. Wai and J. K. Tsotsos. Directing attention to onset and offset of image events

for eye-head movement control. In Proceedings of the Int. Conference on Pattern

Recognition, pages 274-279, 1994.

J. J. Clark M. J. Weisrnan and A. L. Yuille. Using viewpoint consistency information

in active stereo vision. Ln Pmceeding of SPIE Conference on Intelligent Robots and

Cornputer Vision XI, Boston, MA, 1992.

3- Aloimonos 1. Weiss and A. Bandyopadhyay. Active vision. Int- Journal of Corn-

puter Vision, pages 332-355, 1988.

P. Whaite and F. P- Ferrie. Fkom uncertainty to visual exploration. IEEE Transac-

tions on Pattern Analysis and Machine Intelligence, 13 (10):1038-1049, 1991.

P. Whaite and F. P- Feme. Autonomous exploration: Dnven by uncertainty. Fa

Proceedings of the Conference on Compvter Vision and Pattern Recognition, pages

339-346, 1994.

M. Athans R- H. Urhiting and M. Gruber. A suboptimal estimation algorithm with

probabilistic editing for false measurements with applications to target tracking with

wake phenornena. I E E E Transactions on Automatic Control, A G 2 2 (3):372-384,

1977.

A. S Willsky and H. LI Jones. A generalized likelihood ratio approach to the detection

and estimation of jumps in linear systems. IEEE 2hn.s- on Automatic Control, pages

108-112, 1976.

R. G. Willson and S - A. Shafer. Active lem control for high precision cornputer

imaging. In Proceedings of the Int. Conference on Robotics and Automation, pages

2063-2070, 1991.

W. J. Wilson. Visual servo control of robots using kalman filter estimates of relative

pose. In The 12th World Congress International Federation of Automatic Control,

pages 399-404: 1993.

R. J. Woodham. Gradient and curvature fiom the photometric-stereo method, in-

cluding local confidence estimation. Journal of the Optical Society of America. A,

Optics, Image Science, & Vision, 11 (1 l):3050-3068, 1994.

Y. Yeshurun and E. L. Schwartz Shape description with a space-variant sensor:

-4lgori t hms for scan-pat h, fusion, and convergence over multiple scans. IEEE Rans.

on Pattern Analysis and Machine Intelligence, 11 (11):1217-1222, L989.

REFERENCES

[82j 2. Zhang and 0. D. Faugeras. Three-dimensional motion computation and object

segmentation in a long sequence of stereo fiames- Int. Joumal of Cornputer Vision:

7 (3):211-241, 1992-

APPENDIX A

Matrix method in paraxial optics

In this appendix, we review how matrices can be used to describe the geometric forma-

tion of images by a centered lem system [34]. The presented results are valid under two

assump t ions:

The wavelength of light is negligibly small and that the propagation of light is

described in terms of individual rays.

Only paraxial rays are considered so that a h t order approximations for the sines

or tangents of any angle are used. The optics of paraxial imaging is often referred

to as Gaussian optics.

1. Ray-transfer matrices

Tbe trajectory of a ray as it passes through the various refkacting surfaces of the system

will consist of a series of straight lines, each of which can be specified by: the coordinate of

one point of it and the angle it makes with the optic axis z. If we choose in advance any plane

which is perpendicular to the z axis, we can consider it as a reference plane RP. In terms

of any particular reference plane a ray is specified by the height y a t which it intersects

the reference plme, and the angle v which it makes with the z direction. In practice a

new R P is chosen for each stage of the initial caiculation. This means that ray data are

continually transferred Eiom one RP to the next as we consider the various elements of the

system. However, once this initial calcdation has been done right through the system, we

emerge with an overall ray-transfer matrix which will convert ail the ray data we wish to

consider £rom the chosen input RP to the chosen output W. It makes the computation

A.2 THE TRANSLATION M-4TRIX T

FIGURE Al. Translation of a ray by a distance t to the right between ixo reference planes

more convenient to replace the ray angle u by the corresponding "optical-direction cosine"

nu where n is the refractive index of the medium in which the ray is travelling. It greatly

simplifies caiculations and it ensures that ali matrices involved are iinimodular.

As a ray passes through a refracting Iens system, there are only two basic types of

process that we need to consider in order to determine its progress:

(i) A translation, or gap across which the ray simply launches itself in a straight line.

The gap is specified by the thickness t and also the reiiactive index n of the medium.

(ii) Refraction at the boundary surface between two regions of different refiactive index.

To determine how much bending the ray undergoes, we need to know the radius of

curvature of the refracting surface and the two values of the refkactive index.

2. The translation matrix T

Figure A.l shows an example of a ray travelling a distance t to the right between two

reference planes. Referring to figure A.1 we have

It has already been pointed out that the ray quantities upon which the translation matrix

is going to operate are the height of a ray and its optical direction. So if n is the rekactive

index of the medium between RPl and RP2, equation A.1 can be rewritten

t y2 = y1 + ;(nui) = y1 + TV1 (A-2)

where T = is the reduced thidcness of the gap. It is apparent kom (A.l) that 711 equals

u2 so the equation for the new opticai direction c m be written

It follows that

So the matrix representing a translation to the right through a reduced distance T is

Provided that each ith layer is presented by its reduced thidmess Ti = aii the individual

translation matrices can be multiplied together to produce a single matrix for the effect of

the whole gap. The T value appearing in the product matrix is just the sum of T values in

the individual matrices

3. The refkaction matrix

In this section, we present the action of a c w e d surface separating two regions of

refractive index ni and n2. The radius of curvature of the surface is taken as positive if the

centre of curvature lies to the right of the surface. The situation iliustrated in figure A.2,

shows a surface of positive curvature with the refractive index n2 on the right of the surface

greater than that on the left nr. The presented ray, has positive y and V values on both

sides of the surface.

Applying Snell's law under the paraxial assumption, we have

A.4 TEFIN LENS APPROXLMATION

RE?

F~GURE A.2. Refraction of a ray between h o surfaces of refractive index nl and nz

By the exterior angle theorem, we have

Y1 Y I il = u l + c r = v l + - and i 2 = v 2 + a = v 2 + - T r

It follows hom equations A.7 and A.8 that

Rearranging the equations into a mat* form, we obtain

The quantity (722 - n1) / r is usuaiiy defined as the rekacting power of the surface. investi-

gating the other cases where the change of refkactive index or the curvature is reversed or

where the y or u values are negative, the same refkaction matrix will present the refraction

process.

4. Thin lens approximation

If each ith refiacting surface possesses a curvature and refractive indices ni and ni+l,

we shail be able to represent its refiacting power by Pi = (ni+l - ni)/ri and the matrix for

-4.4 TFIIN LENS APPROXIMATION

the thin lens combination will be

The refraction matrix of a single thin Iens, for example, is the same whichever way round

it is mounted. Its refracting power is defined by

The refracting power is measured in dioptres, the focal length f and the curvatures are

defined in metres- In general, if calculations are to be made with a series of thin lenses, we

shall encounter an afternating sequence of R and T matrices and must consider carefdy

the order in which they arise. For any rekaction translation product, mat* multiplication

is not commutative.

For a given optical system, to obtain an overall ray-transfer matrix MI that enables us

to convert an input ray vector [ r: ] h t o an output ray vector [ ] we need to

(i) Choose reference planes with respect to the refracting surfaces of the system.

(ii) The next step is to write down translation or refraction matrices that represent each

of the elements between the various reference planes. Working from left to right

through the system.

(iii) Calculation of output ray which will be produced by a gïven input ray by using the

matrix product of the successive matrices defining transformation fiom one reference

frame to the following. This succession is done starting from the right side- It may

be helpful to visualize this order as that which is seen by an observer looking back

from the output reference plane towards the light source. Then the matrices which

foliow can be regarded as the links in a mathematical chah that brings us back to

the input.

Assuming that we have determined the optical system m a t h MI = 1: 1 , where

L J AD - BC = 1, a better understanding of the role of each element of the matrix ML can be

obtained by considering what happens if one of them vanishes:

A.5 CARDINAL POINTS OF AI OPTICAL SYSTEM

O If D = O This means that aii rays entering the input plane at y1 emerge at the

output plane making the same angle. It follows that is the k t focal plane

under these conditions.

0 If B = 0, the equation for y;! reads y;! = Ayl- This means that all rays leaving the

point O characterized by y1 d l pass through the point 1 characterized by y2 in

RP2. Thus O and I are object and image points. It foiiows that B = O to assure an

object image relationship.

a If C = O this means that aU rays w-hich enter the system p a r d e l to one another will

emerge parallel to one another in a new direction.

if -4 = 0, the equation for y2 reads y2 = Bul. This means that ail rays entering the

system at the same mgle will p a s through the same point in the image plane. It

follows that RP2 is the second focal plane of the system under this conditions.

5. Cardinal points of an optical system C -

A B Foilowing the analysis of the meanhg of each term of the matrix Ml =

I C D I L J

which links a chosen output plane RP2 to a chosen input plane RPi. We now present with

respect t o this reference planes, the two focal points, the principal planes of unit transverse

rnagnification and the nodal planes of unit angular magnification. nl and nz are assumed,

respectively the rehactive indices to the left and to the right of the system. For convenience,

A-5 CARDINAL PONTS OF AN OPTICAL SYSTEM

we recapituiate all these

Special case

nl = n2 = 1

elationships in this table: f

Measured

i?om to

Fr

Fl Hl

RPL Hl

Ml L

m2 F2

H2 F2

RJ'2 H z

R p 2 Li2

System parameter

described

First focal point

First focal length

First principal point

First nodal point

Second focal point

Second focal length

Second principal point

First nodal point

h c t i o n of matrix

elements

n1WC

-1/C

- W C

( D m - n2)lC

-n2A/C

-7tdC

n2(1 - A)/C

(n1 - AndIC

APPENDIX

-

Modified nonlinear estimation

In this approach and rather than assuming known properties of the new measurement mainly

if it is carrying an image of the tracked feature, we lïmit our assumption to a probability

term: we assume that we do know a priori

P7(k + 1) the probability it is the tracking situation?

Po (k + 1) the probability that there is an occlusion.

such that

We define the event V such that

V = 7 ( k + 1): denotes the tracking event?

V = O ( k + 1): denotes the occlusion event.

Standard Bayesian manipulations [76] yield the following expression for the posterior

pdf p ( 4 k + l)ly(k + 1))

where

p(y(k + 1)): d e h e s the unconditional pdf of the measurement at instant k + 1,

APPENDIX B- MODIFIED NONLINEAR ESTIMATION

p(x(k + l ) [ y ( k + l ) , T(k + 1)): the conditional pdf of x ( k + 1) given the measurement

y (k + 1 ) and given the event V = 7 ( k + l ) ,

p(x(k + l)ly(k + 1) , O ( k + 1)): the conditional pdf of x ( k + 1) given the measurement

y(k + 1) and given the event V = O(k + l ) ,

p(y(k + l ) I T ( k + 1 ) ) : the conditionai pdf of y(k + 1) given that it is a tracking case.

p(y(k + 1)10(k + 1) ) : the conditionai pdf of y(k + 1 ) given that there is an occlusion.

The pdf p(y(k + 1 ) ) can be decomposed fiirther. Use of the marginal density and Bayesy

rule yields

We calculate the probabilities

(i) P r ( 7 ( k + 1) ly(k + 1 ) ) : probability that y(k + 1) carries the tracked feature's image.

(ii) Pr (O(k + l ) l y ( k + 1) ) : probability that y (k + 1) corresponds to the occlusion case.

Evident ly

The pcif of the random variable is then

by substituting equation B.3 into equation B-2 and by rnatching the coefficients of the

resultant functions with those of equation B.5 we deduce that

Using the conditional posterior pdf P(k + 1) = P r ( 7 ( k + 1) jy(k + 1 ) ) which can be explicitly

calculated fiom (B.7) and (B.5):

the posterior probability density is then defined by

Once the conditional posterior probabiiity p(x(k + 1) 1 y(k + 1 ) ) is obtained, the conditional

posterior mean of x(k + 1) can be computed

We consider that Z,(k + 1 fk + 1 ) presents the posterior estimate of z ( k + 1) under the

assumption of a tracking situation. Also for the occlusion case, the measurement y(k + 1)

carrying no information on the state, we have

At this level, information derived from the occlusion case is added in the update of the state

z ( k t 1).

In this paragraph we apply the derived resdts to the special case of the extended

Kalman filter. As a first assumption, we consider the foliowing relationship for the occlusion

v = O(k + L),

y(k + 1 ) is assumed to be taken fcom a Gaussian population with large variance where the

pdf of 7 is N ( 0 , ~ ( k + 1 ) ) . y is independent of the state vector estimate- It is evident that

The probabilistic algorithm is then defined as follows:

Initiakation: at k = O

z-(010) = x(0); P(OI0) = Po

Prediction:

APPENDIX B- MODIFIED NONLlM2AR ESTIMATION

Update: We compute the residuai

and its covariance mat*

for the case where V = T ( k + 1 ) we have

The gain vector is then defined by

Tracking:

Given P(k + 1) the conditionai posterior pdf defined by (B.8): we have

Update state estimate:

Update covariance matrix:

0.1. Derivation of Po(k+l). In this suboptimal approach, we derive the expression

of Po(k + 1) function of the predicted state and covariance m a t r k It follows that at instant

k f l

where

@*(z) = Lexp - t2 /2dt (B.24)

defines the normalized error function. This is a more conservative approach to the filtering

process than the one derived in chapter 4: since both update terms of the state vector

are weighted. One interesting application of this approach is in unstructured environment

where we can only fix to a certain value the probability of an occlusion.

1. Simulations

Simulations are presented for the same example defined in chapter 4, section 7 except

for the bounds Bi and B2 which are d e h e d as foilows

BI = nn

B = (n + î ) ~ n even integer

A comparative study of the performances of the proposed filter to those of the filter defined

in chapter 4 is presented.

Simulation results are presented in figures B.1 and B.2. Figure B.1 presents the real

trajectory of the object in the inertial frame and the t h e intervals corresponding to the

occlusion event. The plot of the trace of the state error covariance proves that sirnilar

O 1 2 3 4 5 6 7 8 9 10

trace of the state enw covariance matm

FIGURE B. 1. Red object trajectory and filter performances: first probabilistic approach (dashed) and second probabilistic approach (solid)

performances are obtained from both modXed Kalman filters. We can explain these similar

performances by the fact that the approach including occlusion information and probabilistic

editing assigns a weight to each measurement based upon the statistical measure of the

presence of the object in the field of view. In this manner, information which iies in the

"gay" areas, i-e., those corresponding to state estimates with a large probability of error,

are utiiized to an extent dependent upon the computed weighting factor. From figure B.2,

FIGURE B.2. P(.) The probability of occ1usion of the tracked feature

B.1 SLMULATIONS

it is evident that the weighting factor flips almost a t the same time as the occlusion takes

place. This result is obtained by considering the comriance of the measurement noise in the

occlusion tending to infinity. Moreover, it is clear fiom figure B.2 that the faster the object

rotates, the smaller is the time intervai corresponding to the out of view state. Therefore

we can conclude that the proposed method can be considered as the fitering approach in a

measurement strategy where we oniy know the probability of occurrence of an occlusion.

APPENDIX C

The closed-loop opt irnizat ion

In this appendix, we present the detailed derivations of the deterministic expression that

has tu be minimized with respect to u ( k ) .

Rewriting equation 5-40 in the stochastic dynamic programming forrn with the assumed

closed-loop optimal expected cost to go of the forrn equation 5-41 yields

Given the fact that the perturbations obey the foilowing dynamic equations (with terms up

to the first order);

.WPENDIX C. THE CLOSED-LOOP OPTiMIZATION

for j = k + 1, - - - , N - 1 with initial conditions

we have:

and rearranging terms in equation C.5 it becomes

APPENDIX C. THE CLOSED-LOOP OPTIMIZATTON

The optimal perturbation control is derived kom

It is defmed by:

where

Reinserting equation C.8 into equation C.6 yields

and notice that

where Pn(j(j) is the covariance of the future updated state dong the nominal. With this,

equation C.12 can be rewritten as:

Thus it can be seen that equation C.15 is indeed the assumed form of equation 5.46

and the recursions for gn, Hn, J,, Kn7 Ln, Mn and Nn are, using the notation (equation

APPENDIX C. THE CLOSED-LOOP OPTIiMIZATION

for j = N - 1, - - - , k + 1; K,(N) = - ~ & ( N ) W I ( N ) ~ +X~(N)W~P Z n

APPENDIX C. THE CLOSED-LOOP OPTIlMIZATKON

for j = N - 1 - - - - ,k+ 1; Ln(N) = ~ w L ( N ) ~

for j = N - 1, - - - , k + 1; Mn(N) = -WI(N)%

for j = N - 1, - , k + 1; Nn(N) = - . g ~ l ( ~ ) In order to separate the stochastic effects

in the expected cost, we introduce

for j = N - 1, - - - , k + 1; ï n ( N ) = O then

This completes the proof of equation 5-46-

APPENDIX D

Observability

In state form, the parameter estimation problem is

We limit the study to

where the Jacobian is

For O bservability,

the observability of the system linearized about a trajectory; Le.

evaluated dong some reference trajectory xn (.) . it is necessary and sufficient that there exist no vector Ax(0) # O

scch that Ay(k) = zero for all k > O [9, 101.

We consider the observability wit h measurements fiom the plat form-mounted camera-

The parameters to be estimated are: z = [O,, w,, L,,, L,,, Ln, Lw]

The condition for non-O bservability is t hen

Let us use a single observation, the image of feature point a, and in fact only the

y-component. Recall that

and

Point a : L H

x, = (L, t ZD) sin O, - - cos Bp 2

1 H y, = -(Lp+ 5D)cosOp - -sinep

2

d y 1 ca -- axa

f D - cos(qc) + N sin(qc)

D2

&a - - 1 H - ( L , + D) sin(#,) - - cos(8,) 30, 2 axa - = O

axa - = sin(ûp) ~ L P aylz = -cos(&) %J

We uitroduce the s m d angle approximations sin(+) =z Op, cos(ep) zz 1. We m i t e

To estabiish linear independence, we reason as foiiows. Fust: because and

contain sines and cosines of ûp, Equation D.9 is the sum of terms in sines of O, and cosines

of thetap, it foliows that it is independent of the Iast two. This leaves the final two, for L,,

and Lyc. For these to be dependent, there m u t exist non zero a and P such that

(D. 12)

APPEii iX D. OBSERVABILITY

The ep time dependence cancels out if

However, if qc, the camera angle, is not constant, it will be impossible to cancel the qc time

variation for aii t , unless

and? as well,

We need to have

For equation D.13 to work. and very special values of L, and L,, to satisfy equation D.14.

It is assumed that such an event does not occur.

Document Log:

Manuscript Version 1 -

Typeset by dMWI&jX - 30 January 1998

CEXTRE FOR INTELLIGENT A h f A ~ ~ ~ ~ ~ , ,MCGILL USIVERSITY, 3480 UNIVERSITY ST., MONTRÉAL

(QUÉBEC) H3A 2A7, CANADA, Tel.: (514) 398-8202

E-mail address: benamearOOcim . m c g i l l . ca

informatlon to users as original or · acknowledgement s 1 thank my thesis advisur, dr- pierre r...

Documents