shaun s. gleason 1, mesfin dema 2, hamed sari-sarraf 2, anil cheriyadat 1, raju vatsavai 1, regina...

28
Verification & Validation of a Semantic Image Tagging Framework via Generation of Geospatial Imagery Ground Truth Shaun S. Gleason 1 , Mesfin Dema 2 , Hamed Sari-Sarraf 2 , Anil Cheriyadat 1 , Raju Vatsavai 1 , Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge, TN 2 Texas Tech University, Lubbock, TX 1

Upload: richard-ford

Post on 25-Dec-2015

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

1

Verification & Validation of a Semantic Image Tagging Framework via Generation

of Geospatial Imagery Ground Truth

Shaun S. Gleason1, Mesfin Dema2, Hamed Sari-Sarraf2, Anil Cheriyadat1, Raju Vatsavai1, Regina Ferrell1

 1Oak Ridge National Laboratory, Oak Ridge, TN

2Texas Tech University, Lubbock, TX

Page 2: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

2

Contents

MotivationExisting ApproachesProposed Approach Generative Model Formulation General Framework

Preliminary ResultsConclusions

Page 3: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

3

Motivation

Automated identification of complex facilities in aerial imagery is an important and challenging problem.

For our application, nuclear proliferation, facilities of interest can be complex.

Such facilities are characterized by: the presence of known structures, their spatial arrangement, their geographic location, and their location relative to natural resources.

Development, verification, and validation of semantic classification algorithms for such facilities is hampered by the lack of available sample imagery with ground truth.

Page 4: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

SwitchYard

Semantics:Set of objects like:

Switch yard,Containment Building,Turbine Generator,Cooling Towers

ANDTheir spatial arrangement => mayimply a semanticlabel like “nuclearpower plant”Turbine

BuildingCoolingTowers

ContainmentBuilding

Motivation (cont.)

Page 5: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

5

Motivation (cont.)

Many algorithms are being developed to extract and classify regions of interest from images, such as in [1].

V & V of the algorithms have not kept pace with their development due to lack of image datasets with high accuracy ground truth annotations.

The community needs research techniques that can provide images with accurate ground truth annotation at a low cost.[1] Gleason SS, et al., “Semantic Information Extraction from Multispectral Geospatial

Imagery via a Flexible Framework,” IGARSS, 2010.

Page 6: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

6

Existing Approaches

Manual ground truth annotation of images Very tedious for volumes of images Highly subjective

Using synthetic images with corresponding ground truth data Digital Imaging and Remote Sensing Image Generation ( DIRSIG) [2]▪ Capable of generating hyper-spectral images in range of 0.4-20

microns.▪ Capable of generating accurate ground truth data.▪ Very tedious 3D scene construction stage. ▪ Incapable of producing training images in sufficient quantities.

[2] Digital Imaging and Remote Sensing Image Generation (DIRSIG): http://www.dirsig.org/.

Page 7: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

7

Existing Approaches

In [3,4], researchers attempted to partially automate the cumbersome 3D scene construction of the DIRSIG model. LIDAR sensor is used to extract 3D objects from a given

location. Other modalities are used to correctly identify object types. 3D CAD models of objects and object locations are extracted. Extracted CAD models are placed at their respective position to

reconstruct the 3D scene and finally to generate synthetic image with corresponding ground truth.

Availability of 3D model databases, such as Google SketchUp [5], reduces the need for approaches like [3,4].

[3] S.R. Lach, et al., “Semi-automated DIRSIG Scene Modeling from 3D LIDAR and Passive Imaging Sources”, in Proc. SPIE Laser Radar Technology and Applications XI, vol. 6214,2006. [4] P. Gurram, et al., “3D scene reconstruction through a fusion of passive video and Lidar imagery,” in Proc. 36th AIPR Workshop, pp. 133–138, 2007. [5] Google SketchUp: http://sketchup.google.com/.

Page 8: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

8

Proposed Approach

To generate synthetic images with ground truth annotation at low cost, we need a system which can learn from few training examples.

This system must be generative so that one can sample a plausible scene from the model.

The system must also be capable of producing synthetic

images with corresponding ground truth data in sufficient quantity.

Our contribution to the problem is two-fold. We incorporated expert knowledge into the problem with less effort. We adapted a generative model to synthetic image generation

process.

Page 9: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

9

Nuclear Power Plant

Reactor Turbine Building BuildingSwitchyard Cooling Tower

CT Type1 CT Type 2

[6] S.C. Zhu. and D. Mumford,” A Stochastic Grammar of Images”. Foundation and Trends in Computer Graphics and Vision, 2(4): pp .259–362, 2006

Knowledge Representation: And-Or Graph

Page 10: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

10

Generative Model Formulation: Maximum Entropy Principle(MEP)

Given observed constraints (i.e. the hierarchical and contextual information) of an unobserved distribution f , a probability distribution p which best approximates f is the one with maximum entropy [7,8].

( ) argmax ( ) log( ( ))x X

p X p x p x

( ) ( ) ( ) ( )f p f psubject to X X X X

, ,Feature Or Nodes And Nodes

[7] J. Porway ,et al. “ Learning compositional Models for Object Categories From Small Sample Sets”, 2009[8] J. Porway ,et al. “ A Hierarchical and Contextual Model for Aerial Image Parsing”, 2010

Page 11: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

11

Gibbs Distribution

Parameter Learning

( ) ( ) ( ) ( ), ,

1 1

1( ; , ) exp ; ,

( )

; , , ( ) , ( )Nor NAnd

i i p j j pi j

p I S I SZ

I S H I H I

( )( 1) ( )( ) ( ) ( ), ,

t tj j j p j fH H

Generative Model Formulation:Optimization of MEP

( ) ( ),logi i fH

Page 12: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

12

General Framework

Page 13: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

13

General Framework

[5]

[9]

[5] Google SketchUp: http://sketchup.google.com/.[9] Persistence of Vision Raytracer (POV-Ray): http://www.povray.org/.

Page 14: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

14

Preliminary Results

We are currently working with experts on annotating training images of nuclear power plant sites.

To demonstrate the idea of the proposed approach, we have used a simple example as a proof-of-principle.

Using this example, we illustrate how the generative framework can sample plausible scenes, and finally generate synthetic images with corresponding ground truth annotation.

Page 15: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

15

Proof-of-Principle:Training Images

Page 16: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

16

Proof-of-Principle:Manually Annotated Training Images

Page 17: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

17Orientation Corrected Images

Proof-of-Principle:Manually Annotated Training Images

Page 18: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

18Orientation Corrected Images Followed by Ellipse Fitting

Proof-of-Principle:Manually Annotated Training Images

Page 19: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

19

No. Relationship Function

1 Relative Position in X

2 Relative Position in Y

3 Relative Major Axis

4 Relative Minor Axis

5 Relative Orientation

6 Aspect Ratio

1 2

2Maj

x x

1 2

2Min

y y

1

2

Maj

Maj

1

2

Min

Min

1 2

Relationships

1

1

Maj

Min

Page 20: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

20

Synthesized Images

Before Learning After Learning

Page 21: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

21

Synthesized Images

Before Learning After Learning

Page 22: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

22After Learning Synthesized Image

Synthesized Images

Page 23: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

23Part level ground truth image Object level ground truth image

Synthesized Images

Page 24: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

24

Manually Created Example

3D Google Sketch-Up model of a nuclear plant: Pickering Nuclear Plant, Canada (left), and model manually overlaid on an image (right).

Page 25: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

25

Conclusions

Maximum Entropy model has proven to be an elegant framework to learn patterns from training data and generate synthetic samples having similar patterns.

Using the proposed framework, generating synthetic images with accurate ground truth annotation comes at relatively low cost.

The proposed approach is very promising for algorithm verification and validation.

Page 26: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

26

Challenges Ahead

The current model generates some results that do not represent a well-learned configuration of objects.

We believe that constraint representation using histograms contributes to invalid results, since some values are averaged out while generating histograms.

To avoid invalid results, we are currently studying a divide-and-conquer strategy by introducing on-the-fly clustering approaches. This separates the bad samples from the good ones, which helps tune the parameters during the learning phase.

Page 27: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

27

Acknowledgements

Funding for this work is provided by the Simulations, Algorithms, and Modeling program within the NA-22 office of the National Nuclear Security Administration, U.S. Department of Energy.

Page 28: Shaun S. Gleason 1, Mesfin Dema 2, Hamed Sari-Sarraf 2, Anil Cheriyadat 1, Raju Vatsavai 1, Regina Ferrell 1 1 Oak Ridge National Laboratory, Oak Ridge,

28

Thank You!