training and evaluating of object bank models presenter : changyu liu advisor : prof. alex...

26
Training and Evaluating of Training and Evaluating of Object Bank Models Object Bank Models Presenter Changyu Liu Advisor Prof. Alex Interest Multimedia Analysis May 16 th , 2013

Upload: juniper-darren-shields

Post on 13-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Training and Evaluating of Training and Evaluating of Object Bank ModelsObject Bank Models

Presenter : Changyu LiuAdvisor : Prof. AlexInterest : Multimedia Analysis

May 16th, 2013

CMU - Language Technologies Institute 2

Contents

Dataset Setting Model Training Model Evaluation in Deformable Part Model Evaluation in Object Bank Conclusion and Plan

CMU - Language Technologies Institute 3

Dataset Setting--- Object Lists

OB ID Object Name WNID

10477 knife n029739041253 balloon n02782093

12498 snail n01944390

11515 candle n02948072

1176 soccer ball n04254680

1190 laptop n03642806

1232 airplane n02690373

12982 car n02701002

1329 boat n03329663

1103 cow n01887787

In this experiment, we firstly choose 10 objects, as:

Table 1 Selected 10 Objects

CMU - Language Technologies Institute 4

Dataset Setting--- Sample Configuration

1. Then, choose 961 total image(about 100 for each object) for training, 958 total image for evaluation, and 1331 total image for testing.

2. All these images are divided by 1:4 for positive and negative samples and are all from Image Net (http://www.image-net.org/) with most of them having a bounding box annotation.

CMU - Language Technologies Institute 5

Dataset Setting--- Sample Configuration

3. We use these images to substitute VOC 2008 dataset and have generated as well as evaluated four deformable part models (other six models are on the way).

CMU - Language Technologies Institute 6

Contents

Dataset Setting Model Training Model Evaluation in Deformable Part Model Evaluation in Object Bank Conclusion and Plan

CMU - Language Technologies Institute 7

Model Training---Overview

In order to use Object Bank features, object models

should be trained firstly. Here we introduced a

Deformable Part Model(Felzenszwalb, CVPR 2008)

for such training. The current adopted version was

voc-release 3.l.

CMU - Language Technologies Institute 8

Fig. 1 Deformable Part Model

Model Training--- Deformable Part

The deformable model include both a coarse global template and higher resolution part templates. The templates represent histogram of gradient features

(b1) coarse template (b2)part templates (b3) spatial model (a) person detection Example

CMU - Language Technologies Institute 9

Model Training--- Results

On average, it generated 1.5 models each day on the CQ-serials desktop. After training, we got 9 .mat model file, as:balloon_final.matsnail_final.matcandle_final.matsoccer ball_final.matlaptop_final.matairplane_final.matcar_final.matboat_final.matcow_final.mat

CMU - Language Technologies Institute 10

Contents

Dataset Setting Model Training Model Evaluation in Deformable Part Model Evaluation in Object Bank Conclusion and Plan

CMU - Language Technologies Institute 11

Model Evaluation--- Deformable Part

Then, we had a evaluation of each object on the

selected 958 images, and got the Average

Precision distribution map, as:

CMU - Language Technologies Institute 12

Model Evaluation--- Deformable Part

Fig. 2 AP of Airplane

In which AP is average precision, Bbox 1 is bounding box from root placements, and Bbox 2 is bounding box from using predictor function.

CMU - Language Technologies Institute 13

Model Evaluation--- Deformable Part

Fig. 3 AP of Balloon

CMU - Language Technologies Institute 14

Model Evaluation--- Deformable Part

Last, we got 9 objects average precision, as:

Object AP of Bbox1 AP of Bbox2balloon 0.428 0.439

snail 0.184 0.201

candle 0.203 0.196

soccer ball 0.376 0.376

laptop 0.472 0.479

airplane 0.644 0.652

car 0.518 0.526

boat 0.495 0.488

cow 0.416 0.405

Table 2 Average precision of nine objects

Then, got 9 google images(1 image for each object for a bounding box test.

CMU - Language Technologies Institute 15

Model Evaluation--- Deformable Part

Fig. 4 Balloon

CMU - Language Technologies Institute 16

Model Evaluation--- Deformable Part

Fig. 5 Candle

CMU - Language Technologies Institute 17

Model Evaluation--- Deformable Part

Fig. 6 Cow

CMU - Language Technologies Institute 18

Model Evaluation--- Deformable Part

Fig. 7 Laptop

CMU - Language Technologies Institute 19

Model Evaluation--- Deformable Part

Fig. 8 Soccer ball

CMU - Language Technologies Institute 20

Contents

Dataset Setting Model Training Model Evaluation in Deformable Part Model Evaluation in Object Bank Conclusion and Plan

CMU - Language Technologies Institute 21

Model Evaluation--- Object Bank

Object Correlation Coefficient

balloon 0.71806

snail 0.86498

candle 0.85893

soccer ball 0.84165

laptop 0.73821

airplane 0.79783

car 0.48926

boat 0.75255

cow 0.71712

Table 3 Correlation Coefficient

The second evaluation was tested on Object Bank.

CMU - Language Technologies Institute 22

Model Evaluation--- Object Bank

CMU - Language Technologies Institute 23

Contents

Dataset Setting Model Training Model Evaluation in Deformable Part Model Evaluation in Object Bank Conclusion and Plan

CMU - Language Technologies Institute 24

Conclusion

Conclusion,1)The width or height of selected image must >= 4 HOG bin(4*8 pixels).2)It is feasible to use v3.1(not v5) code to generate object models for getting Object Bank features, and it took 1/1.5 day to get one model.The plan for next steps is,1) Move these codes to PSC for a further test in order to improve the process speed.2) Find what the needed 1000 objects names are.3) Choose and Make the dataset from Image Net.

CMU - Language Technologies Institute 25

Reference

[1] P. Felzenszwalb, D. McAllester, D. Ramanan. A Discriminatively Trained, Multiscale, Deformable Part Model. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2008

[2] P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan. Object Detection with Discriminatively Trained Part Based Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, No. 9, Sep. 2010.

[3] Level Image Representation for Scene Classification and Semantic Feature Sparsification. Proceedings of the Neural Information Processing Systems (NIPS), 2010.

CMU - Language Technologies Institute 26

Thank you!Thank you!