dissimilarity-based people re-identification and search for intelligent video surveillance

43
Riccardo Satta [email protected] Dissimilarity-based people re-identification and search for intelligent video surveillance PhD final dissertation PhD School on Information Engineering April 2013 University Of Cagliari Department of Electrical and Electronic Engineering Pattern Recognition and Applications Lab 1

Upload: riccardo-satta

Post on 04-Aug-2015

730 views

Category:

Technology


1 download

TRANSCRIPT

Riccardo Satta [email protected]

Dissimilarity-basedpeople re-identification and search

for intelligent video surveillancePhD final dissertation

PhD School on Information EngineeringApril 2013

UniversityOf Cagliari

Department of Electrical and Electronic

Engineering

Pattern Recognition and Applications Lab

1

Outline

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

2

• General context

Intelligent Video-Surveillance, and in particular– Person Re-identification– Appearance-based People Search

• A framework for constructing descriptors of people– dissimilarity-based representations and their advantages– the Multiple Component Dissimilarity (MCD) framework

• MCD and person re-identification

• MCD and people search

• Discussion and conclusions

Intelligent Video Surveillance

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

3

Machine Learning

Biometrics and pattern recognition

Novel sensor technologies

Useful tools for operators and forensic investigators• person identification• on-line tracking of persons and objects• detection of events of interest• detection of suspicious actions• summarisation of long video footages …

IntelligentVideo Surveillance

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

Person re-identificationPerson Re-Identification is the ability to determine if an individual has already been observed over a network of video-surveillance cameras

4

A

B

Scenarios- on-line (e.g. people

tracking among different cameras)

- off-line (e.g. retrieve all the frames showing an individual of interest)

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

Person re-identificationFace recognition cannot be used- bad quality images (low resolution, blur, …)- unconstrained pose

Other cues must be used

clothing appearance (easy to extract, good uniqueness in limited time spans)

other ones (e.g. gait) are impractical in real-world scenarios

5

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

Clothing appearance descriptors

6

Blob detection and tracking

BG/FG segmentation

Descriptorcomputation

Descriptor = body part subdivision + appearance featuresEach body part is automatically detected and described separately by e.g.- colour (e.g., histograms)- texture (e.g., DCT, LBP)- local/global features

Appearance-based people search

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

7

Clothing appearance descriptors can enable another useful task, appearance-based people search (a novelty in the literature)

Retrieve images of people via a query expressed as a high-level description of the

clothes (es. “people with red shirt and blue trousers”), instead of as an image

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

8

THE MULTIPLE COMPONENT DISSIMILARITY FRAMEWORK

Dissimilarity representations

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

9

An alternative way [1] to represent objects in pattern recognition, useful when it is unclear how to choose a features it is difficult to find a good feature set

feature-based representation

dissimilarity-based representation

Objectfeature

extraction[ x1 x2 … xn ]

feature vector

prototypes

[1] Pekalska and Duin. The Dissimilarity Representation for Pattern Recognition: Foundations and Applications. World Scientific Publishing, 2005

[ d1 d2 … dn ]dissimilarity vector

Objectdissimilarities computation

P1 P2 Pn

The Multiple Component Dissimilarity framework

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

10

Extension of the dissimilarity-based approach to objects represented by- multiple parts- multiple local features (components)

Prototypes for body part #1

Prototypes for body part #2

Dissimilarity vectors(one for each body

part)

Localappearance

Globalappearance

The Multiple Component Dissimilarity framework

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

11

Prototype construction From a design set of images of people various possible approaches, e.g. clustering

Clustering-based prototype creation example (two body parts)

Design set

Create a set of all the components of body part 1

Create a set of all the components of body part 2

Cluster the set

Take centroids as prototypes

Cluster the set

Take centroids as prototypes

The Multiple Component Dissimilarity framework

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

12

MCD representations will be exploited for person re-identification

appearance-based people search

[d1,1 d1,2 d1,3 d1,4 d2,1 d2,2 d2,3 ] [d1,1 d1,2 d1,3 d1,4 d2,1 d2,2 d2,3 ]

[d1,1 d1,2 d1,3 d1,4 d2,1 d2,2 d2,3 ] [d1,1 d1,2 d1,3 d1,4 d2,1 d2,2 d2,3 ]

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

13

MCD FOR PERSON RE-IDENTIFICATION

MCD and person re-identification

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

14

Person re-identification

MCD salient features for person re-identification:

a very compact representationdescriptors are small real vectors (low storage requirements, fast matching)

dissimilarity vectors are representation-independentthey can be used to combine different features and modalities

Applications: 1) Speed up person re-identification methods

2) Feature combination for person re-identification3) Multimodal person re-identification

matching

ranked list of templates(w.r.t. the degree of similarity)

template gallery

probe0.03 0.28 0.33 0.34

MCD-based matching

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

15

A novel weighted Euclidean distance for dissimilarity spaces RATIONALE: - each dissimilarity is a degree of relevance of the corresponding prototype;

- lower dissimilarity values carry more information; in fact, they encode the most relevant characteristics of the sample.

Weights: where (xi, yi in the range [0,1])

The weighting rule f() is a monotonically increasing function; its choice governs the difference betweenrelevant and non-relevant prototypes

x and y: dissimilarity vectors;

W such that

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

16

USING MCD TO SPEED UPEXISTING METHODS

MCD to speed up existing methods

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

17

MCD has been applied to an existing method, MCMimpl [2]

MCMimpl in short:

part subdivision: torso – legs exploiting symmetry and

anti-symmetry properties, discarding head

multiple component representation:for each part, randomly taken and partly overlapping patches

Four data sets of increasing size:i-LIDS (119 pedestrians) VIPeR-316 (316 pedestrians)VIPeR-474 (474 pedestrians) VIPeR-632 (632 pedestrians)

[2] Satta, Fumera, Roli, Cristani, and Murino. A Multiple Component Matching Framework for Person Re-Identification. In: ICIAP, 2011

Experimental evaluation

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

18

Experimental evaluation

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

19

Trade-off between accuracy and computational time

It can be shown that the overall re-identification time* in a practical search scenario is much lower when using MCD

* sum of processing time plus the average search time spent by the operator

Experimental evaluation

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

20

Impact of the number and source of prototypes

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

21

USING MCD TO COMBINEFEATURE SETS

Fusion of different feature sets by MCD

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

22

Prototypes in MCD are representation-independent

MCD dissimilarity vectors can be used to combine together different kinds of

features, either global or local

each feature set will be responsible for a different sub-set of prototypes

Fusion of different feature sets by MCD

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

23

This technique has been used to combine five different feature sets

• RandPatchesHSV• RandPatchesLBP• FCTH [3]• EdgeHistogram [4]• SCD [4]

exploiting a 4-body-parts subdivision

First two feature sets:200 prototypes per feature set per body part

Last three feature sets:100 prototypes per feature set per body part

3200 prototypes in total

[3] Chatzichristofis and Boutalis. FCTH: Fuzzy Color and Texture Histogram – a Low Level Feature for Accurate Image Retrieval. In: WIAMIS, 2008[4] Sikora. The MPEG-7 Visual Standard for Content Description – an Overview. IEEE Transactions on Circuits and Systems for Video Technology, 2001

Performance of the single feature sets

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

24

I-LIDS: 119 individuals

Comparison with the state-of-the-art

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

25

Comparison with two state-of-the-art methods- SDALF [5]- CPS [6]

[5] Farenzena, Bazzani, Perina, Murino, and Cristani. Person Re-Identification by Symmetry-Driven Accumulation of Local Features. In: CVPR, 2010[6] Cheng, Cristani, Stoppa, Bazzani, and Murino. Custom Pictorial Structures for Re-Identification. In: BMVC, 2011

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

26

USING MCD TO PERFORMMULTI-MODAL PERSON

RE-IDENTIFICATION

Multi-modal person re-identification

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

27

• Appearance is a widely used cue for person re-identification other cues (e.g., gait) pose constraints that limit their applicability in real world scenarios

• However, the recent introduction of RGB-D sensors makes it possible to extract anthropometric measures that can be combined with appearance

Example MS Kinect™!

By processing RGB-D data, it is possible to estimate a 3D model of a person in real-time [7]

From this model, one can extract various anthropometric measures (e.g., height, arm length)

[7] Shotton, Fitzgibbon, Cook, Sharp, Finocchio, Moore, Kipman, and Blake. Real-time Pose Recognition in Parts from Single Depth Images. In: CVPR, 2011

Multi-modal person re-identification

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

28

Multi-modal person re-identification

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

29

A proper fusion strategy must be used to combine different modalities.

Score-level fusion Feature-level fusion

- Performance of score-level fusion is affected by the choice of the fusion rule (e.g.,

mean, min); a suitable choice for re-id is not trivial

- Feature-level fusion requires homogeneous features

Fusion

Modality 1

Matching scoreModality

2Matching score

Modality n

Matching score

Fusion score

Modality 1Modality 2

Modality n

Matching

Multi-modal person re-identification

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

30

MCD provides a way to combine non-homogeneous modalities at feature level, by exploiting its representation-independency

Multi-modal person re-identification

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

31

This MCD-based approach has been used to combine appearance with anthropometry

Appearance:two descriptors, MCMimpl v2 and SDALF

Anthropometry:three measures from the skeleton:

- normalised height- normalised average arm length- normalised average leg length

MCMimpl v2 SDALF

Experimental evaluation

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

32

Experiments have been carried out on a novel dataset acquired using Kinect cameras, Kinect4REID

video sequences of 80 individuals taken at different locations different lighting conditions and view points 2 to 7 different video sequences per person many persons are carrying bags or accessories

Experimental evaluation

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

33

Experiments: one video-sequence per person taken as template, the remaining ones as probe20 repetitions

Experimental evaluation

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

34

Comparison of MCD-based fusion with other fusion rules

Similar results have been obtained with SDALF + Anthropometry

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

35

USING MCD TO PERFORMAPPEARANCE-BASED

PEOPLE SEARCH

MCD for people search

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

36

Implementation by MCD: high-level concepts that describe certain clothing characteristics (e.g., “red shirt”) may be encoded by one or more visual prototypes, according to the low-level features and part subdivision used

Prototypes (rectangular patches) extracted from a set of 24 people (upper body part)

Correlation with the presence of the concept “red shirt”

MCD for people search

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

37

How to implement people search

(i) define a set of basic queries

(ii) construct a detector for each basic query, using dissimilarity values as input

Complex queries can be built by connecting basic ones through Boolean operators,

e.g., “red shirt AND (blue trousers OR black trousers)”

Detector[ d1 d2 … dn ] SCORE

Experimental evaluation

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

38

Dataseta subset of 512 images taken from the VIPeR data-set, tagged with respect to 14 different basic queriesExamples:

Three descriptors:i) MCMimplii) SDALF iii) MCMimpl-PS, which uses a pictorial structure [8] to subdivide the body

into nine parts

body subdivision, MCMimpl and SDALF

body subdivision, MCMimpl-PS

[8] Andriluka, Roth, and Schiele. Pictorial Structures Revisited: People Detection and Articulated Pose Estimation. In: CVPR 2009

Experimental evaluation

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

39

For each basic query:(i) the VIPeR-Tagged is subdivided into a training and a testing sets of equal size(ii) a linear SVM is trained on training images to implement a detector(iii) the P-R curve is evaluated on testing images, by varying the SVM decision thresholdThis procedure is repeated ten times

Break-even points for all classes:

Experimental evaluation

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

40

Red shirt

Blacktrousers

Shortsleeves

Conclusions

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

41

What has been done(i) MCD, a novel dissimilarity-based framework for describing

individuals

(ii) an approach based on MCD to speed up any existing person re-identification method

(iii) a state-of-the-art re-identification method, that combines different features obtained through the use of MCD

(iv) a method to perform multi-modal person re-identification based on MCD and using RGB-D cameras, and a novel data set to assess performance of multi-modal re-identification systems

(v) a method that uses MCD to perform the novel task of “appearance-based people search”

Conclusions

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

42

What to do next (long list…!)

THE FRAMEWORK(i) explore the commonalities between MCD and Visual Words

and Fisher Vectors(ii) extend MCD to other domains

MULTIMODAL RE-ID(iii) explore the use of other cues (other anthropometries, skeleton-

based gait…)(iv) extend the approach to support missing cues

PEOPLE SEARCH(v) address the problem of ambiguity of concepts(vi) add semantic interpretation (Natural Language Processing) to

support queries in natural language

UniversityOf Cagliari

Department of Electrical and Electronic Engineering

43

QUESTION TIME!