problem statement a pair of images or videos in which one is close to the exact duplicate of the...

19
... ... (c)D istance m atrix betw een blocks (d)Integer-value constrained flow m atrix between blocks ... (a)Q uery im age (b)Nearduplicate im age ... Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture, edits, and rendering. ... ... Problem: Spatial shift and scale variations

Upload: godwin-booker

Post on 11-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

...

...

(c)Distance matrix between blocks

(d) Integer-value constrained flow matrix between blocks

...

(a) Query image (b) Near duplicate image

...

Problem Statement

A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture, edits, and rendering.

......

Problem: Spatial shift and scale variations

Page 2: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

Tasks and Applications

• Near Duplicate Retrieval (NDR): – Copyright infringement detection – query-by-example application

• Near Duplicate Detection (NDD)– Link news stories and group them into threads – Filter out the redundant near duplicate images or videos in the

top results from text keywords based web search

Page 3: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

Prior Work

• Attributed Relational Graph (ARG) matching: ACM Multimedia 2004

• Point set matching: ACM Multimedia 2004

• One-to-one symmetric matching algorithm: T-MM 2007 and ACM Multimedia 2007

• Large-scale near duplicate detection: CIVR 2007

Page 4: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

Prior Work: Spatial pyramid match kernel

First quantize descriptors (SIFT) into words, then do one pyramid match per word in image coordinate space.

Lazebnik, Schmid & Ponce, CVPR 2006

Page 5: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

•Fusion of information from different levels.

•Alignment of different subclips (Level-1 as an example)

EMD DistanceMatrix between

Sub-clips

Integer-valueAlignment

Smoke Fire

Smoke

Level-0 Level-0

Level-1

Level-1

Level-1

Level-1

•Temporally Constrained Hierarchical Agglomerative Clustering

Fire

Temporal Pyramid Matching for Event Recognition in News Video

Level-2

Level-2

Level-2

Level-2

D. Xu & S.-F. Chang, CVPR 2007 and T-PAMI 2008

Page 6: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

Earth Mover’s Distance (EMD)

dij

Supplier P is with a given amount of goods

Receiver Q is with a given limited capacity

Weights: Solved by linear programming

1/m1/2m1/2

m

Page 7: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

...

...

(c)Distance matrix between blocks

(d) Integer-value constrained flow matrix between blocks

...

(a) Query image (b) Near duplicate image

...

Spatially Aligned Pyramid Matching

Non-overlapped and overlapped partition at multiple-levels:

Divide images into non-overlapped blocksDivide images into overlapped blocks with size equaling of the original image (in width and height) sampled at a fixed interval, say 1/8 of the image width and height.

4l 1

2l

Page 8: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

First Stage Matching

...

...

(c)Distance matrix between blocks

(d) Integer-value constrained flow matrix between blocks

...

(a) Query image (b) Near duplicate image

...Objective: Compute the pairwise distances between any two blocks and .

Solution: We represent each block as a bag of orderless SIFT descriptors and use EDM distance to measure the similarity between two sets of descriptors of unequal cardinality.

rx cy

Jianguo Zhang et al. Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study, IJCV, 2007

Page 9: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

Second Stage Matching (1)

Objective: Align the blocks from one query image x to corresponding blocks in its near duplicate image y.

SAPM (our work): One block may be matched to another block at a different position and/or scale level to robustly handle piecewise spatial translations and scale variations.SPM: fixed block-to-block matching

Page 10: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

Second Stage Matching (2)

Eq (3) can be reestablished from Eq (4): (Assume R<C)

1) adding C − R virtual blocks in image x, 2) setting , for all r satisfyingR < r ≤ C.

0rcD

Solution: Integer-flow EMD

Page 11: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

Comparison of SAPM, SPM and TPM

Three blocks in the query images (i.e.,(a)) and their matched counterparts in nearduplicate images (i.e.,(b), (c), (d)) are highlighted and associated by the same color outlines.

Page 12: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

Third Stage Matching

• Extension to Near Duplicate Video Identification:

One video clip V1 comprises {x(1), x(2), …, x(M)}, where x(i) is the i-th

frame and M is the total number of frames of V1; Another video clip V2

comprises {y(1), y(2),…, y(N)}, where N and y(j) are similarly defined.

Solution: temporal matching with EMD again.

Page 13: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

Discussion

• If the query image was divided into non-overlapped blocks (e.g., L2-N) and the corresponding database images were divided into overlapped blocks (e.g. L2-O) at the same level, spatial shifts and some degree of scale change are addressed (e.g., )

• a broad range of scale variations is considered by matching the query image and the database images at different levels (e.g., )

• Ideally, SAPM can deal with any variations from spatial shift and scale variation by using more denser scales and spatial spacings.

2 2( )L N L OS x y

1 2( )L N L OS x y

Page 14: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

Near Duplicate Retrieval and Detection

NDR: We directly fuse the distances from different levels :

NDD: Generalized Neighborhood Component Analysis (GNCA)

We use p = {0, 1, 2, 3, 4} to indicate partitions designated as level-0 non-Overlapped (L0-N), level-1 non-overlapped (L1-N), level-1 overlapped (L1-O), level-2 non-overlapped (L2-N), and level-2 overlapped (L2-O).

Page 15: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

Experiments: Three Datasets

• Columbia Near Duplicate Image Database: TRECVID 2003 corpus

• New Image Dataset: TRECVID 2005 and 2006 corpus– 150 near duplicate pairs (300 images) and 300 non-duplicate

images

• New Video Dataset: TRECVID 2005 and 2006 corpus– 50 near duplicate pairs (100 videos) and 200 non-duplicate

videos

The images are collected from real broadcast news (rather than

edits of the same image by the authors).

Page 16: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

Comparison of SAPM with SPM and TPM forImage NDR

Columbia Database

New Image Dataset

Page 17: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

SAPM and GNCA for Image NDD

Performance Measure: Equal Error Rate (EER)

SAPM+NCA and SAPM+GNCA: 20 positive and 80 negative samples to train the projection matrices in NCA and GNCA, another 40 positive and 160 negative samples were used for SVM training. SPM, TPM and SAPM: all training samples (60 positive and 240 negative) were used for SVM training.

Test samples: 90 (positive) and 4840 (negative).

Page 18: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

Comparison of SAPM+TM, SPM+TM and TPM+TM for Video NDR

1: Single-level L0-N->L0-N; 2: Single-level L1-N ->L1-N (or L1-O); 3:Multi-level.

Two weighting schemes in temporal matching: normalized weight (NW) and unit weight (UW)

Page 19: Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

Conclusion

• A multi-level spatial matching framework for image and video near duplicate identification.

• GNCA outperforms NCA for near duplicate detection.