video object segmentation with re-identificationvideo object segmentation with re-identification...

30
Video Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen Change Loy, Xiaoou Tang The Chinese University of Hong Kong, SenseTime Group Limited

Upload: others

Post on 11-Jun-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Video Object Segmentation with Re-identification

Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping ShiPing Luo, Chen Change Loy, Xiaoou Tang

The Chinese University of Hong Kong, SenseTime Group Limited

Page 2: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Semi-supervised Segmentation• Input :Video sequence, ground-truth label of the first frame

• Output : Masks of all instances

Page 3: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Challenge

• Instance Segmentation• Small objects and fine structures• Scale & pose-variations

• Tracking• Frequent occlusions

Page 4: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

• Instance Segmentation• Small objects and fine structures• Scale & pose-variations

• Tracking• Frequent occlusions

Mask Propagation Module

Re-identification Module

Short Term

Long Term

Challenge

Page 5: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Video Object Segmentation with Re-identification (VS-ReID)

Iterative InferenceMask Initialization

Proposed Framework

Mask Propagation Module

Re-identification Module

Input Video Sequence

Mask Propagation Module

Page 6: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Video Object Segmentation with Re-identification (VS-ReID)

Iterative InferenceMask Initialization

Mask Propagation Module

Re-identification Module

Input Video Sequence

Mask Propagation Module

Mask Propagation Module

Page 7: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Mask Propagation Module

• Inspired by MSK[1] and LucidTracker[2]

• Use the temporal continuity property of the video sequence

• Propagate the mask from the previous frame to the current frame

[1] Perazzi F, Khoreva A, Benenson R, et al. Learning video object segmentation from static images[C]. CVPR, 2017. [2] Khoreva A, Benenson R, Ilg E, et al. Lucid Data Dreaming for Object Tracking[J]. arXiv preprint arXiv:1703.09554, 2017.

Page 8: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Mask Propagation Module

Image

Optical Flow

RGB Branch

Flow Branch Prediction

Guided Probability Map

Page 9: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Mask Propagation Module

RGB Branch

Flow Branch Prediction

Image

Optical Flow

Guided Probability Map

Page 10: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Mask Propagation Module

Previous Frame Current Frame

Previous Mask

Page 11: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Mask Propagation Module

Previous Frame Current Frame

FlowNet

Warping

Optical Flow

Guided Probability MapPrevious Mask

Page 12: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Mask Propagation Module

Previous Frame

FlowNet

Warping

Previous Mask

Current FrameOptical Flow

Guided Probability Map

Page 13: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Mask Propagation Module

RGB Branch

Flow Branch Prediction

Image

Optical Flow

Guided Probability Map

Page 14: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Mask Propagation Module

Image

Optical Flow

Guided Probability Map

RGB Branch

Flow Branch Prediction

Page 15: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Video Fram

eG

uided Probability M

apPrediction

Warping

Mask Propagation Module

Page 16: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Mask Propagation Module• Deeper Backbone Network

• ResNet101

• RGB-branch• Pre-trained on the MS-COCO and PASCAL VOC dataset

• Augmented ground-truth label as the guided probability map• Fine-tuned on the DAVIS dataset

• Flow-branch• Initialized with RGB-Branch’s weights• Trained on the DAVIS dataset

• Multi-instance• Inference on each instance individually

Page 17: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Mask Propagation Module

Page 18: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Video Object Segmentation with Re-identification (VS-ReID)

Alternating InferenceMask Initialization

Proposed Framework

Input Video Sequence

Mask Propagation Module

Mask Propagation Module

Re-identification Module

Page 19: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Re-identification Module• Detection and re-identification

First Frame Rest Frames

Template Candidate Bounding Boxes

Re-identification

Most Confident Candidate

Page 20: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Re-identification Module• Recover the mask from a bounding box

Template

Most Confident Candidate & Flow

Guided Probability Map

Mask Propagation Module

Recovered Mask

Page 21: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Re-identification Module

• Detection Model• Faster RCNN• Trained on the ImageNet

• Re-identification Model• ‘Identification Net’ in Person Search[1]• For the person category, we directly use the ‘Identification Net’ in Person

Search[1]• Trained on the ImageNet VID

• Retrieve an instance in a single frame each time[1] Xiao T, Li S, Wang B, et al. Joint detection and identification feature learning for person search[C] CVPR. 2017.

Page 22: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Video Object Segmentation with Re-identification (VS-ReID)

Iterative InferenceMask Initialization

Mask Propagation Module

Re-identification Module

Input Video Sequence

Mask Propagation Module

Mask Propagation Module

Page 23: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

VS-ReID

1 8 20 37 52 64 82

Input Frames

InitializationM

ask

Prop

agat

ion

• Mask Initialization

Page 24: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

1 8 20 37 52 64 82Input Fram

esInitialization

Mas

k Pr

opag

atio

nR

e-Id

entif

icat

ion 1

stRound

21

Page 25: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

1 8 20 37 52 64 82Input Fram

es

21

Re-

Iden

tific

atio

n 1stR

ound

Page 26: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

1 8 20 37 52 64 82Input Fram

es

21

1stR

oundR

e-Id

entif

icat

ion

Mas

k Pr

opag

atio

n

𝒙"

Page 27: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

1 8 20 37 52 64 82Input Fram

es

80

2ndR

oundR

e-Id

entif

icat

ion

Mas

k Pr

opag

atio

n

𝒙"

Page 28: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Performance

J Mean F Mean Global MeanVoigt 54.8 60.5 57.7

Haamo 59.8 63.2 61.5Vanta 61.5 66.2 63.8Apata 65.1 70.6 67.8Ours 67.9 71.9 69.9

(DAVIS 2017 Challenge test-challenge set)

Page 29: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Visualization

Page 30: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen

Thanks!