class 11: smart surveillance - rogerio...

41
Rogerio Feris, April 17, 2014 EECS 6890 – Topics in Information Processing Spring 2014, Columbia University http://rogerioferis.com/VisualRecognitionAndSearch2014 Class 11: Smart Surveillance

Upload: others

Post on 16-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Rogerio Feris, April 17, 2014 EECS 6890 – Topics in Information Processing

Spring 2014, Columbia University http://rogerioferis.com/VisualRecognitionAndSearch2014

Class 11: Smart Surveillance

Page 2: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Deadlines

Page 3: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Final Project Presentation

Page 4: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Project Paper

Page 5: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

What we have seen so far

Low-Level Features Feature Coding and Pooling Encoding Structure: Part-based Models Attributes And Semantic Features

Part I: From Low-level to Semantic Visual Representations

Deep Learning Similarity-based Image Search Learning-based Hashing for Large-Scale Image Search Large-scale Active Learning

Part II: Tools for Large-Scale Image Classification and Retrieval

Page 6: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Case Studies

IBM Smart Surveillance System [Today]

(second half of the class: project update II)

IBM Multimedia Analysis and Retrieval [Next Class]

Part III: Case Studies

Page 7: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Video Analytics for Smart Surveillance

Page 8: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Video Capture/

Encoding &

Management

DVR - records

& streams video

Real-time alerts • Perimeter violation

• Tailgating attempt

• Red car on service road

User driven queries • Find red cars

• Find tailgating incidents involving this person

Sensors &

Transactions

Analytics & Framework

Watches the video for alerts & events

• Analytics modules:

- Object tracking and classification

- Face capture and recognition

- License Plate Recognition

- Many others

• Gathers event meta-data & makes it searchable

• Provides plug and play framework for analytics

IBM Smart Vision Suite (SVS)

Page 9: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Mode of Operation: Real-time Alerts

Tripwire Directional Motion Removed Object

Examples of user configurable real-time alerts

Triggers on the cat crossing the blue line

Triggers on right-turns, when the cars move in the direction of the arrow

Triggers when object outlined in blue is removed from its position

Page 10: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Mode of Operation: Search After the Fact

“Show me all large vehicles with yellow color that crossed this road in the past 5 days” *Finds DHL delivery trucks+

“Show me all events with duration greater than 30 seconds” [Finds people loitering]

Page 11: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Traditional Pipeline: Blob-Based Analytics

Background Subtraction

Blob Tracking High-Level Processing

Background Subtraction: Moving Object Detection (Blobs)

Most existing smart surveillance systems in the market rely on blob-based video analysis. They are efficient and work well in low-activity scenarios.

Page 12: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Background Modeling: Challenges

Pixel-wise noises

Example: Latecki et al

Page 13: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Background Modeling: Challenges

Lighting changes

• Gradual

• Sudden

Shadows and reflections

Camouflage / low-contrast

Crowded Scenes

Removed objects

Shadow Reflection

Crowded Scene

Page 14: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Background Modeling: Challenges

Scatter plots of red and green values of a single pixel over time

Multimodal Backgrounds (swaying trees, water, flickering, …)

Page 15: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Background Modeling

Gaussian Mixture Model (GMM) for each pixel location

Stauffer and Grimson, “Adaptive Background Mixture Models for Real-time Tracking,” CVPR 1999

Page 16: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Background Modeling

Given a new video frame, the task is to classify each pixel as foreground or background

Each pixel has an associated GMM with K Gaussians (usually K ranges from 3 to 5)

Consider a single pixel as an example, with the associated model containing 5 Gaussians

Let’s assume for now 3 Gaussians correspond to the background and the two others correspond to the foreground

Page 17: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Background Modeling

Check which Gaussian better represents the pixel value:

• If “background” Gaussian, then classify the pixel as “background” and update Gaussian parameters.

• If “foreground” Gaussian, then classify the pixel as “foreground” and update Gaussian parameters.

• If none of them, classify the pixel as “foreground” and replace the least probable distribution with a new Gaussian (centered at the pixel value, with low weight and high variance)

Is the pixel foreground or background?

Matching Process

Page 18: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Background Modeling

Adaptive GMMs

Important to handle gradual lighting changes

Once a pixel value “matches” a Gaussian, the corresponding Gaussian parameters are updated:

Prior Weight

Mean

Variance

Page 19: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Background Modeling

Background and Foreground Gaussians

The Gaussians are ordered by (high support & less variance)

Then the first B distributions are labeled as “Background”, where

Page 20: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Blob-Based Analytics: Limitations

Dealing with Crowded Scenes

Objects close to each other are clustered into a single blob

Environmental Conditions

Quick lighting changes, reflections, and shadows cause spurious blobs

Original Video Background Subtraction Tracking

Page 21: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Object-Centric Video Analytics

Vehicle Detection in Crowded Scenes

Click for video

Page 22: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Object-Centric Video Analytics

Pedestrian Detection and Tracking in Crowded Scenes

Page 23: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Object-Centric Video Analytics

Limitations / Challenges

Detector accuracy: dealing with appearance variations

Different object poses, lighting changes, etc.

Detector efficiency / cost

State-of-the-art approaches usually run at low frame rates

How many object classes are needed?

Page 24: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Large-Scale Detector Learning

[Feris et al, Large-Scale Vehicle Detection, Indexing, and Search in Urban Surveillance Videos, IEEE Transactions on Multimedia, 2012]

Page 25: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Semi-Automatic Training Data Collection

User-defined Region of Interest (ROI)

Prior information about motion direction and size of cars in the region

Classifier is applied based on motion direction and blob shape (via background modeling, no appearance) and only high-confidence samples are selected

Original Video Captured Samples [click for video]

Page 26: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Semi-Automatic Training Data Collection

~5 hours video, click for demo [Training Data]

Page 27: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Synthetic Occlusion Generator

Page 28: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Synthetic Occlusion Generator

Page 29: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Huge Vehicle Dataset

Nearly one million images (50+ cameras) ! Largest public dataset to date has ~5000 images

Page 30: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Automatic Dataset Semantic Partitioning

Large variations in pose cause drastic appearance variations difficult for learning

Clustering based on motion direction (related to vehicle pose) motionlet clusters

Multiple detectors are learned (for each motionlet cluster) rather than a single monolithic detector

Clustering Based on Motionlets

Page 31: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Core Detector Model

Cascade of Adaboost Classifiers with Haar-like Features

A feature pool containing a huge set (order of millions) of feature configurations is generated over multiple feature planes

Similar to Integral channel features (Dollar et al), but instead of randomization, we use massively parallel feature selection to select a compact set of discriminative features through Adaboost learning

Page 32: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Deep Cascade Detectors

Significant accuracy improvement by training deep cascades with huge amount of bootstraped negative samples [200,000 negative samples]

Page 33: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Large-Scale Multi-Pose Vehicle Detection

100+ frames per second!

Page 34: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Other Visual Analytics Modules

Page 35: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Abandoned Object Detection

[Q. Fan et al, Relative Attributes For Large-scale Abandoned Object Detection, ICCV 2013]

[Y. L. Tian, R. S. Feris, and A. Hampapur. Real-time detection of abandoned and removed objects in complex environments, VS 2008]

Main Issues

Approach

Page 36: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Attribute-based People Search

[Feris et al, Indexing and searching according to attributes of a person, US Patent 20100106707, 2008]

[Feris et al, Attribute-based People Search: Lessons Learnt from a Practical Surveillance System, ICMR 2014]

[B. Siddiquie , R. S. Feris and L. Davis. Image Ranking and Retrieval Based on Multi-Attribute Queries, CVPR 2011 (Oral), USA, 2011]

[D. Vaquero, R. S. Feris, D. Tran, L. Brown, A. Hampapur, and M. Turk, Attribute-based people search in surveillance environments, WACV 2009 ]

Query Example: “Show me all people entering IBM last month with beard, dark skin, using sunglasses, wearing a red jacket and blue pants”

Page 37: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Attribute-based Vehicle Search

[Feris et al, Attribute-based Vehicle Search in Crowded Surveillance Environments, ICMR 2011]

Page 38: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Surveillance Event Detection (SED)

Qiang Chen et al, CMU-IBM-NUS@TRECVID 2012: Surveillance Event Detection, 2012

We ranked 1st in 4 out of 7 surveillance event detection tasks

Page 39: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Sweethearting Detection

[Fan et al, Recognition of Repetitive Sequential Human Activity, CVPR 2009]

Page 40: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Resources

AMOS: the archive of many outdoor scenes (http://amos.cse.wustl.edu/)

Many public traffic cameras available online! For example, you can check: http://www.chart.state.md.us/travinfo/trafficcams.php#

Page 41: Class 11: Smart Surveillance - Rogerio Ferisrogerioferis.com/VisualRecognitionAndSearch2014/classes/class11.… · Most existing smart surveillance systems in the market rely on blob-based

Visual Recognition And Search Columbia University, Spring 2014

Project Update II

1) Flower Recognition (Wenqian Liu & Shun-Xuan Wang)

2) Axon Segmentation (Mo Zhou & John Bowler)

3) Safer Driving Through Gesture Control (Kartik Darapuneni, Jianze Wang, Shuheng Gong)

4) Recognition of Animal Skin Texture Attributes in the Wild (Amey Dharwadker & Kai Zhang)

5) Identifying Animals in the Wild (Chia Kang Chao & Yen- Cheng Chou)

6) From ImageNet to Serengeti: Recognizing Animals in Wild Scenes (Guangnan Ye & Maja Rudolph)