talk 2009-monash-seminar-intelligent-video-surveillance
TRANSCRIPT
www.monash.edu.au
Gippsland School of Information Technology (GSIT)
Behaviour Recognition
Framework for Intelligent
Visual Surveillance
06 April 2009
www.monash.edu.au
2
Project Team
• Mahfuzul Haque, PhD student (2 yrs, 1 m)
• A/Prof. Manzur Murshed
• Dr. Manoranjan Paul
www.monash.edu.au
3
Project Motivation
“Behaviour Recognition Framework for
Intelligent Visual Surveillance”
Why “Intelligent” Surveillance?
Why “Behaviour Recognition”?
What type of “Behaviours”?
www.monash.edu.au
4
Surveillance Everywhere
Are we really protected?
www.monash.edu.au
5
Too Many Cameras
Deployment of large number of surveillance cameras in recent years
London Heathrow airport has more than 5000 cameras!!
www.monash.edu.au
6
Behind the Scene: Worried Human Monitor
Dependability on human monitors has increased.
Reliability on surveillance system has decreased.
www.monash.edu.au
7
Project Goals
Aiding human monitors by
automatic detection of
specific abnormal behaviors
Decreasing dependability on
human monitors
Improving reliability of
surveillance systems for
ensuring human security
www.monash.edu.au
8
Project Scope
Mob Violence
Crowding
Sudden Group Formation
Sudden Group Deformation
Shooting
Panic Driven Behaviours
Group Behaviours
www.monash.edu.au
9
Research Question
How to recognize specific group
behaviours from surveillance video
streams in real-time?
Research Area - Computer Vision
- Application of Machine Learning
www.monash.edu.au
10
System Architecture
Surveillance
Video Stream
Behaviour Profile
Behaviour Recognition
Framework
www.monash.edu.au
11
System Architecture
Surveillance
Video Stream
Behaviour Recognition
Framework
Behaviour Profile
www.monash.edu.au
12
Behaviour Profile
0 20 60 140 320
Surveillance Video Stream (System Input)
Unknown Group
Appearing
Group
Appearing
Group
Merging
Group
Splitting
Time
Behaviour Profile (System Output)
www.monash.edu.au
13
System Architecture
Surveillance
Video Stream
Behaviour Recognition
Framework
Behaviour Profile
www.monash.edu.au
14
Behaviour Recognition Framework
Framework Components
• Background Modelling
• Frame Level Feature Extraction
• Temporal Feature Extraction
• Behaviour Classification
www.monash.edu.au
15
Behaviour Recognition Framework
Background Modelling
Frame Level Feature
Extraction
Temporal Feature
Extraction
Behaviour Classification
www.monash.edu.au
16
Background Modelling
Background Modelling
Frame Level Feature
Extraction
Temporal Feature
Extraction
Behaviour Classification
www.monash.edu.au
17
Background Modelling
How to extract the active regions from surveillance video stream?
Background Subtraction
- =
Current frame Background Moving foreground
Challenges!!
www.monash.edu.au
18
Background Modelling
Sky
Cloud
Leaf
Moving Person
Road
Shadow
Moving Car
Floor
Shadow
Walking People
P(x)
x µ
σ2
P(x)
x µ
σ2
P(x)
x µ
σ2
P(x)
Sky
Cloud
Person
Leaf
x (Pixel intensity)
www.monash.edu.au
19
Background Modelling
road shadow car road shadow
Frame 1 Frame N
Current frame Moving foreground
Background
Model
ω1
σ12
µ1
road
ω2
σ22
µ2
shadow
ω3
σ32
µ3
car
65% 20% 15%
Background Models
Models are ordered by ω/σ
www.monash.edu.au
20
Background Modelling
(1) PETS2000; (2) PETS2006-B1; (3) PETS2006-B2; (4) PETS2006-B3; and (5) PETS2006-B4.
First
Frame
Test
Frame
Ground
Truth S&G Lee
Proposed
(1)
(2)
(3)
(4)
(5)
www.monash.edu.au
21
Frame Level Feature Extraction
Background Modelling
Frame Level Feature
Extraction
Temporal Feature
Extraction
Behaviour Classification
www.monash.edu.au
22
Frame Level Feature Extraction
• Feature Categories:
– Count
– Area
– Density
– Bounding Box
– Filling Ratio
– Aspect Ratio
• 30 frame level features
Bounding Boxes
www.monash.edu.au
23
Frame Level Feature Extraction
Foreground Count
• FC (Foreground Count)
Foreground Area
• TFA (Total Foreground Area)
• AFA (Average Foreground Area)
• VFA (Variance of Foreground Area)
• MAXFA (Maximum Foreground Area)
• MINFA (Minimum Foreground Area)
Foreground Density
• AFD (Average Foreground Density)
• VFD (Variance of Foreground Density)
Filling Ratio
• TFR (Total Filling Ratio)
• AFR (Average Filling Ratio)
• VFR (Variance of Filling Ratio)
• MAXFR (Maximum Filling Ratio)
• MINFR (Minimum Filling Ratio)
Bounding Box – Area
• TBBA (Total Bounding Box Area)
• ABBA (Average Bounding Box Area)
• VBBA (Variance of Bounding Box Area)
• MAXBBA (Maximum Bounding Box Area)
• MINBBA (Minimum Bounding Box Area)
Bounding Box – Width
• ABBW (Average Bounding Box Width)
• VBBW (Variance of Bounding Box Width)
• MAXBBW (Maximum Bounding Box Width)
• MINBBW (Minimum Bounding Box Width)
Bounding Box – Height
• ABBH (Average Bounding Box Height)
• VBBH (Variance of Bounding Box Height)
• MAXBBH (Maximum Bounding Box Height)
• MINBBH (Minimum Bounding Box Height)
Aspect Ratio
• AAR (Average Aspect Ratio)
• VAR (Variance of Aspect Ratio)
• MAXAR (Maximum Aspect Ratio)
• MINAR (Minimum Aspect Ratio)
www.monash.edu.au
24
Temporal Feature Extraction
Background Modelling
Frame Level Feature
Extraction
Temporal Feature
Extraction
Behaviour Classification
www.monash.edu.au
25
Temporal Feature Extraction
• Fixed length, partially overlapped
sliding window
• Temporal data smoothing – polynomial
curve fitting
• 9 temporal features for each frame level
feature
• Output: 270 temporal features
www.monash.edu.au
26
Temporal Feature Extraction
TFA (Total Foreground Area)
TF
A (
%)
Time (window = 100 frames)
Temporal Features
• MAX
• MIN
• AVG
• VAR
• RATE
• TIME(MAX)
• TIME(MIN)
• D = TIME(MAX) - TIME(MIN)
• SLOPE ( D/2 )
www.monash.edu.au
27
Temporal Feature Extraction
TFA (Total Foreground Area)
TF
A (
%)
Time (window = 100 frames)
MAX
MIN
TIME(MAX) TIME(MIN)
Temporal Features
• MAX
• MIN
• AVG
• VAR
• RATE
• TIME(MAX)
• TIME(MIN)
• D = TIME(MAX) - TIME(MIN)
• SLOPE ( D/2 )
www.monash.edu.au
28
Behaviour Classification
Background Modelling
Frame Level Feature
Extraction
Temporal Feature
Extraction
Behaviour Classification
www.monash.edu.au
29
Behaviour Classification
• Individual classifiers for each behaviour
• Supervised training
• Feature ranking
• Top 100 features from 270 features
• Dimension reduction (PCA)
• Max dimension 30
• SVM classifier
• Output: Behaviour Profile
www.monash.edu.au
30
Behaviour Classification
Experiments
GROUP FORMING
• Accuracy: 0.9767
• Top 3 features
• TIME(MAX)-VFD
• TIME(MAX)-AFD
• TIME(MAX) - TIME(MIN)-VFD
GROUP SPLITTING AND SPREADING
• Accuracy: 0.8488
• Top 3 features
• TIME(MAX)-VFD
• TIME(MIN)-ABBA
• TIME(MIN)-AFA
BLOCKED EXIT
• Accuracy: 0.9651
• Top 3 features
• TIME(MIN)-TFA
• MIN-MINAR
• TIME(MAX)-TFA
www.monash.edu.au
31
Summary: Framework Components
Background Modelling
• Multiple Background Models
• Gaussian Mixture Models (GMM)
• Unsupervised
• Output: Foreground Region/Mask
Frame Level Feature Extraction
• Feature Categories:
• Count
• Area
• Density
• Bounding Box
• Filling Ratio
• Aspect Ratio
• Output: 30 Frame Level Features
Temporal Feature Extraction
• Fixed Length, Partially Overlapped Sliding Window
• Temporal Data Smoothing – Polynomial Curve Fitting
• 9 Temporal Features for Each Frame Level Features
• Output: 270 Temporal Features
Behaviour Classification
• Individual Classifiers for Each Behaviour
• Each Classifier is Trained Using Supervised Learning
• Feature Ranking
• Top 100 Features
• Dimension Reduction (PCA)
• Max Dimension 30
• SVM classifier
• Output: Behaviour Profile
www.monash.edu.au
32
Summary: Framework Components
Background Modelling
• Multiple Background Models
• Gaussian Mixture Models (GMM)
• Unsupervised
• Output: Foreground Region/Mask
Frame Level Feature Extraction
• Feature Categories:
• Count
• Area
• Density
• Bounding Box
• Filling Ratio
• Aspect Ratio
• Output: 30 Frame Level Features
Temporal Feature Extraction
• Fixed Length, Partially Overlapped Sliding Window
• Temporal Data Smoothing – Polynomial Curve Fitting
• 9 Temporal Features for Each Frame Level Features
• Output: 270 Temporal Features
Behaviour Classification
• Individual Classifiers for Each Behaviour
• Each Classifier is Trained Using Supervised Learning
• Feature Ranking
• Top 100 Features
• Dimension Reduction (PCA)
• Max Dimension 30
• SVM classifier
• Output: Behaviour Profile
www.monash.edu.au
33
Research Challenges
• No tracking/trajectory
• Simple behaviours
– Group appear/disappear
– Group merge/split
• Panic driven behaviours
– Fire/Blocked exit
– Fighting/Shooting
• Context variation
– Speed
– Direction
– Object Resolution
www.monash.edu.au
34
Implemented System: VSTK
www.monash.edu.au
35
Publications
1. Mahfuzul Haque, Manzur Murshed, and Manoranjan Paul, Improved
Gaussian Mixtures for Robust Object Detection by Adaptive Multi-
Background Generation, International Conference on Pattern
Recognition (ICPR), Tampa, Florida, USA, 2008. (CORE A)
2. Mahfuzul Haque, Manzur Murshed, and Manoranjan Paul, A Hybrid
Object Detection Technique from Dynamic Background Using
Gaussian Mixture Models, IEEE International Workshop on Multimedia
Signal Processing (MMSP), Cairns, Australia, 2008. (CORE A)
3. Mahfuzul Haque, Manzur Murshed, and Manoranjan Paul, On Stable
Dynamic Background Generation Technique using Gaussian Mixture
Models for Robust Object Detection, IEEE International Conference On
Advanced Video and Signal Based Surveillance (AVSS), Santa Fe,
New Mexico, USA, 2008. (CORE A)
CORE - COmputing Research and Education Association
www.monash.edu.au
37
Acknowledgments
• http://www.fotosearch.com/DGV464/766029/
• http://www.cyprus-trader.com/images/alert.gif
• http://security.polito.it/~lioy/img/einstein8ci.jpg
• http://www.dtsc.ca.gov/PollutionPrevention/images/question.jpg
• http://www.unmikonline.org/civpol/photos/thematic/violence/streetvio2.jpg
• http://www.airports-worldwide.com/img/uk/heathrow00.jpg
• http://www.highprogrammer.com/alan/gaming/cons/trips/genconindy2003/exhibit-hall-crowd-2.jpg
• http://www.bhopal.org/fcunited/archives/fcu-crowd.jpg
• http://img.dailymail.co.uk/i/pix/2006/08/passaPA_450x300.jpg
• http://www.defenestrator.org/drp/files/surveillance-cameras-400.jpg
• http://www.cityofsound.com/photos/centre_poin/crowd.jpg
• http://www.hindu.com/2007/08/31/images/2007083156401501.jpg
• http://paulaoffutt.com/pics/images/crowd-surfing.jpg
• http://msnbcmedia1.msn.com/j/msnbc/Components/Photos/070225/070225_surveillance_hmed.hmedium.jpg
URLs of the images used in this presentation