human activity recognition from video

8/3/2019 Human Activity Recognition From Video

1/24

Guide Presented by:Prof. Vijay Bhosale Ms.Rajashri S.

Ms. Bhagyashri S.Ms. Dipashri S.

HUMAN ACTIVITY RECOGNITIONfrom VIDEO


2/24

Human Activity Recognition

Focus is on three fundamental issues: Design of a classifier & data modeling for

activity recognition

How to perform feature selection How to define the structure of the classifier


3/24

Introduction

Human movement at different levels: Analysis of the movement of body parts

Single person activities

Over increasing temporal windows Large scale interaction

Human Motion analysis common tasks:

Person detection & tracking

Activity classification

Behavior interpretation & person identification


4/24

Human action interpretation

Three Approaches:

1.Generic model recovery

- Try to fit 3D model to the person pose

2.Appearance based model-based on extraction of 2D shape

model directly from the image

3. Motion based model-rely on people motion characteristics


5/24

Bobick and davis work [1] used Motion EnergyandMotion History Images(MEI and MHI) to classifyaerobic-type exercises.

Efros et al [3] compute optical flow measurements ina spatio-temporal volume to recognize humanactivities in a nearest-neighbor framework.

The CAVIAR [13] sequences are used in [7] torecognize a the set of activities, scenarios and roles.The approach generates a list of features andautomatically chooses the smallest set, thataccurately identifies the desired class.

The design of the classifier, we use Bayesianclassifier

Functions are modeled as Gaussian mixture


6/24

Low level activities & features

The activities can be detected from a relatively shortvideo sequences & are described below:

id #frame Activity Description

1 3,211 Inactive A static person/object

2 1,974 Active Person making movements butwithout translating in the image

3 9,831, Walking There are movements & overall

image translation4 297 Running As in walking but with larger

translation

5 594 Fighting Large quantities of movement with

few translation


7/24

Features

There are two large sets of features, eachorganized in several subgroups.

1.Subset of features code the instantaneousposition & velocity of the tracked subject.

- Organized in 3 groups:

i) instantaneous measurement

ii) Avg. speed/velocity based featuresiii) 2nd order moments/energy related

indicators.


8/24

2. Based on estimates of the optic flow orinstantaneous pixel motion inside thebounding box.

-Organized in 4 subgroups:

i) instantaneous measurement

ii) Spatial 2nd order moments

iii) Temporal averaged quantities &

iv) Temporal 2nd

order moments/energyrelated indicators.


9/24

Feature Selection & Recognition

1.The recognition strategy:

Given a set of activities Aj,j=1,n, theposterior probability of a certain activity takingplace can be computed using Bayes rules:

P(Aj|F(t))= p(F(t)|Aj)P(Aj)/p(F(t)Where,

P(Aj|F(t)) is the likelihood of activity AjP(Aj) is the prior probability of the same activity

p(F(t) is the probability of observing F(t),irrespective of the underlying activity.


10/24

To build the Bayesian classifier, estimatethe likelihood function of the features, giveneach class.

The likelihood function is approximatedby:

p(F(t)|Ak)j N(j,j )

Where

N(j,j ) denotes a normal distibutionj represents the weight of that Gaussian in the

mixture for each listed activity


11/24

2.Selecting promising features:

Three Approaches:

1.Brute-Search

2.Lite Search3.Lite-lite search

Following table summarizes the cost of these different method,for M=29

Brute-search Lite search Lite-lite

Nf CNfM=M!/Nf !(M- Nf )! M+(M-1)+(Nf term) M+1

1 29 29 29

2 406 57 30

3 3654 84 30


12/24

Relief algorithm creates a weight vector overall features to quantify their quality.

This vector is updated according to:

wi= wi+(xi-nearmiss(x)i)2-(xi-nearhit(x)i)

2

wherewi represents the weight vector

Xi the ith feature for data point x

nearmiss(x) & nearhit(x) denote the nearest pointto x from the same & different classrespectively.


13/24

Following table show the results obtained usingthese different feature search criteria & for 1,2, or 3

features.

BruteSearch

LiteSearch

Lite-LiteSearch

Relief

Feat

.

Feat. R.

rate

Feat. R.

rate

Feat. R.

rate

Feat. R.

rate

1 7 83,9% 7 83,9% 7 83,9% 14 46,8%

2 9 18 93,5% 7 25 89,8% 7 18 89,6% 14 18 59,2%

3 3 9 20 94% 7 19 25 92,1% 7 1823

86,7% 14 18 23 57,1%


14/24

Classifier Structure

Group activities in subsets & perform

classification in a hierarchical mannerFigure shows binary hierarchical classifier

ActiveInactive

WalkingRunningFighting

95,5% classifier1

99,3%

classifier 2Inactive Active

Walking

Running Fighting

98,8%

classifier 3

Walking Running

100%classifier 4


15/24

Categories:

1.Gestureselementary movements of a persons bodyparts and are the atomic components describing themeaningful motion of a person.

For eg: stretching an arm, raising a leg

2.Actions- single person activities that may becomposed of multiple gestures organized temporallysuch as walking, waving and punching.

3. Interactions

involves 2 or more persons or object.Eg. 2 persons fighting

4. Group Activitiesconceptual group composed ofmultiple persons or objects.

Eg. Group having meeting,2 groups fighting

Human Activity Analysis


16/24

Human Activity Recognitionmethodologies


17/24

1.Single layered Approach

Represent and recognize human activities

directly based on sequences of images. Analyze sequential movements of humans

such as walking, jumping and waving.

Categorized into 2 classes Space-time approach

Space-time volume

Space-time trajectories

Space-time features

Sequential approach Exemplar based approach

State model based approach


18/24

Space-time approach

Approaches that recognize human activities byanalyzing space-time volumes of activity videos.

The video volumes are constructed by concatenatingimage frames along a time axis, and are compared to

measure their similarities. Figure 4 shows example 3-D XYT volumes

corresponding to a human action of `punching'.


19/24

Various recognition algorithms usingspace-time representations

Template matching, which constructs arepresentative model (i.e. a volume) per actionusing training data.

Neighbor-based matching, the systemmaintains a set of sample volumes (ortrajectories) to describe an activity.

Statistical modeling algorithms, which matchvideos by explicitly modeling a probabilitydistribution of an activity.


20/24

1. Action Recognition with Space-time volumes:

The core of the recognition is in the similarity measurementbetween two volumes.

Bobick & Davis constructed a real-time action recognitionsystem using template matching. Represents each action with a template composed of two 2-

dimensional images: a 2-dimensional binary motion-energy image(MEI) and a scalar-valued motion-history image (MHI).

These images are constructed from a sequence of foreground images,

which essentially are weighted 2-D (XY) projections of the original 3-DXYT space-time volume.

Shechtman and Irani have estimated motion flows from a 3-D space-time volume to recognize human actions.

Rodriguez have analyzed 3-D space-time volumes by

synthesizing filters: adopted the maximum averagecorrelation height (MACH) filters used for an analysis ofimages (e.g. object recognition), to solve the actionrecognition problem.

Disadvantage : difficulty in recognizing actions when multiplepersons are present in the scene.


21/24

2. Action recognition with space-time trajectories.

Interpret an activity as a set of space-time trajectories.

A person is generally represented as a set of 2-dimensional (XY) or 3-dimensional (XYZ) pointscorresponding to his/her joint positions.

Advantage : ability to analyze detailed levels of human

movements.

3 A ti iti i ti l l f t


22/24

3. Action recognition using space-time local features:

Approaches using local features extracted from 3-dimensional space-time volumes to represent and

recognize activities. Focusing on three aspects:

what 3-D local features the approaches extract,

how they represent an activity in terms of the extracted

features, and what methodology they use to classify activities.

Advantages: By its nature, background subtraction orother low-level components are generally not required,

and the local features are scale, rotation, and translationinvariant in most cases.

Suitable for recognizing simple periodic actions such as`walking' and `waving',

S ti l h


23/24

Sequential approaches Recognize human activities by analyzing

sequences of features.

Two categories Exemplar-based recognition approaches

State model-based recognition approaches

Exemplar-based sequential approachesdescribe classes of human actions usingtraining samples directly.

State model-based sequential approaches areapproaches that represent a human action byconstructing a model which is trained togenerate sequences of feature vectors

corresponding to the activity.


24/24

*** THANK U ***

human activity recognition from video

Documents