visual human tracking and group activity analysis: a video mining system for retail marketing

VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS:

A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Alex Leykin Indiana University

PhD Thesis by:

Motivation

• Automated tracking and activity recognition is missing from marketing research

• Hardware is already there• Visual information can reveal a lot about

human interactions with each other • Help in making intelligent marketing

decisions

Extract semantic information from the tracks (Activity Analysis)

Process visual information to get a formal representation of human locations (Visual Tracking)

Related Work: Detection and Tracking• Yacoob and Davis “Learned models for estimation of

rigid and articulated human motion from stationary or moving camera” IJCV 2000

• Zhao and Nevatia “Tracking multiple humans in crowded environment” CVPR 2004

• Haritaoglu, Harwood, and Davis “W-4: Real-time surveillance of people and their activities” PAMI 2000

• J. Deutscher, B. North, B. Bascle and A. Blake “Tracking through singularities and discontinuities by random sampling”, ICCV 1999

• A. Elgammal and L. S. Davis, “Probabilistic Framework for Segmenting People Under Occlusion”, ICCV 2001.

• M. Isard, J. MacCormick, “BraMBLe: a Bayesian multiple-blob tracker”, ICCV 2001

Related Work: Activity Recognition

• Haritaoglu and Flickner “Detection and tracking of shopping groups in stores” CVPR 2001

• Oliver, Rosario, and Pentland “A bayesian computer vision system for modeling human interactions” PAMI 2000

• Buzan, Sclaroff, and Kollios “Extraction and clustering of motion trajectories in video” ICPR 2004

• Hongeng, Nevatia, and Bremond “Video-based event recognition: activity representation and probabilistic recognition methods” CVIU 2004

• Bobick and Ivanov “Action recognition using probabilistic parsing” CVPR 1998

System Components

Low-level Processing

Camera Model

Obstacle Model

Foreground Segmentation

Head Detection

Background Modeling

Color• μRGB• Ilow • Ihi

codeword

codebook

………..

Adaptive Background Update

If there is no match

if codebook is saturated then pixel is foreground else create new codeword

Else update the codeword with new pixel information

If >1 matches then merge matching codewords

I(p) > Ilow

I(p) < Ihigh

(RGB(p)∙ μRGB) < TRGB

t(p)/thigh > Tt1

t(p)/tlow > Tt2

Match pixel p to the codebook b

Background Subtraction

Head DetectionVanishing Point Projection (VPP) Historgram

Vanishing Point in Z-direction

Camera Setup

• Two camera typesPerspective Spherical

• Mixtures of indoor and outdoor scenes• Color and thermal image sensors• Varying lighting conditions (daylight, cloud

cover, incandescent, etc.)

Camera ModelingPerspective Projection Spherical Projection

X, Y, Z from:[sx; sy; s] = P [X; Y; Ż; 1] using SVDWhere P, is the 3x4 projection matrix

Assumption: floor plane Zf = 0

X = cos(θ) tan(π-φ)(Zc-Ż)Y = sin(θ) tan(π-φ)(Zc-Ż)Z = Ż

[Xc, Yc, Zc]

Lon[Xc, Yc, Zc]

TrackingGoal: find a correspondence between the bodies, already detected in the

current frame with the bodies which appear in the next frame.

Apply Markov Chain Monte Carlo (MCMC) to estimate the next state

xt-1 xt

Add bodyDelete body

Recover deletedChange Size

TrackingLocation of each pedestrian is estimated probabilistically based on: Current image Previous state of the system Physical constraints

The goal of our tracking system is to find the candidate state x´ (a set of bodies along with their parameters) which, given the last known state x, will best fit the current observation z

P(x’| z, x) = L(z|x’) · P(x’{x})

observation likelihood state prior probability

Tracking: Priors

N(hμ, hσ2) and N(wμ,wσ

2) body width and height

U(x)R and U(y)R body coordinates are weighted uniformly within the rectangular region R of the floor map.

d(wt, wt−1) and d(ht, ht−1) variation from the previous size

d(xt, x’t−1) and d(y, y’t−1) variation from Kalman predicted position

N(μdoor, σdoor) distance to the closest door (for new bodies)

Constraints on the body parameters:

Temporal continuity:

Tracking Likelihoods: Distance weight plane

Problem: blob trackers ignore blob position in 3D (see Zhao and Nevatia CVPR 2004) Solution: employ “distance weight plane” Dxy = |Pxyz, Cxyz| where P and C are world

coordinates of the camera and reference point correspondingly and

Tracking Likelihoods: Z-buffer

0 = background, 1=furthermost body, 2 = next closest body, etc

Tracking Likelihoods: Color Histogram

),(11 1 ttcolorcolor ccBwP

DZOIP xyZ

)( )0(

DIZOP xyZ

)( )0(

Implementation of z-buffer (Z) and distance weight plane (D) allows to compute multiple-body configuration with one computationally efficient step.Let: I - set of all blob pixels O - set of body pixels

Color observation likelihood is based on the Bhattacharya distance between candidate and observed color histograms

Tracking: Anisotropic Weighted Mean Shift

Classic Mean-Shift Our Mean-Shift

Actors and events

• Shopper groups are formed by individual shoppers who shop together for some amount of time– More than fleeting crossing of paths – Dwelling together– Splitting and uniting after a period of time

Swarming

• Shopper groups detected based on “swarming” idea in reverse– Swarming is used in graphics to generate

flocking behaviour in animations. – Rules define flocking behaviour:

• Avoid collisions with the neighbors.• Maintain fixed distance with neighbors• Coordinate velocity vector with neighbors.

Tracking Customer Groups

• We treat customers as swarming agents, acting according to simple rules (e.g. stay together with swarm members)

Customer groups

Terminology

• Actors: shoppers (bodies detected in tracking)– (x, y, id)

• Swarming events defined as short time activity sequences of multiple agents interacting with each other.– Could be fleeting (crossing paths)– Later analysis sorts this out and ignores

chance encounters.

Swarming

• The actors that best fit this model signal a Swarming Event

• Multiple swarming events are further clustered with fuzzy weights to find out shoppers in the same group over long periods.

• Two actors come sufficiently close according to some distance measure:– Relative position pi=(xi, yi) of actor i on the floor– Body orientations αi– Dwelling state δi={T,F}.

Event detection

Distance between two agents is a linear combination of co-location, co-ordination and co-dwelling

Event detection

Perform agglomerative clustering of actors a into clusters C• Initialize: N singleton clusters • Do: merge two closest clusters• While not: validity index I reaches its maximum

I consists of isolation Ini and compactness Inc

Ini = isolation

Inc = compactness

Event detection

# Iteration # Iteration

Final events

Activity Detection

• The shopper group detection is accomplished by clustering the short term events over long time periods. – The events could be separated in time, but

they will be part of the same shopper group if the actors are the same (the first term).

Activity detection

• Higher level activities (shopper groups) detected using these events as building blocks over longer time periods

• Some definitions:– Bei={b ei} the set of all bodies taking part in

an event ei.– τei and τej are the average times of events ei

and ej happening.

Activity detection

22112 )||(

|)()(|),(

eeeejie BB

BBBBeeD

Define a measure of similarity between two events

Overlap between two sets of actors Separation in time

Activity detection• Perform fuzzy agglomerative clustering• Minimize objective function

• where wij are fuzzy weights• and asymmetric variants of Tukey’s biweight estimators:

• (.) is the loss function from robust statistics.• ψ(.) is the weight function

Adaptively choose only strong fuzzy clusters

Label remaining clusters as activities

Results: Swarming activities detected in space-time

• Dot location: average event location

• Dot size: validity• Dots of same color: belong to

same activity

Group Detection Results

Quantitative Results

Tracking

Sequence

number

Frames

People

missed

False hits

Identity switches

1 1054

15 3 1 3

2 0601

8 0 0 0

3 1700

16 5 1 2

4 1506

3 0 0 0

5 2031

2 0 0 0

6 1652

4 0 0 0

%% 8544

48 12.5

4.1 10.4

Group DetectionSequence Groups P+ P− Partial

1 20 0 7 02 17 1 3 13 17 0 7 0

Total 54 1 12 2Percent 100 1.8 22.2 3.7

Ground truth(manually determined)

false positives

false negatives(groups missed)

Partially identified groups(≥2 people in the group Correctly identified)

Qualitative Assesments• Longer paths provide better group detection

(pval << 1)• Two-people groups are easiest to detect• Simple one-step clustering of trajectories is not

sufficient for long-term group detection• Employee tracks pose a significant problem and

have to be excluded• Several groups were missed by the operator in

the initial ground truth– System caught groups missed by the human expert

after inspection of results.

Contributions– BG subtraction based on codebook (RGB+thermal)– Introduced head candidate selection method based on

VPP histogram– Resolving track initialization ambiguity and non-unique

body-blob correspondence– Informed jump-diffuse transitions in MCMC tracker– Weight plane and z-buffer improve likelihood estimation– Anisotropic mean-shift with obstacle model– Two-layer formal framework high level activity detection – Implemented robust fuzzy clustering to group events

into activities

Future Work• Improved Tracking (via feature points)• Demographical analysis• Focus of Attention• Sensor Fusion• Other Types of Swarming Activities

Questions?

Thank you!

|,||,||,|),( 321 jijijiji wwppwbbd

visual human tracking and group activity analysis: a video mining system for retail marketing

visual human tracking

problem of tracking

detection of

blob tracking

blake tracking

motivationautomated

group activity recognition

temporal tracking problem

Documents

tracking retail investor activity

pedestrian data mining with object tracking and trajectory

essen rfid · coimbatore solar panels personnel tracking...

tracking the trends 2011 - the top 10 issues mining ... ·...

lessons and challenges from mining retail e-commerce...

accelerate digital transformation with process mining ·...

tracking the trends 2018 the top 10 issues shaping mining...

consumers, big data, and online tracking in the retail...

tunnel radio communications and tracking for mining

tracking the retail customer experience

spine: an integrated tracking database and data mining...

tracking the trends 2016 - presentation at the 2016 mining...

lessons and challenges from mining retail e...

time series data mining: a retail application using sas

data mining - paidiblog825102560.files.wordpress.com ·...

data mining for the category management in the retail market

20 years of mining innovation underground communication and...

mining social media tracking content and predicting...

real-world insights from mining retail e-commerce data

lessons and challenges from mining retail e-commerce data