![Page 1: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/1.jpg)
CS 1699: Intro to Computer Vision
Human Pose and Actions
Prof. Adriana KovashkaUniversity of Pittsburgh
December 8, 2015
![Page 2: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/2.jpg)
Today
• Human pose and actions: Introduction
• Estimating human pose
• Recognizing human actions
– Using specialized features
– Using pose
– Using objects
– From ego-centric video
![Page 3: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/3.jpg)
Next time (Last class)
• Review for the final exam + OMETs
• By Wednesday night, post on Piazza questions or anything you want me to review, for participation credit
• Extra office hours on Friday, 2-3pm
![Page 4: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/4.jpg)
Final Exam
• Monday, Dec. 14, 12pm
• Same room (5502 Sennott Square)
• Similar to midterm exam (mostly short questions and a few problems), but longer (100 points)
• Will only cover topics discussed after midterm (but some of these use topics from first half)
![Page 5: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/5.jpg)
Homework 4
Mean = 77.16, median = 99, max = 123
![Page 6: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/6.jpg)
Homework 5
• Due Thursday
• See Piazza for correction about how to get probabilities for Part III
![Page 7: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/7.jpg)
Participation
• Tentative grades entered on CourseWeb
• Median is 80%
![Page 8: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/8.jpg)
What is an action/activity?
Action: a transition from one state to another• Who is the actor?• How is the state of the actor changing?• What (if anything) is being acted on?• How is that thing changing?• What is the purpose of the action (if any)?• Could be more or less complex
Adapted from Derek Hoiem
![Page 9: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/9.jpg)
Terminology: Human activity in video
No universal terminology, but approximately:
• “Actions”: atomic motion patterns – often gesture-like, single clear-cut trajectory, single nameable behavior (e.g., sit, wave arms)
• “Activity”: series or composition of actions (e.g., interactions between people)
• “Event”: combination of activities or actions (e.g., a football game, a traffic accident)
Adapted from Venu Govindaraju
![Page 10: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/10.jpg)
How do we represent actions?
CategoriesWalking, hammering, dancing, skiing, sitting down, standing up, jumping
Poses
Nouns and Predicates<man, swings, hammer><man, hits, nail, w/ hammer>
Derek Hoiem
![Page 11: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/11.jpg)
How can we identify actions?
Motion Pose
Held Objects
Nearby Objects
Derek Hoiem
![Page 12: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/12.jpg)
Today
• Human pose and actions: Introduction
• Estimating human pose
• Recognizing human actions
– Using specialized features
– Using pose
– Using objects
– From ego-centric video
![Page 13: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/13.jpg)
Jamie Shotton, Andrew Fitzgibbon, Mat Cook,Toby Sharp, Mark Finocchio, Richard Moore,
Alex Kipman, Andrew Blake
Best paper award at CVPR 2011
Adapted from Jamie Shotton
![Page 14: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/14.jpg)
Recognize large variety of human poses, all shapes & sizes
Limited compute budget
super-real time on Xbox 360 to allow games to run concurrently
Adapted from Jamie Shotton
![Page 15: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/15.jpg)
rightelbow
right hand leftshoulderneck
Jamie Shotton
![Page 16: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/16.jpg)
No temporal information
frame-by-frame
Local pose estimate of parts
each pixel & each body joint treated independently
reduced training data and computation time
Very fast
simple depth image features
parallel decision forest classifier
Jamie Shotton
![Page 17: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/17.jpg)
inferbody parts
per pixelcluster pixels to
hypothesizebody jointpositions
capturedepth image &
remove bg
fit model &track skeleton
Jamie Shotton
![Page 18: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/18.jpg)
Compute P(ci|wi)
pixels i = (x, y)
body part ci
image window wi
Discriminative approach
learn classifier P(ci|wi) from training data
Jamie Shotton
![Page 19: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/19.jpg)
Train invariance to:
Record mocap500k frames
distilled to 100k poses
Retarget to several models
Render (depth, body parts) pairs
Jamie Shotton
![Page 20: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/20.jpg)
Depth comparisons
Very fast to compute
inputdepthimage
xΔ
xΔ
xΔx
Δ
x
Δ
x
Δ
𝑓 𝐼, x = 𝑑𝐼 x − 𝑑𝐼(x + Δ)
image depth
image coordinate
offset depth
featureresponse
Adapted from Jamie Shotton
Θ
fΘ (I, x) =
![Page 21: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/21.jpg)
To classify pixel x, start here
no
Toy example:distinguishleft (L) and right (R)sides of the body
no yes
yes
L R
P(c)
L R
P(c)
L R
P(c)
fΘ(I, x; Δ1) > t1
fΘ(I, x; Δ2) > t2
Adapted from Jamie Shotton
![Page 22: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/22.jpg)
depth 1depth 2depth 3depth 4depth 5depth 6depth 7depth 8depth 9depth 10depth 11depth 12depth 13depth 14depth 15depth 16depth 17depth 18
input depth ground truth parts inferred parts (soft)
Jamie Shotton
![Page 23: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/23.jpg)
30%
35%
40%
45%
50%
55%
60%
65%
8 12 16 20
Ave
rag
e p
er-
cla
ss a
ccu
racy
Depth of trees
30%
35%
40%
45%
50%
55%
60%
65%
5 10 15 20Depth of trees
synthetic test data real test data
Jamie Shotton
![Page 24: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/24.jpg)
Trained on different random subset of images
“bagging” helps avoid over-fitting
Average tree posteriors
[Amit & Geman 97][Breiman 01]
[Geurts et al. 06]
………tree 1 tree T
c
P1(c)c
PT(c)
(𝐼, x) (𝐼, x)
𝑃 𝑐 𝐼, x =1
𝑇
𝑡=1
𝑇
𝑃𝑡(𝑐|𝐼, x)
Jamie Shotton
![Page 25: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/25.jpg)
ground truth
1 tree 3 trees 6 trees
inferred body parts (most likely)
40%
45%
50%
55%
1 2 3 4 5 6
Ave
rag
e p
er-
cla
ss a
ccu
racy
Number of trees
Jamie Shotton
![Page 26: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/26.jpg)
front view top viewside view
input depth inferred body parts
inferred joint positions (modes found using mean shift)
no tracking or smoothingJamie Shotton
![Page 27: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/27.jpg)
front view top viewside view
input depth inferred body parts
no tracking or smoothing
inferred joint positions (modes found using mean shift)
Jamie Shotton
![Page 28: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/28.jpg)
Today
• Human pose and actions: Introduction
• Estimating human pose
• Recognizing human actions
– Using specialized features
– Using pose
– Using objects
– From ego-centric video
![Page 29: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/29.jpg)
Representing Actions
Tracked Points
Matikainen et al. 2009Adapted from Derek Hoiem
![Page 30: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/30.jpg)
Representing Actions
Space-Time Interest Points
Laptev 2005
• Corner detectors in space+time
Adapted from Derek Hoiem
![Page 32: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/32.jpg)
“Talk on phone”
“Get out of car”
Derek Hoiem
Learning realistic human actions from movies, Laptev et al. 2008
![Page 33: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/33.jpg)
Approach
• Space-time interest point detectors
• Descriptors
– HOG, HOF
• Pyramid histograms (3x3x2)
• SVMs with Chi-Squared Kernel
Interest Points
Spatio-Temporal Binning
Derek Hoiem, figures from Ivan Laptev
![Page 34: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/34.jpg)
Results
Derek Hoiem, figures from Ivan Laptev
![Page 35: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/35.jpg)
Today
• Human pose and actions: Introduction
• Estimating human pose
• Recognizing human actions
– Using specialized features
– Using pose
– Using objects
– From ego-centric video
![Page 36: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/36.jpg)
Human-Object Interaction
Torso
Head
• Human pose estimation
Holistic image based classification
Integrated reasoning
Yao/Fei-Fei
![Page 37: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/37.jpg)
Human-Object Interaction
Tennis
racket
• Human pose estimation
Holistic image based classification
Integrated reasoning
• Object detection
Yao/Fei-Fei
![Page 38: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/38.jpg)
Human-Object Interaction
• Human pose estimation
Holistic image based classification
Integrated reasoning
• Object detection
Torso
Head
Tennis
racket
Activity: Tennis Forehand
• Action categorization
Yao/Fei-Fei
![Page 39: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/39.jpg)
• Felzenszwalb & Huttenlocher, 2005
• Ren et al, 2005
• Ramanan, 2006
• Ferrari et al, 2008
• Yang & Mori, 2008
• Andriluka et al, 2009
• Eichner & Ferrari, 2009
Difficult part
appearance
Self-occlusion
Image region looks
like a body part
Human pose estimation & Object detection
Human pose
estimation is
challenging.
Yao/Fei-Fei
![Page 40: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/40.jpg)
Human pose estimation & Object detection
Human pose
estimation is
challenging.
• Felzenszwalb & Huttenlocher, 2005
• Ren et al, 2005
• Ramanan, 2006
• Ferrari et al, 2008
• Yang & Mori, 2008
• Andriluka et al, 2009
• Eichner & Ferrari, 2009Yao/Fei-Fei
![Page 41: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/41.jpg)
Human pose estimation & Object detection
Facilitate
Given the
object is
detected.
Yao/Fei-Fei
![Page 42: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/42.jpg)
• Viola & Jones, 2001
• Lampert et al, 2008
• Divvala et al, 2009
• Vedaldi et al, 2009
Small, low-resolution,
partially occluded
Image region similar
to detection target
Human pose estimation & Object detection
Object
detection is
challenging
Yao/Fei-Fei
![Page 43: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/43.jpg)
Human pose estimation & Object detection
Object
detection is
challenging
• Viola & Jones, 2001
• Lampert et al, 2008
• Divvala et al, 2009
• Vedaldi et al, 2009
Yao/Fei-Fei
![Page 44: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/44.jpg)
Human pose estimation & Object detection
Facilitate
Given the
pose is
estimated.
Yao/Fei-Fei
![Page 45: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/45.jpg)
Human pose estimation & Object detection
Mutual Context
Yao/Fei-Fei
![Page 46: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/46.jpg)
Learning Results
Tennis
serve
Volleyball
smash
Tennis
forehand
Yao/Fei-Fei
![Page 47: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/47.jpg)
Activity Classification Results
Gupta et
al, 2009
Our
model
Bag-of-
Words
83.3%
Cla
ssific
atio
n a
ccu
racy
78.9%
52.5%
0.9
0.8
0.7
0.6
0.5
Cricket
shot
Tennis
forehand
Bag-of-words
SIFT+SVM
Gupta et
al, 2009
Our
model
Yao/Fei-Fei
![Page 48: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/48.jpg)
Today
• Human pose and actions: Introduction
• Estimating human pose
• Recognizing human actions
– Using specialized features
– Using pose
– Using objects
– From ego-centric video
![Page 49: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/49.jpg)
Detecting Activities of Daily Living
in First-person Camera Views
Hamed Pirsiavash, Deva Ramanan
CVPR 2012
Hamed Pirsiavash
![Page 50: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/50.jpg)
MotivationA sample video of Activities of Daily Living
![Page 51: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/51.jpg)
ApplicationsTele-rehabilitation
• Kopp et al,, Arch. of Physical Medicine and Rehabilitation. 1997.
• Catz et al, Spinal Cord 1997.
Long-term at-home monitoring
Hamed Pirsiavash
![Page 52: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/52.jpg)
ApplicationsLife-logging
• Gemmell et al, “MyLifeBits: a personal database for everything.” Communications of the ACM 2006.
• Hodges et al, “SenseCam: A retrospective memory aid”, UbiComp, 2006.
So far, mostly “write-only” memory!
This is the right time for computer vision community to get involved.
Hamed Pirsiavash
![Page 53: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/53.jpg)
53
Wearable ADL detection
ADL actions derived from medical
literature on patient rehabilitation
It is easy to collect
natural data
Hamed Pirsiavash
![Page 54: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/54.jpg)
ChallengesWhat features to use?
Low level features
(Weak semantics)
High level features
(Strong semantics)
Human pose
Difficulties of pose:
• Detectors are not accurate enough
• Not useful in first person camera views
Space-time interest points
Laptev, IJCV’05
Hamed Pirsiavash
![Page 55: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/55.jpg)
ChallengesWhat features to use?
Low level features
(Weak semantics)
High level features
(Strong semantics)
Human pose Object-centric featuresSpace-time interest points
Laptev, IJCV’05Difficulties of pose:
• Detectors are not accurate enough
• Not useful in first person camera views
Hamed Pirsiavash
![Page 56: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/56.jpg)
Challenges Long-scale temporal structure
time
Start boiling
water
Do other things
(while waiting)Pour in cup Drink tea
Wearable data: making tea
“Classic” data: boxing
Adapted from Hamed Pirsiavash
![Page 57: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/57.jpg)
Appearance feature: bag of objects
Bag of detected objects
fridge TVstove
fridge TVstove
SVM
classifier
Video clip
Hamed Pirsiavash
![Page 58: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/58.jpg)
Inspired by “Spatial Pyramid” CVPR’06 and “Pyramid Match Kernels” ICCV’05
Temporal pyramidCoarse to fine correspondence matching with a multi-layer pyramid
Temporal pyramid
descriptor
Video clip
SVM
classifier
timeHamed Pirsiavash
![Page 59: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/59.jpg)
Accuracy on 18 action categories• Our model: 40.6%
• STIP baseline: 22.8%
Hamed Pirsiavash
![Page 60: CS 1699: Intro to Computer Vision Introduction · probabilities for Part III. ... Derek Hoiem, figures from Ivan Laptev. Results Derek Hoiem, ... • Action recognition still an open](https://reader031.vdocuments.us/reader031/viewer/2022030921/5b795aa97f8b9a331e8d9a06/html5/thumbnails/60.jpg)
Summary: Human actions
• Action recognition still an open problem
– How to represent actions?
• Types of data: atomic and more complex actions, ego-
centric video
• Common representations
– Space-time interest points
– Pose
– Objects (and temporal pyramids of objects)
• Pose
– Can be approached as a classification problem using depth data