pose invariant action recognition for automated behaviour...

34
Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition 22 nd November 2016, Tuesday Manoj Ramanathan Research Engineer, IMI IMI Research Seminar

Upload: others

Post on 08-Aug-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action

Recognition

22nd November 2016, Tuesday

Manoj Ramanathan

Research Engineer, IMI

IMI Research Seminar

Page 2: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Contents

• Introduction • Literature Review

– Motion – Pose – Motion + Pose

• Proposed Framework – Propagate Motion Forward (PMF) Path – Canonical Pose Feedback (CPF) Path

• Experimental Results & Discussion • Conclusion

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

2

Page 3: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Introduction

• For several applications, it is necessary for device to understand environment and humans.

• Recognition of human action is essential. • RGB camera based action recognition is not easy.

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

3

Page 4: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Introduction & Motivation

Occlusion

Background Clutter

View invariance

Motivating Challenges & factors

Execution rate

Anthropometric variations

Moving Cameras

Generalizability

Action localization

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

4

Page 5: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Introduction & Motivation

• Objectives:

– RGB camera Action recognition that can handle following challenges

• View angle changes

• Occlusion

• Pose Variations

• Background Clutter

– Generalized to handle actions performed in non-upright human postures.

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

5

Page 6: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Literature Review

Page 7: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Motion Based Approaches

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

Motion History Images & Motion Energy Images [1] – To indicate presence of motion and recency of motion

Trajectories [2]

- Optical Flow [3] - Kinematic Features [4,5]

[1] Aaron F. Bobick and James W. Davis, “The recognition of human movement using temporal templates”, IEEE Trans. on Pattern Analysis and Machine Intelligence, 23(3): 257 - 267, March 2001. [2] H. Wang, A. Klser, C. Schmid, and C.-L. Liu, “Dense trajectories and motion boundary descriptors for action recognition,” Intl Journal on Computer Vision, vol. 103, pp. 60 – 79, May 2013. [3] L. Liu, L. Shao, and P. Rockett, “Boosted key-frame selection and correlated pyramidal motion-feature representation for human action recognition,” Pattern Recognition 46, Elsevier, pp. 1810 – 1818, July 2013. [4] M. Jain, H. Jegou, and P. Bouthemy, “Better exploiting motion for better action recognition,” in IEEE Conf. on Computer Vision and Pattern Recognition, June 2013. [5] S. Ali and M. Shah, “Human action recognition in videos using kinematic features and multiple instance learning,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 32, pp. 288 – 303, February 2010. 7

Page 8: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Pose Based Approaches

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

- Shape [1], Contours [2] - Based on extraction and representation of key poses [5] - Silhouette [4]

Poselets [3] – Body part detectors in 3D appearance space

[1] H. Zhang and L. E. Parker, “4-Dimensional local spatio-temporal features for human activity recognition,” in IEEE Intl. Conf. on Intelligent Robots and Systems, pp. 2044 – 2049, September 2011. [2] S. Cheema, A. Eweiwi, C. Thurau, and C. Bauckhage, “Action recognition by learning discriminative key poses,” in IEEE Intl. conf. on computer vision workshops, pp. 1302 – 1309, November 2011. [3] M. Raptis and L. Sigal, “Poselet key-framing: A model for human activity recognition,” in IEEE Conf. on Computer Vision and Pattern Recognition, pp. 2650 – 2657, October 2013. [4] F. Lv and R. Nevatia, “Single view human action recognition using keypose matching and Viterbi path searching,” in IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1–8, June 2007. [5] N. Ikizler-Cinbis and S. Scarloff, “Web-based classifiers for humanaction recognition,” IEEE Trans. On Multimedia, vol. 14, pp. 1031 –1045, August 2012.

8

Page 9: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Motion + Pose Based Approaches

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

Shape-Motion Prototypes [1] Motionlets [2] - a mid-level spatio-temporal part, which are a tight cluster in motion and appearance space corresponding to each body part movements.

[1] Z. Jiang, Z. Lin, and L. S. Davis, “Recognizing human actions by learning and matching shape-motion prototype trees,” IEEE Trans. On Pattern Analysis and Machine Intelligence, vol. 34, pp. 533 – 547, March 2012. [2] L. Wang, Y. Qiao, and X. Tang, “Motionlets mid-level 3d parts for human motion recognition,” in IEEE Conf. on Computer Vision and Pattern Recognition, June 2013.

9

Page 10: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Proposed Pose Invariant Action Recognition Framework

Consists of two components, namely motion and pose component in a mutually reinforcing framework

Page 11: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Framework for Action Recognition

• Actions are manifested as movements of body parts

• Detection of body parts and analyzing their motion provides a good framework

• Mutually assistive components to improve each other’s performance

• Represent motion of each body part with respect to the body (for pose-invariance)

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

11 Confidential To be Published

Page 12: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Input Video

Propagation Mechanism – Grid Division

Human body centric space conversion

Kinematic features – Div, Curl, Proj, Rot

Action Model from training

videos

ELM Classifier

Canonical Pose Hypothesis – Identify pose in the frame

Canonical Sticks from

training videos

Preprocessing – Foreground detection

Temporal Stick Features – Implicitly captures dynamics

of pose evolution

Recognized Action

Propagation Motion Forward Path

Canonical Pose Feedback Path

Realign grids based on head size

Kinematic features – Div, Curl, Proj, Rot, BodyProj, BodyRot

12 Confidential To be Published

Page 13: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Propagate Motion Forward Path

Parameters assumed as available or estimated Foreground Neck point

Major viewing direction

Page 14: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Propagation Mechanism – Grid Division Requires neck and foreground

Human body centric space conversion Requires viewing direction

Kinematic features – Div, Curl, Proj, Rot Requires neck

Propagation Motion Forward Path

Propagate Motion Forward Path

2x

5x

6x

Divide into grids based on body

proportion

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

14 Confidential To be Published

Page 15: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Propagate Motion Forward Path

• Optical Flow [1] used in the framework

• Kinematic features [2] extracted from Optical flow to represent and characterize actions

– Divergence

– Vorticity (Curl)

– Projection

– Rotation

[1] T. Brox, A. Bruhn, N. Papenberg, and J. Weickert, "High Accuracy Optical Flow Estimation Based on a Theory for Warping," ECCV, vol. 3024, pp. 25-36, 2004. [2] S. Shojaeilangari, W. Y. Yau, k. Nandakumar, J. Li, and E. K. Teoh.,” Robust Representation and Recognition of Facial Emotions Using Extreme Sparse Learning”, IEEE Trans. on Image Processing, Vol.24, No.7, pg. 2140 – 2152, March 2015.

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

15 Confidential To be Published

Page 16: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Propagate Motion Forward Path

• Weighted & Unweighted Histograms of the motion features were used in Pose invariant Emotion recognition [1]

• Assumed that the face is frontal and deals only with 2D motion

• For action recognition

– Method should handle 3D motion

– Human performing action need not be frontal

[1] S. Shojaeilangari, W. Y. Yau, k. Nandakumar, J. Li, and E. K. Teoh.,” Robust Representation and Recognition of Facial Emotions Using Extreme Sparse Learning”, IEEE Trans. on Image Processing, Vol.24, No.7, pg. 2140 – 2152, March 2015.

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

16 Confidential To be Published

Page 17: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Propagate Motion Forward Path Up

Left

Front

3

Grid 1

Grid 2

Grid 3

1 2

4

5 6

3

Human body centric space

Encode the grids based on view

Confidential To be Published

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

17

Page 18: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Canonical Pose Feedback Path

Initially, Head size is assumed to divide into different grids.

Use the initial motion features to recognize an initial action

Page 19: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Canonical Pose Hypothesis – Identify pose in the frame Uses initial action and body part detector

Canonical Sticks from training videos

- offline training only once

Temporal Stick Features – Implicitly captures dynamics of pose evolution

Canonical Pose Feedback Path

Realign grids based on head size

Kinematic features – Div, Curl, Proj, Rot, BodyProj, BodyRot

Canonical Pose Feedback Path

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

19 Confidential To be Published

Page 20: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Canonical Pose Feedback Path

Available training videos

1) Crop foreground region in frame 2) Convert to grayscale 3) Resize to fixed dimension 4) Collect all resized images for all

videos 5) Apply NNMF to this data and

extract top N (=100) Eigen vectors or principal components

- Manually mark the sticks of each Eigen vector. - Use neck point and size of head, to obtain a normalized stick representation that can be used for comparison with test frame. Mutually Reinforcing Motion-Pose Framework for

Pose Invariant Action Recognition

Canonical Stick Extraction

Confidential To be Published

20

Page 21: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Weizmann

KTH

UCF Sports

Bend Wave2 Run

Boxing Hand Clapping Running

Golf Swing Kicking SkateBoarding

Canonical Pose Feedback Path

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

21 Confidential To be Published

Page 22: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Canonical Pose Feedback Path

1) Compare each canonical stick of action with the image to identify the most possible pose

2) A hypothesis for each canonical pose is computed based on the formula

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

Canonical Pose hypothesis

𝑃𝐻𝑁 = ∑i ∑j L(i,j)

Where L(i,j) = l(I,j), if dj ≤ Τd and (Θj –Θs ) ≤ ΤΘ

= 0, otherwise

l(i,j) – Body part i Likelihood score in segment j Td – distance threshold TΘ – Orientation threshold i – For each body part j – For each motion consistent segment

Confidential To be Published

22

Page 23: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Canonical Pose Feedback Path

Algorithm 1. Start with likelihood score Li for each part i in stick pose as 0, Li = 0. 2. Using kinematic motion features, obtain an initial segmentation of the foreground region. 3. Pass each of these segments through the body part detector [1], to know if the segment is a body part i or not. 4. If segment m is detected as body part i associated likelihood score li;m is obtained. 5. If segment m satisfies distance and orientation constraints, the likelihood score Li for body part i in stick pose is accumulated by li;m. (Distance constraints are imposed in normalized stick coordinates). 6. Repeat steps 3 - 5 for every segment and obtain the final Li for the canonical stick pose. 7. The pose hypothesis PHn for canonical stick pose n is summation of all Li. 8. Repeat steps 1 - 7 for every canonical pose n and compute pose hypothesis. 9. Choose the top 3 poses with highest pose hypothesis and compute the mean pose 10. Perform a pixel-wise segmentation into one of the body parts based on the distance from each body part’s stick in the mean pose. 11. Compute the body orientation using obtained torso region and neck point. 12. Compute the head size using obtained head region and body orientation. 13. Repeat steps 5- 12 if computed head size and initial approximate

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

[1] Manoj Ramanathan, Wei-Yun Yau and Eam Khwang Teoh, `Human Body Part Detection Using Likelihood Score Computations', IEEE Symposium on Computational Intelligence in Biometrics and Identity Management (CIBIM), pg. 160 – 166, December 2014.

Confidential To be Published

23

Page 24: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Input video divided into T Temporal Segments For each segment, average Stick Pose and Neck Point is computed

Total T Stick Poses

…..

Motion of each stick joint between consecutive segments

Proj & Rot features computed with respect to neck point

Temporal Stick Features

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

24 Confidential To be Published

Page 25: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Canonical Pose Feedback Path

• Pose component helping motion component – Re-align the grids of the according to the canonical

pose identified for each frame – Compute body part referenced kinematic feature

using pixel wise segmentation for each pixel

– Action recognized based on the original motion feature and newly computed feature.

• Framework forms a loop-like structure that can be repeated until action recognition converges.

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

Confidential To be Published

25

Page 26: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

NUAD - Focus on Non-upright action (NUAD) instead of the usual set of upright actions - 35 actors - 8 actions

- Bend - Squat - Push up - Climber - Knee bending - Single hand wave - Double hand wave - Lying down wave

- 3 views (Front, Left, Right) - Ground truth marking done to

indicate all body parts and neck points in the frames

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

Confidential To be Published

26

Page 27: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Experiments & Discussion

- Datasets • Simple ones – KTH & Weizmann • Challenging ones – UCF Sports & Hollywood • Cross Dataset – MSR Action • Posture Variation – NUAD - Tolerance range for neck markings

Page 28: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Method Performance (%)

Proposed (only PMF) 92.47

Proposed (PMF + CPF) 100

Shape –Motion Prototype [1] 100

Kinematic Features [2] 95.75

MHI & MEI based [3] 93

Experiments

Weizmann Dataset - 9 actors - 10 actions - Simple background - Leave one actor out method

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

Confidential To be Published

[1] Z. Jiang, Z. Lin, and L. S. Davis, “Recognizing human actions by learning and matching shape-motion prototype trees,” IEEE Trans. On Pattern Analysis and Machine Intelligence, vol. 34, pp. 533 – 547, March 2012. [2] S. Ali and M. Shah, “Human action recognition in videos using kinematic features and multiple instance learning,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 32, pp. 288 – 303, February 2010. [3] Y. Lu, Y. Li, Y. Chen, F. Ding, X. Wang, J. Hu, and S. Ding, “A Human action recognition method based on Tchebichef moment invariants and temporal templates,” in Intl. Conf. on Intelligent Human-Machine Systems and Cybernetics, pp. 76–79, August 2012. [4] L. Wang, Y. Qiao, and X. Tang, “Motionlets mid-level 3d parts for human motion recognition,” in IEEE Conf. on Computer Vision and Pattern Recognition, June 2013.

KTH Dataset - 25 actors - 6 actions - 4 scenarios - Leave one out (LOO) method - (16+9) Test validation

Method 16 + 9 (%) LOO (%)

Proposed (only PMF) 87 90

Proposed (PMF + CPF) 90 93.32

Shape –Motion Prototype [1]

95.77

Kinematic Features [2]

87.77

Motionlets [4] 93.3

28

Page 29: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Method Performance (%)

Proposed (only PMF) 76.4

Proposed (PMF + CPF) 87.4

Shape –Motion Prototype [1] 88

[2] 96.6

[3] 81.7

Experiments UCF Sports - 152 videos of 10 sports based actions - Dynamic backgrounds, view changes,

camera motion and pose variations - Skateboard – walk & Run – Kick confused

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

Confidential To be Published

[1] Z. Jiang, Z. Lin, and L. S. Davis, “Recognizing human actions by learning and matching shape-motion prototype trees,” IEEE Trans. On Pattern Analysis and Machine Intelligence, vol. 34, pp. 533 – 547, March 2012. [2] M. T. Harandi, C. Sanderson, S. Shirazi, and B. C. Lovell, “Kernel analysis on grassmann manifolds for action recognition,” Pattern Recognition Letters, vol. 34, pp. 1906 – 1913, November 2013. [3] K. G. Derpanis, M. Sizintsev, K. J. Cannons, and R. P. Wildes, “Action spotting and recognition based on spatiotemporal orientation analysis,” IEEE Trans. On Pattern Analysis and Machine Intelligence, vol. 35, pp. 527 – 540, March 2013. [4] A. Gilbert, J. Illingworth, and R. Bowden, “Action recognition using mined hierarchical compound features,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 33, pp. 883 – 897, May 2011. [5] I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld, “Learning realistic human actions from movies,” in IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1–8, June 2008

Hollywood Dataset - 8 actions (Interactions included) - Dynamic backgrounds, view changes,

camera motion and pose variations - Occlusion Handling –

- Stick limits & Grid based - Test & train set provided

Method Performance (%)

Proposed (only PMF) 54.12

Proposed (PMF + CPF) 56.87

[4] 53.3

[5] 38.2

29

Page 30: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Dataset All canonical Sticks

Lesser no. of Sticks

UCF Sports 87.4% 83.4%

Hollywood 56.87 51.89

Experiments Testing Pose effectiveness - Reducing available canonical stick poses for each action. - Adding sticks extracted from other datasets for certain actions

NUAD Dataset - View-invariance only with mirror image cases. (Around 92.5%) - Frontal & side cases, the accuracy is very less. (Around 58% only) - Because of Human body centric space created using 2D images is not same for different views.

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

Confidential To be Published

Cross Dataset applicability of canonical sticks - Tested using MSR Action Dataset - 3 actions only same as KTH dataset - Canonical sticks extracted from KTH and

used.

Method Performance (%)

Proposed (only PMF) 90.2

Proposed (PMF + CPF) 92.46

Method Performance (%)

Proposed (only PMF) 90.1

Proposed (PMF + CPF) 91.4

30

Page 31: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

Experiments

Errors as neck is not available or action is not visible Erroneous pose identification

31 Confidential To be Published

Page 32: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Discussion

- Availability of different canonical sticks for each action. - Based on available 2D stick project to 3D stick so that it can be used for

comparing with any view.

- Estimation of Neck point and viewing direction.

- Tolerance for neck region – only 3% performance drop

- Foreground estimated using background averaging.

- Body part detection error resulting in wrong pose estimation

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

Confidential To be Published

32

Page 33: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Conclusion

- Mutually reinforcing motion – pose framework action recognition - Pose-invariant - Partially view-invariant - Partial occlusion handling.

- Forward path handles motion & Feedback path handles pose.

- Representation of motion of each body part in a body centric space that allows

pose-invariance.

- Motion determines initial action, that determines the canonical stick poses to be used.

- Canonical stick pose identified help to realign grids and include motion features for each body part motion

Mutually Reinforcing Motion-Pose Framework for Pose Invariant Action Recognition

Confidential To be Published

33

Page 34: Pose invariant Action Recognition for Automated Behaviour ...imi.ntu.edu.sg/.../Manoj_Ramanathan_22_November_2016.pdf10. Perform a pixel-wise segmentation into one of the body parts

Thank you!!

Q & A ??