learning 3d object models from 2d images · segmentation” george papandreou, liang-chiehchen,...
TRANSCRIPT
![Page 1: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/1.jpg)
Learning from Imperfect Data WorkshopIasonas Kokkinos
Learning 3D object models from 2D images
Cropped Input Image
Latent VectorResNet-50 Spatial Mesh Convolutional
Decoder
MeshLoss
Predicted Mesh Generated Ground Truth Predicted Landmarks
Iterative Model Fitting
![Page 2: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/2.jpg)
Ariel AI
R. A. Guler S. ZafeiriouG. Papandreou H. Wang
D. Stoddard
D. Kulon
Z. ShuStony Brook
M. SahasrabudheINRIA
D. SamarasStony Brook
Natalia NeverovaFAIR
E. BartrumUCL
N. ParagiosINRIA
E. Skordos H. TamA. KakolyrisP. Koutras
E. Schmitt
A. LazarouS. Galanakis
B. Fulkerson
M. BronsteinImperial College
UCL, Imperial College, FAIR, INRIA, Stony Brook
![Page 3: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/3.jpg)
Image Classification
Is there a person in this image?Yes? No?
Object Detection
Localize persons in the image.
Pose Estimation
Localize joints of the persons in the images.
DensePose (our work)
Find correspondence between all pixels and a 3D model.
Part Segmentation
Segment semantically meaningful body parts.
Input Image
Image Classification
Is there a person in this image?Yes? No?
Object Detection
Localize persons in the image.
Pose Estimation
Localize joints of the persons in the images.
DensePose (our work)
Find correspondence between all pixels and a 3D model.
Part Segmentation
Segment semantically meaningful body parts.
Image Classification
Is there a person in this image?Yes? No?
Image Classification
Human analysis: from coarse to fine
![Page 4: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/4.jpg)
Image Classification
Is there a person in this image?Yes? No?
Object Detection
Localize persons in the image.
Pose Estimation
Localize joints of the persons in the images.
DensePose (our work)
Find correspondence between all pixels and a 3D model.
Part Segmentation
Segment semantically meaningful body parts.
Input Image
Image Classification
Is there a person in this image?Yes? No?
Object Detection
Localize persons in the image.
Pose Estimation
Localize joints of the persons in the images.
DensePose (our work)
Find correspondence between all pixels and a 3D model.
Part Segmentation
Segment semantically meaningful body parts.
Person Detection
Localize persons in the image.
Image Classification Person Detection
Human analysis: from coarse to fine
![Page 5: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/5.jpg)
Image Classification
Is there a person in this image?Yes? No?
Object Detection
Localize persons in the image.
Pose Estimation
Localize joints of the persons in the images.
DensePose (our work)
Find correspondence between all pixels and a 3D model.
Part Segmentation
Segment semantically meaningful body parts.
Input Image
Part Segmentation
Segment semantically meaningful body parts.
Image Classification Person Detection
Image Classification
Is there a person in this image?Yes? No?
Object Detection
Localize persons in the image.
Pose Estimation
Localize joints of the persons in the images.
DensePose (our work)
Find correspondence between all pixels and a 3D model.
Part Segmentation
Segment semantically meaningful body parts.
Image Classification
Is there a person in this image?Yes? No?
Object Detection
Localize persons in the image.
Pose Estimation
Localize joints of the persons in the images.
DensePose (our work)
Find correspondence between all pixels and a 3D model.
Part Segmentation
Segment semantically meaningful body parts.
Part Segmentation
Human analysis: from coarse to fine
![Page 6: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/6.jpg)
Image Classification
Is there a person in this image?Yes? No?
Object Detection
Localize persons in the image.
Pose Estimation
Localize joints of the persons in the images.
DensePose (our work)
Find correspondence between all pixels and a 3D model.
Part Segmentation
Segment semantically meaningful body parts.
Input Image
Pose Estimation
Localize joints of the persons in the images.
Image Classification Person Detection Image Classification
Is there a person in this image?Yes? No?
Object Detection
Localize persons in the image.
Pose Estimation
Localize joints of the persons in the images.
DensePose (our work)
Find correspondence between all pixels and a 3D model.
Part Segmentation
Segment semantically meaningful body parts.
Part Segmentation Pose Estimation
Human analysis: from coarse to fine
![Page 7: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/7.jpg)
Image Classification
Is there a person in this image?Yes? No?
Object Detection
Localize persons in the image.
Pose Estimation
Localize joints of the persons in the images.
DensePose (our work)
Find correspondence between all pixels and a 3D model.
Part Segmentation
Segment semantically meaningful body parts.
Input Image
Dense Pose Estimation
Find correspondence between all pixels and a 3D model.
Image Classification Person Detection Image Classification
Is there a person in this image?Yes? No?
Object Detection
Localize persons in the image.
Pose Estimation
Localize joints of the persons in the images.
DensePose (our work)
Find correspondence between all pixels and a 3D model.
Part Segmentation
Segment semantically meaningful body parts.
Part Segmentation Pose Estimation DensePose
Human analysis: from coarse to fine
![Page 8: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/8.jpg)
“Wide Open” (The Mill, 2015)
8
Holy grail: 3D human reconstruction
![Page 9: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/9.jpg)
9
Ariel AI: 3D human reconstruction on mobile
![Page 10: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/10.jpg)
1010
Holographic telepresence
Universal motion capture
Seamless augmented reality
Personalised, experiential retail
Kinetic learning
Immersive gaming
Ariel AI: 3D human reconstruction on mobile
![Page 11: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/11.jpg)
1111
Challenges
Depth/height ambiguity
3D from 2D: fundamentally ill-posed problem
Scarce 3D supervision – almost impossible in-the-wild
![Page 12: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/12.jpg)
From imperfect vision to imperfect data
Computer Vision before deep learning:- Your `local evidence’ is imperfect (classifier scores, unary terms, ..)- Compensate for it by model-based prior during inference (AAMs, MRFs,..)
Computer Vision after deep learning:- Your `local evidence’ can become perfect- Your training data is imperfect- Compensate for it by some model-based prior, prior or during training
![Page 13: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/13.jpg)
Imperfect Data for Semantic Segmentation
“Weakly- and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation” George Papandreou, Liang-Chieh Chen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015
Bounding boxes + occupancy priors
![Page 14: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/14.jpg)
Imperfect Data for Instance Segmentation
Deep Extreme Cut: From Extreme Points to Object Segmentation, Kevis-Kokitsi Maninis, Sergi Caelles, Jordi Pont-Tuset, Luc Van Gool
4 points + segmentation system
![Page 15: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/15.jpg)
Imperfect Data for Pose Estimation
Learning Temporal Pose Estimation from Sparsely Labeled Videos, Bertasius, Gedas and Feichtenhofer, Christoph, and Tran, Du and Shi, Jianbo, and Torresani, Lorenzo(NeurIPS 2019)
Keypoints + temporal correspondence
![Page 16: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/16.jpg)
Part 1: Weakly- and semi- supervised learning for 3D
HoloPose: Holistic 3D Human Reconstruction In-the-Wild, A. Guler and I. Kokkinos, CVPR 2019
Weakly-Supervised Mesh-Convolutional Hand Reconstruction in the Wild, D. Kulon et al CVPR 2020
![Page 17: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/17.jpg)
Part 2: Fully unsupervised learning for 3D
Includes all previous tasks as special cases
Unstructured face dataset
deep magic happens 3D model comes out
Lifting AutoEncoders: Unsupervised Learning of 3D Morphable Models Using Deep Non-Rigid Structure from Motion, M. Sahasrabudhe, Z. Shu, E. Bartrum, A. Guler, D. Samaras and I. Kokkinos, ICCV GMDL 2019
![Page 18: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/18.jpg)
DenseReg: From Image to Template to Task
R. A. Guler, G. Trigeorgis, E. Antonakos, P. Snape, S. Zafeiriou, I. Kokkinos,DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild, CVPR 2017
![Page 19: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/19.jpg)
DenseReg, Frame-by-Frame
![Page 20: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/20.jpg)
2D canonical coordinates
Supervision: from parametric model fitting to 2D keypoints
Annotation effort: a few 2D landmarks per image Density: morphable model prior
![Page 21: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/21.jpg)
R. A. Guler, N. Neverova, I. Kokkinos “DensePose: Dense Human Pose Estimation In The Wild”, CVPR’18
DensePose-RCNN: ~25 FPS
DensePose: dense image-to-body correspondence
http://densepose.org/
![Page 22: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/22.jpg)
Surface CorrespondenceTASK 2: Marking CorrespondencesTASK 1: Part Segmentation
...
...
sampled pointsinput image
segmented parts rendered images for the specific part
Surface CorrespondenceTASK 2: Marking CorrespondencesTASK 1: Part Segmentation
...
...
sampled pointsinput image
segmented parts rendered images for the specific part
Annotation pipeline-II
![Page 23: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/23.jpg)
Quantization replaced by part assignment.densepose.org
U coordinates V coordinates
Image
DensePose-COCO dataset
![Page 24: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/24.jpg)
Quantization replaced by part assignment.DensePose-RCNN ResultsVisualization
DensePose-RCNN in action
![Page 25: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/25.jpg)
HoloPose: multi-person 3D reconstruction results
R. A. Guler, I. Kokkinos “HoloPose: Holistic 3D Human Reconstruction In The Wild”, CVPR’19
![Page 26: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/26.jpg)
Surface-level human understanding, CVPR 2018
DensePose: Dense Human Pose Estimation In The Wild, CVPR 2018R. A. Güler, N. Neverova, I. Kokkinos,
End-to-end Recovery of Human Shape and Pose, CVPR 2018 A. Kanazawa M. J Black D. W. Jacobs J. MalikLearning to Estimate 3D Human Pose and Shape from a Single Image, CVPR 2018G. Pavlakos, L. Zhu, X. Zhou, K. DaniilidisMonocular 3D Pose and Shape Estimation of Multiple People, CVPR 2018, Andrei Zanfir, Elisabeta Marinoiu, Cristian Sminchisescu
SMPL parameter regression Dense UV coordinate regression
Robust & accurate, “in-the-wild”Not 3D
Parametric and 3DAlignment
![Page 27: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/27.jpg)
Bottom-up human body reconstruction
![Page 28: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/28.jpg)
Bottom-up 2D Keypoint localization
![Page 29: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/29.jpg)
Bottom-up/Top-down Synergistic Refinement
![Page 30: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/30.jpg)
Synergistic Refinement
![Page 31: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/31.jpg)
![Page 32: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/32.jpg)
Ariel Holopose 2019• In-the-wild human 3D reconstruction
![Page 33: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/33.jpg)
Ariel Holopose 2020
![Page 34: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/34.jpg)
Weakly-Supervised Mesh-ConvolutionalHand Reconstruction in the WildDominik Kulon Riza Alp Güler Iasonas Kokkinos Michael Bronstein Stefanos Zafeiriou
arielai.com/mesh_hands
![Page 35: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/35.jpg)
youtu.be/aQ4shIsQabo
Motivation - hand pose estimation
Broad array of applications:••••
Existing approaches do not always:•
••
![Page 36: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/36.jpg)
Hand Reconstruction System
Cropped Input Image
Latent VectorResNet-50 Spatial Mesh Convolutional
Decoder
MeshLoss
Predicted Mesh Generated Ground Truth Predicted Landmarks
Iterative Model Fitting
![Page 37: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/37.jpg)
Hand Reconstruction System
Cropped Input Image
Latent VectorResNet-50 Spatial Mesh Convolutional
Decoder
MeshLoss
Predicted Mesh Generated Ground Truth Predicted Landmarks
Iterative Model Fitting
![Page 38: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/38.jpg)
Hand Reconstruction System
Cropped Input Image
Latent VectorResNet-50 Spatial Mesh Convolutional
Decoder
MeshLoss
Predicted Mesh Generated Ground Truth Predicted Landmarks
Iterative Model Fitting
![Page 39: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/39.jpg)
Hand Reconstruction System
Cropped Input Image
Latent VectorResNet-50 Spatial Mesh Convolutional
Decoder
MeshLoss
Predicted Mesh Generated Ground Truth Predicted Landmarks
Iterative Model Fitting
![Page 40: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/40.jpg)
Parametric Hand Model Fitting
2D Reprojection Term
Bone Length Preservation Term
Regularization Term
K-Means Prior
Cropped Input Image
Latent VectorResNet-50 Spatial Mesh Convolutional
Decoder
MeshLoss
Predicted Mesh Generated Ground Truth Predicted Landmarks
Iterative Model Fitting
![Page 41: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/41.jpg)
Novel Dataset
We release a dataset of meshes aligned with in the wild images.
-
-
-
-
![Page 42: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/42.jpg)
![Page 43: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/43.jpg)
Evaluation - standard benchmarks
We also obtain state-of-the-artperformance on popularlaboratory datasets.
Method(synthetic, 3D)
RHD (AUC)
Mesh Speed(FPS)
Zimm. and Brox (2017) 0.675
Yang and Yao (2019) 0.849
Spurr et al. (2018) 0.849
Zhou et al. (2020) 0.856 100 (GPU)
Cai et al. (2018) 0.887
Zhang et al. (2019) 0.901
Ge et al. (2019) 0.92 50 (GPU)
Baek et al. (2019) 0.926
Yang et al. (2019) 0.943
Ours 0.956 70 (GPU)
![Page 44: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/44.jpg)
2D PCK ON MPII+NZSL
Evaluation - in the wild
We largely outperform otherapproaches on an in the wildbenchmark.
2D PCK ON MPII+NZSL
Evaluation - in the wild
We largely outperform otherapproaches on an in the wildbenchmark.
![Page 45: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/45.jpg)
Our MethodInput Video
![Page 46: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/46.jpg)
Egocentric Perspective
Our MethodInput Video
![Page 47: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/47.jpg)
AR Effects
Our MethodInput Video
AR Effects
Our MethodInput Video
![Page 48: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/48.jpg)
Part 2: Lifting AutoEncoders: Unsupervised 2D-to-3D
Includes all previous tasks as special cases
Unstructured face dataset
deep magic happens 3D model comes out
Lifting AutoEncoders: Unsupervised Learning of 3D Morphable Models Using Deep Non-Rigid Structure from Motion, M. Sahasrabudhe, Z. Shu, E. Bartrum, A. Guler, D. Samaras and I. Kokkinos, Arxiv 2019
![Page 49: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/49.jpg)
A canonical appearance template
deformed into
A class of images (MNIST 3)
Learning a template and the deformation for a class of images.
?
Unsupervised learning of deformable models
![Page 50: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/50.jpg)
Goal: learn a template and the deformation for a class of images.
?A canonical appearance template
deformed into
A class of images (Faces)
Unsupervised learning of deformable models
![Page 51: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/51.jpg)
Deforming AutoEncoder (DAE) model
Z. Zhu, M. Saha, A. Guler, D. Samaras, I. Kokkinos, Deforming Autoencoders: Unsupervised Shape and Appearance Disentangling, ECCV 2018
![Page 52: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/52.jpg)
input
decoded texture
decoded deformation
reconstruction
DAE for MNIST: single-class template
![Page 53: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/53.jpg)
input
decoded texture
decoded deformation
reconstruction
DAE for Faces-in-the-Wild
![Page 54: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/54.jpg)
DAE-based unsupervised face alignmentD
efor
mat
ion
Alig
ned
Inpu
tLa
ndm
arks
![Page 55: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/55.jpg)
Unsupervised alignment with DAE on MAFL dataset
![Page 56: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/56.jpg)
Goal: learn a 3D model from unstructured image set
Unstructured face dataset
Something deep happens 3D model comes out
![Page 57: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/57.jpg)
3D Reconstruction: Structure-from-Motion
Assumption: Rigid Scene
Methods: Factorization, Bundle Adjustment
Input: Point Correspondences (e.g. through SIFT & Ransac)
Noah Snavely, Steven M. Seitz, Richard Szeliski. Modeling the World from Internet Photo Collections. IJCV, 2007.Yasutaka Furukawa and Jean Ponce, Accurate, Dense, and Robust Multi-View Stereopsis, CVPR 2007
![Page 58: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/58.jpg)
3D Reconstruction: Non-Rigid Structure-from-Motion
https://www.youtube.com/watch?v=35wCPFyS3QQNon-Rigid Structure-From-Motion: Estimating Shape and Motion with Hierarchical Priors, Bregler et al, PAMI 2008Dense Reconstruction of Non-Rigid Surfaces from Monocular Video, Garg et al, CVPR 2013
![Page 59: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/59.jpg)
DAEs: Turn Images to Corresponding Sets of Points
![Page 60: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/60.jpg)
Lifting AutoEncoder: NRSfM with DAEs
Deforming AutoEncoderNon-Rigid Structure-from-Motion
![Page 61: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/61.jpg)
Lifting AutoEncoder: NRSfM with DAEs
Deforming AutoEncoder
![Page 62: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/62.jpg)
Lifting Auto-Encoders: end-to-end 3D generative model
![Page 63: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/63.jpg)
What Lifting AutoEncoders: Unsupervised Learning of a Fully-Disentangled 3D Morphable M§odel
Pose modification
Controllable image modification using LAEs
![Page 64: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/64.jpg)
What
Pose modification Expression modification
Lifting AutoEncoders: Unsupervised Learning of Fully-Disentangled 3D Morphable model
Controllable image modification using LAEs
![Page 65: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/65.jpg)
What
Pose modification Expression modification
Lifting AutoEncoders: Unsupervised Learning of Fully-Disentangled 3D Morphable model
Controllable image modification using LAEs
![Page 66: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/66.jpg)
What
Pose modification Illumination modificationExpression modification
Lifting AutoEncoders: Unsupervised Learning of Fully-Disentangled 3D Morphable model
Controllable image modification using LAEs
![Page 67: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/67.jpg)
Part 1: Weakly- and semi- supervised learning for 3D
HoloPose: Holistic 3D Human Reconstruction In-the-Wild, A. Guler and I. Kokkinos, CVPR 2019
Weakly-Supervised Mesh-Convolutional Hand Reconstruction in the Wild, D. Kulon et al CVPR 2020
![Page 68: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/68.jpg)
Part 2: Fully unsupervised learning for 3D
Includes all previous tasks as special cases
Unstructured face dataset
deep magic happens 3D model comes out
Lifting AutoEncoders: Unsupervised Learning of 3D Morphable Models Using Deep Non-Rigid Structure from Motion, M. Sahasrabudhe, Z. Shu, E. Bartrum, A. Guler, D. Samaras and I. Kokkinos, ICCV GMDL 2019
![Page 69: Learning 3D object models from 2D images · Segmentation” George Papandreou, Liang-ChiehChen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015 Bounding boxes + occupancy priors. Imperfect](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dd9ad52645b38ae3ec8d7/html5/thumbnails/69.jpg)
R. A. Guler S. ZafeiriouG. Papandreou H. Wang
D. Stoddard
D. Kulon
Z. ShuStony Brook
M. SahasrabudheINRIA
D. SamarasStony Brook
Natalia NeverovaFAIR
E. BartrumUCL
N. ParagiosINRIA
E. Skordos H. TamA. KakolyrisP. Koutras
E. Schmitt
A. LazarouS. Galanakis
B. Fulkerson
M. BronsteinImperial College
Thank you! arielai.com/mesh_hands