cloth capture (sketches 0390) - ryan whiteryanmwhite.com/research/sketches_0390.pdf · cloth...

1
Cloth Capture (sketches 0390) Ryan White Anthony Lobay UC Berkeley D.A. Forsyth Overview We present a method for capturing the geometry and pa- rameterization of fast-moving cloth using multiple video cameras, without requiring camera calibration. Our cloth is printed with a multi-scale pattern that allows capture at both high speed and high spatial resolution even though self-occlusion might block any indi- vidual camera from seeing the majority of the cloth. We show how to incorporate knowledge of this pattern into conventional structure- from-motion approaches, and use a novel scheme for camera cali- bration using the pattern. We use bundle adjustment to obtain ac- curate reconstruction results and use the parameterization to drive several stages of post processing including strain reduction and a cloth simulation to fill in gaps. We demonstrate our algorithm by capturing, retexturing and displaying several sequences of fast mov- ing cloth. Pattern Selection Our cloth is printed with a pattern that contains information at multiple scales. At the larger scale, our pattern in- cludes unique multi-colored triangles with parametric information about the location of the triangle in the domain of the cloth, as well as surface normals. Within each large triangle, there are 15 vertices that allow us to produce a fine mesh with a high degree of spatial accuracy. The pattern consists of areas of constant color to avoid problems with motion blur, a small number of unique colors to re- duce confusion and a pattern that can scale to hundreds of unique large triangles containing thousands of unique points. Localization In each view, we perform a hierarchical search for points. First, we identify the locations and normals of the large tri- angles. The normals are resolved up to a two-fold ambiguity. Next, we estimate the colors of the triangle to determine where in the cloth it appears. Finally, we search for the locations of the corners of the smaller triangles. Camera Calibration Extending work in the shape from texture lit- erature [2004], we calibrate our cameras using both points and normals. Choosing two large triangles at random, we estimate an orthographic camera by using information from the locations of the triangles and the associated normals. We prove that, aside from an ambiguity in the direction of the normals, a metric reconstruc- tion is available from only two points and two normals. We search over normal ambiguities and a number of randomly chosen triangle pairs to select the orthographic calibration that produces the lowest reconstruction error for the remaining triangles. 3D Reconstruction With a fairly accurate estimate of an ortho- graphic camera model, we can reconstruct the locations of the tri- angles in each camera. However, when our cameras observe the scene from radically different angles, there are perspective effects. To compensate, we upgrade to a perspective camera model. We ini- tialize assuming rough values from physical measurements and use nonlinear optimization (bundle adjustment) to compute more accu- rate camera parameters. Bundle adjustment reduces reprojection errors from above 10 pixels to less than 3 pixels. Post Processing After reconstructing the 3D location and paramet- ric value of each observed point on the cloth, there are minor errors: image noise in localization, temporal noise, errors in camera cali- bration and missing observations. While some of these errors can be corrected with simple smoothing and minor validation, regions of missing detection and disagreement between cameras present more significant problems. We use a cloth model to fill in gaps. Because gaps are small and well constrained, our cloth simulation does not suffer from typical stiffness and stability problems. We also use surface parameterization to drive a strain reduction step. Results We capture and retexture several sequences that would be difficult to simulate due to complex environmental interactions, in- cluding mechanical contact and wind (shown in the figure). We filmed using four video cameras capturing at 30 Hz. References LOBAY, A., AND FORSYTH, D. 2004. Recovering shape and ir- radiance maps from rich dense texton fields. In Proceedings of Computer Vision and Pattern Recognition (CVPR).

Upload: others

Post on 22-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cloth Capture (sketches 0390) - Ryan Whiteryanmwhite.com/research/sketches_0390.pdf · Cloth Capture (sketches 0390) Ryan White Anthony Lobay UC Berkeley D.A. Forsyth Overview We

Cloth Capture (sketches 0390)

Ryan White Anthony LobayUC Berkeley

D.A. Forsyth

Overview We present a method for capturing the geometry and pa-rameterization of fast-moving cloth using multiple video cameras,without requiring camera calibration. Our cloth is printed with amulti-scale pattern that allows capture at both high speed and highspatial resolution even though self-occlusion might block any indi-vidual camera from seeing the majority of the cloth. We show howto incorporate knowledge of this pattern into conventional structure-from-motion approaches, and use a novel scheme for camera cali-bration using the pattern. We use bundle adjustment to obtain ac-curate reconstruction results and use the parameterization to driveseveral stages of post processing including strain reduction and acloth simulation to fill in gaps. We demonstrate our algorithm bycapturing, retexturing and displaying several sequences of fast mov-ing cloth.

Pattern SelectionOur cloth is printed with a pattern that containsinformation at multiple scales. At the larger scale, our pattern in-cludes unique multi-colored triangles with parametric informationabout the location of the triangle in the domain of the cloth, as wellas surface normals. Within each large triangle, there are 15 verticesthat allow us to produce a fine mesh with a high degree of spatialaccuracy. The pattern consists of areas of constant color to avoidproblems with motion blur, a small number of unique colors to re-duce confusion and a pattern that can scale to hundreds of uniquelarge triangles containing thousands of unique points.

Localization In each view, we perform a hierarchical search forpoints. First, we identify the locations and normals of the large tri-angles. The normals are resolved up to a two-fold ambiguity. Next,we estimate the colors of the triangle to determine where in thecloth it appears. Finally, we search for the locations of the cornersof the smaller triangles.

Camera Calibration Extending work in the shape from texture lit-erature [2004], we calibrate our cameras using both points andnormals. Choosing two large triangles at random, we estimate anorthographic camera by using information from the locations of thetriangles and the associated normals. We prove that, aside from

an ambiguity in the direction of the normals, a metric reconstruc-tion is available from only two points and two normals. We searchover normal ambiguities and a number of randomly chosen trianglepairs to select the orthographic calibration that produces the lowestreconstruction error for the remaining triangles.

3D Reconstruction With a fairly accurate estimate of an ortho-graphic camera model, we can reconstruct the locations of the tri-angles in each camera. However, when our cameras observe thescene from radically different angles, there are perspective effects.To compensate, we upgrade to a perspective camera model. We ini-tialize assuming rough values from physical measurements and usenonlinear optimization (bundle adjustment) to compute more accu-rate camera parameters. Bundle adjustment reduces reprojectionerrors from above 10 pixels to less than 3 pixels.

Post ProcessingAfter reconstructing the 3D location and paramet-ric value of each observed point on the cloth, there are minor errors:image noise in localization, temporal noise, errors in camera cali-bration and missing observations. While some of these errors canbe corrected with simple smoothing and minor validation, regionsof missing detection and disagreement between cameras presentmore significant problems. We use a cloth model to fill in gaps.Because gaps are small and well constrained, our cloth simulationdoes not suffer from typical stiffness and stability problems. Wealso use surface parameterization to drive a strain reduction step.

ResultsWe capture and retexture several sequences that would bedifficult to simulate due to complex environmental interactions, in-cluding mechanical contact and wind (shown in the figure). Wefilmed using four video cameras capturing at 30 Hz.

References

LOBAY, A., AND FORSYTH, D. 2004. Recovering shape and ir-radiance maps from rich dense texton fields. InProceedings ofComputer Vision and Pattern Recognition (CVPR).