multiple view segmentation and mattingiphome.hhi.de/eisert/papers/cvmp2011c.pdf · and other...

1
MULTIPLE VIEW SEGMENTATION AND MATTING Markus Kettern, David C. Schneider and Peter Eisert {markus.kettern, david.schneider, peter.eisert}@hhi.fraunhofer.de – Fraunhofer Heinrich Hertz Institute, Berlin, Germany Abstract We propose a robust and fully automatic method for extracting a highly detailed transparency-preserving segmentation of a person’s head from multiple view recordings, including a background plate for each view. At first, trimaps containing a rough segmentation into foregorund, background and unknown image regions are extracted exploiting the visual hull of an initial foreground-background segmentation. The background plates are adapted to the lighting conditions of the recordings and the trimaps are used to initialize a state-of-the-art matting method adapted to a setup with precise background plates available. From the alpha matte, foreground colours are inferred for realistic rendering of the recordings onto novel backgrounds. Keywords: Multiview Segmentation, Visual Hull, Matting. Initial Segmentation. An initial binary segmentation is based on the intensity and RGB variance of the difference image between image I (fig. 1a) and the respective background plate B (fig. 1b). These cues are combined into a foreground likeliness image P (fig. 1c). Segmentation is performed by thresholding P at the most stable region of its cumulative histogram such that between 20% and 90% of the image are covered by the foreground (fig. 1d). Trimap Generation. A trimap is a common initialization for image matting and consists of three regions: The foreground region M f will have an α-value of 1, the background M b will have an α of zero, and M u contains the pixels for which α has to be inferred. To obtain a trimap, the visual hull [2] of the object is created from the initial segmentations employing an efficient octree representation (fig. 1e). Only voxels visible in the initial segmentations of all views are included into the hull. The backprojection of this hull into each view yields region M f of the trimap. A dilated version of the initial segmentation yields the region M u , completing the trimap (fig. 1f). Background Plate Adaption. Due to shadows, daylight and other factors arising during a recording session, the illumination of B may differ significantly from I . To make the matting process more robust and precise, we estimate the true background colors behind the object to be segmented by a Poisson interpolation [4] over region M u M f of I using the gradient of B as guidance field. The adapted background plate is called B * . Alpha Matting and Foreground Estimation. A matte associates each image pixel p M u with a transparency value α p typically thought of as a blending factor between two unknown colours, ˆ F p and ˆ B p . The SampleMatch algorithm [1] stochastically searches for these colors in a space spanning all combinations of image pixels located at the borders δ(M u ,M f ) and δ(M u ,M b ). Instead of using the whole border δ(M u ,M b ) as candidate background colors for each pixel p we only use the colors of a small region around p in B * . In this way, we decrease search space size and avoid many ambiguities arising from foreground/background similarities. To obtain smoother results, the mattes are regularized with the matting Laplacian of [3] (fig. 1g). Once the transparency α p of each pixel p is known, the foreground colour can be directly estimated by F * p = (1 - α p )B * p - I p α p (1) and used to create new composites (fig. 1h). (a) (b) (c) (d) (e) (f) (g) (h) Figure 1: Processing sequence of trimap and matte generation. References [1] K. He, C. Rhemann, C. Rother, X. Tiang, and J. Sun. A global sampling method for alpha matting. In CVPR, 2011. [2] A. Laurentini. The visual hull concept for silhouette-based image understanding. PAMI, 16:150–162, 1994. [3] A. Levin, D. Lischinski, and Y. Weiss. A closed-form solution to natural image matting. PAMI, 30, 2008. [4] P. P´ erez, M. Gangnet, and A. Blake. Poisson image editing. In ACM SIGGRAPH, 2003.

Upload: others

Post on 28-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MULTIPLE VIEW SEGMENTATION AND MATTINGiphome.hhi.de/eisert/papers/cvmp2011c.pdf · and other factors arising during a recording session, the illumination of Bmay differ significantly

MULTIPLE VIEW SEGMENTATION AND MATTING

Markus Kettern, David C. Schneider and Peter Eisert

{markus.kettern, david.schneider, peter.eisert}@hhi.fraunhofer.de – Fraunhofer Heinrich Hertz Institute, Berlin, Germany

Abstract

We propose a robust and fully automatic method for extractinga highly detailed transparency-preserving segmentation ofa person’s head from multiple view recordings, including abackground plate for each view. At first, trimaps containinga rough segmentation into foregorund, background andunknown image regions are extracted exploiting the visualhull of an initial foreground-background segmentation. Thebackground plates are adapted to the lighting conditionsof the recordings and the trimaps are used to initializea state-of-the-art matting method adapted to a setupwith precise background plates available. From thealpha matte, foreground colours are inferred for realisticrendering of the recordings onto novel backgrounds.

Keywords: Multiview Segmentation, Visual Hull, Matting.

Initial Segmentation. An initial binary segmentation isbased on the intensity and RGB variance of the differenceimage between image I (fig. 1a) and the respective backgroundplate B (fig. 1b). These cues are combined into a foregroundlikeliness image P (fig. 1c). Segmentation is performed bythresholding P at the most stable region of its cumulativehistogram such that between 20% and 90% of the image arecovered by the foreground (fig. 1d).

Trimap Generation. A trimap is a common initialization forimage matting and consists of three regions: The foregroundregion Mf will have an α-value of 1, the background Mb willhave an α of zero, and Mu contains the pixels for which α hasto be inferred. To obtain a trimap, the visual hull [2] of theobject is created from the initial segmentations employing anefficient octree representation (fig. 1e). Only voxels visible inthe initial segmentations of all views are included into the hull.The backprojection of this hull into each view yields regionMf of the trimap. A dilated version of the initial segmentationyields the region Mu, completing the trimap (fig. 1f).

Background Plate Adaption. Due to shadows, daylightand other factors arising during a recording session, theillumination of B may differ significantly from I . To makethe matting process more robust and precise, we estimate thetrue background colors behind the object to be segmented by aPoisson interpolation [4] over region Mu ∪Mf of I using thegradient of B as guidance field. The adapted background plateis called B∗.

Alpha Matting and Foreground Estimation. A matteassociates each image pixel p ∈ Mu with a transparencyvalue αp typically thought of as a blending factor between twounknown colours, Fp and Bp. The SampleMatch algorithm[1] stochastically searches for these colors in a space spanningall combinations of image pixels located at the bordersδ(Mu,Mf ) and δ(Mu,Mb). Instead of using the whole borderδ(Mu,Mb) as candidate background colors for each pixelp we only use the colors of a small region around p in B∗.In this way, we decrease search space size and avoid manyambiguities arising from foreground/background similarities.To obtain smoother results, the mattes are regularized with thematting Laplacian of [3] (fig. 1g). Once the transparency αp

of each pixel p is known, the foreground colour can be directlyestimated by

F ∗p =

(1− αp)B∗p − Ip

αp(1)

and used to create new composites (fig. 1h).

(a) (b) (c) (d)

(e) (f) (g) (h)

Figure 1: Processing sequence of trimap and matte generation.

References

[1] K. He, C. Rhemann, C. Rother, X. Tiang, and J. Sun. Aglobal sampling method for alpha matting. In CVPR, 2011.

[2] A. Laurentini. The visual hull concept for silhouette-basedimage understanding. PAMI, 16:150–162, 1994.

[3] A. Levin, D. Lischinski, and Y. Weiss. A closed-formsolution to natural image matting. PAMI, 30, 2008.

[4] P. Perez, M. Gangnet, and A. Blake. Poisson image editing.In ACM SIGGRAPH, 2003.