multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive...

Post on 26-Mar-2015

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Multi-view real-time depth estimation based on

combination of visual-hull and hybrid recursive matching

HHI

Wolfgang Waizenegger

Overview• Field of application: 3D Presence

– 2D Videoconferencing – 3D Videoconferencing– 3D Presence concept and 3D displays– The camera system

• 3D Analysis– 3D algorithmic chain– Hybrid recursive matching (HRM)– Visual Vull (VH)– HRM and VH combination

• Results• Hardware• Conclusion and Outlook

3D Presence Consortium

SoA of Telepresence Systems

Polycom TPX System

Telepresence System by

CISCO

HP Halo Telepresence

System

Drawbacks of conventional telepresence systems

• Drawback: – No eye contact, e.g. it is hard to

recognize who is talking to whom– Misleading gestures and body

language

• Ideal situation:Every local participant has its own view for each remote conferee

• Solution: Immersive 3D videoconferencing

Missing eye contact (CISCO system)

SoA of 3D Videoconferencing

MultiView by Univ. of

California,Berkeley, 2004

Virtue/im.point by Fraunhofer HHI, 2003/2004

Real Meet Room, France Telecom R&D, 2001

The concept of 3D Presence

Three partiesTwo conferees per party

• Multi-party 3D videoconferencing• 3D multi-user auto-stereoscopic display technology• Multi-party eye contact and gesture-based

interaction

Replace remote confereesby 3D displays

Multi-View 3D Displays

Multiple 3D views from different perspectives

Advantages:- Own view for each local conferee- Adapted viewing perspective- 3D impression- Multiple views allow conferees to switch perspective by moving the head

multiple viewing cones

Multi-View 3D Display

The Multi-View Camera System

Narrow baseline system• Robust disparity estimation• Consistency check by trifocal matching

b

b

kb combined trifocal system

vertical wide baseline system

horizontal wide baseline system

horizontal narrow baseline system

vertical narrow baseline system

vertical wide baseline system

Wide baseline system• Increased depth resolution• Option to combine with Visual Hull

The Mock-up for Camera Configuration Testing

3D Analysis Chain

n stereo streams

segmentation

disparity estimation

volumetric reconstructi

on

head tracking

hand tracking

data fusion

depth maps

3D modeling

data

occlusion information etc.

video + depth (n)

Hybrid-Recursive Matching (HRM)

pixel recursion

choice of best disparity

disparity memory

block recursion

3 candidates

disparity vector

left image

start vector

update vector

right image

Trifocal system

vertical narrow baseline

after consistency check

horizontalnarrow baseline

Multi-View Video Analysis Chain

n stereo streams

segmentation

disparity estimation

volumetric reconstructi

on

head tracking

hand tracking

data fusion

depth maps

3D modeling

data

occlusion information etc.

video + depth (n)

Colored Visual Hull reconstruction

Visual Hull Techniques

• Polygonal• Volume based space carving (VH)• Image based (IBVH)

3D Presence demands real-time processing!!

Parallelization of the last two approaches on graphics hardware is straightforward!

IBVH Algorithm

Our implementation is based on the initial work of Matusik et al. (2000)

Advantages of our algorithm• Improved caching strategy that allows pixel pre-selection

which significantly speeds up the computation• GPU only implementation using CUDA• Establishes an interconnection to voxel based

implementation by applying cameras at infinity.

IBVH interconnection to voxel based methods

VH vs. IBVHTimings for two GPU based implementations with different resolutions. The

imageupload time is included.

Volume based approach from Ladikos et al. 2008 (VH_Lad)Our image based approach (PPSIBVH, without pixel pre-selection IBVH)

Input: Middlebury dinoRig dataset ( 48 images, 640 x 480 )

Hardware 1283 2563 5123

VH_Lad 4 x 8800GTX 99.89 ms 296.71 ms -

IBVH 1 x GTX280 47.9 ms 82.5 ms 280.6 ms

PPSIBVH 1 x GTX280 41.6 ms 60.9 ms 150.6 ms

IBVH result for the dinoRig dataset

left) Voxel representation of the IBVH result (5123), right) image based depth map

IBVH result for a 3D Presence conferee

Timing for a typical 3D Presence setup with depth maps of 192x256 and 8 Visual Hull cameras: 10–20 msec on a single GTX280.

Soares et al. use an eight CPU dual Opteron 2.2GHz machine to achieve almost the same results with 5 cameras and an octree based Visual Hull algorithm

Combination HRM and VH

Result for the combination of HRM and VH

Combination HRM and VH (cont.)

Realization: Hardware Overview for the 3D Presence

setup

• 5 x PCs with dual Nehalem Xeon CPUs

• 2 x Geforce GTX295 per cluster node• Infiniband 40GB/s interconnection

3D Presence System Architecture

Node_VH

Node_2

Node_0

Node_1 Node_3

Node_N-Capture (4 cameras)-Segmentation-Lens un-distortion-Rectification-HRM (trifocal)-Bilateral filtering-Virtual view generation-Encoding (video+depth)-Networking

Inalienability of GPUs• Hardware:

– CPU: Intel 3.0GHz (single core computation)– GPU: Geforce GTX280

• Input:– Images: 1024 x 768, RGB24– Depth Maps: 1024 x 768, float

• GPU results include up- and download times

GPU CPU

Lens un-distortion + rectification

2 msec 68 msec

Bilateral filtering of depth mapVirtual view synthesis (RGB)

11 msec 1000 msec

1 msec 150 msec

Demo

Virtual view generation based on estimated depth maps

Conclusion and Outlook

• Three party immersive 3D Videoconferencing system • Real-time 3D analysis for a 16 camera setup• Fast IBVH algorithm which runs entirely on a single GPU• Combination of trifocal HRM and VH significantly improves the

results• All processing runs in real-time on only 5 PCs• System allows to rapidly test various camera configuration

• First real-time demonstrator prototype available by October 2009

• Future: Full HD real-time 3D processing chain

Thank you!

Contact: Wolfgang.Waizenegger@fraunhofer.hhi.de

Web: www.3dpresence.eu

ReferencesAtzpadin, N., Kauff, P. and Schreer, O.: Stereo Analysis by Hybrid Recursive Matching for

Real-Time Immersive Video Conferencing, IEEE Transactions on Circuits and Systems for Video Technology, special Issue on Immersive Telecommunications, vol. 14, no. 3, pp. 321-334, January 2004.

Matusik, W., Buehler, C., Raskar, R., Gortler, S. J., and McMillan, L. 2000. Image-based visual hulls. In Proceedings of the 27th Annual Conference on Computer Graphics and interactive Techniques International Conference on Computer Graphics and Interactive Techniques.

Lakikos, A., Benhimane, S., Navab, N., Efficient Visual Hull Computation for Real-Time 3D Reconstruction using CUDA, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Anchorage, Alaska (USA), June 2008. Workshop on Visual Computer Vision on GPUs (CVGPU).

Soares, L., Menier, C., Raffin, B., and Roch, J.L. Parallel adaptive octree carving for real-time 3d modeling. Poster at IEEE VR'2007 - Virtual Reality Charlotte, Northe Carolina, USA, March 2007.

top related