Download - D1 - 29/08/2015 Le présent document contient des informations qui sont la propriété de France Télécom. L'acceptation de ce document par son destinataire

D1 - 21/04/23

Le présent document contient des informations qui sont la propriété de France Télécom. L'acceptation de ce document par son destinataire implique, de la part de ce dernier, la reconnaissance du caractère confidentiel de son contenu et l'engagement de n'en faire aucune reproduction, aucune transmission à des tiers, aucune divulgation et aucune utilisation commerciale sans l'accord préalable écrit de Recherche et Développement de France Télécom.

France TélécomRecherche & Développement

Raphaèle Balter

Construction of a scalable and evolving 3D model for video coding

27 / 05

/2005

Context

complementary background related to the subject

Context:Evolution of the numerical video world

Q New sources/terminals (dv, more powerful computer or terminals, IP...)

Q Various networks (internet, telephone networks RTC, GSM…)

Q New functionnalities (interactivity, 3D games, DVD, broadcasting...)

New orientations in video coding: Compact transmission with addition of functionalities Content adapted coding

Transmission

coding/decoding

Context: Representations from images

Rendering withno geometry

Rendering withimplicit geometry

Rendering withexplicit geometry

LightfieldLumigraph LDIs

Texture mapped 3D model

View morphing

Less geometry More geometry

Video Metric 3D model

MosaicsView interpolation

+ Photorealistic rendering- Dataset volume- Acquisition system

+ Compact representation+ acquisition system- rendering

Context: 3D model based coding

Capture DisplayAnalysisDigitalization

Real world Video camera Original sequence 3D model Reconstructed sequence

s Principle:

Context: 3D model based representations

s Goal:Q3D extractionQCamera parameters computation

s Assisted modeling: QHuman intervention [debevec96][debevec00]

QSpecific Acquisition system–Turn table [niem94] [debevec96][gibson98]–Robot [mellor00] [zisserman01]

QKnowledge on the scene contents–Faces [preteux00] [girod02]–Architectural scenes [faugeras95][hartley00][dick00] [bazin01] [werner02]

[sturm02]

Context: 3D model based representations

Q3D model stream [galpin02]Original sequence

3D models

M0 M1

I0 I3 I20I8 In

Originalsequence

3D model

I0 I5 In

QSingle model [fitzgibbon99] [roning99][pollefeys00][yao02] [Nis03][yu04]

s Non assisted modeling:QLimits: only for static scene without reflections nor pure rotation camera motion

Objectives

s 3D representation suited for coding

s Envisioned applicationsQVideo coding for distant real-time visualization on heterogeneous terminals

Services Providers

s Constraints:QNon assisted modeling

QNo assumptions on camera parameters nor on scene content

QNo assumptions on video length

QScalability

Problems

solution = tradeoff between a single model and a stream

s Representations:QSingle realistic model

+Realistic consistent representation–Incompatible with video coding constraint on video

length.

Q3D model stream +No assumption on video length+Adapted to the streaming–Inconsistency of the representation-Transitions between models

Problems

s Scalability:QTo represent a signal with several levels of information

QAllowing adaptation of a signal to –the capabilities of the networks –losses transmission –the terminals capabilities

Video bitstream

losses

Terminal computationaland rendering

capabilities

Network bandwidth

Base stream

Refinement layers

Need for multi-resolution representation

Proposed scheme:

video bitstream3D Reconstruction [galpin02]

Evolving modelConstruction(morphing)

HierarchicalCoding

Compression

A priori morphing

Wavelet analysis

A posteriori morphing

Evolving modelConstruction(morphing)

! Automatic algorithm

HierarchicalCoding

! Evolving structure

Overview

s 3D information extraction

s Evolving model construction

s Evolving model coding

s Evolving model compression

s Conclusions/Perspectives


Evolving modelconstruction

Coding Compression

3D extraction: principle of Galpin algorithm

QModel valid for a portion of the original sequence: a GOF (Group of Frames)

QGOF delimitated by keyframes used as texture images

QKeyframe selection based on several criteria:–Global motion–3D validity : epipolar residual–Ratio of the outgoing points

s 3D model stream:Q Classical structure from motion algorithm [faugeras93] [horaud93]

[hartley-zisserman2004]

C1

M(X,Y,Z)

m1m2

C2

Original sequence

3D models

M0 M1

I0 I3 I20I8 In

3D models

Reconstructed sequence

Camera positions

Textureimages

keyframe keyframe

GOF 1 GOF 2

3D extraction: global scheme [galpin]

Images

Textured3D model Estimation

of textured 3D model

Meshand depth

3D mesh computation

from a triangulation

of the keyframeassociated to the GOF

Motion estimationof pixels

[marquant00]

Dense motion field

Extraction and tracking

of interest points [harris88]

Interest points motion

Estimationof camera

posesintra-GOF

[dementhon95]

Camerapositions

Textured3D models

Coder

Estimationof depth image

[huang84]

Images

Save ofkeyframes

Keyframesselection

keyframes

3D extraction:limits of Galpin 3D model stream

s Stream of independant 3D models:QUniform regular meshesQDifferent fields of view

Abrupt transitions between models

Geometric jump

3D extraction: limits of Galpin 3D model stream

Texture jump

Texture image k Texture image k+1

3D extraction: limits of Galpin 3D model stream

Connectivity jump

3D extraction:limits of Galpin 3D model stream

Overview








Coding Compression

Construction: a posteriori morphing

s Evolving model: QTradeoff between a single model and a 3D model streamQModel stream with 3D morphing to link models together

s Morphing [hong88][parent92][lazarus98][alexa02]QTwo-steps process:

–Vertex mapping–Interpolation between corresponding vertices

QEfficient methods are semi-automatic [bethel89] [kent92][delingette93]

[decarlo96] [lee99][zockler00] [kanai00][michikawa01]

=> not compatible with our schemes Non detailed contributions:

QA posteriori meshed depth maps morphing [balter03]

QA posteriori 3D model morphing [leguen04]

Construction: a priori morphing

nn

cn

tt

tt

tt

tt

1

11)1( nnc MMM with

s Principle of the new encoding scheme:QNo more uniform gridQCorresponding vertices: vertices of successive models are same physical 3D points of the sceneQImplicit morphing based on those corresponding vertices = simple linear interpolation

Construction: inputs

Camera positions

Texture Images

Depth maps

Dense motion field

Images 3DExtraction[galpin02]

Construction: proposed algorithm

s Fixed connectivity and time evolving

geometry1. Initialisation with a uniform regular

mesh covering the whole image surface

2. Tracking and update of vertices still visible from the next point of view to get the corresponding mesh

3. Integration of the new parts appearing in the next model to get the new-vertices mesh (NVM)

4. Merge

5. Reinitialisation of the model for long sequences to avoid drifts => GGOF (group of GOFs)

1

2

3

Construction: additional constraints

s Merge of CMn and NVMn:

QNot call into question the existant connectivity

QNot create a non manifold mesh

s Vertices must be valid:

QValidity map

Construction: constrained merge

s Manifold merge:QNew vertices triangulated under the CMn envelope constraintQCMn envelope vertices are included in the delaunay triangulationQFaces overlapped CMn mask

CMn envelopeCMn faceCMn vertex

NVMn faceSuperimposed NVMn face

CMn mask

Caption

NVMn vertices

Construction: constrained merge

s Proposed solution for 2-manifold mergeQElimination of all the faces containing only vertices of the CMn envelopeQRecovery of the faces eliminated that do not overlap with CMn mask

–Convex areas of the enveloppe–Detection of holes in the mesh (Euler formula : S-A+F = 2(1-

g) )

CMn envelopeCMn faceCMn vertex

NVMn faceSuperimposed NVMn face

CMn mask

Caption

NVMn vertices

Construction: matching information

s How to transmit the matching information?Q No additional information to transmitQ Known at the encoding stage with the motion fieldQ Retrieved at the decoding stage by:

– reprojecting the model on the following point of view – identifying of vertices having the same 2D coordinates.

Cn

Cn+1

Construction: validity map

Validity map: to ensure matching consistencyCn

Cn+1

s Uncertainty on the motion and on decoded models

due to the errors in 3D estimation

Construction: results

s Stair sequence: lateral translationQGreen: current meshQYellow: next meshQRed: morphing source (subset of the current mesh)QBlue: morphing target (subset of the next mesh)

Construction: results

s Stair sequence: virtual navigation

Tradeoff between single model and model stream=> evolving model = consistent 3D model stream

Overview








Coding Compression

Coding: wavelet analysiss Goal: scalable multi-resolution representation

s Classical efficient signal processing tool: wavelets [mallat89]

[derose96]QInterest:

–hierarchical representation of a signal => provides multiresolution–good compression

QPrinciple:–low frequencies representation refined by well located high frequencies (details)–Successive filterings

QExample: image case [jpeg00]

QSurfaces case

Coding: 2nd generation wavelet analysiss 2nd generation wavelets [loop87][dyn90] [schröder95][lounsbery97]

[sweldens98]:QFor non regular surfaces

QCoarse base mesh + refinements

(wavelet coefficients)

Surface

Base Mesh

A

B

C

C

2

BAC

W

BAC

2

Coding: 2nd generation wavelets analysiss Filters:

QGenerated by "lifting scheme"QCan have various sizes according to the properties wanted for wavelets

–Compression requires a minimal size filter QExamples of reconstruction high pass filters

envisionned applications: real-time reconstruction in a adaptive way

need of a fast algorithm => tradeoff compression/speed

0 0 0 0 0

0 0 0 0 0

0

0 0

0

0 0

818

7

161

161

161 16

1

161 16

1 161

161

161 16

1

81

0

0

10

0

0

0 0

0

0

0

0

0 0 0

0

0

0 0

Butterfly Lifted Midpoint lifted Midpoint non lifted

0

0 0

0 0

0

0

161

161

161

161

161

161

161

161

161

161

161

161

161

161

161

161

0 0

161

161

161

161

81

81

818

14

3

167

167

167

167

167

167

167

167

167

167

163

163

163

163

163

163

16316

3

Coding: Independent analysis

not satisfying

Coding: proposed representation

s Proposed representation: QDecompositions based on the same support

QTransformation of each dense depth map into consistent hierarchical triangular meshes

QSupport dissociated of geometry : the single connectivity mesh (SCM)

Coding: base meshes construction

s Base mesh = coarse meshQEvolving model construction

QLarge faces: need for accurate represent the scene despite the face sizes => content based vertices ≠ regular vertices

QTime evolution: increased size and stretched faces management

Coding: base meshes constructions New evolving model generation

Harris corner

detector

Canny edge

detector

Canny edge

detector

Delaunaytriangulation

Init

Update

ValidityMap

computation

Coding: wavelet decomposition

s Decomposition scheme:

Wavelet coefficientscomputation

Filtering

Information computation

Depth difference p

Base modelMBn

Hierarchical 3D meshesconstruction

Canonical facets quadri-section to define scale and wavelets spaces

DensemodelMDn

DensemodelMDn

jM j j

M j 1

)()( jijj

ij

ijS

pd

Camera

and associated view lines

Mni

Mni+1

Mmi

Mni

Mmi

Mni+1 0f ,, zyx ,,

Coding: Consistent wavelet decomposition

s Single Connectivity Mesh (SCM)QCommon connectivity decomposition support:

–sufficient since wavelet coefficients are added on edges by face quadrisection

QPurpose: –To gather connectivity information –Easy to construct thanks to evolving

model structure with consistent connectivity

correspondances/implicit morphing at each level


1

2

3 45

6

78

9

1

6

5

7

98

10 11

3

2

45

9 8

67

96

78

1211

10

6

7 98

10 11

67

1211

5

12

15

14

13

13 14

16

15

1 : face global indices1: vertex global indices

, , global face indexk (max resolution)

Unique global index

Coding: Results

s Street sequence: travelling:QGreen: current meshQYellow: next meshQRed: morphing source (subset of the current mesh)QBlue: morphing target (subset of the next mesh)

Base mesh Base mesh + refinments

Overview








Coding Compression

Compression: Media interrelationss Redundancies => Exploiting interrelations

between medias

Mux

bitstream

3D encoder

3D

2D encoder EBCOT

1D encoder

2D

1D

Mesh geometry & connectivity

Camera positions

Texture

(1)

(1)

(2)

(1) Texture image prediction using previous texture image + 3D model + camera position

(2) Texture coordinates of vertices retrieved using camera position

=> by reprojection of the model on camera position

=> 3 coordinates instead of 5+or

x,y,zu,v

x,y,z u,v,por

Compression: camera position compressions Camera positions compression [galpin02]

Q Intra –All camera positions are encoded in intra mode

Q Inter/ Predictive scheme–The first camera is encoded in intra mode–Key cameras are encoded incrementaly compared to the previous key position–Other cameras are encoded incrementaly compared to linear prediction

)).(()(

)).(()(

1

1

1

1

nnn

nnn

ttnn

nttt

ttnn

nttt

RRtt

ttRRRC

TTtt

ttTTTC

Intermediate positions:

1

1

)(

)(

nnn

nnn

ttt

ttt

RRRC

TTTC

Key positions:

Compression: geometry compression

s Geometry compression:QBase mesh: Topological Surgery (TS) encoder also known as MPEG4-3DMC for (XYZ) encoding

QWavelets: adaptation [koda00] of the SPIHT algorithm (Set Partitioning In Hierarchical Tree) [said96]:

–bitplane scalability is added to spatial and temporal scalability–Based on clever partitioning of coefficient hierarchy –Hierarchy not obtained by face subdivision but trough edge

based hierarchy

Compression: texture compressions Predictive scheme

[galpin02]

texturing projectionPrevious texture image

Predicted image

Next texure image

Compression: texture compression

Tn+2

T’n+1

Debased

¨T’n+2^

Debased

T’n+2

Debased

E’n+1

Debased

+ +

++

Networktransmission

Difference image

Networktransmission

Difference image

Difference image Difference image

Predicted image

Reconstructed image

Reconstructed image

Predicted image

Predicted image

Original image

Original image

Tn+2

Tn+2

Tn+1

En+2 En+2 Tn+2

Tn+1

En+1 En+1

Tn+1

Tn+1

Tn+2^

^

+

-

-

+

+

+

++

Compression: texture compression

s Base layers

K 1K K 2K K 3K

K 1K0

K 2K0

K 2K1

K 2K2

difference

difference

differenceprediction

Higher level layers

Base layer

ensures no error on predicted image

Compression: bitrate repartition

s Non detailed contributions:Q Optimal bitrate repartition between base mesh accuracy and necessary wavelet decomposition levelQ Optimal bitrate repartition between geometry and texture

To favour base mesh accuracy over wavelet decomposition level or on additional bitplane

To favour texture over geometry (once a minimal accuracy has been obtained)

Compression: low bitrate results

s Stairs sequence at 125kb/s for CIF 25Hz images

15

17

19

21

23

25

27

29

31

33

35

1 11 21 31 41 51 61 71 81 91 101

Images

PS

NR

PSNR image GalpinPSNR texture GalpinPSNR image H264PSNR image usPSNR texture us

oursgalpinH264

original

Compression: very low bitrate results

s Street sequence at 16kb/s for CIF 25Hz images

15

17

19

21

23

25

27

29

1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 127 134 141 148 155 162

Images

PS

NR

PSNR image GalpinPSNR texture GalpinPSNR image usPSNR texture us

galpin ours

original

Compression: virtual reality results

s Street sequence at 82kb/s

Compression: augmented reality results

s Street sequence at 82kb/s

s Stairs sequence at 125kb/s

Compression: conclusions

s Good compression results/ better virtual reality resultsQBetter than state of the art generic coder H264QComparable to state of the art for 3D model based codingQBut consistence of the representation and scalability

essential for envisioned applications

Conclusionss Contributions

QEvolving model construction: –A posteriori morphing–consistent 3D model stream–Fixed connectivity and evolving geometry

QScalable encoding of the representation to obtain a multi-resolution evolving model

–Use of second generation wavelets–Consistent decompositions of the models of the stream based on a

common topological support, the SCM–Introduction of a global indexing system

QEfficient and scalable compression of the representation–Efficient codecs–Exploitations of interrelations existing between the media–Bitrate repartition optimisation

Perspectives

s Short term perspectivesQOcclusion zones management:

–Better detection of characteristic for depth discontinuities –Use of an adapted motion estimator => discontinuous disparity maps

[cammas03]QWavelet filters optimisationQSmoothing of the models to avoid noise peaksQAdaptive correction for texture image depending on the size of the 3D face

s Longer term perspectivesQUse of this representation in a dynamic coder to transmit static background of the sceneQMix between this representation and synthetic scene in order to get more genericity (town representation for example)

Communications and patents

s 7 international conference papers

s 5 national conference papers

s 1 patent

s 2 contributions to MPEG4 standardization

s 1 submitted journal paper (review in process)


23

5

1

4

23

4

(1,1,0,1)

(2,1,1,0)

2 global indices per vertex located on a face border

those coordinatesDo not give the same index

, , global face indexk (max resolution)

Unique global index

Coding: wavelet filters choice

s Envisionned applications:QNetwork between client and server

QReal time applications

need of a reconstruction:QScalable and locally adaptive

QVery fast algorithm

Our choice: simplest filters: lazy waveletQFilters

-low pass filter = average filter-High pass filter = identity

QBest compromise between filter simplicity/compression

Le codage d’images fixesOndelette Butterfly non liftée (synthèse)

161

161

0 161

1

0

0 0

21

161

21

21

21

21

21

81

81

81

161

0

161

161

0

81

81

0

161

161 0

0 0

81

161

161

0

0

161

0

0

10

0

0

0 0

0

0

0

0

0 0 0

0

0

0 0

Filtre passe haut

Filtre passe bas

Download - D1 - 29/08/2015 Le présent document contient des informations qui sont la propriété de France Télécom. L'acceptation de ce document par son destinataire

Top Related