sgm-nets: semi-global matching with neural...

1
SGM-Nets: Semi-global matching with neural networks Akihito Seki 1 and Marc Pollefeys 2,3 1 Toshiba Corporation 2 ETH Zurich 3 Microsoft 1. Introduction Our contributions: (1) A first learning based penalties estimation for SGM (Sec.3) (2) New SGM parameterization that separates Pos/Neg disparity changes (Sec.4) (3) State-of-the-art accuracy (Sec.6) 2. Related work I. Hand tuned penalties. Ex. Smaller penalties on edges = obj. boundary II. Learning based penalties for MRF. How to minimize Eq.(1) by SGM ? C. All costs (a) Input image [1] H.Hirschmuller, “Stereo Processing by Semiglobal Matching and Mutual Information”, PAMI 2008. [2] J.Zbontar et al., “Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches”, JMLR2016. Stereo correspondence (ex. ZNCC, MC-CNN[2]) Stereo images Matching cost volume width height SGM (Penalties P1/P2 are needed) Disparity map Disparity estimation pipeline Proposed method ( ) ( ) [ ] [ ] > + = + = x y y x y y x x x x x N N D d d T P d d T P d c D E 1 1 , min ˆ 2 1 Disparity map Matching cost Penalty for slant surface Penalty for discontinuity T [] : Kronecker delta Unary Pairwise Semi-global matching (SGM)[1] is the most popular regularization method for stereo because of its high accuracy in real-time. However SGM penalties are difficult to be tuned. This work deals with deep neural networks for predicting the penalties. SGM: P1/P2 control a smoothness / discontinuity of a disparity map. Energy function E for SGM: (1) Position x = (u, v) Disparity d (b) Minimum cost path ( ) d L , x r Accumulated costs (a) 4 paths from all directions r Position u Position v 0 x Image ( ) ( ) = r x x d L D r d , min arg 0 0 Winner-takes-all (WTA) over accumulated costs Lr 0 x 1 x 2 x 3 x 1 x 2 x ( ) ( ) ( ) { , , min , , 1 0 0 d L d c d L r r x x x + = ( ) , 1 , 1 1 P d L r + ± x ( ) } , min 2 1 1 P i L r d i + ± x Matching cost (2) ( ) ( ) ( ) + = 0 0 0 0 , , , 0 max 0 0 x x x x x x gt i d d i gt g m d L d L E Hinge loss Flat Slant Border (3). 0 x 1 x Position Disparity Accumulation direction 2 x 3 x ( ) 1 1 x P ( ) 2 1 x P ( ) 2 2 x P 0 0 0 x gt d 1 4 x d 1 1 x d 2 3 x d 3 3 x d ( ) 3 2 x P 0 Backward direction 1 d 2 d 3 d 4 d 5 d 0 5 x d ( ) ( ) ( ) ( ) ( ) ( ) 2 2 3 3 3 2 1 1 0 0 3 2 1 0 0 , , , , , x x x x x x x x x x x P d c d c d c d c d L gt gt + + + + = ( ) ( ) ( ) ( ) ( ) ( ) ( ) 2 1 1 1 3 3 3 2 4 1 5 0 5 0 3 2 1 0 0 , , , , , x x x x x x x x x x x x P P d c d c d c d c d L + + + + + = Accumulated cost at pixel of disparity 0 x gt d 0 x Accumulated cost at pixel of disparity 0 5 x d 0 x ( ) ( ) ( ) 1 , 0 , 1 2 2 1 2 1 1 = = = x x x P E P E P E g g g ( ) ( ) 0 , , if 0 0 0 0 > + m d L d L i gt x x x x Derivatives of Eq.(3) wrt. P are needed to train the networks Derivatives wrt. P Correctly estimated pixel < ( ) 1 1 1 , x x d L ( ) ( ) 1 1 2 1 1 , x x x P d L + ( ) ( ) 1 2 3 1 1 , x x x P d L + ) ( N Border case ( ) ( ) 1 2 1 1 , x x x P d L gt + 0 x 1 x 2 d 3 d 4 d = 1 d 1 x gt d 0 x gt d Minimum ) ( N 4. Signed parameterization 5. SGM-Net architecture 6. Results Ex.) ( ) ( ) () ( ) + + = 0 0 1 1 2 1 , , 0 max x x x x x gt i d d gt nb m N P d L E Margin A. Path cost B. Neighbor cost Sparse GT Acceptable Unacceptable Influenced pixels Beyond neighbor pixels Only neighbor pixels Disparity ambiguity Yes No Initial parameters Influenced Uninfluenced + + + = r flat nf slant ns border nb g E E E E E α ( ) , 0 or 1 1 2 = x P E b ( ) 0 or 1 1 1 = x P E b 0 if > nb E Derivatives wrt. P nx E B. Neighbor cost A. Path cost g E A. Path B. Neighbor E + 1 P + 2 P 1 P 2 P Position Disparity positive negative + P P Position Disparity Position + > 1 1 P P Vertical Horizontal + < 1 1 P P Disparity ( ) ( ) [ ] [ ] = + = + = + x y y x x x x N N D d T P d T P d c D E 1 1 , min ˆ 1 1 δ δ P1/P2 have different penalties depending on either positive or negative disparity change. Eq.(1) becomes to [ ] [ ] < + > + + x x y y N N d T P d T P 1 1 2 2 δ δ y x d d d = δ SGM-Net is also applicable to the parameterization Synthetic dataset Real dataset (a) GT disparity map (b) Initial state of SGM-Net (c) SGM-Net (Path cost) (d) SGM-Net(Neighbor cost) A B (e) SGM-Net with all costs Original image B A + 1 P 1 P Original image Disparity map Large Small ZNCC / MC-CNN as a matcher Various settings of SGM. MC-CNN as a matcher (67 [sec.] for the matching) Got 1 st rank at the CVPR deadline # Standard and Signed SGM-Nets output 8 and 16 values, respectively. 3x3 16 filters and 128 dim FCs Computation takes 0.02 seconds on GPU # 18/10/2016 except anonymous submission (b) GT disparity map (c) Winner-tales-all (d) SGM with hand tuned penalties (e) SGM with our SGM-Net Appropriate SGM penalties are important for accurate disparity maps Original image Hand tuned SGM Standard SGM-Net Signed SGM-Net Percentage of erroneous pixels on non-occluded areas with an error threshold of 3 pixels KITTI: http://www.cvlibs.net/datasets/kitti/ SceneFlow: https://lmb.informatik.uni-freiburg.de/ KITTI 2012 For semantic segmentation and difficult to apply to SGM. positive positive negative negative (c) Flat/Slant/Border Penalty map 8.29 3.95 3.60 Set correct path by checking adjacent pixels Minimize loss over the accumulated path. From Eq.(2), we formulate 3. SGM-Nets FC + ReLU Convolution + ReLU Constant Patch (5x5) Direction1 Direction2 Direction3 Direction4 Normalized patch position Concatenate Convolution + ReLU FC + ELU P1+, P1- P2+, P2- P1+, P1- P2+, P2- P1+, P1- P2+, P2- P1+, P1- P2+, P2- Predict 16 values for each pixel. SGM-Net : Predict P1/P2 at each pixel with a image Signed SGM-Net Path-ambiguity

Upload: others

Post on 27-Jun-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SGM-Nets: Semi-global matching with neural networksopenaccess.thecvf.com/content_cvpr_2017/poster/0087_POSTER.pdfSGM-Nets: Semi-global matching with neural networks Akihito Seki1 and

SGM-Nets: Semi-global matching with neural networksAkihito Seki1 and Marc Pollefeys2,3 1Toshiba Corporation 2ETH Zurich 3Microsoft

1. Introduction

Our contributions:(1) A first learning based penalties estimation for SGM (Sec.3)

(2) New SGM parameterization that separates Pos/Neg disparity changes (Sec.4)

(3) State-of-the-art accuracy (Sec.6)

2. Related workI. Hand tuned penalties. Ex. Smaller penalties on edges = obj. boundaryII. Learning based penalties for MRF.

How to minimize Eq.(1) by SGM ?

C. All costs

(a) Input image

[1] H.Hirschmuller, “Stereo Processing by Semiglobal Matching and Mutual Information”, PAMI 2008.[2] J.Zbontar et al., “Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches”, JMLR2016.

Stereo correspondence (ex. ZNCC, MC-CNN[2])

Stereo images

Matching cost volume

width

heig

ht

SGM(Penalties P1/P2 are needed)

Disparity map

Disparity estimation pipeline

Proposed method

( ) ( ) [ ] [ ]∑ ∑∑

>−+=−+=

∈∈x y

yx

y

yxx

xx

xNND

ddTPddTPdcDE 11,minˆ21

Disparity map Matching cost Penalty for slant surface Penalty for discontinuity

T [] : Kronecker deltaUnary Pairwise

Semi-global matching (SGM)[1] is the most popular regularization method for stereo because of its high accuracy in real-time. However SGM penalties are difficult to be tuned. This work deals with deep neural networks for predicting the penalties.

SGM: P1/P2 control a smoothness / discontinuity of a disparity map.Energy function E for SGM:

(1)

Position x = (u, v)

Dis

parit

y d

(b) Minimum cost path ( )dL ,xr

Accumulated costs(a) 4 paths from all directions r

Position u

Pos

ition

v

0x

Image( ) ( )∑=r

xx dLD rd,minarg 00

Winner-takes-all (WTA) over accumulated costs Lr

0x1x2x3x 1−x 2−x

( ) ( ) ( ){ ,, min, , 100 dLdcdL rr xxx ′+=′ ( ) ,1, 11 PdLr +±′ x ( ) } , min 211PiLrdi

+′±≠

x

Matching cost

(2)

( ) ( )( )∑≠

+−=00

00 ,,,0max 00xx

xx xxgti dd

igtg mdLdLEHinge loss

Flat Slant Border

(3).

0x1x

Position

Dis

parit

y

Accumulation direction

2x3x( )11 xP

( )21 xP

( )22 xP

0

0

0xgtd

14xd

11xd

23xd3

3xd

( )32 xP0

Backward direction

1d

2d

3d

4d

5d 05xd

( ) ( ) ( ) ( ) ( ) ( )223332110032100 ,,,,, xxxxxx xxxxx PdcdcdcdcdL gtgt ++++=

( ) ( ) ( ) ( ) ( ) ( ) ( )2111333241505032100 ,,,,, xxxxxxx xxxxx PPdcdcdcdcdL +++++=

Accumulated cost at pixel of disparity 0xgtd0x

Accumulated cost at pixel of disparity 05xd0x

( ) ( ) ( ) 1,0,1221211

=∂

∂=

∂−=

xxx PE

PE

PE ggg ( ) ( ) 0,, if 00

00 >+− mdLdL igtxx xx

Derivatives of Eq.(3) wrt. Pare needed to train the networks

Derivatives wrt. P

Correctly estimated pixel

<

( )111,xx dL

( ) ( )11211, xx x PdL +

( ) ( )12311, xx x PdL +

)(⋅NBorder case

( ) ( )1211, xx x PdL gt +

0x1x

2d

3d

4d

=1d

1xgtd

0xgtd

Minimum

)(⋅N

4. Signed parameterization

5. SGM-Net architecture

6. Results

Ex.)

( ) ( ) ( )( )∑≠

+⋅−+=00

1121, ,0max

xx

xx x

gti ddgtnb mNPdLE

Margin

A. Path cost B. Neighbor costSparse GT Acceptable UnacceptableInfluenced pixels Beyond neighbor pixels Only neighborpixelsDisparity ambiguity Yes NoInitial parameters Influenced Uninfluenced

∑ ∑∑∑∑

+++=

r flatnf

slantns

bordernbg EEEEE α

( ) ,0or 112

=∂∂

xPEb

( ) 0or 111

−=∂∂

xPEb 0 if >nbEDerivatives wrt. P

nxEB. Neighbor cost

A. Path cost gE

A. Path B. Neighbor

E

+1P

+2P

−1P

−2P

Position

Dis

parit

y

positive

negative

+P

−P

Posit

ion

Disp

arity

Position

−+ > 11 PP

VerticalHorizontal

−+ < 11 PP

Disparity

( ) ( ) [ ] [ ]∑ ∑∑

−=+=+=

+

x yy

x

xx

xNND

dTPdTPdcDE 11,minˆ11 δδ

P1/P2 have different penaltiesdepending on either positive ornegative disparity change.Eq.(1) becomes to

[ ] [ ]

−<+>+ ∑∑

+

xx yy NNdTPdTP 11 22 δδ

yx ddd −=δSGM-Net is also applicable to the parameterization

Synthetic dataset

Real dataset

(a) GT disparity map

(b) Initial state of SGM-Net (c) SGM-Net (Path cost)

(d) SGM-Net(Neighbor cost)

AB

(e) SGM-Net with all costs

Original image

BA

+1P −

1P

Original imageDisparity map

LargeSmall

• ZNCC / MC-CNN as a matcher• Various settings of SGM.

• MC-CNN as a matcher (67 [sec.] for the matching)• Got 1st rank at the CVPR deadline#

• Standard and Signed SGM-Nets output 8 and 16 values, respectively.

• 3x3 16 filters and 128 dim FCs• Computation takes 0.02 seconds on GPU

#18/10/2016 except anonymous submission

(b) GT disparity map

(c) Winner-tales-all (d) SGM with hand tuned penalties (e) SGM with our SGM-Net

Appropriate SGM penalties are important for accurate disparity maps

Original image Hand tuned SGM

Standard SGM-Net Signed SGM-Net

Percentage of erroneous pixels on non-occluded areas with an error threshold of 3 pixels

KITTI: http://www.cvlibs.net/datasets/kitti/

SceneFlow: https://lmb.informatik.uni-freiburg.de/

KITTI 2012

For semantic segmentation and difficult to apply to SGM.

positive positivenegative negative

(c) Flat/Slant/Border

Penalty map

8.29

3.95 3.60

Set correct path by checking adjacent pixels

Minimize loss over the accumulated path.

From Eq.(2), we formulate

3. SGM-Nets

FC +

ReL

U

Conv

olut

ion

+ Re

LU

Cons

tant

Patch(5x5)

Direction1

Direction2

Direction3

Direction4Normalized patch position

Conc

aten

ate

Conv

olut

ion

+ Re

LU

FC +

ELU

P1+, P1-P2+, P2-

P1+, P1-P2+, P2-

P1+, P1-P2+, P2-

P1+, P1-P2+, P2-

Predict 16 values for each pixel.

SGM-Net : Predict P1/P2 at each pixel with a image

Signed SGM-Net

Path-ambiguity