a region-based randomized voting scheme for stereo matching

International Symposium on Visual Computing 2010

A Region-Based Randomized Voting Scheme for Stereo Matching

Guillaume GALES, Alain CROUZIL

TOULOUSE INSTITUTE OF COMPUTER SCIENCE RESEARCH (IRIT)Sylvie CHAMBON

FRENCH PUBLIC WORKS LABORATORY (LCPC)

Monday, December 6, 2010

ISVC 2010G. GALES, A. CROUZIL, S. CHAMBON

Problematic

• 3D reconstruction from stereo pairs• Acquisition• Pixel matching• Calibration• Reconstruction

2

Scene

Left camera Right camera

Left image Right image

Y

X

Z

yr

zr

xr

yl

xl

zl

prpl

jr

ir

jl

ilP

1

Left image Right image Disparity map



Outline

• Pixel matching methods• Proposed algorithm based on a randomized voting scheme• Results• Conclusion

3



Pixel matching

• Local methods - based on correlation measures in the vicinity of pixels

• Global methods - based on mathematical optimization techniques (belief propagation, graph cut, simulated annealing, etc.) to minimize errors made by the computed disparities

• Region-based methods - pixels within a homogeneous color region of an image belong to the same surface

4



Region-based methods

5

Region map Initial disparity map

Surface model-based disparity map

Finaldisparity map

Estimation of the parameters of a surface model

Global optimization

Local matching

Left image Rightimage

Segmentation


ISVC 2010G. GALES, A. CROUZIL, S. CHAMBON 6

Proposed algorithm

Initial disparity map

Final disparity map

Voting scheme

SegmentationsLocal matching

with robust correlation


Randomized surface model computation





...

Region map 1 Region map 2 ...



Initial disparities

• Do not need to be dense, however a good repartition of correct initial disparities is needed within each region

• A robust correlation measure is used to improve results near occluded areas

7


Outliers

MatchMatch



Multi-segmentations

• Over-segmentation (more regions) - hard to find enough correct initial disparities in some segments

• Under-segmentation (less regions) - errors are more important when, for instance, a plane model is used to fit a segment from a conic surface

8

Over-segmentation Under-segmentation

Initial disparities

Region boundaries

False initial disparities



Randomized voting scheme

• Estimation of a surface model-based disparity map:• For each region map :• For each region :• Randomly select 3 points (in,jn,dn) in the disparity space

within the region, n={1,2,3}• Compute for this region the plane parameters a,b,c such as ain+bjn+dn=c

• Assign to each pixel (k,l) within the region its estimated disparity

9

pk,l ← c− ak − bl




10

Region map 1 Region map 2 ...

... ... ...

...

...

Surface model-based disparities

Region boundaries

False disparities

...




• Estimation of the disparity density function• n surface model disparity maps are computed. For each pixel,

each computed disparity represents a vote• Final disparity is the most voted one• Sub-pixel accuracy can be obtained by computing a disparity

density function using a kernel density estimation method• Let xi be one vote

11

pk,l ← argmaxx≤dmax

1n

n�

i=1

�34

�1− �x− xi�2

�if �x− xi� ≤ 1

0 otherwise




12

6 G. Gales, A. Crouzil and S. Chambon

random selection 1 random selection 2 random selection 3

S1 . . .

S2 . . .

. . .

Fig. 3. Disparity maps computed by our algorithm with different random selections

and segmentations (S1−2). Region boundaries as well as errors are shown in white.

Each map presents different errors. However, these errors are approximations of the

true disparities. Let’s take a closer look at the cone in front of the mask. The plane

model does not fit the whole cone, nevertheless, at each random selection, a different

part of the cone obtains correct disparities. As a final result, we take the mode of the

disparity density function computed from all the approximations.

disparity

f

33 34 35 36 37 38 39

0.1

0.3

0.5

Fig. 4. Density function built from the different estimated disparities (black circles)

for a given pixel. The mode (36.3 here) is shown with a bold vertical line.

Error rate vs. number of segmentations We evaluate the result of our algorithmwith different number of segmentations. According to our early experiments, 4different segmentations seem to be enough to obtain good results. Thus, we useup to 4 different segmentations for this evaluation. We compute the error ratewith 1, then 2, then 3 and finally 4 different segmentations. When using only 1segmentation, early tests showed that the best scores are obtained with the finersegmentation rather than the coarser ones, thus we use this segmentation in ourprotocol. In the same manner, we determine that when using 2 segmentations,it is better to use the coarser and the finer ones. With 3 segmentations, the bestconfiguration is obtained with the coarser segmentation, the finer segmentationand one between the two others. For each test, the total number of randomselections is fixed to 100 and since our algorithm is based on a stochastic process,the experiment is repeated 30 times which is enough to compute a representativeaverage error rate and a standard deviation. The results are computed withdifferent error thresholds t. Since the behaviour is the same for all the t, we

disparity density function

disparity = x= dmax

xi argmax



Results

• Data set and evaluation - Middlebury protocol. The error rate is given by the percentage of bad pixels:

• BP versus the number of region maps• BP versus the number of random selections (votes)

13

BP =1N

�

i,j

(|d(i, j)− dtheoretical(i, j)| > threshold)



Results (cont’d)

14

A Region-Based Randomized Voting Scheme for Stereo Matching 7

# segmentations

error rate (%)

1 2 3 42.5

2.75

3

3.25

3.5

3.75

# random selections

error rate (%)

5 15 25 50 1002.5

2.75

3

3.25

3.5

3.75

Fig. 5. Error rate for the images cones with t > 1 over the non-occluded pixels versusthe number of segmentations (left) and versus the number of random selections (right).The mean values are given by the black circles and the standard deviation are givenby the vertical lines.

present the results for t > 1 in figure 5 and table 1.We can see that the use of atleast 2 different segmentations significantly improves the results. The best resultsare obtained with the four segmentations. The only exception is for the pairtsukuba which is hard to segment, especially in the background where they aredifferent objects (i.e. with different disparities) whereas they have low intensitydifferences and low contrast. We can notice that the standard deviation does notdecrease much with the different number of segmentations. In fact, it dependson the number of random selections, see § 4. Therefore, if the aimed applicationrequires a low error rate, the choice of at least 2 segmentations is recommended.A better precision can still be achieved with more segmentations.

Error rate vs. number of random selections We apply our algorithm with dif-ferent numbers of random selections (5, 15, 25, 50 and 100) to find the bestsolution. As explained in § 4, for each tested number of random selections, werepeat the test 30 times and we compute the mean and the standard devia-tion given by the error rates, the results are also computed with different errorthresholds t but only the results for t > 1 are presented, see figure 5 and ta-ble 2. We can notice that after 25 random selections, the mean value of the errorrate decreases slowly. However, the standard deviation is still high. It tends todecrease with the number of random selections. At 100 random selections, wecan see that the standard deviation is low (i.e. the probability to obtain a bettersolution increases). Therefore, if the aimed application requires a low error rate,the number of random selections must be set between 25 and 100. The higherthis number is, the lower the standard deviation is.

Ranking Among all the results, we take for each pair the one that gives thebest results and we submit them to the Middlebury website. According to theMiddlebury ranking, see table 3 and figure 6, the proposed method gives verycompetitive results on the images teddy and cones. Our method is ranked in the

# region maps # random selections



Results (cont’d)

15

A Region-Based Randomized Voting Scheme for Stereo Matching 9

5 15 25 50 100

tsukubanonocc 9.89 ± 0.94 7.33 ± 0.71 6.74 ± 0.34 5.91 ± 0.28 5.47 ± 0.18

all 10.66 ± 0.93 8.05 ± 0.68 7.49 ± 0.35 6.65 ± 0.27 6.22 ± 0.18disc 19.88 ± 1.24 17.58 ± 0.82 17.45 ± 0.75 16.45 ± 0.63 16.22 ± 0.46

venusnonocc 0.49 ± 0.31 0.20 ± 0.03 0.17 ± 0.02 0.16 ± 0.01 0.15 ± 0.01

all 0.89 ± 0.34 0.55 ± 0.06 0.50 ± 0.04 0.48 ± 0.02 0.48 ± 0.02disc 3.21 ± 0.82 2.56 ± 0.37 2.22 ± 0.27 2.13 ± 0.18 2.09 ± 0.16

teddynonocc 7.17 ± 1.03 6.04 ± 0.25 5.95 ± 0.21 5.74 ± 0.15 5.65 ± 0.12

all 11.56 ± 1.30 10.26 ± 0.38 10.13 ± 0.28 9.77 ± 0.28 9.80 ± 0.26disc 17.38 ± 1.20 15.70 ± 0.52 15.62 ± 0.42 15.29 ± 0.39 15.19 ± 0.27

conesnonocc 3.05 ± 0.12 2.75 ± 0.09 2.68 ± 0.07 2.64 ± 0.04 2.61 ± 0.02

all 8.88 ± 0.22 8.30 ± 0.18 8.12 ± 0.12 8.02 ± 0.08 7.94 ± 0.04disc 8.58 ± 0.29 7.88 ± 0.23 7.71 ± 0.18 7.60 ± 0.10 7.52 ± 0.06

Table 2. Error rate and standard deviation for the 4 stereo pairs (t > 1) ver-sus the number of random selections (5–100). The evaluation is performed over thenon-occluded pixels (nonocc), all the pixels (all) and the pixels within the depth-discontinuity areas (disc) with 4 segmentations. The best result of each row is writtenin bold.

t tsukuba venus teddy conesnonocc all disc nonocc all disc nonocc all disc nonocc all disc

t1 16(34) 16.7(34) 27.1(73)2.48(10) 2.95(8) 8.13(8) 8.73(3) 13.9(4) 22.1(3) 4.42(1) 10.2(1) 11.5(2)t29.83(27)10.6(26)23.8(64) 0.39(6) 0.78(9) 3.97(13) 6.41(7) 11.1(6) 17.1(9) 2.95(2) 8.40(4) 8.45(3)t3 4.85(77) 5.54(70) 17.7(74) 0.13(6) 0.45(13) 1.86(8) 5.40(12)9.54(12)14.8(13)2.62(4)7.93 (7)7.54(6)

Table 3. Middlebury error rates and ranking into parentheses with different thresholdst: t1 = 0.5, t2 = 0.75 and t3 = 1. The best ranking for each t is shown in bold.

independent they can be computed in parallel. Besides, each selection processes

regions independently thus parallelism can also be achieved.

5 Conclusion

We proposed a region-based stereo matching algorithm which uses different seg-

mentations and a new method to select the final disparity based on a randomized

voting scheme. Our algorithm gives good results at sub-pixel accuracy even with

non polyhedral objects. It is easy to implement yet very effective, giving com-

petitive results in the Middlebury evaluation protocol. Nevertheless, we believe

that we can still improve our method by using a confidence measure to have

a weighted random selection of the disparity triplet to be able to reduce the

influence of incorrect disparities.

References

1. Scharstein, D., Szeliski, R.: A taxomomy and evaluation of dense two-frame stereocorrespondence algorithms. IJVC 47 (2002) 7–42

10 G. Gales, A. Crouzil and S. Chambon

tsukuba venus teddy cones

t > 0.5

t > 1

Fig. 6. The first row shows the final disparity maps computed by our algorithm. The

second and third rows show the error maps with t > 0.5 and t > 1 (black = errors,

grey = errors in occluded areas).

2. Wang, Z.F., Zheng, Z.G.: A region based stereo matching algorithm using cooper-

ative optimization. In: CVPR. (2008)

3. Klaus, A., Sormann, M., Karner, K.: Segment-based stereo matching using belief

propagation and a self-adapting dissimilarity measure. In: ICPR. Volume 3. (2006)

15–18

4. Yang, Q., Wang, L., Yang, R., Stewenius, H., Nister, D.: Stereo matching with

color-weighted correlation, hierarchical belief propagation and occlusion handling.

PAMI 31 (2009) 492–504

5. Bleyer, M., Rother, C., Kohli, P.: Surface stereo with soft segmentation. In: CVPR.

(2010)

6. Sun, J., Kang, S.B., Shum, H.Y.: Symmetric stereo matching for occlusion han-

dling. In: CVPR. Volume 2. (2005) 399–406

7. Taguchi, Y., Wiburn, B., Zitnick, C.L.: Stereo reconstruction with mixed pixels

using adaptive over-segmentation. In: CVPR. (2008)

8. Yang, Q., Engels, C., Akbarzadeh, A.: Near real-time stereo for weakly-textured

scenes. In: BMVC. Volume 1. (2008) 924–931

9. Hong, L., Chen, G.: Segment-based stereo matching using graph cuts. In: CVPR.

Volume 1. (2004) 74–81

10. Lin, M.H., Tomasi, C.: Surfaces with occlusions from layered stereo. PAMI 26(2004) 1073–1078

11. Comaniciu, D., Meer, P.: Robust analysis of feature spaces: color image segmen-

tation. CVPR (1997) 750–755

12. Bleyer, M., Gelautz, M.: A layered stereo matching algorithm using image seg-

mentation and global visibility constraints. ISPRS 59 (2005) 128–150

13. Chambon, S., Crouzil, A.: Dense matching using correlation: new measures that

are robust near occlusions. In: BMVC. Volume 1. (2003) 143–152

14. Chen, H., Meer, P.: Robust computer vision through kernel density estimation. In:

ECCV. (2002) 236–250

t>0.5t>1



Results (cont’d)

16



Conclusion

• Our algorithm gives good results at sub-pixel accuracy even with non polyhedral objects

• Easy to implement yet very effective• Algorithm highly parallelizable• We are working on improvements by using a confidence measure to

make a weighted random selection (to avoid the selection of initial disparities with low confidence)

17


a region-based randomized voting scheme for stereo matching

Technology

number of region

region maps

computed disparity

disparity space

disparity density functionf

w randomized voting

schemefinal disparity

chambon6 isvc