xing mei ; xun sun ; mingcai zhou ; shaohui jiao ; haitao wang ; xiaopeng zhang samsung advanced...

Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops, 2011 IEEE On Building an Accurate Stereo Matching System on Graphics Hardware

Upload: griffin-park

Post on 23-Dec-2015




0 download


Page 1: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Xing Mei ; Xun Sun ;  Mingcai Zhou ;  Shaohui Jiao ;

  Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab

Computer Vision Workshops, 2011 IEEE

On Building an Accurate Stereo Matching System on

Graphics Hardware

Page 2: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Outline• Introduction• Related Works• Algorithmn• CUDA Implementation• Experimental Results• Conclusion

Page 3: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,


Page 4: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

IntroductionDense two-frame stereo matching

• Compute a disparity map from stereo images.• Broad applications: 3D reconstruction, view interpolation

Page 5: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Related Works

Page 6: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Related Works

• Local methods• Compute each pixel’s

disparity independently over a local support region.• Fast but inaccurate.

•Global methods• Solve the stereo problem in

an energy minimization process.• Accurate but slow due to

time-comsuming global optimizer.(GC,BP)

Page 7: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Related Works• Propagation-based methods• Produce quasi-dense or dense disparity results from a set of seed pixels.

• Relatively fast but sensitive to early wrong matches• use segmented regions as guided propagation unit• expensive cost

Page 8: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Related Works• Introduce a simple guided unit for propagation

: pixel-wise 1D line segments. • No image segmentation required here.• Simple, fast and accurate

Page 9: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,


Page 10: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Algorithmn• Framework

Multi-step Disparity Refinement

Scanline Optimization

Cross-based Cost Aggregation

AD-Census Cost InitializationInput: Stereo images

Output: Disparity map

Page 11: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,


Multi-step Disparity Refinement

Scanline Optimization

Cross-based Cost Aggregation

AD-Census Cost InitializationInput: Stereo images

Output: Disparity map

Page 12: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Disparity Cost Computing• Cost mesure : AD, BT, gradient-based measures, non-

parametric transforms(rank/census[3])......• Combination : SAD + gradient[6] , AD + Census• AD (Absolute Distance)• Constant color assumption• Repetitive structures

• Census • Encodes local image structures • Textureless regions

[3] H. Hirschmuller and D. Scharstein. “Evaluation of stereo matching costs on images with radiometric differences.”IEEE TPAMI, 31(9):2009.[6] A. Klaus, M. Sormann, and K. Karner. “Segment-based stereo matching using belief propagation and a self-adapting dissimilarity measure.” ICPR,2006.

Page 13: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

AD-Census Cost Initialization+• p : pixel• d : level• >> a robust function on variable 𝑐

• pd = (x-d,y) in the right image

• : Hamming distance[22]

[22] R. Zabih and J. Woodfill. “Non-parametric local transforms for computing visual correspondence.” In Proc. ECCV, 1994.


Left I Right I

Page 14: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Census Transform

1 1 0 0 0

1 1 0 0 0

1 1 X 0 0

0 0 0 1 1

1 1 1 1 1

121 130 26 31 39

109 115 33 40 30

98 102 78 67 45

47 67 32 170 198

39 86 99 159 210

1 1 0 0 0 1 1 0 0 0 1 1 0 0 0 0 0 1 1 1 1 1 1 1

Census transform window :

Page 15: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Census Hamming Distance

• Left image

• Right image

Hamming Distance = 3

1 1 0 0 0 1 1 0 0 0 1 1 0 0 0 0 0 1 1 1 1 1 1 1

1 1 1 0 0 1 1 0 0 1 0 1 0 0 0 0 0 1 1 1 1 1 1 1


0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0

Page 16: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

AD-Census Cost Initialization


• > >> a robust function on variable 𝑐

Page 17: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

AD-Census Cost Initialization

• AD-Census measure produces proper disparity results for both repetitive structures and textureless regions.

Page 18: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,


Multi-step Disparity Refinement

Scanline Optimization

Cross-based Cost Aggregation

AD-Census Cost InitializationInput: Stereo images

Output: Disparity map

Page 19: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

• Cross construction

• Line ending points P1, P2 for P are located when rule 1 or 2 are violated:R1: Color self-similarity in the line region: smooth depth assumption

R2: Arm length limitation: avoid over-smoothness

Cross-based Cost Aggregation[23]

[23] K. Zhang, J. Lu, and G. Lafruit. “Cross-based local stereo matching using orthogonal integral images.” IEEE TCSVT,2009.

Page 20: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Cross-based Cost Aggregation

Page 21: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Cross-based Cost Aggregation• Enhance cross construction

(use pixel p’s left arm and the endpoint pixel pl as an example)

• • •

Page 22: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Cross-based Cost Aggregation• Cost aggregation

• Run this step for 4 iterations to get stable cost values. • For iteration 1 and 3, aggregated horizontally and then vertically. • For iteration 2 and 4, aggregated vertically and then horizontally.

• Reduce the errors at depth discontinuities.

Page 23: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Cross-based Cost Aggregation

• Our aggregation method can better handle large textureless regions and depth discontinuities.

Page 24: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Cross-based Cost Aggregation

[21] K.-J. Yoon and I.-S. Kweon. “Adaptive support-weight approach for correspondence search.” IEEE TPAMI, 2006.[23] K. Zhang, J. Lu, and G. Lafruit. “Cross-based local stereo matching using orthogonal integral images.” IEEE TCSVT,2009.

Page 25: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,


Multi-step Disparity Refinement

Scanline Optimization

Cross-based Cost Aggregation

AD-Census Cost InitializationInput: Stereo images

Output: Disparity map

Page 26: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Scanline Optimization[2]

• 4 scanline optimization processes are performed independently.• 2 horizontal directions• 2 vertical directions





[2] H. Hirschmuller. Stereo processing by semiglobal matching and mutual information.” IEEE TPAMI, 2008.

Page 27: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

r : directionp-r : the previous pixel along the same direction𝑃1, 𝑃2 : penalize the disparity changes between neighboring

pixels. (𝑃1 ≤ 𝑃2 ) [8]

Scanline Optimization

[8]S. Mattoccia, F. Tombari, and L. D. Stefano. “Stereo vision enabling precise border localization within a scanline optimization framework.” In Proc. ACCV, pages 517–527, 2007.


Page 28: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Scanline Optimization• The final cost :

• The disparity with the minimum 𝐶2 value is selected as pixel

p’s intermediate result.





Page 29: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,


Multi-step Disparity Refinement

Scanline Optimization

Cross-based Cost Aggregation

AD-Census Cost InitializationInput: Stereo images

Output: Disparity map

Page 30: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Multi-step Disparity Refinement• Outlier Handling• Outlier Detection• Iterative Region Voting• Proper Interpolation

• Depth Discontinuity Adjustment• Sub-pixel Enhancement

Page 31: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Outlier Handling--Detection• The outliers : 𝐷𝐿(p) != 𝐷R(p − (𝐷𝐿(p), 0))

• Outliers are further classified into occlusion and mismatch points

• p intersect its epipolar line and 𝐷R is checked• If no intersection p is labelled as “occlusion”, otherwise “mismatch”

Page 32: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Outlier Handling--Iterative Region Voting

• Construct cross-based regions and a robust voting scheme

• Sp :

• 𝜏𝑆, 𝜏𝐻 : threshold values

• 5 iterations


Page 33: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Outlier Handling--Proper Interpolation

• occlusion• The pixel with the lowest disparity value is selected for interpolation• It’s most likely comes from the background

• mismatch points• The pixel with the most similar color is selected for interpolation.

Page 34: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Depth Discontinuity Adjustment• For each pixel p on the disparity edge, two pixels p1, p2 from

both sides of the edge are collected. • 𝐷𝐿(p) is replaced by 𝐷𝐿(p1) or 𝐷𝐿(p2) if one of the two pixels

has smaller matching cost than 𝐶2(p,𝐷𝐿(p)).





Page 35: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

• Quadratic polynomial interpolation

• With 3*3 median filter

Sub-pixel Enhancement[20]

[20] Q. Yang, L. Wang, R. Yang, H. Stewenius, and D. Nister. “Stereo matching with color-weighted correlation, hierarchical belief propagation and occlusion handling.” IEEE TPAMI, 2009.

Page 36: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Multi-step Disparity Refinement

• The average error percentages after performing each refinement step.

Page 37: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

CUDA Implementation

Page 38: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

CUDA Implementation• Compute Unified Device Architecture (CUDA) is a

programming interface for parallel computation tasks on NVIDIA graphics hardware.• The computation task is coded into a kernel function. • The allocation of the threads is controlled with two hierarchical

concepts: grid and block.• A kernel creates a grid with multiple blocks, and each block

consists of multiple threads. Kernel





Block …

Grid …

Page 39: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

CUDA Implementation• Cost Initialization: • Parallelize with × 𝑊 𝐻 threads. • Organize into a 2D grid and the block size is set to 32× 32. • Each thread computes a cost value for a pixel at a given disparity. • For census transform, a square window is require for each pixel, which

requires loading more data into the shared memory for fast access.Kernel




32X32 …


Page 40: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

CUDA Implementation• Cross-based Cost Aggregation: • A grid with × 𝑊 𝐻 threads.• Cross construction : block size is 𝑊

or 𝐻 to efficiently handle a scan line • Cost aggregation : block size is 32X32• Data reuse with shared memory is

considered in both steps.

Page 41: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

CUDA Implementation• Scanline Optimization: • This step is different, because the process is sequential in the scanline

direction and parallel in the orthogonal direction.• 𝑊 × 𝐷 or × 𝐻 𝐷 threads

• Disparity Refinement: • 𝑊 × 𝐻 threads

Page 42: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Experimental Results

Page 43: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Experimental Results• Device : A PC with Core 2 Duo 2.20GHz CPU and NVIDIA

GeForce GTX 480 graphics card• Settings parameters:

• Source : Middlebury

HHI database(book arrival)

Microsofy i2i database(Ilkay)

Page 44: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Experimental Results

• The GPU-friendly system brings an impressive 140× speedup.• The average proportions of the GPU running time for the four

computation steps are 1%, 70%, 28% and 1% respectively. • The iterative cost aggregation step and the scanline

optimization process dominate the running time.

Tsukuba Venus Teddy Cones

CPU 2.5 4.5 15 15

GPU 0.015 0.032 0.095 0.094

Page 45: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Experimental Results

• First row: disparity maps generated with our system. • Second row: disparity error maps with threshold 1. • Errors in unoccluded and occluded regions are marked in black and gray respectively.

Page 46: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Experimental Results

Page 47: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Experimental Results• video

Page 48: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Experimental Results

Snapshots on ’book arrival’ stereo video

Page 49: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Experimental Results

Snapshots on ’Ilkay’ stereo video

Page 50: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,


Page 51: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,

Conclusion• Contributions • Present a near real-time stereo system with accurate disparity results.• Combine some known techniques without sacrificing performance and

parallelism to obtain the high quality disparity map.

• Future works• Improve to apply it in real world applications • Robust parameter setting methods

Page 52: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,
Page 53: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,
Page 54: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,
Page 55: Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops,