occlusion boundaries estimation from a high-resolution sar … · 2009-07-08 · occlusion...

6
OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR IMAGE Wenju He, Marc J¨ ager, and Olaf Hellwich Berlin University of Technology FR3-1, Franklinstr. 28, 10587 Berlin, Germany {wenjuhe, jaeger, hellwich}@fpk.tu-berlin.de ABSTRACT Occlusion is the concept that several objects interfere with one another in an image. This phenomenon is preva- lent in high resolution Synthetic Aperture Radar (SAR) images in urban areas. Geometric contents, which enable us to analyze occlusion, are partially observable in high resolution SAR images. Estimation of occlusion bound- aries helps to discriminate different objects and localize their extents. An occlusion boundary map also corre- sponds to an efficient figure / ground segmentation, which would be quite promising for further object analysis. This paper applies a hierarchical framework [1] to extract oc- clusion boundaries among different objects, e.g. build- ings and trees. The framework uses Conditional Random Fields to simultaneously reason about the boundaries and segments. Key words: SAR; urban; occlusion; boundary. 1. INTRODUCTION A Synthetic Aperture Radar (SAR) image is a projection of scattering reflections of 3D scene to slant range repre- sentation. Object extents, e.g. geometric information, are usually missing in SAR images. Speckles, SAR imag- ing mechanisms and geographical configuration of ob- jects make the analysis of SAR images very difficult. In contrast to optical images, SAR images are not capable of reconstructing objects. However, geometric information contents are partially observable in high resolution SAR images. Thus their applications in urban environments are promising, e.g. when combined with interferometric SAR data which are able to provide height information. Occlusion is a common phenomenon in optical images due to the projection of 3D scene to 2D image plane. Oc- clusion reasoning is an important aspect of the intrinsic 3D understanding from a single image. This effect is han- dled in [1] by extracting potential occlusion boundaries. The occlusion boundaries define figure/ground labeling. The algorithm can naturally be adjusted to strengthen consistency of interested objects. SAR images are occluded in a different way. The prop- agations of electromagnetic waves in urban areas are complicated due to complex geometric configuration of man-made structures and surroundings. Multiple reflec- tions happen among objects in the areas. Electromagnetic waves obstructed by objects can not reach some adjacent objects. Scatterers located at the bottom of a building may fall behind scatterers on the top in an image. It is dif- ficult to discriminate neighboring objects in SAR images in urban areas. The boundaries between different objects are usually occluded. An example is that buildings and trees are sometimes situated together and have similar characteristics along their boundaries. Estimation of oc- clusion boundaries helps to discriminate different objects and localize their extents. It is very important for scene understanding using SAR images. An occlusion bound- ary map also corresponds to a foreground segmentation, which would be quite promising for object analysis de- spite the constrains of SAR imaging mechanism. This paper studies the occlusion between different ob- jects in high resolution SAR images in urban areas. For instance, we estimate that buildings occlude trees and shadow, trees occlude grasses, and so on. We adopt an iterative strategy [1] exploring boundary strength and re- gion characteristics coherently to solve this difficult prob- lem. We integrate occlusion boundary estimation with segmentation problem. Initial segmentation is obtained by applying watershed method on polarimetric amplitude data. The boundaries of the generated segments are po- tential occlusion boundaries. Weak boundaries that are less likely to be occlusions can be removed and the small regions can be grouped if they have the same surface type. Many effective features are adopted in this paper, which help to characterize boundaries and regions effi- ciently. The boundaries and regions likelihoods are inte- grated into a Conditional Random Fields (CRF) frame- work, which models the interaction of boundaries, junc- tions and regions. The CRF inference outputs the occlu- sion boundary map. Our goal is to find the boundaries and occlusion relation- ships. The recovered occlusion boundary map shows ma- jor occlusions in a SAR image. Therefore, it would be helpful for 3D scene understanding of a single high reso- lution SAR image. An accurate occlusion boundary map also defines a high-quality segmentation. The segmen-

Upload: others

Post on 03-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR … · 2009-07-08 · OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR IMAGE Wenju He, Marc J¨ager, and Olaf

OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR IMAGE

Wenju He, Marc Jager, and Olaf Hellwich

Berlin University of TechnologyFR3-1, Franklinstr. 28, 10587 Berlin, Germany{wenjuhe, jaeger, hellwich}@fpk.tu-berlin.de

ABSTRACT

Occlusion is the concept that several objects interferewith one another in an image. This phenomenon is preva-lent in high resolution Synthetic Aperture Radar (SAR)images in urban areas. Geometric contents, which enableus to analyze occlusion, are partially observable in highresolution SAR images. Estimation of occlusion bound-aries helps to discriminate different objects and localizetheir extents. An occlusion boundary map also corre-sponds to an efficient figure / ground segmentation, whichwould be quite promising for further object analysis. Thispaper applies a hierarchical framework [1] to extract oc-clusion boundaries among different objects, e.g. build-ings and trees. The framework uses Conditional RandomFields to simultaneously reason about the boundaries andsegments.

Key words: SAR; urban; occlusion; boundary.

1. INTRODUCTION

A Synthetic Aperture Radar (SAR) image is a projectionof scattering reflections of 3D scene to slant range repre-sentation. Object extents, e.g. geometric information, areusually missing in SAR images. Speckles, SAR imag-ing mechanisms and geographical configuration of ob-jects make the analysis of SAR images very difficult. Incontrast to optical images, SAR images are not capable ofreconstructing objects. However, geometric informationcontents are partially observable in high resolution SARimages. Thus their applications in urban environmentsare promising, e.g. when combined with interferometricSAR data which are able to provide height information.

Occlusion is a common phenomenon in optical imagesdue to the projection of 3D scene to 2D image plane. Oc-clusion reasoning is an important aspect of the intrinsic3D understanding from a single image. This effect is han-dled in [1] by extracting potential occlusion boundaries.The occlusion boundaries define figure/ground labeling.The algorithm can naturally be adjusted to strengthenconsistency of interested objects.

SAR images are occluded in a different way. The prop-agations of electromagnetic waves in urban areas arecomplicated due to complex geometric configuration ofman-made structures and surroundings. Multiple reflec-tions happen among objects in the areas. Electromagneticwaves obstructed by objects can not reach some adjacentobjects. Scatterers located at the bottom of a buildingmay fall behind scatterers on the top in an image. It is dif-ficult to discriminate neighboring objects in SAR imagesin urban areas. The boundaries between different objectsare usually occluded. An example is that buildings andtrees are sometimes situated together and have similarcharacteristics along their boundaries. Estimation of oc-clusion boundaries helps to discriminate different objectsand localize their extents. It is very important for sceneunderstanding using SAR images. An occlusion bound-ary map also corresponds to a foreground segmentation,which would be quite promising for object analysis de-spite the constrains of SAR imaging mechanism.

This paper studies the occlusion between different ob-jects in high resolution SAR images in urban areas. Forinstance, we estimate that buildings occlude trees andshadow, trees occlude grasses, and so on. We adopt aniterative strategy [1] exploring boundary strength and re-gion characteristics coherently to solve this difficult prob-lem. We integrate occlusion boundary estimation withsegmentation problem. Initial segmentation is obtainedby applying watershed method on polarimetric amplitudedata. The boundaries of the generated segments are po-tential occlusion boundaries. Weak boundaries that areless likely to be occlusions can be removed and the smallregions can be grouped if they have the same surfacetype. Many effective features are adopted in this paper,which help to characterize boundaries and regions effi-ciently. The boundaries and regions likelihoods are inte-grated into a Conditional Random Fields (CRF) frame-work, which models the interaction of boundaries, junc-tions and regions. The CRF inference outputs the occlu-sion boundary map.

Our goal is to find the boundaries and occlusion relation-ships. The recovered occlusion boundary map shows ma-jor occlusions in a SAR image. Therefore, it would behelpful for 3D scene understanding of a single high reso-lution SAR image. An accurate occlusion boundary mapalso defines a high-quality segmentation. The segmen-

Page 2: OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR … · 2009-07-08 · OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR IMAGE Wenju He, Marc J¨ager, and Olaf

tation formed by the boundaries also gives an efficientfigure/ground segmentation for further object analysis.

2. ALGORITHM

Occlusion boundary analysis and image segmentation areintegrated and interleaved in the algorithm [1]. Segmen-tation provides initial boundaries and regions. We gradu-ally estimate occlusion boundaries by iteratively remov-ing weak boundaries and inferencing on the new segmen-tation. The growing segments provide better spatial sup-port for feature extraction. After several iterations we ob-tain an occlusion boundary map. Each iteration consistsof three steps: (1) compute multiple features for bound-aries and regions; (2) inference confidences for bound-aries and regions; and (3) compute a hierarchical seg-mentation by iteratively removing boundaries with lowerboundary strength than a given threshold. Regions aremerged to form a new segmentation. Each time a weakboundary is about to be removed, boundary likelihoodof the enlarged region has to be re-estimated. The newsegmentation is used as the initial segmentation for thenext iteration. Each iteration produces a new segmenta-tion which enables reasonable complex feature extractionin the next iteration.

Our estimation framework consists of three iterations.The first iteration is minimum merging using unary like-lihood estimation. Boundaries with smallest occlusionlikelihood are eliminated. In the second iteration, weuse a CRF model to integrate the unary likelihood withthe conditional dependency of a boundary on its preced-ing boundaries. In the third iteration, the CRF modelis extended to model surface evidences on both sides ofboundaries. In each iteration we apply the proposed threesteps to obtain new and less boundaries, which are sup-posed to be likely to be occlusion ones. The new prob-abilistic boundary map is thresholded to give the initialsegmentation for the next iteration. The third iterationproduces the final occlusion likelihoods and boundaries.

2.1. Minimum merging

At the beginning, we adopt watershed segmentationmethod to segment an image into small regions, whichprovide an initial hypothesis of the occlusion boundaries.An example is shown in Fig. 1(b). Watershed seg-mentation generates an over-segmentation with severalthousand regions from intensity gradients of polarimet-ric SAR data. These regions provide nearly true bound-aries. They are conservative estimations of the occlusionboundaries. Most of the boundaries are smooth and thusfacilitate efficient junction analysis.

We extract features for all boundaries and use a bound-ary classifier to estimate boundary likelihoods. The like-lihoods are thresholded to provide a new hypothesis ofthe occlusion boundaries and a new segmentation. The

boundaries together with the segmentation are the inputof the second iteration.

2.2. CRF model

Both boundaries and regions indicate whether an oc-clusion boundary exists. On the one hand, the initialboundary map contains a large number of edges. Occlu-sion boundaries tend to be strong edges. We calculatestrength, length and other features for boundaries. On theother hand, the initial segmentation contains lots of smallregions. Regions with a same surface label are usuallynot occluded. Therefore, occlusions estimation will ben-efit from the integration of boundaries and regions.

In the second and third iteration we use CRF to modelthe interaction of adjacent boundaries and surfaces onboth sides. The CRF model inferences over boundaryand junctions, modeling boundary strength and enforcingclosure and boundary consistency. The model is definedas

P (labels|data) =1Z

Nj∏j

φj

Ne∏e

γe (1)

where φj indicates junction factor, γe indicates surfacefactor, Nj is the number of junctions, Ne is the numberof boundaries, and 1

Z is normalization item.

The factor graph of CRF consists of a junction factor anda surface factor. The junction factor models the strengthand continuities of boundaries, i.e. the likelihood of thelabel of each boundary according to the data, conditionedon its preceding boundaries if exist. The junction fac-tor consists of unary boundary likelihood and conditionalcontinuity likelihood. The surface factor models the like-lihood of a boundary conditioned on the region types oneach side. A boundary between two regions assignedwith the same surface label is less likely to be occludedand thus has a low occlusion likelihood. We learn to de-tect whether a boundary probably exists between two re-gions due to occlusion.

The CRF model achieves a joint inference over the twofactors. Confidences for boundaries and surfaces arecomputed simultaneously, which are expected to be morestable. It enforces boundary consistency that the left sideis object and that the left side occludes the right side. Sur-face evidence map also helps to guarantee the consistencyof object boundaries. Besides, the model is capable toimprove surface estimation in the mean time. CRF infer-ence gives occlusion likelihood of boundaries. Bound-aries with low likelihood are removed. Therefore, we ob-tain a new probabilistic boundary map.

Given a labeling of boundaries and excluding the sur-face factor, the CRF model decomposes into single like-lihood term for each boundary. This property allows usto learn boundary likelihood and conditional likelihoodof junction factor using boosted decision tree, which is

Page 3: OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR … · 2009-07-08 · OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR IMAGE Wenju He, Marc J¨ager, and Olaf

able to perform feature selection and give probabilistic re-sults. Boundary classifier and boundary continuity clas-sifier are trained to generate potentials in the junctionfactor. Sum-product belief propagation is used for infer-ence. The CRF outputs occlusion likelihoods of bound-aries. Boundaries with low likelihoods are to be removed.

Surface evidence maps used in the model are computedusing a smaller set of low-level features by the algo-rithm in [2]. The maps indicate 5 surface types: layover,shadow, tree, grass and unknow class. The unknown classindicates that in meter-resolution SAR images some re-gions are hard to interpret by eyes. An example of sur-face map is shown in Fig. 1(d). The maps allow usto inference boundaries between different object types,and to impose penalty to non-occlusion boundaries. Theyhelp to enforce consistency between the region labels andboundary labels.

2.3. Features extraction

Table 1. Features extracted for a boundary.

Boundary Feature DescriptionsRegionR1. Polarimetric entropy, anisotropy and α differencesR2. Sublook coherence and entropy differencesR3. Optimized coherence differenceR4. HH, VV and HV: amplitude differencesR5. Span image: amplitude differenceR6. Span histogram: KullbackLeibler (KL) divergenceR7. Log span histogram: KL divergenceR8. Filter bank responses of span: differencesR9. Filter bank responses of log span: differencesR10. Texton histogram of span: KL divergenceR11. Texton histogram of log span: KL divergenceR12. HOG of span: KL divergenceR13. HOG of log span: KL divergenceR14. Dense SIFT of log span: KL divergenceR15. Area: area of region on each side, area ratioR16. Lines: difference of line pixelsR17. Parallel lines: percentage differenceR18. Position: differences of bounding box coordinatesR19. Alignment: horizontal and vertical overlapsBoundaryB1. Strength: average PbB2. Length: length / (perimeter of smaller side)B3. Smoothness: length / (endpoint distance)B4. Orientation: directed orientationB5. Continuity: angle difference at each junctionSurfaceS1. Surface evidences: confidences of each sideS2. Surface evidences: differences of S1

We extract a rich set of features for regions in a segmen-tation. The region features are used to generate boundaryfeatures. They include polarimety, amplitude, texture,shape, and other types. We believe that comprehensivefeatures extraction would better characterize different ob-

jects in the images. More robust features are expected forevolving regions. Besides these low-level features, wealso use surface evidence maps as additional cues.

We extract 204 features for each region. Polarimet-ric SAR data reveal more scattering physics than a sin-gle channel image. Therefore, polarimetric decomposi-tion provides informative indicators of approximate mainscattering processes involved in a region. We extract po-larimetric entropy, anisotropy and α angle. Sub-aperturecoherence, entropy and optimized coherence are alsohelpful, e.g. most coherent scatterers are targets formedby buildings or together with ground. Amplitude ofpolarimetric SAR data is the most important informa-tion for discriminating different objects, since all derivedproducts of polarimetric SAR data, e.g. coherence, arestrongly influenced by intensity, i.e., reflection strength.The distribution of SAR amplitude data can be modeledby K distribution, log normal, and so on. For simplicity,we use features extracted from log span image of polari-metric SAR data. The log features are very effective forSAR image segmentation. For span and log span images,we use a filter bank [3] to generate texton histogram. His-togram of oriented gradients (HOG) [4] is another effec-tive feature. We also apply scale-invariant feature trans-form (SIFT) descriptor [5] to SAR images. Furthermore,we represent area, small lines generated by a line detec-tor [2], position and bounding box as additional regionfeatures.

Boundaries features are used to learn occlusions. Occlu-sion boundaries often have strong amplitude gradients.Therefore, the distances of the features of two neighbor-ing regions are efficient boundary features. Probabilis-tic boundary map produced from polarimetric amplitudeimage by Pb algorithm [6] provides an important cue ofboundary strength. An example of Pb map is shown inFig. 1(c). We represent the mean Pb along the bound-ary pixels as a boundary feature. We also extract bound-ary length, smoothness, orientation, alignment continu-ity [1]. The extracted 88 boundary features are listed inTab. 1. We expect that boundaries reasoning will benefitfrom effective features.

We calculate continuity features to describe the condi-tional dependency of a boundary on its preceding one.The continuity features are the concatenation of bound-ary features of two adjacent boundaries, and in additionthe relative angle between them.

3. EXPERIMENTS

3.1. Dataset

The polarimetric SAR data of Copenhagen acquired byEMISAR are used in the experiments. We extract 98 im-ages (384×352) from the data. We generate ground truthocclusions boundary for 41 images. We use 31 of themfor training, and 10 images to evaluate the estimation ac-

Page 4: OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR … · 2009-07-08 · OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR IMAGE Wenju He, Marc J¨ager, and Olaf

(a) (b)

(c) (d)

Figure 1. (a) A polarimetric SAR image, (b) watershedsegmentation (4915 segments), (c) probabilistic bound-aries, (d) surface evidences.

curacies. Ground truth contains object labels for each re-gion. To generate ground truth, we first segment an imageinto thousands of regions, and manually group them intoobject regions. Then we manually label occlusion typesof adjacent regions.

3.2. Training

We train three boundary classifiers and two boundarycontinuity classifiers in the three iterations using a logis-tic regression version of Adaboost [1]. In each iteration,the classifiers are trained on the segmentation result ofthe previous iteration. The trained classifiers are then ap-plied to the train data. We transfer the ground truth fromthe last iteration to current iteration in order to train newclassifiers on new regions. In the transfer process, welabel each region as the object that has the most pixelsin the region, and then label the occlusion types betweenregions.

In the three iterations, we set the thresholds for removingweak boundaries to 0.08, 0.12 and 0.2, respectively. Set-ting the thresholds is a trade-off between more resultedsegments and smoother, more sensible objects.

In the second iteration, we restrict the CRF model tothe junction factor. The CRF model is extended to thefull factors in the third iteration. We impose a penalty(e−0.3) for the lack of a boundary between different sur-face classes, for shadow occluding others, for grass oc-cluding layover or tree, and for unknow class occludinglayover or tree.

Figure 2. Precision-recall curve for classifying whether aboundary is an occlusion boundary in the first iteration.

3.3. Inference

For a test image, we apply the classifiers and the modelsto estimate occlusion boundaries. The image is initiallyover-segmented by watershed. In the first iteration, weextract features for boundaries and apply the first bound-ary classifier. Weak boundaries are removed and new seg-mentation is formed. In the second iteration, we extractboundaries features and continuity features. The infer-ence over junction factor terms gives boundary probabili-ties. We perform inference over the full model in the thirditeration to obtain the final occlusion likelihoods.

Fig. 3 and Fig. 4 show two examples of occlusionboundary estimation and corresponding segmentations.Fig. 3(f) shows the segmentation result of the seconditeration, which contains more segments and is slightlymore accurate than the final segmentation shown inFig. 3(e) in terms of small objects. Nonetheless, finalsegmentation contains 290 less segments. Less segmentswill reduce the computational burden in further applica-tions. This demonstrates the effectiveness of joint in-ference over junctions and surface evidences in the CRFmodel.

3.4. Evaluation

Table 2. Overall segmentation accuracy BSS and aver-aged number of segments.

BSS NumberNormalized cuts 42.48% 400Our method Iter. 2 43.72% 830

Iter. 3 42.33% 582

The algorithm is evaluated by measuring the accuracy ofboundaries classification and final segmentation. Fig. 2shows the precision-recall curve for detecting whether aninitial boundary is an occlusion boundary. Boundariesare weighted by length in computing the precisions andrecalls. We measure the overall segmentation accuracy interms of best spatial support (BSS) score [7]. For eachground truth region, BSS is the maximum overlap scoreacross all of the segments. It measures how well the bestsegment covers the region. The segmentation accuracy

Page 5: OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR … · 2009-07-08 · OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR IMAGE Wenju He, Marc J¨ager, and Olaf

is shown in Tab. 2. The algorithm is comparable to Nor-malized cuts, which segments each image into 400 seg-ments. In Normalized cuts segmentation, only log span ofpolarimetric SAR data is used as feature, and Euclideandistance is used in constructing the distance matrix.

(a) (b)

(c) (d)

(e) (f)

Figure 3. An example of boundary result: (a) original im-age, with RGB colors representing HH, VV and HV chan-nels, (b) ground truth occlusion boundaries, (c) estimatedocclusion boundaries, (d) probabilistic boundaries, (e)segmentation defined by boundaries (629 segments), (f)segmentation result of iteration 2 (919 segments).

4. CONCLUSIONS

This paper extracts occlusion boundaries from a high-resolution SAR image in urban areas. Segmentation andboundary estimation are integrated in the framework. Aniterative strategy is adopted to estimate occlusion like-lihood and then threshold them to generate occlusionboundaries and segmentations. Increasing regions pro-vide better spatial support, that helps us to better deter-mine whether a boundary is caused by occlusion. Thealgorithm jointly reasons about boundaries and surfacesthat influence the occlusions in SAR images. The ob-tained promising results of boundaries extraction and seg-

(a) (b)

(c) (d)

(e) (f)

Figure 4. Another example of boundary result: (a) orig-inal image, (b) ground truth occlusion boundaries, (c)estimated occlusion boundaries, (d) probabilistic bound-aries, (e) segmentation defined by boundaries (594 seg-ments), (f) segmentation result of iteration 2 (860 seg-ments).

mentation are applicable to further applications, e.g. ob-ject detection. The occlusion boundary map is a proba-bilistic output, which can be integrated into statistical ge-ometric models for urban scene analysis using SAR data.The occlusion boundaries will play an important role inurban understanding using SAR images.

REFERENCES

[1] Hoiem, D., Stein, A. N., Efros, A. A. & Hebert, M.(2007). Recovering Occlusion Boundaries from a Sin-gle Image. In International Conference on ComputerVision.

[2] Hoiem, D., Efros, A. & Hebert, M. (2007). Recover-ing Surface Layout from an Image. International Jour-nal of Computer Vision, 75(1), 151-172.

[3] Varma, M. & Zisserman, A. (2005). A StatisticalApproach to Texture Classification from Single Im-

Page 6: OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR … · 2009-07-08 · OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR IMAGE Wenju He, Marc J¨ager, and Olaf

ages. International Journal of Computer Vision, 62(1-2), 61-81.

[4] Dalal, N. & Triggs, B. (2005). Histograms of Ori-ented Gradients for Human Detection. In IEEE Con-ference on Computer Vision and Pattern Recognition,2, pp 886-893.

[5] Lowe, D.G. (2004). Distinctive Image Features fromScale-invariant Keypoints. International Journal ofComputer Vision, 2(60), 91-110.

[6] Martin, D.R., Fowlkes, C.C. & Malik, J. (2003).Learning to Detect Natural Image Boundaries usingBrightness and Texture. In Advances in Neural Infor-mation Processing Systems 15 (NIPS), pp 1255-1262.

[7] Malisiewicz, T. & Efros, A. (2007). Improving Spa-tial Support for Objects via Multiple Segmentations.In British Machine Vision Conference, pp 282-289.